Principal Component Analysis in Dynamic Force Spectroscopy
- Fig. 1: Interface between EPON resin (gray mark) and PEMMA polymer (red mark). Image is acquired is amplitude modulation mode (excitation frequency - 5% of resonance frequency, setpoint 72% of free oscillation amplitude). Panel (b) shows the ROI selected from bigger image (a). Panels (c–f) show spectroscopy data taken in positions 1 and 2: channels acquired are Amplitude (c), Phase (d) frequency (e) and Amplitude 2; continuous and dotted lines are approach and retract curves, respectively (image reprinted from ).
- Fig. 2: First component of PCA analysis on dynamic spectroscopy map: Amplitude (a), Second Harmonic (b), Phase (c), frequency (d), and all (e) channels are shown. Map (f) is the difference between (d) and (e): color bar scale is part per thousand (Image reprinted from )
- Fig. 3: 3D topography map of the ROI, overlayed with dynamic spectroscopy amplitude signal, after filtering it with PCA algorithm. Lack of correlation between topography and PCA maps highlights the effect of hidden structures laying below the surface on mechanical properties of the sample. High sensitivity of dynamic modes is also evident from color distribution.
A simple method to direct identify nanometer sized textures in composite materials by means of AFM Spectroscopy, aiming at recognizing nanometer structures embedded in a sample. It consists in acquiring a set of dynamic data organized in spectroscopy maps and subsequently extracting most valuable information by means of the Principal Component Analysis (PCA) method . In this work we explain the main features of the method and show its application on a nanocomposite sample.
Force Spectroscopy by AFM
Local properties of materials with nanometer resolution can be probed by means of atomic force microscope, performing force spectroscopy experiments: in these experiments, tip-sample interaction forces are measured acquiring quasi-static cantilever deflection as a function of separation while the tip is brought into contact with the sample and then far apart from it.
Force-distance (FD) curves contain valuable information about nanoscale material properties  such as adhesion, elasticity and plasticity, as well as friction. Several mathematical models have been proposed and used to model the different ideal situations of tip sample interaction, where rounded tips with well known curvature radius interact with a flat surface-hypothesis is not always verified-in presence (JKR and DMT) or in absence (Hertz) of adhesion.
An accurate evaluation of each FD curve of a 3D spectroscopy map (2D arrays of FD curves) is, in most cases, time consuming, especially in biological samples that use a more refined model. If we shift to dynamic AFM, in which the cantilever is dithered close to its resonance frequency, we can use the same approach with the so called "dynamic force spectroscopy" but the complexity of the system is considerably raised. In this case we will collect several parameters as a function of distance (such as static deflection, amplitude, phase, higher harmonics frequency shifts, etc.) containing a larger amount of data. As shown by several authors [3,4], approach curves contain valuable information about chemical composition, short and long range interaction forces, friction and plasticity.
In relation to what has been told introducing secondary oscillations (multifrequency AFM) during imaging has shown interesting results in enhancing material contrast going "beneath the surface" but the theoretical understanding is still under development .
Applying PCA to Data
The huge amount of information requires a higher computational weight to reconstruct physically valuable parameters in comparison with contact models; as a result a fast and easy analysis relying on these dynamic methods is still far to be routinely implemented to spectroscopy maps or it is limited to few information.
Principal Component Analysis (PCA) method  has been successfully applied to compress complex data series and is the right statistical tool to facilitate the analysis of multi parameters maps [7,8].
In short, this algorithm projects the information of D (e.g. amplitude, phase, freq. etc.) spectroscopy curves, each containing P values (depending on sampling frequency), acquired at each point of an L x C grid into a subset of L x C maps without any assumption on the sample structure, filtering out redundancies and noise. As a consequence, a huge amount of 3D data is condensed into few 2D maps, easy to be examined.
Data from all channels is grouped together and analyzed: in this way data is reduced to maps with L x C dimensionality, summing up independent information from each parameter and from all of them. This process is intended to find different features within the probed region (in the following example a 5 x 5 µm2 interface area) and to evaluate, at the same time, the response on different dynamic parameters: results therefore provide a robust and quick screening method to locate region of interest (ROI) on the sample, where further investigations can be addressed.
Example of Application
The area is shown in figure 1; the selected area is imaged in amplitude modulation. The sample considered is the interface region between EPON resin (left) and PEMMA polymer (right).
The following step is the acquisition of the approach curves. The area is divided into a 64 x 64 grid, we obtain for each point 4 vectors corresponding to the 4 channels acquired: Amplitude (variation of amplitude oscillation of the main vibration mode, used also as trigger channel; Amplitude2 (variation in amplitude of a secondary oscillation, in this case the second harmonic); Phase; Frequency (shifts in frequency of primary "driven" oscillation) All sampled at 2.147 Hz. On the right column we can see the spectroscopy data for two points (1, 2). The plastic behavior, recognizable from the differences between traces and retrace in channel (e) and (f), is expected; it is noticeable to see the high sensitivity of frequency and second harmonic channel in comparison to the amplitude signal. It is also interesting to notice how channel (f) for instance exhausts its main contribution 50 nm above the Amplitude trigger point in position 2, while same channel in position 1 still have high sensitivity. Given the input vectors, PCA is applied. We retained a number of components sufficient to explain a given amount of variance (98 %). In figure 2 the results are shown: the Region Of Interest (ROI) we focused our attention on is at the interface between polymer and resin. The two areas have different composition and, in a narrower scale, different roughness and different mechanical properties. The first principal component on each spectroscopy channel is shown: maps (a), (b), (c), and (d) show results on amplitude, second harmonic, phase and frequency. Figure 4e is the first component of PCA on data from all components.
Direct comparison of the letter with (f) shows that the most important contribution comes from the frequency shift channel. On some of the maps (frequency and second harmonic channels) we can clearly recognize the interface between the two materials (Arrow 2), providing much lower contrast on Amplitude channel. Surprisingly, a new feature, not visible in topography data, appears in the left part of all the maps [Arrow 3, see map (b)] whose substructure are differently highlighted from different channels: this suggest a complex inner structure that can be evaluated. This happens also for region in point 4.
A direct overlay of PCA processed amplitude data and sample topography is shown in figure 3. Here PCA results appear as a colormap "painted over" the 3D rendered height channel, showing that the main features extracted from spectroscopy maps is completely uncorrelated from specimen morphology and surface structure. This is regarded as an indication of a compositional texture laying below the surface, not visible in topography but still affecting mechanical properties of the sample and, therefore, emerging from dynamic force spectroscopy data.
Without entering into detailed analyses we have shown a different strategy to address the problem of compositional recognition, introducing an intermediate step to localize features through a simple screening that does not requires detailed modeling of contact potential.
This method can therefore be used routinely to identify part of the sample where a more detailed investigation can be addressed. Hidden features, invisible to the topography, can be highlighted and recognized. This method does not take into account topography information that is filtered out from the data: maps and images are regarded as independent.
Different channels are analyzed to give complete information joint into a unique map, while noise and redundancies are filtered out by the algorithm itself and relative weight of different channels can be highlighted.
 Torre B. et al.: Microscopy Research and Technique 73, 973-81 (2010)
 Cappella B and Dietler G.: Surf Sci Rep 34:1-104 (1999)
 Sader J. E. et al.: Appl. Phys. Lett. 84, 1801-1803 (2004)
 Giessibl FJ., Phys Rev B 56:16010-16015 (1997)
 Rodriguez T.R. and Garcia R., Appl. Phys. Lett. 84, 449 (2004)
 Jollife I.T. (1986) Principal Component Analysis. Springer Verlag, New York
 Jesse S. and Kalinin SV., Nanotechnology 20:085714, 1-7 (2009)
 Jesse S. et al.: Nanotechnology 18:435503, 1-8 (2008)
Dr. Bruno Torre (corresponding author)
Dr. Matteo Lorenzoni
Dr. Marco Cristani
Prof. Dr. Vittorio Murino
Prof. Dr. Alberto Diaspro
Italian Institute of Technology
Dr. Manuele Bicego
Università di Verona