PCA for PPG Analysis

Principal Component Analysis (PCA) projects multi-dimensional PPG data (multi-wavelength, multi-site, or multi-feature) onto orthogonal principal components ordered by variance explained, enabling dimensionality reduction, artifact separation, and identification of dominant physiological signal modes.

PCA computes the eigenvectors of the data covariance matrix, projecting the original high-dimensional data onto directions of maximum variance. For multi-wavelength PPG (e.g., green, red, IR channels), PCA typically concentrates cardiac pulsation in the first principal component (PC1, explaining 80–95% of variance) while distributing motion artifacts across PC2-PC3. This separation enables artifact rejection by reconstructing the signal from PC1 alone.

For beat morphology analysis, PCA applied to a matrix of aligned individual beats (each beat as a row vector) reveals the dominant morphological variation modes. PC1 captures the mean beat shape, PC2 typically captures amplitude variation, PC3 captures timing variation, and PC4+ capture respiratory modulation and dicrotic notch variability. Dimensionality reduction to 5–10 PCs retains >99% of morphological information while reducing feature dimensionality by 10–50×.

Incremental PCA (IPCA) enables online computation suitable for real-time PPG processing. IPCA updates the eigenvector estimates sample-by-sample using rank-1 matrix update formulas, requiring O(d²) computation per sample where d is the number of dimensions. This is computationally feasible for multi-channel PPG systems with 3–8 channels on embedded processors.

Frequently Asked Questions

How does PCA differ from ICA for PPG?

PCA finds orthogonal directions of maximum variance (decorrelation). ICA finds statistically independent components (higher-order independence). PCA is a preprocessing step that whitens the data; ICA applied after PCA provides physiologically meaningful source separation.

How many principal components should be retained for PPG?

For cardiac signal extraction from multi-channel data: 1–2 PCs. For beat morphology analysis: 5–10 PCs capturing >99% variance. For ML feature reduction: select PCs explaining >95% cumulative variance.

Can PCA remove motion artifacts from single-channel PPG?

Not directly — PCA requires multiple channels for source separation. However, PCA applied to a Hankel matrix of delayed copies (similar to SSA/SVD) provides single-channel artifact separation, though this is more commonly described as SVD or SSA.

Related Algorithms