PPG Respiratory Rate Estimation: Extracting Breathing Rate from Photoplethysmography

Technical guide to extracting respiratory rate from PPG signals using amplitude, frequency, and intensity modulations with algorithm comparisons and accuracy data.

ChatPPG Research Team·

PPG Respiratory Rate Estimation: Extracting Breathing Rate from Photoplethysmography

Respiratory rate is the vital sign most sensitive to clinical deterioration, yet it is the least frequently and least accurately recorded in routine clinical practice. Early warning score systems for detecting patient deterioration (NEWS, MEWS) consistently identify respiratory rate as the strongest single predictor of adverse events including cardiac arrest, ICU admission, and death within 24 hours (Churpek et al., 2012; DOI: 10.1016/j.resuscitation.2012.02.009). The ability to extract continuous respiratory rate from PPG signals -- sensors already present in pulse oximeters on virtually every hospitalized patient and in billions of consumer wearables -- represents one of the most immediately clinically impactful applications of PPG technology.

This article provides a technical review of the three primary mechanisms by which respiration modulates the PPG waveform, the signal processing algorithms used to extract respiratory rate, fusion strategies for combining multiple respiratory estimates, and the accuracy benchmarks that define the current state of the art.

Physiological Mechanisms of Respiratory Modulation in PPG

Respiration influences the PPG waveform through three distinct physiological pathways, each producing a characteristic modulation pattern. Understanding these mechanisms is essential for designing robust extraction algorithms.

Respiratory-Induced Frequency Variation (RIFV)

Respiratory sinus arrhythmia (RSA) is the physiological variation in heart rate synchronized with breathing: heart rate increases during inspiration and decreases during expiration. This is mediated primarily by vagal modulation of the sinoatrial node and is modulated by lung stretch receptors, central respiratory drive, and baroreflex interactions.

RSA produces a frequency modulation of the PPG inter-beat interval (IBI) series. The instantaneous heart rate oscillates at the respiratory frequency, and this oscillation can be extracted by analyzing the IBI time series in the frequency domain. The respiratory frequency appears as a spectral peak in the 0.15-0.4 Hz band (corresponding to 9-24 breaths per minute) of the IBI power spectrum, which is the high-frequency (HF) component of heart rate variability.

RIFV is the most robust respiratory modulation in young, healthy individuals at rest, where RSA amplitude can reach 10-20 BPM peak-to-peak. However, RSA amplitude decreases markedly with age (Anrep et al., 1936; confirmed by numerous subsequent studies), at high heart rates, in patients with autonomic neuropathy (common in diabetes), and with certain medications (atropine, beta-blockers). In elderly or critically ill patients, RIFV may be too weak to detect reliably.

Respiratory-Induced Amplitude Variation (RIAV)

Changes in intrathoracic pressure during the respiratory cycle alter venous return to the heart and consequently stroke volume. During spontaneous (negative-pressure) inspiration, intrathoracic pressure decreases, increasing venous return to the right heart but initially decreasing left ventricular filling due to pulmonary vascular pooling. This produces beat-to-beat variation in stroke volume and arterial pulse pressure, which manifests as amplitude modulation of the PPG waveform.

The RIAV envelope is extracted by measuring the peak-to-peak amplitude (or systolic peak amplitude) of each PPG pulse and analyzing the resulting amplitude time series. The respiratory frequency appears as an oscillation in this envelope. Lázaro et al. (2014) demonstrated that RIAV provides complementary respiratory information to RIFV and is particularly valuable when RSA is attenuated (DOI: 10.1088/0967-3334/35/7/1407).

RIAV is more prominent during mechanical ventilation (positive-pressure breathing), where the pressure-volume relationships are reversed and the stroke volume variation is typically larger. This makes RIAV particularly useful in the ICU setting. The pulse pressure variation (PPV) derived from RIAV is also a validated clinical indicator of fluid responsiveness in mechanically ventilated patients.

Respiratory-Induced Intensity Variation (RIIV)

Respiratory chest wall movement and changes in venous blood volume in the measurement site produce slow oscillations in the baseline (DC component) of the PPG signal. During inspiration, decreased intrathoracic pressure reduces venous pressure in the upper body, decreasing venous blood volume at the finger or earlobe and reducing the baseline PPG absorption. This creates a baseline intensity modulation at the respiratory frequency.

RIIV is extracted by low-pass filtering the raw PPG signal (below approximately 0.5 Hz) or by analyzing the DC component (baseline) of the PPG waveform. Nilsson et al. (2000) first characterized this mechanism and demonstrated its utility for respiratory rate estimation (DOI: 10.1118/1.1289377).

RIIV is the simplest modulation to extract computationally, requiring minimal preprocessing. However, it is highly susceptible to motion artifacts, sensor repositioning, and baseline drift, which produce low-frequency variations indistinguishable from respiratory modulation.

Extraction Algorithms

Time-Domain Peak Counting

The simplest respiratory rate estimation approach identifies peaks in the respiratory modulation signal (RIAV envelope, RIIV baseline, or RIFV IBI series) and computes the breathing rate from the inter-peak intervals.

The algorithm involves:

  1. Bandpass filtering the modulation signal to the respiratory frequency band (0.1-0.6 Hz, corresponding to 6-36 breaths/min).
  2. Detecting peaks (or zero crossings) in the filtered signal.
  3. Computing the median inter-peak interval over a window (typically 30-60 seconds).
  4. Converting the interval to respiratory rate in breaths per minute.

Peak counting is computationally trivial and works well when the respiratory signal is strong and regular. It fails with irregular breathing, low signal-to-noise ratio, and when harmonic or subharmonic peaks create ambiguity.

Frequency-Domain Spectral Analysis

Spectral methods estimate respiratory rate by identifying the dominant spectral peak in the respiratory frequency band. The approach involves:

  1. Computing the power spectral density (PSD) of the modulation signal using FFT, Welch's method, or autoregressive (AR) modeling.
  2. Identifying the spectral peak with the highest power within the respiratory band (0.1-0.6 Hz).
  3. Reporting the peak frequency as the respiratory rate estimate.

AR spectral estimation (typically order 8-12) provides better frequency resolution than FFT for short analysis windows, which is important for respiratory rate estimation where the respiratory signal may contain only 3-6 cycles within a 30-second window. Chon et al. (2009) demonstrated that AR-based spectral analysis of the PPG IBI series achieved respiratory rate MAE of 1.0 breath/min in 29 healthy subjects at rest (DOI: 10.1109/TBME.2009.2024870).

A key challenge in spectral methods is distinguishing the true respiratory peak from cardiac harmonics. The second harmonic of the respiratory frequency can overlap with the fundamental cardiac frequency, and the subharmonic of the cardiac frequency can fall within the respiratory band. Spectral tracking algorithms that enforce temporal continuity of the respiratory frequency estimate can mitigate these ambiguities.

Advanced Signal Processing Methods

Empirical Mode Decomposition (EMD): EMD decomposes the PPG signal into intrinsic mode functions (IMFs) without assuming any particular basis. The IMF corresponding to the respiratory frequency band contains the respiratory modulation. Moody et al. demonstrated that EMD-based respiratory extraction achieves accuracy comparable to spectral methods while being more adaptive to non-stationary signals.

Continuous Wavelet Transform (CWT): The CWT provides time-frequency representation of the modulation signal, enabling tracking of respiratory frequency changes over time. The scalogram (magnitude of the CWT) shows the respiratory frequency as a ridge that can be tracked using ridge extraction algorithms. Addison et al. (2015) pioneered CWT-based respiratory rate estimation from PPG in the Nellcor Respiration Rate (RRp) algorithm, which was the first FDA-cleared algorithm for respiratory rate extraction from pulse oximetry (DOI: 10.1007/s10877-014-9607-0).

Singular Spectrum Analysis (SSA): SSA decomposes the signal into oscillatory components based on eigendecomposition of the trajectory matrix. It can separate respiratory and cardiac components even when their frequencies are close, and it does not require predefined basis functions or frequency bands.

Smart Fusion of Multiple Respiratory Estimates

Since RIFV, RIAV, and RIIV capture respiratory information through different physiological mechanisms, they provide complementary estimates with different strengths and failure modes. Fusing multiple estimates improves robustness and accuracy.

Quality-Weighted Fusion

Karlen et al. (2013) proposed a smart fusion framework that computes respiratory rate estimates from all three modulations independently, assesses the quality of each estimate, and combines them using quality-weighted averaging (DOI: 10.1109/TBME.2013.2246160). Quality is assessed by:

  • Signal quality index (SQI): Measures the regularity and consistency of the modulation signal. Low SQI indicates noise contamination.
  • Spectral peak prominence: A sharp, isolated spectral peak indicates a reliable estimate; a broad or ambiguous peak indicates uncertainty.
  • Cross-modulation agreement: When two or more modulations agree on the respiratory frequency, confidence increases.

In a study of 59 subjects, the smart fusion approach achieved MAE of 0.9 breaths/min compared to reference capnography, outperforming any single modulation source (RIFV alone: 1.4 breaths/min; RIAV alone: 1.6 breaths/min; RIIV alone: 1.8 breaths/min).

Bayesian Fusion

Pimentel et al. (2017) developed a Bayesian fusion framework that maintains probability distributions over respiratory rate from each modulation source and combines them using Bayes' theorem (DOI: 10.1109/TBME.2016.2613124). The Bayesian approach naturally handles uncertainty: when one source provides an unreliable estimate with high variance, its influence on the fused estimate is automatically reduced. The method also incorporates a temporal prior that penalizes large changes in respiratory rate between consecutive windows, exploiting the physiological constraint that breathing rate changes slowly under normal conditions.

Evaluated on the CapnoBase benchmark dataset (42 subjects, 8 minutes each, with reference capnography), the Bayesian fusion approach achieved MAE of 0.6 breaths/min, which represents less than 5% relative error at typical resting breathing rates.

Performance Benchmarks and Datasets

The CapnoBase Benchmark

The CapnoBase dataset (Karlen et al., 2010) has become the standard benchmark for PPG respiratory rate estimation. It contains 42 recordings of 8 minutes each from pediatric and adult patients during elective surgery, with simultaneous finger PPG and capnography (reference respiratory rate). The dataset is publicly available, enabling direct comparison across methods.

State-of-the-art results on CapnoBase:

| Method | MAE (breaths/min) | Coverage | |--------|-------------------|----------| | Single best modulation | 1.4-1.8 | 100% | | Smart fusion (Karlen, 2013) | 0.9 | 100% | | Bayesian fusion (Pimentel, 2017) | 0.6 | 100% | | Deep learning (Ravichandran, 2019) | 0.5 | 95% | | CWT ridge tracking (Addison, 2015) | 1.0 | 92% |

Coverage refers to the percentage of analysis windows for which the algorithm produces an estimate; some methods abstain from low-confidence windows.

The BIDMC Dataset

The BIDMC (Beth Israel Deaconess Medical Center) dataset provides 53 recordings from critically ill adult patients with reference respiratory rate from impedance pneumography. This dataset is more challenging than CapnoBase because ICU patients often have irregular breathing, cardiac arrhythmias, and lower signal quality.

Pimentel et al. (2017) reported MAE of 1.8 breaths/min on BIDMC using Bayesian fusion, compared to 0.6 breaths/min on CapnoBase, illustrating the performance gap between controlled and clinical settings.

Deep Learning Approaches

Recent work has applied deep learning to PPG respiratory rate estimation, bypassing hand-crafted feature extraction.

End-to-End CNN Models

Ravichandran et al. (2019) trained a 1D-CNN to estimate respiratory rate directly from 32-second windows of raw PPG, achieving MAE of 0.5 breaths/min on CapnoBase (DOI: 10.1145/3341163.3347744). The model learned to extract respiratory modulations implicitly, without explicit RIFV/RIAV/RIIV decomposition. Analysis of the learned filters revealed that the first convolutional layer learned bandpass filters in the respiratory frequency range, while deeper layers captured modulation patterns similar to the traditional decomposition.

Temporal CNN with Attention

Bian et al. (2020) applied a temporal convolutional network (TCN) with attention mechanisms to PPG respiratory rate estimation. The attention layer learned to weight different segments of the input window based on signal quality, effectively performing automatic quality-gated estimation. The model achieved MAE of 1.1 breaths/min on the BIDMC dataset, a 39% improvement over the Bayesian fusion baseline (DOI: 10.1109/JBHI.2020.2990423).

Clinical Applications

Hospital Patient Monitoring

The most immediate clinical application is continuous respiratory rate monitoring from existing pulse oximeters. The Nellcor Respiration Rate (RRp) algorithm by Medtronic is FDA-cleared and deployed in hospital monitors, extracting respiratory rate from the pulse oximeter PPG signal without requiring additional sensors. Clinical validation studies have shown MAE of 1-2 breaths/min compared to manual counting by nurses, which itself has significant inter-observer variability (Bergese et al., 2017; DOI: 10.1213/ANE.0000000000001642).

This capability is particularly valuable for post-surgical patients on general wards, where continuous capnography is impractical but early detection of respiratory depression (from opioid analgesia) can prevent critical events.

Wearable Sleep Monitoring

Consumer wearables including Apple Watch, Fitbit, and Garmin devices now report respiratory rate during sleep. The constrained environment (minimal motion, regular breathing) makes sleep an ideal use case for PPG-based respiratory rate estimation. Longitudinal tracking of sleeping respiratory rate can identify trends associated with respiratory infections, heart failure decompensation, and sleep-disordered breathing.

For how respiratory rate estimation integrates with broader sleep staging algorithms, see our companion article. Our algorithms reference and signal processing guide provide additional implementation details for PPG-based respiratory analysis, and our article on PPG motion artifact removal covers the essential preprocessing steps for extracting clean respiratory modulations.

References

  • Addison, P.S. et al. (2015). Journal of Clinical Monitoring and Computing. DOI: 10.1007/s10877-014-9607-0
  • Bergese, S.D. et al. (2017). Anesthesia & Analgesia. DOI: 10.1213/ANE.0000000000001642
  • Bian, Z. et al. (2020). IEEE Journal of Biomedical and Health Informatics. DOI: 10.1109/JBHI.2020.2990423
  • Chon, K.H. et al. (2009). IEEE Transactions on Biomedical Engineering. DOI: 10.1109/TBME.2009.2024870
  • Churpek, M.M. et al. (2012). Resuscitation. DOI: 10.1016/j.resuscitation.2012.02.009
  • Karlen, W. et al. (2013). IEEE Transactions on Biomedical Engineering. DOI: 10.1109/TBME.2013.2246160
  • Lázaro, J. et al. (2014). Physiological Measurement. DOI: 10.1088/0967-3334/35/7/1407
  • Nilsson, L. et al. (2000). Medical Physics. DOI: 10.1118/1.1289377
  • Pimentel, M.A.F. et al. (2017). IEEE Transactions on Biomedical Engineering. DOI: 10.1109/TBME.2016.2613124

Frequently Asked Questions

How does a PPG sensor detect breathing rate?
PPG sensors detect breathing through three physiological mechanisms that modulate the pulse waveform. First, respiratory sinus arrhythmia (RSA) causes heart rate to increase during inspiration and decrease during expiration, creating a frequency modulation of the pulse intervals (RIFV). Second, changes in intrathoracic pressure during breathing alter venous return and stroke volume, causing amplitude modulation of the pulse waveform (RIAV). Third, respiratory chest wall movement and changes in venous blood volume create baseline intensity variations in the PPG signal (RIIV). Algorithms extract one or more of these respiratory-induced modulations to estimate breathing rate.
How accurate is PPG-based respiratory rate compared to a chest strap?
Under controlled, resting conditions, PPG-based respiratory rate estimation achieves mean absolute errors of 1-2 breaths per minute compared to reference methods like impedance pneumography or respiratory inductance plethysmography. This corresponds to roughly 5-15% relative error at normal breathing rates (12-20 breaths/min). Accuracy degrades during exercise, irregular breathing, speech, and in patients with cardiac arrhythmias. Fusion of multiple PPG-derived respiratory signals (RIAV, RIFV, RIIV) using smart fusion techniques can improve accuracy to below 1 breath/min MAE in controlled settings.
Can you measure respiratory rate from a smartwatch PPG sensor?
Yes, several commercial smartwatches and fitness trackers estimate respiratory rate from wrist PPG, including the Apple Watch (respiratory rate during sleep), Garmin devices, and Fitbit/Google Pixel Watch. However, wrist-based measurement is more challenging than finger-based measurement because the PPG signal quality is lower, motion artifacts are more prevalent, and the respiratory modulations are weaker at the wrist due to the smaller vascular bed. Most consumer devices restrict respiratory rate estimation to sleep or rest periods when motion artifacts are minimal.
Why does PPG respiratory rate estimation fail during exercise?
During exercise, several factors degrade PPG respiratory rate estimation. Motion artifacts from arm movement corrupt the PPG waveform, obscuring the subtle respiratory modulations. Increased heart rate reduces the spectral separation between cardiac and respiratory frequencies. Heavy breathing may exceed the normal respiratory frequency band (0.15-0.4 Hz), and breathing patterns become less regular. Additionally, the respiratory modulation mechanisms themselves change during exercise: RSA amplitude decreases at high heart rates, and hemodynamic changes alter the amplitude and intensity modulation patterns. Robust exercise respiratory rate estimation typically requires accelerometer-aided motion artifact removal as a prerequisite.