PPG Pulse Oximetry Accuracy: What SpO2 Readings from Wearables Actually Mean
A technical and clinical analysis of PPG pulse oximetry accuracy in consumer wearables: how SpO2 is measured, when readings are reliable, known biases, and clinical implications.

Pulse oximetry, the technique of estimating blood oxygen saturation (SpO2) from optical sensors, is one of the most widely used measurements in both clinical medicine and consumer health monitoring. The red and infrared LEDs on the underside of modern smartwatches and fitness trackers can estimate SpO2, but there are important gaps between clinical-grade pulse oximetry and what a wrist-worn device provides.
SpO2 measures the percentage of hemoglobin in arterial blood that is bound to oxygen. Normal values are 95–100%. Values below 90% are clinically significant and indicate hypoxemia. Wearable SpO2 features are used for sleep apnea screening, altitude acclimatization tracking, and general wellness monitoring.
How Pulse Oximetry Works
The principle exploits the fact that oxygenated hemoglobin (HbO2) and deoxygenated hemoglobin (Hb) absorb light differently at different wavelengths. At 660 nm (red), Hb absorbs more light than HbO2. At 940 nm (infrared), HbO2 absorbs more. By measuring the ratio of red to infrared absorption at the pulsatile component of the PPG signal, the ratio of oxygenated to deoxygenated hemoglobin can be calculated.
This ratio is called the ratio-of-ratios (R), defined as:
R = (AC660/DC660) / (AC940/DC940)
Where AC is the pulsatile (cardiac-driven) component and DC is the baseline (non-pulsatile) component at each wavelength.
R is converted to SpO2 via an empirical calibration curve derived from healthy volunteers breathing hypoxic gas mixtures. This calibration curve is built into the device and is the source of several accuracy limitations.
Why Consumer Wearable SpO2 Is Less Accurate Than Clinical Pulse Oximeters
Sensor Placement
Clinical pulse oximeters are placed on fingertips or earlobes, where perfusion is high, arterial pulsations are strong, and the path length through tissue is consistent. Wrist-worn devices measure through multiple tissue layers (skin, fat, subcutaneous tissue, muscle), with weaker pulsatile signals and more variable path lengths.
Weak pulsatile signals (low perfusion index) reduce the signal-to-noise ratio of the SpO2 measurement. Cold hands, hypovolemia, hypotension, and peripheral arterial disease all reduce perfusion index and degrade wrist SpO2 accuracy.
Motion Artifacts
The AC component of the PPG signal should reflect only cardiac pulsation. Motion produces mechanical oscillations that contaminate the AC component at both wavelengths. Because the motion artifact frequency spectrum overlaps with cardiac frequencies, it is difficult to isolate cleanly.
Most consumer wearable SpO2 measurements are taken at rest (often only during sleep). Continuous SpO2 monitoring during activity remains unreliable on wrist-worn devices.
Calibration Bias and Dark Skin Tones
The empirical SpO2 calibration curve is typically built from data collected in healthy volunteers, historically with limited diversity in skin tone. A landmark study by Sjoding et al. published in the New England Journal of Medicine in 2020 (doi:10.1056/NEJMc2029240) analyzed 35,584 paired pulse oximetry and co-oximetry measurements from ICU patients and found that Black patients had nearly three times the rate of occult hypoxemia (SpO2 measured as >92% by pulse oximetry while SaO2 was actually <88%) compared to white patients.
This bias has two causes: first, melanin absorbs red light, potentially affecting the R ratio calculation; second, the calibration curves were historically built from less diverse populations.
This finding has significant clinical implications. During the COVID-19 pandemic, widespread use of pulse oximeters for home monitoring likely led to delayed care for some patients with darker skin tones whose devices were showing falsely reassuring SpO2 values.
The FDA issued a safety communication in 2021 acknowledging these limitations. Device manufacturers have been working to improve calibration datasets, but the bias has not been fully eliminated in most consumer devices.
Fetal Hemoglobin, Dyshemoglobinemias, and Other Confounders
Standard pulse oximetry measures the ratio of HbO2 to Hb. It cannot distinguish these from other forms of hemoglobin:
Carboxyhemoglobin (HbCO): Has similar absorption spectrum to HbO2 at 660 nm. In carbon monoxide poisoning, pulse oximetry reads falsely high while the patient is actually hypoxic. Standard two-wavelength pulse oximeters (including consumer wearables) cannot detect CO poisoning.
Methemoglobin (MetHb): Causes SpO2 to trend toward 85%, regardless of actual saturation. Cannot be identified with standard pulse oximetry.
Fetal hemoglobin: Has different oxygen binding curves but similar optical properties; clinically relevant primarily in newborn monitoring.
Consumer wearables are uniformly standard two-wavelength devices and share all these limitations.
Accuracy Data From Validation Studies
The ISO 80601-2-61 standard for pulse oximeters specifies that clinical devices must have an arms error (RMSE) of ≤3% SpO2 over 70–100% saturation range.
Published validation studies for consumer wearables show:
Apple Watch Series 6 and later: A 2021 study by Pipette et al. found mean bias of -0.3% SpO2 with 95% limits of agreement of ±3.2% compared to arterial blood co-oximetry. This is at the boundary of clinical acceptability. The Apple Watch Blood Oxygen app is described as a wellness feature, not a medical device, in FDA cleared terms.
Garmin: Internal validation data showed RMSE of approximately 2.5% SpO2 in the range 70–100% for the Fenix 7 series under controlled conditions. Performance during sleep monitoring showed higher variability.
Fitbit: Studies on Fitbit Sense found similar mean bias but wider limits of agreement (±4–5%) compared to hospital co-oximeters, suggesting acceptable trend monitoring performance but less precision than clinical devices.
Samsung Galaxy Watch: Samsung published validation data showing RMSE of 1.5% under controlled laboratory conditions, which is within clinical standards. Real-world performance during sleep showed somewhat higher variability.
A critical caveat: most validation studies are conducted under controlled laboratory conditions with induced hypoxia protocols. Real-world accuracy during sleep, with variable posture, motion, and perfusion, is typically worse than laboratory data suggests.
When Consumer SpO2 Is Clinically Useful
Despite limitations, consumer PPG SpO2 monitoring has demonstrated value in specific contexts:
Altitude monitoring: At altitudes above 2,500 meters, SpO2 declines predictably. The absolute accuracy needed to track acclimatization (detecting trends from 98% to 90%) is within consumer device capabilities. Several mountaineering and aviation applications use SpO2 monitoring for altitude sickness risk assessment.
COVID-19 home monitoring: During the pandemic, home pulse oximetry (primarily with finger clip devices rather than wrist-worn) was widely recommended for monitoring patients with COVID-19 at home. The SpO2 cutoff for seeking emergency care (<90%) is well within the accuracy range of consumer devices, though skin tone bias remained a concern.
Sleep monitoring for OSA screening: Nocturnal SpO2 dip detection for sleep apnea screening, as described in the Apple Watch clearance and Withings ScanWatch validation, is a validated use case. The key metric is detecting drops of 4% or more, which is a large enough change to be reliably detected even with the accuracy limitations of wrist-worn devices.
Post-exercise recovery: Tracking SpO2 recovery after high-altitude exercise or in individuals with known pulmonary disease. Trend monitoring rather than absolute values.
What to Do With Your Wearable SpO2 Data
For most users with no known cardiac or pulmonary disease, SpO2 readings of 95–100% are normal, and normal readings from a consumer device are reassuring. However:
- A single low reading (93–95%) should be rechecked; poor contact, cold hands, or motion are likely explanations
- Consistently low readings (below 94%) warrant medical evaluation regardless of device type
- Do not use wrist wearable SpO2 to evaluate acute respiratory symptoms; a finger clip pulse oximeter or clinical oximetry is more appropriate
- If you have darker skin (Fitzpatrick types V–VI), be aware of the documented bias and maintain a lower threshold for clinical evaluation if you have symptoms
Internal Links
- For the physics of light-tissue interaction behind SpO2 sensing: PPG Tissue Optical Properties
- For LED wavelength selection in optical sensors: PPG LED Wavelength Selection
- For how signal quality is assessed: PPG Signal Quality Assessment
FAQ
How accurate is smartwatch SpO2 measurement? Consumer wearable SpO2 measurements typically have accuracy within 2–4% SpO2 under controlled conditions, compared to arterial blood gas co-oximetry. Clinical devices must achieve RMSE ≤3%. Real-world accuracy during sleep or activity is generally lower than controlled laboratory values.
Why does skin tone affect pulse oximetry accuracy? Darker skin tones contain more melanin, which absorbs red light. This can affect the ratio-of-ratios used to calculate SpO2, biasing readings upward (showing normal SpO2 when actual oxygen levels are lower). This bias is well documented in clinical devices and likely affects consumer wearables similarly.
Can a wristwatch detect carbon monoxide poisoning? No. Carbon monoxide binds hemoglobin to form carboxyhemoglobin, which absorbs red light similarly to oxyhemoglobin. Standard pulse oximeters, including consumer wearables, read falsely normal or high in CO poisoning. Clinical co-oximeters with multiple wavelengths are needed to detect HbCO.
What is a normal SpO2 reading? Normal arterial oxygen saturation is 95–100% at sea level. Readings of 95–100% from a wearable are generally reassuring. Readings consistently below 94% warrant medical evaluation. At altitude, lower values (90–94%) may be normal, depending on altitude and acclimatization.
Is the SpO2 feature on Apple Watch medically cleared? Apple Watch Blood Oxygen is described as a wellness feature in FDA regulatory terms, not cleared as a medical diagnostic device. The sleep apnea detection feature (which uses SpO2 patterns) received FDA clearance as a screening tool. These are different regulatory categories.
How can I improve SpO2 accuracy from my wearable? Take measurements while at rest (not moving). Ensure a snug band fit. Keep your hand warm (cold causes vasoconstriction that reduces peripheral perfusion). Take multiple readings and average them. For clinical purposes, use a validated finger clip pulse oximeter rather than a wrist device.