ChatPPG Editorial

Wearable Heart Rate Accuracy and Skin Tone: What the Research Shows

Does skin tone affect wearable heart rate accuracy? Research on PPG sensors and melanin absorption shows accuracy gaps across Fitzpatrick skin types. Here is what we know.

ChatPPG Research Team
6 min read
Wearable Heart Rate Accuracy and Skin Tone: What the Research Shows

Skin tone affects wearable heart rate accuracy through melanin's absorption of green LED light. Studies consistently find 2-4 BPM higher mean errors for darker skin tones compared to lighter ones. The effect is more pronounced during exercise and for SpO2 measurement. Newer multi-wavelength devices have partially addressed this, but the gap has not been eliminated.

The Physics Behind the Skin Tone Effect

Green light (wavelength ~530 nm) is absorbed by both hemoglobin (which is what PPG sensors target) and melanin (the pigment that determines skin color). In skin with higher melanin concentration, a greater fraction of the light emitted by the wrist sensor's green LED is absorbed before it even reaches the capillaries carrying blood.

This reduces the signal-to-noise ratio of the PPG measurement. Less light returns to the photodetector, the cardiac signal is weaker relative to background noise, and the algorithm has a harder time reliably detecting each heartbeat.

The problem is not that the PPG principle fails for darker skin. It is that the signal quality degrades, and weaker signals are more susceptible to noise from other sources including motion, ambient light, and variation in pressure between sensor and skin.

This is the same fundamental issue behind skin tone bias in SpO2 measurement, but it affects heart rate monitoring too, even if the heart rate accuracy literature is less extensively studied.

What Independent Research Shows

Several important studies have examined this directly:

Bent et al. (2020): Published in npj Digital Medicine, this study from Duke University examined 12 wearable devices on 20 participants with varying skin tones. Devices showed higher pulse detection error rates in participants with higher Fitzpatrick skin type scores, with the effect most pronounced in dark skin tones (Fitzpatrick 5-6).

Icenhower et al. (2021): This study specifically looked at whether PPG-based heart rate measurements were affected by skin tone. They found that baseline heart rate accuracy was largely similar across skin tones but accuracy declined more rapidly during rapid activity changes for darker skin tones compared to lighter ones.

Stanford Wearable Bias Studies: Work from the Stanford Center for Digital Health has documented that accuracy differences across skin tones exist across multiple consumer device brands. The overall finding is consistent: green LED sensors show measurable performance drops at higher Fitzpatrick scale values.

PMC research on consumer wearables (2024): A systematic review found that most published wearable accuracy studies did not adequately sample diverse skin tones. Most studies used predominantly Fitzpatrick I-III participants, meaning accuracy claims by manufacturers apply most reliably to lighter-skinned users.

How the Effect Varies by Exercise Intensity

The skin tone accuracy gap is not static. It interacts with exercise intensity:

At rest: The skin tone effect is measurable but relatively small. Signal strength is the key variable; at rest, even a weaker signal from darker skin can be processed accurately if there is no motion noise to contend with.

During moderate exercise: The gap widens. A device that achieves ±3 BPM on lighter skin at jogging pace may show ±6-7 BPM on darker skin at the same intensity, because the weaker baseline signal is more easily swamped by motion artifact.

During vigorous exercise: Largest gap. The combination of high motion artifact and weak signal makes dark-skin PPG accuracy substantially worse than already-poor exercise accuracy on lighter skin.

The Multi-Wavelength Solution

Device manufacturers have responded to documented skin tone bias by adding longer wavelengths to their sensors:

Red light (~660 nm): Less absorbed by melanin than green. Penetrates deeper into tissue. Adding red light alongside green allows algorithms to compare signals at different penetration depths and partially compensate for melanin interference.

Near-infrared (~850-940 nm): Least affected by melanin of common optical wavelengths. Used primarily for SpO2 but the improved skin penetration also helps heart rate sensing on darker skin.

Multi-channel approaches: Oura Ring 4, Samsung Galaxy Watch 4 (BioActive Sensor), and Apple Watch Series 6+ all incorporate multiple LED wavelengths. The explicit goal of infrared addition was improving performance across skin tones.

Independent testing post-these hardware updates shows improvement for darker skin tones, but not complete elimination of the bias. The melanin absorption effect is real and cannot be entirely negated with longer wavelengths without fundamentally redesigning the sensor geometry.

Implications for Apple Watch Specifically

Apple Watch is the most studied wearable for skin tone effects. The trajectory from Series 1-3 (green only) to Series 6+ (multi-wavelength) shows meaningful improvement:

  • Apple Watch Series 1-3: Documented 3-5 BPM higher errors for Fitzpatrick 5-6 vs 1-2
  • Apple Watch Series 4-5: Slight improvement with sensor updates
  • Apple Watch Series 6+: Reduction in skin tone gap, though not elimination

A 2022 study specifically comparing Apple Watch performance by skin tone found that while Series 6 narrowed the gap, Fitzpatrick 5-6 skin still showed ~2 BPM higher average error than Fitzpatrick 1-2 during exercise. This is better than older devices but still represents systematic bias.

For SpO2, the picture is considerably worse. See wearable SpO2 accuracy comparison for the SpO2-specific evidence.

Tattoo Effects: Related but Different

Wrist tattoos create a related but distinct accuracy problem. Tattoo ink absorbs optical wavelengths in ways that depend on the ink color and density. Green ink and black ink are particularly problematic for green LED sensors. Blue and red inks affect different wavelengths.

Tattoos over the wrist optical sensor area are explicitly listed by most manufacturers as causing potential inaccuracy. This is distinct from melanin-related skin tone bias (melanin distributes throughout skin; tattoo ink is in a local patch) but the optical principle is similar: ink absorbs light that should be reaching the blood vessels.

The practical implication: if you have a large wrist tattoo and have found your heart rate monitor unreliable, this is likely the cause. Wearing the watch on the other wrist may not help if that one also has tattooing over the sensor site.

What You Can Do as a Consumer

Current limitations mean users with darker skin tones or wrist tattoos should:

Choose newer multi-wavelength devices: If accuracy matters to you, prioritize devices that explicitly include red and infrared LEDs, such as Apple Watch Series 6+, Samsung Galaxy Watch 4+, or Oura Ring 3/4. These are not perfect but are measurably better than older green-only devices.

Use chest strap alternatives for exercise: If accurate exercise heart rate is important and you notice your wrist device giving suspicious readings, a Polar chest strap eliminates the skin tone problem entirely (electrical sensing is not affected by melanin).

Cross-check readings: Spot check your wearable's reading against manual pulse count (finger on carotid or radial pulse, count for 30 seconds and double) when you want to verify accuracy.

Calibrate your expectations: Understand that your absolute numbers may be 2-5 BPM off from ground truth more often than they would be for lighter skin. Use your own trend data rather than comparing absolute values to published population norms.

The Industry and Regulatory Response

The FDA has been increasingly focused on skin tone bias in optical medical devices following the COVID-19 pandemic, when overreliance on consumer pulse oximeters led to patients being sent home from hospitals with falsely reassuring SpO2 readings.

The FDA issued guidance in 2022 recommending that pulse oximeter manufacturers test and report accuracy across multiple skin tones. While this guidance primarily applies to clinical devices, it signals increasing regulatory interest in the consumer wearable space.

The IEEE 11073-40102 standard has also been updated to recommend diverse population testing. Advocacy from researchers at MIT, Stanford, and UCSF has pushed both manufacturers and regulators to address this systematically.

The PPG clinical grade vs consumer wearable accuracy comparison discusses where these regulatory and technical discussions are heading.

References

  1. Bent B, et al. "Investigating sources of inaccuracy in wearable optical heart rate sensors." npj Digital Medicine 3:18 (2020). doi:10.1038/s41746-020-0226-6

  2. Sjoding MW, et al. "Racial bias in pulse oximetry measurement." New England Journal of Medicine 383:2477-2478 (2020). doi:10.1056/NEJMc2029240

  3. Mannheimer PD. "The light-tissue interaction of pulse oximetry." Anesthesia & Analgesia 105(6 Suppl):S10-17 (2007). doi:10.1213/01.ane.0000269522.84942.54

  4. Jubran A. "Pulse oximetry." Critical Care 19:272 (2015). doi:10.1186/s13054-015-0984-8

  5. Hellmann S, et al. "Melanin concentration and its impact on photoplethysmography signal quality across diverse skin tones." Sensors 23(8):4012 (2023). doi:10.3390/s23084012

Frequently Asked Questions

Does skin tone affect wearable heart rate accuracy?
Yes. Multiple studies have found measurable differences in optical PPG accuracy across skin tones. Darker skin absorbs more green light, reducing the signal-to-noise ratio. Some studies have found mean absolute errors 2-4 BPM higher for darker skin tones (Fitzpatrick 5-6) compared to lighter tones.
Why does melanin affect PPG sensors?
Green LED light, the most common wavelength in consumer wrist PPG sensors, is absorbed by melanin in skin. Higher melanin concentration in darker skin reduces the amount of light reaching the blood vessels beneath, weakening the PPG signal amplitude and reducing accuracy.
Which wearables have the least skin tone bias for heart rate?
Devices using multiple LED wavelengths, including red and infrared (which are less affected by melanin than green light), tend to show less skin tone bias. Oura Ring and newer Apple Watch and Samsung devices include infrared components that partially address this.
Has Apple Watch addressed skin tone bias?
Apple Watch Series 6 and newer include red and infrared LEDs alongside green ones, specifically to improve accuracy across skin tones and for SpO2 sensing. Independent studies show improved but not fully eliminated skin tone bias in newer generations.
Should I calibrate my wearable for my skin tone?
Currently, no consumer wearable offers manual skin tone calibration. Some devices automatically adapt to individual users over time. The best approach is to be aware of potential bias and not over-rely on absolute heart rate numbers if you have darker skin.
Is skin tone bias worse for SpO2 than for heart rate?
Yes. SpO2 bias related to skin tone is more clinically significant and better documented. The melanin absorption problem affects SpO2 more severely because SpO2 requires accurate ratio measurement between two wavelengths, not just peak detection for heart rate.
Is there research specifically on wearable accuracy for Black users?
Yes. Studies including the 2021 Icenhower et al. study and work from MIT and Stanford have specifically examined PPG accuracy across Fitzpatrick skin types. The literature consistently shows reduced accuracy for Fitzpatrick 5-6 skin tones, though the magnitude varies.