PPG for Emotion Recognition: Affective Computing with Photoplethysmography

Technical review of emotion recognition from PPG signals covering autonomic correlates, feature extraction, deep learning models, and affective computing applications.

ChatPPG Research Team·

PPG for Emotion Recognition: Affective Computing with Photoplethysmography

Emotions are not purely cognitive events -- they are embodied physiological states that leave measurable traces in cardiovascular signals. Fear accelerates the heart and constricts blood vessels. Relaxation slows the heart and opens peripheral circulation. These autonomic signatures, captured by photoplethysmography through the skin, enable a growing field of affective computing that seeks to recognize emotional states from physiological data. While the accuracy of PPG-based emotion recognition remains below that of facial or vocal analysis, its unobtrusiveness and resistance to voluntary masking make it uniquely valuable for continuous, honest emotional assessment.

This article reviews the physiological basis linking emotions to PPG signals, the feature extraction and classification methods used in the research literature, benchmark datasets and reported accuracies, and the practical applications and ethical considerations of physiological emotion sensing. For foundational context on PPG signal acquisition, see our introduction to PPG technology.

Theoretical Foundations: Emotions and the Autonomic Nervous System

The relationship between emotions and cardiovascular physiology has been studied for over a century, beginning with William James's somatic theory of emotion (1884), which proposed that bodily changes are not consequences but rather constituents of emotional experience. Modern psychophysiology has established that different emotional states produce partially distinct patterns of autonomic nervous system (ANS) activation, though the specificity and consistency of these patterns remain debated.

The Circumplex Model of Affect

Most PPG-based emotion recognition research adopts Russell's (1980) circumplex model, which represents emotions along two continuous dimensions: valence (positive to negative) and arousal (calm to activated). This model maps discrete emotions onto a two-dimensional space: happiness is high valence, moderate-to-high arousal; sadness is low valence, low arousal; anger is low valence, high arousal; and relaxation is high valence, low arousal.

The circumplex model is preferred over discrete emotion categories for physiological sensing because the autonomic nervous system primarily encodes arousal (through sympathetic-parasympathetic balance) rather than the specific cognitive content of emotions. Valence has weaker and less consistent autonomic correlates, which is why arousal classification from PPG typically outperforms valence classification by 5-15 percentage points.

Autonomic Specificity of Emotions

Ekman et al. (1983; DOI: 10.1126/science.6612338) provided early evidence for emotion-specific autonomic patterns, showing that anger and fear produced larger heart rate increases than happiness, and that anger was associated with greater finger temperature increase (peripheral vasodilation) than fear (which produced vasoconstriction). Subsequent meta-analyses have confirmed that different emotions produce statistically distinguishable cardiovascular patterns, though with substantial overlap and individual variability.

Kreibig (2010; DOI: 10.1016/j.biopsycho.2010.03.010) conducted a comprehensive review of autonomic correlates of discrete emotions across 134 studies. Key findings relevant to PPG-based detection include: fear and anger produce the largest heart rate increases (8-15 BPM above baseline); sadness produces variable heart rate responses depending on whether it involves active crying (increase) or passive withdrawal (decrease); happiness produces moderate heart rate increase (3-7 BPM) and peripheral vasodilation; and disgust is uniquely associated with heart rate deceleration. These differential patterns, while statistically significant at the group level, overlap considerably within individuals, limiting classification accuracy.

PPG Feature Extraction for Emotion Recognition

Extracting emotion-relevant features from PPG signals involves the same categories used for cognitive load detection -- HRV, amplitude, and morphology -- but with emphasis on features that differentiate emotional valence in addition to arousal. For background on HRV analysis methods, see our HRV chart guide.

Time-Domain HRV Features

Inter-beat intervals (IBIs) derived from PPG peak detection provide the foundation for HRV analysis. Key features for emotion recognition include mean IBI (inversely proportional to heart rate), SDNN (overall HRV reflecting total autonomic modulation), RMSSD (short-term HRV reflecting vagal parasympathetic activity), and pNN50. Under high-arousal emotions (anger, fear, excitement), RMSSD and pNN50 decrease as vagal withdrawal occurs. Under low-arousal states (sadness, relaxation), parasympathetic activity may increase or normalize.

Valenza et al. (2014; DOI: 10.1109/JBHI.2013.2271299) analyzed HRV features during emotional stimulation in 30 subjects watching emotion-eliciting film clips. They found that RMSSD and SDNN discriminated arousal levels with effect sizes (Cohen's d) of 0.82 and 0.71 respectively, while mean heart rate had a smaller effect size of 0.58. For valence discrimination, SDNN showed a modest effect (d = 0.34), while RMSSD showed no significant valence effect, confirming the arousal-bias of time-domain HRV features.

Frequency-Domain Features

Spectral analysis of IBI series separates sympathetic and parasympathetic contributions. HF power (0.15-0.40 Hz) reflects parasympathetic vagal modulation and decreases under high-arousal emotions. LF power (0.04-0.15 Hz) reflects a mixture of sympathetic and parasympathetic activity. The LF/HF ratio increases during sympathetically-dominant emotional states (anger, fear) and decreases during parasympathetically-dominant states (relaxation, certain sadness states).

Notably, some researchers have identified frequency-domain features that partially discriminate valence. Rainville et al. (2006; DOI: 10.1016/j.ijpsycho.2005.10.024) found that anger and fear (both negative valence, high arousal) could be partially distinguished by their spectral HRV profiles: anger showed greater LF power increase, while fear showed greater HF power suppression, consistent with differential sympathetic versus parasympathetic activation patterns.

Pulse Wave Morphology Features

PPG waveform shape carries emotion-relevant information beyond inter-beat timing. The dicrotic notch prominence, pulse rise time, pulse width at half maximum, and reflection index change with arterial tone and blood pressure, which are modulated by emotional arousal.

Lee et al. (2019) extracted 42 morphological features from PPG waveforms during emotion induction in 50 subjects and found that the combination of pulse rise time (which shortened during anger and fear) and dicrotic notch depth (which decreased during high-arousal states) improved four-class emotion classification accuracy by 6% compared to HRV features alone, reaching 62% accuracy.

Non-Linear and Complexity Features

Emotional states may alter the complexity of cardiac dynamics in ways not captured by linear features. Approximate entropy, sample entropy, Lyapunov exponents, and correlation dimension of IBI series have all been explored for emotion classification. Valenza et al. (2012; DOI: 10.1109/TBME.2012.2190922) demonstrated that non-linear HRV features improved valence classification accuracy by 4-8% over linear features alone, suggesting that the complexity structure of cardiac regulation carries emotion-specific information. For more on PPG-derived algorithms, see our algorithms section.

Machine Learning and Deep Learning Approaches

Traditional Machine Learning

Support vector machines (SVMs) with radial basis function kernels have been the most widely used classifier for PPG-based emotion recognition. Handcrafted feature vectors combining 10-30 HRV, amplitude, and morphological features are classified using SVMs trained on labeled emotional episodes.

Picard et al. (2001; DOI: 10.1109/5.940843) pioneered physiological emotion recognition, achieving 81% accuracy for eight emotional states using multiple physiological signals including PPG in a single-subject study. Multi-subject generalization proved far more challenging, with subsequent group-level studies typically achieving 55-70% accuracy for four emotion classes.

Schmidt et al. (2018; DOI: 10.1145/3242969.3242985) evaluated multiple classifiers on the WESAD dataset (wrist-worn PPG and accelerometer data from 15 subjects during stress, amusement, and baseline conditions). Using PPG features alone, random forest achieved the best three-class accuracy of 67.7%, while adding accelerometer features improved accuracy to 72.1%. With both wrist PPG and chest sensors, accuracy reached 84.7%.

Deep Learning Architectures

Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) applied to raw or minimally processed PPG signals have shown promise for emotion recognition without manual feature engineering.

Sarkar and Etemad (2020; DOI: 10.1109/JBHI.2020.2967200) proposed a self-supervised learning framework for PPG-based emotion recognition. They pre-trained a CNN on a large unlabeled PPG dataset using contrastive learning (predicting whether two PPG segments came from the same person), then fine-tuned on labeled emotion data. On the AMIGOS dataset (40 subjects watching emotion-eliciting videos), this approach achieved binary valence accuracy of 78.2% and arousal accuracy of 82.6% using PPG alone, outperforming supervised-only training by 6-9 percentage points.

Long short-term memory (LSTM) networks capture temporal dynamics in PPG-derived feature sequences. Siddharth et al. (2019) used a bidirectional LSTM operating on 10-second PPG feature windows to classify four quadrants of the valence-arousal space. Classification accuracy reached 72.3% on the DEAP dataset (32 subjects), with the temporal modeling providing 5% improvement over static feature classification.

Benchmark Datasets

Reproducible emotion recognition research depends on standardized datasets with physiological recordings and emotion labels. Several publicly available datasets include PPG data.

DEAP Dataset

The DEAP dataset (Koelstra et al., 2012; DOI: 10.1109/T-AFFC.2011.15) contains physiological recordings from 32 participants watching 40 one-minute music videos, rated on valence, arousal, dominance, and liking scales. While the primary physiological signals are EEG and peripheral channels, PPG is included among the peripheral recordings. DEAP is the most widely used benchmark for physiological emotion recognition, with over 1,500 citing papers.

Reported PPG-only results on DEAP include binary valence classification accuracies of 62-78% and binary arousal accuracies of 68-83%, depending on the feature extraction and classification approach. The relatively modest valence accuracies on DEAP have been attributed partly to the weakness of music video stimuli for inducing strong valence differences and partly to noisy self-report labels.

WESAD Dataset

The WESAD dataset (Schmidt et al., 2018) specifically targets wearable stress and affect detection, making it highly relevant for PPG-based applications. It includes wrist-worn PPG (Empatica E4) and chest-worn multi-sensor data from 15 subjects during baseline, stress (Trier Social Stress Test), amusement (funny video clips), and meditation conditions. The stress induction protocol produces strong, ecologically valid emotional responses with clear autonomic signatures.

PPG-only three-class classification (baseline, stress, amusement) on WESAD typically achieves 67-75% accuracy, while binary stress detection (stress vs. non-stress) reaches 80-87% accuracy. The dataset's limitation is its small size (15 subjects) and imbalanced class distribution.

AMIGOS Dataset

The AMIGOS dataset (Miranda-Correa et al., 2018; DOI: 10.1109/TAFFC.2018.2884461) recorded EEG, ECG, GSR, and PPG from 40 participants watching short and long emotion-eliciting videos in both individual and group settings. It includes social context annotations, making it suitable for studying how social context modulates physiological emotional responses.

Applications of PPG-Based Emotion Recognition

Mental Health Monitoring

Continuous emotion tracking through wearable PPG could support mental health management by detecting prolonged negative emotional states, tracking mood patterns over days and weeks, and providing objective data to complement self-report assessments. Sano et al. (2018; DOI: 10.2196/10101) demonstrated that wrist-worn sensor data including PPG-derived HRV could predict daily mood ratings with a correlation of r = 0.42 over a 30-day monitoring period in 201 college students.

Human-Computer Interaction

Emotion-aware interfaces could adapt their behavior based on the user's emotional state -- calming down an agitated user, energizing a bored user, or deferring complex tasks when the user appears stressed. This application connects closely with cognitive load detection, as emotional arousal and cognitive load share autonomic signatures.

Consumer Wellness

Smartwatches and fitness trackers are increasingly incorporating stress and emotion tracking features based on PPG-derived HRV. These features typically provide daily stress scores, relaxation prompts, and trend visualizations. While the accuracy of consumer implementations is lower than research-grade systems, the longitudinal data they collect may be valuable for identifying patterns and triggers.

Affective Gaming and Media

Interactive entertainment that responds to the player's emotional state represents a growing application area. Games that become more or less challenging based on detected stress, or narratives that branch based on emotional engagement, could create more immersive experiences. PPG-based sensing is well-suited for this application because gaming controllers and VR headsets can incorporate optical sensors with minimal additional hardware.

Ethical Considerations

PPG-based emotion recognition raises significant ethical concerns that distinguish it from other PPG applications. Unlike heart rate monitoring, which provides accepted health information, emotion detection involves inferring subjective internal states from physiological data, with potential for misuse.

Concerns include workplace surveillance (monitoring employee emotions without consent), manipulative personalization (using detected emotional vulnerability to push commercial decisions), and privacy implications of continuous emotional monitoring. The moderate accuracy of current systems adds the risk of false inferences -- labeling someone as stressed, angry, or disengaged based on ambiguous physiological signals.

Crawford et al. (2019) have argued that physiological emotion recognition systems should be subject to specific regulatory frameworks that address informed consent, data minimization, and the right to not have one's emotional state inferred. These considerations are particularly important as PPG-based emotion sensing moves from research laboratories into consumer products and workplace applications.

Limitations and Future Directions

The fundamental limitation of PPG-based emotion recognition is the many-to-one mapping between emotions and autonomic responses. Different emotions can produce similar cardiovascular patterns (anger and excitement both increase heart rate), and the same emotion can produce different patterns across individuals and contexts. This inherent physiological ambiguity places a ceiling on classification accuracy that no algorithm can fully overcome from PPG data alone.

Multi-modal fusion -- combining PPG with electrodermal activity (EDA), facial expression, voice prosody, and contextual information -- substantially improves accuracy but increases system complexity. Context-aware models that incorporate time of day, activity level, and social context as additional inputs could help disambiguate emotional from non-emotional autonomic changes.

Personalization through individual calibration and continual learning from user feedback may ultimately be the most impactful improvement strategy. A system that learns each user's unique physiological emotional fingerprint over weeks or months could achieve substantially higher accuracy than cross-subject models, bringing PPG-based emotion recognition from a research curiosity to a practically useful technology. For more on how PPG signals are processed and interpreted across clinical and consumer applications, explore our conditions database and algorithms reference.

Conclusion

PPG-based emotion recognition exploits the autonomic nervous system's role in emotional experience to infer affective states from cardiovascular signals. Current systems achieve 75-88% accuracy for arousal classification and 70-82% for valence classification in controlled settings, with specific discrete emotion recognition remaining more challenging at 45-65% accuracy. Deep learning approaches, particularly self-supervised pre-training and temporal modeling, have pushed performance boundaries beyond traditional feature-based classifiers. While PPG alone cannot match the accuracy of facial expression or multi-modal systems for emotion recognition, its unobtrusiveness, resistance to voluntary masking, and compatibility with existing wearable hardware make it a uniquely valuable modality for continuous affective computing. The transition from laboratory demonstrations to real-world applications will require addressing ethical concerns, improving cross-subject generalization, and developing robust methods for separating emotional arousal from physical activity and other confounding sources of autonomic activation.

Frequently Asked Questions

Can PPG sensors detect emotions accurately?
PPG sensors can detect broad emotional categories with moderate accuracy. Binary classification of emotional valence (positive vs. negative) achieves 70-82% accuracy in controlled laboratory settings, while arousal classification (calm vs. excited) reaches 75-88% accuracy because arousal has a stronger autonomic nervous system signature. Classifying specific discrete emotions (happiness, sadness, anger, fear) is substantially harder, with accuracies of 45-65% for four-class classification. Performance is limited by the inherent ambiguity of physiological responses to emotions and large inter-individual variability in autonomic reactivity.
What is the difference between valence and arousal in emotion recognition?
Valence and arousal are the two primary dimensions of Russell's circumplex model of emotion. Valence describes whether an emotion is positive (pleasant) or negative (unpleasant) -- for example, happiness has positive valence and sadness has negative valence. Arousal describes the intensity of physiological activation -- excitement and anger are high arousal, while calmness and sadness are low arousal. PPG-derived features like heart rate variability and pulse wave amplitude primarily reflect arousal through sympathetic-parasympathetic balance, making arousal classification generally more accurate than valence classification from PPG alone.
How does PPG-based emotion detection compare to facial expression analysis?
PPG-based emotion detection and facial expression analysis have complementary strengths and weaknesses. Facial expression analysis achieves higher accuracy for discrete emotion classification (typically 75-90% for 6 basic emotions) because facial muscles are under more direct voluntary control mapped to specific emotions. However, facial expressions can be deliberately masked or faked, while PPG-measured autonomic responses are largely involuntary and harder to consciously control. PPG also works in darkness, during physical activity, and without cameras, making it more suitable for continuous unobtrusive monitoring. Many affective computing systems combine both modalities for improved robustness.
Is PPG-based stress detection reliable enough for consumer health apps?
Current PPG-based stress detection in consumer wearables provides useful but imperfect information. Binary stress/no-stress classification achieves 75-85% accuracy under controlled conditions, but real-world accuracy is lower due to physical activity confounds, posture changes, and caffeine effects that mimic stress-related autonomic activation. Consumer apps typically use HRV-based stress scores calibrated to individual baselines over days or weeks, improving personalized accuracy. These scores are best interpreted as trends rather than absolute measurements. They are clinically meaningful for tracking chronic stress patterns but should not be relied upon for acute stress diagnosis or medical decision-making.