Emotion Detection

The document discusses Speech Emotion Recognition (SER), highlighting its significance in enhancing human-computer interaction by enabling machines to understand and respond to human emotions. It reviews various deep learning methodologies and applications of SER across different industries, including healthcare, customer service, and entertainment. The challenges of recognizing emotional cues in speech due to their dynamic and subjective nature are also addressed, along with the potential benefits for diagnosing speech-related disorders and mental health conditions.

Uploaded by

abhiramrockz49

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views2 pages

Emotion Detection

Uploaded by

abhiramrockz49

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Emotion Recognition

N Sai Satwik Reddy∗ , V Venkata Alluri Rohith∗ , V Poorna Muni Sasidhar Reddy∗ ,
Y Shashank Reddy∗ , Jyothish Lal G∗
∗ Amrita School of Artificial Intelligence, Coimbatore, Amrita Vishwa Vidyapeetham, India
{satwikreddy987@, vennaalluri1, vpoornareddy2004, ysrpersom}@[Link],
g jyothishlal@[Link]

Abstract— pitch, intensity, and speech rate, can unveil emotional distress
Index Terms—Speech emotion recognition markers, enabling timely intervention and support for indi-
viduals in need. Some of the real-life applications of speech
I. I NTRODUCTION emotion recognition span diverse industries, from customer
Speech signals, a fundamental aspect of human communica- service and virtual assistants to education and entertainment.
tion, carry an abundance of emotional data that enriches the Advancements in these fields leverage SER to create more
meaning and significance of our conversations. Due to the intuitive and responsive systems, ultimately enhancing user
ability to convey both linguistic content and the emotional experiences. Industries benefit from improved customer sat-
state of the speaker, speech signals are extremely valuable. isfaction, personalized learning experiences, and emotionally
The recognition of these emotional cues has become extremely engaging entertainment content, leading to a paradigm shift in
significant; thus, Speech Emotion Recognition (SER) has how we interact with technology.
emerged as a research area with applications in many impor-
tant aspects of our daily lives, including computer and robot II. R ELATED W ORK
interfaces, legally and socially acceptable applications, coun- Numerous deep learning (DL) methodologies have been pro-
selling, therapy, etc. The role of speech emotion recognition in posed for emotion recognition in recent years.
Human-Computer Interaction (HCI) plays an essential role in A lightweight dual-stream conformer fusion network is
discriminating between the emotional nuances of human com- designed with convolution kernels of sizes (3×3), (1×11),
munication. With technology interaction becoming more and and (11×1) to extract a diverse set of features from
more conversational and personalized, incorporating emotional mel-spectrograms and Mel frequency cepstral coefficients
intelligence into these systems becomes indispensable. SER (MFCCs) obtained from the audio signals [1]. The features
serves as a bridge for machines to comprehend and respond extracted from these three different methods are then fed into
appropriately to human emotions, fostering a more natural and the second part of the overall network for emotion classifi-
empathetic connection between humans and computers. cation. Constant-Q transform-based modulation spectrograms
However, the problem of speech emotion recognition is are extracted from the voice records from two well-known
inherently challenging due to the dynamic and subjective databases, EmoDB and RAVDESS, and fed into two different
nature of emotions. Unlike other modalities such as hand deep neural networks (DNNs) for classifying the emotions in
gestures or facial expressions, speech emotions are often subtle [2]. The variant of DNN that used support vector machines
and context-dependent, making their identification a complex (SVM), which took embeddings resulting from the DNN,
task. Various modalities, including facial expressions, body outperformed the usual DNN. In [3], a combination of MFCCs
language, physiological signals, and even textual analysis, and time-domain features is extracted and input into the
contribute to a holistic understanding of emotions. Neverthe- convolutional neural network (CNN) for emotion recognition,
less, the prevalence of emotional information in audio waves and this approach also outperformed the standard machine
make speech a relevant modality for emotion recognition. The learning (ML) approaches. Complex MFCCs are used as input
prevalence of audio modality in emotion recognition can be to the sequential DNN in [4], and the metrics improved sig-
attributed to its unique ability to capture the nuances of human nificantly when tested using gender-integrated differentiation
expression, including prosody, intonation, and other acoustic in the RAVDESS dataset. Multiple acoustic features, includ-
features. ing MFCCs, linear prediction cepstral coefficients (LPCCs),
Speech emotion recognition holds promise for diagnosing wavelet packet transform (WPT), and other time domain
and aiding patients with speech-related disorders such as features, are obtained from EmoDB and RAVDESS in [5], and
dysarthria or stuttering. The subtle variations in speech pat- a one-dimensional CNN is utilized for classification purposes.
terns can provide valuable insights for medical professionals, The architecture of the SER system proposed in [6] is designed
aiding in the assessment and treatment of these disorders. In for three tasks, which include the intensity estimation of the
the realm of mental health, speech emotion recognition extends emotion, type of emotion, and gender identification. Time-
its utility to detect conditions like depression, anxiety, and domain and spectral-domain filters are applied to the mel-
even suicidal thoughts. Analyzing acoustic features, such as spectrograms extracted from the voice records and input into
the CNNs and long short-term memory (LSTM) for feature [11] Z. Chen, J. Li, H. Liu, X. Wang, H. Wang, and Q. Zheng, “Learning
learning to perform the aforementioned tasks. In [7], the input multi-scale features for speech emotion recognition with connection
attention mechanism,” Expert Systems with Applications, vol. 214,
into the VGG network is chaograms, which represent the 3- p. 118943, 2023.
dimensional tensor obtained from the speech records in RGB
color space. The gray wolf optimization method is used for
fine-tuning the hyperparameters.
[8] utilized data augmentation techniques involving the
addition of white Gaussian noise to the records, and also gen-
erated pitch-shifted and time-stretched versions of the speech
records. Subsequently, multiple time-domain and frequency-
domain features such as zero-crossing rate (ZCR), MFCCs,
chromagrams, etc., were extracted and fed into multiple DL
models such as ensemble models, attention-based models,
and transfer learning-based models for emotion recognition.
A Raspberry Pi-based hardware implementation of the SER
system is proposed in [9], utilizing a multi-layer perceptron
neural network that uses MFCCs for classifying emotions.
A blend of 2-dimensional CNN and LSTM networks with
MFCC features as input is proposed in [10] and evaluated on
a dataset comprising records from RAVDESS, SAVEE, and
TESS datasets to detect eight classes of emotions. In [11],
log-mel spectrograms are extracted from the audio signals

III. M ETHODOLOGY
IV. R ESULTS AND D ISCUSSION
V. C ONCLUSION
R EFERENCES
[1] M. Tellai, L. Gao, and Q. Mao, “An efficient speech emotion recognition
based on a dual-stream cnn-transformer fusion network,” International
Journal of Speech Technology, vol. 26, no. 2, pp. 541–557, 2023.
[2] P. Singh, M. Sahidullah, and G. Saha, “Modulation spectral features
for speech emotion recognition using deep neural networks,” Speech
Communication, vol. 146, pp. 53–69, 2023.
[3] A. S. Alluhaidan, O. Saidani, R. Jahangir, M. A. Nauman, and O. S.
Neffati, “Speech emotion recognition through hybrid features and con-
volutional neural network,” Applied Sciences, vol. 13, no. 8, p. 4750,
2023.
[4] S. Patnaik, “Speech emotion recognition by using complex mfcc and
deep sequential model,” Multimedia Tools and Applications, vol. 82,
no. 8, pp. 11897–11922, 2023.
[5] K. Bhangale and M. Kothandaraman, “Speech emotion recognition
based on multiple acoustic features and deep convolutional neural
network,” Electronics, vol. 12, no. 4, p. 839, 2023.
[6] Z.-T. Liu, M.-T. Han, B.-H. Wu, and A. Rehman, “Speech emotion
recognition based on convolutional neural network with attention-based
bidirectional long short-term memory network and multi-task learning,”
Applied Acoustics, vol. 202, p. 109178, 2023.
[7] M. R. Falahzadeh, F. Farokhi, A. Harimi, and R. Sabbaghi-Nadooshan,
“Deep convolutional neural network and gray wolf optimization algo-
rithm for speech emotion recognition,” Circuits, Systems, and Signal
Processing, vol. 42, no. 1, pp. 449–492, 2023.
[8] M. R. Ahmed, S. Islam, A. M. Islam, and S. Shatabda, “An ensemble
1d-cnn-lstm-gru model with data augmentation for speech emotion
recognition,” Expert Systems with Applications, vol. 218, p. 119633,
2023.
[9] S. Kumar, M. A. Haq, A. Jain, C. A. Jason, N. R. Moparthi, N. Mittal,
and Z. S. Alzamil, “Multilayer neural network based speech emotion
recognition for smart assistance.,” Computers, Materials & Continua,
vol. 75, no. 1, 2023.
[10] J. Singh, L. B. Saheer, and O. Faust, “Speech emotion recognition using
attention model,” International Journal of Environmental Research and
Public Health, vol. 20, no. 6, p. 5140, 2023.

Emotion Recognition with SAVEE Dataset
No ratings yet
Emotion Recognition with SAVEE Dataset
9 pages
Speech Emotion Recognition with CNNs
No ratings yet
Speech Emotion Recognition with CNNs
6 pages
Speech Emotion Recognition Using Tonal and Prosodic Features With Convolutional Neural Networks
No ratings yet
Speech Emotion Recognition Using Tonal and Prosodic Features With Convolutional Neural Networks
6 pages
Real-Time Emotion Recognition via Deep Learning
No ratings yet
Real-Time Emotion Recognition via Deep Learning
40 pages
1869 3972 1 PB
No ratings yet
1869 3972 1 PB
12 pages
Speech Emotion Recognition with ML Techniques
No ratings yet
Speech Emotion Recognition with ML Techniques
8 pages
Speech Emotion Recognition Using Machine
No ratings yet
Speech Emotion Recognition Using Machine
5 pages
Hindi Speech Emotion Recognition with LSTM
No ratings yet
Hindi Speech Emotion Recognition with LSTM
6 pages
Advanced ML in Speech Emotion Recognition
No ratings yet
Advanced ML in Speech Emotion Recognition
6 pages
$RSM4OX0
No ratings yet
$RSM4OX0
45 pages
Speech Emotion Recognition with DNN
No ratings yet
Speech Emotion Recognition with DNN
5 pages
2nd DM
No ratings yet
2nd DM
15 pages
Speech Emotion Detection with ML
No ratings yet
Speech Emotion Detection with ML
15 pages
DeepSpeech Dynamic Emotion Detection
No ratings yet
DeepSpeech Dynamic Emotion Detection
15 pages
Electronics 12 00839 v2
No ratings yet
Electronics 12 00839 v2
17 pages
Audio Emotion Prediction Using MFCC and MEL
No ratings yet
Audio Emotion Prediction Using MFCC and MEL
5 pages
Speech Emotion Recognition Model Analysis
No ratings yet
Speech Emotion Recognition Model Analysis
12 pages
Speech Emotion Recognition - 20th Jan
No ratings yet
Speech Emotion Recognition - 20th Jan
6 pages
Speech Emotion Recognition with ML
No ratings yet
Speech Emotion Recognition with ML
7 pages
Deep Learning for Emotion Prediction in Speech
No ratings yet
Deep Learning for Emotion Prediction in Speech
13 pages
Real-Time Speech Emotion Recognition
No ratings yet
Real-Time Speech Emotion Recognition
41 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
6 pages
Speech Emotion Recognition Progress Report
No ratings yet
Speech Emotion Recognition Progress Report
12 pages
Emotional Speech Recognition with CNNs
No ratings yet
Emotional Speech Recognition with CNNs
11 pages
Research Paper 2
No ratings yet
Research Paper 2
9 pages
Speech Emotion Recognition Survey 2024
No ratings yet
Speech Emotion Recognition Survey 2024
7 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
18 pages
Human Emotion Recognition via ANN
No ratings yet
Human Emotion Recognition via ANN
7 pages
Speech Emotion Recognition with LSTM
No ratings yet
Speech Emotion Recognition with LSTM
11 pages
Speech Emotion Recognition with ML
No ratings yet
Speech Emotion Recognition with ML
5 pages
Speech Emotion Recognition Overview
No ratings yet
Speech Emotion Recognition Overview
11 pages
CNN-Transformer Speech Emotion Detection
No ratings yet
CNN-Transformer Speech Emotion Detection
11 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
6 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
12 pages
Cross-Accent Emotion Recognition System
No ratings yet
Cross-Accent Emotion Recognition System
18 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
5 pages
Multi-Emotion Speech Recognition Analysis
No ratings yet
Multi-Emotion Speech Recognition Analysis
65 pages
Research Paper
No ratings yet
Research Paper
7 pages
Speech Emotion Recognition Using RNN
85% (13)
Speech Emotion Recognition Using RNN
10 pages
Speech Emotion Recognition System
No ratings yet
Speech Emotion Recognition System
14 pages
Multimodal Speech Emotion Recognition
No ratings yet
Multimodal Speech Emotion Recognition
7 pages
Speech Emotion Recognition with S-kNN
No ratings yet
Speech Emotion Recognition with S-kNN
8 pages
XEmoAccent: AI for Cross-Accent Emotion Recognition
No ratings yet
XEmoAccent: AI for Cross-Accent Emotion Recognition
19 pages
Applsci 13 02167
No ratings yet
Applsci 13 02167
14 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
19 pages
Marathi Speech Emotion Detection System
No ratings yet
Marathi Speech Emotion Detection System
22 pages
Speech Emotion Recognition in ML
No ratings yet
Speech Emotion Recognition in ML
20 pages
Speech Emotion Recognition Survey
No ratings yet
Speech Emotion Recognition Survey
6 pages
Real-Time Speech Emotion Recognition
No ratings yet
Real-Time Speech Emotion Recognition
5 pages
Deep Learning for Speech Emotion Recognition
No ratings yet
Deep Learning for Speech Emotion Recognition
5 pages
Speech Emotion Recognition Techniques
No ratings yet
Speech Emotion Recognition Techniques
13 pages
Pre Processing
No ratings yet
Pre Processing
54 pages
Speech Emotion Recognition Using Entropy
No ratings yet
Speech Emotion Recognition Using Entropy
16 pages
Speech Emotion Detection Using ML Techniques
No ratings yet
Speech Emotion Detection Using ML Techniques
7 pages
RM Expt 4
No ratings yet
RM Expt 4
2 pages
Singing Voice Conversion With Non-Parallel Data
No ratings yet
Singing Voice Conversion With Non-Parallel Data
5 pages
A Largescale Comparison of Two Voice Synthesis Techniques on Intelligibility Naturalness Preferences and Attitudes Toward Voices Banked by Individuals With Amyotrophic Lateral SclerosisAAC Augmentative and Alternat
No ratings yet
A Largescale Comparison of Two Voice Synthesis Techniques on Intelligibility Naturalness Preferences and Attitudes Toward Voices Banked by Individuals With Amyotrophic Lateral SclerosisAAC Augmentative and Alternat
16 pages
Stack Generalized Deep Ensemble Learning ForRetinal Layer Segmentation
No ratings yet
Stack Generalized Deep Ensemble Learning ForRetinal Layer Segmentation
16 pages
MARK2 - Self-Attention CNN For Retinal Layer
No ratings yet
MARK2 - Self-Attention CNN For Retinal Layer
13 pages
MARK1 OCtDL Optical Coherence Tomography Dataset For Image Based Deep Learning Methods
No ratings yet
MARK1 OCtDL Optical Coherence Tomography Dataset For Image Based Deep Learning Methods
10 pages
A Comparison of Deep Learning U Net Architectures For Posterior Segment OCT Retinal Layer Segmentation
No ratings yet
A Comparison of Deep Learning U Net Architectures For Posterior Segment OCT Retinal Layer Segmentation
14 pages
Ensemble Learning Approach To Retinal Thickness Assessment in Optical Coherence Tomography
No ratings yet
Ensemble Learning Approach To Retinal Thickness Assessment in Optical Coherence Tomography
15 pages
Deepretina: Layer Segmentation of Retina in Oct Images Using Deep Learning
No ratings yet
Deepretina: Layer Segmentation of Retina in Oct Images Using Deep Learning
17 pages
Business Law and Ethics - Individual Report
No ratings yet
Business Law and Ethics - Individual Report
11 pages
Soviet Air Power: 1925-1942 Analysis
No ratings yet
Soviet Air Power: 1925-1942 Analysis
19 pages
Investment Property Accounting Standards
No ratings yet
Investment Property Accounting Standards
8 pages
Male Sterility Systems in Plants
No ratings yet
Male Sterility Systems in Plants
1 page
Full Book Husband Material Alexis Hall Available All Format
100% (2)
Full Book Husband Material Alexis Hall Available All Format
172 pages
GeoSmart II: Soil Investigation Software
No ratings yet
GeoSmart II: Soil Investigation Software
8 pages
Test Bank For Managerial Economics and Strategy 3rd Edition by by Jeffrey M Perloff
No ratings yet
Test Bank For Managerial Economics and Strategy 3rd Edition by by Jeffrey M Perloff
61 pages
TDC-GP22 Laser Rangefinder Guide
No ratings yet
TDC-GP22 Laser Rangefinder Guide
29 pages
GMB Brochure New
No ratings yet
GMB Brochure New
12 pages
Safety and Vigilance Case Types
No ratings yet
Safety and Vigilance Case Types
27 pages
Sainik School Class 9 Entrance Exam Math
No ratings yet
Sainik School Class 9 Entrance Exam Math
15 pages
Standard MQP CRP Sas Panels PDF
No ratings yet
Standard MQP CRP Sas Panels PDF
30 pages
Tech Mahindra IAM Solutions Overview
No ratings yet
Tech Mahindra IAM Solutions Overview
7 pages
Transistor Amplifiers Overview and Analysis
No ratings yet
Transistor Amplifiers Overview and Analysis
125 pages
Sales EQ: Take-Aways
100% (3)
Sales EQ: Take-Aways
5 pages
VW ABS and ESP Coding Guide
100% (3)
VW ABS and ESP Coding Guide
3 pages
Mythology and Science Quiz Questions
No ratings yet
Mythology and Science Quiz Questions
5 pages
Yealink SIP-T48G User Guide V80 60
No ratings yet
Yealink SIP-T48G User Guide V80 60
230 pages
2024 Global Font Use Insights Report
No ratings yet
2024 Global Font Use Insights Report
15 pages
External Interrupts in 8051 Microcontroller
No ratings yet
External Interrupts in 8051 Microcontroller
6 pages
Gym Startup Cost Breakdown Guide
No ratings yet
Gym Startup Cost Breakdown Guide
9 pages
Understanding Railway Gauges in India
No ratings yet
Understanding Railway Gauges in India
8 pages
Data Searching and Hashing Techniques
No ratings yet
Data Searching and Hashing Techniques
16 pages
ch10 - Liabilities
100% (1)
ch10 - Liabilities
89 pages
Paragon HX Pre-site Checklist Guide
No ratings yet
Paragon HX Pre-site Checklist Guide
9 pages
Lesson Plan on Biomolecules for Biology
50% (2)
Lesson Plan on Biomolecules for Biology
7 pages
Trademark Ownership Dispute Ruling
100% (1)
Trademark Ownership Dispute Ruling
2 pages
Unique Custom Audience Strategies
No ratings yet
Unique Custom Audience Strategies
5 pages
Understanding Resistance and Conductivity
No ratings yet
Understanding Resistance and Conductivity
24 pages
SIWES Experience at NIMASA
No ratings yet
SIWES Experience at NIMASA
22 pages

Emotion Detection

Uploaded by

Emotion Detection

Uploaded by

Emotion Recognition

You might also like