A Comparative Study of Machine Learning and Deep Learning Algorithms for Speech Emotion Recognition

Document Type : Original Article

Authors

1 Computer Science Department, Faculty of Computers and Information, Menoufia University, Shebin Elkom 32511, Egypt, Rania_anwer@hotmail.com

2 Computer Science Department, Faculty of Computers and Information, Menoufia University, Shebin Elkom 32511, Egypt, fci_3mh@yahoo.com

3 Faculty of Computer and Information Menoufia UniversitComputer Science Department, Faculty of Computers and Information, Menoufia University, Shebin Elkom 32511, Egypt, arabikeshk@yahoo.com

Abstract

Today’s world has been "Chatting" with machines for a long time. Speech signal processing has been a long-standing topic of discussion, and its applications in our lives have been evolving over time. One of these applications is speech emotion recognition. Which is the process of identifying the emotions expressed in a person's speech. This is a challenging task, as emotions are subjective and can be expressed differently by different people. Emotions are difficult to categorize because there is no single set of criteria or steps that everyone agrees on. Despite these challenges, speech emotion recognition has the potential to be a valuable technique in many applications, such as customer service, healthcare, and education.

Overall, SER is a promising field of research with the potential to improve our understanding of human emotion and to develop new applications that can benefit society.

This paper compares four methods used to recognize emotion from speech. The comparison is mainly done on the IEMOCAP dataset, IEMOCAP is a large and well-annotated corpus of emotional speech datasets.

Keywords