2014 |
Theodora Chaspari, Dimitrios Dimitriadis, Petros Maragos Emotion classification of speech using modulation features Conference European Signal Processing Conference, 2014, ISSN: 22195491. Abstract | BibTeX | Links: [PDF] @conference{170, title = {Emotion classification of speech using modulation features}, author = { Theodora Chaspari and Dimitrios Dimitriadis and Petros Maragos}, url = {http://robotics.ntua.gr/wp-content/uploads/publications/ChaspariDimitriadisMaragos_EmotionRecognitionSpeech_EUSIPCO2014_cr.pdf}, issn = {22195491}, year = {2014}, date = {2014-01-01}, booktitle = {European Signal Processing Conference}, pages = {1552--1556}, abstract = {Automatic classification of a speaker's affective state is one of the major challenges in signal processing community, since it can improve Human-Computer interaction and give insights into the nature of emotions from psychology perspective. The amplitude and frequency control of sound production influences strongly the affective voice content. In this paper, we take advantage of the inherent speech modulations and propose the use of instant amplitude- and frequency-derived features for efficient emotion recognition. Our results indicate that these features can further increase the performance of the widely-used spectral-prosodic information, achieving improvements on two emotional databases, the Berlin Database of Emotional Speech and the recently collected Athens Emotional States Inventory.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } Automatic classification of a speaker's affective state is one of the major challenges in signal processing community, since it can improve Human-Computer interaction and give insights into the nature of emotions from psychology perspective. The amplitude and frequency control of sound production influences strongly the affective voice content. In this paper, we take advantage of the inherent speech modulations and propose the use of instant amplitude- and frequency-derived features for efficient emotion recognition. Our results indicate that these features can further increase the performance of the widely-used spectral-prosodic information, achieving improvements on two emotional databases, the Berlin Database of Emotional Speech and the recently collected Athens Emotional States Inventory. |
Copyright Notice:
Some material presented is available for download to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
The work already published by the IEEE is under its copyright. Personal use of such material is permitted. However, permission to reprint/republish the material for advertising or promotional purposes, or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of the work in other works must be obtained from the IEEE.