2014 |
A. Katsamanis, I. Rodomagoulakis, G. Potamianos, P. Maragos, A. Tsiami Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing Conference ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2014, ISSN: 15206149. Abstract | BibTeX | Links: [PDF] @conference{171, title = {Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing}, author = { A. Katsamanis and I. Rodomagoulakis and G. Potamianos and P. Maragos and A. Tsiami}, url = {http://robotics.ntua.gr/wp-content/uploads/publications/KatsamanisEtAl_MultichannelASR_DIRHA_icassp2014.pdf}, doi = {10.1109/ICASSP.2014.6854664}, issn = {15206149}, year = {2014}, date = {2014-01-01}, booktitle = {ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings}, pages = {5547--5551}, abstract = {The paper presents our approach to speech-controlled home automa- tion. We are focusing on the detection and recognition of spoken commands preceded by a key-phrase as recorded in a voice-enabled apartment by a set of multiple microphones installed in the rooms. For both problems we investigate robust modeling, environmental adaptation and multichannel processing to cope with a) insufficient training data and b) the far-field effects and noise in the apartment. The proposed integrated scheme is evaluated in a challenging and highly realistic corpus of simulated audio recordings and achieves F-measure close to 0.70 for key-phrase spotting and word accuracy close to 98% for the command recognition task.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } The paper presents our approach to speech-controlled home automa- tion. We are focusing on the detection and recognition of spoken commands preceded by a key-phrase as recorded in a voice-enabled apartment by a set of multiple microphones installed in the rooms. For both problems we investigate robust modeling, environmental adaptation and multichannel processing to cope with a) insufficient training data and b) the far-field effects and noise in the apartment. The proposed integrated scheme is evaluated in a challenging and highly realistic corpus of simulated audio recordings and achieves F-measure close to 0.70 for key-phrase spotting and word accuracy close to 98% for the command recognition task. |
Copyright Notice:
Some material presented is available for download to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
The work already published by the IEEE is under its copyright. Personal use of such material is permitted. However, permission to reprint/republish the material for advertising or promotional purposes, or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of the work in other works must be obtained from the IEEE.