2013 |
A Zlatintsi, P Maragos Multiscale Fractal Analysis of Musical Instrument Signals with Application to Recognition Journal Article 21 (4), pp. 737–748, 2013. Abstract | BibTeX | Links: [PDF] @article{ZlMa13, title = {Multiscale Fractal Analysis of Musical Instrument Signals with Application to Recognition}, author = {A Zlatintsi and P Maragos}, url = {http://robotics.ntua.gr/wp-content/publications/ZlatintsiMaragos_MultiscaleFractalAnalMusicInstrumSignalsApplicRecogn_ieeetASLP2013.pdf}, year = {2013}, date = {2013-04-01}, volume = {21}, number = {4}, pages = {737--748}, abstract = {In this paper, we explore nonlinear methods, inspired by the fractal theory for the analysis of the structure of music signals at multiple time scales, which is of importance both for their modeling and for their automatic computer-based recognition. We propose the multiscale fractal dimension (MFD) prourl as a short-time descriptor, useful to quantify the multiscale complexity and fragmentation of the different states of the music waveform. We have experimentally found that this descriptor can discriminate several aspects among different music instruments, which is verified by further analysis on synthesized sinusoidal signals. We compare the descriptiveness of our features against that of Mel frequency cepstral coefficients (MFCCs), using both static and dynamic classifiers such as Gaussian mixture models (GMMs) and hidden Markov models (HMMs). The method and features proposed in this paper appear to be promising for music signal analysis, due to their capability for multiscale analysis of the signals and their applicability in recognition, as they accomplish an error reduction of up to 32%. These results are quite interesting and render the descriptor of direct applicability in large-scale music classification tasks.}, keywords = {}, pubstate = {published}, tppubtype = {article} } In this paper, we explore nonlinear methods, inspired by the fractal theory for the analysis of the structure of music signals at multiple time scales, which is of importance both for their modeling and for their automatic computer-based recognition. We propose the multiscale fractal dimension (MFD) prourl as a short-time descriptor, useful to quantify the multiscale complexity and fragmentation of the different states of the music waveform. We have experimentally found that this descriptor can discriminate several aspects among different music instruments, which is verified by further analysis on synthesized sinusoidal signals. We compare the descriptiveness of our features against that of Mel frequency cepstral coefficients (MFCCs), using both static and dynamic classifiers such as Gaussian mixture models (GMMs) and hidden Markov models (HMMs). The method and features proposed in this paper appear to be promising for music signal analysis, due to their capability for multiscale analysis of the signals and their applicability in recognition, as they accomplish an error reduction of up to 32%. These results are quite interesting and render the descriptor of direct applicability in large-scale music classification tasks. |
Athanasia Zlatintsi, Petros Maragos Multiscale fractal analysis of musical instrument signals with application to recognition Journal Article IEEE Transactions on Audio, Speech and Language Processing, 21 (4), pp. 737–748, 2013, ISSN: 15587916. Abstract | BibTeX | Links: [PDF] @article{140, title = {Multiscale fractal analysis of musical instrument signals with application to recognition}, author = {Athanasia Zlatintsi and Petros Maragos}, url = {http://robotics.ntua.gr/wp-content/uploads/publications/ZlatintsiMaragos_MultiscaleFractalAnalMusicInstrumSignalsApplicRecogn_ieeetASLP2013.pdf}, doi = {10.1109/TASL.2012.2231073}, issn = {15587916}, year = {2013}, date = {2013-01-01}, journal = {IEEE Transactions on Audio, Speech and Language Processing}, volume = {21}, number = {4}, pages = {737--748}, abstract = {—In this paper, we explore nonlinear methods, inspired by the fractal theory for the analysis of the structure of music sig- nals at multiple time scales, which is of importance both for their modeling and for their automatic computer-based recognition.We propose the multiscale fractal dimension (MFD) profile as a short- time descriptor, useful to quantify the multiscale complexity and fragmentation of the different states of the music waveform. We have experimentally found that this descriptor can discriminate several aspects among different music instruments, which is veri- fied by further analysis on synthesized sinusoidal signals.We com- pare the descriptiveness of our features against that of Mel fre- quency cepstral coefficients (MFCCs), using both static and dy- namic classifierssuch asGaussian mixture models (GMMs) and hidden Markov models (HMMs). The method and features pro- posed in this paper appear to be promising for music signal anal- ysis,due to their capability for multiscale analysis of the signals and their applicability in recognition, as they accomplish an error re- duction of up to 32%.These results are quite interesting and render the descriptor of direct applicability in large-scalemusic classifica- tion tasks.}, keywords = {}, pubstate = {published}, tppubtype = {article} } —In this paper, we explore nonlinear methods, inspired by the fractal theory for the analysis of the structure of music sig- nals at multiple time scales, which is of importance both for their modeling and for their automatic computer-based recognition.We propose the multiscale fractal dimension (MFD) profile as a short- time descriptor, useful to quantify the multiscale complexity and fragmentation of the different states of the music waveform. We have experimentally found that this descriptor can discriminate several aspects among different music instruments, which is veri- fied by further analysis on synthesized sinusoidal signals.We com- pare the descriptiveness of our features against that of Mel fre- quency cepstral coefficients (MFCCs), using both static and dy- namic classifierssuch asGaussian mixture models (GMMs) and hidden Markov models (HMMs). The method and features pro- posed in this paper appear to be promising for music signal anal- ysis,due to their capability for multiscale analysis of the signals and their applicability in recognition, as they accomplish an error re- duction of up to 32%.These results are quite interesting and render the descriptor of direct applicability in large-scalemusic classifica- tion tasks. |
2012 |
A Zlatintsi, P Maragos AM-FM Modulation Features for Music Instrument Signal Analysis and Recognition Conference Proc. European Signal Processing Conference, Bucharest, Romania, 2012. Abstract | BibTeX | Links: [PDF] @conference{ZlMa12, title = {AM-FM Modulation Features for Music Instrument Signal Analysis and Recognition}, author = {A Zlatintsi and P Maragos}, url = {http://robotics.ntua.gr/wp-content/publications/ZlatintsiMaragos_MusicalInstrumentsAMFM_EUSIPCO2012.pdf}, year = {2012}, date = {2012-08-01}, booktitle = {Proc. European Signal Processing Conference}, address = {Bucharest, Romania}, abstract = {In this paper, we explore a nonlinear AM-FM model to extract alternative features for music instrument recognition tasks. Amplitude and frequency micro-modulations are measured in musical signals and are employed to model the existing information. The features used are the multiband mean instantaneous amplitude (mean-IAM) and mean instantaneous frequency (mean-IFM) modulation. The instantaneous features are estimated using the multiband Gabor Energy Separation Algorithm (Gabor-ESA). An alternative method, the iterative-ESA is also explored; and initial experimentation shows that it could be used to estimate the harmonic content of a tone. The Gabor-ESA is evaluated against and in combination with Mel frequency cepstrum coefficients (MFCCs) using both static and dynamic classifiers. The method used in this paper has proven to be able to extract the fine-structured modulations of music signals; further, it has shown to be promising for recognition tasks accomplishing an error rate reduction up to 60% for the best recognition case combined with MFCCs.}, keywords = {}, pubstate = {published}, tppubtype = {conference} } In this paper, we explore a nonlinear AM-FM model to extract alternative features for music instrument recognition tasks. Amplitude and frequency micro-modulations are measured in musical signals and are employed to model the existing information. The features used are the multiband mean instantaneous amplitude (mean-IAM) and mean instantaneous frequency (mean-IFM) modulation. The instantaneous features are estimated using the multiband Gabor Energy Separation Algorithm (Gabor-ESA). An alternative method, the iterative-ESA is also explored; and initial experimentation shows that it could be used to estimate the harmonic content of a tone. The Gabor-ESA is evaluated against and in combination with Mel frequency cepstrum coefficients (MFCCs) using both static and dynamic classifiers. The method used in this paper has proven to be able to extract the fine-structured modulations of music signals; further, it has shown to be promising for recognition tasks accomplishing an error rate reduction up to 60% for the best recognition case combined with MFCCs. |
Copyright Notice:
Some material presented is available for download to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
The work already published by the IEEE is under its copyright. Personal use of such material is permitted. However, permission to reprint/republish the material for advertising or promotional purposes, or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of the work in other works must be obtained from the IEEE.