Hi all, I'm working on a singer recognition project that identifies singer in a music recording via Multifeature Statistical Singer Modeling. At this stage, I trying to apply vocal/non vocal segmentation using SVM classifier (Matlab - Sptoolbox). I exracted features for each frame (Spectral centroid, Spectral flux, Zero crossings, and Low energy) and tried to train the SVM binary classifier in 2-dimensional space, for example: Zero Crossing Rate vs Spectral centroid, but the error percentage was too high (around 35%). I also extracted MFCC coefficients from vocal/non vocal regions as suggested in the article that I have based on ("Hybrid Singer Identifier", John Shepherd), but I don’t understand how can I train the 2-dimensional binary SVM classifier using 14 dimensional features vector, and is it even possible to perform the classification in 2D space. Here are some of the articles I used for reference, but they didn't give an answer or perhaps I didn't understand the point. • Mel frequency cepstral coefficients for music modeling http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.9216 • LOCATING SINGING VOICE SEGMENTS WITHIN MUSIC SIGNALS http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.3067 Separation of vocals from polyphonic audio recordings http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.5580 I would like if someone could suggest me any other articles on this issue or explain shortly how to train the binary SVM classifier with MFCC coefficients in order to apply vocal/non vocal segmentation. Thanks in advance, Rodion
vocal/non vocal segmentation using SVM classifier
Started by ●October 7, 2008
Reply by ●October 7, 20082008-10-07
Sounds like suitable question for comp.speech.research Rodion wrote:> Hi all, > > I'm working on a singer recognition project that identifies singer in a > music recording via Multifeature Statistical Singer Modeling. > > At this stage, I trying to apply vocal/non vocal segmentation using SVM > classifier (Matlab - Sptoolbox). > > I exracted features for each frame (Spectral centroid, Spectral flux, Zero > crossings, and Low energy) and tried to train the SVM binary classifier in > 2-dimensional space, for example: Zero Crossing Rate vs Spectral centroid, > but the error percentage was too high (around 35%). > > I also extracted MFCC coefficients from vocal/non vocal regions as > suggested in the article that I have based on ("Hybrid Singer Identifier", > John Shepherd), but I don’t understand how can I train the 2-dimensional > binary SVM classifier using 14 dimensional features vector, and is it even > possible to perform the classification in 2D space. > > Here are some of the articles I used for reference, but they didn't give > an answer or perhaps I didn't understand the point. > > • Mel frequency cepstral coefficients for music modeling > > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.9216 > > • LOCATING SINGING VOICE SEGMENTS WITHIN MUSIC SIGNALS > > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.3067 > > Separation of vocals from polyphonic audio recordings > > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.5580 > > > I would like if someone could suggest me any other articles on this issue > or explain shortly how to train the binary SVM classifier with MFCC > coefficients in order to apply vocal/non vocal segmentation. > > > Thanks in advance, > > Rodion > > > >