Hi, I'm working on a term-project in a Speech Communications course. The project's goal is to increase the quality of speech by encoding the high band (above 4kHz) with parametric information, and then (for the synthesis stage) just filter white noise and add the output to the low-passed speech signal (i.e., the speech signal samples at 8kHz, with spectral contents limited to below 4kHz) What I'm looking for is speech audio samples at CD-quality (i.e., sampled at 44.1kHz, 16 bits), so that I can use them to evaluate/demonstrate the effectiveness of the method. Actually, it wouldn't have to be 44.1kHz -- audio sampled at 32kHz, 22.05kHz, or even 16kHz should be ok, as long as it is recorded preserving as much from the actual signal spectrum as the sampling rate allows. (yes, I know, I can record my own voice -- and that's what I've done so far; but I'd like to find some additional samples to test the method with a variety of voices) Thanks, Carlos --
Where can I find speech samples (audio)?
Started by ●April 5, 2004
Reply by ●April 5, 20042004-04-05
Carlos Moreno wrote:> > Hi, > > I'm working on a term-project in a Speech Communications > course. The project's goal is to increase the quality of > speech by encoding the high band (above 4kHz) with > parametric information, and then (for the synthesis stage) > just filter white noise and add the output to the low-passed > speech signal (i.e., the speech signal samples at 8kHz, with > spectral contents limited to below 4kHz) > > What I'm looking for is speech audio samples at CD-quality > (i.e., sampled at 44.1kHz, 16 bits), so that I can use them > to evaluate/demonstrate the effectiveness of the method. > Actually, it wouldn't have to be 44.1kHz -- audio sampled > at 32kHz, 22.05kHz, or even 16kHz should be ok, as long as > it is recorded preserving as much from the actual signal > spectrum as the sampling rate allows. > > (yes, I know, I can record my own voice -- and that's what > I've done so far; but I'd like to find some additional > samples to test the method with a variety of voices) > > Thanks, > > Carlos > --Two divergent suggestions: 1. If you need a sample that has some history of being used as a formal research sample, post your question to comp.speech.research . 2. I had a similar requirement for some initial experimentation. I already had a copy of Alexander Scourby reading King James Bible. It has a large sample of high quality speech recorded by a single speaker under studio quality conditions. You might check you local public library to see what "audio books" they have on CD. I assume the use of a few minutes of the recording would fall under the intent of "fair use". If you want to use more, write the publisher with a description of what you hope to accomplish and how it would benefit them. You might get a couple of free CD's or even some serious research support.
Reply by ●April 5, 20042004-04-05
Richard Owlett wrote: Thanks Richard for your suggestions. A couple of comments:> 1. If you need a sample that has some history of being used as a formal > research sample, post your question to comp.speech.research .I wasn't even familiar with this newsgroup -- I'll keep it in mind.> speaker under studio quality conditions. You might check you local > public library to see what "audio books" they have on CD. I assume the > use of a few minutes of the recording would fall under the intent of > "fair use".I had actually thought about it -- or even simpler, any movie or TV series on DVD (for instance, I have several seasons of X-Files, where we enjoy the most spectacular voice ever -- that of Gillian Anderson, of course :-)). But then I'm worried that these recordings have been already passed through non-linear processing (e.g., compression) that might introduce artifacts that, if not obvious to the ear, might make them poor choices to evaluate some speech processing system. I guess I can try. DVD audio has to be brutally superior to telephone-quality audio. Also, maybe "audio books" have clean, CD-quality audio? (it makes sense, being that they're CD's :-)) Thanks! Carlos --
Reply by ●April 6, 20042004-04-06
Hi. Have a look at the SQAM discs. Some nice excerpts (including speech) can be found at: http://www.tnt.uni-hannover.de/project/mpeg/audio/sqam/ -- /Mads (http://kom.aau.dk/~mgc)
Reply by ●April 6, 20042004-04-06
christensen@nospam.ieee.org (Mads G. Christensen) wrote in message news:<wky3c7h76gj.fsf@zil.kom.auc.dk>...> Hi. > > Have a look at the SQAM discs. Some nice excerpts (including speech) > can be found at: > > http://www.tnt.uni-hannover.de/project/mpeg/audio/sqam/NTT (Nipon Telegraph and Telephone) made a CD some years ago in several languages for research. I do not have it handy, but an inquiry on Google might pull up something. Maurice Givens
Reply by ●April 6, 20042004-04-06
Another idea, many radio stations have audio from past broadcasts for download or streaming. For example, try http://espnradio.com for a plethora of sports-related programming. (I think most of these are streaming media, and I don't know for sure what is involved in converting to wave files.) "Maurice Givens" <maurice.givens@ieee.org> wrote in message news:eb93cce8.0404061359.69360513@posting.google.com...> christensen@nospam.ieee.org (Mads G. Christensen) wrote in messagenews:<wky3c7h76gj.fsf@zil.kom.auc.dk>...> > Hi. > > > > Have a look at the SQAM discs. Some nice excerpts (including speech) > > can be found at: > > > > http://www.tnt.uni-hannover.de/project/mpeg/audio/sqam/ > > NTT (Nipon Telegraph and Telephone) made a CD some years ago in > several languages for research. I do not have it handy, but an > inquiry on Google might pull up something. > > Maurice Givens
Reply by ●April 6, 20042004-04-06
Jon Harris wrote:> Another idea, many radio stations have audio from past broadcasts for download > or streaming. > For example, try http://espnradio.com for a plethora of sports-related > programming. > (I think most of these are streaming media, and I don't know for sure what is > involved in converting to wave files.)What would worry me most is that these waveforms would have been already subject to who-knows what processing; maybe they would be good for my application, but maybe not, for some obscure reason related to the way the bitstream is compressed. (that's also the reason why I was reluctant to use voice samples from DVD's -- any movie or TV series where there are dialogs without any background music or sound effects) Thinking about it, I think what I'm trying to do is simple enough (after all, it's just a term-project for a course, and not a PhD thesis :-)); but you never know, so I'm better off with samples of "pure", untreated, uncompressed audio. Thanks, Carlos --