Hi there, First be gentle! I know very little about this stuff :-) I have been asked to develop a module to emulate whatever SoundForge does to downsample a speech only wav file from 44.1khz to 11.025khz and 8khz when it has the "apply anti-alias filter" set. This has been done for years and has become the "gold standard" here so its my "benchmark" in what I have to achieve...the above is set in concrete, there is no need for anything else to be allowed for in the prog :-) .... Now from what I have read I have to 1) Filter the wav file to remove all frequencies above Nyquist 2) Decimate the samples I have got quite a bit of code already and have hacked it into a working module that actually does both filter and decimate..... (thats a wow as far as I am concerned :-) ) but my issue at this stage is really to see am I doing the right things as there seems to be so many different methods that I could use and how to get the best quality of output given the design constraints. I found a resampling prog that I "learnt" on and I think it is a FIR filter with a SINC function, with 17kcoeffs(for vhigh quality) that you skip thru at 128 at a time in doing the calcs, you can decimate in one go regardless of non integer decimation as it can offset and interpolate the coeffs to produce a good result. OK but I noticed for integer decimation (44>11) it does nothing ... ie I can get the same result by just dropping samples. It only seems to impact non-integer decimation where it is very accurate at interpolating the right value. I also found a bunch of code for biquad filters though when I used this to prefilter the wav file I must say the impact was not particularly great... ie it wasnt that big a deal the before and after werent that much better thogh I was a bit confused over some of the params. I also found a filter design tool from Systolix which allows me to design specific filters, so I could design say a FIR Kaiser Window LowPass filter with F1 at 3.5khz and an 80db stop band for the 8khz output and then I assume prefilter the wav file before moving on to resampling. So here are the questions on how to achieve the best quality:-)) 1) Am I right I need to do this in 2 phases.Filter/Decimate 2) What would the best type of filter be for phase 1 to remove the freq above Nyquist. 3) Am I right that integer resampling requires nothing other than dropping samples or is there something "smarter" that I have missed. 4) Is the FIR SINC filter the best way to go for the interpolation/resampling phase with non-integer decimation. 5) Given the above 2 scenarios what would be a reasonable number of coeffs needed in the resampling SINC filter to achieve good/vgood output, i.e. especially considering I am ending up at 8 or 11khz 6) If resampling from 44>11 just drop bits, is there any point in recording at a higher bit rate in the first place i.e why dont we just record at 11k to start with and then just do 1 resample to 8k? Is there something gained by sampling at a higher rate then downsampling? My apologies in advance for use of any incorrect terms or other such gaffs. My knowledge on audio is about 4 days worth and my field is more general than specific to audio. Thanks a lot!!! Rob
downsampling
Started by ●August 11, 2003
Reply by ●August 11, 20032003-08-11
Rob Edgar wrote:> So here are the questions on how to achieve the best quality:-)) > 1) Am I right I need to do this in 2 phases.Filter/DecimateThe general process is: interpolate (upsample) -> filter -> decimate (downsample).> 2) What would the best type of filter be for phase 1 to remove the > freq above Nyquist.I like the polyphase approach using a windowed truncated cardinal sine (a.k.a. sinc) function. The Web site DSPguru has good information on this.> 3) Am I right that integer resampling requires nothing other than > dropping samples or is there something "smarter" that I have missed.I don't think so. You need to low-pass filter your signal.> 4) Is the FIR SINC filter the best way to go for the > interpolation/resampling phase with non-integer decimation.I like it.> 5) Given the above 2 scenarios what would be a reasonable number of > coeffs needed in the resampling SINC filter to achieve good/vgood > output, i.e. especially considering I am ending up at 8 or 11khzDepends on the requirements of your specific application. How good is good enough?> 6) If resampling from 44>11 just drop bits, is there any point in > recording at a higher bit rate in the first place i.e why dont we just > record at 11k to start with and then just do 1 resample to 8k? Is > there something gained by sampling at a higher rate then downsampling?The 44 kHz sampling rate is required for high-quality audio. If your application doesn't require it, then go ahead and sample at the lower rate. Audio sampled at 8 kHz will have no high frequency content. It's good enough to understand voice, but that's about all.> > My apologies in advance for use of any incorrect terms or other such > gaffs. My knowledge on audio is about 4 days worth and my field is > more general than specific to audio.I'm no expert. This is Usenet where you are free to ask any type question you please. Good luck, OUP
Reply by ●August 11, 20032003-08-11
Rob Edgar wrote:> > Hi there, > First be gentle! I know very little about this stuff :-) > I have been asked to develop a module to emulate whatever SoundForge > does to downsample a speech only wav file from 44.1khz to 11.025khz > and 8khz when it has the "apply anti-alias filter" set. This has been > done for years and has become the "gold standard" here so its my > "benchmark" in what I have to achieve...the above is set in concrete, > there is no need for anything else to be allowed for in the prog :-)You have fallen in with a gentle, helpful crowd. Doing someone's homework problem doesn't help him, so it gets an appropriate response.> .... > > Now from what I have read I have to > 1) Filter the wav file to remove all frequencies above Nyquist > 2) Decimate the samplesBy "above Nyquist", I assume above what will be the new Nyquist after decimating. Then yes.> > I have got quite a bit of code already and have hacked it into a > working module that actually does both filter and decimate..... (thats > a wow as far as I am concerned :-) ) but my issue at this stage is > really to see am I doing the right things as there seems to be so many > different methods that I could use and how to get the best quality of > output given the design constraints.An efficiency comes from recognizing that since you will keep only one of every four samples produced by a simple-minded anti-alias (low-pass) filter, there is no need to calculate the others.> > I found a resampling prog that I "learnt" on and I think it is a FIR > filter with a SINC function, with 17kcoeffs(for vhigh quality) that > you skip thru at 128 at a time in doing the calcs, you can decimate in > one go regardless of non integer decimation as it can offset and > interpolate the coeffs to produce a good result. > OK but I noticed for integer decimation (44>11) it does nothing ... ie > I can get the same result by just dropping samples. It only seems to > impact non-integer decimation where it is very accurate at > interpolating the right value.Something's very wrong, then. (Your understanding?) If samples are dropped before filtering, there will be aliasing if the original signal had components above ~5 KHz. Those need to be removed before decimating.> > I also found a bunch of code for biquad filters though when I used > this to prefilter the wav file I must say the impact was not > particularly great... ie it wasnt that big a deal the before and after > werent that much better thogh I was a bit confused over some of the > params.Recursive filters -- biquads and other IIRs -- need old output samples to compute new ones. While generally more computationally efficient than transversal filters -- the usual FIRs -- they foreclose the refinement of computing only those output samples actually needed.> > I also found a filter design tool from Systolix which allows me to > design specific filters, so I could design say a FIR Kaiser Window > LowPass filter with F1 at 3.5khz and an 80db stop band for the 8khz > output and then I assume prefilter the wav file before moving on to > resampling.In general, the filtering should be part of the decimation process, and I believe that a class of filters called "Nyquist filters" are especially good. ScopeFIR from http://www.iowegian.com (free trial available) can design these, and http://documents.wolfram.com/applications/digitalimage/UsersGuide/9.5.html describes their use.> > So here are the questions on how to achieve the best quality:-)) > 1) Am I right I need to do this in 2 phases.Filter/DecimateBest done as one combined operation, as I tried bumblingly to explain.> 2) What would the best type of filter be for phase 1 to remove the > freq above Nyquist.See above.> 3) Am I right that integer resampling requires nothing other than > dropping samples or is there something "smarter" that I have missed.Dropping samples is decimating. You need to filter before doing that.> 4) Is the FIR SINC filter the best way to go for the > interpolation/resampling phase with non-integer decimation.Windowed sinc is a to design an effective filter with relatively little computing. Parks-McClellan gives better (by most criteria) results, but is requires more number crunching. In a modern computer, the computation happens in a flash anyway.> 5) Given the above 2 scenarios what would be a reasonable number of > coeffs needed in the resampling SINC filter to achieve good/vgood > output, i.e. especially considering I am ending up at 8 or 11khzI guess that a very good filter will need 20-30 taps, but that fewer can be OK. Since only every fourth output needs to be calculated, that's not a lot of calculation.> 6) If resampling from 44>11 just drop bits, is there any point in > recording at a higher bit rate in the first place i.e why dont we just > record at 11k to start with and then just do 1 resample to 8k? Is > there something gained by sampling at a higher rate then downsampling?It doesn't just drop bits if the input to the sampler exceeded 8 KHz. If it were known that it didn't, then 44.1 would be needless 4x oversampling.> > My apologies in advance for use of any incorrect terms or other such > gaffs. My knowledge on audio is about 4 days worth and my field is > more general than specific to audio. > > Thanks a lot!!! > RobCheck back frequently. The real experts will probably want to correct or expand what I wrote. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●August 11, 20032003-08-11
Jerry Avins <jya@ieee.org> wrote in message news:<3F37A56B.5419CA6F@ieee.org>...> By "above Nyquist", I assume above what will be the new Nyquist after > decimating. Then yes.Yes that was what I meant :-)> > Something's very wrong, then. (Your understanding?) If samples are > dropped before filtering, there will be aliasing if the original signal > had components above ~5 KHz. Those need to be removed before decimating.Not what I meant, sorry maybe I didnt explain it to well, I was meaning that after I had filtered the wav to remove all freqs above the final nyquist level (lets say everything over 4khz), and I had then moved on separately to the >>decimation<< phase and the filter I was >>solely<< using for the decimation stage provides an identical output to what I could achieve without it by simply dropping samples in this >>second<< phase, of course this is only true if it is simple decimation (1/4 1/2 etc) but obviously when its 8/44 then the interpolation algo thats wrapped round the filter kicks in and provides a good prediction of the output .... I guess what I am getting at here is is there >>any<< diff in the final resultant between 1) record at 44 / filter to remove freqs>4 / downsample to 8 and 2) record at 8 / filter to remove freqs>4 'cos the obvious diff is that in 1 I have to lug around a wapping big wav file all over the place long before it gets to my lovely resampling prog and if this is for no purpose then I can at least optimise other things external to this discussion by say.. compromising on an 11k file, yes I would still filter it but a smaller file beforehand would be good IF I lost nothing...> > In general, the filtering should be part of the decimation process, and > I believe that a class of filters called "Nyquist filters" are > especially good. ScopeFIR from http://www.iowegian.com (free trial > available) can design these, and > http://documents.wolfram.com/applications/digitalimage/UsersGuide/9.5.html > describes their use.OK will chk> > 3) Am I right that integer resampling requires nothing other than > > dropping samples or is there something "smarter" that I have missed. > > Dropping samples is decimating. You need to filter before doing that.Yes understood , but as above I wasnt clear that I was on about post-nyquist filtering> > > 4) Is the FIR SINC filter the best way to go for the > > interpolation/resampling phase with non-integer decimation. > > Windowed sinc is a to design an effective filter with relatively little > computing. Parks-McClellan gives better (by most criteria) results, but > is requires more number crunching. In a modern computer, the computation > happens in a flash anyway. >OK will look at that> > 5) Given the above 2 scenarios what would be a reasonable number of > > coeffs needed in the resampling SINC filter to achieve good/vgood > > output, i.e. especially considering I am ending up at 8 or 11khz > > I guess that a very good filter will need 20-30 taps, but that fewer can > be OK. Since only every fourth output needs to be calculated, that's not > a lot of calculation. >mm ok -- this is a lot fewer taps than the examples I have or for that matter the systolix prg generates eg as a test I tried a bandpass filter to cut the 44khz - 3.5khz and <300khz and it generated 2k taps> > It doesn't just drop bits if the input to the sampler exceeded 8 KHz. If > it were known that it didn't, then 44.1 would be needless 4x > oversampling. > >Well apart from removing the >nyquist frqs I still dont see that anything is done when its is simple 1/2 or 1/4 decimation, goes to my q above about why not just record at lower sample rate. Tks for quick response Rob
Reply by ●August 11, 20032003-08-11
One Usenet Poster <me@my.computer.org> wrote in message news:<vjf69jqmfr9127@corp.supernews.com>...> > 3) Am I right that integer resampling requires nothing other than > > dropping samples or is there something "smarter" that I have missed. > > I don't think so. You need to low-pass filter your signal. >Yea what I was meaning was ...assuming that I had already done a lowpass filter... I think that what I am driving at here .... if this was a JPEG image and I change the quality of a jpeg will resample the image and it can use the bits surrounding a point to derive a sort of average value so that when viewed later it doesnt look as clunky as it would have(if you get what I mean, I know this is not exactly what it does :-) ) and due to my lack of knowledge on audio I was wondering whether a simlar sort of thing happened here ie when you lose those samples is there some "smart" thing that happens so the info is not just chucked away.... its probably a bad analogy brought on my lack of knowledge that only measn something to me :-) lol .... and from the response I think the answer to the real question I am asking is that there is >>no<< difference between record at 44 >> downsample to 8khz and record at 8khz and filter freqs +4khz> > 6) If resampling from 44>11 just drop bits, is there any point in > > recording at a higher bit rate in the first place i.e why dont we just > > record at 11k to start with and then just do 1 resample to 8k? Is > > there something gained by sampling at a higher rate then downsampling? > > The 44 kHz sampling rate is required for high-quality audio. If your > application doesn't require it, then go ahead and sample at the lower > rate. Audio sampled at 8 kHz will have no high frequency content. It's > good enough to understand voice, but that's about all. >Yea but for this application voice is the only thing that will ever be recorded which is we are using such low sample rate (actually I will say the quality is pretty good ) Tks for the advice Rob
Reply by ●August 11, 20032003-08-11
Rob Edgar wrote:> > Hi there, > First be gentle! I know very little about this stuff :-) > I have been asked to develop a module to emulate whatever SoundForge > does to downsample a speech only wav file from 44.1khz to 11.025khz > and 8khz when it has the "apply anti-alias filter" set.Well this is a wheel that you may not need to re-invent. Have a look at Secret Rabbit Code: http://mega-nerd.com/SRC/ It works on *nix, Mac and Win32 and is released under the GNU GPL. A license for use with closed source software can also be obtained at a reasonable rate. Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo nospam@mega-nerd.com (Yes it's valid) +-----------------------------------------------------------+ Linux: Because rebooting is for adding new hardware
Reply by ●August 11, 20032003-08-11
Rob Edgar wrote:> > One Usenet Poster <me@my.computer.org> wrote in message news:<vjf69jqmfr9127@corp.supernews.com>......> > > > The 44 kHz sampling rate is required for high-quality audio. If your > > application doesn't require it, then go ahead and sample at the lower > > rate. Audio sampled at 8 kHz will have no high frequency content. It's > > good enough to understand voice, but that's about all. > > > Yea but for this application voice is the only thing that will ever be > recorded which is we are using such low sample rate (actually I will > say the quality is pretty good ) >It doesn't matter that you're only recording voice. There are voice components above 5 KHz that need to be removed before sampling at 11.025 KHz or before decimating a 44.1 Khz signal by four. Not caring about them isn't good enough. You must not _have_ them. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●August 11, 20032003-08-11
Rob Edgar wrote:> > Jerry Avins <jya@ieee.org> wrote in message news:<3F37A56B.5419CA6F@ieee.org>... > > By "above Nyquist", I assume above what will be the new Nyquist after > > decimating. Then yes. > Yes that was what I meant :-) > > > > > Something's very wrong, then. (Your understanding?) If samples are > > dropped before filtering, there will be aliasing if the original signal > > had components above ~5 KHz. Those need to be removed before decimating. > > Not what I meant, sorry maybe I didnt explain it to well, I was > meaning that after I had filtered the wav to remove all freqs above > the final nyquist level (lets say everything over 4khz), and I had > then moved on separately to the >>decimation<< phase and the filter I > was >>solely<< using for the decimation stage provides an identical > output to what I could achieve without it by simply dropping samples > in this >>second<< phase, of course this is only true if it is simple > decimation (1/4 1/2 etc) but obviously when its 8/44 then the > interpolation algo thats wrapped round the filter kicks in and > provides a good prediction of the output .... > > I guess what I am getting at here is is there >>any<< diff in the > final resultant between > 1) record at 44 / filter to remove freqs>4 / downsample to 8 > and > 2) record at 8 / filter to remove freqs>4 >If the spectrum of your signal is suitable for 11 KHz sampling and you collect 4 samples where one would do, it's OK to keep only every 4th one, that being the precise equivalent of sampling at the lower rate to start with. If the spectrum of your signal is such that the higher rate is needed, then you need to filter before discarding samples, but you don't need to compute the samples you're about to discard. So when reasonably done, there will be either a decimation alone, or a combination stage that is fed with all the samples and calculates the filtered and decimated end result. ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●August 12, 20032003-08-12
Jerry, Thanks...> > I guess what I am getting at here is is there >>any<< diff in the > > final resultant between > > 1) record at 44 / filter to remove freqs>4 / downsample to 8 > > and > > 2) record at 8 / filter to remove freqs>4 > > > If the spectrum of your signal is suitable for 11 KHz sampling and you > collect 4 samples where one would do, it's OK to keep only every 4th > one, that being the precise equivalent of sampling at the lower rate to > start with. If the spectrum of your signal is such that the higher rate > is needed, then you need to filter before discarding samples, but you > don't need to compute the samples you're about to discard. So when > reasonably done, there will be either a decimation alone, or a > combination stage that is fed with all the samples and calculates the > filtered and decimated end result. >Think I am still miscommunicating or misunderstanding you reply or maybe what I am asking is so obvious it sounds too dumb... or maybe I am off to another Q :-) Anyway I am insterested in knowing is there any point in recording someone speaking at 44.1khz when I intend only to downsample to 8khz taking as a given that I will at some point have filtered for >4khz freqs i.e does the first procedure produce a better end product... OK to throw in my thoughts so maybe you can correct me and understand better what I am getting at... I think the answer is YES, becuase if your speach, say, contained freqs that were up to 6khz when orginally recorded then, its would be impossible to accurately filter them out of an 8khz file but easy to filter out of a 44khz file, due to the nyquist theory the echo(if thats the right word) of 0-6 would run 8>2 in an 8khz file so it would already be intermingled with the original signal and impossible to accurately remove whereas in the 44k file it runs 38-44 and so can be removed precisely. So if I am right the question then becomes... OK so what is a good limit ie 11khz/22khz/44khz for recording speech , regular average business conference call type voices (if there is such a thing :-) )... my guess would be 11khz is still to tight, 22khz is probably OK, 44khz is unecessary.. Tks Rob PS I have looked at ScopeFir, very nice, detailed, still working my way through it to try and understand what I need.... but thanks for the pointer..
Reply by ●August 12, 20032003-08-12
"Rob Edgar" <robedgar@hkstar.com> wrote in message news:3560d674.0308112323.1887a0f9@posting.google.com... other stuff snipped.> > Anyway I am insterested in knowing is there any point in recording > someone speaking at 44.1khz when I intend only to downsample to 8khz > taking as a given that I will at some point have filtered for >4khz > freqs i.e does the first procedure produce a better end product... >Hello Rob, Sometimes sound cards don't appropriately scale their input antialias filter with changes in the sampling rate. Once when I had that problem with a studio making voice prompts. (I didn't get to choose the studio - a person there was related to a person I worked for at the time) I had them record them at 44.1kHz sampling rate and 16 bits per sample. I then did the scaling and sample rate conversion to make them work for a telephone voice mail system. It was also a good opportunity to write sample rate conversion code. In going from 44100 to 8000, I ended up using a cascade of 4 interpolator-decimators. Once the "c" structure for one was worked out, it was a simple matter to have four of them. Clay






