DSPRelated.com
Forums

Resampling (by non integer factors) algorithm - can someone tell me why this works?

Started by David Reid June 30, 2004
Hi all,

i've been asking question about how to do proper resampling in c++ on wave
audio data.  After all the excellent and intelligent replies, my manager
(who has no dsp background) suggested the following algoritm, which works
perfectly, as i've tested it out viewing the output on a spectrum analyzer.

Here's the concept:

Assume the following:
-wave data loaded and stored in a buffer X bytes long.
-audio data is 16bits wide, stored as 44.1kHz
- for the purposes of the explanation the data will be a 1kHz sine wave

if i want to change the frequency by 2, simply drop every second sample.
This in effect halves the sample rate of the data (i mean to say that i'm
playing 22050 samples of my original data, twice per second).  To implement
this i keep a floating point variable which keeps track of the offset into
my data.  Each time i take a sample, i increment the variable by my
multiplcation factor.  If my factor is 1.3, for example, starting at offset
0, i will read a sample then increment my offset count.  For offset counts
that are not whole numbers, i would interpolate a value between the floor of
my current offset and the ceiling of my offset.  Drawing a straight line
between the two points, i would pick the point that lies at the distance
equal to the fractional part of my current offset count.  So if my factor is
1.3, my offset for my second sample would be at offset 1.3 (the first being
offset 0).  i would take the value at offset 1 (the floor of 1.3) in my data
buffer, the value at offset 2.0, and then take their differences and
multiply that by 0.3, then add the result to the value at offset 1.  I would
then incremnt my offset count by 1.3 (bringing me to offset 2.6).  This
continues for as long as there is data, and loops if i want to keep
repeating the sound.  in code it looks something like this:

for(int sample = 0; sample < buffsize; sample ++){
fInt = floor(fOffset);
fPartial = fOffset - fInt;
/* if offset equals 32.4, fInt = 32, and fPartial = 0.4 */

      if(fPartial == 0.0){
       output[sample] = Data[fInt]));
       fOffset += fFreqFactor;
      }
      else{
       output[sample] = (Data[fInt] + (fPartial*(Data[fInt+1] -
uData[fInt])));
       fOffset += fFreqFactor;
      }
}

i've removed the type-casting code for readability, as well as checking for
when my offset grows beyond my data size.  The odd thnig about this is that
when you increase the factor, you're in fact performing decimation, and when
you're decreasing the factor you're interpolating.  so when i multiply the
1kHz sine wave by 2, i see a 2 kHz sine wave on my scope.  Can someone tell
me why this works?

I've also tested it with other sound files, whose contents have frequencies
from 0 to 22.05 kHz.  if more explanation is required, i'd be happy to give
more details.  I just want to understand why no filtering is needed, why no
aliasing occur, etc....

David


On Wed, 30 Jun 2004 17:54:54 -0400, "David Reid"
<dreid_nospam@remove_no_spam_mechtronix.ca> wrote:

>>my manager >>(who has no dsp background) suggested the following algoritm, which works >>perfectly,
It works "perfectly" because your data are band-limited.
>>Assume the following: >>-wave data loaded and stored in a buffer X bytes long. >>-audio data is 16bits wide, stored as 44.1kHz >>- for the purposes of the explanation the data will be a 1kHz sine wave >> >>if i want to change the frequency by 2, simply drop every second sample.
Try this with a 20.050 kHz sine wave. It will also give you a 2 kHz sine wave output -- probably not what you expected.
>>For offset counts >>that are not whole numbers, i would interpolate a value between the floor of >>my current offset and the ceiling of my offset.
This is equivalent to convolving with a triangle window, which actually IS a form of lowpass filtering. Unfortunately, you're filtering AFTER decimation, instead of BEFORE.
>>I just want to understand why no filtering is needed, why no >>aliasing occur, etc....
Filtering IS needed, and aliasing IS occurring. You just haven't used the kinds of signals that will reveal them. Greg Berchin

David Reid wrote:
> > Hi all, > > i've been asking question about how to do proper resampling in c++ on wave > audio data. After all the excellent and intelligent replies, my manager > (who has no dsp background) suggested the following algoritm, which works > perfectly, as i've tested it out viewing the output on a spectrum analyzer. > > Here's the concept: > > Assume the following: > -wave data loaded and stored in a buffer X bytes long. > -audio data is 16bits wide, stored as 44.1kHz > - for the purposes of the explanation the data will be a 1kHz sine wave > > if i want to change the frequency by 2, simply drop every second sample. > This in effect halves the sample rate of the data (i mean to say that i'm > playing 22050 samples of my original data, twice per second). To implement > this i keep a floating point variable which keeps track of the offset into > my data. Each time i take a sample, i increment the variable by my > multiplcation factor. If my factor is 1.3, for example, starting at offset > 0, i will read a sample then increment my offset count. For offset counts > that are not whole numbers, i would interpolate a value between the floor of > my current offset and the ceiling of my offset. Drawing a straight line > between the two points, i would pick the point that lies at the distance > equal to the fractional part of my current offset count. So if my factor is > 1.3, my offset for my second sample would be at offset 1.3 (the first being > offset 0). i would take the value at offset 1 (the floor of 1.3) in my data > buffer, the value at offset 2.0, and then take their differences and > multiply that by 0.3, then add the result to the value at offset 1. I would > then incremnt my offset count by 1.3 (bringing me to offset 2.6). This > continues for as long as there is data, and loops if i want to keep > repeating the sound. in code it looks something like this: > > for(int sample = 0; sample < buffsize; sample ++){ > fInt = floor(fOffset); > fPartial = fOffset - fInt; > /* if offset equals 32.4, fInt = 32, and fPartial = 0.4 */ > > if(fPartial == 0.0){ > output[sample] = Data[fInt])); > fOffset += fFreqFactor; > } > else{ > output[sample] = (Data[fInt] + (fPartial*(Data[fInt+1] - > uData[fInt]))); > fOffset += fFreqFactor; > } > } > > i've removed the type-casting code for readability, as well as checking for > when my offset grows beyond my data size. The odd thnig about this is that > when you increase the factor, you're in fact performing decimation, and when > you're decreasing the factor you're interpolating. so when i multiply the > 1kHz sine wave by 2, i see a 2 kHz sine wave on my scope. Can someone tell > me why this works? > > I've also tested it with other sound files, whose contents have frequencies > from 0 to 22.05 kHz. if more explanation is required, i'd be happy to give > more details. I just want to understand why no filtering is needed, why no > aliasing occur, etc....
Its called linear interpolation. Hey, that's one step up from nearest neighbor resampling. Your happy, the boss is happy. Break out the champagne and have a good 4th :} -jim -----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- http://www.newsfeeds.com - The #1 Newsgroup Service in the World! -----== Over 100,000 Newsgroups - 19 Different Servers! =-----
This is just basic linear or first-order SRC.  It works, but not all that well
compared with other methods.  There is significant aliasing that occurs.  And
you are in effect doing filtering--the linear interpolator is a low-pass filter
(with pretty lousy frequency response).

I know at first glance this seems like a totally different method than that used
in for example http://www.mega-nerd.com/SRC/.  However, you can do math and show
that linear interpolation reduces to a polyphase implementation of standard
interpolation, using a very simple low pass filter (the linear interpolator).
When this finally clicked for me, it was an "aha" moment!

As for the quality of this method, for many things, it may be acceptable.  But
it is certainly not ideal or "perfect" as you state.  If you want to hear the
limitations of this method, create a slowly swept tone from 10kHz to 20kHz
(assuming 44.1kHz sampling).  Next, do a sample rate conversion by some factor
near one, such as 1.1, using this method.  Then play back the results.  As the
tone frequency goes higher, you will here an alias tone (or tones) moving in the
opposite direction, lower in frequency.  This is aliasing and it should be very
obvious.

"David Reid" <dreid_nospam@remove_no_spam_mechtronix.ca> wrote in message
news:DAGEc.185343$207.1308399@news20.bellglobal.com...
> Hi all, > > i've been asking question about how to do proper resampling in c++ on wave > audio data. After all the excellent and intelligent replies, my manager > (who has no dsp background) suggested the following algoritm, which works > perfectly, as i've tested it out viewing the output on a spectrum analyzer. > > Here's the concept: > > Assume the following: > -wave data loaded and stored in a buffer X bytes long. > -audio data is 16bits wide, stored as 44.1kHz > - for the purposes of the explanation the data will be a 1kHz sine wave > > if i want to change the frequency by 2, simply drop every second sample. > This in effect halves the sample rate of the data (i mean to say that i'm > playing 22050 samples of my original data, twice per second). To implement > this i keep a floating point variable which keeps track of the offset into > my data. Each time i take a sample, i increment the variable by my > multiplcation factor. If my factor is 1.3, for example, starting at offset > 0, i will read a sample then increment my offset count. For offset counts > that are not whole numbers, i would interpolate a value between the floor of > my current offset and the ceiling of my offset. Drawing a straight line > between the two points, i would pick the point that lies at the distance > equal to the fractional part of my current offset count. So if my factor is > 1.3, my offset for my second sample would be at offset 1.3 (the first being > offset 0). i would take the value at offset 1 (the floor of 1.3) in my data > buffer, the value at offset 2.0, and then take their differences and > multiply that by 0.3, then add the result to the value at offset 1. I would > then incremnt my offset count by 1.3 (bringing me to offset 2.6). This > continues for as long as there is data, and loops if i want to keep > repeating the sound. in code it looks something like this: > > for(int sample = 0; sample < buffsize; sample ++){ > fInt = floor(fOffset); > fPartial = fOffset - fInt; > /* if offset equals 32.4, fInt = 32, and fPartial = 0.4 */ > > if(fPartial == 0.0){ > output[sample] = Data[fInt])); > fOffset += fFreqFactor; > } > else{ > output[sample] = (Data[fInt] + (fPartial*(Data[fInt+1] - > uData[fInt]))); > fOffset += fFreqFactor; > } > } > > i've removed the type-casting code for readability, as well as checking for > when my offset grows beyond my data size. The odd thnig about this is that > when you increase the factor, you're in fact performing decimation, and when > you're decreasing the factor you're interpolating. so when i multiply the > 1kHz sine wave by 2, i see a 2 kHz sine wave on my scope. Can someone tell > me why this works? > > I've also tested it with other sound files, whose contents have frequencies > from 0 to 22.05 kHz. if more explanation is required, i'd be happy to give > more details. I just want to understand why no filtering is needed, why no > aliasing occur, etc.... > > David >
On Wed, 30 Jun 2004 22:35:59 GMT, Greg Berchin
<Bank-to-Turn@mchsi.com> wrote:

>>This is equivalent to convolving with a triangle window, which >>actually IS a form of lowpass filtering. Unfortunately, you're >>filtering AFTER decimation, instead of BEFORE.
I looked again at your technique, and you are actually interpolating before decimating. In that case, your interpolater is serving as your lowpass filter. Unfortunately, the triangle window is not the best choice for a SRC lowpass filter. Greg Berchin
Thanks for all the replies, it has shed some light is actually happening.

I take back my statement of it being "perfect", i meant perfect for what we
want to do, so far anyway.   Greg, you're right, the aliasing probably just
hasn't shown up yet.  Thanks to you and the rest for explaining the logic.
It's pretty funny that someone with no DSP background (my manager) came up
with an algorithm like this.  When I started insisting that some form of
filtering needed to be done, he cut me off and said, "..not really sure why
you need that...first just get it to run this way, and later if we see
problems, we'll add all sorts of filters and bells and whistles..."

Hopefully, i'll be at another job _when_ that happens.

David
"David Reid" <dreid_nospam@remove_no_spam_mechtronix.ca> wrote in message
news:DAGEc.185343$207.1308399@news20.bellglobal.com...
> Hi all, > > i've been asking question about how to do proper resampling in c++ on wave > audio data. After all the excellent and intelligent replies, my manager > (who has no dsp background) suggested the following algoritm, which works > perfectly, as i've tested it out viewing the output on a spectrum
analyzer.
> > Here's the concept: > > Assume the following: > -wave data loaded and stored in a buffer X bytes long. > -audio data is 16bits wide, stored as 44.1kHz > - for the purposes of the explanation the data will be a 1kHz sine wave > > if i want to change the frequency by 2, simply drop every second sample. > This in effect halves the sample rate of the data (i mean to say that i'm > playing 22050 samples of my original data, twice per second). To
implement
> this i keep a floating point variable which keeps track of the offset into > my data. Each time i take a sample, i increment the variable by my > multiplcation factor. If my factor is 1.3, for example, starting at
offset
> 0, i will read a sample then increment my offset count. For offset counts > that are not whole numbers, i would interpolate a value between the floor
of
> my current offset and the ceiling of my offset. Drawing a straight line > between the two points, i would pick the point that lies at the distance > equal to the fractional part of my current offset count. So if my factor
is
> 1.3, my offset for my second sample would be at offset 1.3 (the first
being
> offset 0). i would take the value at offset 1 (the floor of 1.3) in my
data
> buffer, the value at offset 2.0, and then take their differences and > multiply that by 0.3, then add the result to the value at offset 1. I
would
> then incremnt my offset count by 1.3 (bringing me to offset 2.6). This > continues for as long as there is data, and loops if i want to keep > repeating the sound. in code it looks something like this: > > for(int sample = 0; sample < buffsize; sample ++){ > fInt = floor(fOffset); > fPartial = fOffset - fInt; > /* if offset equals 32.4, fInt = 32, and fPartial = 0.4 */ > > if(fPartial == 0.0){ > output[sample] = Data[fInt])); > fOffset += fFreqFactor; > } > else{ > output[sample] = (Data[fInt] + (fPartial*(Data[fInt+1] - > uData[fInt]))); > fOffset += fFreqFactor; > } > } > > i've removed the type-casting code for readability, as well as checking
for
> when my offset grows beyond my data size. The odd thnig about this is
that
> when you increase the factor, you're in fact performing decimation, and
when
> you're decreasing the factor you're interpolating. so when i multiply the > 1kHz sine wave by 2, i see a 2 kHz sine wave on my scope. Can someone
tell
> me why this works? > > I've also tested it with other sound files, whose contents have
frequencies
> from 0 to 22.05 kHz. if more explanation is required, i'd be happy to
give
> more details. I just want to understand why no filtering is needed, why
no
> aliasing occur, etc.... > > David > >
"Greg Berchin" <Bank-to-Turn@mchsi.com> wrote in message
news:t1h6e09p2i8qf8qitcjugu0ud141elmvm7@4ax.com...
> On Wed, 30 Jun 2004 22:35:59 GMT, Greg Berchin > <Bank-to-Turn@mchsi.com> wrote: > > >>This is equivalent to convolving with a triangle window, which > >>actually IS a form of lowpass filtering. Unfortunately, you're > >>filtering AFTER decimation, instead of BEFORE. > > I looked again at your technique, and you are actually > interpolating before decimating. In that case, your interpolater > is serving as your lowpass filter. Unfortunately, the triangle > window is not the best choice for a SRC lowpass filter.
Right. It only has a paltry 10dB stop band rejection at the Nyquist ratio, and the first sidelobe is at -26dB. Not to mention the poor pass-band response. With a good FIR interpolator, you can achieve >100dB stop-band attenuation and flat response to 20kHz.
"David Reid" <darkdefenderX@Xvideotron.ca> wrote in message
news:MPHEc.163361$rt5.1844252@wagner.videotron.net...
> Thanks for all the replies, it has shed some light is actually happening. > > I take back my statement of it being "perfect", i meant perfect for what we > want to do, so far anyway. Greg, you're right, the aliasing probably just > hasn't shown up yet. Thanks to you and the rest for explaining the logic. > It's pretty funny that someone with no DSP background (my manager) came up > with an algorithm like this. When I started insisting that some form of > filtering needed to be done, he cut me off and said, "..not really sure why > you need that...first just get it to run this way, and later if we see > problems, we'll add all sorts of filters and bells and whistles..."
Well, linear interpolation is pretty intuitive. That's how I did my first SRC too, before I "knew better". :-) As I mentioned in a previous post, linear may be perfectly adequate for certain tasks, possibly yours. And as for filtering, the linear interpolator really is a filter that looks at 2 values, even if intuitively people don't tend to think of it that way.
> Hopefully, i'll be at another job _when_ that happens.
:-)