comp.dsp | Higher upsampling with minimum phase downsampling produces more aliasing

Hi,

I'm programming a sound generator, based on wavetables. I have 8192 point
wavetable. I create several band-limited "subwavetables" by taking DFT,
zeroing high octave(s) and IDFT. For generating particular pitch a choose a
wavetable, which has all harmonics until 20k. Sound good so far, way better
than just upsampling the original non-band-limited wavetable.

The harmonics that exceed 22k in 44kHz sampling rate are aliasing back, so
I'm oversampling it. But since I want zero latency, I'm using a minimum
phase 120dB/oct low-pass (butterworth filters, so yeah I know, huge phase
shift, but I'm ok with that). 

PROBLEM:

I'm trying a sawtooth wave, which due to its harmonic complexity aliases a
lot.

- If I oversample 2x (to 88k), the aliasing is reduced.

- If I oversample 4x, the aliasing is reduced, but less than with the 2x
oversamping!

If I use linear-phase downsampling however, then it behaves correctly and
the higher the oversampling rate, the more the aliasing is reduced. 

Why is this happening?	 

_____________________________		
Posted through www.DSPRelated.com

Reply by ●July 4, 20142014-07-04

Are you using real hardware or Matlab?

Check that your Butterworth filters are not clipping at an internal node (common problem for high- order iir filters, you need to order the stages correctly )

Bob

Reply by jungledmnc ●July 4, 20142014-07-04

>Are you using real hardware or Matlab?
>
>Check that your Butterworth filters are not clipping at an internal node
(common problem for high- order iir filters, you need to order the stages
correctly )
>
>Bob
>

It's a C++ app. Anyway I tried decreasing the generator volume, a lot, and
no change. I was thinking maybe it's just that the FIR I'm using for LP
downsampling is much steeper? But still the minimum phase filter is
120dB/oct!...Anyway the problem is probably simply the fact that the
Butterworth filter is kind of less steep with higher sampling rates (or
more like converging to Nyquist in lower samping rate).

	 

_____________________________		
Posted through www.DSPRelated.com

Reply by robert bristow-johnson ●July 4, 20142014-07-04

On 7/4/14 4:48 AM, jungledmnc wrote:
>
> I'm programming a sound generator, based on wavetables. I have 8192 point
> wavetable. I create several band-limited "subwavetables" by taking DFT,
> zeroing high octave(s)

you mean zeroing bins?  how many?

> and IDFT.

so now you have a 8192 sample wavetable that has only energy in the 
lower harmonics, right?

> For generating particular pitch a choose a
> wavetable, which has all harmonics until 20k.

where those harmonics end up depends on the "stride" that you're 
stepping through the wavetable (and the sample rate)./

> Sound good so far, way better
> than just upsampling the original non-band-limited wavetable.

whatever that means.

> The harmonics that exceed 22k in 44kHz sampling rate are aliasing back, so
> I'm oversampling it. But since I want zero latency, I'm using a minimum
> phase 120dB/oct low-pass (butterworth filters, so yeah I know, huge phase
> shift, but I'm ok with that).
>
> PROBLEM:
>
> I'm trying a sawtooth wave, which due to its harmonic complexity aliases a
> lot.
>
> - If I oversample 2x (to 88k), the aliasing is reduced.
>
> - If I oversample 4x, the aliasing is reduced, but less than with the 2x
> oversampling!
>
> If I use linear-phase downsampling however, then it behaves correctly and
> the higher the oversampling rate, the more the aliasing is reduced.
>
> Why is this happening?	

i think you need to be more clear about what you're doing.  when you 
have a wavetable that is much larger than the number of samples per 
cycle of the waveform (and 8192 sounds like it's a helluva lot bigger), 
then you're oversampled.  there is no downsampling other than the 
striding through the wavetable as you play back the waveform. what 
remains are the number of harmonics.  that number stays the same no 
matter what the stride is.  the highest harmonics might fold back.

i think you need to be careful with the concepts and quantities.  if 
you're careful enough that you can explain it so that we can understand 
what the heck you're talking about, you'll probably solve your problem.

FYI, some aliasing won't kill you, as long as the aliased harmonics do 
no fold back too far.  if your sample rate is 44.1 kHz and you have 2 
wavetables per octave of keyboard, you can keep all of the aliased 
harmonics (or missing harmonics) all above 19 kHz.  but you have to be 
able to line up the wavetables (not hard) and crossfade between them.

also, 8192 sample wavetable is 4 or 8 times bigger than you need, me 
thinks.  even with a 1024 or 2048 point wavetable, you should be able to 
linearly-interpolate between points and it will come out fine.

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Reply by jungledmnc ●July 4, 20142014-07-04

Thank you r-b-j

You generally got all the points I think. After some thinking I came up to
the conclusion that this is inevitable, it just cannot be completely
alias-free without using extremely steep filters.

Basically I'm having just one wavetable per octave. I create these
band-limited wavetables by DFT, then zeroing bins that correspond to these
frequencies and the IDFT back. Since I have 8192 point FFT, let's say I'm
playing tone at 1000Hz and I want to remove everything above :

2 * (20000 / 1000) = 40

20000 Hz is the maximum frequency I'm allowing, 1000 Hz is the minimum, and
2 is simply because of the fact that I'm using real DFT, so every bin has 2
values (re and im).

Now I'm thinking, since in the example above I'd be using only 40 bins from
the FFT, would it be enough to have only 40 bins in the wavetable as well?
Assuming just linear interpolation. I don't suppose so, because it would
just be too inaccurate, on the other hand maybe you know some cool theorem,
that could make this work :). Linear interpolation is basically a low-pass
right? So this could mess up the highest frequencies...

Btw. you said interpolate between multiple band-limited wavetables - why?
This would gradually remove the higher octave, so it would lower the
aliasing, but also remove the highest frequencies from where I don't really
want this yet, I think.	 

_____________________________		
Posted through www.DSPRelated.com

Reply by Tim Wescott ●July 4, 20142014-07-04

On Fri, 04 Jul 2014 03:48:33 -0500, jungledmnc wrote:

> Hi,
> 
> I'm programming a sound generator, based on wavetables. I have 8192
> point wavetable. I create several band-limited "subwavetables" by taking
> DFT, zeroing high octave(s) and IDFT. For generating particular pitch a
> choose a wavetable, which has all harmonics until 20k. Sound good so
> far, way better than just upsampling the original non-band-limited
> wavetable.
> 
> The harmonics that exceed 22k in 44kHz sampling rate are aliasing back,
> so I'm oversampling it. But since I want zero latency, I'm using a
> minimum phase 120dB/oct low-pass (butterworth filters, so yeah I know,
> huge phase shift, but I'm ok with that).
> 
> PROBLEM:
> 
> I'm trying a sawtooth wave, which due to its harmonic complexity aliases
> a lot.
> 
> - If I oversample 2x (to 88k), the aliasing is reduced.
> 
> - If I oversample 4x, the aliasing is reduced, but less than with the 2x
> oversamping!
> 
> If I use linear-phase downsampling however, then it behaves correctly
> and the higher the oversampling rate, the more the aliasing is reduced.
> 
> Why is this happening?

Sawtooth waves are so very rich in harmonics, it's likely that for the 
note you've chosen, when you oversample at 2x the aliases are thrown out 
of the audible band.

It may behoove you to experiment (on paper) with the spectrum of the 
sawtooth wave, and see where the aliases land as you vary the sampling 
rate.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by robert bristow-johnson ●July 4, 20142014-07-04

On 7/4/14 2:34 PM, jungledmnc wrote:
> Thank you r-b-j
>
> You generally got all the points I think. After some thinking I came up to
> the conclusion that this is inevitable, it just cannot be completely
> alias-free without using extremely steep filters.
>

well, it certainly is not inevitable unless you're counting our hearing 
as an extremely steep filter.

> Basically I'm having just one wavetable per octave. I create these
> band-limited wavetables by DFT, then zeroing bins that correspond to these
> frequencies and the IDFT back. Since I have 8192 point FFT, let's say I'm
> playing tone at 1000Hz and I want to remove everything above :
>
> 2 * (20000 / 1000) = 40
>
> 20000 Hz is the maximum frequency I'm allowing, 1000 Hz is the minimum, and
> 2 is simply because of the fact that I'm using real DFT, so every bin has 2
> values (re and im).
>
> Now I'm thinking, since in the example above I'd be using only 40 bins from
> the FFT, would it be enough to have only 40 bins in the wavetable as well?
> Assuming just linear interpolation. I don't suppose so, because it would
> just be too inaccurate, on the other hand maybe you know some cool theorem,
> that could make this work :).

Duane Wise and i wrote a paper in the 90s about how much oversampling is 
needed given a form of polynomial interpolation to accomplish a given 
S/N ratio.  it assumes that all images eventually fold back and become 
error (or noise), which is a worst case.  then we looked at drop-sample, 
linear, 3rd-order Lagrange, Hermite, and B-spline.  and i came up with a 
rationale for why B-spline was the "best" (in that it attenuated the 
images the most).  but, if your oversampling ratio is high enough, 
linear interpolation is fine.  if i remember, oversampling by a ratio of 
512 will get you 120 dB S/N with linear interpolation.  (now, with 8192, 
that gets you only up to the 8th harmonic, but i don't think the 
*images* of the higher harmonics will be that loud.  but if you like 
8192 and have the memory for it, because you may have a lotta wavetables 
loaded up, then go for it.  it doesn't cost computational resources, 
just memory resources.)

if you want a copy of that paper, send me an email with your legit email 
address and i'll send it.  it's a good 2 decades old, i think.

> Linear interpolation is basically a low-pass right?

it has a low-pass property to it.  but there are other properties of 
linear interpolation.  one *good* property is that the sinc^2(f/fs) 
(where fs is the *oversampled* sample rate) frequency response of linear 
interpolation is that it puts a zero right in the middle of every image 
of the original spectrum, other than the image (centered at DC) that you 
wanna keep.

> So this could mess up the highest frequencies...

but if you're greatly oversampled, f<<fs and the sinc^2 function is 
still pretty close to 1.  so not much LPFing of the frequencies you're 
worried about, *if* you oversample sufficiently.  and the oversampling i 
am referring to, is the size of the wavetable divided by twice the 
highest harmonic number.  (i.e. you are *sufficiently* sampled, not 
oversampled, if the number of wavetable points is just over twice the 
highest harmonic number.  so a 128-point wavetable can represent the 
63rd harmonic perfectly, both magnitude and phase.)

>
> Btw. you said interpolate between multiple band-limited wavetables - why?

to avoid clicks in a note that may glissando or portamento from the 
transition of one wavetable to another because the pitch moved from one 
wavetable range to another.

> This would gradually remove the higher octave, so it would lower the
> aliasing, but also remove the highest frequencies from where I don't really
> want this yet, I think.

what if all this missing harmonics or aliased harmonics is all happening 
above your defined bandlimit (which you want as 20 kHz)?  do you care then?

lemme show you how to think of this mathematically.

let M be the number of wavetables per octave (it does not have to be an 
integer).  and let f0 be the fundamental frequency of a note in the 
bottom of the pitch range of some wavetable.  the fundamental frequency 
of a note at the top of that range is:

      2^(1/M) * f0  =  r * f0.

i think M=2 (or r=sqrt(2)) is a good value, but you seem to like M=1 (or 
r=2).  it doesn't have to be either.  you could have a new wavetable 
every 7 semitones or 8 or 9.  it's just easier for me to think about 
having an integer number of wavetables per octave.

now let B be your bandlimit; the top frequency that you care about.  if 
you wanted to, you can put *one* brick-wall filter (for *all* of the 
voices, you don't need a separate filter for each voice) on the output 
of the whole thing to kill everything above B, so you don't care if 
there are harmonics above B or missing harmonics above B or even aliased 
harmonics above B.  you don't give a rat's ass about it.

so the oversampling ratio of the system (this is *not* the same as the 
oversampling ratio implied by the wavetable size which affects the 
linear interpolation between points, that's a different issue) is

     Nyquist/B  =  (Fs/2)/B .

now let N be the harmonic index of the highest harmonic in the waveform 
defined by the wavetable.  so if you want harmonics up to the bandlimit 
B but not over it, then

     N*f0 <= B < (N+1)*f0

so
     N = floor(B/f0) .

i think you got that right.  for the note middle C and for B=20 kHz, 
then N=76.  (i think that's more harmonics than you need, but i just 
want to illustrate the arithmetic.)

now, given the *same* wavetable, with the *same* number of harmonics 
(which is N), if you go to the highest note in the pitch range for that 
wavetable, the fundamental frequency is r*f0 and the highest harmonic 
would be N*r*f0.  the lowest frequency alias of that highest harmonic is

     Fs - N*r*f0

now, in this worst case, you do not want that highest harmonic to be 
below your bandlimit, B.  you don't care about it folding over as long 
as it's above B, because either we don't hear it, or you're gonna kill 
it with a brick-wall LPF.  so that says:

     B  <  Fs - r*N*f0

this can be accomplished with a little bit of conservative rounding:

     B  <  Fs - r*B  <=  Fs - r*N*f0

this gives you a relationship that tells you something about how many 
wavetables per octave you need:

     B + r*B  =  (1+r)*B  <  Fs

     1 + r  =  1 + 2^(1/M)  <  Fs/B

so

     M  >  1 / log2(Fs/B - 1)

then for Fs = 48 kHz and B = 19.882 kHz (pretty damn close to your 20 
kHz), you need 2 wavetables per octave of synthesizer pitch.  this 
means, above 19.882 kHz, you might have harmonics, missing harmonics, or 
even aliased harmonics.  if you think you can hear it, brick-wall the 
sonofabitch at B, then none of us will care.

you get to check this out with numbers you like (like M=1 or Fs=88.2 kHz 
or whatever).

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Reply by robert bristow-johnson ●July 7, 20142014-07-07

On 7/4/14 5:39 PM, robert bristow-johnson wrote:
> ...
> [a bunch of stuff]

so jungle, did you figure it out?  how big are your wavetables gonna be 
and how many are there (for a single voice)?


-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Reply by jungledmnc ●July 9, 20142014-07-09

Thank you r-b-j! Exhaustive and helpful as usual! I'm still getting my head
around it though :). One thing is certain - 512x oversampling is waaaay too
much. Anyway I need to read the bandlimit arithmetic theory of yours again,
seems that one pass wasn't enough :D.	 

_____________________________		
Posted through www.DSPRelated.com

Reply by robert bristow-johnson ●July 9, 20142014-07-09

On 7/9/14 12:10 PM, jungledmnc wrote:
> Thank you r-b-j! Exhaustive and helpful as usual! I'm still getting my head
> around it though :). One thing is certain - 512x oversampling is waaaay too
> much.

i would agree, but only because the higher harmonics, that *do* fold 
back into your baseband, are too weak in energy to amount to much. 
personally i think an 8192-point wavetable is more than you need.  but i 
have seen another person go to 4096.  i have, myself, limited the 
wavetable size to 1024 for "expanded" wavetables (those that are 
"loaded" at program-change time and are ready to rock-n-roll).  for 
wavetable *storage* (like in flash ROM or hard disk some other long-term 
storage), i have them usually at 128 points (which can represent *every* 
harmonic up to the 63rd).  then when the small wavetable is loaded into 
RAM, it is expanded by high-quality interpolation into a wavetable 8 or 
16 or 32 times larger (so the oversampling ratio would be 8 or 16 or 32).

but if you wanted really high-quality resampling to an arbitrary ratio, 
if you want 120 dB S/N in the resampling, first you need to upsample by 
512x and then linearly interpolate.  you *don't* need to compute *every* 
point in the upsampled signal, just the two samples that you're using to 
linearly interpolate.  but with wavetable, they are all precomputed 
anyway.  so it's just linear interpolation, but make sure the wavetable 
is large enough and the number of non-zero harmonics is sufficiently 
limited.  then linear interpolation is good enough.

> Anyway I need to read the bandlimit arithmetic theory of yours again,
> seems that one pass wasn't enough :D.	

again, send me an email (with an email), if you want that paper from 
Duane and i.

essentially, if you're using drop-sample interpolation, you gain 6 dB 
S/N ratio for every time you double the oversampling ratio (plus some 
constant around 5 dB, i think).  for linear interpolation, it's 12 dB 
for every time you double the oversampling ratio (plus some constant 
around 11 dB, i think).

also, are you hooked up to the music-dsp mailing list?  a lot of this 
would be a good discussion there.  i would recommend it.

-- 

r b-j                  rbj@audioimagination.com

"Imagination is more important than knowledge."

Previous12 3 Next

Higher upsampling with minimum phase downsampling produces more aliasing

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group