Forums

Higher upsampling with minimum phase downsampling produces more aliasing

Started by jungledmnc July 4, 2014
Hi,

I'm programming a sound generator, based on wavetables. I have 8192 point
wavetable. I create several band-limited "subwavetables" by taking DFT,
zeroing high octave(s) and IDFT. For generating particular pitch a choose a
wavetable, which has all harmonics until 20k. Sound good so far, way better
than just upsampling the original non-band-limited wavetable.

The harmonics that exceed 22k in 44kHz sampling rate are aliasing back, so
I'm oversampling it. But since I want zero latency, I'm using a minimum
phase 120dB/oct low-pass (butterworth filters, so yeah I know, huge phase
shift, but I'm ok with that). 

PROBLEM:

I'm trying a sawtooth wave, which due to its harmonic complexity aliases a
lot.

- If I oversample 2x (to 88k), the aliasing is reduced.

- If I oversample 4x, the aliasing is reduced, but less than with the 2x
oversamping!

If I use linear-phase downsampling however, then it behaves correctly and
the higher the oversampling rate, the more the aliasing is reduced. 

Why is this happening?	 

_____________________________		
Posted through www.DSPRelated.com
Are you using real hardware or Matlab?

Check that your Butterworth filters are not clipping at an internal node (common problem for high- order iir filters, you need to order the stages correctly )

Bob
>Are you using real hardware or Matlab? > >Check that your Butterworth filters are not clipping at an internal node
(common problem for high- order iir filters, you need to order the stages correctly )
> >Bob >
It's a C++ app. Anyway I tried decreasing the generator volume, a lot, and no change. I was thinking maybe it's just that the FIR I'm using for LP downsampling is much steeper? But still the minimum phase filter is 120dB/oct!...Anyway the problem is probably simply the fact that the Butterworth filter is kind of less steep with higher sampling rates (or more like converging to Nyquist in lower samping rate). _____________________________ Posted through www.DSPRelated.com
On 7/4/14 4:48 AM, jungledmnc wrote:
> > I'm programming a sound generator, based on wavetables. I have 8192 point > wavetable. I create several band-limited "subwavetables" by taking DFT, > zeroing high octave(s)
you mean zeroing bins? how many?
> and IDFT.
so now you have a 8192 sample wavetable that has only energy in the lower harmonics, right?
> For generating particular pitch a choose a > wavetable, which has all harmonics until 20k.
where those harmonics end up depends on the "stride" that you're stepping through the wavetable (and the sample rate)./
> Sound good so far, way better > than just upsampling the original non-band-limited wavetable.
whatever that means.
> The harmonics that exceed 22k in 44kHz sampling rate are aliasing back, so > I'm oversampling it. But since I want zero latency, I'm using a minimum > phase 120dB/oct low-pass (butterworth filters, so yeah I know, huge phase > shift, but I'm ok with that). > > PROBLEM: > > I'm trying a sawtooth wave, which due to its harmonic complexity aliases a > lot. > > - If I oversample 2x (to 88k), the aliasing is reduced. > > - If I oversample 4x, the aliasing is reduced, but less than with the 2x > oversampling! > > If I use linear-phase downsampling however, then it behaves correctly and > the higher the oversampling rate, the more the aliasing is reduced. > > Why is this happening?
i think you need to be more clear about what you're doing. when you have a wavetable that is much larger than the number of samples per cycle of the waveform (and 8192 sounds like it's a helluva lot bigger), then you're oversampled. there is no downsampling other than the striding through the wavetable as you play back the waveform. what remains are the number of harmonics. that number stays the same no matter what the stride is. the highest harmonics might fold back. i think you need to be careful with the concepts and quantities. if you're careful enough that you can explain it so that we can understand what the heck you're talking about, you'll probably solve your problem. FYI, some aliasing won't kill you, as long as the aliased harmonics do no fold back too far. if your sample rate is 44.1 kHz and you have 2 wavetables per octave of keyboard, you can keep all of the aliased harmonics (or missing harmonics) all above 19 kHz. but you have to be able to line up the wavetables (not hard) and crossfade between them. also, 8192 sample wavetable is 4 or 8 times bigger than you need, me thinks. even with a 1024 or 2048 point wavetable, you should be able to linearly-interpolate between points and it will come out fine. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Thank you r-b-j

You generally got all the points I think. After some thinking I came up to
the conclusion that this is inevitable, it just cannot be completely
alias-free without using extremely steep filters.

Basically I'm having just one wavetable per octave. I create these
band-limited wavetables by DFT, then zeroing bins that correspond to these
frequencies and the IDFT back. Since I have 8192 point FFT, let's say I'm
playing tone at 1000Hz and I want to remove everything above :

2 * (20000 / 1000) = 40

20000 Hz is the maximum frequency I'm allowing, 1000 Hz is the minimum, and
2 is simply because of the fact that I'm using real DFT, so every bin has 2
values (re and im).

Now I'm thinking, since in the example above I'd be using only 40 bins from
the FFT, would it be enough to have only 40 bins in the wavetable as well?
Assuming just linear interpolation. I don't suppose so, because it would
just be too inaccurate, on the other hand maybe you know some cool theorem,
that could make this work :). Linear interpolation is basically a low-pass
right? So this could mess up the highest frequencies...

Btw. you said interpolate between multiple band-limited wavetables - why?
This would gradually remove the higher octave, so it would lower the
aliasing, but also remove the highest frequencies from where I don't really
want this yet, I think.	 

_____________________________		
Posted through www.DSPRelated.com
On Fri, 04 Jul 2014 03:48:33 -0500, jungledmnc wrote:

> Hi, > > I'm programming a sound generator, based on wavetables. I have 8192 > point wavetable. I create several band-limited "subwavetables" by taking > DFT, zeroing high octave(s) and IDFT. For generating particular pitch a > choose a wavetable, which has all harmonics until 20k. Sound good so > far, way better than just upsampling the original non-band-limited > wavetable. > > The harmonics that exceed 22k in 44kHz sampling rate are aliasing back, > so I'm oversampling it. But since I want zero latency, I'm using a > minimum phase 120dB/oct low-pass (butterworth filters, so yeah I know, > huge phase shift, but I'm ok with that). > > PROBLEM: > > I'm trying a sawtooth wave, which due to its harmonic complexity aliases > a lot. > > - If I oversample 2x (to 88k), the aliasing is reduced. > > - If I oversample 4x, the aliasing is reduced, but less than with the 2x > oversamping! > > If I use linear-phase downsampling however, then it behaves correctly > and the higher the oversampling rate, the more the aliasing is reduced. > > Why is this happening?
Sawtooth waves are so very rich in harmonics, it's likely that for the note you've chosen, when you oversample at 2x the aliases are thrown out of the audible band. It may behoove you to experiment (on paper) with the spectrum of the sawtooth wave, and see where the aliases land as you vary the sampling rate. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 7/4/14 2:34 PM, jungledmnc wrote:
> Thank you r-b-j > > You generally got all the points I think. After some thinking I came up to > the conclusion that this is inevitable, it just cannot be completely > alias-free without using extremely steep filters. >
well, it certainly is not inevitable unless you're counting our hearing as an extremely steep filter.
> Basically I'm having just one wavetable per octave. I create these > band-limited wavetables by DFT, then zeroing bins that correspond to these > frequencies and the IDFT back. Since I have 8192 point FFT, let's say I'm > playing tone at 1000Hz and I want to remove everything above : > > 2 * (20000 / 1000) = 40 > > 20000 Hz is the maximum frequency I'm allowing, 1000 Hz is the minimum, and > 2 is simply because of the fact that I'm using real DFT, so every bin has 2 > values (re and im). > > Now I'm thinking, since in the example above I'd be using only 40 bins from > the FFT, would it be enough to have only 40 bins in the wavetable as well? > Assuming just linear interpolation. I don't suppose so, because it would > just be too inaccurate, on the other hand maybe you know some cool theorem, > that could make this work :).
Duane Wise and i wrote a paper in the 90s about how much oversampling is needed given a form of polynomial interpolation to accomplish a given S/N ratio. it assumes that all images eventually fold back and become error (or noise), which is a worst case. then we looked at drop-sample, linear, 3rd-order Lagrange, Hermite, and B-spline. and i came up with a rationale for why B-spline was the "best" (in that it attenuated the images the most). but, if your oversampling ratio is high enough, linear interpolation is fine. if i remember, oversampling by a ratio of 512 will get you 120 dB S/N with linear interpolation. (now, with 8192, that gets you only up to the 8th harmonic, but i don't think the *images* of the higher harmonics will be that loud. but if you like 8192 and have the memory for it, because you may have a lotta wavetables loaded up, then go for it. it doesn't cost computational resources, just memory resources.) if you want a copy of that paper, send me an email with your legit email address and i'll send it. it's a good 2 decades old, i think.
> Linear interpolation is basically a low-pass right?
it has a low-pass property to it. but there are other properties of linear interpolation. one *good* property is that the sinc^2(f/fs) (where fs is the *oversampled* sample rate) frequency response of linear interpolation is that it puts a zero right in the middle of every image of the original spectrum, other than the image (centered at DC) that you wanna keep.
> So this could mess up the highest frequencies...
but if you're greatly oversampled, f<<fs and the sinc^2 function is still pretty close to 1. so not much LPFing of the frequencies you're worried about, *if* you oversample sufficiently. and the oversampling i am referring to, is the size of the wavetable divided by twice the highest harmonic number. (i.e. you are *sufficiently* sampled, not oversampled, if the number of wavetable points is just over twice the highest harmonic number. so a 128-point wavetable can represent the 63rd harmonic perfectly, both magnitude and phase.)
> > Btw. you said interpolate between multiple band-limited wavetables - why?
to avoid clicks in a note that may glissando or portamento from the transition of one wavetable to another because the pitch moved from one wavetable range to another.
> This would gradually remove the higher octave, so it would lower the > aliasing, but also remove the highest frequencies from where I don't really > want this yet, I think.
what if all this missing harmonics or aliased harmonics is all happening above your defined bandlimit (which you want as 20 kHz)? do you care then? lemme show you how to think of this mathematically. let M be the number of wavetables per octave (it does not have to be an integer). and let f0 be the fundamental frequency of a note in the bottom of the pitch range of some wavetable. the fundamental frequency of a note at the top of that range is: 2^(1/M) * f0 = r * f0. i think M=2 (or r=sqrt(2)) is a good value, but you seem to like M=1 (or r=2). it doesn't have to be either. you could have a new wavetable every 7 semitones or 8 or 9. it's just easier for me to think about having an integer number of wavetables per octave. now let B be your bandlimit; the top frequency that you care about. if you wanted to, you can put *one* brick-wall filter (for *all* of the voices, you don't need a separate filter for each voice) on the output of the whole thing to kill everything above B, so you don't care if there are harmonics above B or missing harmonics above B or even aliased harmonics above B. you don't give a rat's ass about it. so the oversampling ratio of the system (this is *not* the same as the oversampling ratio implied by the wavetable size which affects the linear interpolation between points, that's a different issue) is Nyquist/B = (Fs/2)/B . now let N be the harmonic index of the highest harmonic in the waveform defined by the wavetable. so if you want harmonics up to the bandlimit B but not over it, then N*f0 <= B < (N+1)*f0 so N = floor(B/f0) . i think you got that right. for the note middle C and for B=20 kHz, then N=76. (i think that's more harmonics than you need, but i just want to illustrate the arithmetic.) now, given the *same* wavetable, with the *same* number of harmonics (which is N), if you go to the highest note in the pitch range for that wavetable, the fundamental frequency is r*f0 and the highest harmonic would be N*r*f0. the lowest frequency alias of that highest harmonic is Fs - N*r*f0 now, in this worst case, you do not want that highest harmonic to be below your bandlimit, B. you don't care about it folding over as long as it's above B, because either we don't hear it, or you're gonna kill it with a brick-wall LPF. so that says: B < Fs - r*N*f0 this can be accomplished with a little bit of conservative rounding: B < Fs - r*B <= Fs - r*N*f0 this gives you a relationship that tells you something about how many wavetables per octave you need: B + r*B = (1+r)*B < Fs 1 + r = 1 + 2^(1/M) < Fs/B so M > 1 / log2(Fs/B - 1) then for Fs = 48 kHz and B = 19.882 kHz (pretty damn close to your 20 kHz), you need 2 wavetables per octave of synthesizer pitch. this means, above 19.882 kHz, you might have harmonics, missing harmonics, or even aliased harmonics. if you think you can hear it, brick-wall the sonofabitch at B, then none of us will care. you get to check this out with numbers you like (like M=1 or Fs=88.2 kHz or whatever). -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
On 7/4/14 5:39 PM, robert bristow-johnson wrote:
> ... > [a bunch of stuff]
so jungle, did you figure it out? how big are your wavetables gonna be and how many are there (for a single voice)? -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Thank you r-b-j! Exhaustive and helpful as usual! I'm still getting my head
around it though :). One thing is certain - 512x oversampling is waaaay too
much. Anyway I need to read the bandlimit arithmetic theory of yours again,
seems that one pass wasn't enough :D.	 

_____________________________		
Posted through www.DSPRelated.com
On 7/9/14 12:10 PM, jungledmnc wrote:
> Thank you r-b-j! Exhaustive and helpful as usual! I'm still getting my head > around it though :). One thing is certain - 512x oversampling is waaaay too > much.
i would agree, but only because the higher harmonics, that *do* fold back into your baseband, are too weak in energy to amount to much. personally i think an 8192-point wavetable is more than you need. but i have seen another person go to 4096. i have, myself, limited the wavetable size to 1024 for "expanded" wavetables (those that are "loaded" at program-change time and are ready to rock-n-roll). for wavetable *storage* (like in flash ROM or hard disk some other long-term storage), i have them usually at 128 points (which can represent *every* harmonic up to the 63rd). then when the small wavetable is loaded into RAM, it is expanded by high-quality interpolation into a wavetable 8 or 16 or 32 times larger (so the oversampling ratio would be 8 or 16 or 32). but if you wanted really high-quality resampling to an arbitrary ratio, if you want 120 dB S/N in the resampling, first you need to upsample by 512x and then linearly interpolate. you *don't* need to compute *every* point in the upsampled signal, just the two samples that you're using to linearly interpolate. but with wavetable, they are all precomputed anyway. so it's just linear interpolation, but make sure the wavetable is large enough and the number of non-zero harmonics is sufficiently limited. then linear interpolation is good enough.
> Anyway I need to read the bandlimit arithmetic theory of yours again, > seems that one pass wasn't enough :D.
again, send me an email (with an email), if you want that paper from Duane and i. essentially, if you're using drop-sample interpolation, you gain 6 dB S/N ratio for every time you double the oversampling ratio (plus some constant around 5 dB, i think). for linear interpolation, it's 12 dB for every time you double the oversampling ratio (plus some constant around 11 dB, i think). also, are you hooked up to the music-dsp mailing list? a lot of this would be a good discussion there. i would recommend it. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."