Forums

Polyphase & wavetable playback?

Started by R Jones July 25, 2005
Hi.

I am looking to add "wavetable" playback of sampled music/instruments to an
existing project.  The design already uses an FPGA & has an audio CODEC
which supports playback at 48 kHz.

I want to implement pitch-shifting and mimic higher/lower notes by altering
the playback rate of the samples.  The pitch-shift is in the range of -2x
to 2x (-2*sampling_rate to +2*sampling_rate) controlled by a pot, probably
discretized to a signed 8-bit quantity.

I am having trouble conceptualizing what I need to implement the
pitch-shifting.

Do I just create a polyphase filtering structure which allows me to
up-sample/down-samples the raw audio and match it with the CODEC's 48 kHz
rate?

I do want to maintain as much audio quality as possible.

My initial guess was something along the lines of a DDS/phase accumulator
that controls the rate at which samples are plucked from the ROM, however
that does not address the fact the CODEC still expects data @ 48 kHz.

Your suggestions are appreciated.
Richard.
"R Jones" <donot@emailme.here> wrote in message 
news:QKeFe.3794$vf.3770@tornado.socal.rr.com...
> Hi. > > I am looking to add "wavetable" playback of sampled music/instruments to an > existing project. The design already uses an FPGA & has an audio CODEC > which supports playback at 48 kHz. > > I want to implement pitch-shifting and mimic higher/lower notes by altering > the playback rate of the samples. The pitch-shift is in the range of -2x > to 2x (-2*sampling_rate to +2*sampling_rate) controlled by a pot, probably > discretized to a signed 8-bit quantity. > > I am having trouble conceptualizing what I need to implement the > pitch-shifting. > > Do I just create a polyphase filtering structure which allows me to > up-sample/down-samples the raw audio and match it with the CODEC's 48 kHz > rate?
I'm assuming your original wavetable is all sampled at 48kHz. Correct me if I'm wrong.
> I do want to maintain as much audio quality as possible. > > My initial guess was something along the lines of a DDS/phase accumulator > that controls the rate at which samples are plucked from the ROM, however > that does not address the fact the CODEC still expects data @ 48 kHz.
You are on the right track. You output a sample at 48kHz, but move through the ROM table at a faster or slower rate depending on if you want to shift the pitch up or down. When between samples, you interpolate. Higher quality interpolation schemes will give better audio quality. This usually means looking at more samples around the interpolation point.
Hi Jon,

> I'm assuming your original wavetable is all sampled at 48kHz. Correct me > if I'm wrong.
Correct.
>> My initial guess was something along the lines of a DDS/phase accumulator >> that controls the rate at which samples are plucked from the ROM, however >> that does not address the fact the CODEC still expects data @ 48 kHz. > > You are on the right track. You output a sample at 48kHz, but move > through the ROM table at a faster or slower rate depending on if you want > to shift the pitch > up or down. When between samples, you interpolate. Higher quality > interpolation schemes will give better audio quality. This usually means > looking at more samples around the interpolation point.
Ok, so the DDS attached to the ROM "scanning" is good. Any suggestions on the interpolation itself? What techniques should I look into? Polyphase is the technique that I've "heard" is overall the best. Would this interpolation filter be clocked by the CODEC clock and recieve samples whenever the ROM presents it? Thanks. Richard.
Jon Harris wrote:
> "R Jones" <donot@emailme.here> wrote in message > news:QKeFe.3794$vf.3770@tornado.socal.rr.com...
> > I am looking to add "wavetable" playback of sampled music/instruments to an > > existing project. The design already uses an FPGA & has an audio CODEC > > which supports playback at 48 kHz. > > > > I want to implement pitch-shifting and mimic higher/lower notes by altering > > the playback rate of the samples. The pitch-shift is in the range of -2x > > to 2x (-2*sampling_rate to +2*sampling_rate) controlled by a pot, probably > > discretized to a signed 8-bit quantity. > > > > I am having trouble conceptualizing what I need to implement the > > pitch-shifting. > > > > Do I just create a polyphase filtering structure which allows me to > > up-sample/down-samples the raw audio and match it with the CODEC's 48 kHz > > rate?
> > I do want to maintain as much audio quality as possible. > > > > My initial guess was something along the lines of a DDS/phase accumulator > > that controls the rate at which samples are plucked from the ROM, however > > that does not address the fact the CODEC still expects data @ 48 kHz. > > You are on the right track. You output a sample at 48kHz, but move through the > ROM table at a faster or slower rate depending on if you want to shift the pitch > up or down. When between samples, you interpolate. Higher quality > interpolation schemes will give better audio quality. This usually means > looking at more samples around the interpolation point.
... and figuring out how you're gonna combine them for a particular interpolated value. the other issue is whether or not you want your musical note to get shorter when upshifted and to get longer when downshifted. if yes, then what you're really referring to is Sampling "synthesis" or really PCM sample playback. if no, you might want to take a look at my http://www.musicdsp.org/files/Wavetable-101.pdf to get an idea of what Wavetable synthesis was originally defined to be and it's not quite the same. r b-j
"R Jones" <donot@emailme.here> wrote in message 
news:VEjFe.31455$aA5.5258@tornado.socal.rr.com...
> Hi Jon, > >> I'm assuming your original wavetable is all sampled at 48kHz. Correct me >> if I'm wrong. > > Correct. > >>> My initial guess was something along the lines of a DDS/phase accumulator >>> that controls the rate at which samples are plucked from the ROM, however >>> that does not address the fact the CODEC still expects data @ 48 kHz. >> >> You are on the right track. You output a sample at 48kHz, but move >> through the ROM table at a faster or slower rate depending on if you want >> to shift the pitch >> up or down. When between samples, you interpolate. Higher quality >> interpolation schemes will give better audio quality. This usually means >> looking at more samples around the interpolation point. > > Ok, so the DDS attached to the ROM "scanning" is good. > > Any suggestions on the interpolation itself? What techniques should I look > into? Polyphase is the technique that I've "heard" is overall the best.
All interpolation really just comes down to low-pass filtering. Polyphase is one implementation of a low-pass filter that works well for interpolation. On a DSP, it is the most efficient method in terms of execution time when memory is plentiful. I'm not sure how well that translates to an FPGA which I think you are using. I would start by reading the notes on analog devices on their hardware sample rate converters. Here are a few links: http://www.analog.com/UploadedFiles/Application_Notes/14452667AN394.pdf http://www.analog.com/en/content/0,2886,765%255F807%255F9690,00.html
> Would this interpolation filter be clocked by the CODEC clock and recieve > samples whenever the ROM presents it?
The output would be clocked by the CODEC clock, and the input would need to receive samples from the ROM at that same rate. The variable part is the rate your ROM "read pointer" advances through the ROM. At 1x speed, it increments one location per output sample. At 1.5x, it advances 1.5 locations every output sample. Obviously this needs to be a "fractional pointer", i.e. keep track of the ROM address and the amount you are in between the samples. I think this is the same idea you had in mind when you talked about DDS/phase accumulator, which works the same way. Also, as Robert mentioned in another post, this method not only shifts the pitch of the samples, but also changes their length proportionally. This is a pretty common method but doesn't sound natural if you need to shift the pitch by too much. High quality systems try to use as many pre-recorded notes as possible so the shifting amount can be minimized.