comp.dsp | Async. sample rate conversion for audio - various methods vs fractional delay filters (Farrow)

Hi,
I am currently investigating the use of fractional delay filters for the
asynchronous sample rate conversion of audio signals. I'm getting quite
confused with the various methods and hopefully some of you guys could
help.

First of all, let's make it clear that this is for 'asynchronous'
conversion. Nothing is known about the input to output ratio, except
that it's within a 'normal audio range - 8x up or down-' and it varies
very slowly. Let's also assume that we have a mean to obtain a high
precision (20 bits) output time for every output sample -> we know where
between two input samples we want to get the new sample. Basically,
let's focus on the filtering part of the problem.

Various methods that I think I understand:
1) Interpolate input signal by 4, 8 or 16, followed by some kind of
polynomial interpolator (linear, spline, lagrange...). This works well
if fsout>fsin. If not, we need to add a decimating filter. The bandwidth
of that filter would need to be adjusted depending on the output rate
which I guess is do-able since there are usually only a few fixed audio
frequency. I guess the interpolating filter could also perform the
downsampling but it would now becomes a harder filter to design and
implement. This all seems like a hack to me? There must be a filter
structure that would do it all (would this be the fractional delay
filter?).

2) Use a quite large FIR filter to generate something like 2^16
intermediate input sample points and simply pick the closest one we
need. A polyphase approach could be taken so we only calculate say a
64-tap filter when an output sample is requested. This would require
quite a lot of memory to store all the coefficients though. However, the
'decimating' case can be handled nicely by that filter with some
scaling. I believe this is the approach taken by most of the commercial
ICs available (AD1896, CS8420).

3) I'm sure there are other ways to approach the problem, some mixture
of the above two etc...

4) Can this problem be solved more efficiently using fractional delay
filters (farrow) more efficiently? I am VERY confused about when those
filters come into play. The only reference I could find about audio is
here (hopefully some of you can read it):
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=870683
but it seems to be designed for only one set of fractional delay
elements (44.1 to 48 or vice-versa). I'm not sure if the exact same
approach would work with a varrying delay element - Help! Also, I'm not
quite sure how this differs from method (1) described above. Can
fractional delay filter also take care of the downsampling part of the
problem without added complexity? 

Whouah that's longer than I thought! I know this is quite a lot of
questions, but any opinions would be appreciated! Also, please correct me
if I made any false statements.

Thank you very much,

gretz

Reply by robert bristow-johnson ●June 24, 20082008-06-24

On Jun 23, 9:22 pm, "gretzteam" <gretzt...@yahoo.com> wrote:
>
> I am currently investigating the use of fractional delay filters for the
> asynchronous sample rate conversion of audio signals. I'm getting quite
> confused with the various methods and hopefully some of you guys could
> help.
>
> First of all, let's make it clear that this is for 'asynchronous'
> conversion. Nothing is known about the input to output ratio, except
> that it's within a 'normal audio range - 8x up or down-' and it varies
> very slowly.

okay, the asynchrounous spec means that the SRC ratio is adjusted (by
a servo-control system that attempts to keep the output pointer of a
buffer, a fixed distance in time behind the input pointer).  async
does mean that you need a high precision fractional part to that
output pointer...

> Let's also assume that we have a mean to obtain a high
> precision (20 bits) output time for every output sample -> we know where
> between two input samples we want to get the new sample.

... dunno if that fractional delay needs to be 20 bits, but let's
assume it's continuous.

> Basically, let's focus on the filtering part of the problem.

you mean the "interpolation part of the problem", right?  this is the
part that async SRC ("ASRC") has in common with synchronous SRC (where
the SRC ratio is given and fixed).

> Various methods that I think I understand:
> 1) Interpolate input signal by 4, 8 or 16, followed by some kind of
> polynomial interpolator (linear, spline, lagrange...). This works well
> if fsout>fsin. If not, we need to add a decimating filter. The bandwidth
> of that filter would need to be adjusted depending on the output rate
> which I guess is do-able since there are usually only a few fixed audio
> frequency. I guess the interpolating filter could also perform the
> downsampling but it would now becomes a harder filter to design and
> implement. This all seems like a hack to me? There must be a filter
> structure that would do it all (would this be the fractional delay
> filter?).

when downsampling, the same FIR coef table can be used, but you stride
through it at a rate of Fs_out/Fs_in compared to the stride when
upsamples.  unlike for downsampling (when additional LPFing is needed
for anti-aliasing), different SRC ratios for upsampling do not change
that stride in the coef table.

> 2) Use a quite large FIR filter to generate something like 2^16
> intermediate input sample points and simply pick the closest one we
> need. A polyphase approach could be taken so we only calculate say a
> 64-tap filter when an output sample is requested. This would require
> quite a lot of memory to store all the coefficients though.

you can linearly interpolate between adjacent phases.  you don't need
more than 512 phases (or equally spaced fractional delays) if you
linearly interpolate (which doubles the FIR costs).  at least not for
audio apps (130 dB S/N)

> However, the
> 'decimating' case can be handled nicely by that filter with some
> scaling. I believe this is the approach taken by most of the commercial
> ICs available (AD1896, CS8420).

that's what i meant by changing the "stride".

> 3) I'm sure there are other ways to approach the problem, some mixture
> of the above two etc...
>
> 4) Can this problem be solved more efficiently using fractional delay
> filters (farrow) more efficiently? I am VERY confused about when those
> filters come into play. The only reference I could find about audio is
> here (hopefully some of you can read it):http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=870683
> but it seems to be designed for only one set of fractional delay
> elements (44.1 to 48 or vice-versa). I'm not sure if the exact same
> approach would work with a varrying delay element - Help! Also, I'm not
> quite sure how this differs from method (1) described above. Can
> fractional delay filter also take care of the downsampling part of the
> problem without added complexity?
>
> Whouah that's longer than I thought! I know this is quite a lot of
> questions, but any opinions would be appreciated! Also, please correct me
> if I made any false statements.

i forgot what the trick is in the Farrow SRC filters.  can someone
spell out what the salient thing that the Farrow design does over
"conventional"?

r b-j

Reply by Ron N ●June 24, 20082008-06-24

On Jun 23, 10:04&#4294967295;pm, robert bristow-johnson
<r...@audioimagination.com> wrote:
> On Jun 23, 9:22 pm, "gretzteam" <gretzt...@yahoo.com> wrote:
>
>
>
> > I am currently investigating the use of fractional delay filters for the
> > asynchronous sample rate conversion of audio signals. I'm getting quite
> > confused with the various methods and hopefully some of you guys could
> > help.
>
> > First of all, let's make it clear that this is for 'asynchronous'
> > conversion. Nothing is known about the input to output ratio, except
> > that it's within a 'normal audio range - 8x up or down-' and it varies
> > very slowly.

Each output sample from an "asynchronous" resampler is just a
bandlimited interpolation.  You can treat each point individually
without any reference to the rates and such, as long as you
bandlimit below the minimum ceiling rate / 2.

> okay, the asynchrounous spec means that the SRC ratio is adjusted (by
> a servo-control system that attempts to keep the output pointer of a
> buffer, a fixed distance in time behind the input pointer). &#4294967295;async
> does mean that you need a high precision fractional part to that
> output pointer...
>
> > Let's also assume that we have a mean to obtain a high
> > precision (20 bits) output time for every output sample -> we know where
> > between two input samples we want to get the new sample.
>
> ... dunno if that fractional delay needs to be 20 bits, but let's
> assume it's continuous.
>
> > Basically, let's focus on the filtering part of the problem.
>
> you mean the "interpolation part of the problem", right? &#4294967295;this is the
> part that async SRC ("ASRC") has in common with synchronous SRC (where
> the SRC ratio is given and fixed).
>
> > Various methods that I think I understand:
> > 1) Interpolate input signal by 4, 8 or 16, followed by some kind of
> > polynomial interpolator (linear, spline, lagrange...). This works well
> > if fsout>fsin. If not, we need to add a decimating filter. The bandwidth
> > of that filter would need to be adjusted depending on the output rate
> > which I guess is do-able since there are usually only a few fixed audio
> > frequency. I guess the interpolating filter could also perform the
> > downsampling but it would now becomes a harder filter to design and
> > implement. This all seems like a hack to me? There must be a filter
> > structure that would do it all (would this be the fractional delay
> > filter?).
>
> when downsampling, the same FIR coef table can be used, but you stride
> through it at a rate of Fs_out/Fs_in compared to the stride when
> upsamples. &#4294967295;unlike for downsampling (when additional LPFing is needed
> for anti-aliasing), different SRC ratios for upsampling do not change
> that stride in the coef table.
>
> > 2) Use a quite large FIR filter to generate something like 2^16
> > intermediate input sample points and simply pick the closest one we
> > need. A polyphase approach could be taken so we only calculate say a
> > 64-tap filter when an output sample is requested. This would require
> > quite a lot of memory to store all the coefficients though.
>
> you can linearly interpolate between adjacent phases. &#4294967295;you don't need
> more than 512 phases (or equally spaced fractional delays) if you
> linearly interpolate (which doubles the FIR costs). &#4294967295;at least not for
> audio apps (130 dB S/N)

You don't need to use a finite number of taps if you can calculate
your filter kernel on the fly (say with a simply windowed Sinc,
or Farrow approximation).

A fast PC can recalculate each tap of a von Hann windowed Sinc
for each new sample fast enough to keep up with several channels
of real time audio.  No table needed.

If not, the "phases" are just an interpolation table, and there's
a lot of old literature on how to optimize interpolation tables for
funtion approximation (finite differences, multi-resolution, and
the such...)

> > However, the
> > 'decimating' case can be handled nicely by that filter with some
> > scaling. I believe this is the approach taken by most of the commercial
> > ICs available (AD1896, CS8420).
>
> that's what i meant by changing the "stride".
>
>
>
> > 3) I'm sure there are other ways to approach the problem, some mixture
> > of the above two etc...
>
> > 4) Can this problem be solved more efficiently using fractional delay
> > filters (farrow) more efficiently? I am VERY confused about when those
> > filters come into play. The only reference I could find about audio is
> > here (hopefully some of you can read it):http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=870683
> > but it seems to be designed for only one set of fractional delay
> > elements (44.1 to 48 or vice-versa). I'm not sure if the exact same
> > approach would work with a varrying delay element - Help! Also, I'm not
> > quite sure how this differs from method (1) described above. Can
> > fractional delay filter also take care of the downsampling part of the
> > problem without added complexity?
>
> > Whouah that's longer than I thought! I know this is quite a lot of
> > questions, but any opinions would be appreciated! Also, please correct me
> > if I made any false statements.
>
> i forgot what the trick is in the Farrow SRC filters. &#4294967295;can someone
> spell out what the salient thing that the Farrow design does over
> "conventional"?

A Farrow filter uses polynomial interpolaters for each "lobe"
or half lobe of a windowed sinc (or other filter kernel).  Useful
in certain FPGA's and other hardware where pipeline-able MACs
are cheaper than table look-ups.


IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
 http://www.nicholson.com/rhn/dsp.html

Reply by PFC ●June 24, 20082008-06-24

> 3) I'm sure there are other ways to approach the problem, some mixture
> of the above two etc...

	Yep.
	The new ESS Sabre DAC seems to use a novel approach. This is a DAC which  
runs on its own clock (low jitter) and it accepts digital audio data in  
SPDIF or I2S with an asynchronous clock (high jitter).
	It is a multibit sigma-delta DAC so the input is highly oversampled  
(don't remember how much) using polyphase filters, then the asynchronous  
resampling is done on the highly oversampled data which allows use of a  
very simple interpolation algorithm.
	This is very clever (and patented).

	It is a good example of lateral thought. Instead of solving a hard  
problem (asynchronous resampling at frequencies close to the Nyquist  
limit) it turns it into an easy problem (asynchronous resampling with a  
sample frequency way above the Nyquist limit).

	The chip incorporates lots of other extremely clever tricks.
	It has been reported as potentially the best sounding chip ever by most  
of those who tried to implement it.

Reply by Robert Adams ●June 24, 20082008-06-24

On Jun 24, 7:09&#4294967295;am, PFC <li...@peufeu.com> wrote:
> > 3) I'm sure there are other ways to approach the problem, some mixture
> > of the above two etc...
>
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; Yep.
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; The new ESS Sabre DAC seems to use a novel approach. This is a DAC which &#4294967295;
> runs on its own clock (low jitter) and it accepts digital audio data in &#4294967295;
> SPDIF or I2S with an asynchronous clock (high jitter).
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; It is a multibit sigma-delta DAC so the input is highly oversampled &#4294967295;
> (don't remember how much) using polyphase filters, then the asynchronous &#4294967295;
> resampling is done on the highly oversampled data which allows use of a &#4294967295;
> very simple interpolation algorithm.
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; This is very clever (and patented).
>
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; It is a good example of lateral thought. Instead of solving a hard &#4294967295;
> problem (asynchronous resampling at frequencies close to the Nyquist &#4294967295;
> limit) it turns it into an easy problem (asynchronous resampling with a &#4294967295;
> sample frequency way above the Nyquist limit).
>
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; The chip incorporates lots of other extremely clever tricks.
> &#4294967295; &#4294967295; &#4294967295; &#4294967295; It has been reported as potentially the best sounding chip ever by most &#4294967295;
> of those who tried to implement it.

One problem you will encounter in your design is the following.

When the output sample-rate falls below the input sample-rate, there
should be a bandlimiting filter that tracks the output rate so that
input signals that are avove fs_out/2 get filtered out.

One reason the IC folks (of which I am one) use the "single fractional-
delay filter with coefficients calculated on-the-fly" approach is that
there are clever ways to stretch the impulse response of this filter
such that the cutoff frequency varies in an almost-continuous fashion.
There will also be a gradual increase in group-delay as this filter
scales down, as you might expect.

In many other approaches this can present quite a difficult problem,
and often people end up with multiple sets of coefficients that are
switched in for specific ranges of sample-rates. This may be
acceptable for many consumer applications, or applications where the
sample-rate ratios fall into specific narrow ranges, in which case the
overhead of needing multiple sets of coefficients is not so high.

Bob Adams

Reply by gretzteam ●June 24, 20082008-06-24

>On Jun 23, 9:22 pm, "gretzteam" <gretzt...@yahoo.com> wrote:
>>
>> I am currently investigating the use of fractional delay filters for
the
>> asynchronous sample rate conversion of audio signals. I'm getting
quite
>> confused with the various methods and hopefully some of you guys could
>> help.
>>
>> First of all, let's make it clear that this is for 'asynchronous'
>> conversion. Nothing is known about the input to output ratio, except
>> that it's within a 'normal audio range - 8x up or down-' and it varies
>> very slowly.
>
>okay, the asynchrounous spec means that the SRC ratio is adjusted (by
>a servo-control system that attempts to keep the output pointer of a
>buffer, a fixed distance in time behind the input pointer).  async
>does mean that you need a high precision fractional part to that
>output pointer...
>
>> Let's also assume that we have a mean to obtain a high
>> precision (20 bits) output time for every output sample -> we know
where
>> between two input samples we want to get the new sample.
>
>... dunno if that fractional delay needs to be 20 bits, but let's
>assume it's continuous.
>
>> Basically, let's focus on the filtering part of the problem.
>
>you mean the "interpolation part of the problem", right?  this is the
>part that async SRC ("ASRC") has in common with synchronous SRC (where
>the SRC ratio is given and fixed).

so far we are on the same page.

>> Various methods that I think I understand:
>> 1) Interpolate input signal by 4, 8 or 16, followed by some kind of
>> polynomial interpolator (linear, spline, lagrange...). This works well
>> if fsout>fsin. If not, we need to add a decimating filter. The
bandwidth
>> of that filter would need to be adjusted depending on the output rate
>> which I guess is do-able since there are usually only a few fixed
audio
>> frequency. I guess the interpolating filter could also perform the
>> downsampling but it would now becomes a harder filter to design and
>> implement. This all seems like a hack to me? There must be a filter
>> structure that would do it all (would this be the fractional delay
>> filter?).
>
>when downsampling, the same FIR coef table can be used, but you stride
>through it at a rate of Fs_out/Fs_in compared to the stride when
>upsamples.  unlike for downsampling (when additional LPFing is needed
>for anti-aliasing), different SRC ratios for upsampling do not change
>that stride in the coef table.
>
>> 2) Use a quite large FIR filter to generate something like 2^16
>> intermediate input sample points and simply pick the closest one we
>> need. A polyphase approach could be taken so we only calculate say a
>> 64-tap filter when an output sample is requested. This would require
>> quite a lot of memory to store all the coefficients though.
>
>you can linearly interpolate between adjacent phases.  you don't need
>more than 512 phases (or equally spaced fractional delays) if you
>linearly interpolate (which doubles the FIR costs).  at least not for
>audio apps (130 dB S/N)


ok so isn't this saying that method 1 and 2 are the same. Either you do a
small interpolation (say 8x), followed by a good polynomial
interpolation(3rd order spline or lagrange). Or you peform a better
interpolation upfront (512x) followed by a simpler polynomial (linear). Or
you do even a better interpolation (2^16), followed by a pretty poor
interpolator(sample and hold). I guess depending on hardware, power and
performance targets, there is a sweet spot in this solution space?


>> However, the
>> 'decimating' case can be handled nicely by that filter with some
>> scaling. I believe this is the approach taken by most of the
commercial
>> ICs available (AD1896, CS8420).
>
>that's what i meant by changing the "stride".
>
>> 3) I'm sure there are other ways to approach the problem, some mixture
>> of the above two etc...
>>
>> 4) Can this problem be solved more efficiently using fractional delay
>> filters (farrow) more efficiently? I am VERY confused about when those
>> filters come into play. The only reference I could find about audio is
>> here (hopefully some of you can read
it):http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=870683
>> but it seems to be designed for only one set of fractional delay
>> elements (44.1 to 48 or vice-versa). I'm not sure if the exact same
>> approach would work with a varrying delay element - Help! Also, I'm
not
>> quite sure how this differs from method (1) described above. Can
>> fractional delay filter also take care of the downsampling part of the
>> problem without added complexity?
>>
>> Whouah that's longer than I thought! I know this is quite a lot of
>> questions, but any opinions would be appreciated! Also, please correct
me
>> if I made any false statements.
>
>i forgot what the trick is in the Farrow SRC filters.  can someone
>spell out what the salient thing that the Farrow design does over
>"conventional"?
>
>r b-j
>

Reply by Vladimir Vassilevsky ●June 24, 20082008-06-24

robert bristow-johnson wrote:

> i forgot what the trick is in the Farrow SRC filters.  can someone
> spell out what the salient thing that the Farrow design does over
> "conventional"?

Farrow filter is the polynomial interpolation by means of the Tailor 
series. What is good about that: all of the derivatives can be computed 
in parallel in the hardware, and then the polynomial can be calculated 
using Horner rule. I.e. it is a good architecture for the hardware 
implementation.

Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com

Reply by Muzaffer Kal ●June 24, 20082008-06-24

On Tue, 24 Jun 2008 09:06:01 -0500, Vladimir Vassilevsky
<antispam_bogus@hotmail.com> wrote:
>Farrow filter is the polynomial interpolation by means of the Tailor 
>series. 

Do you want a jacket with that sir?

;-)

Reply by Randy Yates ●June 24, 20082008-06-24

Vladimir Vassilevsky <antispam_bogus@hotmail.com> writes:
> [...]
> Tailor

A tailor is one who mends your clothes. Brook Taylor was an English
mathematician that pioneered the use of series in analysis.
-- 
%  Randy Yates                  % "She has an IQ of 1001, she has a jumpsuit
%% Fuquay-Varina, NC            %            on, and she's also a telephone."
%%% 919-577-9882                % 
%%%% <yates@ieee.org>           %        'Yours Truly, 2095', *Time*, ELO   
http://www.digitalsignallabs.com

Reply by gretzteam ●June 24, 20082008-06-24

>
>
>robert bristow-johnson wrote:
>
>
>> i forgot what the trick is in the Farrow SRC filters.  can someone
>> spell out what the salient thing that the Farrow design does over
>> "conventional"?
>
>
>Farrow filter is the polynomial interpolation by means of the Tailor 
>series. What is good about that: all of the derivatives can be computed 
>in parallel in the hardware, and then the polynomial can be calculated 
>using Horner rule. I.e. it is a good architecture for the hardware 
>implementation.
>

Ok so if I understand right, this could yield to huge computation savings
compared to method (2) of the original post. Basically, say the output rate
is about 8 times faster than the ouptut rate, we know we will need to
calculate about 8 different output samples for a given set of input
samples. With this farrow structure, you could do the filtering only once
on the input data, and then only need to do a few multiplies for the 8
different 'delay' value. Is this right?
Also, would there be any advantages in the case where the output rate is
lower than the input rate?

gretz.

>
>Vladimir Vassilevsky
>DSP and Mixed Signal Design Consultant
>http://www.abvolt.com
>

Async. sample rate conversion for audio - various methods vs fractional delay filters (Farrow)

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group