DSPRelated.com
Forums

Arbitrary asynchronous (plesiochronous?) resampling in "real time"

Started by snappy April 25, 2005
"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message
news:d4n1ku$5dp$1@blue.rahul.net...
> In article <116q9b3ilte9hfe@corp.supernews.com>, > Jon Harris <jon_harrisTIGER@hotmail.com> wrote: > >> >I'd play the sequences "as is" and either skip one sample in the > >> >longer stream or repeat one sample in the shorter stream, every > >> >second. > >> > > >> >But of then, you might not have exact knowledge of how much out > >> >of sync your streams are at any one time, in which case it could > >> >be awkward to find synchronization points... > >> > >> The problem with the skip/repeat method is that is it very audible. I do > >> not want the resampling to be noticed at all. > >> > >> I have seen some posts about linear interpolation filters that seem to > >> filter the entire stream to the 'correct' rate. It seems like it is this > >> continous method that is the only way not to introduce clicks or pops. > > > >Yes, you will need to filter the entire stream so as not to introduce clicks > >and > >pops. Linear interpolation is one way (in fact the simplest way) to do this, > > Linear interpolation will sound awful. Use windowed-Sync interpolation > with a precalculated sin/cos table. At 8 Khz you can use up to 15 taps > and still have a delay of under a millisecond (plus the sound card FIFO > delay).
Well, it will sound better than drop-sample interpolation! But I agree there are better methods (as I mentioned and you snipped). I have had cases where linear interpolation sounded quite acceptable, but now that I think about it, that was converting like from 48kHz to 44.1kHz. In that case, most of the really bad aliasing is up in the less-audible high frequency range >10kHz. With an 8kHz sample rate, the aliasing is going to be smack dab in the middle of the speech band.
in article 116vinooj0a0ua5@corp.supernews.com, Jon Harris at
jon_harrisTIGER@hotmail.com wrote on 04/27/2005 13:26:

> "robert bristow-johnson" <rbj@audioimagination.com> wrote in message > news:BE952FF5.6A5B%rbj@audioimagination.com... >> in article d4n1ku$5dp$1@blue.rahul.net, Ronald H. Nicholson Jr. at >> rhn@mauve.rahul.net wrote on 04/26/2005 23:41: >> >>> Linear interpolation will sound awful. Use windowed-Sync interpolation >>> with a precalculated sin/cos table. At 8 Khz you can use up to 15 taps >>> and still have a delay of under a millisecond (plus the sound card FIFO >>> delay). >> >> true, but the OP has another problem which is the asynchronous nature. >> without getting *both* input and output clocks (as well as the input >> samples), i can't see how he can do this thing. somehow, he has to derive >> the sample rate ratio (which can drift a little in an async application). >> how's he gonna do that without the clocks? > > Assuming you have some sort of an input FIFO/buffer, I'm thinking you could > monitor the buffer empty/full status and adjust the output sample rate to > maintain a nearly constant level, slowing down if it's nearly empty and > speeding up if it's nearly full. It will probably end up looking like some > sort of digital PLL algorithm. It's not trivial, but should be possible > unless I'm missing something.
no, you're not missing anything in principle. this is, for the most part, how such an ASRC chip like the AD1890 does it. i would say that if "you" or whatever computer that is monitoring the buffer empty/full status, if that thing can see such a status, it is seeing the clocks. but suppose, each time an output sample is required, the CPU just sees how full the buffer is, to the nearest integer sample index, and maintains the SRC ratio so that the buffer always is roughly half-full. now what happens when the two sample rates are 7999 and 8000 Hz? i think that was the problem the OP had. so the SRC ratio is 1 until it finds that it is one sample off (from 1/2 full buffer). then it speeds up a little or slows down a little until it's correct. now it's back to SRC ratio = 1. but then the next second, it's off by one little sample and it has to do it again. do you want this speed change every second? or if you LPF the SRC ratio heavily to get it to exactly 7999/8000, then do you want your response to adapt to a change to be so slow? i had this same problem but it was 44100 and 44101 or similar. two very nearly identical sample clocks but independent of each other. it was a SHArC doing this and the SHArC has a clock hooked up to the whatever MHz clock driving the chip. i made use of it. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
"robert bristow-johnson" <rbj@audioimagination.com> wrote in message 
news:BE95BAD0.6ACB%rbj@audioimagination.com...
> in article 116vinooj0a0ua5@corp.supernews.com, Jon Harris at > jon_harrisTIGER@hotmail.com wrote on 04/27/2005 13:26: > >> "robert bristow-johnson" <rbj@audioimagination.com> wrote in message >> news:BE952FF5.6A5B%rbj@audioimagination.com... >>> in article d4n1ku$5dp$1@blue.rahul.net, Ronald H. Nicholson Jr. at >>> rhn@mauve.rahul.net wrote on 04/26/2005 23:41: >>> >>>> Linear interpolation will sound awful. Use windowed-Sync interpolation >>>> with a precalculated sin/cos table. At 8 Khz you can use up to 15 taps >>>> and still have a delay of under a millisecond (plus the sound card FIFO >>>> delay). >>> >>> true, but the OP has another problem which is the asynchronous nature. >>> without getting *both* input and output clocks (as well as the input >>> samples), i can't see how he can do this thing. somehow, he has to derive >>> the sample rate ratio (which can drift a little in an async application). >>> how's he gonna do that without the clocks? >> >> Assuming you have some sort of an input FIFO/buffer, I'm thinking you could >> monitor the buffer empty/full status and adjust the output sample rate to >> maintain a nearly constant level, slowing down if it's nearly empty and >> speeding up if it's nearly full. It will probably end up looking like some >> sort of digital PLL algorithm. It's not trivial, but should be possible >> unless I'm missing something. > > no, you're not missing anything in principle. this is, for the most part, > how such an ASRC chip like the AD1890 does it. > > i would say that if "you" or whatever computer that is monitoring the buffer > empty/full status, if that thing can see such a status, it is seeing the > clocks. but suppose, each time an output sample is required, the CPU just > sees how full the buffer is, to the nearest integer sample index, and > maintains the SRC ratio so that the buffer always is roughly half-full. > > now what happens when the two sample rates are 7999 and 8000 Hz? i think > that was the problem the OP had. so the SRC ratio is 1 until it finds that > it is one sample off (from 1/2 full buffer). then it speeds up a little or > slows down a little until it's correct. now it's back to SRC ratio = 1. > but then the next second, it's off by one little sample and it has to do it > again. do you want this speed change every second? or if you LPF the SRC > ratio heavily to get it to exactly 7999/8000, then do you want your response > to adapt to a change to be so slow? > > i had this same problem but it was 44100 and 44101 or similar. two very > nearly identical sample clocks but independent of each other. it was a > SHArC doing this and the SHArC has a clock hooked up to the whatever MHz > clock driving the chip. i made use of it.
I've heard that various SRC chips don't do well with ratios near 1:1, and never really understood why, until now. Thanks, rb-j, for illustrating some of these difficulties. But if you know quite a bit about the characteristics of the 2 sample rates, you can make good use of that in your PLL filtering algorithm. If you know the 2 signals are "plesiochronous", you can use a very heavily filtered response, knowing you won't have to worry about tracking large variations. A bigger buffer also lets you move more slowly without danger of over/under-running. If you don't know _anything_ about the characeristics of the 2 sample rates, like the designers of the 1890 family, I can see why this could be a very difficult problem to solve. Maybe some kind of a multi-speed non-linear approach coupled with a big FIFO?
"robert bristow-johnson" <rbj@audioimagination.com> wrote in message
news:BE95BAD0.6ACB%rbj@audioimagination.com...
> in article 116vinooj0a0ua5@corp.supernews.com, Jon Harris at > jon_harrisTIGER@hotmail.com wrote on 04/27/2005 13:26: > >> "robert bristow-johnson" <rbj@audioimagination.com> wrote in message >> news:BE952FF5.6A5B%rbj@audioimagination.com... >>> in article d4n1ku$5dp$1@blue.rahul.net, Ronald H. Nicholson Jr. at >>> rhn@mauve.rahul.net wrote on 04/26/2005 23:41: >>> >>>> Linear interpolation will sound awful. Use windowed-Sync interpolation >>>> with a precalculated sin/cos table. At 8 Khz you can use up to 15 taps >>>> and still have a delay of under a millisecond (plus the sound card FIFO >>>> delay). >>> >>> true, but the OP has another problem which is the asynchronous nature. >>> without getting *both* input and output clocks (as well as the input >>> samples), i can't see how he can do this thing. somehow, he has to derive >>> the sample rate ratio (which can drift a little in an async application). >>> how's he gonna do that without the clocks? >> >> Assuming you have some sort of an input FIFO/buffer, I'm thinking you could >> monitor the buffer empty/full status and adjust the output sample rate to >> maintain a nearly constant level, slowing down if it's nearly empty and >> speeding up if it's nearly full. It will probably end up looking like some >> sort of digital PLL algorithm. It's not trivial, but should be possible >> unless I'm missing something. > > no, you're not missing anything in principle. this is, for the most part, > how such an ASRC chip like the AD1890 does it. > > i would say that if "you" or whatever computer that is monitoring the buffer > empty/full status, if that thing can see such a status, it is seeing the > clocks. but suppose, each time an output sample is required, the CPU just > sees how full the buffer is, to the nearest integer sample index, and > maintains the SRC ratio so that the buffer always is roughly half-full. > > now what happens when the two sample rates are 7999 and 8000 Hz? i think > that was the problem the OP had. so the SRC ratio is 1 until it finds that > it is one sample off (from 1/2 full buffer). then it speeds up a little or > slows down a little until it's correct. now it's back to SRC ratio = 1. > but then the next second, it's off by one little sample and it has to do it > again. do you want this speed change every second? or if you LPF the SRC > ratio heavily to get it to exactly 7999/8000, then do you want your response > to adapt to a change to be so slow? > > i had this same problem but it was 44100 and 44101 or similar. two very > nearly identical sample clocks but independent of each other. it was a > SHArC doing this and the SHArC has a clock hooked up to the whatever MHz > clock driving the chip. i made use of it.
I've heard that various SRC chips don't do well with ratios near 1:1, and never really understood why, until now. Thanks, rb-j, for illustrating some of these difficulties. But if you know quite a bit about the characteristics of the 2 sample rates, you can make good use of that in your PLL filtering algorithm. If you know the 2 signals are "plesiochronous", you can use a very heavily filtered response, knowing you won't have to worry about tracking large variations. A bigger buffer also lets you move more slowly without danger of over/under-running. If you don't know _anything_ about the characeristics of the 2 sample rates, like the designers of the 1890 family, I can see why this could be a very difficult problem to solve. Maybe some kind of a multi-speed non-linear approach coupled with a big FIFO?
>.. >i would say that if "you" or whatever computer that is monitoring the
buffer
>empty/full status, if that thing can see such a status, it is seeing the >clocks. but suppose, each time an output sample is required, the CPU
just
>sees how full the buffer is, to the nearest integer sample index, and >maintains the SRC ratio so that the buffer always is roughly half-full. > >now what happens when the two sample rates are 7999 and 8000 Hz? i
think
>that was the problem the OP had. so the SRC ratio is 1 until it finds
that
>it is one sample off (from 1/2 full buffer). then it speeds up a little
or
>slows down a little until it's correct. now it's back to SRC ratio = 1. >but then the next second, it's off by one little sample and it has to do
it
>again. do you want this speed change every second? or if you LPF the
SRC
>ratio heavily to get it to exactly 7999/8000, then do you want your
response
>to adapt to a change to be so slow? > >i had this same problem but it was 44100 and 44101 or similar. two very >nearly identical sample clocks but independent of each other. it was a >SHArC doing this and the SHArC has a clock hooked up to the whatever MHz >clock driving the chip. i made use of it.
Thanks for all your comments guys. As you all point out, the problem isn't really the arbitrary resampling, but to estimate the true sample rate difference between the devices. If I would have direct access to the clocks, the problem would be solved, but this is not the case. I might in a later stage have access to the buffers, but in the current stage of development a higher level approach is required. Since it will be a PC based software solution, I am not 100% sure of how the buffers are handled, this seems very OS/driver specific to me? Anyhow, considering the higher-level case. I've some information that clears things up a bit, and the setup is: The signal x(n) is being DAC:ed (by device A) and sent over a channel, possibly distorted by a an approximately linear filter, and then ADC:ed (by device B) into y(n). Device B has access to x(n) in it's original form! This is what must be exploited. By looking at the sample streams graphically, you can see the drift when measuring the distance between transients - compared the delay in the beginning the recording compared to N samples later. This cannot be hard to resolve? Since the sample drift resulting from the clock difference is about 1 sample out of 8000, I don't know if a method based on crosscorrelation would do it? (unless data is upsamled?) The drift estimation needs to be accurately estimated for good resampling results since phase information must be preserved between x(n) and y(n). Any ideas? All pointers appreciated. This message was sent using the Comp.DSP web interface on www.DSPRelated.com
I have a question about this kind of PLESIOCHROUNOUS re-sampling being
discussed here.

It was mentioned that simply deleting or repeating a sample now and
then is a poor solution because it adds clicks.

It was mentioned that linear interpolation is not ideal.  Why?  Because
it adds noise? or distortion?

I presume the "ideal" digital solution is to use a higher order
interpolation algorithm.  Is this correct?

And here is my real question.  I understand that re-sampling by
converting the signal back to analog, reconstruction filtering, then
re-sampling in a D/A is considered a kludgey.   But assuming we didn't
care that it was kludgey and the A/D and D/A are good, in terms of
audio quality, how does the D/A > A/D method compare to the other
methods mentioned above.

Is the D/A >A/D method theoretically equivalent to a very high order
digital interpolation?  It seems we need to compare the operation of
the digital interpolation filter vs the operation of the analog
reconstruction filter.

In practice, how complex of an interpolation would be needed to get
audio performance that was as good or better than can be obtained using
typical D/A > A/D.

thanks

And to the O/P.  I suggest reading this Analog Devices 1895 data sheet:

http://www.analog.com/UploadedFiles/Data_Sheets/326447608AD1895_b.pdf


Mark

>It was mentioned that simply deleting or repeating a sample now and >then is a poor solution because it adds clicks.
Correct.
>It was mentioned that linear interpolation is not ideal. Why? Because >it adds noise? or distortion?
Yes. Linear interpolation's Fourier properties are not good enough for a high quality interpolation. It smoothes the signal (low pass on the highest frequencies) and does not suppress frequency information above nyquist rate enough to avoid imaging (similar to aliasing, a kind of distortion).
>I presume the "ideal" digital solution is to use a higher order >interpolation algorithm. Is this correct?
Yes. The ideal lowpass filter is a rect-function from -0.5 to 0.5 (normalized frequency). The Fourier transform of that is the sinc pulse. But since the sinc pulse is finite you have to end it somewhere, preferably by windowing to taper it smoothly.
>And here is my real question. I understand that re-sampling by >converting the signal back to analog, reconstruction filtering, then >re-sampling in a D/A is considered a kludgey. But assuming we didn't >care that it was kludgey and the A/D and D/A are good, in terms of >audio quality, how does the D/A > A/D method compare to the other >methods mentioned above.
Each AD/DA introduces some qualitization noise. The interpolation will also introduce atrefacts such as imaging, but I am not sure about the relation between those distortions. This message was sent using the Comp.DSP web interface on www.DSPRelated.com
Why are new terms added to industry "discussions" without
properly defining them?  I've been looking for a definition
of isochronous for the last few years and have never found
one.  Now someone has added plesiochronous to this group's
discussions.

It's been quite a number of years since streaming tape
drives were introduced (and have since died?).  There never
was a definition of what streaming meant, but in that case
one could eventually determine that streaming meant not
start-stop, but it would have nice if someone had taken the
time to define what was meant.
http://www.google.com/search?hl=en&q=plesiochronous+definition

"Everett M. Greene" <mojaveg@mojaveg.iwvisp.com> wrote in message
news:20050613.7944A18.F196@mojaveg.iwvisp.com...
> Why are new terms added to industry "discussions" without > properly defining them? I've been looking for a definition > of isochronous for the last few years and have never found > one. Now someone has added plesiochronous to this group's > discussions. > > It's been quite a number of years since streaming tape > drives were introduced (and have since died?). There never > was a definition of what streaming meant, but in that case > one could eventually determine that streaming meant not > start-stop, but it would have nice if someone had taken the > time to define what was meant.
P.S. Forgot this one for isochroous:
http://whatis.techtarget.com/definition/0,,sid9_gci212403,00.html

And this:
http://searchnetworking.techtarget.com/sDefinition/0,,sid7_gci214373,00.html

"Jon Harris" <jon_harrisTIGER@hotmail.com> wrote in message
news:11ascmcidaeouc3@corp.supernews.com...
> http://www.google.com/search?hl=en&q=plesiochronous+definition > > "Everett M. Greene" <mojaveg@mojaveg.iwvisp.com> wrote in message > news:20050613.7944A18.F196@mojaveg.iwvisp.com... > > Why are new terms added to industry "discussions" without > > properly defining them? I've been looking for a definition > > of isochronous for the last few years and have never found > > one. Now someone has added plesiochronous to this group's > > discussions. > > > > It's been quite a number of years since streaming tape > > drives were introduced (and have since died?). There never > > was a definition of what streaming meant, but in that case > > one could eventually determine that streaming meant not > > start-stop, but it would have nice if someone had taken the > > time to define what was meant. > >