DSPRelated.com
Forums

Help for sampling rate converter

Started by hsquared January 24, 2008
On Jan 27, 6:48 pm, Michel Rouzic <Michel0...@yahoo.fr> wrote:
> Erratum : > > Michel Rouzic wrote: > > it makes interpolation by a non-integer > > ratio much *SOUND* harder than it is. > > weighted_value = original_signal[indexX] * (0.42 - 0.5*cos(2*pi * > > (indexX-indexA) / sincsize) + 0.08*cos(4*pi * (indexX-indexA) / > > sincsize)) * sin(2*pi*0.5 * (indexX-indexA)) / (indexX-indexA) > > More like : > > weighted_value = original_signal[indexX] * (0.42 - 0.5*cos(2*pi * > ((indexX-indexA) / sincsize + 0.5)) + 0.08*cos(4*pi * ((indexX- > indexA) / sincsize + 0.5)) * sin(2*pi*0.5 * (indexX-indexA)) / (indexX- > indexA) > > I always forget that the Blackman function doesn't peak at x = 0.0 but > at half its size.
Hi all, Thank you very much. The advices help me to finish the problem. And hope to finish by deadline.
Michel Rouzic wrote:
> Rick Lyons wrote: > > you can upsample by 320, lowpass filter, and > > then downsample by 147. > > Wow, excuse me, but why do you suggest that?
Because it is The Correct Way (tm)? :-)
> That's horrible to > encourage such practices, and it makes interpolation by a non-integer > ratio much harder than it is. Here's what I would suggest instead :
Michel, two points: 1. What you describe below is _exactly_ the same procedure that Rick described above (see the posts that Ron and I exchanged). 2. Irrational sampling rate changes are impossible with computers. The best you can do are rational approximations. Regards, Andor
> First, define how precise a filtering you want by choosing how large a > windowed sinc function you would use (in numbers of bins). We'll call > this value sincsize (sorry about the poor naming scheme). Go through > each sample of your new, interpolated signal you're trying to create. > Now, for each of these new samples, calculate the position it matches > to in the original signal (for example, if you're at sample index 100 > in the new signal, you'll get the value 100 * 11025 / 24000 = 45.9375, > but let's call this value indexA). > > Now, go through each original sample between indexes calculated by > indexA - (sincsize/2) (rounded up to the next integer number) and > indexA + (sincsize/2) (rounded down, or if you prefer truncated). > Weight the value at these indexes (which we'll call indexX) using a > windowed sinc centered on indexA. The formula giving for that would > be, using the Blackman function as a window, in C-ish pseudo-code: > > weighted_value = original_signal[indexX] * (0.42 - 0.5*cos(2*pi * > (indexX-indexA) / sincsize) + 0.08*cos(4*pi * (indexX-indexA) / > sincsize)) * sin(2*pi*0.5 * (indexX-indexA)) / (indexX-indexA) > > You would sum every of these weighted values into the value in the new > signal at indexA, and do that for every integration of indexA. Thus, > you would simply and efficiently interpolate by a non-integer > resampling ratio using a variant of time-domain convolution by a > windowed sinc FIR. I only hope I didn't lose anyone during the course > of my explanation.

Andor wrote:
> Michel Rouzic wrote: > > Rick Lyons wrote: > > > you can upsample by 320, lowpass filter, and > > > then downsample by 147. > > > > Wow, excuse me, but why do you suggest that? > > Because it is The Correct Way (tm)?
You're just kidding, right?
> > That's horrible to > > encourage such practices, and it makes interpolation by a non-integer > > ratio much harder than it is. Here's what I would suggest instead : > > Michel, two points: > > 1. What you describe below is _exactly_ the same procedure that Rick > described above (see the posts that Ron and I exchanged).
I completely fail to see how this is similar. Surely it might give similar results, but in the meantime my method doesn't involve hogging up hundreds of times more memory than necessary.
> 2. Irrational sampling rate changes are impossible with computers. The > best you can do are rational approximations.
I absolutely and unequivocally fail to see how this is impossible. It seems to me that my suggestion shows it is possible. What difference in result would there be between my method and the ideal result?
Michel Rouzic wrote:
> Andor wrote: > > Michel Rouzic wrote: > > > Rick Lyons wrote: > > > > you can upsample by 320, lowpass filter, and > > > > then downsample by 147. > > > > Wow, excuse me, but why do you suggest that? > > > Because it is The Correct Way (tm)? > > You're just kidding, right?
No.
> > > That's horrible to > > > encourage such practices, and it makes interpolation by a non-integer > > > ratio much harder than it is. Here's what I would suggest instead : > > > Michel, two points: > > > 1. What you describe below is _exactly_ the same procedure that Rick > > described above (see the posts that Ron and I exchanged). > > I completely fail to see how this is similar. Surely it might give > similar results, but in the meantime my method doesn't involve hogging > up hundreds of times more memory than necessary.
Once you notice that most of the memory is filled with zeros, and you implicitely assume the zeros without actually storing them you arrive at the polyphase implementation of the upsampling-filter-downsampling scheme. It requires just as much (or as little) memory as your suggested idea.
> > 2. Irrational sampling rate changes are impossible with computers. The > > best you can do are rational approximations. > > I absolutely and unequivocally fail to see how this is impossible.
Because you cannot evaluate irrational numbers on a computer (at least not in a finite amount of time). You always use rational approximations.
> It seems to me that my suggestion shows it is possible. What difference > in result would there be between my method and the ideal result?
You can make the difference arbitrarily small by adjusting the rational approximation as you go along (this is called asynchronous sampling rate conversion), but never zero. I guess this also goes for rational sampling rate changes (because the sinc-function is in fact irrational at most rational time instants). So admittedly, this is a weak objection. The idea you suggest is also described in that link I posted in my reply to Ron. There, the interpolation filter is continously updated (by interpolating the kernel, just as you suggest) to reflect the changing positions of the interpolation locations. And I think Erik did just that in his open source sampling rate conversion software Secret Rabbit Code. Regards, Andor
Ron N. wrote:

> On Jan 24, 12:29 pm, "hsquared" <hnin...@yahoo.com> wrote: > >>Hi, >> >>I have a quantized input audio signal file (.wav). >>I want to convert the sampling rate from 11025 Hz to 24000. >>I know that I have to upsample the signal, lowpass filter and downsample >>it. > > > You don't have to. You could just interpolate any > samples needed using various splines, or a Sinc or > windowed-Sinc kernel. >
As the specified sampling rates were rather low (11025 and 24000 Hz), why does every one in this and Rick's "Multistage interpolation question" thread presume zero stuffing? I hacked around a little with using Scilab's smooth() function to "upsample" by 11 (relative prime to 11025 and 24000). Then one could do linear interpolation to get the desired sample. This should vastly reduce the filtering problem as there would be no large discontinuities. I suspect it could also be done in real time (even though OP seemed to imply he was using stored data).
Richard Owlett wrote:
> Ron N. wrote: > > On Jan 24, 12:29 pm, "hsquared" <hnin...@yahoo.com> wrote: > > >>Hi, > > >>I have a quantized input audio signal file (.wav). > >>I want to convert the sampling rate from 11025 Hz to 24000. > >>I know that I have to upsample the signal, lowpass filter and downsample > >>it. > > > You don't have to. &#4294967295;You could just interpolate any > > samples needed using various splines, or a Sinc or > > windowed-Sinc kernel. > > As the &#4294967295;specified sampling rates were rather low (11025 and 24000 Hz), > why does every one in this and Rick's "Multistage interpolation > question" thread presume zero stuffing?
Richard, I've been trying to bang into the heads of several people here that upsampling-filtering-downsampling procedure for sampling rate changes is the same thing as kernel interpolation (for example polynomial interpolation).
> I hacked around a little with using Scilab's smooth() function to > "upsample" by 11 (relative prime to 11025 and 24000). Then one could do > linear interpolation to get the desired sample. This should vastly > reduce the filtering problem as there would be no large discontinuities. > I suspect it could also be done in real time (even though OP seemed to > imply he was using stored data).
We discussed polynomial, specifically linear interpolation, the last time here: http://groups.google.ch/group/comp.dsp/browse_frm/thread/f1963ddc88272f0d/be4f31bb19580aed?#be4f31bb19580aed The drawback of using polynomial interpolation is that for a given number of coefficients, you only have a bunch of polynomial types to chose from. If you use standard filter design algorithms (in the upsampling-filtering-downsampling procedure), you can usually do better at optimizing the filter to a given criterion. Not all filters can be understood as polynomial interpolators. Regards, Andor
On Jan 28, 7:58 am, Andor <andor.bari...@gmail.com> wrote:
> Michel Rouzic wrote: > > Andor wrote: > > > Michel Rouzic wrote: > > > > Rick Lyons wrote: > > > > > you can upsample by 320, lowpass filter, and > > > > > then downsample by 147. > > > > > Wow, excuse me, but why do you suggest that? > > > > Because it is The Correct Way (tm)? > > > You're just kidding, right? > > No.
i think, Andor, we should be careful here, when we speak of upsampling by 320 and downsampling by 147, that we do not actually compute each one of those 320 new samples per input sample, only then to throw away 146 out of 147 of them. i think this waste is what Michel is objecting to. also, when we "zero stuff", we do that conceptually and we actually don't waste the computational effort to multiply those stuffed zeros by sinc-like filter coefficients.
> > > > That's horrible to > > > > encourage such practices, and it makes interpolation by a non-integer > > > > ratio much harder than it is. Here's what I would suggest instead : > > > > Michel, two points: > > > > 1. What you describe below is _exactly_ the same procedure that Rick > > > described above (see the posts that Ron and I exchanged). > > > I completely fail to see how this is similar. Surely it might give > > similar results, but in the meantime my method doesn't involve hogging > > up hundreds of times more memory than necessary. > > Once you notice that most of the memory is filled with zeros, and you > implicitely assume the zeros without actually storing them you arrive > at the polyphase implementation of the upsampling-filter-downsampling > scheme. It requires just as much (or as little) memory as your > suggested idea.
okay, i guess you're making that point here. fine.
> > > 2. Irrational sampling rate changes are impossible with computers. The > > > best you can do are rational approximations. > > > I absolutely and unequivocally fail to see how this is impossible. > > Because you cannot evaluate irrational numbers on a computer (at least > not in a finite amount of time). You always use rational > approximations.
another thing, and you alluded to it regarding ASRC, even though the computers think in terms of rational numbers and rational SRC ratios, there may be SRC ratios of values that are not an integer divided by the upsampling ratio (320 in this case). your instantaneous value of time can land in between two of those "polyphases" (that's what i like to call them) and you can apply linear interpolation (or some other spline, if you are willing to pay for it) to do the continuous interpolation in between. actually linear interpolation (or drop-sample "interpolation") can be considered a 1st-order (or 0th-order) B-spline. also a 1st-order (or 0th-order) Lagrange polynomial (another family of interpolating curves) or 1st-order (or 0th-order) Hermite polynomials (these different methods start looking different from each other as you get to higher orders, like 3rd-order). anyway, since memory is reasonably cheap, i have never felt the need to go higher than linear interpolation. if we upsample by a factor of 512 and use linear interpolation to get in between those closely spaced little samples, then you can get 120 dB S/N. if you use drop sample (i.e. *no* interpolation), you need to upsample by a factor 512K to get the same S/N, which might not be so bad if you have lotsa bytes to burn (and the computations are half that of linear interpolation). i have *once*, with a test C program (that never found its way to a product), done the same for a 3rd-order B-spline (as suggested by another paper by Zoelzer), and even though i didn't need to upsample as much, i didn't think the memory saved was worth the extra computation per output sample. i think linear interpolation between the upsampled samples is a good compromize point. Michel (and/or whomever is the OP), Duane Wise and i wrote a little paper about this a decade ago and since Olli Niemitalo wrote a better paper that drew on the same concepts. there are some other SRC and interpolation papers, including the Julius Smith (and someone Gossett) paper. do you have them? lemme know if you need me to send you something. r b-j
On Jan 28, 10:57 am, robert bristow-johnson
<r...@audioimagination.com> wrote:
> On Jan 28, 7:58 am, Andor <andor.bari...@gmail.com> wrote: > > > Michel Rouzic wrote: > > > Andor wrote: > > > > Michel Rouzic wrote: > > > > > Rick Lyons wrote: > > > > > > you can upsample by 320, lowpass filter, and > > > > > > then downsample by 147. > > > > > > Wow, excuse me, but why do you suggest that? > > > > > Because it is The Correct Way (tm)? > > > > You're just kidding, right? > > > No. > > i think, Andor, we should be careful here, when we speak of upsampling > by 320 and downsampling by 147, that we do not actually compute each > one of those 320 new samples per input sample, only then to throw away > 146 out of 147 of them. i think this waste is what Michel is > objecting to. also, when we "zero stuff", we do that conceptually and > we actually don't waste the computational effort to multiply those > stuffed zeros by sinc-like filter coefficients. > > > > > > > > That's horrible to > > > > > encourage such practices, and it makes interpolation by a non-integer > > > > > ratio much harder than it is. Here's what I would suggest instead : > > > > > Michel, two points: > > > > > 1. What you describe below is _exactly_ the same procedure that Rick > > > > described above (see the posts that Ron and I exchanged). > > > > I completely fail to see how this is similar. Surely it might give > > > similar results, but in the meantime my method doesn't involve hogging > > > up hundreds of times more memory than necessary. > > > Once you notice that most of the memory is filled with zeros, and you > > implicitely assume the zeros without actually storing them you arrive > > at the polyphase implementation of the upsampling-filter-downsampling > > scheme. It requires just as much (or as little) memory as your > > suggested idea. > > okay, i guess you're making that point here. fine. > > > > > 2. Irrational sampling rate changes are impossible with computers. The > > > > best you can do are rational approximations. > > > > I absolutely and unequivocally fail to see how this is impossible. > > > Because you cannot evaluate irrational numbers on a computer (at least > > not in a finite amount of time). You always use rational > > approximations. > > another thing, and you alluded to it regarding ASRC, even though the > computers think in terms of rational numbers and rational SRC ratios, > there may be SRC ratios of values that are not an integer divided by > the upsampling ratio (320 in this case). your instantaneous value of > time can land in between two of those "polyphases" (that's what i like > to call them) and you can apply linear interpolation (or some other > spline, if you are willing to pay for it) to do the continuous > interpolation in between. > > actually linear interpolation (or drop-sample "interpolation") can be > considered a 1st-order (or 0th-order) B-spline. also a 1st-order (or > 0th-order) Lagrange polynomial (another family of interpolating > curves) or 1st-order (or 0th-order) Hermite polynomials (these > different methods start looking different from each other as you get > to higher orders, like 3rd-order). anyway, since memory is reasonably > cheap, i have never felt the need to go higher than linear > interpolation. if we upsample by a factor of 512 and use linear > interpolation to get in between those closely spaced little samples, > then you can get 120 dB S/N. if you use drop sample (i.e. *no* > interpolation), you need to upsample by a factor 512K to get the same > S/N, which might not be so bad if you have lotsa bytes to burn (and > the computations are half that of linear interpolation). i have > *once*, with a test C program (that never found its way to a product), > done the same for a 3rd-order B-spline (as suggested by another paper > by Zoelzer), and even though i didn't need to upsample as much, i > didn't think the memory saved was worth the extra computation per > output sample.
This, of course, depends on details of the processor and system implementation, such as the data cache size and the cache miss penalty. I've worked with systems where main memory could be so far away in terms of CPU cycles that it was faster to compute a transcendental function than to do a large random table lookup. On a modern PC one can easily compute sin(x)/x many orders of magnitude faster than the audio sample rate. So sometimes, in non-battery powered DSP experiments, one might not have to bother with coding even a small filter table. See: http://www.nicholson.com/rhn/dsp.html for one example Q&D resampling method. I've run this type of code, still in audio real-time, in slow slow interpreted Basic on a GHz PC/Mac. IMHO. YMMV. -- rhn A.T nicholson d.0.t C-o-M
robert bristow-johnson wrote:

(someone wrote)

>>>>>>you can upsample by 320, lowpass filter, and >>>>>>then downsample by 147.
(snip)
> i think, Andor, we should be careful here, when we speak of upsampling > by 320 and downsampling by 147, that we do not actually compute each > one of those 320 new samples per input sample, only then to throw away > 146 out of 147 of them. i think this waste is what Michel is > objecting to. also, when we "zero stuff", we do that conceptually and > we actually don't waste the computational effort to multiply those > stuffed zeros by sinc-like filter coefficients.
I think this problem, (actually the one from 44.1kHz to 48kHz) was the one that got me interested in DSP some years ago. Fourier transforms are taught in physics, but the rest of DSP is not. When DAT tapes first came out the 48kHz sampling rate was supposedly chosen to make sample rate conversion from 44.1kHz CDs hard. I wanted to find out just how hard that would be. It seems, though, that there are 'good enough' algorithms that are now implemented in one chip to convert from almost any sampling rate to another, commonly used on inputs to digital recorders. As I understand it, they interpolate to some high rate such as 1MHz and then either select the nearest sample or linear interpolation from that. It works even if the two sources have different clocks, and so the conversion is not synchronous. -- glen
On Jan 28, 3:24 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> > It seems, though, that there are 'good enough' algorithms that > are now implemented in one chip to convert from almost any > sampling rate to another, commonly used on inputs to digital > recorders. As I understand it, they interpolate to some high > rate such as 1MHz and then either select the nearest sample > or linear interpolation from that. It works even if the > two sources have different clocks, and so the conversion > is not synchronous.
if you mean the sorta seminal chips that Bob Adams designed (i think the AD1890, et. al.), they don't actually upsample and then downsample, throwing away unused samples created in the upsampling process. they have a circular buffer, and input pointer that increments at a rate of one buffer sample per input sample (so this pointer always has an integer value), and an output pointer that follows the input pointer, has an integer and fractional value, and it increments at an adjustable rate that is more than one buffer sample per output sample (if you're downsampling) or less than one buffer sample per output sample (if you're upsampling). that output pointer, with integer and fractional components to it has the increment (which also has integer and fractional components) added to it every output sample. then the pointer is split into the integer and fractional components (simple masking of bits) and the integer component is what is used to point to the place in the circular buffer where you're getting your samples from. the fractional part is used to determine what the polyphase filter coefficents to use for that particular fractional delay. the left most, say, 9 bits of the fractional part are used to pick one of 512 coefficient sets (actually two for linear interpolation) and the bits of the fractional part that are right of the 9 bits used to choose the coefficients are used to interpolate between the two neighboring delays (out of 512), perhaps using linear interpolation. let's say that you're downsampling. the output pointer advances less often per second than does the input pointer, but it takes larger steps, nominally the step size is the ratio of the input Fs to the output Fs. if the increment of the output pointer is a little high, it will catch up with the input pointer and buffer pointer wrapping will occur (bad). so the positition of the output pointer is compared to half of the buffer size behind the input pointer. if the position of the output pointer is ahead of that half-way point in the buffer, the increment is reduced slightly to slow it down. likewise when the position of he output pointer is too slow, then the increment is increased slightly. it's a servo-mechanism control loop. r b-j