Hi all,
I am in the final stages of a new release of Secret Rabbit Code
aka libsamplerate, my audio sample rate conversion library which
is released under the GNU GPL [0].
The big news about this release is that I found a way to
drastically improve the quality of the SFC_SINC_MEDIUM_QUALITY
and SRC_SINC_BEST_QUALITY converters.
For more info on that, see this blog entry:
http://www.mega-nerd.com/erikd/Blog/CodeHacking/SecretRabbitCode/progress.html
I'm aiming for the release proper to happen next weekend so
have a bash at it
Cheers,
Erik
[0] A very reasonable commercial use license is also available.
--
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"Even Napoleon had his Watergate" -- Michael Spautz
Secret Rabbit Code : New release imminent
Started by ●March 8, 2008
Reply by ●March 9, 20082008-03-09
On Mar 8, 12:19 am, Erik de Castro Lopo <er...@mega-nerd.com> wrote:> Hi all, > > I am in the final stages of a new release of Secret Rabbit Code > aka libsamplerate, my audio sample rate conversion library which > is released under the GNU GPL [0]. > > The big news about this release is that I found a way to > drastically improve the quality of the SFC_SINC_MEDIUM_QUALITY > and SRC_SINC_BEST_QUALITY converters. > > For more info on that, see this blog entry: > > http://www.mega-nerd.com/erikd/Blog/CodeHacking/SecretRabbitCode/prog...Erik, i've been wondering, the mechanics of your SRC is the same as any polyphase filtering, right? you have a pointer to where the output sample is to be drawn from out of the input buffer, this pointer has an integer and fractional part. the integer part tells you which N adjacent samples to use in the interpolation and the fractional part tells you which set of N coefficients to use to combine. is that what you do? for a general SRC ratio, do you compute samples for adjacent "phases" or "fractional delays" and linearly interpolate (or interpolate with a higher order spline)? anyway, other than how you're determining the number of adjacent samples to include in the interpolation (the length of the FIR, should be at least N=32), number of uniformly spaced fractional delays (your "upsample ratio", i would recommend at least 512, if you're linearly interpolating and 512K if you're drop-sample interpolating), how you're doing the fractional arithmetic on the pointer (what we usually call a "phase accumulator"), and how you're interpolating to a precision beyond that of adjacent fractional delays (linear should be good enough, if you have at least 512 for your upsampling ratio). well, other than that sorta mechanical stuff, what else is there in your improvements, other than the set of coefficients? and, i assume that someone using SRC could substitute their own set of coefs, no? anyway, i don't wanna subscribe to any more mailing lists. can we discuss this here (or at music-dsp)? r b-j
Reply by ●March 9, 20082008-03-09
robert bristow-johnson wrote:> Erik, > > i've been wondering, the mechanics of your SRC is the same as any > polyphase filtering, right?Actually, its just the Julius O. Smith algorithm: http://ccrma-www.stanford.edu/~jos/resample/ which is not really a traditional polyphase.> you have a pointer to where the output > sample is to be drawn from out of the input buffer, this pointer has > an integer and fractional part. the integer part tells you which N > adjacent samples to use in the interpolation and the fractional part > tells you which set of N coefficients to use to combine. is that what > you do? for a general SRC ratio, do you compute samples for adjacent > "phases" or "fractional delays" and linearly interpolate (or > interpolate with a higher order spline)?For each output sample, the algorithm generates a the FIR filter for that sample before moving to the next output sample. Each FIR filter is generated by linearly interpolating from a vastly oversampled prototype filter. This all sounds like a lot of work (and it is), but allows for time varying conversion ratios.> well, other than that sorta mechanical stuff, what else is there in > your improvements, other than the set of coefficients?Nope, this is simply the Julius O. Smith algorithm with a specially designed set of filter coefficients.> and, i assume > that someone using SRC could substitute their own set of coefs, no?Sure.> anyway, i don't wanna subscribe to any more mailing lists. can we > discuss this here (or at music-dsp)?Sure, I still read comp.dsp reasonably regularly (and as long as my ISP's new server works). Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "In my opinion, shareware tends to combine the worst of commercial software (no sources) with the worst of free software (no finishing touches). I simply do not believe in the shareware market at all." -- Linus Torvalds
Reply by ●March 9, 20082008-03-09
On Mar 9, 3:21 am, Erik de Castro Lopo <er...@mega-nerd.com> wrote:> robert bristow-johnson wrote: > > Erik, > > > i've been wondering, the mechanics of your SRC is the same as any > > polyphase filtering, right? > > Actually, its just the Julius O. Smith algorithm: > > http://ccrma-www.stanford.edu/~jos/resample/ > > which is not really a traditional polyphase.gee, i thought it was the traditional polyphase (where you just don't bothering to calculate the phases you don't need) with (likely linear) interpolation between samples.> > you have a pointer to where the output > > sample is to be drawn from out of the input buffer, this pointer has > > an integer and fractional part. the integer part tells you which N > > adjacent samples to use in the interpolation and the fractional part > > tells you which set of N coefficients to use to combine. is that what > > you do? for a general SRC ratio, do you compute samples for adjacent > > "phases" or "fractional delays" and linearly interpolate (or > > interpolate with a higher order spline)? > > For each output sample, the algorithm generates a the FIR filter > for that sample before moving to the next output sample. Each FIR > filter is generated by linearly interpolating from a vastly > oversampled prototype filter.linearly interpolating the coefficients between adjacent phases or fractional delays, right? if so, that turns out to be the same as computing the two output samples from the neighboring fractional sample delays and linearly interpolating between the two. if you're downsampling and, to put in the necessary LPF sorta free, you sample that sinc-like function (that maybe is some Parks-McClellen optimal design instead of a windowed sinc) more densely so that you LPF to half the final Fs (which is less than the initial Fs) , *then* you have to do it your way; you have to interpolate the coefficients before using them. but otherwise, where the coefficients depend only on the phase or fractional delay, then it's easier, i thought, to calculate the two adjacent output subsamples and interpolate between them.> This all sounds like a lot of work (and it is), but allows for > time varying conversion ratios.i'm familiar with Smith/Gossett. and i understand that it does (and why not for FIR filtering inside?) allow for time varying ratios. if you have arbitrary ratios not expressible as M/N, sometimes we can put the ratio somewhere between M/N and (M+1)/N and still do this as two (if linear interpolation is used, 4 if cubic interpolation) neighboring fractional delays out of N possible fractional delays. then interpolate between the two.> > well, other than that sorta mechanical stuff, what else is there in > > your improvements, other than the set of coefficients? > > Nope, this is simply the Julius O. Smith algorithmi dunno. i think this is pretty much the same.> with a specially designed set of filter coefficients.okay, so the improvements in the SRC is due to the coefs getting redesigned? did you increase either the number of taps or the number of phases? or is it a better designed LPF kernel? how much are you willing to tell us about that? best, r b-j
Reply by ●March 10, 20082008-03-10
robert bristow-johnson wrote:> linearly interpolating the coefficients between adjacent phases or > fractional delays, right? if so, that turns out to be the same as > computing the two output samples from the neighboring fractional > sample delays and linearly interpolating between the two.Yes, for the upsampling case I agree. Its not quite as simple for the downsampling case.> if you're > downsampling and, to put in the necessary LPF sorta free, you sample > that sinc-like function (that maybe is some Parks-McClellen optimal > design instead of a windowed sinc) more densely so that you LPF to > half the final Fs (which is less than the initial Fs) , *then* you > have to do it your way; you have to interpolate the coefficients > before using them.Yes.> but otherwise, where the coefficients depend only > on the phase or fractional delay, then it's easier, i thought, to > calculate the two adjacent output subsamples and interpolate between > them.Equivalent, but : - Its nice to have the same algorithm work for the up and down sample case. - Its nice to be able to do time varying conversions with the same algorithm. - Your suggested method requires two accumulators while mine only needs one.> okay, so the improvements in the SRC is due to the coefs getting > redesigned?Correct.> did you increase either the number of taps or the number > of phases?Yes, I did. The mid quality filter uses 491 phases and the high quality one uses 2381 phases. I found that a prime number of phases performed better than say a power of two.> or is it a better designed LPF kernel?That as well.> how much are you willing to tell us about that?Thats where I going to keep quiet :-). Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "To iterate is human, to recurse divine." -- L. Peter Deutsch
Reply by ●March 10, 20082008-03-10
On Mar 10, 5:54 am, Erik de Castro Lopo <er...@mega-nerd.com> wrote:> robert bristow-johnson wrote: > > linearly interpolating the coefficients between adjacent phases or > > fractional delays, right? if so, that turns out to be the same as > > computing the two output samples from the neighboring fractional > > sample delays and linearly interpolating between the two. > > Yes, for the upsampling case I agree. Its not quite as simple for > the downsampling case. > > > if you're > > downsampling and, to put in the necessary LPF sorta free, you sample > > that sinc-like function (that maybe is some Parks-McClellen optimal > > design instead of a windowed sinc) more densely so that you LPF to > > half the final Fs (which is less than the initial Fs) , *then* you > > have to do it your way; you have to interpolate the coefficients > > before using them. > > Yes. > > > but otherwise, where the coefficients depend only > > on the phase or fractional delay, then it's easier, i thought, to > > calculate the two adjacent output subsamples and interpolate between > > them. > > Equivalent, but : > > - Its nice to have the same algorithm work for the up and down > sample case. > - Its nice to be able to do time varying conversions with the same > algorithm. > - Your suggested method requires two accumulators while mine only > needs one.you can use the same accumulator but you just have to hang on to the output subsample for the two adjacent phases. in your case, you have to do linear interpolation for each coefficient before you use it in the polyphase filter. that would be necessary for downsampling, but not for upsampling and, in that case, i would think the code would be simpler and faster if you computed the two subsamples first, and then linearly interpolate between them. maybe i should dig out some old C code (and prune it for posting) so we both know we are referring to the same things.> > okay, so the improvements in the SRC is due to the coefs getting > > redesigned? > > Correct. > > > did you increase either the number of taps or the number > > of phases? > > Yes, I did. The mid quality filter uses 491 phases and the high > quality one uses 2381 phases. I found that a prime number of > phases performed better than say a power of two. > > > or is it a better designed LPF kernel? > > That as well. > > > how much are you willing to tell us about that? > > Thats where I going to keep quiet :-).okay, we'll stay away from trade secrets. can you tell me how many taps the polyphase FIR filter is (in the upsampling case, since in down sampling it can get longer)? something around 32? now the number of phases is intriguing. why not have 512 phases (uniformly spaced fractional delays) instead of 491? (that power of 2 can be useful when separating the fractional delay into the two parts; which discrete phase, the 9 upper bits, and the linear interpolation coef would be in the bottom bits of fractional delay.) and if you're linearly interpolating between these phases, your SRC S/N should be as good as 120 dB, no? your "high quality" mode (with 2381 phases), is that with linear interp between phases also? seems like more table than you would need (and i would make that a power of two also, like 2048 or maybe 4096). i suppose you can have weird values for that upsample ratio (491 or 2381). first you increment the output sample pointer (that has integer and fractional parts that you separate). then take the fractional part and multiply by 491. the integer part of that scaled up result (from 0 to 490) would be the index used to look up the filter coefs and the fractional part of that (from 0 to 0.99999) would be used in the linear interpolation. so i guess it's not necessary to have the number of phases = to a power of 2, but i had normally thought that doing so would make your life easier somewhere. L8r, r b-j
Reply by ●March 12, 20082008-03-12
robert bristow-johnson wrote:> okay, we'll stay away from trade secrets. can you tell me how many > taps the polyphase FIR filter is (in the upsampling case, since in > down sampling it can get longer)? something around 32?About 40 for the SRC_MEDIUM_QUALITY converter and about 140 for the SRC_BEST_QUALITY (which has a much steeper transition band).> why not have 512 phases > (uniformly spaced fractional delays) instead of 491?I found experimentally that for the code I already had, I got better results for non-power-of-two values. Multiples of three were not much better so I went for prime numbers.> (that power of 2 > can be useful when separating the fractional delay into the two parts;I use a 32 bit integer as a fixed point value.> which discrete phase, the 9 upper bits, and the linear interpolation > coef would be in the bottom bits of fractional delay.) and if you're > linearly interpolating between these phases, your SRC S/N should be as > good as 120 dB, no?With enough phases, yes.> your "high quality" mode (with 2381 phases), is > that with linear interp between phases also?Yes.> seems like more table > than you would need (and i would make that a power of two also, like > 2048 or maybe 4096). > > i suppose you can have weird values for that upsample ratio (491 or > 2381). first you increment the output sample pointer (that has > integer and fractional parts that you separate). then take the > fractional part and multiply by 491. the integer part of that scaled > up result (from 0 to 490) would be the index used to look up the > filter coefs and the fractional part of that (from 0 to 0.99999) would > be used in the linear interpolation. so i guess it's not necessary to > have the number of phases = to a power of 2,Exactly.> but i had normally > thought that doing so would make your life easier somewhere.I was using powers of 2 for a long time and found that primes were better. Primes are also no more or less trouble. Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "You don't have make it your sole purpose in life, but could you at least sacrifice a rubber chicken upon the altar of literacy?" -- TackHead on Slashdot
Reply by ●March 12, 20082008-03-12
On Mar 12, 6:11 am, Erik de Castro Lopo <er...@mega-nerd.com> wrote:> robert bristow-johnson wrote: > > okay, we'll stay away from trade secrets. can you tell me how many > > taps the polyphase FIR filter is (in the upsampling case, since in > > down sampling it can get longer)? something around 32? > > About 40 for the SRC_MEDIUM_QUALITY converter and about 140 for > the SRC_BEST_QUALITY (which has a much steeper transition band).140 tap FIR?? that must be one friggin' brick wall!> > why not have 512 phases > > (uniformly spaced fractional delays) instead of 491? > > I found experimentally that for the code I already had, I got > better results for non-power-of-two values. Multiples of three > were not much better so I went for prime numbers.hmmmm, i'm trying to get my head around this. you're saying that if you increase the number of phases from 491 to 512 (MORE phases, not less), that for an arbitrary sample rate ratio, it sounds *better* using the slightly fewer phases. and it sounds better because the number of phases is a prime number? somebody's gonna need to explain to me how that works. i don't know why the primeness or compositeness of the number of fractional delays per input sampling period, why it's better for that number to be prime or bad for it to be highly composite (like a power of 2). :-\> > i suppose you can have weird values for that upsample ratio (491 or > > 2381). first you increment the output sample pointer (that has > > integer and fractional parts that you separate). then take the > > fractional part and multiply by 491. the integer part of that scaled > > up result (from 0 to 490) would be the index used to look up the > > filter coefs and the fractional part of that (from 0 to 0.99999) would > > be used in the linear interpolation. so i guess it's not necessary to > > have the number of phases = to a power of 2, > > Exactly. > > > but i had normally > > thought that doing so would make your life easier somewhere. > > I was using powers of 2 for a long time and found that primes were > better. Primes are also no more or less trouble.well, multiplication is cheap nowadays. it's mutliplying by 491 vs. shifting left 9 bits (for 512) which is nowadays about the same amount of trouble. can you give me a hint, either regarding the spectrum or some hand- waving psycho-acoustic argument for why upsampling by a factor that is a prime number is better than some other integer of about the same size? r b-j
Reply by ●March 12, 20082008-03-12
On Mar 12, 12:11 pm, robert bristow-johnson <r...@audioimagination.com> wrote:> ... > can you give me a hint, either regarding the spectrum or some hand- > waving psycho-acoustic argument for why upsampling by a factor that is > a prime number is better than some other integer of about the same > size? > > r b-jErik made no statement about the difference being psycho-acoustic. Until he cares to clarify, I'll wave my hands somewhere else. Watch my hands... If the processing is evaluated by an FFT analysis based on a 2^n size transform, resampling with a prime number of banks may decorrelate the quantization noise from the analysis bins. Hands back to armrests... Dale B. Dalrymple
Reply by ●March 15, 20082008-03-15
Sorry for the delay in responding. Busy, busy, busy. robert bristow-johnson wrote:> On Mar 12, 6:11 am, Erik de Castro Lopo <er...@mega-nerd.com> wrote: >> I found experimentally that for the code I already had, I got >> better results for non-power-of-two values. Multiples of three >> were not much better so I went for prime numbers. > > hmmmm, i'm trying to get my head around this. you're saying that if > you increase the number of phases from 491 to 512 (MORE phases, not > less), that for an arbitrary sample rate ratio, it sounds *better* > using the slightly fewer phases.To my tin ear they sound pretty much identical, but the 491 phase converter has a measurably lower noise floor than the 512 phase converter.> and it sounds better because the > number of phases is a prime number? somebody's gonna need to explain > to me how that works. i don't know why the primeness or compositeness > of the number of fractional delays per input sampling period, why it's > better for that number to be prime or bad for it to be highly > composite (like a power of 2).In my implementation I use a double precision float to calculate the next sample position and a fixed point number (12 fractional bit and the rest integer) for indexing into the FIR coefficient table. The indexing is done with an offset and increment (both fixed point numbers) so that the Nth index is calculated using: index = offset + N * increment> well, multiplication is cheap nowadays. it's mutliplying by 491 vs. > shifting left 9 bits (for 512) which is nowadays about the same amount > of trouble.Exactly.> can you give me a hint, either regarding the spectrum or some hand- > waving psycho-acoustic argument for why upsampling by a factor that is > a prime number is better than some other integer of about the same > size?I didn't pursue it, but I suspect that errors caused by the inaccuracy of the fixed point arithmetic is more correlated when there are 2^N phases as opposed to a prime number. Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "Within C++, there is a much smaller and cleaner language struggling to get out." -- Bjarne Stroustrup






