Hello everyone. Have an interesting problem. I have an FFT of an audio signal and I want to interpolate the bins into another frequency set... instead of a linear set of frequencies, I'd like the frequencies I get (in the end) to be based on an exponential/logarithmic formula, based on the twelfth root of 2... What I'd like to do is take the FFT results and re-interpolate (or re-scale, I'm not sure what the proper term is for this) to my new (more useful) scale and use those results. At the low end of the spectrum, my resulting bins may be only fractions of FFT bins, whereas at the top of the spectrum they are combinations of many FFT bins (as well as fractions of neighboring bins).. for frequency bins f, for sampling rate f0: FFT scale: f(n) = f0/N * n : N = total bins (4096 in my case), n = whole number (< N/2) My scale: f(n) = 20 < 440 * pow((12th root of 2), n) < f0/2 : n = �whole number The FFT scale ends up with 2048 bins, and my scale should be 120 bins. Is this best calculated via a set of tables, or is there a calculation (in C, hopefully) that would solve this better? Do I need to concern myself with the "bandwidths" of each bin in order to calculate the interpolation properly? Either way, it's over my head right off the bat - I'm not sure how to approach this! Thanks for any help/advice! Ev Technical Knowledge Officer Head Programmer/Designer Audiofile Engineering http://www.audiofile-engineering.com/
FFT re-scaling or interpolation
Started by ●June 13, 2005
Reply by ●June 13, 20052005-06-13
"Evan Olcott" <eolcott@triplo.com> wrote in message news:2005061310050716807%eolcott@triplocom...> Hello everyone. > > Have an interesting problem. I have an FFT of an audio > signal and I want to interpolate the bins into another > frequency set... instead of a linear set of frequencies, > I'd like the frequencies I get (in the end) to be based on > an exponential/logarithmic formula, based on the twelfth > root of 2... > > What I'd like to do is take the FFT results and > re-interpolate (or re-scale, I'm not sure what the proper > term is for this) to my new (more useful) scale and use > those results.[snip]> - I'm not sure how to approach this! >Look into the Goertzel Algorithm. If this doesn't do exactly what you want, there may be alternatives. Here is one web reference. I don't know how useful it might be to you. http://citeseer.ist.psu.edu/651574.html Here is another reference, but it's probably useless since I very much doubt that the book is available today: Gold, Bernard and Rader, Charles M., Digital Processing of Signals, (c) 1969, McGraw Hill, New York, Page 171 and following.
Reply by ●June 14, 20052005-06-14
"Evan Olcott" <eolcott@triplo.com> wrote in message news:2005061310050716807%eolcott@triplocom...> Hello everyone. > > Have an interesting problem. I have an FFT of an audio signal and I want > to interpolate the bins into another frequency set... instead of a linear > set of frequencies, I'd like the frequencies I get (in the end) to be > based on an exponential/logarithmic formula, based on the twelfth root of > 2...Since you're asking for semitones, I'm guessing that you want to track musical notes. I'm going to skip ahead of the part where everyone asks what kind of window you're using, and tells you about the trade-offs between time and frequency resolution and why you can't effectively do what you think you want to do. Instead, if you actually do want to track musical notes, then I suggest you look at the paper called "A high resolution fundamental frequency determination based on phase changes in the Fourier transform", which describes what I believe is the standard way to accomplish this. You can get it from Judith Brown's home page here: (there's also a link to the djvu plugin which you will have to install to see it). http://www.wellesley.edu/Physics/brown/jbrown.html -- Matt
Reply by ●June 14, 20052005-06-14
Evan Olcott wrote:> Hello everyone. > > Have an interesting problem. I have an FFT of an audio signal and I > want to interpolate the bins into another frequency set... instead of a > linear set of frequencies, I'd like the frequencies I get (in the end) > to be based on an exponential/logarithmic formula, based on the twelfth > root of 2... > > What I'd like to do is take the FFT results and re-interpolate (or > re-scale, I'm not sure what the proper term is for this) to my new > (more useful) scale and use those results. At the low end of the > spectrum, my resulting bins may be only fractions of FFT bins, whereas > at the top of the spectrum they are combinations of many FFT bins (as > well as fractions of neighboring bins)..You may have a look into this function. It converts bins scaled in linear frequency into bins scaled in logarithmic frequency. The bins are summed up to represent the energy, also the maximum peak is stored. It can handle all cases (less than one input bin = 1 output bin, one inoput bin = 1 output bin, many input bins = 1 output bin). I reduced the code to its bare minimum, so this version has not yet been tested. Also this is not DSP code, just plain C++. The original code is covered by the GPL, but since I'm the author and since this is just standard stuff you may take this version for free: data: the input bins insize: number of input bins fMax: highed frequency of input bins avout: array to store the summed up bins pkout: array to store the peak value bins outsize: number of bins in avout and pkout f1: start of desired frequency range f2: end of desired frequency range void processLogXScale(const double* data,unsigned int insize, double fMax, double* avout, double* pkout, unsigned int outsize double f1, double f2) { const double fin_inc = fMax/(double)(insize - 1); double fout = f1; const double fout_inc = pow(f2/f1, 1.0 / (outsize - 1.0)); const double flim_inc = pow(f2/f1, 0.5 / (outsize - 1.0)); for (unsigned int out = 0; out < outsize; ++out, fout *= fout_inc) { // determine frequency span of this output frequency: double fs1 = fout / flim_inc; double fs2 = fout * flim_inc; if (fs1 < 0) fs1 = 0; if (fs2 > fMax) fs2 = fMax; // determine index span (float) of this output frequency: double fi1 = fs1 / fin_inc; double fi2 = fs2 / fin_inc; // determine first and last index: int bi1 = (int)(fi1 + 0.5); int bi2 = (int)(fi2 + 0.5); if (bi1 < 0) bi1 = 0; if (bi2 >= (int)insize) bi1 = insize - 1; // determine proportional parts of first and last index: double pi1; // proportional part of bi1 double pi2; // proportional part of bi2 if (bi1 == bi2) { pi1 = 0.5 * fabs(fi2 - fi1); pi2 = pi1; } else { pi1 = (double)bi1 + 0.5 - fi1; pi2 = fabs(fi2 - (double)bi2 + 0.5); } // sum up over all indices: double max = data[bi1] > data[bi2] ? data[bi1] : data[bi2]; double sum = pi1 * data[bi1] + pi2 * data[bi2]; double cnt = pi1 + pi2; for (int i = bi1+1; i < bi2; ++i) { sum += data[i]; if (data[i] > max) max = data[i]; ++cnt; } // I do not really understand why we must divide the sum // by sqrt(cnt), but if we don't, then a pink noise spectrum // rises by 10 dB per octave. if (avout) avout[out] = sum / sqrt(cnt); if (pkout) pkout[out] = max; } } // end of processLogXScale() Bye Andreas -- Andreas H�nnebeck | email: ah@despammed.com ----- privat ---- | www : http://www.huennebeck-online.de Fax/Anrufbeantworter: 0721/151-284301 GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
Reply by ●June 15, 20052005-06-15
In article <igsre.5859$yU.287575@news20.bellglobal.com>, Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote:> >"Evan Olcott" <eolcott@triplo.com> wrote in message >news:2005061310050716807%eolcott@triplocom... >> Hello everyone. >> >> Have an interesting problem. I have an FFT of an audio signal and I want >> to interpolate the bins into another frequency set... instead of a linear >> set of frequencies, I'd like the frequencies I get (in the end) to be >> based on an exponential/logarithmic formula, based on the twelfth root of >> 2... > >Since you're asking for semitones, I'm guessing that you want to track >musical notes. I'm going to skip ahead of the part where everyone asks what >kind of window you're using, and tells you about the trade-offs between time >and frequency resolution and why you can't effectively do what you think you >want to do. > >Instead, if you actually do want to track musical notes, then I suggest you >look at the paper called "A high resolution fundamental frequency >determination based on phase changes in the Fourier transform", which >describes what I believe is the standard way to accomplish this. > >You can get it from Judith Brown's home page here: (there's also a link to >the djvu plugin which you will have to install to see it). > >http://www.wellesley.edu/Physics/brown/jbrown.htmlNever heard of djvu, and it only seems to be supported on a tiny number of types of browsers and OS platforms. Is the above phase change frequency determination method similar to the analysis front-end of a phase vocoder using successive FFT frames? In my experience, phase vocoder techniques seem to work well even in the presence of some noise, as long as that noise doesn't include tones closely spaced to the fundamental of interest. Note that the strongest frequency present in an FFT (or vocoder analysis) may not be the musical note being played if the tone consists of a diminished or even missing fundamental frequency content compared to the harmonic energy. IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
Reply by ●June 15, 20052005-06-15
Ronald H. Nicholson Jr. wrote:> In article <igsre.5859$yU.287575@news20.bellglobal.com>, > Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote: >> >>You can get it from Judith Brown's home page here: (there's also a link >>to the djvu plugin which you will have to install to see it). >> >>http://www.wellesley.edu/Physics/brown/jbrown.html > > Never heard of djvu,Neither did I.> and it only seems to be supported on a tiny number > of types of browsers and OS platforms.I don't think so. Binaries exist for Windows and MacOS, and source code is available for UNIX style OS. The binary plugin for RedHat 7.3 works fine with Netscape, Mozilla and Konqueror. bye Andreas -- Andreas H�nnebeck | email: ah@despammed.com ----- privat ---- | www : http://www.huennebeck-online.de Fax/Anrufbeantworter: 0721/151-284301 GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
Reply by ●June 16, 20052005-06-16
Matt Timmermans wrote: ...> Instead, if you actually do want to track musical notes, then I suggest you > look at the paper called "A high resolution fundamental frequency > determination based on phase changes in the Fourier transform", which > describes what I believe is the standard way to accomplish this. > > You can get it from Judith Brown's home page here: (there's also a link to > the djvu plugin which you will have to install to see it). > > http://www.wellesley.edu/Physics/brown/jbrown.htmlMatt, thanks for the link! I wasn't aware of Judith Brown's paper "Calculation of a constant Q spectral transform". In it she describes a transform similar to the DFT, albeit at logarithmically spaced frequencies. This transform consists of taking the inner product of the signal vector with frequency vectors, but instead as in the DFT where each frequency vector is equal in length (and the k-th vector contains k cycles of the frequency it represents), the logarithmically spaced vectors are adjusted in length to contain a constant multiple of cycles of the frequency they represent. This is also the reason why it is not, in contrast to the DFT, invertible as not all input values are used to compute an output value, making it unsuited for frequency domain processing. I'm wondering, is this equivalent to applying Goertzel filters with the k's logarithmically spaced, where each filter's N is actually a function of k, N(k+1) = c^k N(k) (and c < 1)?. My second question is: in this paper she mentions the Mellin transform, another constant Q transform, which has an inverse and therefore is suited for frequency domain processing. Without having to dig out the reference, can anybody comment on the essence of this transform? Regards, Andor
Reply by ●June 16, 20052005-06-16
Hi Andor, "Andor" <an2or@mailcircuit.com> wrote in message news:1118904008.586422.316930@z14g2000cwz.googlegroups.com...> [...] "Calculation of a constant Q spectral transform". [...] > I'm wondering, is this equivalent to applying Goertzel filters with the > k's logarithmically spaced, where each filter's N is actually a > function of k, N(k+1) = c^k N(k) (and c < 1)?.You really want to use a good window. All the magic is in the windows and varying window size. It's easier to understand if you think of the windows as low-pass filters, which are modulated by the center frequency to make Hilbert-transform pairs of bandpass filters at the frequency of interest. Changing the window size changes the bandwidth of the low-pass prototype. It is this that makes the filter bank constant-Q.> My second question is: in this paper she mentions the Mellin transform, > another constant Q transform, which has an inverse and therefore is > suited for frequency domain processing. Without having to dig out the > reference, can anybody comment on the essence of this transform?The Mellin transform is like a constant Q transform with infinite support. It gives you logarithmic frequency spacing, but doesn't use a window. -- Matt
Reply by ●June 16, 20052005-06-16
Djvu is worth having. It goves much better access to the online version of The New Century* Dictionary than does HTML. The NCD is next-best to the OED, and sometimes better, Illustrated. Jerry _______________________________________ * That's 1900, but then my copy of the OED is prety old too.
Reply by ●June 17, 20052005-06-17
"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message news:d8od70$6rb$1@blue.rahul.net...> Is the above phase change frequency determination method similar to > the analysis front-end of a phase vocoder using successive FFT frames?The first part of their process is to calculate a constant Q transform with semitone or quartertone spacing, which they cleverly accellerate using an FFT. Then they do phase analysis for finer resolution. This part isn't completely dissimilar to the method you mention, but the way they do it is better.> Note that the strongest frequency present in an FFT (or vocoder analysis) > may not be the musical note being played if the tone consists of a > diminished or even missing fundamental frequency content compared to > the harmonic energy.They add a good deal of cleverness here, too. Since they use a constant Q transform, they get log-frequency output. In log-frequency, a sum of harmonic frequencies (f, 2f, 3f, 4f, etc.) shows up as the same pattern, modulo a shift, regardless of f. They cross-correlate the log-frequency spectrum with an "ideal" harmonic pattern to pick the fundamental. -- Matt