Reply by Rick Lyons June 19, 20052005-06-19
On Mon, 13 Jun 2005 23:35:59 -0400, "Matt Timmermans"
<mt0000@sympatico.nospam-remove.ca> wrote:

>
(snipped)
> >You can get it from Judith Brown's home page here: (there's also a link to >the djvu plugin which you will have to install to see it). > >http://www.wellesley.edu/Physics/brown/jbrown.html > >-- >Matt
Hi Matt, Please ignore my last post about djvu. I just saw the "djvu plugin" info on Prof. Brown's web page. Thanks, [-Rick-]
Reply by Rick Lyons June 19, 20052005-06-19
On Mon, 13 Jun 2005 23:35:59 -0400, "Matt Timmermans"
<mt0000@sympatico.nospam-remove.ca> wrote:

  (snipped)
> >You can get it from Judith Brown's home page here: (there's also a link to >the djvu plugin which you will have to install to see it). > >http://www.wellesley.edu/Physics/brown/jbrown.html > >-- >Matt
Hi Matt, can you tell me what software I need to be able to read some of Prof. Brown's papers? I can't "open" those djvu files. Thanks, [-Rick-]
Reply by Matt Timmermans June 17, 20052005-06-17
"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message 
news:d8od70$6rb$1@blue.rahul.net...
> Is the above phase change frequency determination method similar to > the analysis front-end of a phase vocoder using successive FFT frames?
The first part of their process is to calculate a constant Q transform with semitone or quartertone spacing, which they cleverly accellerate using an FFT. Then they do phase analysis for finer resolution. This part isn't completely dissimilar to the method you mention, but the way they do it is better.
> Note that the strongest frequency present in an FFT (or vocoder analysis) > may not be the musical note being played if the tone consists of a > diminished or even missing fundamental frequency content compared to > the harmonic energy.
They add a good deal of cleverness here, too. Since they use a constant Q transform, they get log-frequency output. In log-frequency, a sum of harmonic frequencies (f, 2f, 3f, 4f, etc.) shows up as the same pattern, modulo a shift, regardless of f. They cross-correlate the log-frequency spectrum with an "ideal" harmonic pattern to pick the fundamental. -- Matt
Reply by June 16, 20052005-06-16
Djvu is worth having. It goves much better access to the online version
of The New Century* Dictionary than does HTML. The NCD is next-best to
the OED, and sometimes better, Illustrated.

Jerry
_______________________________________
* That's 1900, but then my copy of the OED is prety old too.

Reply by Matt Timmermans June 16, 20052005-06-16
Hi Andor,

"Andor" <an2or@mailcircuit.com> wrote in message 
news:1118904008.586422.316930@z14g2000cwz.googlegroups.com...
> [...] "Calculation of a constant Q spectral transform". [...] > I'm wondering, is this equivalent to applying Goertzel filters with the > k's logarithmically spaced, where each filter's N is actually a > function of k, N(k+1) = c^k N(k) (and c < 1)?.
You really want to use a good window. All the magic is in the windows and varying window size. It's easier to understand if you think of the windows as low-pass filters, which are modulated by the center frequency to make Hilbert-transform pairs of bandpass filters at the frequency of interest. Changing the window size changes the bandwidth of the low-pass prototype. It is this that makes the filter bank constant-Q.
> My second question is: in this paper she mentions the Mellin transform, > another constant Q transform, which has an inverse and therefore is > suited for frequency domain processing. Without having to dig out the > reference, can anybody comment on the essence of this transform?
The Mellin transform is like a constant Q transform with infinite support. It gives you logarithmic frequency spacing, but doesn't use a window. -- Matt
Reply by Andor June 16, 20052005-06-16
Matt Timmermans wrote:
...
> Instead, if you actually do want to track musical notes, then I suggest you > look at the paper called "A high resolution fundamental frequency > determination based on phase changes in the Fourier transform", which > describes what I believe is the standard way to accomplish this. > > You can get it from Judith Brown's home page here: (there's also a link to > the djvu plugin which you will have to install to see it). > > http://www.wellesley.edu/Physics/brown/jbrown.html
Matt, thanks for the link! I wasn't aware of Judith Brown's paper "Calculation of a constant Q spectral transform". In it she describes a transform similar to the DFT, albeit at logarithmically spaced frequencies. This transform consists of taking the inner product of the signal vector with frequency vectors, but instead as in the DFT where each frequency vector is equal in length (and the k-th vector contains k cycles of the frequency it represents), the logarithmically spaced vectors are adjusted in length to contain a constant multiple of cycles of the frequency they represent. This is also the reason why it is not, in contrast to the DFT, invertible as not all input values are used to compute an output value, making it unsuited for frequency domain processing. I'm wondering, is this equivalent to applying Goertzel filters with the k's logarithmically spaced, where each filter's N is actually a function of k, N(k+1) = c^k N(k) (and c < 1)?. My second question is: in this paper she mentions the Mellin transform, another constant Q transform, which has an inverse and therefore is suited for frequency domain processing. Without having to dig out the reference, can anybody comment on the essence of this transform? Regards, Andor
Reply by Andreas Huennebeck June 15, 20052005-06-15
Ronald H. Nicholson Jr. wrote:

> In article <igsre.5859$yU.287575@news20.bellglobal.com>, > Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote: >> >>You can get it from Judith Brown's home page here: (there's also a link >>to the djvu plugin which you will have to install to see it). >> >>http://www.wellesley.edu/Physics/brown/jbrown.html > > Never heard of djvu,
Neither did I.
> and it only seems to be supported on a tiny number > of types of browsers and OS platforms.
I don't think so. Binaries exist for Windows and MacOS, and source code is available for UNIX style OS. The binary plugin for RedHat 7.3 works fine with Netscape, Mozilla and Konqueror. bye Andreas -- Andreas H&#4294967295;nnebeck | email: ah@despammed.com ----- privat ---- | www : http://www.huennebeck-online.de Fax/Anrufbeantworter: 0721/151-284301 GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
Reply by Ronald H. Nicholson Jr. June 15, 20052005-06-15
In article <igsre.5859$yU.287575@news20.bellglobal.com>,
Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote:
> >"Evan Olcott" <eolcott@triplo.com> wrote in message >news:2005061310050716807%eolcott@triplocom... >> Hello everyone. >> >> Have an interesting problem. I have an FFT of an audio signal and I want >> to interpolate the bins into another frequency set... instead of a linear >> set of frequencies, I'd like the frequencies I get (in the end) to be >> based on an exponential/logarithmic formula, based on the twelfth root of >> 2... > >Since you're asking for semitones, I'm guessing that you want to track >musical notes. I'm going to skip ahead of the part where everyone asks what >kind of window you're using, and tells you about the trade-offs between time >and frequency resolution and why you can't effectively do what you think you >want to do. > >Instead, if you actually do want to track musical notes, then I suggest you >look at the paper called "A high resolution fundamental frequency >determination based on phase changes in the Fourier transform", which >describes what I believe is the standard way to accomplish this. > >You can get it from Judith Brown's home page here: (there's also a link to >the djvu plugin which you will have to install to see it). > >http://www.wellesley.edu/Physics/brown/jbrown.html
Never heard of djvu, and it only seems to be supported on a tiny number of types of browsers and OS platforms. Is the above phase change frequency determination method similar to the analysis front-end of a phase vocoder using successive FFT frames? In my experience, phase vocoder techniques seem to work well even in the presence of some noise, as long as that noise doesn't include tones closely spaced to the fundamental of interest. Note that the strongest frequency present in an FFT (or vocoder analysis) may not be the musical note being played if the tone consists of a diminished or even missing fundamental frequency content compared to the harmonic energy. IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
Reply by Andreas Huennebeck June 14, 20052005-06-14
Evan Olcott wrote:

> Hello everyone. > > Have an interesting problem. I have an FFT of an audio signal and I > want to interpolate the bins into another frequency set... instead of a > linear set of frequencies, I'd like the frequencies I get (in the end) > to be based on an exponential/logarithmic formula, based on the twelfth > root of 2... > > What I'd like to do is take the FFT results and re-interpolate (or > re-scale, I'm not sure what the proper term is for this) to my new > (more useful) scale and use those results. At the low end of the > spectrum, my resulting bins may be only fractions of FFT bins, whereas > at the top of the spectrum they are combinations of many FFT bins (as > well as fractions of neighboring bins)..
You may have a look into this function. It converts bins scaled in linear frequency into bins scaled in logarithmic frequency. The bins are summed up to represent the energy, also the maximum peak is stored. It can handle all cases (less than one input bin = 1 output bin, one inoput bin = 1 output bin, many input bins = 1 output bin). I reduced the code to its bare minimum, so this version has not yet been tested. Also this is not DSP code, just plain C++. The original code is covered by the GPL, but since I'm the author and since this is just standard stuff you may take this version for free: data: the input bins insize: number of input bins fMax: highed frequency of input bins avout: array to store the summed up bins pkout: array to store the peak value bins outsize: number of bins in avout and pkout f1: start of desired frequency range f2: end of desired frequency range void processLogXScale(const double* data,unsigned int insize, double fMax, double* avout, double* pkout, unsigned int outsize double f1, double f2) { const double fin_inc = fMax/(double)(insize - 1); double fout = f1; const double fout_inc = pow(f2/f1, 1.0 / (outsize - 1.0)); const double flim_inc = pow(f2/f1, 0.5 / (outsize - 1.0)); for (unsigned int out = 0; out < outsize; ++out, fout *= fout_inc) { // determine frequency span of this output frequency: double fs1 = fout / flim_inc; double fs2 = fout * flim_inc; if (fs1 < 0) fs1 = 0; if (fs2 > fMax) fs2 = fMax; // determine index span (float) of this output frequency: double fi1 = fs1 / fin_inc; double fi2 = fs2 / fin_inc; // determine first and last index: int bi1 = (int)(fi1 + 0.5); int bi2 = (int)(fi2 + 0.5); if (bi1 < 0) bi1 = 0; if (bi2 >= (int)insize) bi1 = insize - 1; // determine proportional parts of first and last index: double pi1; // proportional part of bi1 double pi2; // proportional part of bi2 if (bi1 == bi2) { pi1 = 0.5 * fabs(fi2 - fi1); pi2 = pi1; } else { pi1 = (double)bi1 + 0.5 - fi1; pi2 = fabs(fi2 - (double)bi2 + 0.5); } // sum up over all indices: double max = data[bi1] > data[bi2] ? data[bi1] : data[bi2]; double sum = pi1 * data[bi1] + pi2 * data[bi2]; double cnt = pi1 + pi2; for (int i = bi1+1; i < bi2; ++i) { sum += data[i]; if (data[i] > max) max = data[i]; ++cnt; } // I do not really understand why we must divide the sum // by sqrt(cnt), but if we don't, then a pink noise spectrum // rises by 10 dB per octave. if (avout) avout[out] = sum / sqrt(cnt); if (pkout) pkout[out] = max; } } // end of processLogXScale() Bye Andreas -- Andreas H&#4294967295;nnebeck | email: ah@despammed.com ----- privat ---- | www : http://www.huennebeck-online.de Fax/Anrufbeantworter: 0721/151-284301 GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
Reply by Matt Timmermans June 14, 20052005-06-14
"Evan Olcott" <eolcott@triplo.com> wrote in message 
news:2005061310050716807%eolcott@triplocom...
> Hello everyone. > > Have an interesting problem. I have an FFT of an audio signal and I want > to interpolate the bins into another frequency set... instead of a linear > set of frequencies, I'd like the frequencies I get (in the end) to be > based on an exponential/logarithmic formula, based on the twelfth root of > 2...
Since you're asking for semitones, I'm guessing that you want to track musical notes. I'm going to skip ahead of the part where everyone asks what kind of window you're using, and tells you about the trade-offs between time and frequency resolution and why you can't effectively do what you think you want to do. Instead, if you actually do want to track musical notes, then I suggest you look at the paper called "A high resolution fundamental frequency determination based on phase changes in the Fourier transform", which describes what I believe is the standard way to accomplish this. You can get it from Judith Brown's home page here: (there's also a link to the djvu plugin which you will have to install to see it). http://www.wellesley.edu/Physics/brown/jbrown.html -- Matt