"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message 
news:d8od70$6rb$1@blue.rahul.net...
> Is the above phase change frequency determination method similar to
> the analysis front-end of a phase vocoder using successive FFT frames?

The first part of their process is to calculate a constant Q transform with 
semitone or quartertone spacing, which they cleverly accellerate using an 
FFT.

Then they do phase analysis for finer resolution.  This part isn't 
completely dissimilar to the method you mention, but the way they do it is 
better.

> Note that the strongest frequency present in an FFT (or vocoder analysis)
> may not be the musical note being played if the tone consists of a
> diminished or even missing fundamental frequency content compared to
> the harmonic energy.

They add a good deal of cleverness here, too.  Since they use a constant Q 
transform, they get log-frequency output.  In log-frequency, a sum of 
harmonic frequencies (f, 2f, 3f, 4f, etc.) shows up as the same pattern, 
modulo a shift, regardless of f.  They cross-correlate the log-frequency 
spectrum with an "ideal" harmonic pattern to pick the fundamental.

--
Matt

Djvu is worth having. It goves much better access to the online version
of The New Century* Dictionary than does HTML. The NCD is next-best to
the OED, and sometimes better, Illustrated.

Jerry
_______________________________________
* That's 1900, but then my copy of the OED is prety old too.

Hi Andor,

"Andor" <an2or@mailcircuit.com> wrote in message 
news:1118904008.586422.316930@z14g2000cwz.googlegroups.com...
> [...] "Calculation of a constant Q spectral transform". [...]
> I'm wondering, is this equivalent to applying Goertzel filters with the
> k's logarithmically spaced, where each filter's N is actually a
> function of k, N(k+1) = c^k N(k) (and c < 1)?.

You really want to use a good window.  All the magic is in the windows and 
varying window size.  It's easier to understand if you think of the windows 
as low-pass filters, which are modulated by the center frequency to make 
Hilbert-transform pairs of bandpass filters at the frequency of interest. 
Changing the window size changes the bandwidth of the low-pass prototype. 
It is this that makes the filter bank constant-Q.

> My second question is: in this paper she mentions the Mellin transform,
> another constant Q transform, which has an inverse and therefore is
> suited for frequency domain processing. Without having to dig out the
> reference, can anybody comment on the essence of this transform?

The Mellin transform is like a constant Q transform with infinite support. 
It gives you logarithmic frequency spacing, but doesn't use a window.

--
Matt

Matt Timmermans wrote:
...
> Instead, if you actually do want to track musical notes, then I suggest you
> look at the paper called "A high resolution fundamental frequency
> determination based on phase changes in the Fourier transform", which
> describes what I believe is the standard way to accomplish this.
>
> You can get it from Judith Brown's home page here:  (there's also a link to
> the djvu plugin which you will have to install to see it).
>
> http://www.wellesley.edu/Physics/brown/jbrown.html

Matt,

thanks for the link! I wasn't aware of Judith Brown's paper
"Calculation of a constant Q spectral transform". In it she describes a
transform similar to the DFT, albeit at logarithmically spaced
frequencies. This transform consists of taking the inner product of the
signal vector with frequency vectors, but instead as in the DFT where
each frequency vector is equal in length (and the k-th vector contains
k cycles of the frequency it represents), the logarithmically spaced
vectors are adjusted in length to contain a constant multiple of cycles
of the frequency they represent.  This is also the reason why it is
not, in contrast to the DFT, invertible as not all input values are
used to compute an output value, making it unsuited for frequency
domain processing.

I'm wondering, is this equivalent to applying Goertzel filters with the
k's logarithmically spaced, where each filter's N is actually a
function of k, N(k+1) = c^k N(k) (and c < 1)?.

My second question is: in this paper she mentions the Mellin transform,
another constant Q transform, which has an inverse and therefore is
suited for frequency domain processing. Without having to dig out the
reference, can anybody comment on the essence of this transform?

Regards,
Andor

Ronald H. Nicholson Jr. wrote:

> In article <igsre.5859$yU.287575@news20.bellglobal.com>,
> Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote:
>>
>>You can get it from Judith Brown's home page here:  (there's also a link
>>to the djvu plugin which you will have to install to see it).
>>
>>http://www.wellesley.edu/Physics/brown/jbrown.html
> 
> Never heard of djvu, 

Neither did I.

> and it only seems to be supported on a tiny number
> of types of browsers and OS platforms.

I don't think so. Binaries exist for Windows and MacOS, and
source code is available for UNIX style OS. The binary plugin
for RedHat 7.3 works fine with Netscape, Mozilla and Konqueror.

bye
Andreas
-- 
Andreas H&#4294967295;nnebeck | email: ah@despammed.com
----- privat ---- | www  : http://www.huennebeck-online.de
Fax/Anrufbeantworter: 0721/151-284301
GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc

In article <igsre.5859$yU.287575@news20.bellglobal.com>,
Matt Timmermans <mt0000@sympatico.nospam-remove.ca> wrote:
>
>"Evan Olcott" <eolcott@triplo.com> wrote in message 
>news:2005061310050716807%eolcott@triplocom...
>> Hello everyone.
>>
>> Have an interesting problem. I have an FFT of an audio signal and I want 
>> to interpolate the bins into another frequency set... instead of a linear 
>> set of frequencies, I'd like the frequencies I get (in the end) to be 
>> based on an exponential/logarithmic formula, based on the twelfth root of 
>> 2...
>
>Since you're asking for semitones, I'm guessing that you want to track 
>musical notes.  I'm going to skip ahead of the part where everyone asks what 
>kind of window you're using, and tells you about the trade-offs between time 
>and frequency resolution and why you can't effectively do what you think you 
>want to do.
>
>Instead, if you actually do want to track musical notes, then I suggest you 
>look at the paper called "A high resolution fundamental frequency 
>determination based on phase changes in the Fourier transform", which 
>describes what I believe is the standard way to accomplish this.
>
>You can get it from Judith Brown's home page here:  (there's also a link to 
>the djvu plugin which you will have to install to see it).
>
>http://www.wellesley.edu/Physics/brown/jbrown.html

Never heard of djvu, and it only seems to be supported on a tiny number
of types of browsers and OS platforms.

Is the above phase change frequency determination method similar to
the analysis front-end of a phase vocoder using successive FFT frames?

In my experience, phase vocoder techniques seem to work well even
in the presence of some noise, as long as that noise doesn't include
tones closely spaced to the fundamental of interest.

Note that the strongest frequency present in an FFT (or vocoder analysis)
may not be the musical note being played if the tone consists of a
diminished or even missing fundamental frequency content compared to
the harmonic energy.

IMHO. YMMV.
-- 
Ron Nicholson   rhn AT nicholson DOT com   http://www.nicholson.com/rhn/ 
#include <canonical.disclaimer>        // only my own opinions, etc.

Evan Olcott wrote:

> Hello everyone.
> 
> Have an interesting problem. I have an FFT of an audio signal and I
> want to interpolate the bins into another frequency set... instead of a
> linear set of frequencies, I'd like the frequencies I get (in the end)
> to be based on an exponential/logarithmic formula, based on the twelfth
> root of 2...
> 
> What I'd like to do is take the FFT results and re-interpolate (or
> re-scale, I'm not sure what the proper term is for this) to my new
> (more useful) scale and use those results. At the low end of the
> spectrum, my resulting bins may be only fractions of FFT bins, whereas
> at the top of the spectrum they are combinations of many FFT bins (as
> well as fractions of neighboring bins)..

You may have a look into this function. It converts bins scaled
in linear frequency into bins scaled in logarithmic frequency. The
bins are summed up to represent the energy, also the maximum peak is
stored. It can handle all cases (less than one input bin = 1 output bin,
one inoput bin = 1 output bin, many input bins = 1 output bin).

I reduced the code to its bare minimum, so this version has not yet
been tested. Also this is not DSP code, just plain C++. The original
code is covered by the GPL, but since I'm the author and since
this is just standard stuff you may take this version for free:

data: the input bins
insize: number of input bins
fMax: highed frequency of input bins
avout: array to store the summed up bins
pkout: array to store the peak value bins
outsize: number of bins in avout and pkout
f1: start of desired frequency range
f2: end of desired frequency range

void processLogXScale(const double* data,unsigned int insize, double fMax,
			   double* avout, double* pkout, unsigned int outsize
			   double f1, double f2)
{
    const double fin_inc = fMax/(double)(insize - 1);
    double fout = f1;
    const double fout_inc = pow(f2/f1, 1.0 / (outsize - 1.0));
    const double flim_inc = pow(f2/f1, 0.5 / (outsize - 1.0));

    for (unsigned int out = 0; out < outsize; ++out, fout *= fout_inc)
    {
	// determine frequency span of this output frequency:
	double fs1 = fout / flim_inc;
	double fs2 = fout * flim_inc;
	if (fs1 < 0) fs1 = 0;
	if (fs2 > fMax) fs2 = fMax;

	// determine index span (float) of this output frequency:
	double fi1 = fs1 / fin_inc;
	double fi2 = fs2 / fin_inc;

	// determine first and last index:
	int bi1 = (int)(fi1 + 0.5);
	int bi2 = (int)(fi2 + 0.5);
	if (bi1 < 0) bi1 = 0;
	if (bi2 >= (int)insize) bi1 = insize - 1;

	// determine proportional parts of first and last index:
	double pi1; // proportional part of bi1
	double pi2; // proportional part of bi2
	if (bi1 == bi2)
	{
	    pi1 = 0.5 * fabs(fi2 - fi1);
	    pi2 = pi1;
	}
	else
	{
	    pi1 = (double)bi1 + 0.5 - fi1;
	    pi2 = fabs(fi2 - (double)bi2 + 0.5);
	}

	// sum up over all indices:
	double max = data[bi1] > data[bi2] ? data[bi1] : data[bi2];
	double sum = pi1 * data[bi1] + pi2 * data[bi2];
	double cnt = pi1 + pi2;
	for (int i = bi1+1; i < bi2; ++i)
	{
	    sum += data[i];
	    if (data[i] > max) max = data[i];
	    ++cnt;
	}
	// I do not really understand why we must divide the sum
	// by sqrt(cnt), but if we don't, then a pink noise spectrum
	// rises by 10 dB per octave.
	if (avout) avout[out] = sum / sqrt(cnt);
	if (pkout) pkout[out] = max;
    }

} // end of processLogXScale()

Bye
Andreas
-- 
Andreas H&#4294967295;nnebeck | email: ah@despammed.com
----- privat ---- | www  : http://www.huennebeck-online.de
Fax/Anrufbeantworter: 0721/151-284301
GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc

"Evan Olcott" <eolcott@triplo.com> wrote in message 
news:2005061310050716807%eolcott@triplocom...
> Hello everyone.
>
> Have an interesting problem. I have an FFT of an audio signal and I want 
> to interpolate the bins into another frequency set... instead of a linear 
> set of frequencies, I'd like the frequencies I get (in the end) to be 
> based on an exponential/logarithmic formula, based on the twelfth root of 
> 2...

Since you're asking for semitones, I'm guessing that you want to track 
musical notes.  I'm going to skip ahead of the part where everyone asks what 
kind of window you're using, and tells you about the trade-offs between time 
and frequency resolution and why you can't effectively do what you think you 
want to do.

Instead, if you actually do want to track musical notes, then I suggest you 
look at the paper called "A high resolution fundamental frequency 
determination based on phase changes in the Fourier transform", which 
describes what I believe is the standard way to accomplish this.

You can get it from Judith Brown's home page here:  (there's also a link to 
the djvu plugin which you will have to install to see it).

http://www.wellesley.edu/Physics/brown/jbrown.html

--
Matt