comp.dsp | Sound analysis

Hello, Everyone,

I am designing a software for a course, and it is suppose to do the
following:
1. get a wave file
2. analyze it - to notes and instruments that play them
3. enable changing an instrument with another.

Yes, I know these are all things that are under research, and that there
is no magic solution.

Yet, since this is for a course, it doesn't have to be robust - if I can
get it to work only on a small group of samples, that would be OK. So I
chose GuitarPro (which I hope you know). It basically allows one to write
notes for different instruments and play them. After I write the music, I
export it as a MIDI file and then I use a MIDI to WAVE converter. This is
how I get my samples.

Now, I am using an FT (fourier transform) algorithm that appears on:
http://jvalentino2.tripod.com/dft/index.html (it uses a regular FT, not
even FFT)
The major problem I am having is that when I create just a simple "C" note
played by a flute or a guitar-> then export it as MIDI -> convet it to WAVE
-> use the code above on it: I don't get 216.626 Hz as the dominant
frequency!!! Sometimes, 216.626 gets even a very small amplitude.

The code is fairly basic and uses the theoratical FT pretty much as it is,
so i don't think that that's the problem.

So Question 1:--------------------------------------------------
What am I doing wrong? - i expected that the "C" frequency would be one of
the leading ones, but the results I am getting seem almost random.
Is there any problem, you think, with the MIDI export or the WAVE
coversion. Maybe it is the code i am using after all?

Maybe there is nothing wrong - and this is the way it is supposed to be,
and if so could you give me and advice on how to somehow still understand
the note played.
-----------------------------------------------------------------

If you could just answer question number 1, I'd be very greatful, but if
some can continue reading, that would be fantastic:

I thought I could overcome the problem above - I assumed each instrument
as its own fingerprint - that is when a guitar plays C it may also play E,
G at different energies - and the list of energies for each frequency is
some sort of fingerprint of a guitar.

So I designed an algorithm that:
1. Divides a song (Wave file) to segmants, each of less than 0.2 seconds.
I want to analyze each such segmant.
2. At a certain segmant, let's say A - some instruments could be playing
some notes.
3. on segmant A, the algorithm uses an FFT and finds the significant
fourier coefficeints. Let's call this set of notes: notes(A).
4. I assuemd that at least some of these fourier coefficents are "real"
that is represent actual notes played.
5. Now, I also need to support only 3 instruments (guitar, piano,
flute)!!! So, I retrive from a bank of samples how each of the 3
instruments play each of notes(A):
this forms a matrix:

(let's assume notes(A) is {C,E} so the matrix looks like this:
  Guitar_on_C | Guitar_on_E | Flute_on_C | Flute_on_E | piano_C | piano_E
C    from bank  from bank     from bank
E    from bank...                                                 from
bank

that is, for cell [Guitar_on_C, C] it will contain the bank's C frequency
when C is played by a guitar.
On [Guitar_on_C, E] - the E frequency when the guitar plays C etc.

Then, I add a few more rows for let's say notes B,A#,F,G that were not
found significant - this is to make the matrix a square matrix (6 columns,
6 rows).

and finally, let's call the matrix M, and the sampled notes at frequncies
<C,E,B,A#,F,G>, we shall call Y, so:
I look for the solution x of the linear equation system Mx = Y.

If all my assumption were correct, I could deduce from the solution x, the
contrubution of each instrument & note to the segmant I am analyzing.
I could discard those that have very small contribution - saying it's a
numerical error and those that have a big contribution are probably really
playing.

So, Quesion 2:------------------------------------------
What do you think of the division made at (1) into segmants of 0.2
seconds. This was an idea of a professor of mine (he isn't in the field of
DSP), in order to avoid needing to identify were the note begins and ends.
-------------------------------------------------------
Question 3:
what do you think of the algorithm - could it work?
-------------------------------------------------------
Question 4:
The algorithm is based on the idea that if I play C note at volume 16, and
at the same time E note at volume 17 - then the fourier transform over for
this - Is just like:

the sum of 2 fourier transfroms:
1. C note on guitar at volume 16.
2. E note on huitar at volume 17.

Is this a right assumption?
-------------------------------------------------------

Thank you for all your help!
I will be extremly greatful to anyone who responds even on some of the
questions.

Reply by Rune Allnor ●September 2, 20092009-09-02

On 2 Sep, 13:46, "Shimon M" <shimon.ma...@gmail.com> wrote:
> Hello, Everyone,
>
> I am designing a software for a course, and it is suppose to do the
> following:
> 1. get a wave file
> 2. analyze it - to notes and instruments that play them
> 3. enable changing an instrument with another.
>
> Yes, I know these are all things that are under research, and that there
> is no magic solution.
>
> Yet, since this is for a course, it doesn't have to be robust - if I can
> get it to work only on a small group of samples, that would be OK.

When something is 'subject of current research' even this
limited goal might be too optimistic.

> So I
> chose GuitarPro (which I hope you know). It basically allows one to write
> notes for different instruments and play them. After I write the music, I
> export it as a MIDI file and then I use a MIDI to WAVE converter. This is
> how I get my samples.
>
> Now, I am using an FT (fourier transform) algorithm that appears on:http://jvalentino2.tripod.com/dft/index.html(it uses a regular FT, not
> even FFT)

Use the FFT. The FFT is a standard tool that everybody
understand how works.

> The major problem I am having is that when I create just a simple "C" note
> played by a flute or a guitar-> then export it as MIDI -> convet it to WAVE
> -> use the code above on it: I don't get 216.626 Hz as the dominant
> frequency!!! Sometimes, 216.626 gets even a very small amplitude.

This can have any number of explanations. If you had used
the FFT, you would have one less dubious factor to worry
about. As is, it could be a problem with the FT algorithm
you chose.

> The code is fairly basic and uses the theoratical FT pretty much as it is,
> so i don't think that that's the problem.

Don't 'think' it's no problem. Make sure. Use the FFT.

> So Question 1:--------------------------------------------------
> What am I doing wrong? - i expected that the "C" frequency would be one of
> the leading ones, but the results I am getting seem almost random.
> Is there any problem, you think, with the MIDI export or the WAVE
> coversion. Maybe it is the code i am using after all?

The FT? Could well be. Or it could be that MIDI uses
some format that is tailored to the human auditory
system. If you had used the FFT from the start, you
would have had one less uncertain factor on your list.

> Maybe there is nothing wrong - and this is the way it is supposed to be,
> and if so could you give me and advice on how to somehow still understand
> the note played.
> -----------------------------------------------------------------

If you mean 'how the human auditory system works,' that's
anybody's guess.

> If you could just answer question number 1, I'd be very greatful, but if
> some can continue reading, that would be fantastic:
>
> I thought I could overcome the problem above - I assumed each instrument
> as its own fingerprint - that is when a guitar plays C it may also play E,
> G at different energies - and the list of energies for each frequency is
> some sort of fingerprint of a guitar.

Again an *assumption* on your part. Don't assume. Make sure
where you can; verify what remains.

> So I designed an algorithm that:
> 1. Divides a song (Wave file) to segmants, each of less than 0.2 seconds.
> I want to analyze each such segmant.
> 2. At a certain segmant, let's say A - some instruments could be playing
> some notes.
> 3. on segmant A, the algorithm uses an FFT and finds the significant
> fourier coefficeints. Let's call this set of notes: notes(A).
> 4. I assuemd that at least some of these fourier coefficents are "real"
> that is represent actual notes played.
> 5. Now, I also need to support only 3 instruments (guitar, piano,
> flute)!!! So, I retrive from a bank of samples how each of the 3
> instruments play each of notes(A):
> this forms a matrix:
>
> (let's assume notes(A) is {C,E} so the matrix looks like this:
> &#4294967295; Guitar_on_C | Guitar_on_E | Flute_on_C | Flute_on_E | piano_C | piano_E
> C &#4294967295; &#4294967295;from bank &#4294967295;from bank &#4294967295; &#4294967295; from bank
> E &#4294967295; &#4294967295;from bank... &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; &#4294967295; from
> bank
>
> that is, for cell [Guitar_on_C, C] it will contain the bank's C frequency
> when C is played by a guitar.
> On [Guitar_on_C, E] - the E frequency when the guitar plays C etc.
>
> Then, I add a few more rows for let's say notes B,A#,F,G that were not
> found significant - this is to make the matrix a square matrix (6 columns,
> 6 rows).
>
> and finally, let's call the matrix M, and the sampled notes at frequncies
> <C,E,B,A#,F,G>, we shall call Y, so:
> I look for the solution x of the linear equation system Mx = Y.
>
> If all my assumption were correct, I could deduce from the solution x, the
> contrubution of each instrument & note to the segmant I am analyzing.
> I could discard those that have very small contribution - saying it's a
> numerical error and those that have a big contribution are probably really
> playing.
>
> So, Quesion 2:------------------------------------------
> What do you think of the division made at (1) into segmants of 0.2
> seconds. This was an idea of a professor of mine (he isn't in the field of
> DSP), in order to avoid needing to identify were the note begins and ends.
> -------------------------------------------------------

It's an arbitrary choise. What's wrong with 0.25? Or 0.15?

> Question 3:
> what do you think of the algorithm - could it work?
> -------------------------------------------------------

Depends on what your objective is. It would probably work
if the objective is to demonstrate that the spectra of
different instruments that play the same notes in isolation,
are different. If you want to extract the contributions from
each instrument to a song, you might be out of luck.

> Question 4:
> The algorithm is based on the idea that if I play C note at volume 16, and
> at the same time E note at volume 17 - then the fourier transform over for
> this - Is just like:
>
> the sum of 2 fourier transfroms:
> 1. C note on guitar at volume 16.
> 2. E note on huitar at volume 17.
>
> Is this a right assumption?
> -------------------------------------------------------

Sure. The DFT is linear.

Rune

Reply by Richard Dobson ●September 2, 20092009-09-02

Shimon M wrote:
> Hello, Everyone,
> 
> I am designing a software for a course, and it is suppose to do the
> following:
> 1. get a wave file
> 2. analyze it - to notes and instruments that play them
> 3. enable changing an instrument with another.
> 

<gulp>

What is the pedagogical purpose of this? What academic level? (2) (and 
(3) from it) is a formidable problem for polyphonic sources (see MPEG-7, 
Blind Source Separation etc)

>..
> The major problem I am having is that when I create just a simple "C" note
> played by a flute or a guitar-> then export it as MIDI -> convet it to WAVE
> -> use the code above on it: I don't get 216.626 Hz as the dominant
> frequency!!! Sometimes, 216.626 gets even a very small amplitude.
> 

Hope that's a typo - should be 261.626. Taking the lowest partial of a 
tone as the fundamental pitch works a lot of the time, but not all the 
tiume. Many oboe notes for example have a 2nd harmonic stronger than the 
fundamental.

No time to answer everything else just now; but I wonder why you feel 
the need to design all this yourself. How much time do you have? The new 
academic year is virtually started already.

For example, does this software not do almost all of what you need (and 
is GPL too)?

http://www.sonicvisualiser.org

Or even the new Melodyne, which claims polyphonic note separation among 
other things (I don't have it, haven't tested it).

Richard Dobson

Reply by Shimon M ●September 2, 20092009-09-02

Thank you both for your comments.

>Rune said:
>Use the FFT. The FFT is a standard tool that everybody
>understand how works.
>

Thank you. I just used the FFT on www.fftcalculator.com
I got similar results :(
However, the way I produce data from a wave file is AudioInputStream and
SourceDataLine - they produce an array of int(s)!!!
So, the data I preformed FFT on looks like this:
10503
45207
15203
310752
..

>Is this the way it is supposed to be? Maybe this is part of the problem?
>If you mean 'how the human auditory system works,' that's
>anybody's guess.
>

No, I meant how do I deduce from FFT results the note played, even if
perhaps the FFT as I see it doesn't give me any real information - because
the frequency I expected would have a high value, has an average value.

>Again an *assumption* on your part. Don't assume. Make sure
>where you can; verify what remains.
>

You are correct. I meant the word "assumption" as a general "what should
be" in terms of theory. But as we all know - theory != reality. So in
theory, a guitar's C note differs from a piano's C note because of its
other bi-product notes.

>It's an arbitrary choise. What's wrong with 0.25? Or 0.15?

There is nothing wrong - i am supposed to find an optimal size via trial
and error - which I will do as soon as the above parts work. :)

>Sure. The DFT is linear.

:) Yes, but is music linear? I assumed it was, because music is a wave,
and two waves that have the same phase and same period, their amplitudes
shall add.

Thank you very much for your comments, Rune.
-------------------------------

>Richard Dobson wrote:
>What is the pedagogical purpose of this? What academic level? (2) (and 
>(3) from it) is a formidable problem for polyphonic sources (see MPEG-7,

>Blind Source Separation etc)

I am studying for a BSC in CS, and this is my final project to submit. it
is due in a few weeks. I can use some library classes, but I need to come
up with algorithms myself and inmplement them myself.

>Hope that's a typo - should be 261.626. Taking the lowest partial of a 
>tone as the fundamental pitch works a lot of the time, but not all the 
>tiume. Many oboe notes for example have a 2nd harmonic stronger than the

>fundamental.
>

Yes, typo. Thank you. Other than that, the resuklts I am getting through
FFT don't indicate that 261.626 has any importance!
Some frequencies get amplitude 1500 and 261.626 gets only about 500
sometimes! So it is not the first and not even the second!

thank you for your comments, Mr. Dobson.
---------------
Thank you both.

Reply by Richard Dobson ●September 2, 20092009-09-02

Shimon M wrote:
> Thank you both for your comments.
> 
>> Rune said:
>> Use the FFT. The FFT is a standard tool that everybody
>> understand how works.
>>
> 
> Thank you. I just used the FFT on www.fftcalculator.com
> I got similar results :(

Such things are worse than useless for serious work of the kind you are 
contemplating. That work requires a deep understanding of how the FFT 
works, and in particular the significance of (a) the sample rate of the 
audio being analysed and (B) the minimum FFT length (window size) needed 
to resolve low fundamentals and close frequency components.  Go here first:

http://www.dspdimension.com/admin/dft-a-pied

Then go here are read as much as you can:

http://ccrma.stanford.edu/~jos/sasp

> However, the way I produce data from a wave file is AudioInputStream and
> SourceDataLine - they produce an array of int(s)!!!

Unfortunate it is you appear to be dependent on using Java.
<flamebait>
Don't use java.
</flamebait>
There is a huge amount of free high-quality audio processing code on the 
net, none of it written in Java. It is all C, C++ (mostly C).

The site you cited with a java FFT is at the very least suspect, as it 
talks variously about "an array of bytes" but then also discusses 
Endianness (not very well!) and casually mentions that audio data come 
is floats  (well it may do in java, but real audio files come in all 
sorts of formats). The most usual cause of pain in processing audio is 
mixing up bytes, ints, floats and so forth.

...
>> Richard Dobson wrote:
>> What is the pedagogical purpose of this? What academic level? (2) (and 
>> (3) from it) is a formidable problem for polyphonic sources (see MPEG-7,
> 
>> Blind Source Separation etc)
> 
> I am studying for a BSC in CS, and this is my final project to submit. it
> is due in a few weeks. I can use some library classes, but I need to come
> up with algorithms myself and inmplement them myself.
> 

In that case, you are up against it. Abandon the idea of separating 
polyphonic sources in one soundfile. One useful thing you just ~might~ 
be able to do in the time is stick to analyses of single instrument 
tones, and characterize them in terms of their ~time-varying~ spectra 
(which of course requires your FFT to work). You need much finer 
resolution than frames every 0.2 seconds. Try every 25msecs.

Real musical instruments do not have a static spectrum that you can 
simply characterize in terms of strengths of harmonics. the guitar is 
the most obvious example - harmonically complex attack (almost chaotic 
indeed), with a very rapidy decaying envelope, with high frequencies 
decaying faster than low ones (as it the usual case).

This is therefore an elementary exercise in ~partial tracking~ (see the 
Julius Smith ref above) . The C++ CLAM library and tools support this 
and much else besides:

http://clam-project.org

Once you have an FFT working (use a good existing library for that!), 
you might consider creating a sonogram display (spectrum/time). Try the 
one in Audacity (www.audacity.org) to see what this is. The different 
characters of guitar v piano v flute etc will show up very well. And it 
will help you empirically understand about the relationship between FFT 
size, sample rate, and freqency resolution. And the use of Windowing.

Also look at Csound (www.csounds.com) - which has tools for partial 
tracking and all manner of spectral analysis, including basic FFTs.

But I fear that if the FFT is as opaque to you as your comments suggest, 
you may have approached this project far too late in the day. One basic 
issue is that different FFT implementations scale amplitudes in 
different ways - you need to find out what those ways are for the FFT 
you are using.

Richard Dobson

Reply by Shimon M ●September 2, 20092009-09-02

Thank you for the resources - Also,
The algorithm we had to implement for the project appears in page 52
(psudeo code at page 54)
http://www.math.ias.edu/~akavia/AkaviaPhDThesis.pdf

If you could just glance at it and tell me if it is good. It is called a
SFT (Short time fourier transfrom).

>Unfortunate it is you appear to be dependent on using Java.
I guess I could try to incorparate C code with java.

>The site you cited with a java FFT is at the very least suspect, as it 
>talks variously about "an array of bytes" but then also discusses 
>Endianness (not very well!) and casually mentions that audio data come 
>is floats  (well it may do in java, but real audio files come in all 
>sorts of formats). The most usual cause of pain in processing audio is 
>mixing up bytes, ints, floats and so forth.

>resolution than frames every 0.2 seconds. Try every 25msecs.
thank you.

>
>Real musical instruments do not have a static spectrum that you can 
>simply characterize in terms of strengths of harmonics. the guitar is 
>the most obvious example - harmonically complex attack (almost chaotic 
>indeed), with a very rapidy decaying envelope, with high frequencies 
>decaying faster than low ones (as it the usual case).
>
>This is therefore an elementary exercise in ~partial tracking~ (see the 
>Julius Smith ref above) . The C++ CLAM library and tools support this 
>and much else besides:
>

I understand that a guitar has a dynamic spectrum - but if I preform FFT
)or SFT) on a basic sample of a guitar playing C, will it not somehow
resemble to the FFT I will later use on a file that plays the same guitar
C? So, I understand it is not perfect - but will it approximatley be most
matching out of, let's say, 2 other instruments and their C "fingerprints"?
That is all I need.

Reply by Jerry Avins ●September 2, 20092009-09-02

Shimon M wrote:



> I understand that a guitar has a dynamic spectrum - but if I preform FFT
> )or SFT) on a basic sample of a guitar playing C, will it not somehow
> resemble to the FFT I will later use on a file that plays the same guitar
> C? So, I understand it is not perfect - but will it approximatley be most
> matching out of, let's say, 2 other instruments and their C "fingerprints"?
> That is all I need.

What kind of guitar? Acoustic will differ markedly from electric, 
especially the kinds of amplifier treatment common in rock. A sustained 
tone won't resemble the plucked sound at all. You don't need an FFT to 
know that. Your ears can tell you. Listen before forming a theory.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Richard Dobson ●September 2, 20092009-09-02

Shimon M wrote:
> Thank you for the resources - Also,
> The algorithm we had to implement for the project appears in page 52
> (psudeo code at page 54)
> http://www.math.ias.edu/~akavia/AkaviaPhDThesis.pdf
> 
> If you could just glance at it and tell me if it is good. It is called a
> SFT (Short time fourier transfrom).
>

That is usually abbreviated to STFT (and is very relevant to your 
project, e.g. to make spectrograms/sonograms). That isn't what the paper 
describes. They call it a "Significant Fourier Transform", in effect a 
faster data-reduced version of the FFT in the context of data 
encryption.  Now there may be an advanced research project in there, to 
see if their "SFT" has any special application for audio (I have no 
idea), but that probably needs rather more than two weeks.

Just use the "bog-standard" FFT.

If (from what you say ) you already have a working implementation of 
their SFT, then you will have to demonstrate that it meets the 
requirements of the project; and justify that choice other than on the 
basis that it was what you had lying around at the time.

..
> I understand that a guitar has a dynamic spectrum - but if I preform FFT
> )or SFT) on a basic sample of a guitar playing C, will it not somehow
> resemble to the FFT I will later use on a file that plays the same guitar
> C? So, I understand it is not perfect - but will it approximatley be most
> matching out of, let's say, 2 other instruments and their C "fingerprints"?
> That is all I need.

Instrument notes can be quite long (and electic guitar one especially 
so!). Are you proposing to analyse a brief moment from the whole 
(50msecs?), or take a single large FFT of the whole file?

There are standard techniques for describing static spectra - such as 
the Spectral Centroid (q.v.) a weighted average of the frequencies 
(where the amplitudes are the weights) - gives a general sense of the 
brightness or otherwise of a sound. In essence, if you nailed the 
spectrum loosely to a wall at its midpoint, would it topple to the left 
or to the right, Or just balance exactly. If your sounds are 
sufficiently timbrally distinct, you should be able to, well, 
distinguish them.  But that is not really much of a project. You are 
testing a technique dependent on having strongly dissimilar sources, to 
demonstrate that you can distinguish dissimilar sources!

You are looking at a basic level of timbre classification. Given the 
constraints above it will probably work. Not sure how many brownie 
points you could reasonably get for it though.  What would be 
~interesting~ would be to be able to tell what guitar string was being 
used to generate a given note "C" (or whichever). Middle C is  a 
relatively high note for a guitar - by my reckoning at least five 
strings can be used to play it - they will be similar but different.

Assuming all your FFTs are of the same length, you could extract the 
(static) spectral envelope (as used to identify vocal formants, for 
example), and simply find their differences; and then determine how 
useful that is in distinguishing one from another. Music researchers use 
the idea of a "timbre space" (usually illustrated on 3 axes, but I think 
higher dimensions are used too; MPEG-7 defines 17 primary sound 
descriptors); if you can classify your test sounds in terms of such a 
timbre space, then you have the basis for a useful project.

Richard Dobson

Reply by Shimon M ●September 3, 20092009-09-03

> Richard Dobson wrote:
> Once you have an FFT working (use a good existing library for that!), 
> you might consider creating a sonogram display (spectrum/time). Try the

> one in Audacity (www.audacity.org) to see what this is. The different 
> characters of guitar v piano v flute etc will show up very well. And it

> will help you empirically understand about the relationship between FFT

> size, sample rate, and freqency resolution. And the use of Windowing.

> Also look at Csound (www.csounds.com) - which has tools for partial 
> tracking and all manner of spectral analysis, including basic FFTs.

I read your "FFT in a day" and also tried the above links - but I was
still baffled. I tried Audacity's program on a wave file produced like
this:
a recurring C4 (261.626 Hz), and then tried Analyze > Plot spectrum
and got:
255.706787	-54.295128
258.398438	-48.651611
261.090088	-40.968891 (no apparent importance)
263.781738	-39.088898
266.473389	-44.641678
269.165039	-54.939251

but then - I tried changing the axis to "log frequency" and got:
apperant peaks at 130 and 261 - so I am stasified.

So, I conclude the problem is this "log axis" - I shall now research, and
try to incorporate it with the SFT or FFT, implement it myself and proceed
with my other tasks in the project.

Thank you all. You've been  a great help.

Reply by Shimon M ●September 3, 20092009-09-03

> Richard Dobson wrote:
> Once you have an FFT working (use a good existing library for that!), 
> you might consider creating a sonogram display (spectrum/time). Try the

> one in Audacity (www.audacity.org) to see what this is. The different 
> characters of guitar v piano v flute etc will show up very well. And it

> will help you empirically understand about the relationship between FFT

> size, sample rate, and freqency resolution. And the use of Windowing.

> Also look at Csound (www.csounds.com) - which has tools for partial 
> tracking and all manner of spectral analysis, including basic FFTs.

I read your "FFT in a day" and also tried the above links - but I was
still baffled. I tried Audacity's program on a wave file produced like
this:
a recurring C4 (261.626 Hz), and then tried Analyze > Plot spectrum
and got:
255.706787	-54.295128
258.398438	-48.651611
261.090088	-40.968891 (no apparent importance)
263.781738	-39.088898
266.473389	-44.641678
269.165039	-54.939251

but then - I tried changing the axis to "log frequency" and got:
apperant peaks at 130 and 261 - so I am stasified.

So, I conclude the problem is this "log axis" - I shall now research, and
try to incorporate it with the SFT or FFT, implement it myself and proceed
with my other tasks in the project.

Thank you all. You've been  a great help.

Previous12 3 Next

Sound analysis

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group