DSPRelated.com
Forums

Frequency Analysis parameters

Started by Matti Lamprhey June 18, 2004
What are the parameters which are required to completely specify the
frequency analysis of an audio stream?

I'm assuming they are roughly as follows:
1. incoming sampling rate
2. incoming sampling bitlength
3. floor frequency
4. ceiling frequency
5. number of bins
6. no of samples per frame
7. sample displacement between frames (overlap)

Thanks,

Matti


How about if the detection for each band is peak or RMS-based.  (And the RMS
response/ballistics.)  Don't know if that's something common or not, but it came
to mind.

"Matti Lamprhey" <matti-nospam@totally-official.com> wrote in message
news:2jh91uF10sdmpU1@uni-berlin.de...
> What are the parameters which are required to completely specify the > frequency analysis of an audio stream? > > I'm assuming they are roughly as follows: > 1. incoming sampling rate > 2. incoming sampling bitlength > 3. floor frequency > 4. ceiling frequency > 5. number of bins > 6. no of samples per frame > 7. sample displacement between frames (overlap) > > Thanks, > > Matti > >
What do you mean by "completely specify"? Are you talking about the
Fourier transform (you mention "bins"), asking for the bin spacing to
guarantee perfect reconstruction on an inverse Fourier transform?

If that's the case, my DFT &#4294967295; Pied article might be of help. You can
find it at http://www.dspdimension.com.

--smb

"Matti Lamprhey" <matti-nospam@totally-official.com> wrote in message news:<2jh91uF10sdmpU1@uni-berlin.de>...
> What are the parameters which are required to completely specify the > frequency analysis of an audio stream? > > I'm assuming they are roughly as follows: > 1. incoming sampling rate > 2. incoming sampling bitlength > 3. floor frequency > 4. ceiling frequency > 5. number of bins > 6. no of samples per frame > 7. sample displacement between frames (overlap) > > Thanks, > > Matti
"Matti Lamprhey" <matti-nospam@totally-official.com> wrote in message
news:2jh91uF10sdmpU1@uni-berlin.de...
> What are the parameters which are required to completely specify the > frequency analysis of an audio stream? > > I'm assuming they are roughly as follows: > 1. incoming sampling rate > 2. incoming sampling bitlength > 3. floor frequency > 4. ceiling frequency > 5. number of bins > 6. no of samples per frame > 7. sample displacement between frames (overlap) >
Maybe I'm being picky but .... "the frequency analysis of an audio stream" is whatever you define the analysis to be. Only after you define the analysis you have in mind, then can you specify what parameters are needed. So, maybe you have one kind of analysis in mind and I have another...... For example, you have "number of bins" on the list and you have "samples per frame" but you don't have "frame length in time or samples" which is directly related to number of bins (in frequency). Bits have a length of 1. I think you mean word length. If that's important then you might also want to mention the type of representation: fixed point, floating point, IEEE, integer, etc. It would help if you told us what you're trying to accomplish. Are you writing a specification for something? Fred
"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote...
> "Matti Lamprhey" <matti-nospam@totally-official.com> wrote... > > What are the parameters which are required to completely specify the > > frequency analysis of an audio stream? > > > > I'm assuming they are roughly as follows: > > 1. incoming sampling rate > > 2. incoming sampling bitlength > > 3. floor frequency > > 4. ceiling frequency > > 5. number of bins > > 6. no of samples per frame > > 7. sample displacement between frames (overlap) > > > > Maybe I'm being picky but .... "the frequency analysis of an audio > stream" is whatever you define the analysis to be. Only after you > define the analysis you have in mind, then can you specify what > parameters are needed. So, maybe you have one kind of analysis > in mind and I have another...... > > For example, you have "number of bins" on the list and you have > "samples per frame" but you don't have "frame length in time or > samples" which is directly related to number of bins (in frequency). > > Bits have a length of 1. I think you mean word length. If that's > important then you might also want to mention the type of > representation: fixed point, floating point, IEEE, integer, etc. > > It would help if you told us what you're trying to accomplish. Are > you writing a specification for something?
Thanks to you and the others for your responses -- I've been trying to educate myself starting from the link provided by Stephan. Does "frame length in time or samples" have a special meaning? I would have thought the length in samples was my "samples per frame", and length in time was available using the sample rate. But one of my problems here is unfamiliarity with the terminology. I need to take an audio stream and manipulate it in real time. This will involve analysing it into frequency subranges, manipulating the amplitude values in a particular way, then resynthesizing. The manipulated stream must be indistinguishable from the original and minimally delayed. I'm trying to understand all the configurable parameters involved in this process, and the nature of the data I will need to manipulate between the analysis and the resynthesis stages; for example, will I be presented with a phase value as well as an amplitude value? I'm a programmer, but with no experience of DSP and trying to learn fast! Matti
Hi Matti,

"Matti Lamprhey" wrote:
> Thanks to you and the others for your responses -- I've been trying to > educate myself starting from the link provided by Stephan. > > Does "frame length in time or samples" have a special meaning? I would > have thought the length in samples was my "samples per frame", and > length in time was available using the sample rate. But one of my > problems here is unfamiliarity with the terminology.
In a nutshell: under realtime constraints you don't have your entire signal available in advance, hence you utilize something that is called the "short time Fourier transform" (STFT) for the purpose you have mentioned (be aware that there are other transforms that give you an output that may be related to the concept of frequency, too, but let's assume you want the Fourier transform for now). So, you chop the incoming signal into handy (overlapping) "analysis frames", spaced at a certain "stride". That stride defines how much they overlap, and hence takes influence on the I/O latency (provided the target machine is fast enough to do the transform in realtime). The analysis frame size takes influence on the frequency resolution and the reciprocal time localization (larger analysis frames give better frequency resolution but over a longer time period).The latter is explained on my web site at the link I've provided, so I won't repeat it here. Usually, you apply a "windowing function" to the data in each chunk to prevent the discontinuities at the borders from "cluttering" your data, at the expense of some resolution. Now, each STFT frame gives you a complex valued output for each bin which you can easily convert into instantaneous magnitude and phase through the Cartesian -> polar conversion which I also explain on my web site. You can mess with these values as you wish, and do the reverse to get from your Fourier transform back to the time domain again. You string the individual chunks together (there are different recipes available for that, called overlap-add or overlap-save) and there you go with your modified data. There's much more to it than what I've just said, but that should help to get you started. Take a look at my smbPitchShift() function on http://www.dspdimension.com/src/smbPitchShift.cpp, it does all of this and more, it also messes with the magnitudes and phases, so you should be able to use it for your purpose with little modification. --smb
"Matti Lamprhey" <matti-nospam@totally-official.com> wrote in message
news:2jl1kuF12qbhuU1@uni-berlin.de...
> Thanks to you and the others for your responses -- I've been trying to > educate myself starting from the link provided by Stephan. > > Does "frame length in time or samples" have a special meaning? I would > have thought the length in samples was my "samples per frame", and > length in time was available using the sample rate. But one of my > problems here is unfamiliarity with the terminology. > > I need to take an audio stream and manipulate it in real time. This > will involve analysing it into frequency subranges, manipulating the > amplitude values in a particular way, then resynthesizing. The > manipulated stream must be indistinguishable from the original and > minimally delayed. I'm trying to understand all the configurable > parameters involved in this process, and the nature of the data I will > need to manipulate between the analysis and the resynthesis stages; > for example, will I be presented with a phase value as well as an > amplitude value? I'm a programmer, but with no experience of DSP and > trying to learn fast!
Matti, Stephan has provided a good answer with suggestions about where to look for guidance. The frame length in time determines the spectral resolution - so it's an important parameter. So, yes you could calculate it from the sample rate and the number of samples. But that might obscure this fundamental identity. Thus my comment to help you understand what you're dealing with. The spectral resolution is the reciprocal of the frame length in time. Just as the temporal resolution (sample interval) is the reciprocal of the sample rate. Good that you gave us more information on your objective. Now you have folks who understand that particular area better. As Stephan mentioned, you will end up with complex values when you do the Fourier Transform / Discrete Fourier Transform / Fast (Discrete) Fourier Transform / FFT. Complex values can be converted to the polar version: amplitude and phase. If for some reason you are going to deal with phase directly then you will may have to deal with phase wrap-around because phase is only calculable as an angle modulo 2*pi. Going back from polar to rectangular representation often forces some kind of effort to remove the discontinuities. And, in order to inverse FFT / IFFT I believe you have to have the rectangular / complex representation. That is, I don't know of an algorithm that will do the IFFT directly from amplitude and phase numbers. "Indistinguishable" in what way? Fred
"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote...
> "Matti Lamprhey" <matti-nospam@totally-official.com> wrote... > > [...] > > I need to take an audio stream and manipulate it in real time. This > > will involve analysing it into frequency subranges, manipulating the > > amplitude values in a particular way, then resynthesizing. The > > manipulated stream must be indistinguishable from the original and > > minimally delayed.
> > "Indistinguishable" in what way?
They should sound the same to the human ear/brain. Matti
"Stephan M. Bernsee" <stephan.bernsee@web.de> wrote...
> Hi Matti, > > [snip v. helpful chunk] > There's much more to it than what I've just said, but that should help > to get you started. Take a look at my smbPitchShift() function on > http://www.dspdimension.com/src/smbPitchShift.cpp, it does all of this > and more, it also messes with the magnitudes and phases, so you should > be able to use it for your purpose with little modification.
Thank you very much, Stephan -- I'll do so immediately and avidly! Matti