DSPRelated.com
Forums

Audio FFT problem - PLEASE HELP

Started by cppt...@yahoo.com April 21, 2008
Could some DSP guru please help me a bit ? I am creating a FFT based
application that uses an audio input. The sampling details are as
follows:
encoding : PCM
sampling frequency : 16000 Hz
resolution : 16 bits
channel : mono
signed : true
endianness : little (as this runs on an Intel processor)

The raw bytes get collected in a byte array, and I take two bytes at a
time (resolution is 16 bits) and correct for wrap-around in the first
byte and get the corresponding floating point number which is the
sampled value. The buffer is 2048 bytes long.

When I examine the raw numbers, I find a long sub-sequence of 0s at
the start, before I start seeing the sinusoidal pattern in the sampled
numbers.

I use a time shifted Gaussian window to filter the data before I
compute the FFT. When I examine the array containing the FFT
magnitudes, I find some extra peaks at the end of the array, i.e., the
spectrum does not look symmetric as it should.

I am not sure where the extra peaks are coming from. I am fairly
confident that my FFT routine is working, as I have tested it out with
hand-generated sample data, and the output spectrum has a very
symmetric structure in the peaks.
Any hints, suggestions would be immensely helpful. Thanks in advance
for your help.
cpptutor2000@yahoo.com wrote:
> Could some DSP guru please help me a bit ? I am creating a FFT based > application that uses an audio input. The sampling details are as > follows: > encoding : PCM > sampling frequency : 16000 Hz > resolution : 16 bits > channel : mono > signed : true > endianness : little (as this runs on an Intel processor) > > The raw bytes get collected in a byte array, and I take two bytes at a > time (resolution is 16 bits) and correct for wrap-around in the first > byte and get the corresponding floating point number which is the > sampled value. The buffer is 2048 bytes long.
Wraparound? How so? Where do the bytes come from? Does the converter give you half a sample at a time?
> When I examine the raw numbers, I find a long sub-sequence of 0s at > the start, before I start seeing the sinusoidal pattern in the sampled > numbers.
How did the zeros get there? Again, where do the bytes come from?
> I use a time shifted Gaussian window to filter the data before I > compute the FFT.
Why?
> When I examine the array containing the FFT > magnitudes, I find some extra peaks at the end of the array, i.e., the > spectrum does not look symmetric as it should. > > I am not sure where the extra peaks are coming from. I am fairly > confident that my FFT routine is working, as I have tested it out with > hand-generated sample data, and the output spectrum has a very > symmetric structure in the peaks. > Any hints, suggestions would be immensely helpful. Thanks in advance > for your help.
Some of what you leave unsaid may hold the key. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
On Apr 21, 10:30 am, Jerry Avins <j...@ieee.org> wrote:

> Wraparound? How so? Where do the bytes come from? Does the converter > give you half a sample at a time?
Since I am using 16 bits resolution in the A-D conversion, I read in 2 bytes at a time from the array of raw bytes and convert these two bytes to a signed short number. Also, I am using little-endian format, as that is what my computer will allow. So, if the low order byte is less than 0, I add 256 to it. Then, each sampled value is: low_order_byte | high_order_byte << 8
> > > When I examine the raw numbers, I find a long sub-sequence of 0s at > > the start, before I start seeing the sinusoidal pattern in the sampled > > numbers. > > How did the zeros get there? Again, where do the bytes come from?
That is the problem. The bytes are whatever are being given by the A-D converter.
> > > I use a time shifted Gaussian window to filter the data before I > > compute the FFT. > > Why?
The window is used to remove discontinuities at the start and end of the array of sampled numbers. As far as I recall, this is a very common method. Also, most windows, in one way or the other attempt to replicate a Gaussian centered at the origin (maximum at 0). However, if we take the zero of time to correspond to be that instant the sampling starts, then the window has to be time shifted, so that its filtering action is correct - i.e., start and end of the window values correspond to the start and end of the sampled values.
> > Some of what you leave unsaid may hold the key. >
These are all the details that I think can think of.
> Jerry > -- > Engineering is the art of making what you want from things you can get. > &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
cpptutor2000@yahoo.com wrote:
> On Apr 21, 10:30 am, Jerry Avins <j...@ieee.org> wrote: > >> Wraparound? How so? Where do the bytes come from? Does the converter >> give you half a sample at a time? > Since I am using 16 bits resolution in the A-D conversion, I read in > 2 bytes at a time > from the array of raw bytes and convert these two bytes to a signed > short number.
Why not read one 16-bit word at a time? *Where does the array of bytes come from?*
> Also, I am using little-endian format, as that is what my computer > will allow. So, if the low > order byte is less than 0, I add 256 to it. Then, each sampled value > is: > low_order_byte | high_order_byte << 8
Since you are dealing with the bytes (strictly, octets) one at a time, buy not treat them as data and take them in the order you want? Adding 128 to a byte toggles the MSB. Adding 128 only if the MSB is 1 is the same as ANDing with 0x7F to all bytes, thus avoiding a test and branch.
>>> When I examine the raw numbers, I find a long sub-sequence of 0s at >>> the start, before I start seeing the sinusoidal pattern in the sampled >>> numbers. >> How did the zeros get there? Again, where do the bytes come from? > That is the problem. The bytes are whatever are being given by the > A-D > converter.
So you are using a real converter, not a simulation? Why don't you read it all at once? Depending on how it's connected, it can have any endianness you want.
>>> I use a time shifted Gaussian window to filter the data before I >>> compute the FFT. >> Why? > The window is used to remove discontinuities at the start and end > of the array of > sampled numbers. As far as I recall, this is a very common method.
All right, so why not a more usual von Hann or an even better-performance Blackman window? Why Gaussian?
> Also, most > windows, in one way or the other attempt to replicate a Gaussian > centered at the > origin (maximum at 0). However, if we take the zero of time to > correspond to be > that instant the sampling starts, then the window has to be time > shifted, so that > its filtering action is correct - i.e., start and end of the window > values correspond > to the start and end of the sampled values.
The window you actually use has to go to zero at the ends. A Gaussian stretches to infinity in both directions. Is the idea of a Gaussian window something you thought up yourself? (The first digital filters I used were Gaussian -- binomial, actually -- that I stumbled upon by thinking.) The more of a window's derivatives go to zero at its ends, the better the side-lobe suppression you can expect.
>> Some of what you leave unsaid may hold the key. >> > These are all the details that I think can think of.
I still don't know how the numbers end up in an array, only to be moved somewhere else, considering that you fetch them from an ADC. Is that how the strange zeros sneak in? Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
On Apr 21, 1:49 pm, Jerry Avins <j...@ieee.org> wrote:

> Why not read one 16-bit word at a time? *Where does the array of bytes > come from?*
I am using a Java interface that writes the sampled bytes to a ByteArrayInputStream, that gets converted to a byte array using 'toByteArray()'.
> Since you are dealing with the bytes (strictly, octets) one at a time, > buy not treat them as data and take them in the order you want?
I am afraid that doing so would prevent me from doing the windowing operation, since the window values are all double values. I am not sure how I could represent a number as 0.005382 as a byte.
> So you are using a real converter, not a simulation? Why don't you read > it all at once? Depending on how it's connected, it can have any > endianness you want.
Yes it is a real converter in that it is the same converter that comes with my PC. I am not hand-generating the input bytes. These are all that the PC and the Java interface provides me.
> All right, so why not a more usual von Hann or an even > better-performance Blackman window? Why Gaussian?
These are definitely options, but I am trying to get the thing to work, before I refine my thing.
> Jerry > -- > Engineering is the art of making what you want from things you can get. > &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
cpptutor2000@yahoo.com wrote:
> On Apr 21, 1:49 pm, Jerry Avins <j...@ieee.org> wrote: > >> Why not read one 16-bit word at a time? *Where does the array of bytes >> come from?* > I am using a Java interface that writes the sampled bytes to a > ByteArrayInputStream, that gets converted to a byte array using > 'toByteArray()'. >> Since you are dealing with the bytes (strictly, octets) one at a time, >> buy not treat them as data and take them in the order you want? > I am afraid that doing so would prevent me from doing the windowing > operation, since the window values are all double values. I am not > sure > how I could represent a number as 0.005382 as a byte.
A 16-bit word is not really adequate for your chain of calculations. Multiplying by a window coefficient causes rounding, and FFTs drop a bit of significance for every two stages. You would do better to cast your 16-bit words to floats as you begin and convert back to integer when you're all done. You can find the answer to your actual question at http://www.digitalsignallabs.com/fp.pdf. Fixed-point is scaled integer.
>> So you are using a real converter, not a simulation? Why don't you read >> it all at once? Depending on how it's connected, it can have any >> endianness you want. > Yes it is a real converter in that it is the same converter that > comes with > my PC. I am not hand-generating the input bytes. These are all that > the > PC and the Java interface provides me. > >> All right, so why not a more usual von Hann or an even >> better-performance Blackman window? Why Gaussian?
> These are definitely options, but I am trying to get the thing to > work, > before I refine my thing.
If you assemble the bytes out of the converter in the proper order, there should be no need to zero the sign bit. You want the bits you work with to be the ones the ADC put out. If you have a need to modify them, something is amiss. Jerry