comp.dsp | Audio FFT problem

Could some DSP guru please help me a bit ? I am creating a FFT based
application that uses an audio input. The sampling details are as
follows:
encoding : PCM
sampling frequency : 16000 Hz
resolution : 16 bits
channel : mono
signed : true
endianness : little (as this runs on an Intel processor)

The raw bytes get collected in a byte array, and I take two bytes at a
time (resolution is 16 bits) and correct for wrap-around in the first
byte and get the corresponding floating point number which is the
sampled value. The buffer is 2048 bytes long.

When I examine the raw numbers, I find a long sub-sequence of 0s at
the start, before I start seeing the sinusoidal pattern in the sampled
numbers.

I use a time shifted Gaussian window to filter the data before I
compute the FFT. When I examine the array containing the FFT
magnitudes, I find some extra peaks at the end of the array, i.e., the
spectrum does not look symmetric as it should.

I am not sure where the extra peaks are coming from. I am fairly
confident that my FFT routine is working, as I have tested it out with
hand-generated sample data, and the output spectrum has a very
symmetric structure in the peaks.
Any hints, suggestions would be immensely helpful. Thanks in advance
for your help.

Reply by Jerry Avins ●April 21, 20082008-04-21

cpptutor2000@yahoo.com wrote:
> Could some DSP guru please help me a bit ? I am creating a FFT based
> application that uses an audio input. The sampling details are as
> follows:
> encoding : PCM
> sampling frequency : 16000 Hz
> resolution : 16 bits
> channel : mono
> signed : true
> endianness : little (as this runs on an Intel processor)
> 
> The raw bytes get collected in a byte array, and I take two bytes at a
> time (resolution is 16 bits) and correct for wrap-around in the first
> byte and get the corresponding floating point number which is the
> sampled value. The buffer is 2048 bytes long.

Wraparound? How so? Where do the bytes come from? Does the converter 
give you half a sample at a time?

> When I examine the raw numbers, I find a long sub-sequence of 0s at
> the start, before I start seeing the sinusoidal pattern in the sampled
> numbers.

How did the zeros get there?  Again, where do the bytes come from?

> I use a time shifted Gaussian window to filter the data before I
> compute the FFT.

Why?

> When I examine the array containing the FFT
> magnitudes, I find some extra peaks at the end of the array, i.e., the
> spectrum does not look symmetric as it should.
> 
> I am not sure where the extra peaks are coming from. I am fairly
> confident that my FFT routine is working, as I have tested it out with
> hand-generated sample data, and the output spectrum has a very
> symmetric structure in the peaks.
> Any hints, suggestions would be immensely helpful. Thanks in advance
> for your help.

Some of what you leave unsaid may hold the key.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by cppt...@yahoo.com ●April 21, 20082008-04-21

On Apr 21, 10:30 am, Jerry Avins <j...@ieee.org> wrote:

> Wraparound? How so? Where do the bytes come from? Does the converter
> give you half a sample at a time?
  Since I am using 16 bits resolution in the A-D conversion, I read in
2 bytes at a time
  from the array of raw bytes and convert these two bytes to a signed
short number.
  Also, I am using little-endian format, as that is what my computer
will allow. So, if the low
  order byte is less than 0, I add 256 to it. Then, each sampled value
is:
  low_order_byte | high_order_byte << 8
>
> > When I examine the raw numbers, I find a long sub-sequence of 0s at
> > the start, before I start seeing the sinusoidal pattern in the sampled
> > numbers.
>
> How did the zeros get there?  Again, where do the bytes come from?
   That is the problem. The bytes are whatever are being given by the
A-D
   converter.
>
> > I use a time shifted Gaussian window to filter the data before I
> > compute the FFT.
>
> Why?
   The window is used to remove discontinuities at the start and end
of the array of
   sampled numbers. As far as I recall, this is a very common method.
Also, most
  windows, in one way or the other attempt to replicate a Gaussian
centered at the
  origin (maximum at 0). However, if we take the zero of time to
correspond to be
  that instant the sampling starts, then the window has to be time
shifted, so that
  its filtering action is correct - i.e., start and end of the window
values correspond
  to the start and end of the sampled values.
>
> Some of what you leave unsaid may hold the key.
>
   These are all the details that I think can think of.
> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Jerry Avins ●April 21, 20082008-04-21

cpptutor2000@yahoo.com wrote:
> On Apr 21, 10:30 am, Jerry Avins <j...@ieee.org> wrote:
> 
>> Wraparound? How so? Where do the bytes come from? Does the converter
>> give you half a sample at a time?
>   Since I am using 16 bits resolution in the A-D conversion, I read in
> 2 bytes at a time
>   from the array of raw bytes and convert these two bytes to a signed
> short number.

Why not read one 16-bit word at a time? *Where does the array of bytes 
come from?*

>   Also, I am using little-endian format, as that is what my computer
> will allow. So, if the low
>   order byte is less than 0, I add 256 to it. Then, each sampled value
> is:
>   low_order_byte | high_order_byte << 8

Since you are dealing with the bytes (strictly, octets) one at a time, 
buy not treat them as data and take them in the order you want?

Adding 128 to a byte toggles the MSB. Adding 128 only if the MSB is 1 is 
the same as ANDing with 0x7F to all bytes, thus avoiding a test and branch.

>>> When I examine the raw numbers, I find a long sub-sequence of 0s at
>>> the start, before I start seeing the sinusoidal pattern in the sampled
>>> numbers.
>> How did the zeros get there?  Again, where do the bytes come from?
>    That is the problem. The bytes are whatever are being given by the
> A-D
>    converter.

So you are using a real converter, not a simulation? Why don't you read 
it all at once? Depending on how it's connected, it can have any 
endianness you want.

>>> I use a time shifted Gaussian window to filter the data before I
>>> compute the FFT.
>> Why?
>    The window is used to remove discontinuities at the start and end
> of the array of
>    sampled numbers. As far as I recall, this is a very common method.

All right, so why not a more usual von Hann or an even 
better-performance Blackman window? Why Gaussian?

> Also, most
>   windows, in one way or the other attempt to replicate a Gaussian
> centered at the
>   origin (maximum at 0). However, if we take the zero of time to
> correspond to be
>   that instant the sampling starts, then the window has to be time
> shifted, so that
>   its filtering action is correct - i.e., start and end of the window
> values correspond
>   to the start and end of the sampled values.

The window you actually use has to go to zero at the ends. A Gaussian 
stretches to infinity in both directions. Is the idea of a Gaussian 
window something you thought up yourself? (The first digital filters I 
used were Gaussian -- binomial, actually -- that I stumbled upon by 
thinking.) The more of a window's derivatives go to zero at its ends, 
the better the side-lobe suppression you can expect.

>> Some of what you leave unsaid may hold the key.
>>
>    These are all the details that I think can think of.

I still don't know how the numbers end up in an array, only to be moved 
somewhere else, considering that you fetch them from an ADC. Is that how 
the strange zeros sneak in?

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by cppt...@yahoo.com ●April 21, 20082008-04-21

On Apr 21, 1:49 pm, Jerry Avins <j...@ieee.org> wrote:

> Why not read one 16-bit word at a time? *Where does the array of bytes
> come from?*
I am using a Java interface that writes the sampled bytes to a
ByteArrayInputStream, that gets converted to a byte array using
'toByteArray()'.
> Since you are dealing with the bytes (strictly, octets) one at a time,
> buy not treat them as data and take them in the order you want?
  I am afraid that doing so would prevent me from doing the windowing
  operation, since the window values are all double values. I am not
sure
  how I could represent a number as 0.005382 as a byte.

> So you are using a real converter, not a simulation? Why don't you read
> it all at once? Depending on how it's connected, it can have any
> endianness you want.
  Yes it is a real converter in that it is the same converter that
comes with
  my PC. I am not hand-generating the input bytes. These are all that
the
  PC and the Java interface provides me.

> All right, so why not a more usual von Hann or an even
> better-performance Blackman window? Why Gaussian?
  These are definitely options, but I am trying to get the thing to
work,
  before I refine my thing.


> Jerry
> --
> Engineering is the art of making what you want from things you can get.
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Jerry Avins ●April 22, 20082008-04-22

cpptutor2000@yahoo.com wrote:
> On Apr 21, 1:49 pm, Jerry Avins <j...@ieee.org> wrote:
> 
>> Why not read one 16-bit word at a time? *Where does the array of bytes
>> come from?*
> I am using a Java interface that writes the sampled bytes to a
> ByteArrayInputStream, that gets converted to a byte array using
> 'toByteArray()'.
>> Since you are dealing with the bytes (strictly, octets) one at a time,
>> buy not treat them as data and take them in the order you want?
>   I am afraid that doing so would prevent me from doing the windowing
>   operation, since the window values are all double values. I am not
> sure
>   how I could represent a number as 0.005382 as a byte.

A 16-bit word is not really adequate for your chain of calculations. 
Multiplying by a window coefficient causes rounding, and FFTs drop a bit 
of significance for every two stages. You would do better to cast your 
16-bit words to floats as you begin and convert back to integer when 
you're all done. You can find the answer to your actual question at 
http://www.digitalsignallabs.com/fp.pdf. Fixed-point is scaled integer.

>> So you are using a real converter, not a simulation? Why don't you read
>> it all at once? Depending on how it's connected, it can have any
>> endianness you want.
>   Yes it is a real converter in that it is the same converter that
> comes with
>   my PC. I am not hand-generating the input bytes. These are all that
> the
>   PC and the Java interface provides me.
> 
>> All right, so why not a more usual von Hann or an even
>> better-performance Blackman window? Why Gaussian?

>   These are definitely options, but I am trying to get the thing to
> work,
>   before I refine my thing.

If you assemble the bytes out of the converter in the proper order, 
there should be no need to zero the sign bit. You want the bits you work 
with to be the ones the ADC put out. If you have a need to modify them, 
something is amiss.

Jerry

Audio FFT problem - PLEASE HELP

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group