Forums

Nyquist, quantization and windowing gotcha's

Started by Richard Owlett September 4, 2008
I've been experimenting with a 3D version of spectrograms [amplitude vs 
frequency vs time]. Instead of plotting the spectrum of each time slice 
(cf waterfall displays), I plot contours of equal amplitude across time. 
Borrowing from traditional spectrograms, each contour's color also 
indicates amplitude allowing adjacent contours to be distinguished when 
close together.

The observed artifacts, although catching me by surprise, are easily 
explained. They may be interesting visualizations when giving elementary 
explanations of sampling related phenomena.

My environment is Scilab-4.1.2 under WinXP. The test sample is a 16 
second 44kHz 16 bit pcm recording of an arctic loon (ORIGINALLY recorded 
at 22 kHz and saved as mp3).

1. To locate a time of interest I plotted the complete 16 seconds. It
    was too slow. I arbitrarily chose to display every 100th sample to
    speed up the display. When I displayed only the first 1 second, it
    was a really weird "waveform" (effectively sampled at less than 1/2
    Nyquist ;)

2. As part of of my routine to chose a threshold for displaying data,
    I did a traditional 2D spectrogram with intensity displayed on a 32
    color scale. The display was extremely blocky. These were quantization
    artifacts (16 time slices and 32 amplitudes).

3. I then decided to investigate the effect of windowing the data prior
    to performing FFT. From Scilab's choices I had originally chosen a
    Kaiser window with alpha=8.6 ("Kaiser" and "alpha" are Scilab's
    terms). For contrast I no windowing (aka boxcar window). The effect,
    probably emphasized by thresholding of which data to plot, was
    *DRAMATIC*. With windowing I got a bar extending on the time axis.
    Without windowing, I got a "T" with the top on the time axis as
    before, with a prominent leg on the frequency axis.


*QUESTION*     Am I facing other "gotcha's"?

Current defaults are 50 mSec windows offset by 10 mSec.

Are these choices appropriate/reasonable?
Are there trade offs other than processing times?
Are there other questions I should be asking?




Am Thu, 04 Sep 2008 11:34:52 -0500 schrieb Richard Owlett:
> ... recording of an arctic loon (ORIGINALLY recorded > at 22 kHz and saved as mp3). > [...] > *QUESTION* Am I facing other "gotcha's"? >
mp3 is a lossy compression based on human psychoacoustics, so the signal is distorted. Saving as mp3 isn't a good idea for later doing signal analysis (unless studying the effects of mp3, of course ...). IMHO. YMMV. Martin
mblume wrote:
> Am Thu, 04 Sep 2008 11:34:52 -0500 schrieb Richard Owlett: > >>... recording of an arctic loon (ORIGINALLY recorded >>at 22 kHz and saved as mp3). >>[...] >>*QUESTION* Am I facing other "gotcha's"? >> > > mp3 is a lossy compression based on human psychoacoustics, so > the signal is distorted. Saving as mp3 isn't a good idea for > later doing signal analysis (unless studying the effects of > mp3, of course ...). > > IMHO. YMMV. > Martin
!!! *LOL* !!! Should have mentioned that Rune had pointed that out in previous thread. That's why I mentioned "prime" source was mp3 ;) ALSO that's why I noted I was working with a 44k sample/sec copy when original was 22k ;{ Other than lossy compression effects, what are other GOTCHA's ????
On Sep 4, 1:57&#2013266080;pm, Richard Owlett <rowl...@atlascomm.net> wrote:
> mblume wrote: > > Am Thu, 04 Sep 2008 11:34:52 -0500 schrieb Richard Owlett: > > >>... recording of an arctic loon (ORIGINALLY recorded > >>at 22 kHz and saved as mp3). > >>[...] > >>*QUESTION* &#2013266080; &#2013266080; Am I facing other "gotcha's"? > > > mp3 is a lossy compression based on human psychoacoustics, so > > the signal is distorted. Saving as mp3 isn't a good idea for > > later doing signal analysis (unless studying the effects of > > mp3, of course ...). > > > IMHO. YMMV. > > Martin > > !!! *LOL* !!! > Should have mentioned that Rune had pointed that out in previous thread. > That's why I mentioned "prime" source was mp3 ;) > ALSO that's why I noted I was working with a 44k sample/sec copy when > original was 22k ;{ > > Other than lossy compression effects, > what are other GOTCHA's ????- Hide quoted text - > > - Show quoted text -
Hello Richard, One possible place for error is when the original contains frequencies outside of normal human hearing or near its limits. I think the .mp3 approach uses a filter bank and if you are off of the ends of the bank, you may lose some data. I do note that your source is limited to a max frequency of only 11kHz and practically it is maybe only 9 or 10kHz. I don't know if bird calls normally contain frequencies of any significant strength above 10kHz. Cats can easily hear up to 60 kHz (kittens hear over 100kHz) which is quite effective in localization of prey. Part of the high frequency requirement for cats comes from their heads being small and hence their ears close together. Birds (most) of course have even smaller heads. Do they localize well with hearing or do they have to rely on their amazing vision to locate things? Some birds like owls seem to have quite sensitive ears and cat sized heads and they too hunt rodents. I don't know much about loons. What do they hunt and does hearing play a large part in that process? Clay
clay@claysturner.com wrote:
> On Sep 4, 1:57 pm, Richard Owlett <rowl...@atlascomm.net> wrote: > >>mblume wrote: >> >>>Am Thu, 04 Sep 2008 11:34:52 -0500 schrieb Richard Owlett: >> >>>>... recording of an arctic loon (ORIGINALLY recorded >>>>at 22 kHz and saved as mp3). >>>>[...] >>>>*QUESTION* Am I facing other "gotcha's"? >> >>>mp3 is a lossy compression based on human psychoacoustics, so >>>the signal is distorted. Saving as mp3 isn't a good idea for >>>later doing signal analysis (unless studying the effects of >>>mp3, of course ...). >> >>>IMHO. YMMV. >>>Martin >> >>!!! *LOL* !!! >>Should have mentioned that Rune had pointed that out in previous thread. >>That's why I mentioned "prime" source was mp3 ;) >>ALSO that's why I noted I was working with a 44k sample/sec copy when >>original was 22k ;{ >> >>Other than lossy compression effects, >>what are other GOTCHA's ????- Hide quoted text - >> >>- Show quoted text - > > > Hello Richard, > > One possible place for error is when the original contains frequencies > outside of normal human hearing or near its limits. I think the .mp3 > approach uses a filter bank and if you are off of the ends of the > bank, you may lose some data.
Hmmm. Gives me an idea. An alto and tenor have agreed to record glisandoes for me in the next week or so. My original purpose was to see if I could identify resonances in vocal tract which create formants. Now I might experiment to see if I could create a effective visual presentation of the distortion introduced by various lossy compression schemes.
> > I do note that your source is limited to a max frequency of only 11kHz > and practically it is maybe only 9 or 10kHz. I don't know if bird > calls normally contain frequencies of any significant strength above > 10kHz.
Just took a closer look adjusting the display parameters. There are significant components up to 10kHz. No idea if they are "real" or "artifacts". Demonstrates why I need original uncompressed with high sample rate recordings.
> Cats can easily hear up to 60 kHz (kittens hear over 100kHz) > which is quite effective in localization of prey. Part of the high > frequency requirement for cats comes from their heads being small and > hence their ears close together. Birds (most) of course have even > smaller heads. Do they localize well with hearing or do they have to > rely on their amazing vision to locate things? Some birds like owls > seem to have quite sensitive ears and cat sized heads and they too > hunt rodents. I don't know much about loons. What do they hunt and > does hearing play a large part in that process?
No idea. Chose them because Rune had commented on using spectrograms of them to demonstrate some audio concepts.
> > Clay