DSPRelated.com
Forums

Sharpening oversampled STFT-based spectrograms

Started by Michel Rouzic December 28, 2008
If you made a spectrogram by windowing in the time domain using an
infinitely long window (let's go for a Gaussian function) every one
sample in the signal (that is centre your window on each sample and
DFT each time), you would obtain a huge and very blurry spectrogram.
But this huge blurry image wouldn't be aliased, so the thought
occurred to me, since it's oversampled and not aliased, and we know
frequency response (or should I say "point spread function"?) (in the
case of infinite Gaussian windowing that would be a 2D Gaussian
function, which is very convenient for it has no zero crossings in the
space or frequency domain) of the image in question, what prevents us
for sharpening/deconvolving this huge image into revealing the finer
details of the image?

I'm sure we could sharpen it a bit and enhance some details, but how
far could we take the sharpening before we'd get bogus/noisy results?
To me it sounds like the limit would be in the precision of
calculations/quantisation, but is there anything else? What would be
the limit, assuming no such precision issues? Would would there be any
limit, would we bring out any artifacts, and if so why?
On 28 Des, 14:38, Michel Rouzic <Michel0...@yahoo.fr> wrote:
> If you made a spectrogram by windowing in the time domain using an > infinitely long window
First of all, I don't like these kinds of 'infinitely long' discussions, since the devil (and insights) are in the practical details. Playing along: If you use a spectrogram with an 'infinitely long' window, you would obtain repeated spectra of the whole signal, only scaled by the window function. By doing this you lose the signal dynamics information the spectrogram was designed to extract in the first place. Rune
On Dec 28, 6:23&#4294967295;pm, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 28 Des, 14:38, Michel Rouzic <Michel0...@yahoo.fr> wrote: > > > If you made a spectrogram by windowing in the time domain using an > > infinitely long window > > First of all, I don't like these kinds of 'infinitely long' > discussions, since the devil (and insights) are in the practical > details. > > Playing along: If you use a spectrogram with an 'infinitely long' > window, you would obtain repeated spectra of the whole signal, > only scaled by the window function. By doing this you lose > the signal dynamics information the spectrogram was designed > to extract in the first place. > > Rune
Oh sorry I misworded it. By infinitely long, I meant, make the Gaussian window as wide as normal, except without "cutting" the infinite "tails". As for what I mean by infinitely long I mean it from a non-implementational point of view. From an implementational point of view we would of course limit ourselves to the points within the signal.
On Dec 28, 5:38 am, Michel Rouzic <Michel0...@yahoo.fr> wrote:

> If you made a spectrogram by windowing in the time domain using an > infinitely long window (let's go for a Gaussian function) every one > sample in the signal (that is centre your window on each sample and > DFT each time), you would obtain a huge and very blurry spectrogram.
'Blurry' is a function of the signal characteristics and the second moment of your window. (Gaussian window can be arbitrarily long for a given second moment.) A stationary tone will have a frequency extent inversely proportional to the second moment of the window. The constant response in the time direction is an accurate measure of its nature.
> But this huge blurry image wouldn't be aliased, so the thought > occurred to me, since it's oversampled and not aliased, and we know
Are you saying the signal was bandlimited before sampling or is this part of the sentence compulsive fluff?
> frequency response (or should I say "point spread function"?) (in the > case of infinite Gaussian windowing that would be a 2D Gaussian > function,
Spectrograms are usually time frequency plots calculated from a 1D signal. Only 1D of the response is determined by the Gaussianity of the window, the other 1D comes from signal non-stationarity.
> which is very convenient for it has no zero crossings in the > space or frequency domain) of the image in question,
Convenient for what application? Many applications are not sensitive to the zero-crossing structure. If there is a real application, that is a calculation that will be performed on real data, the window will be finite and the response of the truncated Gaussian will have zero crossings. Some finite windows don't suffer this but why bother unless there is an application?
> what prevents us > for sharpening/deconvolving this huge image into revealing the finer > details of the image? >
Control of the second moment of the window allows us to select the scale of details we wish to reveal already. Why process poorly and worry about how to clean up afterwards?
> I'm sure we could sharpen it a bit and enhance some details, but how > far could we take the sharpening before we'd get bogus/noisy results?
Signal non-stationarity puts a limit on the useful range of processing parameters.
> To me it sounds like the limit would be in the precision of > calculations/quantisation, but is there anything else? What would be > the limit, assuming no such precision issues? Would would there be any > limit, would we bring out any artifacts, and if so why?
Why worry about real effects on an unreal computation? Making the computation a real one has enough requirements and effects that you aren't aware of. Dale B. Dalrymple
dbd wrote:
> On Dec 28, 5:38 am, Michel Rouzic <Michel0...@yahoo.fr> wrote: > > > If you made a spectrogram by windowing in the time domain using an > > infinitely long window (let's go for a Gaussian function) every one > > sample in the signal (that is centre your window on each sample and > > DFT each time), you would obtain a huge and very blurry spectrogram. > > 'Blurry' is a function of the signal characteristics and the second > moment of your window. (Gaussian window can be arbitrarily long for a > given second moment.) A stationary tone will have a frequency extent > inversely proportional to the second moment of the window. The > constant response in the time direction is an accurate measure of its > nature.
Sorry but I'm afraid I'm not sure what you meant by any of that. As for the "frequency extent" bit well I understand that, it's correct, but my point is that the frequency response of a stationary tone being the Fourier transform of the window function, if you window with a Gaussian function then a stationary tone will appear as a Gaussian function in the frequency domain. And because I believe you can deconvolve a Gaussian function into a Dirac delta function, you could therefore tremendously increase both the time and frequency resolution of a spectrogram at the same time.
> > But this huge blurry image wouldn't be aliased, so the thought > > occurred to me, since it's oversampled and not aliased, and we know > > Are you saying the signal was bandlimited before sampling or is this > part of the sentence compulsive fluff?
Oh I'm not talking about the 1D signal not being aliased but its spectrogram not being aliased. Because when we get a spectrogram using a STFT we "decimate" (by taking chunks spaced by N samples) therefore we get aliased spectrograms.
> > frequency response (or should I say "point spread function"?) (in the > > case of infinite Gaussian windowing that would be a 2D Gaussian > > function, > > Spectrograms are usually time frequency plots calculated from a 1D > signal. Only 1D of the response is determined by the Gaussianity of > the window, the other 1D comes from signal non-stationarity.
Right, but if you look at the 2D image as a whole, you'll find that a vertical line (a Dirac delta in the analysed signal) will be spread horizontally as a function that is the same as the windowing function, while a stationary tone will be spread vertically as a function matching to the Fourier transform of the windowing function. From that you can infer the 2D point spread function of the whole 2D image, which in this case is a 2D Gaussian function.
> > which is very convenient for it has no zero crossings in the > > space or frequency domain) of the image in question, > > Convenient for what application? Many applications are not sensitive > to the zero-crossing structure. If there is a real application, that > is a calculation that will be performed on real data, the window will > be finite and the response of the truncated Gaussian will have zero > crossings. Some finite windows don't suffer this but why bother unless > there is an application?
Convenient for deconvolution? If I'm not mistaken it's a bit problematic to recover the value of x from the result of x * 0. As for the finiteness of things, it's no problem as you can just consider that anything out of bounds is equal to zero.
> > what prevents us > > for sharpening/deconvolving this huge image into revealing the finer > > details of the image? > > > > Control of the second moment of the window allows us to select the > scale of details we wish to reveal already. Why process poorly and > worry about how to clean up afterwards?
I'm afraid you missed the whole point of the idea, but I won't hold it against you, I'm notoriously bad at explaining my ideas. By "controlling the second moment of the window" to "select the scale of details" you're making a choice between time resolution and frequency resolution. What I'm trying to talk about is a way not to have to make that choice and get all the detail, by catching it all at once and sharpening it, if you will. I'm just trying to figure out what would be the limitations of this. I have no idea what you're referring to by "process poorly".
> > I'm sure we could sharpen it a bit and enhance some details, but how > > far could we take the sharpening before we'd get bogus/noisy results? > > Signal non-stationarity puts a limit on the useful range of processing > parameters.
I'm afraid I didn't catch what you mean there either.
> > To me it sounds like the limit would be in the precision of > > calculations/quantisation, but is there anything else? What would be > > the limit, assuming no such precision issues? Would would there be any > > limit, would we bring out any artifacts, and if so why? > > Why worry about real effects on an unreal computation? Making the > computation a real one has enough requirements and effects that you > aren't aware of.
I'm talking about theory here, not implementation. But I won't hold it against you either, it seems like it's a custom around here to ponder practicality and feasibility before establishing the theoretical possibility of something.
"Michel Rouzic" <Michel0528@yahoo.fr> wrote in message
news:bfd2dcfb-20dc-4df1-895e-560d4dd9fb52@z28g2000prd.googlegroups.com...

> but my point is that the frequency response of a stationary tone being > the Fourier transform of the window function, if you window with a > Gaussian function then a stationary tone will appear as a Gaussian > function in the frequency domain.
What about two or three tones?
> And because I believe you can > deconvolve a Gaussian function into a Dirac delta function, you could > therefore tremendously increase both the time and frequency resolution > of a spectrogram at the same time.
Nope. If you know apriory that the input is a piece of a sinewave, then you can deconvolve. But if you don't know what the input is, you can't deconvolve. More generally, if the input can be represented as a parametric model, then the best you can do is estimate the parameters of the model. Vladimir Vassilevsky DSP and Mixed Signal Consultant www.abvolt.com
Michel

I think it will be clearer to respond in a altered order beginning
with you conclusion:

On Dec 28, 12:37 pm, Michel Rouzic <Michel0...@yahoo.fr> wrote:

> I'm talking about theory here, not implementation. But I won't hold it > against you either, it seems like it's a custom around here to ponder > practicality and feasibility before establishing the theoretical > possibility of something.
We discuss things here that fall into at least four categories: Group 1 Theoretical limits of infinite and continuous extent Group 2 Theoretical limits of finite and sampled extent Group 3 Limits of signal environment, noise, interference robustness algorithm choice Group 4 coefficient generation method complexity is there enough RAM is the CPU fast enough etc. The inexperienced here often combine group2 with group3 and group4 as 'implementation details'. However both group2 and group1 are theoretical. Group 3 might be 'feasibility' and Group 4 might be 'practicality'. It's a custom around here to consider the theoretical requirements of group 2 even with those unaccustomed to applying them, whether the unaccustomed are aware of their nature or not.
> dbd wrote: > > On Dec 28, 5:38 am, Michel Rouzic <Michel0...@yahoo.fr> wrote: >
...
> Oh I'm not talking about the 1D signal not being aliased but its > spectrogram not being aliased. Because when we get a spectrogram using > a STFT we "decimate" (by taking chunks spaced by N samples) therefore > we get aliased spectrograms.
'spacing by N samples is an incorrect processing choice (poor processing). Group2 considerations allow strides of any integer value. Dynamic signal analyzers provide selectable overlaps to deal with this correctly. ...
> From that > you can infer the 2D point spread function of the whole 2D image, > which in this case is a 2D Gaussian function.
The usefulness of this function is limited by group3 realities.
> > > > which is very convenient for it has no zero crossings in the > > > space or frequency domain) of the image in question, > > > ... > > If there is a real application, that > > is a calculation that will be performed on real data, the window will > > be finite and the response of the truncated Gaussian will have zero > > crossings.
These are group2 concerns with your approach. ...
> Convenient for deconvolution? If I'm not mistaken it's a bit > problematic to recover the value of x from the result of x * 0. As for > the finiteness of things, it's no problem as you can just consider > that anything out of bounds is equal to zero.
Group2 concerns bring in zeros. You need an approach that is robust to the presence of zeros in the finite sampled domains. ...
> By > "controlling the second moment of the window" to "select the scale of > details" you're making a choice between time resolution and frequency > resolution. What I'm trying to talk about is a way not to have to make > that choice and get all the detail, by catching it all at once and > sharpening it, if you will. I'm just trying to figure out what would > be the limitations of this.
People have been giving you examples from groups 2 and 3
> I have no idea what you're referring to by > "process poorly".
See the 'chunks spaced by N' remarks and inability to deal with zeros ... This part discussed at top of message:
> I'm talking about theory here, not implementation. But I won't hold it > against you either, it seems like it's a custom around here to ponder > practicality and feasibility before establishing the theoretical > possibility of something.
Dale B. Dalrymple
On 29 Des, 06:52, dbd <d...@ieee.org> wrote:
> Michel > > I think it will be clearer to respond in a altered order beginning > with you conclusion: > > On Dec 28, 12:37 pm, Michel Rouzic <Michel0...@yahoo.fr> wrote: > > > I'm talking about theory here, not implementation. But I won't hold it > > against you either, it seems like it's a custom around here to ponder > > practicality and feasibility before establishing the theoretical > > possibility of something. > > We discuss things here that fall into at least four categories: > Group 1 > Theoretical limits of infinite and continuous extent > Group 2 > Theoretical limits of finite and sampled extent > Group 3 > Limits of > &#4294967295; &#4294967295;signal environment, noise, interference > &#4294967295; &#4294967295;robustness > &#4294967295; &#4294967295;algorithm choice > Group 4 > &#4294967295; &#4294967295;coefficient generation method complexity > &#4294967295; &#4294967295;is there enough RAM > &#4294967295; &#4294967295;is the CPU fast enough > &#4294967295; &#4294967295;etc. > > The inexperienced here often combine group2 with group3 and group4 as > 'implementation details'. However both group2 and group1 are > theoretical. Group 3 might be 'feasibility' and Group 4 might be > 'practicality'. > > It's a custom around here to consider the theoretical requirements of > group 2 even with those unaccustomed to applying them, whether the > unaccustomed are aware of their nature &#4294967295;or not.
Very good summary. If the comp.dsp FAQ still exists, this ought to go straight in there. Rune
On 29 Dez., 16:46, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 29 Des, 06:52, dbd <d...@ieee.org> wrote: > > > > > > > Michel > > > I think it will be clearer to respond in a altered order beginning > > with you conclusion: > > > On Dec 28, 12:37 pm, Michel Rouzic <Michel0...@yahoo.fr> wrote: > > > > I'm talking about theory here, not implementation. But I won't hold it > > > against you either, it seems like it's a custom around here to ponder > > > practicality and feasibility before establishing the theoretical > > > possibility of something. > > > We discuss things here that fall into at least four categories: > > Group 1 > > Theoretical limits of infinite and continuous extent > > Group 2 > > Theoretical limits of finite and sampled extent > > Group 3 > > Limits of > > &#4294967295; &#4294967295;signal environment, noise, interference > > &#4294967295; &#4294967295;robustness > > &#4294967295; &#4294967295;algorithm choice > > Group 4 > > &#4294967295; &#4294967295;coefficient generation method complexity > > &#4294967295; &#4294967295;is there enough RAM > > &#4294967295; &#4294967295;is the CPU fast enough > > &#4294967295; &#4294967295;etc. > > > The inexperienced here often combine group2 with group3 and group4 as > > 'implementation details'. However both group2 and group1 are > > theoretical. Group 3 might be 'feasibility' and Group 4 might be > > 'practicality'. > > > It's a custom around here to consider the theoretical requirements of > > group 2 even with those unaccustomed to applying them, whether the > > unaccustomed are aware of their nature &#4294967295;or not. > > Very good summary. If the comp.dsp FAQ still exists, this ought to > go straight in there.
In fact, it does exist (http://www.bdti.com/faq/) but it is hopelessly out of date (probably because it is maintained by one single person who has other duties with higher priorities). I have been thinking if we shouldn't construct the FAQ as a wiki, so that (registered) editors (regulars) can update the FAQ at their leasure. This again had me thinking that many of the points in the FAQ (books, software, online resources, etc. ) and more DSP resources have already been compiled and are kept up-to-date at another location (dsprelated.com). Briefly, I wondered whether we should ask Stephane to host the comp.dsp FAQ as a wiki and referencing those out-of-date sections on books, software, etc. to his maintained links. This has brought up the doubt of whether an open group FAQ should be hosted on a private page (as it is now, as well), and whether it is ok for Stephane to make money with it. Also, I thought of r b-j and Wikipedia. This has led me down a moral blind alley and thus I postponed a request to the group in this matter for later. I guess now is later :-). Comments? Regards, Andor
Andor <andor.bariska@gmail.com> writes:
> [...] > In fact, it does exist (http://www.bdti.com/faq/) but it is hopelessly > out of date (probably because it is maintained by one single person > who has other duties with higher priorities). > > I have been thinking if we shouldn't construct the FAQ as a wiki, so > that (registered) editors (regulars) can update the FAQ at their > leasure. This again had me thinking that many of the points in the FAQ > (books, software, online resources, etc. ) and more DSP resources have > already been compiled and are kept up-to-date at another location > (dsprelated.com). Briefly, I wondered whether we should ask Stephane > to host the comp.dsp FAQ as a wiki and referencing those out-of-date > sections on books, software, etc. to his maintained links. This has > brought up the doubt of whether an open group FAQ should be hosted on > a private page (as it is now, as well), and whether it is ok for > Stephane to make money with it. Also, I thought of r b-j and > Wikipedia. This has led me down a moral blind alley and thus I > postponed a request to the group in this matter for later. I guess now > is later :-). > > Comments?
Hi Andor, I applaud your incentive to gather things together in one spot and update the material. But as you've already noted, putting it on dsprelated would be following the same pattern we've already seen twice now. I propose that we get a domain independent of any business or individual; in essence it would be the comp.dsp community at large. Place this sort of material there. The money that is made by the site would be "comp.dsp" money. It oculd be used primarily to keep the site running (pay ISP costs, maintenance costs, etc.) and if much more is left over, perhaps to fund comp.dsp conferences. If even more is left over, perhaps we could start a scholarship fund. Just a few ideas from me, I and myself. -- % Randy Yates % "The dreamer, the unwoken fool - %% Fuquay-Varina, NC % in dreams, no pain will kiss the brow..." %%% 919-577-9882 % %%%% <yates@ieee.org> % 'Eldorado Overture', *Eldorado*, ELO http://www.digitalsignallabs.com