comp.dsp | Sorry about the many posts| page 3

Reply by Cedron ●March 4, 20172017-03-04

>
>Please don't tell me that it is incredibly complicated. Now I have to
>solve it!
>
>Btw, the solution to the cubic equation is one of the most fascinating
>stories of Mathematics. The men, the methods and the drama. It took
>place in Italy almost 500 years ago.
>
>Michael

Michael,

I took a stab at this again and I've got some good news and some bad
news.


First the good news:

There is a K (and any scalar multiple of it) which reduces the quadratic
of cos(alpha) in the numerator to being linear.  More on this below.

Of the three factors in the denominator, one will be common across three
bins of W, so it might be able to be factored out leaving a quadratic
instead of a cubic equation to be solved.  The case of a five value K will
have three factors in common, so it too, may be solvable as a quadratic.

I have not tried separating the equations into their real and imaginary
parts, so there may be some hope there.


Now the bad news:

The K that kills the quadratic term does not meet Dale's specification of
"pick a cosine sum window already used".

I still don't think it is readily solvable, and if it is, I have to say
again that it is incredibly complicated.


Here is the killer K:

Let H = e^( -i * Pi / N )

Let K = ( -1/H  1/H + H  -H ) * 1/4

For large N, H ~=~ 1 so K becomes ( -1  2  -1 ) * 1/4 which matches Dale's
given set.

The window function corresponding to Dale's set is

( 1 - cos( 2Pi/N * n ) * 1/2 = sin^2( Pi/N * n )

Von Hann's function, according to Wikipedia is:

( 1 - cos( 2Pi/(N-1) * n ) * 1/2 = sin^2( Pi/(N-1) * n )

Window functions centered at N/2, like the first one, are known as
"periodic".  Window functions centered at (N-1)/2 are known as
"symmetric".  Again, this is according to Wikipedia which is the ultimate
standard of truth on the internet.  (snark)

The window function of the killer K is

[ cos( Pi/N ) - cos( Pi/N * (2n+1) ) ] * 1/2 = sin( Pi/N * (n+1) ) * sin(
Pi/N * n )

Which by definition is a symmetric window function.

You might also note that 4 * H * K is the same vector that appears in the
three bin formula for the exact frequency in my blog article.

I hope you are having fun with this.  It's quite a challenge.

Ced



---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Michael Plet ●March 4, 20172017-03-04

On Sat, 04 Mar 2017 15:22:58 -0600, "Cedron" <103185@DSPRelated>
wrote:

>>
>>Please don't tell me that it is incredibly complicated. Now I have to
>>solve it!
>>
>>Btw, the solution to the cubic equation is one of the most fascinating
>>stories of Mathematics. The men, the methods and the drama. It took
>>place in Italy almost 500 years ago.
>>
>>Michael
>
>Michael,
>
>I took a stab at this again and I've got some good news and some bad
>news.
>
>
>First the good news:
>
>There is a K (and any scalar multiple of it) which reduces the quadratic
>of cos(alpha) in the numerator to being linear.  More on this below.
>
>Of the three factors in the denominator, one will be common across three
>bins of W, so it might be able to be factored out leaving a quadratic
>instead of a cubic equation to be solved.  The case of a five value K will
>have three factors in common, so it too, may be solvable as a quadratic.
>
>I have not tried separating the equations into their real and imaginary
>parts, so there may be some hope there.
>
>
>Now the bad news:
>
>The K that kills the quadratic term does not meet Dale's specification of
>"pick a cosine sum window already used".
>
>I still don't think it is readily solvable, and if it is, I have to say
>again that it is incredibly complicated.
>
>
>Here is the killer K:
>
>Let H = e^( -i * Pi / N )
>
>Let K = ( -1/H  1/H + H  -H ) * 1/4
>
>For large N, H ~=~ 1 so K becomes ( -1  2  -1 ) * 1/4 which matches Dale's
>given set.
>
>The window function corresponding to Dale's set is
>
>( 1 - cos( 2Pi/N * n ) * 1/2 = sin^2( Pi/N * n )
>
>Von Hann's function, according to Wikipedia is:
>
>( 1 - cos( 2Pi/(N-1) * n ) * 1/2 = sin^2( Pi/(N-1) * n )
>
>Window functions centered at N/2, like the first one, are known as
>"periodic".  Window functions centered at (N-1)/2 are known as
>"symmetric".  Again, this is according to Wikipedia which is the ultimate
>standard of truth on the internet.  (snark)
>
>The window function of the killer K is
>
>[ cos( Pi/N ) - cos( Pi/N * (2n+1) ) ] * 1/2 = sin( Pi/N * (n+1) ) * sin(
>Pi/N * n )
>
>Which by definition is a symmetric window function.
>
>You might also note that 4 * H * K is the same vector that appears in the
>three bin formula for the exact frequency in my blog article.
>
>I hope you are having fun with this.  It's quite a challenge.
>
>Ced
>
>
>
>---------------------------------------
>Posted through http://www.DSPRelated.com


Cedron,

I haven't started working on this and with you providing all the
conclusions I don't see why I should do so.

Michael

Reply by Michael Plet ●March 4, 20172017-03-04

On Fri, 03 Mar 2017 19:32:21 -0500, robert bristow-johnson
<rbj@audioimagination.com> wrote:

>On 3/3/17 5:31 PM, Cedron wrote:
>>>
>>> Since you like music, math and puzzles, let me pose a puzzle for you to
>>> consider.
>>>
>> [...snip...]
>>>
>>> My puzzle for you is to find an exact estimator for frequency in a
>>> windowed transform. That's still a bit of a wide topic, so let me give
>> more
>>> background and narrow it down some.
>>>
>>> There is a family of windows called "cosine sum" windows that have the
>>> characteristic that they can be applied in the time domain, but also in
>> the
>>> frequency domain where they are applied by convolving the Fourier
>>> coefficients with a small kernel. For example, the classic von Hann
>> window has the
>>> kernel coefficients: -1/4 +1/2 -1/4.
>>>
>> [...snip...]
>>
>>
>> It's nice of you to challenge Michael, but I can tell you that I have run
>> down this path and it is either incredibly complicated or not possible.
>> Furthermore, it is not necessary.
>>
>> First, the reason it is not possible, or extremely difficult, is due to
>> the nature of the bin value formula for a pure tone.  The best form of
>> this equation for this purpose is the one I give as Equation 25 in my blog
>> article "DFT Bin Value Formulas for Pure Real Tones" which can be found
>> here:
>>
>> https://www.dsprelated.com/showarticle/771.php
>>
>> The bugaboo is the cos( beta_k ) term in the denominator.  Generalizing a
>> window function that can be implemented as "convolving the Fourier
>> coefficients with a small kernel."
>>
>> Let's call K = ( k_{-1} k_0 k_1 ) be such a kernel.  An example would be
>> the Von Hann coefficients you provided of ( -1/4 +1/2 -1/4 )
>>
>> Let's call the bins of an unwindowed DFT Z_k, and the bins of the windowed
>> one W_k.
>>
>> W_k = ( Z_{k-1} Z_k Z_{k+1} ) dot K
>>
>> Deriving an exact formula for W_k would lead to an expression that has a
>> cubic equation of cos( alpha ) in the denomitator and a quadratic equation
>> of cos( alpha ) in the numerator which would include mixed values of U and
>> V (which both also contain alpha).
>>
>> Solving for an exact frequency equation means manipulating the equations
>> for W_{k-1}, W_k, and W_{k+1} (for a 3 bin equation) in order to eliminate
>> the M, U, and V unknowns.  If this is doable, and I'm not sure it is, it
>> leaves you with a cubic equation of cos( alpha ).  There is a generalized
>> way to solve cubic equations, but it is much more complicated than the
>> quadratic formula.
>>
>> Once you have solved for cos( alpha ), choosing the correct root, then the
>> value of alpha will yield your frequency.
>>
>> If you step up to a five value kernel, you will similarly get a fifth
>> degree equation for cos( alpha ) for which there is no general analytic
>> solution available.
>>
>> Second, it is not necessary.  The primary reason for employing a window
>> function is to reduce the size of the side lobes (aka "spectral leakage")
>> for all the tones in the signal.  The conventional thinking[1] is the side
>> lobes are undesirable because they can interfere with, and even mask,
>> other tones that are in the signal.  There is a better way to deal with
>> this problem that I am working up to in my series of blog articles.  It
>> involves building a list of the tones that are present and iteratively
>> calculating their parameters.  It converges rapidly to very accurate
>> values.  In the case of a noiseless signal consisting of steady tones
>> through the analysis frame, it converges to an exact answer.
>>
>> In the case of analyzing music, the latter assumption is generally not
>> true.  I have some other tricks that make it true on a piecemeal basis
>> which I am currently working on coding (among other things).  Yes, this
>> also involves the concept of tone trajectories mentioned in the Phd thesis
>> you referenced.
>>
>
>well, i'm gonna toss in something i did nearly 2 decades ago:
>
>  http://ieeexplore.ieee.org/document/969581/
>
>you can get a free copy at researchgate:
>
> 
>https://www.researchgate.net/publication/3927319_Intraframe_time-scaling_of_nonstationary_sinusoids_within_the_phase_vocoder 
>
>
>if you use a Gaussian window, compute the FFT, and perform the complex 
>logarithm on those results, you can estimate not just the frequency of 
>the sinusoid, but also the linear sweep rate of the frequency and the 
>ramp rate of the amplitude.  just by fitting a line to a set of points.

This is a very good input rbj. The Gassian is unique in that sense and
will reduce sidelobes.

Michael

Reply by Cedron ●March 4, 20172017-03-04

>
>
>Cedron,
>
>I haven't started working on this and with you providing all the
>conclusions I don't see why I should do so.
>
>Michael

Michael,

I'm sorry these weren't meant to be spoilers of any type and I don't think
they are.  I was thinking about what I said yesterday, did a little
scratch work, and found out what I said wasn't entirely correct.

I am a little less pessimistic about it being solvable than I was. 
Solving it for a general K is still the ultimate goal.  If an exact
solution is not possible, there may still be a very good approximation
lurking in there.  I don't think I reached a point of any conclusions at
all.

Whether it is worth pursuing or not is up to you.  Dale stated that it
would be valuable for him and for others.  I don't agree that it is that
valuable, but it is still an interesting problem.

I'm not going to work on it anymore.  It's all yours, or Dale's, or
anybody else that may want to tackle it.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Michael Plet ●March 5, 20172017-03-05

On Sat, 04 Mar 2017 19:23:10 -0600, "Cedron" <103185@DSPRelated>
wrote:

>>
>>
>>Cedron,
>>
>>I haven't started working on this and with you providing all the
>>conclusions I don't see why I should do so.
>>
>>Michael
>
>Michael,
>
>I'm sorry these weren't meant to be spoilers of any type and I don't think
>they are.  I was thinking about what I said yesterday, did a little
>scratch work, and found out what I said wasn't entirely correct.
>
>I am a little less pessimistic about it being solvable than I was. 
>Solving it for a general K is still the ultimate goal.  If an exact
>solution is not possible, there may still be a very good approximation
>lurking in there.  I don't think I reached a point of any conclusions at
>all.
>
>Whether it is worth pursuing or not is up to you.  Dale stated that it
>would be valuable for him and for others.  I don't agree that it is that
>valuable, but it is still an interesting problem.
>
>I'm not going to work on it anymore.  It's all yours, or Dale's, or
>anybody else that may want to tackle it.
>
>Ced
>---------------------------------------
>Posted through http://www.DSPRelated.com


I wasn't being sarcastic. Also in view of what rbj said about the
Gaussian I'm not working on it.
So please continue. If anyone can do it, it's you Cedron.

Michael

Reply by robert bristow-johnson ●March 5, 20172017-03-05

On Saturday, March 4, 2017 at 5:10:08 PM UTC-5, Michael Plet wrote:
> On Fri, 03 Mar 2017 19:32:21 -0500, robert bristow-johnson
> <rbj@audioimagination.com> wrote:
> 
> >On 3/3/17 5:31 PM, Cedron wrote:
> >>>
> >>> Since you like music, math and puzzles, let me pose a puzzle for you to
> >>> consider.
> >>>
> >> [...snip...]
> >>>
> >>> My puzzle for you is to find an exact estimator for frequency in a
> >>> windowed transform. That's still a bit of a wide topic, so let me give
> >> more
> >>> background and narrow it down some.
> >>>
> >>> There is a family of windows called "cosine sum" windows that have the
> >>> characteristic that they can be applied in the time domain, but also in
> >> the
> >>> frequency domain where they are applied by convolving the Fourier
> >>> coefficients with a small kernel. For example, the classic von Hann
> >> window has the
> >>> kernel coefficients: -1/4 +1/2 -1/4.
> >>>
> >> [...snip...]
> >>
> >>
> >> It's nice of you to challenge Michael, but I can tell you that I have run
> >> down this path and it is either incredibly complicated or not possible.
> >> Furthermore, it is not necessary.
> >>
> >> First, the reason it is not possible, or extremely difficult, is due to
> >> the nature of the bin value formula for a pure tone.  The best form of
> >> this equation for this purpose is the one I give as Equation 25 in my blog
> >> article "DFT Bin Value Formulas for Pure Real Tones" which can be found
> >> here:
> >>
> >> https://www.dsprelated.com/showarticle/771.php
> >>
> >> The bugaboo is the cos( beta_k ) term in the denominator.  Generalizing a
> >> window function that can be implemented as "convolving the Fourier
> >> coefficients with a small kernel."
> >>
> >> Let's call K = ( k_{-1} k_0 k_1 ) be such a kernel.  An example would be
> >> the Von Hann coefficients you provided of ( -1/4 +1/2 -1/4 )
> >>
> >> Let's call the bins of an unwindowed DFT Z_k, and the bins of the windowed
> >> one W_k.
> >>
> >> W_k = ( Z_{k-1} Z_k Z_{k+1} ) dot K
> >>
> >> Deriving an exact formula for W_k would lead to an expression that has a
> >> cubic equation of cos( alpha ) in the denomitator and a quadratic equation
> >> of cos( alpha ) in the numerator which would include mixed values of U and
> >> V (which both also contain alpha).
> >>
> >> Solving for an exact frequency equation means manipulating the equations
> >> for W_{k-1}, W_k, and W_{k+1} (for a 3 bin equation) in order to eliminate
> >> the M, U, and V unknowns.  If this is doable, and I'm not sure it is, it
> >> leaves you with a cubic equation of cos( alpha ).  There is a generalized
> >> way to solve cubic equations, but it is much more complicated than the
> >> quadratic formula.
> >>
> >> Once you have solved for cos( alpha ), choosing the correct root, then the
> >> value of alpha will yield your frequency.
> >>
> >> If you step up to a five value kernel, you will similarly get a fifth
> >> degree equation for cos( alpha ) for which there is no general analytic
> >> solution available.
> >>
> >> Second, it is not necessary.  The primary reason for employing a window
> >> function is to reduce the size of the side lobes (aka "spectral leakage")
> >> for all the tones in the signal.  The conventional thinking[1] is the side
> >> lobes are undesirable because they can interfere with, and even mask,
> >> other tones that are in the signal.  There is a better way to deal with
> >> this problem that I am working up to in my series of blog articles.  It
> >> involves building a list of the tones that are present and iteratively
> >> calculating their parameters.  It converges rapidly to very accurate
> >> values.  In the case of a noiseless signal consisting of steady tones
> >> through the analysis frame, it converges to an exact answer.
> >>
> >> In the case of analyzing music, the latter assumption is generally not
> >> true.  I have some other tricks that make it true on a piecemeal basis
> >> which I am currently working on coding (among other things).  Yes, this
> >> also involves the concept of tone trajectories mentioned in the Phd thesis
> >> you referenced.
> >>
> >
> >well, i'm gonna toss in something i did nearly 2 decades ago:
> >
> >  http://ieeexplore.ieee.org/document/969581/
> >
> >you can get a free copy at researchgate:
> >
> > 
> >https://www.researchgate.net/publication/3927319_Intraframe_time-scaling_of_nonstationary_sinusoids_within_the_phase_vocoder 
> >
> >
> >if you use a Gaussian window, compute the FFT, and perform the complex 
> >logarithm on those results, you can estimate not just the frequency of 
> >the sinusoid, but also the linear sweep rate of the frequency and the 
> >ramp rate of the amplitude.  just by fitting a line to a set of points.
> 
> This is a very good input rbj. The Gassian is unique in that sense and
> will reduce sidelobes.
> 

well, since the Gaussian goes on forever, but it decays fast so we can truncated it when it is very close to zero, but that truncation causes sidelobes (if it went on forever, the Fourier transform of a Gaussian is a Gaussian).  but the sidelobes aren't big if you let the window get real close to zero before truncating.

but the main reason for the Gaussian window is because it is the same mathematical form as a linearly-swept sinusoid:

     x(t) =  e^(-a/2 t^2) e^(j b/2 t^2) e^(j w0 t)

and the instantaneous frequency is

     w = w0  +  b t

so "a" is the measure for how narrow the window is.  "b" is the sweep rate for the angular frequency, w, and in the middle of the window, t=0 and w=w0, which is likely the frequency you are looking for.

r b-j

Reply by dbd ●March 6, 20172017-03-06

On Friday, March 3, 2017 at 2:32:02 PM UTC-8, Cedron wrote:
> ...
>  The primary reason for employing a window
> function is to reduce the size of the side lobes (aka "spectral leakage")
> for all the tones in the signal.  The conventional thinking[1] is the side
> lobes are undesirable because they can interfere with, and even mask,
> other tones that are in the signal.

What I told Michael was:
"discrete Fourier family transform music/voice schemes for both compression for storage and transmission and for analysis/synthesis for generation and analysis require the use of (non-rectangular) windowed overlapped transforms"
This is true but the -requirement- is motivated by reconstruction, independent of controlling spectral leakage, although it works for that too.

>  There is a better way to deal with
> this problem that I am working up to in my series of blog articles.  It
> involves building a list of the tones that are present and iteratively
> calculating their parameters.  It converges rapidly to very accurate
> values.  In the case of a noiseless signal consisting of steady tones
> through the analysis frame, it converges to an exact answer.

This has been in the dsp literature for 50 years. You don't find it applied much because it is not adequate for the representation of music or voice.

> 
> In the case of analyzing music, the latter assumption is generally not
> true.  I have some other tricks that make it true on a piecemeal basis
> which I am currently working on coding (among other things).  Yes, this
> also involves the concept of tone trajectories mentioned in the Phd thesis
> you referenced.

In the cases of music and speech there are vital components that are stochastic in nature (and some others) that are not usefully represented by periodic basis functions. In the Serra thesis these are called residuals. These are not small errors that would disappear if we let the algorithm converge longer. They are components of significant energy. Serra plots examples of decompositions into coherent and residual parts. The important part of the algorithm is the scheme which resolves what energy is part of each.

> 
> Ced
> ...

Dale B. Dalrymple

Reply by Cedron ●March 6, 20172017-03-06

>>
>>Whether it is worth pursuing or not is up to you.  Dale stated that it
>>would be valuable for him and for others.  I don't agree that it is
that
>>valuable, but it is still an interesting problem.
>>
>>I'm not going to work on it anymore.  It's all yours, or Dale's, or
>>anybody else that may want to tackle it.
>>
>>Ced
>>---------------------------------------
>>Posted through http://www.DSPRelated.com
>
>
>I wasn't being sarcastic. Also in view of what rbj said about the
>Gaussian I'm not working on it.
>So please continue. If anyone can do it, it's you Cedron.
>
>Michael

I didn't think you were being sarcastic.  Personally, I don't think it is
worth pursuing.  If it is doable, the end result is going to be a formula
that is significantly more complicated than the ones we already have.

The nice thing about the periodic Hann Window is that it can be
implemented with the ( -1/4 1/2 -1/4 ) convolution on the raw DFT.  (I was
unable to solve for a similar kernel for the symmetric, N-1 based, Hann
function and I don't think it is possible to do so.)

All I did in my scratch work was derive the bin value formula for a
general K.  What I found out was that there was only one K which
eliminated the quadratic term of cos( alpha ) and took a significant chunk
out of the linear term.  The effect of this should be reduced ripples in
the side lobes, and a more consistent fall off shape across the frequency
range.  

This K turned out to be the same vector that is part of my exact frequency
formula for the same reason.  It is the only vector that is orthogonal to
( 1 1 1 ) and ( R_1  1  1/R_1 ) where R_1 is exp( -2Pi / N ), the first
root of unity in the DFT calculation.  It can be found by taking the cross
product of those two vectors.  I then rescaled the resulting vector to
make it exponentially symmetric.  H is the primary square root of R_1, or
half of it along the unit circle.

The Gaussian approach is very interesting in its own right, particularly
with its frequency sweep capability.  Unfortunately, exactness is lost due
to the tails being chopped off.  Still, it did second best in the low
noise range in Figure 4 of Julien's comparison paper.

I think I have set up how to solve for the frequency equation pretty well.
 If anybody is interested I will post the bin value formula I derived, but
it is not that difficult to find it yourself.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Michael Plet ●March 6, 20172017-03-06

On Sun, 5 Mar 2017 17:16:47 -0800 (PST), robert bristow-johnson
<rbj@audioimagination.com> wrote:

>On Saturday, March 4, 2017 at 5:10:08 PM UTC-5, Michael Plet wrote:
>> On Fri, 03 Mar 2017 19:32:21 -0500, robert bristow-johnson
>> <rbj@audioimagination.com> wrote:
>> 
>> >On 3/3/17 5:31 PM, Cedron wrote:
>> >>>
>> >>> Since you like music, math and puzzles, let me pose a puzzle for you to
>> >>> consider.
>> >>>
>> >> [...snip...]
>> >>>
>> >>> My puzzle for you is to find an exact estimator for frequency in a
>> >>> windowed transform. That's still a bit of a wide topic, so let me give
>> >> more
>> >>> background and narrow it down some.
>> >>>
>> >>> There is a family of windows called "cosine sum" windows that have the
>> >>> characteristic that they can be applied in the time domain, but also in
>> >> the
>> >>> frequency domain where they are applied by convolving the Fourier
>> >>> coefficients with a small kernel. For example, the classic von Hann
>> >> window has the
>> >>> kernel coefficients: -1/4 +1/2 -1/4.
>> >>>
>> >> [...snip...]
>> >>
>> >>
>> >> It's nice of you to challenge Michael, but I can tell you that I have run
>> >> down this path and it is either incredibly complicated or not possible.
>> >> Furthermore, it is not necessary.
>> >>
>> >> First, the reason it is not possible, or extremely difficult, is due to
>> >> the nature of the bin value formula for a pure tone.  The best form of
>> >> this equation for this purpose is the one I give as Equation 25 in my blog
>> >> article "DFT Bin Value Formulas for Pure Real Tones" which can be found
>> >> here:
>> >>
>> >> https://www.dsprelated.com/showarticle/771.php
>> >>
>> >> The bugaboo is the cos( beta_k ) term in the denominator.  Generalizing a
>> >> window function that can be implemented as "convolving the Fourier
>> >> coefficients with a small kernel."
>> >>
>> >> Let's call K = ( k_{-1} k_0 k_1 ) be such a kernel.  An example would be
>> >> the Von Hann coefficients you provided of ( -1/4 +1/2 -1/4 )
>> >>
>> >> Let's call the bins of an unwindowed DFT Z_k, and the bins of the windowed
>> >> one W_k.
>> >>
>> >> W_k = ( Z_{k-1} Z_k Z_{k+1} ) dot K
>> >>
>> >> Deriving an exact formula for W_k would lead to an expression that has a
>> >> cubic equation of cos( alpha ) in the denomitator and a quadratic equation
>> >> of cos( alpha ) in the numerator which would include mixed values of U and
>> >> V (which both also contain alpha).
>> >>
>> >> Solving for an exact frequency equation means manipulating the equations
>> >> for W_{k-1}, W_k, and W_{k+1} (for a 3 bin equation) in order to eliminate
>> >> the M, U, and V unknowns.  If this is doable, and I'm not sure it is, it
>> >> leaves you with a cubic equation of cos( alpha ).  There is a generalized
>> >> way to solve cubic equations, but it is much more complicated than the
>> >> quadratic formula.
>> >>
>> >> Once you have solved for cos( alpha ), choosing the correct root, then the
>> >> value of alpha will yield your frequency.
>> >>
>> >> If you step up to a five value kernel, you will similarly get a fifth
>> >> degree equation for cos( alpha ) for which there is no general analytic
>> >> solution available.
>> >>
>> >> Second, it is not necessary.  The primary reason for employing a window
>> >> function is to reduce the size of the side lobes (aka "spectral leakage")
>> >> for all the tones in the signal.  The conventional thinking[1] is the side
>> >> lobes are undesirable because they can interfere with, and even mask,
>> >> other tones that are in the signal.  There is a better way to deal with
>> >> this problem that I am working up to in my series of blog articles.  It
>> >> involves building a list of the tones that are present and iteratively
>> >> calculating their parameters.  It converges rapidly to very accurate
>> >> values.  In the case of a noiseless signal consisting of steady tones
>> >> through the analysis frame, it converges to an exact answer.
>> >>
>> >> In the case of analyzing music, the latter assumption is generally not
>> >> true.  I have some other tricks that make it true on a piecemeal basis
>> >> which I am currently working on coding (among other things).  Yes, this
>> >> also involves the concept of tone trajectories mentioned in the Phd thesis
>> >> you referenced.
>> >>
>> >
>> >well, i'm gonna toss in something i did nearly 2 decades ago:
>> >
>> >  http://ieeexplore.ieee.org/document/969581/
>> >
>> >you can get a free copy at researchgate:
>> >
>> > 
>> >https://www.researchgate.net/publication/3927319_Intraframe_time-scaling_of_nonstationary_sinusoids_within_the_phase_vocoder 
>> >
>> >
>> >if you use a Gaussian window, compute the FFT, and perform the complex 
>> >logarithm on those results, you can estimate not just the frequency of 
>> >the sinusoid, but also the linear sweep rate of the frequency and the 
>> >ramp rate of the amplitude.  just by fitting a line to a set of points.
>> 
>> This is a very good input rbj. The Gassian is unique in that sense and
>> will reduce sidelobes.
>> 
>
>well, since the Gaussian goes on forever, but it decays fast so we can truncated it when it is very close to zero, but that truncation causes sidelobes (if it went on forever, the Fourier transform of a Gaussian is a Gaussian).  but the sidelobes aren't big if you let the window get real close to zero before truncating.
>
>but the main reason for the Gaussian window is because it is the same mathematical form as a linearly-swept sinusoid:
>
>     x(t) =  e^(-a/2 t^2) e^(j b/2 t^2) e^(j w0 t)
>
>and the instantaneous frequency is
>
>     w = w0  +  b t
>
>so "a" is the measure for how narrow the window is.  "b" is the sweep rate for the angular frequency, w, and in the middle of the window, t=0 and w=w0, which is likely the frequency you are looking for.
>
>r b-j


Thank you for the details. I'm going to write some code to try this
out on a few signals.

Michael

Reply by Cedron ●March 6, 20172017-03-06

>On Friday, March 3, 2017 at 2:32:02 PM UTC-8, Cedron wrote:
>> ...
>>  The primary reason for employing a window
>> function is to reduce the size of the side lobes (aka "spectral
leakage")
>> for all the tones in the signal.  The conventional thinking[1] is the
>side
>> lobes are undesirable because they can interfere with, and even mask,
>> other tones that are in the signal.
>
>What I told Michael was:
>"discrete Fourier family transform music/voice schemes for both
>compression for storage and transmission and for analysis/synthesis for
generation
>and analysis require the use of (non-rectangular) windowed overlapped
>transforms"
>This is true but the -requirement- is motivated by reconstruction,
>independent of controlling spectral leakage, although it works for that
too.
>

The fact that a periodic Hann has tapers in the time domain that fit
together perfectly when frames are 50% overlapped is another one of those
neat mathematical "tricks" that seem to permeate around the DFT.  However,
all this does is makes the recombining of frames implicit.  There are
other ways too, so it isn't really a requirement.  It is an acceptable
constraint it terms of solving for a frequency equation, though it does
complicate the process.

>>  There is a better way to deal with
>> this problem that I am working up to in my series of blog articles. 
It
>> involves building a list of the tones that are present and iteratively
>> calculating their parameters.  It converges rapidly to very accurate
>> values.  In the case of a noiseless signal consisting of steady tones
>> through the analysis frame, it converges to an exact answer.
>
>This has been in the dsp literature for 50 years. You don't find it
>applied much because it is not adequate for the representation of music
or voice.
>

Without having bin values equations, this algorithm has to be implemented
in the time domain and another DFT taken.  It can be done much more
computationally efficient manner in the frequency domain.  The goal is not
necessarily to represent the signal completely, the goal is to find the
best fit set of tones.

>> 
>> In the case of analyzing music, the latter assumption is generally not
>> true.  I have some other tricks that make it true on a piecemeal basis
>> which I am currently working on coding (among other things).  Yes,
this
>> also involves the concept of tone trajectories mentioned in the Phd
>thesis
>> you referenced.
>
>In the cases of music and speech there are vital components that are
>stochastic in nature (and some others) that are not usefully represented
by
>periodic basis functions. In the Serra thesis these are called residuals.
These
>are not small errors that would disappear if we let the algorithm
converge
>longer. They are components of significant energy. Serra plots examples
of
>decompositions into coherent and residual parts. The important part of
the
>algorithm is the scheme which resolves what energy is part of each.
>

My goal has been to accurately reproduce the signal, not necessarily as a
platform for implementing sound effects.

I have always called this split the tonal and atonal components.  I got
into this when I decided to try to devise a better sound compression
algorithm than mp3.  I won't get into details, but the goal is to reduce
the signal to sets of parameters of basis functions (not all of them
sinusoidal).  The reconstruction in my scheme does not involve inverse
DFTs, but rather the evaluation of various functions based on the
parameters.

The blog article I did on exponential smoothing is a stepping stone in the
process of separating out the tonal from the atonal.  The "trajectory
tracking" is crucial for navigating around frames in which there are large
atonal components present, such as drum hits.

>> 
>> Ced
>> ...
>
>Dale B. Dalrymple

My advice, as I mentioned in my last response to Michael, would be to
stick to taking a raw DFT for your frequency analysis, and calculate your
Hann DFT from the raw one.  In harris's paper I reference earlier, he
makes the observation: "Thus a Hanning window applied to a real transform
of length N can be performed as N real multiplies on the time sequence or
as 2N read adds and 2N binary shifts on the spectral data."

I am still puzzled about why Wikipedia's entry on the Hann function only
mentions the N-1 symmetric version.  Meanwhile harris uses the N periodic
version, which is also the version I have always used (it does make for a
better spectrograph display than the raw DFT).

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Previous 1 234 5 6 Next

Sorry about the many posts

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group