Exact Near Instantaneous Frequency Formulas Best at Peaks (Part 1)

Cedron DawgMay 12, 2017

Introduction

This is an article that is a another digression from trying to give a better understanding of the Discrete Fourier Transform (DFT). Although it is not as far off as the last blog article.

A new family of formulas for calculating the frequency of a single pure tone in a short interval in the time domain is presented. They are a generalization of Equation (1) from Rick Lyons' recent blog article titled "Sinusoidal Frequency Estimation Based on Time-Domain Samples"[1]. This same formula in a different context for a different purpose appears in one of my blog articles as Equation (29) in "DFT Bin Value Formulas for Pure Real Tones"[2]. This formula is exact. However, exactness is not an indicator of robustness. In the presence of noise, this formula performs rather poorly. Its performance is also worse on lower frequencies.

The generalization of this formula produces a set of exact formulas that have varying levels of computation along with corresponding varying levels of robustness. No approximations were taken in the derivation and they will give the true answer in a noiseless case. Remarkably, the formulas work on both real valued tones and complex valued tones.

In the real tone case, they should be evaluated centered at a peak of the signal in the time domain. An evaluation centered at a crossing point will theoretically be indeterminate and in practice extremely noise sensitive, as in useless. This placement consideration does not apply in the complex tone case.

Comparison to a DFT Based Approach

Time domain frequency formulas are applied on a small section of the signal. Frequency domain frequency formulas are applied to a small section of the frequency spectrum. In order to build the frequency spectrum a Discrete Fourier Transform (DFT) has to be applied to a larger section of the signal called a frame. One that encompasses at least a few cycles of the tone. If the frequency is changing a little bit across the frame, the DFT formula will return an estimation of an average frequency. For real tones, the formulas in this article can measure the frequency twice per wavelength, at the peak and at the trough.

The DFT approach will be much more calculation intensive. For a single tone of known approximate frequency, only two or three DFT bins need to be calculated. This is still way more calculations than the formulas in this article require. The DFT is expected to do much better in the presence of noise.

Pure Real Tone Signal Definition

This is the equation used in all my real tone blog articles to describe a discrete single pure tone: $$ S_n = M \cdot \cos( \alpha n + \phi ) \tag {1} $$ Since it represents a sequence of values, and not the continuous underlying function, the designation is $ S_n $, and not $ S(n) $. You will often find the signal represented by $x(n)$. A pure tone signal, whether real or complex, can be thought of as being the result of a dot traveling around the circumference of a unit circle centered at the origin of the complex plane. The value of $\phi$ specifies the initial position of the dot on the circle measured along the circumference from unity (The point (1,0) on the plane). The value of $\alpha$ specifies how far the dot moves along the circumference for each step of the sequence. The distance around the circumference is measured in radians, therefore the unit for $\phi$ is in radians and the unit for $\alpha$ is in radians per sample.

The cosine function gives the projection of the traveling dot onto the real axis, so it travels back and forth from 1 to -1. Finally, $M$ multiplies the projection value so the range of the signal is from $M$ to $-M$. The peak possible value of the signal will be $M$ which is also known as the amplitude.

One of the Family

Here is a member of the family of formulas: $$ \begin{aligned} P_{n,1} &= S_{n+2} + S_{n-2} \\ P_{n,2} &= S_{n+4} + S_{n-4} \\ P_{n,3} &= S_{n+6} + S_{n-6} \\ P_{n,4} &= S_{n+8} + S_{n-8} \end{aligned} \tag {2} $$ $$ \alpha = \frac{1}{2} \cos^{-1}\left( \frac{ 6 S_n + 4 P_{n,2} + P_{n,4} }{ 6 P_{n,1} + 2 P_{n,3} } \right) \tag {3} $$ This is the $d=2,k=4$ case. The $P$ values are called neighbor pairs. The $d$ specifies the spacing of the neighbor pairs to sum. The $k$ specifies the exponential order of the numerator of the fraction inside the inverse cosine function. It is also the number of neighbor pairs that need to be summed.

The $P$ variables can be unfurled into signal values and the formula can be seen as the inverse cosine of one weighted sum of a section of signal values divided by another. This member takes 7 sums, 5 multiplies, 1 divide, and 1 one inverse cosine.

See Appendix A for a list of the formulas up to $k=9$.

What is the Frequency?

The frequency term in the signal equation is $\alpha$, therefore solving for $\alpha$ means solving for the frequency in radians per sample. However, the frequency can be converted to other units which are more appropriate to the context of whatever application is involved.

The first conversion changes the measure from how fast the dot is traveling on the circumference to how many times it is going around the circle. The distance around the circumference is $2\pi$ radians. The units of the values in the equation are shown off to the right in the bracketed expressions. $$ f_{cpn} = \frac{\alpha}{2\pi} \qquad \left[\frac{cycles}{sample} = \frac{\frac{radians}{sample}}{\frac{radians}{cycle}}\right] \tag {4} $$ Frequencies that are expressed in units of cycles per sample are also called "normalized frequencies". The reciprocal value is the number of samples it takes to hold one wavelength.

When the signal is being processed a frame at a time, the natural unit for frequency is cycles per frame. This is particularly true when using a DFT as the bin index then corresponds to the frequency of that bin. $$ \begin{aligned} f_{cpf} &= f_{cpn} \cdot N \qquad \left[\frac{cycles}{frame} = \frac{cycles}{sample} \cdot \frac{samples}{frame} \right] \\ &= \alpha \cdot \frac{N}{2\pi} \qquad \left[= \frac{radians}{sample} \cdot \frac{\frac{samples}{frame}}{\frac{radians}{cycle}}\right] \end{aligned} \tag {5} $$ The measuring unit of frequency in real life contexts is generally Hertz, aka cycles per second. In this case, the sampling rate (samples per second) needs to be known in order to convert $\alpha$ into a Hz value. $$ \begin{aligned} f_{Hz} &= f_{cpn} \cdot f_s \qquad \left[\frac{cycles}{second} = \frac{cycles}{sample} \cdot \frac{samples}{second} \right] \\ &= \alpha \cdot \frac{f_s}{2\pi} \qquad \left[= \frac{radians}{sample} \cdot \frac{\frac{samples}{second}}{\frac{radians}{cycle}}\right] \end{aligned} \tag {6} $$

What is an Instantaneous Frequency?

Doing a little research shows that there are varying meanings to the term "Instantaneous Frequency". The differences in meanings depend on how many samples are taken and is it a time domain or frequency domain calculation. In this article, "instantaneous" means whatever is as close as possible to the continuous case of "now". In real time DSP applications, there is always going to be a slight lag from the arrival of a data point and what the "instantaneous frequency", whatever formula is being used, is at that point.

For a single pure tone, a true real valued sinusoidal, the concept of instantaneous frequency is easily understood. Since the frequency is constant, by definition, the frequency at every point is the same as the overall frequency. This is the assumed situation in the derivation of the formulas. Under these assumptions, the formulas are exact.

For a single tone that is changing in amplitude, but the zero crossing remain in the same place, should have the same frequency. Likewise, the concept of an instantaneous frequency is not hard to grasp for a single tone with a slightly varying frequency. In these cases, since the tones are not pure sinusoids the formulas will not be exact, but good estimates can be expected.

If two, or more, pure tones are added together, the concept of what an instantaneous frequency is becomes a little bit fuzzy.

Neighbor Pairs of Signal Values

To be as instantaneous as possible, the signal points that are used should be as close together as possible. The sample points used in the family of formulas consist of a center point and sets of pairs of neighbors. The number of pairs is one of the parameters that specify the member formula in the family.

A neighbor pair consists of one sample some interval in the future of the center point and one sample the same interval in the past. The future neighbor signal value can be found from the signal definition by adding the interval size to the subscript. $$ \begin{aligned} S_{n+m} &= M \cdot \cos[ \alpha (n+m) + \phi ] \\ &= M \cdot \cos[ (\alpha n + \phi) + (\alpha m) ] \\ &= M \cdot \left[ \cos( \alpha n + \phi ) \cos( \alpha m ) - \sin( \alpha n + \phi ) \sin( \alpha m ) \right] \end{aligned} \tag {7} $$ The neighbor signal value in the past can be defined in the same way. $$ S_{n-m} = M \cdot \left[ \cos( \alpha n + \phi ) \cos( \alpha m ) + \sin( \alpha n + \phi ) \sin( \alpha m ) \right] \tag {8} $$ The two neighbors are then added together to form a neighbor pair sum. $$ \begin{aligned} P_{n,m} &= S_{n+m} + S_{n-m} \\ &= 2M \cos( \alpha n + \phi ) \cos( \alpha m ) \\ &= 2 S_n \cos( \alpha m ) \end{aligned} \tag {9} $$ The sine terms cancel and the definition of the signal reappears. The equation can be rearranged a little bit to show that the cosines of multiples of $\alpha$ can be calculated from the value of the signal and neighboring pairs. $$ S_n \cos( \alpha m ) = \frac{P_{n,m}}{2} \tag {10} $$

Powers of Cosine

The cosine function has the special property that its powers can be expressed as a linear combination of cosines of the angle multiples. This exponential like behavior is due to its exponential like definition combined with the pattern of binomial expansion. $$ \cos^0(\theta) = 1 \tag {11} $$ $$ \cos^1(\theta) = \frac{ e^{i\theta} + e^{-i\theta} }{2} = \frac{1}{2} ( e^{i\theta} + e^{-i\theta} ) \tag {12} $$ $$ \begin{aligned} \cos^2(\theta) &= \frac{1}{2^2} ( e^{i2\theta} + 2 + e^{-i2\theta} ) \\ &= \frac{1}{2^2} [ 2 + 2 cos(2\theta) ] \end{aligned} \tag {13} $$ $$ \begin{aligned} \cos^3(\theta) &= \frac{1}{2^3} ( e^{i3\theta} + 3 e^{i\theta} + 3 e^{-i\theta} + e^{-i3\theta} ) \\ &= \frac{1}{2^3} [ 6 cos(\theta) + 2 cos(3\theta) ] \end{aligned} \tag {14} $$ $$ \begin{aligned} \cos^4(\theta) &= \frac{1}{2^4} ( e^{i4\theta} + 4 e^{i2\theta} + 6 + 4 e^{-i2\theta} + e^{-i4\theta} ) \\ &= \frac{1}{2^4} [ 6 + 8 cos(2\theta) + 2 cos(4\theta) ] \end{aligned} \tag {15} $$ The coefficients of Pascal's triangle are clearly visible in each equation. Notice that they double up when two terms are combined back into the cosine value.

This pattern can be continued to whatever power level is desired.

Signal Value Calculations

The terms in cosine power series are all cosines of multiples of the same angle. By multiplying the series by the signal value, all the cosine terms take on the form of (10). All the constant terms become coefficients to the signal value. $$ V_{n,k} = S_n \cos^k(\alpha) \tag {16} $$ When applied to all the series equations the results are this: $$ V_{n,0} = S_n \tag {17} $$ $$ V_{n,1} = S_n \cos(\alpha) = \frac{1}{2} P_{n,1} \tag {18} $$ $$ \begin{aligned} V_{n,2} &= \frac{1}{2^2} [ 2 S_n + 2 S_n cos(2\alpha) ] \\ &= \frac{1}{2^2} [ 2 S_n + P_{n,2} ] \end{aligned} \tag {19} $$ $$ \begin{aligned} V_{n,3} &= \frac{1}{2^3} [ 6 S_n cos(\alpha) + 2 S_n cos(3\alpha) ] \\ &= \frac{1}{2^3} [ 3 P_{n,1} + P_{n,3} ] \end{aligned} \tag {20} $$ $$ V_{n,4} = \frac{1}{2^4} [ 6 S_n + 4 P_{n,2} + P_{n,4} ] \tag {21} $$ Notice that when the neighbor pair value gets substituted in, the coefficient gets divided by two. This reverses the doubling up that occurred in the cosine power series leaving the original Pascal triangle coefficient. This makes it easy to construct any desired signal value equation straight from Pascal's triangle.

Each $V_{n,k}$ is a linear combination of the signal value and it neighbor pair sums. The sums can be unfurled and the equation can also be seen as a linear combination of signal values having Pascal triangle values as coefficients.

Frequency Calculation

The frequency calculation, that is solving for $\alpha$, is very straightforward. Simply dividing $V_{n,k}$ by $V_{n,k-1}$, the ratio of two signal calculations, gives $\cos(\alpha)$ as a result. $$ r = \frac{ V_{n,k} }{ V_{n,k-1} } = \frac{ S_n \cos^{k}(\alpha) }{ S_n \cos^{k-1}(\alpha) } = \cos(\alpha) \tag {22} $$ From there, taking the inverse cosine gets the actual angle value which is the parameter that represents the frequency. The frequency can then be converted to any units desired depending on the application. $$ \alpha_n = \cos^{-1}(r) = \cos^{-1}\left(\frac{ V_{n,k} }{ V_{n,k-1} }\right) \tag {23} $$ Just like in the DFT based real tone frequency formulas, an inverse cosine is taken to get the frequency value. The inverse cosine function can return many values. The principal value is the one that is generally sought. The other values represent alias frequencies.

Base Member

The simplest version of the formula is when $k=1$ and $d=1$. $$ r = \frac{ V_{n,1} }{ V_{n,0} } = \frac{\frac{1}{2} P_{n,1}}{S_n} = \frac{ P_{n,1} }{2 S_n} \tag {24} $$ $$ \alpha_n = \cos^{-1}(r) = \cos^{-1}\left( \frac{ P_{n,1} }{2 S_n} \right) = \cos^{-1}\left( \frac{ S_{n+1}+S_{n-1} }{2 S_n} \right) \tag {25} $$ This is the formula that is in Lyons' article[1].

Spacing the Neighbors

If being as instantaneous as possible isn't an important criteria, the results in noisy signals can generally be improved by spacing the samples used farther apart. This gives the formula a little wider stance on the signal. In the noiseless case, this doesn't matter.

Spacing the neighbor pair samples is equivalent to having the same signal with a lower sampling rate. The parameter used for the spacing step is called $d$. Folding this into (9) gives this equation: $$ P_{n,m} = S_{n+md} + S_{n-md} = 2 S_n \cos[ (\alpha d) m ] \tag {26} $$ All this effectively does is makes the quantity $(\alpha d)$ play the role of plain $\alpha$. This now changes (22) to this: $$ r = \cos(\alpha d) \tag {27} $$ Finally, (23) gets adjusted to this: $$ \alpha_n = \frac{1}{d} \cos^{-1}(r) = \frac{1}{d} \cos^{-1}\left(\frac{ V_{n,k} }{ V_{n,k-1} }\right) \tag {28} $$ Assuming the noise is the same for all the sample points, the probability distribution coming out of the inverse cosine function is going to be essentially independent of the spacing parameter. Therefore the impact of noise on the result is close to inversely proportional to the spacing step.

When the formula's stance on the signal exceeds the wavelength, the output of the inverse cosine function may need to have a multiple of $2\pi$ added before dividing by $d$ to get the correct result.

Selecting the Best Spacing Parameter

The most robust results are expected from the formula when the numerator is as close to zero as possible and the denominator is as large as possible. For the base member, this is easy to work out. The sample should be at the peak of a cycle and the spacing should be such that the neigbor pair's sum size is minimized. This means being near the adjacent zero crossings. If the neighbor pair sum is zero, any noise at the center sample, even the center sample value itself, is irrelevant. In this situation, the formula is acting just like a zero crossing calculation. The closer the sum is to zero, the less the noise at the center sample matters. Having the center sample at a peak makes the denominator as large as possible. The larger the denominator value is, the less the noise in the denominator matters, and the less the noise from the numerator matters.

The optimal spacing for the higher degree formula members has yet to be worked out. However, all members should be evaluated at a peak if possible. In the noiseless real tone case, the formula will give exact results at every point along the cycle except at a zero crossing point. When noise is present, the formula will be most robust at a peak and very noise sensitive near zero crossing points.

Better Signal Value

As a side effect of the way the calculations are done, it is possible to get a calculated value of what the signal value is at the sampling center. This is done by solving for the signal value $S_n$ in (16). $$ G_n = \frac{ V_{n,k} }{ \cos^k(\alpha d) } = \frac{ V_{n,k} }{ r^k } \tag {29} $$ The result gets renamed to $G_n$ to distinguish it from the sampled value. Technically it is a guess. In the noiseless situation, the two values will be the same. In a noisy implementation, the $G_n$ is more likely to be closer to the true signal value than the sampled $S_n$.

Numerical Example

Here are the results of one calculation. The easy to follow source code is listed in Appendix B.

Since it is a noiseless case, the results are theoretically exact and numerically accurate to the precision of the calculations.

    alpha =  0.0626894
   S[140] =  2.4132449
   S[142] =  2.5617878
   S[144] =  2.6701126
   S[146] =  2.7365186
   S[148] =  2.7599633
   S[150] =  2.7400787
   S[152] =  2.6771768
   S[154] =  2.5722454
   S[156] =  2.4269316
      Pn1 =  5.4765972
      Pn2 =  5.3472894
      Pn3 =  5.1340332
      Pn4 =  4.8401764
      num =  42.789114
      den =  43.127650
        r =  0.9921504
  alpha_n =  0.0626894
       Gn =  2.7599633

Complex Signals

The family of formulas has been derived under the assumption that the signal is a real valued sinusoid. Remarkably, the formulas also work on complex tones.

Suppose the signal is a pure complex tone: $$ S_n = M \cdot e^{i(\alpha n + \phi )} \tag {30} $$ Plugging this signal definition into (26), the spacing version of (9), yields the same results as the real pure tone. $$ \begin{aligned} P_{n,m} &= S_{n+md} + S_{n-md} \\ &= M \cdot e^{i(\alpha(n+md) + \phi )} + M \cdot e^{i(\alpha(n-md) + \phi )} \\ &= M \cdot e^{i(\alpha n + \phi )} e^{i(\alpha md)} + M \cdot e^{i(\alpha n + \phi )}e^{-i(\alpha md)} \\ &= M \cdot e^{i(\alpha n + \phi )} \left[ e^{i(\alpha md)} + e^{-i(\alpha md)} \right] \\ &= S_n \left[ 2 cos( \alpha md ) \right] \\ &= 2 S_n \cos[ (\alpha d) m ] \end{aligned} \tag {31} $$ Since the whole derivation is based on this relationship between neighbor pair sums and the cosine of the corresponding frequency term, it just as applicable to the complex signal as it is to the real signal.

Because the linear combination coefficients, in both the numerator and denominator, are symmetric about $ S_n $, the linear combination is going to be in the same direction of $ S_n $ on the complex plane. The numerator and denominator being in the same direction means that their quotient is going to be real valued.

That's for the noiseless ideal case. In the presence of noise there will likely be a small imaginary component to the quotient. This should be discarded when taking the inverse cosine rather than trying to use the complex version of the inverse cosine.

Conclusion

A new (?) family of time domain frequency formulas for pure tones has been developed. The members are all exact generalizations of an simple known exact formula. The family members are specified by two parameters: $d$ the spacing step for samples to use and $k$ which is the degree of the equations as well as the number of neighbor pairs needed. Increasing the degree, which increases the number of samples used, reduced the noise level in the results at the expense of extra calculations. Fortunately, the calculations are not intensive.

The formulas have integer coefficients so they are fixed point implementation friendly.

A slightly different, but very similar, family of formulas will be introduced in (Part 2). At the cost of just a few more calculations, much more robust results can be obtained using the same sample points.

References

[1] Lyons, R., Sinusoidal Frequency Estimation Based on Time-Domain Samples

[2] Dawg, Cedron, DFT Bin Value Formulas for Pure Real Tones

Appendix A. Formula Fractions

These are the first nine simplified versions of $ \frac{ V_{n,k} }{ V_{n,k-1} } $ for use in (28). An example of the $k=4$ can be found in (3). $$ 1: \frac{ P_{n,1} }{ 2 S_n } \tag {32} $$ $$ 2: \frac{ 2 S_n + P_{n,2} }{ 2 P_{n,1} } \tag {33} $$ $$ 3: \frac{ 3 P_{n,1} + P_{n,3} }{ 4 S_n + 2 P_{n,2} } \tag {34} $$ $$ 4: \frac{ 6 S_n + 4 P_{n,2} + P_{n,4} }{ 6 P_{n,1} + 2 P_{n,3} } \tag {35} $$ $$ 5: \frac{ 10 P_{n,1} + 5 P_{n,3} + P_{n,5} }{ 12 S_n + 8 P_{n,2} + 2 P_{n,4} } \tag {36} $$ $$ 6: \frac{ 20 S_n + 15 P_{n,2} + 6 P_{n,4} + P_{n,6} }{ 20 P_{n,1} + 10 P_{n,3} + 2 P_{n,5} } \tag {37} $$ $$ 7: \frac{ 35 P_{n,1} + 21 P_{n,3} + 7 P_{n,5} + P_{n,7} }{ 40 S_n + 30 P_{n,2} + 12 P_{n,4} + 2 P_{n,6} } \tag {38} $$ $$ 8: \frac{ 70 S_n + 56 P_{n,2} + 28 P_{n,4} + 8 P_{n,6} + P_{n,8} }{ 70 P_{n,1} + 42 P_{n,3} + 14 P_{n,5} + 2 P_{n,7} } \tag {39} $$ $$ 9: \frac{ 126 P_{n,1} + 84 P_{n,3} + 36 P_{n,5} + 9 P_{n,7} + P_{n,9} }{ 140 S_n + 112 P_{n,2} + 56 P_{n,4} + 16 P_{n,6} + 2 P_{n,8} } \tag {40} $$

Appendix B. Source Code

#include <math.h>
#include <stdio.h>
//====================================================
int main( int argc, char *argv[] )
{
//--- Set the Parameters

        double f_Hz = 440;
        double f_s  = 44100;
        int N = 441;
        int d = 2;

//--- Build a Signal Sequence

        double* S = new double[N];
        double M     = 2.76; 
        double alpha = 2.0 * M_PI * ( f_Hz / f_s );
        double phi   = -3;
        int    n_max = 0;
        double S_max = 0.0;
        for( int n = 0; n < N; n++ )
        {
            S[n] = M * cos( alpha * n + phi );
            if( S_max < S[n] )
            {
                S_max = S[n];
                n_max = n;
            }
        }            

//--- Calculate Neighbor Pairs

        double Pn1 = S[n_max+  d] + S[n_max-  d];
        double Pn2 = S[n_max+2*d] + S[n_max-2*d];
        double Pn3 = S[n_max+3*d] + S[n_max-3*d];
        double Pn4 = S[n_max+4*d] + S[n_max-4*d];

//--- Calculate The Numerator and Denominator

        double num = 6 * S[n_max] + 4 * Pn2 + Pn4;
        double den = 6 * Pn1      + 2 * Pn3;

//--- Calculate alpha

        double r = num / den;
        double alpha_n = acos( r ) / (double) d;

//--- Calculate S_n guess

        double Gn = num / pow( 2. * r, 4. );

//--- Print the Results

        printf( "\n" );
        printf( "    alpha = %10.7f\n", alpha );
        printf( "\n" );
        for( int n = n_max-4*d; n <= n_max+4*d; n += d )
        {
            printf( "   S[%d] = %10.7f\n", n, S[n] );
        }
        printf( "\n" );
        printf( "      Pn1 = %10.7f\n", Pn1 );
        printf( "      Pn2 = %10.7f\n", Pn2 );
        printf( "      Pn3 = %10.7f\n", Pn3 );
        printf( "      Pn4 = %10.7f\n", Pn4 );
        printf( "\n" );
        printf( "      num = %10.6f\n", num );
        printf( "      den = %10.6f\n", den );
        printf( "\n" );
        printf( "        r = %10.7f\n", r );
        printf( "\n" );
        printf( "  alpha_n = %10.7f\n", alpha_n );
        printf( "       Gn = %10.7f\n", Gn );
        printf( "\n" );

//--- Exit 

        delete S;
        return 0;
}
//====================================================

Previous post by Cedron Dawg:
   A Recipe for a Common Logarithm Table
Next post by Cedron Dawg:
   Exact Near Instantaneous Frequency Formulas Best at Peaks (Part 2)

Comments:

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Registering will allow you to participate to the forums on ALL the related sites and give you access to all pdf downloads.

Sign up
or Sign in