DFT Bin Value Formulas for Pure Real Tones
Introduction
This is an article to hopefully give a better understanding to the Discrete Fourier Transform (DFT) by deriving an analytical formula for the DFT of pure real tones. The formula is used to explain the well known properties of the DFT. A sample program is included, with its output, to numerically demonstrate the veracity of the formula. This article builds on the ideas developed in my previous two blog articles:
- The Exponential Nature of the Complex Unit Circle
- DFT Graphical Interpretation: Centroids of Weighted Roots of Unity
Travel on the Complex Unit Circle
Any point on the unit circle in the complex plane can be located knowing the distance along the circumference by using Euler's equation. $$ e^{ i\theta } = cos( \theta ) + i \cdot sin( \theta ) \tag 1 $$ The conjugate of the point can be located by the same distance in the opposite direction along the circumference. $$ e^{ -i\theta } = cos( \theta ) - i \cdot sin( \theta ) \tag 2 $$ The projection of each of these points onto the real axis can be accomplished by finding their midpoint. $$ cos( \theta ) = { e^{ i\theta } + e^{ -i\theta } \over 2 } \tag 3 $$ As the point travels along the circumference of the circle at a steady speed, its projection on the real axis travels in a sinusoidal manner.
A Pure Real Tone Sequence
A pure real tone is a sinusiodal defined by three parameters. They are amplitude ($ M $), frequency ($ f $), and phase ($ \phi $). A discrete pure real tone is a set of signal values within a frame. A frame consists of a defined number of points, $ N $, and the signal values are indexed from $ 0 $ to $ N-1 $. The signal can be extended beyond the frame by using out of range subscripts. Here is a equation defining a pure real tone: $$ S_n = M cos( f \cdot { 2\pi \over N } n + \phi ) \tag 4 $$ The frequency is expressed in units of cycles per frame. This value needs to be rescaled to a distance along the circumference. For practical purposes the frequency should be constrained to be greater than zero and less than $ N/2 $. Frequencies outside this range will appear to be frequencies within the range. This phenomenon is known as aliasing.
Defining an angle variable makes the definition simpler. $$ \alpha = f \cdot { 2\pi \over N } \tag 5 $$ The signal can be reformulated using exponential functions rather than a trigonometric one. $$ S_n = M \cdot { e^{ i(\alpha n + \phi) } + e^{ -i(\alpha n + \phi) } \over 2 } \tag 6 $$
The Discrete Fourier Transform
By a common convention, the DFT is defined mathematically as: $$ Z_k = { 1 \over N } \sum_{n=0}^{N-1} { S_n e^{ -i2 \pi { k \over N } n } } \tag 7 $$ The summation can be made more comprehensible by making the following substitution: $$ \beta_k = k \cdot { 2\pi \over N } \tag 8 $$ Just like $ \alpha $ variable, the $ \beta_n $ variables represent distances along the circumference. In this case the $ \beta_k $s represent the circumference segmented into $ N $ pieces. The frequency ($ f $) is how many of these segments are travelled from one sample to the next. Therefore $ f $ and $ k $ are on the same scale, and similarly $ \alpha $ and the $ \beta_k $s are on a circumference distance (or radian angle) scale.
Making the substitution results in the following version of the DFT calculation: $$ Z_k = { 1 \over N } \sum_{n=0}^{N-1} { S_n e^{ -i\beta_k n } } \tag 9 $$
Roots of Unity
The $ \beta_k $s mark the locations along the circumference of a set of N Roots of Unity. $$ R_k = e^{ -i \beta_k } \tag {10} $$ The $ R_k $s are the complex number values at the Root of Unity locations. Because of the negative sign in the exponent, the $ \beta_k $s are indexed in a counter-clockwise manner starting with index zero at 1 (also known as Unity), and the Roots of Unity are indexed clockwise, also with index zero at 1.
Showing the each $ R_k $ is an Nth Root of Unity is straightforward: $$ R_k^N = (e^{ -i \beta_k })^N = (e^{ -i (k \cdot { 2\pi \over N }) })^N = (e^{ -i 2\pi })^k = 1^k = 1 \tag {11} $$
The DFT of a Pure Real Tone
The definition of the pure real tone sequence can now be plugged into the DFT definition to obtain a formula for the DFT values of a pure real tone. $$ Z_k = { 1 \over N } \sum_{n=0}^{N-1} { M \cdot { e^{ i(\alpha n + \phi) } + e^{ -i(\alpha n + \phi) } \over 2 } \cdot e^{ -i\beta_k n } } \tag {12} $$ This form isn't that useful. It can be made a little better by factoring out the constants. $$ Z_k = { M \over { 2N } } \sum_{n=0}^{N-1} { ( e^{ i(\alpha n - \beta_k n + \phi) } + e^{ -i(\alpha n + \beta_k n + \phi) } ) } \tag {13} $$
Separation into Geometric Series Forms
The summation of the sum is the sum of the summations. Furthermore, $ \phi $ is independent of $ n $ so it's exponentials can also be factored out of the summations. $$ Z_k = { M \over { 2N } } \left[ e^{ i \phi } \sum_{n=0}^{N-1} { e^{ i(\alpha - \beta_k )n } } + e^{ -i \phi } \sum_{n=0}^{N-1} { e^{ -i(\alpha + \beta_k )n } } \right] \tag {14} $$ The purpose of separating the summation and factoring out all possible constants is that it leaves the remaining summations as geometric series. The geometric series summation formula can be applied. $$ \sum_{n=0}^{N-1} {x^n} = { { 1 - x^N } \over { 1 - x } } \tag {15} $$ It is useful to note that $ \beta_k $ times $ N $ is a whole number multiple of $ 2 \pi $. $$ \beta_k N = k \cdot 2\pi \tag {16} $$ This allows the $ \beta_k $s to be removed from the numerator expressions when the geometric series summation formula is applied. The result is this: $$ Z_k = { M \over { 2N } } \left[ e^{ i \phi } \cdot { 1 - e^{ i\alpha N } \over 1 - e^{ i(\alpha - \beta_k ) } } + e^{ -i \phi } \cdot { 1 - e^{ -i\alpha N } \over 1 - e^{ -i(\alpha + \beta_k ) } } \right] \tag {17} $$ At this point the summation has been eliminated and a closed form analytic formula remains.
Handling the Integer Frequency Case
When f is an integer frequency the value $ \alpha N $ becomes a whole number multiple of $ 2 \pi $. This means the exponents in the numerator of (17) become 1 resulting in both numerators equalling zero. $$ \alpha N = f \cdot 2\pi \tag {18} $$ For all cases of $ k $, except when $ k=f $ or $ (N-f) $, the result is that the bin value is zero. When $ k=f $ the right term is still zero, but the left term becomes 0 divided by 0 which is an indeterminate form. This case can then be evaluated by going back to (14). The right summation is zero. The left summation becomes a sum of ones. There are $ N $ of them so the sum is $ N $. This $ N $ will cancel out the $ N $ in the denominator in the outside factor leaving a very simple equation: $$ Z_k = { M \over 2 } e^{ i \phi } \, \text{ when } \, k=f \tag {19} $$ When $ k $ equals $ N - f $, the reverse argument is true yielding the conjugate.
Non-Integer Frequencies
When f is a non-integer, the denominators in (17) can never be zero, so any indeterminate forms are not possible. In this case, it becomes a matter of plug and chug to pound the formula into a more useful form. $$ Z_k = { M \over { 2N } } \left[ { e^{ i \phi } - e^{ i(\alpha N + \phi) } - e^{ i( -\alpha - \beta_k + \phi ) } \\ + e^{ i(\alpha N -\alpha - \beta_k + \phi ) } + e^{ -i \phi } - e^{ -i(\alpha N + \phi) } \\ - e^{ -i( -\alpha + \beta_k + \phi ) } + e^{ -i(\alpha N -\alpha + \beta_k + \phi ) } \over 1 - e^{ i(\alpha - \beta_k ) } - e^{ -i(\alpha + \beta_k ) } + e^{ -i 2 \beta_k } } \right] \tag {20} $$ Many of the exponential expressions can be simplified by using (3). The only exponential expressions that remain are some that include $ \beta_k $. $$ Z_k = { M \over { 2N } } \left[ { 2cos( \phi ) - 2cos(\alpha N + \phi) \\ - 2cos( -\alpha + \phi )e^{ -i\beta_k } \\ + 2cos( \alpha N -\alpha + \phi )e^{ -i\beta_k } \over 2cos( \beta_k )e^{ -i\beta_k } - 2cos( \alpha )e^{ -i\beta_k } } \right] \tag {21} $$ A factor of $ -2e^{ -i\beta_k } $ can be pulled out of the numerator and denominator. $$ Z_k = { M \over { 2N } } \left[ { \left[ cos( \alpha N + \phi ) - cos(\phi) \right]e^{ i\beta_k } \\ - \left[ cos( \alpha N -\alpha + \phi ) - cos( -\alpha + \phi ) \right] \over cos( \alpha ) - cos( \beta_k ) } \right] \tag {22} $$
Simplification Substitutions
In order to make (22) more manageable, a couple of obvious substitutions can be made. $$ U = cos( \alpha N + \phi ) - cos(\phi) \tag {23} $$ $$ V = cos( \alpha N -\alpha + \phi ) - cos( -\alpha + \phi ) \tag {24} $$ Both these values reflect the signal parameters and are independent of the bin number. The resulting equation then reveals much more clearly the relationship between the bin number and the bin value. $$ Z_k = { M \over { 2N } } \left[ { Ue^{ i\beta_k } - V \over cos( \alpha ) - cos( \beta_k )} \right] \tag {25} $$ This is the simplest form of this equation from a computational perspective.
Qualitative Analysis
The behavior of the bin values can now be understood from the equation. First off, when the frequency is an integer value, both $ U $ and $ V $ are zero. Therefore all the bin values will be zero except for when $ k = f $ or $ (N-f) $ and the equation becomes indeterminate. The answer is then (19), or its conjugate, which is well known and well understood. The bin value in this case is simply half the amplitude rotated by the phase value.
When the frequency is close to an integer value, $ U $ and $ V $ will be small. So all the bin values will be small except the case of the closest $ k $ to $ f $ or $ (N-f) $. In this case the denominator will also be small and the quotient will be larger. As the $ k $ gets further from $ f $ or $ (N-f) $, the denominator gets larger and the bin values taper off. When $ f $ is near $ N/4 $ and $ N $ is fairly large, the $ cos( \beta_k ) $ values will be nearly linear, so the tapering approximates a hyperbola.
When the $ k $ value goes from below $ f $ to above $ f $, the sign on the denominator changes and the numerator stays roughly the same. This explains why the bin values on either side of $ f $ are nearly opposite each other. This also happens around $ (N-f) $.
When the frequency is further from being an integer value, $ U $ and $ V $ become relatively larger so all the bin values are also larger. Since the denominator doesn't get nearly as small compared to the near integer case, the peak value is much smaller compared to the neighboring bins. In the traditional interpretation, this is called "significant leakage", as if the peak lost some of its "energy" to its neighbors.
Both $ U $ and $ V $ are real valued. So is the denominator. Therefore the only source of the imaginary part of the bin is from the $ e^{i \beta_k } $ in the numerator. Because $ e^{i \beta_{N-k} } $ is the conjugate of $ e^{i \beta_k } $ this means that $ Z_{N-k} $ is the conjugate of $ Z_k $. This explains why the top half of the bin set is symmetric to the bottom half.
The value of bin zero, the DC bin, will be real because $ e^0 = 1 $, making the imaginary part zero. Similarly, if $ N $ is even, $ e^{ \beta_{N/2} } = -1 $ so the halfway bin, also known as the Nyquist bin, will also have an imaginary part equal to zero.
The amplitude is in the numerator of the outside factor so the bin values are directly proportional to it. This follows from the DFT being a linear operation.
Quantitative Example
Here is a test program to demonstrate that (25) is correct. It is designed purposefully to show the equations, not for optimal performance. Not even a code optimizer can fix this program for efficient performance. That doesn't matter. What matters is the results it shows and how easy it is to understand.
#include <math.h> #include <stdio.h> //============================================================================ double DFT_real( double beta_k, int N, double M, double phi, double alpha ) { double sum = 0.0; int n = 0; for( n = 0; n < N; n++ ) { double s_n = M * cos( alpha * (double) n + phi ); sum += cos( beta_k * (double) n ) * s_n; } return sum / (double) N; } //============================================================================ double DFT_imag( double beta_k, int N, double M, double phi, double alpha ) { double sum = 0.0; int n = 0; for( n = 0; n < N; n++ ) { double s_n = M * cos( alpha * (double) n + phi ); sum -= sin( beta_k * (double) n ) * s_n; } return sum / (double) N; } //============================================================================ void main( int argc, char *argv[] ) { double freq = 10.4; double phi = 0.6; double M = 1.0; int N = 32; double alpha = freq * 2.0 * M_PI / (double) N; double U = cos( alpha * N + phi ) - cos( phi ); double V = cos( alpha * N - alpha + phi ) - cos( -alpha + phi ); int k = 0; for( k = 0; k < N; k++ ) { double beta_k = (double) k * 2.0 * M_PI / (double) N; double factor = M / ( 2 * N * ( cos( alpha ) - cos( beta_k ) ) ); double Zn_real = factor * ( U * cos( beta_k ) - V ); double Zn_imag = factor * ( U * sin( beta_k ) ); printf( "%2d %14.11f %14.11f %14.11f %14.11f\n", k, Zn_real, Zn_imag, DFT_real( beta_k, N, M, phi, alpha ), DFT_imag( beta_k, N, M, phi, alpha ) ); } } //============================================================================This is the output:
0 0.02337925966 0.00000000000 0.02337925966 0.00000000000 1 0.02331048640 0.00387720744 0.02331048640 0.00387720744 2 0.02309555740 0.00791951808 0.02309555740 0.00791951808 3 0.02270598975 0.01232388713 0.02270598975 0.01232388713 4 0.02208384054 0.01736535806 0.02208384054 0.01736535806 5 0.02111857347 0.02348449793 0.02111857347 0.02348449793 6 0.01959028862 0.03148664998 0.01959028862 0.03148664998 7 0.01701104105 0.04308662834 0.01701104105 0.04308662834 8 0.01206769125 0.06280881267 0.01206769125 0.06280881267 9 -0.00032563186 0.10802118551 -0.00032563186 0.10802118551 10 -0.07619790924 0.36944527683 -0.07619790924 0.36944527683 11 0.10202082457 -0.23340312262 0.10202082457 -0.23340312262 12 0.05801386132 -0.07965852656 0.05801386132 -0.07965852656 13 0.04829514841 -0.04196752827 0.04829514841 -0.04196752827 14 0.04440504164 -0.02322264592 0.04440504164 -0.02322264592 15 0.04268851510 -0.01055994389 0.04268851510 -0.01055994389 16 0.04218971842 -0.00000000000 0.04218971842 0.00000000000 17 0.04268851510 0.01055994389 0.04268851510 0.01055994389 18 0.04440504164 0.02322264592 0.04440504164 0.02322264592 19 0.04829514841 0.04196752827 0.04829514841 0.04196752827 20 0.05801386132 0.07965852656 0.05801386132 0.07965852656 21 0.10202082457 0.23340312262 0.10202082457 0.23340312262 22 -0.07619790924 -0.36944527683 -0.07619790924 -0.36944527683 23 -0.00032563186 -0.10802118551 -0.00032563186 -0.10802118551 24 0.01206769125 -0.06280881267 0.01206769125 -0.06280881267 25 0.01701104105 -0.04308662834 0.01701104105 -0.04308662834 26 0.01959028862 -0.03148664998 0.01959028862 -0.03148664998 27 0.02111857347 -0.02348449793 0.02111857347 -0.02348449793 28 0.02208384054 -0.01736535806 0.02208384054 -0.01736535806 29 0.02270598975 -0.01232388713 0.02270598975 -0.01232388713 30 0.02309555740 -0.00791951808 0.02309555740 -0.00791951808 31 0.02331048640 -0.00387720744 0.02331048640 -0.00387720744The code can easily be copied and pasted into a local file for testing.
Signal and Roots of Unity Formulation
As a matter of being interesting, rather than any practicality, the bin value formula can be expressed exclusively in terms of some signal values and the Roots of Unity.
First is expressing $ cos( \alpha ) $ as a combination of signal values. Using the angle addition formula for cosines separates out the $ cos( \alpha ) $. $$ S_{-1} = cos( -\alpha + \phi ) = cos( \alpha )cos( \phi ) + sin( \alpha )sin( \phi ) \tag {26} $$ $$ S_{1} = cos( \alpha + \phi ) = cos( \alpha )cos( \phi ) - sin( \alpha )sin( \phi ) \tag {27} $$ Adding these two gets rid of the sine portions. Dividing the result by $ 2 cos( \phi ) $ does the rest. Unfortunately, this becomes problematic when this value is zero. $$ S_0 = cos( \phi ) \tag {28} $$ $$ cos( \alpha ) = { S_{-1} + S_{1} \over 2 S_0 } \tag {29} $$ Next is an expression for $ cos( \beta_k ) $. This comes exclusively from the Roots of Unity. $$ cos( \beta_k ) = { e^{ i\beta_k } + e^{ -i\beta_k } \over 2 } = { 1 / R_k + R_k \over 2 } \tag {30} $$ Last is recognizing that the two previous simplifying substitutions (23), (24) are actually signal values without the amplitude. $$ U = ( S_N - S_0 ) / M \tag {31} $$ $$ V = ( S_{N-1} - S_{-1} ) / M \tag {32} $$ Putting it all together, a new formulation of (25) is obtained: $$ Z_k = { 1 \over N } \left[ { ( S_N - S_0 ) / R_k - ( S_{N-1} - S_{-1} ) \over ( S_{-1} + S_{1} ) / S_0 - ( R_k + 1/R_k ) } \right] \tag {33} $$ Not only does this take more calculations, but it has the additional problem when $ S_0 = 0 $. However, from a theoretical perspective, knowing that the bin value formula for a pure real tone is a real valued non-linear transform of the Roots of Unity is significant.
Conclusion
The derivation of a formula for the DFT bin values of a pure real tone is somewhat complicated, but it doesn't end up being that bad. Being trigonometric, the formula can also take different forms. Although the formula is useful for a qualitative understanding of the behavior of the DFT on tones, it also has practical applications as well. These will be the subject of the next few blog articles.
[Edit 2015-06-01: Swap n and k in subscripts]
- Comments
- Write a Comment Select to add a comment
fr Rectangular window if data length is 1K then FFT Lenght Should be 1K FFT only because ENBW of rectangualr window is 1.0 but for Minimum 4-Sanple BlackmannHarris window is 2.0. so for improving accuracy in terms of either frequency or amplitude which method is preferrded in below 1. for 1K Data i will append 1K zeros and Caluclate 2K FFT or 2. instead of 1K data take 2K data and calclate 2K FFT
or else any other method please calrify ..
To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.
Please login (on the right) if you already have an account on this platform.
Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: