# Approximating the area of a chirp by fitting a polynomial

Once in a while we need to estimate the area of a dataset in which we are interested. This area could give us, for example, force (mass vs acceleration) or electric power (electric current vs charge).

One way to do that is fitting a curve on our data, and let's face it: this is not that easy. In this post we will work on this issue using Python and its packages. If you do not have Python installed on your system, check here how to proceed.

Specifically, we will need:

- Numpy: the numerical basis for Python scientific software.
- Scipy: tons of scientific functions for different applications.
- Matplotlib: a nice plot platform.

We can import these packages on Python using the code below.

import numpy as np import scipy as sp import matplotlib.pyplot as plt

We used the form import **package** as **nickname** to avoid typing the entire package name. Note that we got only pyplot from matplotlib, which provides a MATLAB-like plotting framework. Then, we use np, sp and plt instead of numpy, scipy and matplotlib.pyplot respectively.

On our example, let's calculate the area of a chirp. Scipy's function *signal.chirp* is an easy function to generate a chirp varying in time. This is its general form:

sp.signal.chirp(t, f0, t1, f1, method='linear', phi=0, vertex_zero=True)

Where:

*t*is an array containing the time points of interest.*f0*is the frequency at time equals to zero.*t1*is the time at which*f1*is specified.*f1*is the frequency of the waveform at*t1*.*method*is the kind of frequency sweep. We can use 'linear', 'quadratic', 'logarithmic', and 'hyperbolic'. When we do not provide a value, the default value is 'linear'.*phi*is the phase offset in degrees. The default value is zero.*vertex_zero*is used only when*method='quadratic**'*, and determines whether the vertex of the generated parabola is at t=0 or t=t1.

Check more information about *signal.chirp* on the scipy online documentation. You can also type sp.signal.chirp? on your preferred Python prompt.

Let's generate a chirp. First, we generate a time interval using the numpy's function *linspace*. Its general form follows:

np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)

Where:

*start*is the starting value of the sequence.*stop*depends on endpoint. It is the end value of the sequence.*num*is an natural number, meaning the number of samples to generate. The default number is 50.*endpoint*decides if stop is within the sequence (*endpoint=True*) or not. The default value is True.*retstep*, when true, returns the number of samples and the space between them. The default value is False.*dtype*specifies the data type of the output array. If it is not given, linspace will infer the data type from the other input arguments.

More information about linspace is given at the scipy online documentation. You can also type np.linspace? on your Python prompt.

On our example, the time interval begins on 0 and ends on 10. It has 100 samples. The following code generates this interval and stores it on the *time_interval* variable.

time_interval = np.linspace(0, 10, num=100)

Now, using time_interval, let's generate our chirp. Its values of *f0*, *t1* and *f1* are 0, 1 and 1, respectively. We will generate a quadratic chirp, using *method='quadratic'*. This code generates this chirp and stores its points on *example_chirp*.

example_chirp = sp.signal.chirp(time_interval, 0, 1, 1, method='quadratic')

Let's visualize this chirp. In order to do that we use matplotlib.pyplot's *stem* function, which plots a discrete sequence of data using vertical lines and symbols. Its general form follows:

plt.stem(y, linefmt='b-', markerfmt='bo', basefmt='r-')<span></span>

Where:

*y*are the data points.*linefmt*modifies the color and appearance of the vertical lines. The default value is*'b-'*, which plots lines blue-colored (*'b'*) and continuous (*'-'*).*markerfmt*modifies the marker positioned at the value*(x, y)*. The default value is*'bo'*, which plots blue (*'b'*) circles (*'o'*).*basefmt*modifies the horizontal line plotted at 0. The default value is 'r-', which plots a line red (*'r'*) and continuous (*'-'*).

Check also for *stem* at the matplotlib documentation, or type plt.stem? on your Python prompt.

Let's use plot the chirp using stem's default values. We use the following code:

plt.stem(time_interval, example_chirp)

The figure representing our discrete chirp is given on Figure 1.
**Figure 1.** Our example chirp, generated from *sp.signal.chirp* and plotted using *plt.stem*.

Now we will fit a curve on this chirp. On this example we use Chebyshev's polynomials, that can be fitted on a dataset using numpy's *polyomial.Chebyshev.fit* function. There are many alternatives, such as Legendre, Laguerre and Hermite. Check these possibilities at the scipy online documentation.

The simplest form of polynomial.Chebyshev.fit follows:

np.polynomial.Chebyshev.fit(x, y, deg)

Where:

*x*are the x-coordinates of the sample points.*y*are the y-coordinates of the sample points.*deg*is the degree of the polynomial to be fitted.

For more information and arguments to this function, please check the scipy documentation. As an alternative, type np.polynomial.Chebyshev.fit? on your Python prompt.

To show the behavior of this function, we will use three different arguments for *deg*: 20, 45 and 115. The following code will generate these different polynomials based on our data and store them on the variables *cheby_chirp20*, *cheby_chirp45* and *cheby_chirp115*.

cheby_chirp20 = np.polynomial.Chebyshev.fit(time_interval, example_chirp, deg=20) cheby_chirp45 = np.polynomial.Chebyshev.fit(time_interval, example_chirp, deg=45) cheby_chirp115 = np.polynomial.Chebyshev.fit(time_interval, example_chirp, deg=115)

Let's plot these results. Note that the results of polynomial.Chebyshev.fit are functions, not points; we need to provide the plot interval as an argument. We plot these three polynomials against the original data as subplots. They will be represented as green lines ('g') with width equals to 2 (lw=2). We also add a label to each of them, which will appear on the legend of each plot. Check the matplotlib examples for more ideas on subplots.

f, ax_array = plt.subplots(3, sharex=True) ax_array[0].stem(time_interval, example_chirp) ax_array[0].plot(time_interval, cheby_chirp20(time_interval), 'g', lw=2, label='cheby deg 20') ax_array[0].legend() ax_array[1].stem(time_interval, example_chirp) ax_array[1].plot(time_interval, cheby_chirp45(time_interval), 'g', lw=2, label='cheby deg 45') ax_array[1].legend() ax_array[2].stem(time_interval, example_chirp) ax_array[2].plot(time_interval, cheby_chirp115(time_interval), 'g', lw=2, label='cheby deg 115') ax_array[2].legend()

The three generated polynomials can be seen on Figure 2.

**Figure 2.**: Three polynomials representing our data, obtained using *np.polynomial.Chebyshev.fit*.

Now, let's use scipy's *integrate.quad* function to calculate the area of these polynomials. This function uses a technique from QUADPACK, a Fortran library. The simplest form of integrate.quad is given below:

scipy.integrate.quad(func, a, b)

Where:

*func*is the function to be integrated.*a*is the lower limit of integration.*b*is the upper limit of integration.

This function returns the integral value and the absolute error in the result. For more information on integrate.quad, its arguments and usage examples, check the scipy documentation. You can also type sp.integrate.quad? on your Python prompt.

Let's define the integration interval between zero and pi. Then, we integrate the three polynomials. Check the code below:

integ_int = [0, np.pi] cheby20_area, cheby20_err = sp.integrate.quad(cheby_chirp20, *integ_int) cheby45_area, cheby45_err = sp.integrate.quad(cheby_chirp45, *integ_int) cheby115_area, cheby115_err = sp.integrate.quad(cheby_chirp115, *integ_int)

We define *integ_int* and pass it to integrate.quad using the star (*****), which provides the arguments all at once to the function. For example, on the *cheby_chirp20* case, this is an alternative to:

cheby20_area, cheby25_err = sp.integrate.quad(cheby_chirp20, 0, np.pi)

We need to present area and error now. This can be done using print, which provides a nice output:

print('The area of cheby20 in this interval is {0}. The absolute error is {1}'.format(cheby20_area, cheby20_err)) print('The area of cheby45 in this interval is {0}. The absolute error is {1}'.format(cheby45_area, cheby45_err)) print('The area of cheby115 in this interval is {0}. The absolute error is {1}'.format(cheby115_area, cheby115_err))

The function print presents a message. Using format we can also present the variable values on *{0}*, *{1}*, *{2}*, and so on. More on the print function can be seen on the Python documentation. As always, you can type print? on your Python prompt.

Let's check the results:

The area of cheby20 in this interval is 0.4637693754972824. The absolute error is 1.5334233932461724e-13 The area of cheby45 in this interval is -8.792912670367262. The absolute error is 2.9585315540800773e-12 The area of cheby115 in this interval is 151230239254.8722. The absolute error is 50.06303513050079

The two first polynomials wielded acceptable results, but there is something wrong about *cheby_chirp115*. We can confirm that checking its absolute error (around 50). Let's see what is going on, plotting the integration areas of these polynomials.

We define the integration interval with 50 points using linspace to plot the results. This interval will be attributed to *integ_full*. Again, we will use plt.subplots:

integ_full = np.linspace(*integ_int) f, ax_array = plt.subplots(3, sharex=True) ax_array[0].stem(time_interval, example_chirp) ax_array[0].plot(time_interval, cheby_chirp20(time_interval), 'g', lw=2, label='cheby deg 20') ax_array[0].fill_between(integ_full, cheby_chirp20(integ_full), color='orchid', label=r'$\int_{0}^{\pi}\,$cheby_chirp20(x)') ax_array[0].legend() ax_array[1].stem(time_interval, example_chirp) ax_array[1].plot(time_interval, cheby_chirp45(time_interval), 'g', lw=2, label='cheby deg 45') ax_array[1].fill_between(integ_full, cheby_chirp45(integ_full), color='orchid', label=r'$\int_{0}^{\pi}\,$cheby_chirp45(x)') ax_array[1].legend() ax_array[2].stem(time_interval, example_chirp) ax_array[2].plot(time_interval, cheby_chirp115(time_interval), 'g', lw=2, label='cheby deg 115') ax_array[2].fill_between(integ_full, cheby_chirp115(integ_full), color='orchid', label=r'$\int_{0}^{\pi}\,$cheby_chirp115(x)') ax_array[2].legend()

We employed *plt.fill_between* to plot the integration areas. For comparison, we plotted also the original points and the polynomials. Note that we added LaTeX code to the label; it is necessary to add an *r* before the string. The areas of integration are shown on Figure 3.

**Figure 3.** The integration areas highlighted on the fitted polynomials.

Check how the integration interval of *cheby_chirp115* is messy! *cheby_chirp45* is not good too. *cheby_chirp20* is satisfactory from 0 to 1 approximately, but we need more than that on the rest of our data.

**Lesson learned:** be careful choosing the degree of the fitted polynomial! Lower degrees may not represent your data nicely; however, higher degrees may lead to a poor conditioned fit. The solution could be to search for a good balance.

All right, this is it! Hope you enjoyed this post. Are there anything wrong? Do you have any ideas to improve it? Please share them with us! Also, you can download the codes on GitHub.

Thank you for joining us! See you next time!

**Additional notes:**

**jms_nh**proposed another tools for fitting the chirp:*"The waveforms you've shown really aren't very polynomial-like and would be better handled by curve-fitting using cubic splines or Savitsky-Golay filtering"*. He has a point: according to the SciPy documentation,*"Fits using Chebyshev series are usually better conditioned than fits using power series, but much can depend on the distribution of the sample points and the smoothness of the data. If the quality of the fit is inadequate splines may be a good alternative"*.

## Comments:

- Comments
- Write a Comment Select to add a comment

Also, if you do have a Chebyshev polynomial P(x), the way to integrate it is not numerically using something like scipy.integrate.quad, but rather algebraically by manipulating the components to get the Chebyshev coefficients of the indefinite integral of P(x), and evaluating the integral directly.

1. sure thing. Actually, the SciPy documentation says that too:

"Fits using Chebyshev series are usually better conditioned than fits using power series, but much can depend on the distribution of the sample points and the smoothness of the data. If the quality of the fit is inadequate splines may be a good alternative."

I will write this consideration as an Additional Note to the post, ok?

Again, keep in mind that this post is only a thought on the balance of a fit and its consequences.

2. sure thing². I could also use the Simpson's rule, composite trapezoidal, ... but this is the subject of a next post ;)"

Thanks for your thoughts! Have a nice one!

Keep in mind that I used the discrete chirp points as an example, you can use any discrete data points you want. The purpose here is to show how to act on a discrete sample; sorry if this is not clear on the post. I'll think about it and try to improve the motivation, ok? Thanks!

Linus

I really appreciate it! Feel free to ask a question or point a doubt if you'd like.

Thanks again!

thanks!

I don't know how/if one could do it. I know there exists py_compile [1] and compileall [2], however. I just don't know how to use them.

Check it out if you'd like:

[1] https://docs.python.org/3.5/library/py_compile.html#module-py_compile

[2] https://docs.python.org/3.5/library/compileall.html#module-compileall

Thanks! Have a nice one!

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Registering will allow you to participate to the forums on ALL the related sites and give you access to all pdf downloads.