Forums

How to time an algorithm on the c6713?

Started by m_st...@hotmail.com March 15, 2010
Hello all,

I’m using the c6713 DSK to perform a real-time FFT on some data sampled at 8 kHz. I’ve hooked up an oscilloscope to the line out of the DSP and everything is working well.

I’d like to be able to record the time taken to perform one FFT. I tried using the c time() functions to record the time before and after a large number of FFTs. These functions (when used on a PC) return the number of seconds passed since 1/1/1970, and can be used to record a time-difference between two poins in some code. The results returned using this method (via printf) on the dsp seem to be fairly meaningless, and I think that the method is pretty flawed.

I was wondering what the conventional way to time algorithms is on the c6x? I’ve read about RTDX and understand that this is the process used to communicate between the host and DSP, but how can one record the time taken for an algorithm to run? Is this a case of simply looking at a graph if RTDX is used somehow?

If anybody could point me in the general direction, it would be much appreciated.

Cheers,

Mike

_____________________________________
There are two ways that I've done timing calculations on my code. One is
that I use one of the timer resources in the DSP and set up my own code to
calculate how much time has passed.

The more reliable timing information is toggling a pin that is viable from
the outside world with an oscilloscope and finding the timing completely
outside of the software I'm working with. The external timing device is not
subject to things I may not catch, such as interrupts being disabled, or
other timing latencies, or the debugger halting the processor.

On Mon, Mar 15, 2010 at 9:38 AM, wrote:

> Hello all,
>
> Im using the c6713 DSK to perform a real-time FFT on some data sampled at
> 8 kHz. Ive hooked up an oscilloscope to the line out of the DSP and
> everything is working well.
>
> Id like to be able to record the time taken to perform one FFT. I tried
> using the c time() functions to record the time before and after a large
> number of FFTs. These functions (when used on a PC) return the number of
> seconds passed since 1/1/1970, and can be used to record a time-difference
> between two poins in some code. The results returned using this method (via
> printf) on the dsp seem to be fairly meaningless, and I think that the
> method is pretty flawed.
>
> I was wondering what the conventional way to time algorithms is on the c6x?
> Ive read about RTDX and understand that this is the process used to
> communicate between the host and DSP, but how can one record the time taken
> for an algorithm to run? Is this a case of simply looking at a graph if RTDX
> is used somehow?
>
> If anybody could point me in the general direction, it would be much
> appreciated.
>
> Cheers,
>
> Mike
>
>
>
> _____________________________________
>
You can read the number of cycles consumed between your run and breakpoint under CCS.

> To: c...
> From: m...@hotmail.com
> Date: Mon, 15 Mar 2010 12:38:21 -0400
> Subject: [c6x] How to time an algorithm on the c6713?
>
> Hello all,
>
> Im using the c6713 DSK to perform a real-time FFT on some data sampled at 8 kHz. Ive hooked up an oscilloscope to the line out of the DSP and everything is working well.
>
> Id like to be able to record the time taken to perform one FFT. I tried using the c time() functions to record the time before and after a large number of FFTs. These functions (when used on a PC) return the number of seconds passed since 1/1/1970, and can be used to record a time-difference between two poins in some code. The results returned using this method (via printf) on the dsp seem to be fairly meaningless, and I think that the method is pretty flawed.
>
> I was wondering what the conventional way to time algorithms is on the c6x? Ive read about RTDX and understand that this is the process used to communicate between the host and DSP, but how can one record the time taken for an algorithm to run? Is this a case of simply looking at a graph if RTDX is used somehow?
>
> If anybody could point me in the general direction, it would be much appreciated.
>
> Cheers,
>
> Mike
>

_____________________________________
Thanks for the help, Christophe and William. I'm going to output_sample(something big) before and after the algorithm I'm interested and then use the scope, seems like the most reliable and simple method for me.

Thanks again.

_____________________________________
Mike-

> Thanks for the help, Christophe and William. I'm going to
> output_sample(something big) before and after the algorithm
> I'm interested and then use the scope, seems like the most
> reliable and simple method for me.

I assume you mean toggle a GPIO pin of some type. That's a reliable method and we use it frequently. Also we often
set an onchip timer to free-run and read the count register (if you do that, be sure to stay away from lo-res and
hi-res timers used by DSP/BIOS).

-Jeff

_____________________________________
512 data points are being collected at a rate of 8 kHz before the FFT is called. The fact that it works in real time means that the FFT function works in under 64 ms. I'd like to make an accurate time measurement of just the FFT process, excluding the input/output process, so that I can compare the run-time of different FFT algorithms.

Here's the code I'm using. I'm really interested in timing this part of the code:

cfftr2_dit(x, W, N ) ; //TI floating-pt
complex FFT

digitrev_index(iData, N, RADIX); //produces index for
bitrev() X

bitrev(x, iData, N); //freq
scrambled->bit-reverse x

for (i =0; i
{

Xmag[i] = sqrt(x[i].re*x[i].re+x[i].im*x[i].im)/32;
//magnitude of X

}

I could output something before and after this section of the code, and measure the time difference on a 'scope. Is it feasible to use a timer withou delving into the world of BIOS - which I know nothing about? And is this code mssing anything that I'd need in order to use the profiler?

I apologise if these questions relate to a very simple matter; I'm just trying to get a job done for my experiment - I'm not involved with DSPs at all.

//FFTr2.c FFT using TI's optimized FFT function and real-time input

#include "dsk6713_aic23.h"
Uint32 fs=DSK6713_AIC23_FREQ_8KHZ; //set sampling rate
#include
#define N 512 //number of FFT points
#define RADIX 2 //radix or base
#define DELTA (2*PI)/N //argument for sine/cosine
#define PI 3.14159265358979
short i = 0;
short iTwid[N/2]; //index for twiddle constants W
short iData[N]; //index for bitrev X
float Xmag[N]; //magnitude spectrum of x
typedef struct Complex_tag {float re,im;}Complex;
Complex W[N/RADIX]; //array for twiddle constants
Complex x[N]; //N complex data values
#pragma DATA_ALIGN(W,sizeof(Complex)) //align W on boundary
#pragma DATA_ALIGN(x,sizeof(Complex)) //align input x on boundary

void main()
{
for( i = 0 ; i < N/RADIX ; i++ )
{
W[i].re = cos(DELTA*i); //real component of W
W[i].im = sin(DELTA*i); //neg imag component
} //see cfftr2_dit
digitrev_index(iTwid, N/RADIX, RADIX);//produces index for bitrev() W
bitrev(W, iTwid, N/RADIX); //bit reverse W

comm_poll(); //init DSK,codec,McBSP
for(i=0; i Xmag[i] = 0; //init output magnitude
while (1) //infinite loop
{
output_sample(32000); //negative spike for reference
for( i = 0 ; i < N ; i++ )
{
x[i].re = (float)((short)input_sample()); //external input
x[i].im = 0.0 ; //zero imaginary part
if(i>0) output_sample((short)Xmag[i]); //output magnitude
}

cfftr2_dit(x, W, N ) ; //TI floating-pt complex FFT
digitrev_index(iData, N, RADIX); //produces index for bitrev() X
bitrev(x, iData, N); //freq scrambled->bit-reverse x
for (i =0; i {
Xmag[i] = sqrt(x[i].re*x[i].re+x[i].im*x[i].im)/32; //magnitude of X
}
}
}

_____________________________________
M_stanhope,

how much data is being collected before a FFT is performed?

Since the data continues to be collected, you know that a single FFT takes less than the time to collect the data.

another possibility is to start a timer, call the FFT function, upon return, stop the timer.
Since timers count down, a simple subtract between the timer start count and the end count multiplied by the count rate of the timer will give you the elapsed time.

another possibility is to use the profiler to measure the time from the call to the FFT function to the next instruction in the code after the call.

R. Williams


---------- Original Message -----------
From: m...@hotmail.com
To: c...
Sent: Mon, 15 Mar 2010 12:38:21 -0400
Subject: [c6x] How to time an algorithm on the c6713?

> Hello all,
>
> I’m using the c6713 DSK to perform a real-time FFT on some data sampled at 8 kHz. I’ve hooked up an oscilloscope to the line out of the DSP and everything is working well.
>
> I’d like to be able to record the time taken to perform one FFT. I tried using the c time() functions to record the time before and after a large number of FFTs. These functions (when used on a PC) return the number of seconds passed since 1/1/1970, and can be used to record a time-difference between two poins in some code. The results returned using this method (via printf) on the dsp seem to be fairly meaningless, and I think that the method is pretty flawed.
>
> I was wondering what the conventional way to time algorithms is on the c6x? I’ve read about RTDX and understand that this is the process used to communicate between the host and DSP, but how can one record the time taken for an algorithm to run? Is this a case of simply looking at a graph if RTDX is used somehow?
>
> If anybody could point me in the general direction, it would be much appreciated.
>
> Cheers,
>
> Mike
>

_____________________________________
Mike,

On Mon, Mar 15, 2010 at 3:32 PM, Jeff Brower wrote:

> Mike-
> > Thanks for the help, Christophe and William. I'm going to
> > output_sample(something big) before and after the algorithm
> > I'm interested and then use the scope, seems like the most
> > reliable and simple method for me.
>
> I assume you mean toggle a GPIO pin of some type. That's a reliable method
> and we use it frequently. Also we often
> set an onchip timer to free-run and read the count register (if you do
> that, be sure to stay away from lo-res and
> hi-res timers used by DSP/BIOS).
>

The three suggested methods [timer ticks, GPIO pin, CCS cycles] are all good
to use. Some situations and your style of debug will tend to favor one over
the other, but, as you found out, NEVER, NEVER, use any stdio [printf, time,
etc.] for timing on an embedded system unless it is running an OS with the
answers [time and printf rely on the PC for completion - which kills
performance.

mikedunn

>
> -Jeff
>
>
>

--
www.dsprelated.com/blogs-1/nf/Mike_Dunn.php
M_Stanhope,

If I were doing the measuring, I would directly manipulate a GPIO pin.

Say, set it high at the beginning of the indicated code sequence and set it low at the end of the indicated code sequence.

This may require a bit of code in your initialization to setup the GPIO pin for output.
The actual output would be a simple write to two registers.
One register write has a '1' in the appropriate bit position to set the GPIO pin
Another register write has a '1' in the appropriate bit position to clear the GPIO pin.

using the BIOS may require a bit of modification to the BIOS config file to initialize the desired GPIO pin for output
Then using appropriate calls into the CSL to modify the voltage level of the desired GPIO pin.

R. Williams

---------- Original Message -----------
From: M Stanhope
To: ,
Sent: Mon, 15 Mar 2010 20:13:37 +0000
Subject: RE: [c6x] How to time an algorithm on the c6713?

>
>
> 512 data points are being collected at a rate of 8 kHz before the FFT is called. The fact that it works in real time means that the FFT function works in under 64 ms. I'd like to make an accurate time measurement of just the FFT process, excluding the input/output process, so that I can compare the run-time of different FFT algorithms.
>
> Here's the code I'm using. I'm really interested in timing this part of the code:
>
> cfftr2_dit(x, W, N ) ; //TI floating-pt
> complex FFT
>
> digitrev_index(iData, N, RADIX); //produces index for
> bitrev() X
>
> bitrev(x, iData, N); //freq
> scrambled->bit-reverse x
>
> for (i =0; i >
> {
>
> Xmag[i] = sqrt(x[i].re*x[i].re+x[i].im*x[i].im)/32;
> //magnitude of X
>
> }
>
> I could output something before and after this section of the code, and measure the time difference on a 'scope. Is it feasible to use a timer withou delving into the world of BIOS - which I know nothing about? And is this code mssing anything that I'd need in order to use the profiler?
>
> I apologise if these questions relate to a very simple matter; I'm just trying to get a job done for my experiment - I'm not involved with DSPs at all.
>
> //FFTr2.c FFT using TI's optimized FFT function and real-time input
>
> #include "dsk6713_aic23.h"
> Uint32 fs=DSK6713_AIC23_FREQ_8KHZ; //set sampling rate
> #include
> #define N 512 //number of FFT points
> #define RADIX 2 //radix or base
> #define DELTA (2*PI)/N //argument for sine/cosine
> #define PI 3.14159265358979
> short i = 0;
> short iTwid[N/2]; //index for twiddle constants W
> short iData[N]; //index for bitrev X
> float Xmag[N]; //magnitude spectrum of x
> typedef struct Complex_tag {float re,im;}Complex;
> Complex W[N/RADIX]; //array for twiddle constants
> Complex x[N]; //N complex data values
> #pragma DATA_ALIGN(W,sizeof(Complex)) //align W on boundary
> #pragma DATA_ALIGN(x,sizeof(Complex)) //align input x on boundary
>
> void main()
> {
> for( i = 0 ; i < N/RADIX ; i++ )
> {
> W[i].re = cos(DELTA*i); //real component of W
> W[i].im = sin(DELTA*i); //neg imag component
> } //see cfftr2_dit
> digitrev_index(iTwid, N/RADIX, RADIX);//produces index for bitrev() W
> bitrev(W, iTwid, N/RADIX); //bit reverse W
>
> comm_poll(); //init DSK,codec,McBSP
> for(i=0; i > Xmag[i] = 0; //init output magnitude
> while (1) //infinite loop
> {
> output_sample(32000); //negative spike for reference
> for( i = 0 ; i < N ; i++ )
> {
> x[i].re = (float)((short)input_sample()); //external input
> x[i].im = 0.0 ; //zero imaginary part
> if(i>0) output_sample((short)Xmag[i]); //output magnitude
> }
>
> cfftr2_dit(x, W, N ) ; //TI floating-pt complex FFT
> digitrev_index(iData, N, RADIX); //produces index for bitrev() X
> bitrev(x, iData, N); //freq scrambled->bit-reverse x
> for (i =0; i > {
> Xmag[i] = sqrt(x[i].re*x[i].re+x[i].im*x[i].im)/32; //magnitude of X
> }
> }
> }
Andrew,

Thanks for the advice, I'll sort out the magnitude loop - I'll do a few itterations of the Newton-Raphson method to find the square root. I'll get rid of the /32, I have no idea how it ended up in there.

I've never used CSS debug clocks or a free-running timer (this is the first time I've used a DSP) so I'll go research them.

Cheers,

Mike

> Date: Wed, 17 Mar 2010 01:40:36 -0800
> From: a...@techemail.com
> To: c...
> CC: m...@hotmail.com
> Subject: Re: How to time an algorithm on the c6713?
> Hi Mike,
>
> A few small remarks: you don't have to call digitrev_index() function
> inside the data processing loop while(1), the indices wouldn't change,
> so it is sufficient to precalculate them once outside the loop.
>
> Place data (if possible) into internal ram. If that's not the option,
> use cache. Place code into internal ram.
>
> What is really bad and performance killer is the magnitude loop.
> The call of sqrt and division by 32 (by the way, why 32? sqrt(512)
> is 16*sqrt(2)?) alone would take half of the whole FTT cycles.
>
> I would change division by 32 by multiplication by 0.03125 and
> program Newton-Raphson iterations instead of calling sqrt().
>
> Of the various timing methods I prefer either CCS debug clocks
> or free-running timer. I don't like scope-based measurements just
> because I assume a CPU is self-sufficient to measure cycles :)
>
> Cheers,
>
> Andrew

_____________________________________