DSPRelated.com
Forums

ARM CMSIS: arm_cfft_q15() roundoff errors

Started by Unknown August 18, 2015
Hi,
Sanity check here. Using ARM arm_cfft_q15() with 512 length. Seeing noticeable roundoff errors. For a simulated input pure sine wave (first harmonic) the magnitudes of the first nine FFT harmonics look as follows:

Input H1 10 bits:
11.3    5.7    0.0    5.7    5.7    5.7    2.0    4.0    5.7

Input H1 1000 bits:
992.0    5.7    2.0    5.7    7.2    5.7    5.7    5.7    5.7

Input H1 10000 bits:
9992.0    5.7    5.7    5.7    5.7    5.7    5.7    5.7    4.5

Computed as:
  arm_cfft_q15(cfft, Data0, 0, 1);
  fi = Data0[2*i];
  fq = Data0[2*i+1];
  mag = sqrtf(fi*fi + fq*fq);

Does this look OK or the errors are too high? I recall getting lower roundoff errors with custom FFT256 routine in the past. Would appreciate if somebody can provide any insight.
Regards,
Sergei
sergei.sharonov@gmail.com> wrote:

>Sanity check here. Using ARM arm_cfft_q15() with 512 length. Seeing >noticeable roundoff errors. For a simulated input pure sine wave (first >harmonic) the magnitudes of the first nine FFT harmonics look as >follows: > >Input H1 10 bits: >11.3 5.7 0.0 5.7 5.7 5.7 2.0 4.0 5.7 > >Input H1 1000 bits: >992.0 5.7 2.0 5.7 7.2 5.7 5.7 5.7 5.7 > >Input H1 10000 bits: >9992.0 5.7 5.7 5.7 5.7 5.7 5.7 5.7 4.5 > >Computed as: > arm_cfft_q15(cfft, Data0, 0, 1); > fi = Data0[2*i]; > fq = Data0[2*i+1]; > mag = sqrtf(fi*fi + fq*fq);
Can you clarify what is meant by "10 bits", "1000 bits", and "10000 bits" in the above? I have never seen a 10000-bit number. Thanks Steve
On Tuesday, August 18, 2015 at 5:44:40 PM UTC-5, Steve Pope wrote:

> ....I have never seen a 10000-bit number.
Me neither ;-) Sorry for the confusion. 10000 bits meant that in this case the sine wave array was computed as: ph = n * M_TWOPI * i / FFTLEN; Data0[2*i] = roundf(1000 * cosf(ph)); Data0[2*i+1] = roundf(1000 * sinf(ph)); Regards, Sergei
On Tue, 18 Aug 2015 16:29:17 -0700, sergei.sharonov wrote:

> On Tuesday, August 18, 2015 at 5:44:40 PM UTC-5, Steve Pope wrote: > >> ....I have never seen a 10000-bit number. > Me neither ;-) > Sorry for the confusion. 10000 bits meant that in this case the sine > wave array was computed as: > ph = n * M_TWOPI * i / FFTLEN; Data0[2*i] = roundf(1000 * cosf(ph)); > Data0[2*i+1] = roundf(1000 * sinf(ph));
Somehow I think you mean 1000 bits? And for "10000 bits" you'd say roundf (10000 * cosf(ph)); ? I'm thinking that perhaps you're not a native English speaker and you're translating something like "10000 pieces" into "10000 bits", then having trouble because in English-computerese we use "bits" to mean that little thing that can be 0 or 1. In English you'd probably say "round to 1 decimal place" (your "10 bits"), "round to 3 decimal places" (your "1000 bits"), etc. I don't mean to criticize here: rather I'm just providing a bit of native guidance to our linguistic swamp. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
<sergei.sharonov@gmail.com> wrote:

>On Tuesday, August 18, 2015 at 5:44:40 PM UTC-5, Steve Pope wrote:
>> ....I have never seen a 10000-bit number.
>Me neither ;-)
>Sorry for the confusion. 10000 bits meant that in this case the sine >wave array was computed as: > ph = n * M_TWOPI * i / FFTLEN; > Data0[2*i] = roundf(1000 * cosf(ph)); > Data0[2*i+1] = roundf(1000 * sinf(ph));
Thanks. So, first I would try removing the scale factor, that is use Data[2*i] = roundf( F * cosf(ph) ) / F; Where F corresponds to your values of 10.0, 1000.0, and 10000.0 . Secondly, I'd make these powers of two instead, something like 256.0, 4096.0, 16384.0 so that the quantizing corresponds to 8-bit, 12-bit, and 16-bit data respectively. Thirdly I would not use roundf() as it is single-precision. I usually use rint(). But maybe here you specifically want single precision. I'm guessing that you will not see much difference between 12-bit and 16-bit quantizing. Hope this helps Steve
On Tuesday, August 18, 2015 at 6:44:30 PM UTC-5, Steve Pope wrote:
> ...Thanks. So, first I would try removing the scale factor, that > is use > Data[2*i] = roundf( F * cosf(ph) ) / F; > > Where F corresponds to your values of 10.0, 1000.0, and 10000.0 .
Not sure where you are going with that. Data[] is Q15 array as required by arm_cfft_q15() API. Dividing by F will result in Data taking only (integer) values of -1, 0 and 1 with some automatic truncation effects. In any case, I checked simulated signal quality with external tool (cli-fft) and it is clean. That scaling factor (F) is there to see if we get the same value on the output as we simulated on the input. Just trying to make sure that I use the CMSIS library correctly and trying to understand what sort of errors it adds to the ideal signal.
> Secondly, I'd make these powers of two instead, something like > > 256.0, 4096.0, 16384.0 > > so that the quantizing corresponds to 8-bit, 12-bit, and 16-bit data > respectively.
OK, interesting thought but I will not get these nice 2^N numbers later when this is fed from ADC. The point of exercise is to estimate the errors due to ARM DSP library. And it seems that no matter what the input is I get 2-7 counts of noise floor from FFT. That was the original question. Is that normal for this lib or am I doing something wrong with it?
> Thirdly I would not use roundf() as it is single-precision. > I usually use rint(). But maybe here you specifically want > single precision.
Yes, my MCU has only single precision FPU so I figured I would save a few cycles. In any case, we are looking at such large errors that float or double should not make any difference.
> I'm guessing that you will not see much difference between > 12-bit and 16-bit quantizing. > Hope this helps
Thank, appreciate you taking a look at it. Regards, Sergei
<sergei.sharonov@gmail.com> wrote:

>On Tuesday, August 18, 2015 at 6:44:30 PM UTC-5, Steve Pope wrote:
>> ...Thanks. So, first I would try removing the scale factor, that >> is use >> Data[2*i] = roundf( F * cosf(ph) ) / F;
>> Where F corresponds to your values of 10.0, 1000.0, and 10000.0 .
>Not sure where you are going with that. Data[] is Q15 array as required >by arm_cfft_q15() API. Dividing by F will result in Data taking only >(integer) values of -1, 0 and 1 with some automatic truncation effects.
I do not much use the antiquated Q-notation (others here use it much more than me), but a Q15 number has 15 fractional bits and 1 integer bit. Such a number is in the range [-1,1). cosf() has values within the range [-1,1]. 1000.0 * cosf(), as well as roundf(1000.0 * cosf()), have values within the range [-1000.0, 1000.0]. You cannot (unless there's something I'm missing) assign the above to a Q15 number without dividing it by 1000 first. It does not fit in the range.
>In any case, I checked simulated signal quality with external tool >(cli-fft) and it is clean.
I'm not sure exactly how you checked the signal quality but the way I would check the signal quality of your ARM fixed-point FFT is as follows: 1) Create a floating-point input in [-1,1] such as x = cosf(whatever) . 2) Quantize it to Q15 as follows: y = roundf(32768.0 * x) / 32768.0 3) Using y as input, compute the FFT result using both arm_cfft_q15() and cil-fft. Call these outputs z and zref. 4) Assuming these are scaled in the same fashion (or if not, after correcting for this) compute the following: rms ( z - zref ) / rms (zref) Where "rms" is the root-mean-square taken over the length 512 arrays. Acutally, at this point you can take 20 * log10 of the above result. This gives you the RMS error in dBc. Whether it's good enough depends on your application. But you'd like it to be something not much more than -6 * 15 dBc or maybe -6 * 14 dBc, that is, you have not loss more than a bit of precision within the library function.
>That scaling factor (F) is there to see if we get the same value on the >output as we simulated on the input. Just trying to make sure that I use >the CMSIS library correctly and trying to understand what sort of errors >it adds to the ideal signal.
>> Secondly, I'd make these powers of two instead, something like >> >> 256.0, 4096.0, 16384.0 >> >> so that the quantizing corresponds to 8-bit, 12-bit, and 16-bit data >> respectively. >OK, interesting thought but I will not get these nice 2^N numbers later >when this is fed from ADC. The point of exercise is to estimate the >errors due to ARM DSP library. And it seems that no matter what the >input is I get 2-7 counts of noise floor from FFT. That was the original >question. Is that normal for this lib or am I doing something wrong with >it?
It my be expected for this ARM library, but if so I'd call it an underdesign (that is, over-compromised design) within the library. Library (and coprocessor) designers should strive to obtain less than 1 LSB of roundoff due to internal precisions.
>> Thirdly I would not use roundf() as it is single-precision. >> I usually use rint(). But maybe here you specifically want >> single precision.
>Yes, my MCU has only single precision FPU so I figured I would save a >few cycles. In any case, we are looking at such large errors that float >or double should not make any difference.
Right. Steve
Steve Pope <spope33@speedymail.org> wrote:
> <sergei.sharonov@gmail.com> wrote:
(snip)
>>Not sure where you are going with that. Data[] is Q15 array as required >>by arm_cfft_q15() API. Dividing by F will result in Data taking only >>(integer) values of -1, 0 and 1 with some automatic truncation effects.
> I do not much use the antiquated Q-notation (others here use it > much more than me), but a Q15 number has 15 fractional bits and > 1 integer bit. Such a number is in the range [-1,1).
I never liked the Q notation, either. I was doing PL/I programming 40 years ago, where PL/I has attributes like: FIXED BINARY(15,15). The first number is the total number of bits, not including the sign bit, the second is the number after the radix point. (PL/I also has FIXED DECIMAL.) For extra fun, PL/I can also do COMPLEX FIXED BIN(15,15), and even use that for trigonometric functions. (Though in most cases after conversion to floating point.) Does Q notation allow for complex numbers? -- glen
glen herrmannsfeldt  <gah@ugcs.caltech.edu> wrote:

>Steve Pope <spope33@speedymail.org> wrote:
>> I do not much use the antiquated Q-notation (others here use it >> much more than me), but a Q15 number has 15 fractional bits and >> 1 integer bit. Such a number is in the range [-1,1).
>I never liked the Q notation, either.
I like it even less now that I have looked at its Wikipedia page. Probably, my previous post was incorrect. According to the Wikipedia editors: Float to Q To convert a number from floating point to Qm.n format: Multiply the floating point number by 2^n Round to the nearest integer Q to float To convert a number from Qm.n format to floating point: Convert the number to floating point as if it were an integer Multiply by 2^n That is, you cannot simply assign between Q's and floats, or Q's and Q's, and get the right answer. You have to re-scale each time you do this. I wish people would just start using IEEE 1666. Steve
Steve Pope <spope33@speedymail.org> wrote:

(snip, I wrote)
>>I never liked the Q notation, either.
> I like it even less now that I have looked at its Wikipedia > page.
> Probably, my previous post was incorrect. > According to the Wikipedia editors:
> Float to Q
> To convert a number from floating point to Qm.n format: > Multiply the floating point number by 2^n > Round to the nearest integer
Computer tradition is to truncate. If you want rounded, correct the value before converting.
> Q to float
> To convert a number from Qm.n format to floating point: > Convert the number to floating point as if it were an integer > Multiply by 2^n
> That is, you cannot simply assign between Q's and floats, or > Q's and Q's, and get the right answer. You have to re-scale > each time you do this.
But we learned how to do that in 2nd or 3rd grade. The process is the same in decimal and binary, except decimal digits in one, binary bits in the other.
> I wish people would just start using IEEE 1666.
Hmm. I don't know that one. -- glen