Forums

CMSIS CFFT bit reversal table -how to generate?

Started by mjbcswitzerland 2 years ago6 replieslatest reply 7 months ago427 views

Hi All


The ARM CMSIS  FIXED POINT FFTs uses a bit reversal table at the end of the FFT calculation to transpose the output bins to their linear ordering. For example:

const uint16_t armBitRevIndexTable_fixed_16[ARMBITREVINDEXTABLE_FIXED___16_TABLE_LENGTH] =
{
   //radix 4, size 12
   8,64, 16,32, 24,96, 40,80, 56,112, 88,104
};

Using a standard bit reversal algorithm I can generate matching tables for 16, 32, .., 4096 lengths.


However, the ARM CMSIS floating point bit reverse tables are different. Eg. this is the 16 point one.

const uint16_t armBitRevIndexTable16[ARMBITREVINDEXTABLE__16_TABLE_LENGTH] =
{
   //8x2, size 20
   8,64, 24,72, 16,64, 40,80, 32,64, 56,88, 48,72, 88,104, 72,96, 104,112
};

As you can see it has more entries.



In fact the corresponding tables for 4096 points have the same length (both 4032) but different content as shown by just the start of the (large) tables:


const uint16_t armBitRevIndexTable_fixed_4096[ARMBITREVINDEXTABLE_FIXED_4096_TABLE_LENGTH] =
{
    //radix 4, size 4032
    8,16384, 16,8192, 24,24576, 32,4096, 40,20480, 48,12288, 56,28672, 64,2048,
    72,18432, 80,10240, 88,26624, 96,6144, 104,22528, 112,14336, 120,30720,
    128,1024, 136,17408, 144,9216, 152,25600, 160,5120, 168,21504, 176,13312,
    184,29696, 192,3072, 200,19456, 208,11264, 216,27648, 224,7168, 232,23552,
...

const uint16_t armBitRevIndexTable4096[ARMBITREVINDEXTABLE4096_TABLE_LENGTH] =
{
   //radix 8, size 4032
   8,4096, 16,8192, 24,12288, 32,16384, 40,20480, 48,24576, 56,28672, 64,512,
   72,4608, 80,8704, 88,12800, 96,16896, 104,20992, 112,25088, 120,29184,
   128,1024, 136,5120, 144,9216, 152,13312, 160,17408, 168,21504, 176,25600,
   184,29696, 192,1536, 200,5632, 208,9728, 216,13824, 224,17920, 232,22016,
...


I suppose the clue is the radix 4 against radix 8 but I haven't been able to find out any specific details.

Does anyone know how the radix 8 table is generated?

Thanks

Mark

P.S. My intention is to generate a 8192 point set, which is not available anywhere at the moment...
[ - ]
Reply by Y(J)SSeptember 16, 2018

I don't know what CMSIS is, but for a radix-4 FFT you need to reverse pairs of bits instead of bits. In other words, if for radix 2 abcd changed to dcba, for radix-4 abcd changes to cdab. 

I am sure you will be able to infer how to generalize this to radix-2^k.

Y(J)S

[ - ]
Reply by mjbcswitzerlandSeptember 17, 2018

Hi

CMSIS is ARM's "Cortex Microcontroller Software Interface Standard"

https://developer.arm.com/embedded/cmsis

I tried with different bit-reversal codes:

static int reverse_bits(size_t x, int n)
{
    int result = 0;
    int i;
    for (i = 0; i < n; i++, x >>= 1) {
        result = (result << 1) | (x & 1);
    }
    return result;
}

static int reverse_bits_2(size_t x, int n)
{
    int result = 0;
    int i;
    for (i = 0; i < n/2; i++, x >>= 2) {
        result = (result << 2) | (x & 3);
    }
    return result;
}

static int reverse_bits_3(size_t x, int n)
{
    int result = 0;
    int i;
    for (i = 0; i < n / 3; i++, x >>= 3) {
        result = (result << 3) | (x & 7);
    }
    return result;
}

and indeed using reverse_bits_3() instead of reverse_bits() I then managed to generate a matching set for the 4096 case!

Unfortunately I couldn't find any combination to generate the 2048 point table (8x2), but could generate the 512 point table (also marked as radix-8).

I see the following:

64, 512 and 4096 point FFTS are performed with a routine called arm_radix8_butterfly_f32() and their tables marked as radix 8) - these I can now generate.

32, 256 and 2048 point FFTs use arm_cfft_radix8by4_f32() and their tables are marked with (8x2). I haven't been able to generate these yet.

16, 128 and 1024 point FFTs use arm_cfft_radix8by2_f32() and their tables are marked with (8x2) and I haven't been able to generate these yet.

8192 point presumably uses arm_cfft_radix8by2_f32() (based on the way that the other lengths cycle through the use of the three types) but the question is now what 8x2 means with respect to the bit reversal required?

As I quick reference - if I try to generate the 1024 point table with the radix 8 method (which works for 4096) I get these as the first lines.

8,512, 16,1024, 24,1536, 32,2048, 40,2560, 48,3072, 56,3584, 72,576,
80,1088, 88,1600, 96,2112, 104,2624, 112,3136, 120,3648, 136,640, 144,1152,
...

but the reference in this case is 

8,4096, 16,512, 24,4608, 32,1024, 40,5120, 48,1536, 56,5632, 64,2048,
72,6144, 80,2560, 88,6656, 96,3072, 104,7168, 112,3584, 120,7680, 128,2048,
...

The one that I generate is 488 in length and the reference 1800.

Any ideas of what may be needed?

Regards

Mark

[ - ]
Reply by Y(J)SSeptember 17, 2018

OK, I see. These are mixed-radix algorithms. Note that the powers of 8 (64, 512, 4K) are pure radix 8, while the powers of 2 that are not powers of 8 are mixed radix. I would expect the powers of 4 (16, 64, 256, 1024) to be handled in one way, and the powers of 2 which are not powers of 4 or 8 (32, 128, 2048) differently. This doesn't match what you wrote - are you sure?

What happens with mixed-radix algorithms is that you do all the radix 8 that you can, after that a radix 4 if one is left over, and finally a radix 2 if one is left.

Y(J)S

[ - ]
Reply by mjbcswitzerlandSeptember 17, 2018

The CMSIS code selects the FFT to be used based on the length as follows:

    switch (L)
    {
    case 16:
    case 128:
    case 1024:
        arm_cfft_radix8by2_f32((arm_cfft_instance_f32 *)S, p1);
        break;

    case 32:
    case 256:
    case 2048:
        arm_cfft_radix8by4_f32((arm_cfft_instance_f32 *)S, p1);
        break;

    case 64:
    case 512:
    case 4096:
        arm_radix8_butterfly_f32(p1, L, (float32_t *)S->pTwiddle, 1);
        break;
    }

I am assuming that 8192 would use arm_cfft_radix8by2_f32() since it "cycles" through the three types.


There follows then the bit reversal:

arm_bitreversal_32((uint32_t*)p1,S->bitRevLength,S->pBitRevTable);

which needs the correct table to be passed and, as we have seen, the table needs also be to generated accordingly.

Regards

Mark


[ - ]
Reply by mjbcswitzerlandSeptember 17, 2018

A couple of notes:

1) There is an ARM /mbed forum but there are virtually no responses there (especially not from ARM)

2) I have seen other people trying to do the same (extend to 8192 points) but never found anyone that managed it (in one case that looked promising they gave up and decomposed the FFT into smaller ones)

3) At the CMSIS repository on GitHub I see others asking for it and they say they have plans but it is taking months (presumably also due to other priorities)

Regards

Mark

[ - ]
Reply by krzysiekApril 9, 2020

Hi all,


I implemented a script that generates bits reversal tables for all CMSIS algorithms (Radix-4, 4x2, Radix-8, 8x4, 8x2). It is available here:

https://gist.github.com/rosek86/d0d709852fddf36193071d7f61987bae


It is based on the guide written by christophe0606 on CMSIS_5 github page (https://github.com/ARM-software/CMSIS_5/issues/858).


Kind regards,

Krzysztof