Hello, so I finally have some time to return to the problem of the multichannel decimation on PSOC5LP. The situation is as follows: there are 8 channels of 12 bits@100kHz each and a single digital quadrature mixer running at 310kHz, also 12 bits. The hardware is an 80MHz ARM CortexM3 equipped with a coprocessor called DFB, running at the same speed, with single-cycle 24x24->48-bit MAC and 256 24-bit memory cells. The final processing will be handled by the ARM, but since all the input data streams are heavily oversampled (by a factor of ~100), I'd like to do as much preprocessing as I can on the DFB in order not to swamp the ARM with massive amount of redundant data. Since the 310kHz I/Q data stream is in fact composed of two 155kHz independent streams with exactly the same filtering requirements and 155kHz is, by pure accident, close to the remaining 100kHz streams, it effectively boils down to a 10 channel decimation by as much as possible using the same structure. There are enough MIPS, but the RAM capacity is effectively 256/10=25 cells per stream. This immediately wipes out all the polyphase FIR techniques you told me about previously. The only option on the table is a cascade of decimating IIR filters. Question #1: what would be appropriate here? It can be the CIC, but it leaves the MAC unit idle -- maybe it could be used somehow? But if not, then what should the CIC structure look like? Each order K filter decimating by M requires 2K cells for the delay storage + one cell for the decimation counter. The input data strem is 12 bits wide and the word length is 24 bits, which does not allow high K. 4 or 5 is the absolute maximum. The required headroom for the integrators grows only logarithmically with M, so a high value of M is tempting. OTOH, the higher M is, the closer I slide to the left on the main CIC lobe, decreasing the antialiasing attenuation, so it seems to be pointless to push M very high. It also makes no sense to require the stopband attenuation to be higher than the max height of the second lobe, and since the main and the second lobe's attenuations are equal at about 1/5th of the normalized frequency, the max. useful decimation factor per stage is also about 5. Question #2: is this reasoning acceptably correct? If yes, then a quick calculation shows that the max. available M is 20 for K=3 (23.97 bits), 9 for K=4 (23.68 bits) and 6 for K=5 (23.93 bits). OTOH, the SNR grows by 1 bit for every 4-fold decrease in the sampling rate, so having that low integrator headroom it looks wise to use that bit to its full capacity, i.e. making the decimation factor equal 4^N. This + all the above means that M should be 4. Question #3: is this reasoning acceptably correct? If yes, then a cascade of 3 such CICs should do the job, providing decimation factor of 64. For K=4 they would require (2*4+1)*3=27 storage cells per channel, which doesn't fit in the chip. But if I somehow combine all the decimation counters into one cell it makes (2*4+1)*3=25, which barely fits. The second saving may come from the fact that the last comb section runs at exactly the final frequency and so can be calculated by the ARM. It removes 4 cells per channel, so the total memory footprint would be 21. Then I could make the last stage 5th order, i.e. use 22 cells. Sincere thanks to anyone who managed to reach up to this point, but I wanted to make my reasoning explicit in order to make it easy for the experts to spot and correct the mistakes in my understanding. I am also extremely curious if there are better, MAC-based factor ~60 decimators which would fit within 25 cells. Best regards, Piotr
Low memory footprint decimation
Started by ●September 7, 2017