I want to convolve a long FIR (at least 800 samples) with real-time analogue stereo audio data on a STM32F769I-Discovery board. I have posted questions about this on the STM Community forum with no solution being given, and I have written and tried the Overlap/Save technique, but I get no output from the card. The card is functioning correctly, as if I remove the Overlap/Save code, I get correct output.
Is there a way of performing real-time convolution on such a board that is extremely fast in execution ?
Seems like you really need to understand why your overlap/save code isn't running on the STM32 properly. Add some diagnostics to the audio processing code - toggle GPIO pins as you get through different stages of the algorithm so you can track progress. Learn to use your debugger to check if it's hanging up in an exception handler.
Once you know why there's no output you'll have a better idea if you need to alter your algorithm.
Are you doing time or frequency domain block processing? Frequency domain is much faster for long FIR lengths.
Also depending on how much latency you can afford partitioned block processing is slightly slower with lower latency and single block processing is the fastest.
The CMSIS library is hard to beat unless you know what you’re doing (I.e good knowledge of both target application and dsp theory). Here is a link to a reference example:
To do it in real time, you would want to have a buffer or double buffer to fill up the block size per their example. Then output the data when it’s ready. Best of luck