Hi there. I'm going to implement circular convolution with zero latency, however here I'll simplify it to say 1024 points latency, because I need to understand where is the problem. So let's say we have an extremely long kernel, like 1000 * 1024 points. Now this is the input: i1 i2 i3 .. i1000 (each 1024 points long). And this is the kernel: k1 k2 k3 .. k1000 (each 1024 points long). To perform convolution we can zero pad each input and kernel block to 2048 samples and: out1 = idft( dft(i1 zero padded) * dft(k1 zero padded) ) And this is the result of convolution for input i1000: finalout1000 = out1 + out2 + .. + out1000 + leakagefromfinalout999 We can obviously precalculate dft's for k1 .. k1000. But we can also calculate input dft's just once, when processing realtime, because what is now i1000, will soon be i999, and then i998 etc., right? And here's my dilemma - I think we can also calculate output idft once. Why summing out1+out2.., when we can sum it in frequency domain? So, is this true: ? finalout1000 = idft( dft(i1 zero padded) * dft(k1 zero padded) + dft(i2 zero padded) * dft(k2 zero padded) + .. dft(i1000 zero padded) * dft(k1000 zero padded)) + leakagefromfinalout999 If that would be true, the convolution could be done too quickly I believe, since FFT is the most time-consuming task here. But why it is wrong? Thank in advance.

# Circular convolution optimization

Started by ●May 17, 2010

Reply by ●May 17, 20102010-05-17

On 17 Mai, 15:58, "jungledmnc" <jungledmnc@n_o_s_p_a_m.gmail.com> wrote:> Hi there. I'm going to implement circular convolution with zero latency, > however here I'll simplify it to say 1024 points latency, because I need =to> understand where is the problem. > > So let's say we have an extremely long kernel, like 1000 * 1024 points. N=ow> this is the input: > i1 i2 i3 .. i1000 (each 1024 points long). > > And this is the kernel: > k1 k2 k3 .. k1000 (each 1024 points long). > > To perform convolution we can zero pad each input and kernel block to 204=8> samples and: > out1 =3D idft( dft(i1 zero padded) * dft(k1 zero padded) ) > > And this is the result of convolution for input i1000: > finalout1000 =3D out1 + out2 + .. + out1000 + leakagefromfinalout999 > > We can obviously precalculate dft's for k1 .. k1000. But we can also > calculate input dft's just once, when processing realtime, > because what is now i1000, will soon be i999, and then i998 etc., right? > > And here's my dilemma - I think we can also calculate output idft once. W=hy> summing out1+out2.., when we can sum it in frequency domain? So, is this > true: ? > > finalout1000 =3D idft( > =A0 dft(i1 zero padded) * dft(k1 zero padded) + > =A0 dft(i2 zero padded) * dft(k2 zero padded) + > =A0 .. > =A0 dft(i1000 zero padded) * dft(k1000 zero padded)) + > =A0 leakagefromfinalout999 > > If that would be true, the convolution could be done too quickly I believ=e,> since FFT is the most time-consuming task here. But why it is wrong? > > Thank in advance.I'm not quite sure what your question is, but you might find answers here: http://pcfarina.eng.unipr.it/Public/AES-113/ Regards, Andor

Reply by ●May 17, 20102010-05-17