DSPRelated.com
Forums

partitioned/fast convolution

Started by peterworth December 19, 2007
i'm trying to implement a real-time partitioned/fast convolution and having
real trouble getting my head around which partitions to multiply with
which, and what to cumulatively sum and output etc... can anyone offer or
link to some pseudocode explaining it step by step or a diagram? thanks.


peterworth wrote:
> i'm trying to implement a real-time partitioned/fast convolution and having > real trouble getting my head around which partitions to multiply with > which, and what to cumulatively sum and output etc... can anyone offer or > link to some pseudocode explaining it step by step or a diagram? thanks.
Perhaps this old thread helps: http://groups.google.com/group/comp.dsp/browse_frm/thread/8494c515785251b3/1e8a4f9c5b70bb3d?#1e8a4f9c5b70bb3d Regards, Andor
>Perhaps this old thread helps: > >http://groups.google.com/group/comp.dsp/browse_frm/thread/8494c515785251b3/1e8a4f9c5b70bb3d?#1e8a4f9c5b70bb3d
thanks, i had found that already actually, but still was a bit confused.. looking at it a second time it makes more sense actually, it's just a case of turning it into something procedural and working out how to do it with a real-time input rather than 2 pre-defined buffers. can you explain why it is x(n-k) rather than x(k) when partitioned as below? thanks for your help. y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) = sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k) = sum_{k=0}^{N1-1} h1(k) x(n-k) + sum_{k=0}^{N-N1-1} h2(k) x(n-k-N1) = h1(n) * x(n) + h2(n) * x(n-N1)
>thanks, i had found that already actually, but still was a bit confused.. > >looking at it a second time it makes more sense actually, it's just a
case
>of turning it into something procedural and working out how to do it with
a
>real-time input rather than 2 pre-defined buffers. can you explain why
it
>is x(n-k) rather than x(k) when partitioned as below? thanks for your >help. > >y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) > >= sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k) >= sum_{k=0}^{N1-1} h1(k) x(n-k) + sum_{k=0}^{N-N1-1} h2(k) x(n-k-N1) >= h1(n) * x(n) + h2(n) * x(n-N1)
is it that you're essentially flipping one of them around, so that h(0) is multiplied with x(n), h(1) with x(n-1), and so on? i can see why you would do this in the time domain, but is it the same in the frequency domain? i.e. does n here refer to the bin number, so that the highest frequency bin of the IR is multiplied with the lowest of the input etc. and then all summed?
peterworth wrote:
> >Perhaps this old thread helps: > > >http://groups.google.com/group/comp.dsp/browse_frm/thread/8494c515785... > > thanks, i had found that already actually, but still was a bit confused.. > > looking at it a second time it makes more sense actually, it's just a case > of turning it into something procedural and working out how to do it with a > real-time input rather than 2 pre-defined buffers. can you explain why it > is x(n-k) rather than x(k) when partitioned as below? thanks for your > help. > > y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) > > = sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k)
This seond term on the right above? It's just splitting the sum up into two sums. For example x(0) + x(1) + x(2) = (x(0) + x(1)) + x(2).
> = sum_{k=0}^{N1-1} h1(k) x(n-k) + sum_{k=0}^{N-N1-1} h2(k) x(n-k-N1) > = h1(n) * x(n) + h2(n) * x(n-N1)
And this pops out. So you can split one large N-point convolution into two smaller convolutions (and recursively into any number of smaller convolutions). Usually, 2*N1 is taken to be the size of the FFT that you process the input signal with (giving a hop size of N1). Then you can reuse each FFT input buffer as it transverses through your impulse response buffers. The concept is called frequency domain delay line convolution. You'll find more here: http://pcfarina.eng.unipr.it/Public/AES-113/ The Garcia-PrePrint has some diagrams that might help. Regards, Andor
>> y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) >> >> = sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k) > >This seond term on the right above? It's just splitting the sum up >into two sums. For example > >x(0) + x(1) + x(2) = (x(0) + x(1)) + x(2). > > >> = sum_{k=0}^{N1-1} h1(k) x(n-k) + sum_{k=0}^{N-N1-1} h2(k) x(n-k-N1) >> = h1(n) * x(n) + h2(n) * x(n-N1) > >And this pops out. So you can split one large N-point convolution into >two smaller convolutions (and recursively into any number of smaller >convolutions). Usually, 2*N1 is taken to be the size of the FFT that >you process the input signal with (giving a hop size of N1). Then you >can reuse each FFT input buffer as it transverses through your impulse >response buffers. > >The concept is called frequency domain delay line convolution. You'll >find more here: > >http://pcfarina.eng.unipr.it/Public/AES-113/ > >The Garcia-PrePrint has some diagrams that might help.
thanks for that - the diagram on page 2 of the garcia thing really demonstrates the part i'm having trouble with at the moment: i'm trying to find out if i can implement this in max/msp, and since i don't want to constrain the size of the IR, there is no telling how many such delay lines i will need (likely hundreds), so this aspect of the implementation seems very tricky. what i was wondering above was why in the original sum it is h(k) and x(n-k) rather than h(k) and x(k). is the highest frequency bin of the IR multiplied with the lowest of the input (and so on)? sorry to ask lots of questions..
On 19 Dez., 17:35, "peterworth" <peterwo...@gmail.com> wrote:
> >> y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) > > >> = sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k) > > >This seond term on the right above? It's just splitting the sum up > >into two sums. For example > > >x(0) + x(1) + x(2) = (x(0) + x(1)) + x(2). > > >> = sum_{k=0}^{N1-1} h1(k) x(n-k) + sum_{k=0}^{N-N1-1} h2(k) x(n-k-N1) > >> = h1(n) * x(n) + h2(n) * x(n-N1) > > >And this pops out. So you can split one large N-point convolution into > >two smaller convolutions (and recursively into any number of smaller > >convolutions). Usually, 2*N1 is taken to be the size of the FFT that > >you process the input signal with (giving a hop size of N1). Then you > >can reuse each FFT input buffer as it transverses through your impulse > >response buffers. > > >The concept is called frequency domain delay line convolution. You'll > >find more here: > > >http://pcfarina.eng.unipr.it/Public/AES-113/ > > >The Garcia-PrePrint has some diagrams that might help. > > thanks for that - the diagram on page 2 of the garcia thing really > demonstrates the part i'm having trouble with at the moment: i'm trying to > find out if i can implement this in max/msp, and since i don't want to > constrain the size of the IR, there is no telling how many such delay > lines i will need (likely hundreds), so this aspect of the implementation > seems very tricky.
Yeah, double partitioned convolution using OLA/OLS is a real pain. Don't forget the optimal partioning sequence, which requires parallel convolution processes.
> > what i was wondering above was why in the original sum it is h(k) and > x(n-k) rather than h(k) and x(k). is the highest frequency bin of the IR > multiplied with the lowest of the input (and so on)?
No, k is the time index. Flipping one sequence around in time before the multply-add is just the definition of the convolution. Regards, Andor
On Dec 19, 11:35 am, "peterworth" <peterwo...@gmail.com> wrote:
> >> y(n) = h(n) * x(n) = sum_{k=0}^{N-1} h(k) x(n-k) > > >> = sum_{k=0}^{N1-1} h(k) x(n-k) + sum_{k=N1}^{N-1} h(k) x(n-k) >
i'm not sure i quite get this. i am trying to relate it to OLA...
> > sorry to ask lots of questions..
it's fine. now you're doing overlap-ADD fast convolution, right? or is it overlap-SAVE? r b-j
Try this reference: 

http://www.dspguide.com/ch18.pdf
Sorry for the bad link-- here is the correct one: 

http://www.dspguide.com/CH18.PDF