Technical discussions related to Audio Signal Processing (digital effects, acoustics, noise reduction, musical signal processing, etc).
Hi Alex, got the posting only today, since I'm currently only reading the digest. if it's still of interest, here are some hints... > � �Date: Thu, 16 Feb 2006 15:35:56 +0100 > � �From: Alex Young <alex.young@alex...> > Subject: in-place interleaving > > Hi all, > > has anyone come up with a solution to the following problem... > Given a buffer containing 2 sub blocks of left and right samples: > L(0), L(1), L(2), L(3), ... L(n-1), R(0), R(1), R(2), R(3), ...R(n-1) > Perform in-place interleaving: > L(0), R(0), L(1), R(1), L(2), R(2), L(3), R(3), ... L(n-1), R(n-1) > > The reason for in-place processing is that we do not have any extra memory > available. > > One method appears to work but is rather long: > Taking just 4 elements shows that just the inner terms need to be swapped > L0 L1 R0 R1 input > L0 R0 L1 R1 output > > Taking 8 elements shows that by swapping the inner half of the terms > reduces the problem to 2*4 element problems > L0 L1 L2 L3 � R0 R1 R2 R3 input > L0 L1 R0 R1 � L2 L3 R2 R3 swap inner half > L0 R0 L1 R1 � L2 R2 L3 R3 output > > So the solution may look in structure a bit like an FFT given 2^x > elements, if anyone knows a more efficient method, please let me know! > > Regards, > > Alex Young > DSP software Engineer I don't think that your assumption and your idea is correct and useful. Here's why: If you consider your example with 8 elements, you'll see that R0 has to be moved twice until it is at the right location. If you extend your example to more elements, this will imply more moves for R0. The optimal solution would certainly be to move every element only once. Ideally, you would save one element to an external location, then take the element which will finally belong to the now empty location, place it and from now on don't touch it again. The next element to pick would be that which is targetted to the now empty location, etc. Provided that this is feasible, it would guarantee the minimal number of memory accesses. however, you'd have to take into account the additional calculations or index accesses to find the source and target locations... Mathematically spoken, your task is to find the transpose of a matrix. If you're familiar with Matlab, it's this: before = [ L0, L1, L2, ... ; R0, R1, R2, ... ]; after = transpose(before); after is now: [ L0, R0; L1, R1; ... ]; Maybe you find optimal solutions if you investigate the matrix scene... Depending on the processor/DSP which you're doing this on, you might employ special addressing capabilities like autoincrements. Demonstration with Pseudo-code: pL = adr(L1); pR = adr(R0); loop while not last sample: *(pL++) = *(pR) *(pR++)=*(pL) loop end. assuming that *() would retrieve the content, and that (pX++) increments the pointer after evaluating it's content. this would be an effective solution which doesn't touch the elements more often than necessary - therefore it should be pretty fast. Bernhard
On Wednesday 22 February 2006 17:12, D K wrote: > Hi , > The seems to be correct but i would like to comment that this is just > a kind of pointer manipulation which can be anytime while reading or > writing from the buffer and hence can be easily tackled and you may not > even need any extra memory . > rgds, > DK > Hi DK, you're right. While investigating the capabilities of the internal DMA in a Sharc DSP, I played with chained DMA, and was astonished at this feature. If the processor which Alex is going to use has similar capabilities, it might even be possible to delegate the task to a DMA process, so that (after some initialization) it actually costs no single step in the DSP process. Bernhard > On 2/20/06, Bernhard Holzmayer <Holzmayer.Bernhard@Holz...> wrote: > > Hi Alex, > > > > got the posting only today, since I'm currently only reading the digest. > > if it's still of interest, here are some hints... > > > > > � �Date: Thu, 16 Feb 2006 15:35:56 +0100 > > > � �From: Alex Young <alex.young@alex...> > > > Subject: in-place interleaving > > > > > > Hi all, > > > > > > has anyone come up with a solution to the following problem... > > > Given a buffer containing 2 sub blocks of left and right samples: > > > L(0), L(1), L(2), L(3), ... L(n-1), R(0), R(1), R(2), R(3), ...R(n-1) > > > Perform in-place interleaving: > > > L(0), R(0), L(1), R(1), L(2), R(2), L(3), R(3), ... L(n-1), R(n-1) > > > > > > The reason for in-place processing is that we do not have any extra > > > > memory > > > > > available. > > > > > > One method appears to work but is rather long: > > > Taking just 4 elements shows that just the inner terms need to be > > > > swapped > > > > > L0 L1 R0 R1 input > > > L0 R0 L1 R1 output > > > > > > Taking 8 elements shows that by swapping the inner half of the terms > > > reduces the problem to 2*4 element problems > > > L0 L1 L2 L3 � R0 R1 R2 R3 input > > > L0 L1 R0 R1 � L2 L3 R2 R3 swap inner half > > > L0 R0 L1 R1 � L2 R2 L3 R3 output > > > > > > So the solution may look in structure a bit like an FFT given 2^x > > > elements, if anyone knows a more efficient method, please let me know! > > > > > > Regards, > > > > > > Alex Young > > > DSP software Engineer > > > > I don't think that your assumption and your idea is correct and useful. > > Here's why: > > > > If you consider your example with 8 elements, you'll see that R0 has to > > be moved twice until it is at the right location. > > If you extend your example to more elements, this will imply more moves > > for > > R0. The optimal solution would certainly be to move every element only > > once. > > Ideally, you would save one element to an external location, > > then take the element which will finally belong to the now empty > > location, place it and from now on don't touch it again. > > The next element to pick would be that which is targetted to the now > > empty location, etc. > > Provided that this is feasible, it would guarantee the minimal number of > > memory accesses. > > > > however, you'd have to take into account the additional calculations or > > index > > accesses to find the source and target locations... > > > > Mathematically spoken, your task is to find the transpose of a matrix. > > If you're familiar with Matlab, it's this: > > before == [ L0, L1, L2, ... ; R0, R1, R2, ... ]; > > after == transpose(before); > > after is now: [ L0, R0; L1, R1; ... ]; > > Maybe you find optimal solutions if you investigate the matrix scene... > > > > Depending on the processor/DSP which you're doing this on, you might > > employ > > special addressing capabilities like autoincrements. Demonstration with > > Pseudo-code: > > > > pL == adr(L1); > > pR == adr(R0); > > > > loop while not last sample: > > *(pL++) == *(pR) > > *(pR++)==*(pL) > > loop end. > > > > assuming that *() would retrieve the content, and that (pX++) increments > > the > > pointer after evaluating it's content. > > this would be an effective solution which doesn't touch the elements more > > often than necessary - therefore it should be pretty fast. > > > > Bernhard > >