DSPRelated.com
Forums

TI 54x FIRS Not Compatible with Circular Buffering?

Started by Unknown September 10, 2004
Randy Yates wrote:

> I don't mean to put you off, but go read the responses I had to Jerry > Avins and understand what this instruction is doing. A key point is > that xmem and ymem are pointing to symmetric points about a center > point in the input data buffer, so everytime you move a notch (a > coefficient) you must increment one and decrement the other.
[ Blushing - brushing off old 54xx manual to read more about FIRS (*cough*) ] OK - here is a somewhat more thought-over shot at it. If we consider an 8-tap symmetric FIR filter, and we denote x[n] as [0], x[n-1] as [1] etc, we want the following equation to be realized using the FIRS instruction: y[n]=c[0]([0]+[7])+c[1]([1]+[6])+c[2]([2]+[5])+c[3]([3]+[4]) The key is to get data aligned in x/y memory so that we can form the sums inside the () by the FIRS instruction (the same pointer update for both X and Y). This can be done by forming data the following way (8 consecutive samples shown, the rest is a repetition). Note that each iteration replaces [7] with a new sample, that then becomes the newest ([0]), while the other samples gets "older". * * n=0: x [0][2][4][6] | n=1: x [1][3][5][7] y [7][5][3][1] | y [0][6][4][2] * * * * n=2: x [2][4][6][0] | n=3: x [3][5][7][1] y [1][7][5][3] | y [2][0][6][4] * * * * n=4: x [4][6][0][2] | n=5: x [5][7][1][3] y [3][1][7][5] | y [4][2][0][6] * * * * n=6: x [6][0][2][4] | n=7: x [7][1][3][5] y [5][3][1][7] | y [6][4][2][0] * * If we start the pointers at the stars, a circular increment by 1 will implement the desired functionality , provided that the coefficient vector has the form [ c[0] c[2] c[3] c[1] ] for even samples and [ c[0] c[1] c[3] c[2] ] for odd samples, meaning that we need M coefficients (even though M/2 describes the filter fully). Maintaining the pointers and data input is a little tricky, but can be done with two circular pointers (modulo 4 in this case). If we denote these two index pointers px and py, and index the coefficients by pc (a linear index pointer) the following pseudocode implements our filter (how to map this to the TI is left as an exercise). The loop processes two samples at a time px=py=0 # Actually - if you are not doing block processing with a block # size modulo M, these should be stored /restored for each # sample block. The FIRS approach in here requires at least 2 # samples per block. do (an_even_number_of_samples){ # Insert a sample in the delay line xmem[px]=new_sample_even B=0 pc=0 do (M/2){ B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction } # Output B to the appropriate place here somemem[outputptr++]=B # Adjust the X pointer according to the scheme above px-- # Insert another sample in the delay line ymem[py]=new_sample_odd # Note that we do NOT reset pc here since a different coeff order # is needed for the odd samples B=0 do (M/2){ B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction # strikes again } # Output one more B somemem[outputptr++]=B # Adjust the Y pointer according to the scheme above py++ } I hope this is a little more helpful than my first post. Generalization to an odd number of taps is left as an exercise. DISCLAIMER: None of this is actually tested in code, but I think the idea is correct. -- Brian Dam Pedersen M.Sc.EE.
Randy Yates wrote:

> Jerry Avins <jya@ieee.org> writes: > > >>Randy Yates wrote: >> >> ... >> >> >>>Hi Jerry, >>>xmem and ymem both point to data - pmad points to the >>>coefficients. (pmad >> >>>stands for "program memory address"). This instruction computes the >>>following, in C meta code >> >>> B += A * *(pmad+n); >> >>> A = *(xmem++) + *(ymem--); >>>where n is incremented by one when you repeat the instruction and >> >>>I'm assuming the corresponding addressing form for xmem and ymem >>>as shown in the code above (i.e., *ARx+, *AR7-). >>>You see the idea? The data on both sides of the symmetric FIR are >> >>>added first, then multiplied by the one coefficient. This saves >>>MIPS (since this instruction can be done in 1 cycle) AND memory >>>since you only have to store ((M - 1) / 2) + 1 coefficients. Note >>>that you must precompute the first coefficient's multiplication and >>>first data sum before entering the repeat loop with FIRS. >> >>Gotcha, Randy; thanks. I see where it saves space, but not time if an >>addition takes as long as a MAC. > > > But in this case it doesn't. The addition and the MAC are ALL done in > 1 cycle. Pretty slick, eh? Them folks at TI sure are smart. Except that > they forgot how to make it work with circular addressing...
Can you set the index stride? If so, you can index backward by setting it to [size - 1]. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Jerry Avins <jya@ieee.org> writes:

> Randy Yates wrote: > > > Jerry Avins <jya@ieee.org> writes: > > > > >>Randy Yates wrote: > >> > >> ... > >> > >> > >>>Hi Jerry, > >>>xmem and ymem both point to data - pmad points to the > >>>coefficients. (pmad > >> > >>>stands for "program memory address"). This instruction computes the > >>>following, in C meta code > >> > >>> B += A * *(pmad+n); > >> > >>> A = *(xmem++) + *(ymem--); > >>>where n is incremented by one when you repeat the instruction and > >> > >>>I'm assuming the corresponding addressing form for xmem and ymem > >>>as shown in the code above (i.e., *ARx+, *AR7-). > >>>You see the idea? The data on both sides of the symmetric FIR are > >> > >>>added first, then multiplied by the one coefficient. This saves > >>>MIPS (since this instruction can be done in 1 cycle) AND memory > >>>since you only have to store ((M - 1) / 2) + 1 coefficients. Note > >>>that you must precompute the first coefficient's multiplication and > >>>first data sum before entering the repeat loop with FIRS. > >> > >>Gotcha, Randy; thanks. I see where it saves space, but not time if an > >> addition takes as long as a MAC. > > > But in this case it doesn't. The addition and the MAC are ALL done in > > > 1 cycle. Pretty slick, eh? Them folks at TI sure are smart. Except that > > they forgot how to make it work with circular addressing... > > Can you set the index stride?
Not only can you, you must. The only circular addressing mode available for this instruction is the one which increments by the amount in AR0 circularly, so both operands must stride the same way. I'm sure you could see this in an instant if you picked up the mnemonic assembly language book from TI and looked at the instruction - it's document SPRU172.
> If so, you can index backward by setting > it to [size - 1].
You can, but then both operands would index backward. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Randy Yates wrote:

   ...

> I'm sure you could see this in an instant if you picked up the mnemonic > assembly language book from TI and looked at the instruction - it's > document SPRU172.
Speculation is foolish. To one who scorns voting to decide what time it is, your admonition to RTFM is particularly apt. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Hi Brian,

Thanks for this detailed response. You may have something here, but
I'm having a helluva time trying to understand your notation. I also
don't understand how you're shifting new data into your arrays. 

Would you mind re-expressing using actual memory indexes and
explaining exactly how new data is shifted into memory each
sample?

--Randy



Brian Dam Pedersen <brian.pedersen@mail.danbbs.dk> writes:

> Randy Yates wrote: > > > I don't mean to put you off, but go read the responses I had to Jerry > > Avins and understand what this instruction is doing. A key point is > > that xmem and ymem are pointing to symmetric points about a center > > point in the input data buffer, so everytime you move a notch (a > > coefficient) you must increment one and decrement the other. > > [ Blushing - brushing off old 54xx manual to read more about FIRS > (*cough*) ] > > > OK - here is a somewhat more thought-over shot at it. If we consider > an 8-tap symmetric FIR filter, and we denote x[n] as [0], x[n-1] as > [1] etc, we want the following equation to be realized using the FIRS > instruction: > > > y[n]=c[0]([0]+[7])+c[1]([1]+[6])+c[2]([2]+[5])+c[3]([3]+[4]) > > The key is to get data aligned in x/y memory so that we can form the > sums inside the () by the FIRS instruction (the same pointer update > for both X and Y). This can be done by forming data the following way > (8 consecutive samples shown, the rest is a repetition). Note that > each iteration replaces [7] with a new sample, that then becomes the > newest ([0]), while the other samples gets "older". > > > * * > n=0: x [0][2][4][6] | n=1: x [1][3][5][7] > y [7][5][3][1] | y [0][6][4][2] > * * > * * > n=2: x [2][4][6][0] | n=3: x [3][5][7][1] > y [1][7][5][3] | y [2][0][6][4] > * * > * * > n=4: x [4][6][0][2] | n=5: x [5][7][1][3] > y [3][1][7][5] | y [4][2][0][6] > * * > * * > n=6: x [6][0][2][4] | n=7: x [7][1][3][5] > y [5][3][1][7] | y [6][4][2][0] > * * > > If we start the pointers at the stars, a circular increment by 1 will > implement the desired functionality , provided that the coefficient > vector has the form [ c[0] c[2] c[3] c[1] ] for even samples and [ > c[0] c[1] c[3] c[2] ] for odd samples, meaning that we need M > coefficients (even though M/2 describes the filter fully). Maintaining > the pointers and data input is a little tricky, but can be done with > two circular pointers (modulo 4 in this case). If we denote these two > index pointers px and py, and index the coefficients by pc (a linear > index pointer) the following pseudocode implements our filter (how to > map this to the TI is left as an exercise). The loop processes two > samples at a time > > > > px=py=0 # Actually - if you are not doing block processing with a > block # size modulo M, these should be stored /restored for each > > # sample block. The FIRS approach in here requires at least 2 > # samples per block. > > do (an_even_number_of_samples){ > # Insert a sample in the delay line > xmem[px]=new_sample_even > B=0 > pc=0 > do (M/2){ > B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction > } > # Output B to the appropriate place here > somemem[outputptr++]=B > # Adjust the X pointer according to the scheme above > px-- > # Insert another sample in the delay line > ymem[py]=new_sample_odd > # Note that we do NOT reset pc here since a different coeff order > # is needed for the odd samples > B=0 > do (M/2){ > B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction > # strikes again > > } > # Output one more B > somemem[outputptr++]=B > # Adjust the Y pointer according to the scheme above > py++ > } > > I hope this is a little more helpful than my first > post. Generalization to an odd number of taps is left as an exercise. > > > DISCLAIMER: None of this is actually tested in code, but I think the > idea is correct. > > > -- Brian Dam Pedersen > M.Sc.EE. >
-- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Randy Yates wrote:
> Hi Brian, > > Thanks for this detailed response. You may have something here, but > I'm having a helluva time trying to understand your notation. I also > don't understand how you're shifting new data into your arrays. > > Would you mind re-expressing using actual memory indexes and > explaining exactly how new data is shifted into memory each > sample?
I can try. In the example below I still use an 8 tap filter. If we consider a memory block in x/y memory at address 0, it will look like this for time instances t=0 and 1(seeing this will hopefully make you able to correlate this to my notation): t=0 ! t=1 X Y ! X Y @0 x[n] x[n-7] ! x[n-1] x[n] @1 x[n-2] x[n-5] ! x[n-3] x[n-6] @2 x[n-4] x[n-3] ! x[n-5] x[n-4] @3 x[n-6] x[n-1] ! x[n-7] x[n-2] or to take absolute sample numbers (first eight samples): t=0 ! t=1 X Y ! X Y @0 x[0] x[-7] ! x[0] x[1] @1 x[-2] x[-5] ! x[-2] x[-5] @2 x[-4] x[-3] ! x[-4] x[-3] @3 x[-6] x[-1] ! x[-6] x[-1] t=2 ! t=3 X Y ! X Y @0 x[0] x[1] ! x[0] x[1] @1 x[-2] x[-5] ! x[-2] x[3] @2 x[-4] x[-3] ! x[-4] x[-3] @3 x[2] x[-1] ! x[2] x[-1] t=4 ! t=5 X Y ! X Y @0 x[0] x[1] ! x[0] x[1] @1 x[-2] x[3] ! x[-2] x[3] @2 x[4] x[-3] ! x[4] x[5] @3 x[2] x[-1] ! x[2] x[-1] t=6 ! t=7 X Y ! X Y @0 x[0] x[1] ! x[0] x[1] @1 x[6] x[3] ! x[6] x[3] @2 x[4] x[-3] ! x[4] x[5] @3 x[2] x[-1] ! x[2] x[7] So x[n] is always the newest sample (must be inserted into the memory block prior to filtering of course) and is denoted [0] in my notation. [1] is then x[n-1], [2] is x[n-2] and so forth. I hope you can see the memory layout now at the 8 time instances. (lowest adress is leftmost)
>>The key is to get data aligned in x/y memory so that we can form the >>sums inside the () by the FIRS instruction (the same pointer update >>for both X and Y). This can be done by forming data the following way >>(8 consecutive samples shown, the rest is a repetition). Note that >>each iteration replaces [7] with a new sample, that then becomes the >>newest ([0]), while the other samples gets "older". >> >> >> * * >> n=0: x [0][2][4][6] | n=1: x [1][3][5][7] >> y [7][5][3][1] | y [0][6][4][2] >> * * >> * * >> n=2: x [2][4][6][0] | n=3: x [3][5][7][1] >> y [1][7][5][3] | y [2][0][6][4] >> * * >> * * >> n=4: x [4][6][0][2] | n=5: x [5][7][1][3] >> y [3][1][7][5] | y [4][2][0][6] >> * * >> * * >> n=6: x [6][0][2][4] | n=7: x [7][1][3][5] >> y [5][3][1][7] | y [6][4][2][0] >> * * >> >>If we start the pointers at the stars, a circular increment by 1 will >>implement the desired functionality , provided that the coefficient >>vector has the form [ c[0] c[2] c[3] c[1] ] for even samples and [ >>c[0] c[1] c[3] c[2] ] for odd samples, meaning that we need M >>coefficients (even though M/2 describes the filter fully). Maintaining >>the pointers and data input is a little tricky, but can be done with >>two circular pointers (modulo 4 in this case). If we denote these two >>index pointers px and py, and index the coefficients by pc (a linear >>index pointer) the following pseudocode implements our filter (how to >>map this to the TI is left as an exercise). The loop processes two >>samples at a time
So in order to get the filter to run correctly, the pointers into X and Y memory must start at the locations that are marked by stars above prior to startup of the FIRS sequence, but after putting a new sample into the block. If we use AR1 for X indexing and AR2 for Y indexing, that means that they should start according to the following table at t=0..7 for and 8 tap filter (again assuming that the memory block starts at addr 0). Also the new samples should be injected at the positions in the I column prior to running the filter: t AR1 AR2 I 0 0 0 X0 1 3 0 Y0 2 3 1 X3 3 2 1 Y1 4 2 2 X2 5 1 2 Y2 6 1 3 X1 7 0 3 Y3 As you can see AR1 should be decremented every other sample (after executing FIRS), and AR2 should be incremented every other sample - which is why I wrote the pseudocode to process two samples at a time. Also the AR registers can be used one at a time to inject new samples (prior to executing FIRS), even samples by AR1 and odd samples by AR2, as you can see from the indices where new samples should be injected in each bank. This pattern extends to any odd-order FIRS (even number of taps), so the pseudocode I have below actually is valid for all even M. I use px for denoting a pointer to X memory - that would be AR1 in the above table. Similarly py would be AR2 in the above table. The ++ is circular, so is the -- . The coefficients are tricky, but you will be able to see why they need to be different for even and odd samples by tracking the pointers through the patterns above (or just the tables I made in this new post). You will note that they always start out at the newest and oldest sample and proceed forward in memory. When the FIRS instructions are executed for even samples, they compute in this order: c[0](x[n-0]+x[n-7])+ c[2](x[n-2]+x[n-5])+ c[3](x[n-3]+x[n-4]) c[1](x[n-1]+x[n-6])+ and for odd samples: c[0](x[n-0]+x[n-7])+ c[1](x[n-1]+x[n-6])+ c[3](x[n-3]+x[n-4]) c[2](x[n-2]+x[n-5])+ [Hmmm.... Looking at this again, it looks like you can do with the [0 2 3 1] version of the coeffs if you set the stride in AR0 to -1 for odd samples when executing the FIRS. It shouldn't matter which way you do it, since the pointers start and end the same place both going forward and backward ... Maybe ... It is late here in europe... #-] Looking again at the pseudocode - the new comments refer to the first time the loop is executed:
>> >>px=py=0 # Actually - if you are not doing block processing with a >>block # size modulo M, these should be stored /restored for each >> >> # sample block. The FIRS approach in here requires at least 2 >> # samples per block. >> >>do (an_even_number_of_samples){ >> # Insert a sample in the delay line
px is 0, and should be used according to my table to insert x[0] into memory
>> xmem[px]=new_sample_even >> B=0 >> pc=0 >> do (M/2){ >> B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction >> } >> # Output B to the appropriate place here >> somemem[outputptr++]=B >> # Adjust the X pointer according to the scheme above >> px--
px is now 3 (assuming t=0), which is what it should be for t=1, which is started below. x[1] should be inserted at Y0, which is what is happening below:
>> # Insert another sample in the delay line >> ymem[py]=new_sample_odd >> # Note that we do NOT reset pc here since a different coeff order >> # is needed for the odd samples >> B=0
Now here is where I think you could do px--/py-- in order to use the same coefficients as for even samples.
>> do (M/2){ >> B+=coeffs[pc++]*(xmem[px++]+ymem[py++]) # the FIRS instruction >> # strikes again >> >> } >> # Output one more B >> somemem[outputptr++]=B >> # Adjust the Y pointer according to the scheme above
Again considering the first round, we increment py, so that it points to 1. Now px,py is 3,1 which is what it should be in order to process x[2] according to the table. px is used in the beginning of the next round to insert the sample into X3 (again correct according to the table, while py is used to insert x[3] into Y1 in the next round.
>> py++ >>} >>
I hope this made it a little more clear - otherwise post again, it would be a waste of both yours and my time if we give up now ;) -- Brian
Brian Dam Pedersen <brian.pedersen@mail.danbbs.dk> writes:

> Randy Yates wrote: > > Hi Brian, > > Thanks for this detailed response. You may have something here, but > > > I'm having a helluva time trying to understand your notation. I also > > don't understand how you're shifting new data into your > > arrays. Would you mind re-expressing using actual memory indexes and > > > explaining exactly how new data is shifted into memory each > > sample? > > I can try. In the example below I still use an 8 tap filter. If we > consider a memory block in x/y memory at address 0, it will look like > this for time instances t=0 and 1(seeing this will hopefully make you > able to correlate this to my notation): > > > > t=0 ! t=1 > X Y ! X Y > @0 x[n] x[n-7] ! x[n-1] x[n] > @1 x[n-2] x[n-5] ! x[n-3] x[n-6] > @2 x[n-4] x[n-3] ! x[n-5] x[n-4] > @3 x[n-6] x[n-1] ! x[n-7] x[n-2] > > or to take absolute sample numbers (first eight samples): > > t=0 ! t=1 > X Y ! X Y > @0 x[0] x[-7] ! x[0] x[1] > @1 x[-2] x[-5] ! x[-2] x[-5] > @2 x[-4] x[-3] ! x[-4] x[-3] > @3 x[-6] x[-1] ! x[-6] x[-1] > > t=2 ! t=3 > X Y ! X Y > @0 x[0] x[1] ! x[0] x[1] > @1 x[-2] x[-5] ! x[-2] x[3] > @2 x[-4] x[-3] ! x[-4] x[-3] > @3 x[2] x[-1] ! x[2] x[-1] > > t=4 ! t=5 > X Y ! X Y > @0 x[0] x[1] ! x[0] x[1] > @1 x[-2] x[3] ! x[-2] x[3] > @2 x[4] x[-3] ! x[4] x[5] > @3 x[2] x[-1] ! x[2] x[-1] > > t=6 ! t=7 > X Y ! X Y > @0 x[0] x[1] ! x[0] x[1] > @1 x[6] x[3] ! x[6] x[3] > @2 x[4] x[-3] ! x[4] x[5] > @3 x[2] x[-1] ! x[2] x[7]
Still utterly confused. What are the "@x" x = 0, 1, 2, 3 at the left? Is that the coefficient index? Is a new sample shifted in each time into x[0], and the remaining samples shifted down? Specifically, for each new sample do we do the following: x[-7] = x[-6] x[-6] = x[-5] x[-5] = x[-4] x[-4] = x[-3] x[-3] = x[-2] x[-2] = x[-1] x[-1] = x[0] x[0] = a ? If so, why wouldn't the pattern be constant? I.e., why do we need these two columns? Is t time? Perhaps it would be easier just to write the code (in 54x assembly) and post it? Here is the solution I settled on. Essentially I created a buffer of length N+L-1, where L is the length of the filter and N is the block length, copy the new data in, and then do a straight, non-circularly-indexed, convolution within it. In my design, L = 161 and N = 160. The buffer is arranged so that the first word of each new block of 160 words is located at relative buffer address pcmBuffer+ ; ; Get the input data buffer address into AR1 ; STM #pcmBuffer+80, AR1 ; [2] ; ; Set arithmetic modes required for this convolution ; RSBX FRCT ; [1] no left shift by one on multiply RSBX CMPT ; [1] no MAR compatibility mode SSBX SXM ; set SXM for FIRS ; ; Perform the FIR filtering: ; FIRFilter: ; ; b. setup block repeat count ; STM #PCM_BLOCK_SIZE-1, BRC ; [2] ; ; d. convolve ; RPTB BbConvolveEnd-1 ; [4] ; ; e. Make initial computation for FIRS and setup data pointers AR2 and AR3 ; MVMM AR1, AR2 ; [1*] MVMM AR1, AR3 ; [1*] LD *AR2-, 16, A ; [1*] MAR *AR3+ ; [1*] ; RPTZ B, #((BB_FILTER_SIZE-1)/2)+1-1 ; [2*] FIRS *AR2-, *AR3+, #filterCoefficients; [(FILTER_SIZE-1)/2*] ; SFTA B, 16-15 ; [1*] SAT B ; [1*] STH B, *AR7+ ; [1*] LD *AR1+, A ; [1*] bogus read - just get AR1 modified ; [91*160] ; BbConvolveEnd: ; ; Move the new data into the old data: ; STM #pcmBuffer+PCM_BLOCK_SIZE, AR2 ; [2] STM #pcmBuffer, AR3 ; [2] RPT #PCM_BLOCK_SIZE-1 ; [1] MVDD *AR2+, *AR3+ ; [PCM_BLOCK_SIZE] ; RET ; [5] ; 182 + 91*160 = 14742 -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Randy Yates <randy.yates@sonyericsson.com> writes:

> block of 160 words is located at relative buffer address pcmBuffer+
160. :) -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124