DSPRelated.com
Forums

TI 54x FIRS Not Compatible with Circular Buffering?

Started by Unknown September 10, 2004
The FIRS instruction,

  FIRS  xmem, ymem, pmad

doesn't seem to be compatible with circular buffering! The problem
is the limited addressing modes available with by dual-memory operands,
namely, just

  *ARx
  *ARx-     ; post decrement
  *ARx+     ; post increment
  *ARx+0%   ; post increment circularly by AR0 amount

The problem is that, for a symmetric FIR filter operating on
data in a circular buffer, xmem must increment circularly
while ymem decrements circularly (or vice-versa), AND YOU
CAN'T DO BOTH IN THIS INSTRUCTION. 

Unless I'm missing something. Can someone please confirm or
set me straight?

I acknowledge that one can arrange the data in the buffer
so that circular addressing isn't required, but this eats
cycles. The relative percentage of degradation of such spurious
copying depends on the filter size. 

I just find it strange that, since great pains were taken in 
the architecture to support circular addressing, it can't
be used with such an important instruction. 
-- 
Randy Yates
Sony Ericsson Mobile Communications
Research Triangle Park, NC, USA
randy.yates@sonyericsson.com, 919-472-1124
Randy Yates wrote:

> The FIRS instruction, > > FIRS xmem, ymem, pmad > > doesn't seem to be compatible with circular buffering! The problem > is the limited addressing modes available with by dual-memory operands, > namely, just > > *ARx > *ARx- ; post decrement > *ARx+ ; post increment > *ARx+0% ; post increment circularly by AR0 amount > > The problem is that, for a symmetric FIR filter operating on > data in a circular buffer, xmem must increment circularly > while ymem decrements circularly (or vice-versa), AND YOU > CAN'T DO BOTH IN THIS INSTRUCTION. > > Unless I'm missing something. Can someone please confirm or > set me straight? > > I acknowledge that one can arrange the data in the buffer > so that circular addressing isn't required, but this eats > cycles. The relative percentage of degradation of such spurious > copying depends on the filter size. > > I just find it strange that, since great pains were taken in > the architecture to support circular addressing, it can't > be used with such an important instruction.
I'm guessing that the data are in xmem and coefficients in ymem (or the other way around). I don' know why the coefficient addressing needs to be circular. Can you explain? Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Jerry Avins <jya@ieee.org> writes:

> Randy Yates wrote: > > > The FIRS instruction, > > FIRS xmem, ymem, pmad > > > doesn't seem to be compatible with circular buffering! The problem > > > is the limited addressing modes available with by dual-memory operands, > > namely, just > > *ARx > > > *ARx- ; post decrement > > *ARx+ ; post increment > > *ARx+0% ; post increment circularly by AR0 amount > > The problem is that, for a symmetric FIR filter operating on > > > data in a circular buffer, xmem must increment circularly > > while ymem decrements circularly (or vice-versa), AND YOU > > CAN'T DO BOTH IN THIS INSTRUCTION. Unless I'm missing something. Can > > someone please confirm or > > > set me straight? > > I acknowledge that one can arrange the data in the buffer > > > so that circular addressing isn't required, but this eats > > cycles. The relative percentage of degradation of such spurious > > copying depends on the filter size. I just find it strange that, > > since great pains were taken in the architecture to support circular > > addressing, it can't > > > be used with such an important instruction. > > > I'm guessing that the data are in xmem and coefficients in ymem (or the > other way around). I don' know why the coefficient addressing needs to > be circular. Can you explain?
Hi Jerry, xmem and ymem both point to data - pmad points to the coefficients. (pmad stands for "program memory address"). This instruction computes the following, in C meta code B += A * *(pmad+n); A = *(xmem++) + *(ymem--); where n is incremented by one when you repeat the instruction and I'm assuming the corresponding addressing form for xmem and ymem as shown in the code above (i.e., *ARx+, *AR7-). You see the idea? The data on both sides of the symmetric FIR are added first, then multiplied by the one coefficient. This saves MIPS (since this instruction can be done in 1 cycle) AND memory since you only have to store ((M - 1) / 2) + 1 coefficients. Note that you must precompute the first coefficient's multiplication and first data sum before entering the repeat loop with FIRS. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Randy Yates wrote:

   ...

> Hi Jerry, > > xmem and ymem both point to data - pmad points to the coefficients. (pmad > stands for "program memory address"). > > This instruction computes the following, in C meta code > > B += A * *(pmad+n); > A = *(xmem++) + *(ymem--); > > where n is incremented by one when you repeat the instruction and > I'm assuming the corresponding addressing form for xmem and ymem > as shown in the code above (i.e., *ARx+, *AR7-). > > You see the idea? The data on both sides of the symmetric FIR are > added first, then multiplied by the one coefficient. This saves > MIPS (since this instruction can be done in 1 cycle) AND memory > since you only have to store ((M - 1) / 2) + 1 coefficients. Note > that you must precompute the first coefficient's multiplication and > first data sum before entering the repeat loop with FIRS.
Gotcha, Randy; thanks. I see where it saves space, but not time if an addition takes as long as a MAC. Is it a memory access thing? Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Jerry Avins <jya@ieee.org> writes:

> Randy Yates wrote: > > ... > > > Hi Jerry, > > xmem and ymem both point to data - pmad points to the > > coefficients. (pmad > > > stands for "program memory address"). This instruction computes the > > following, in C meta code > > > B += A * *(pmad+n); > > > A = *(xmem++) + *(ymem--); > > where n is incremented by one when you repeat the instruction and > > > I'm assuming the corresponding addressing form for xmem and ymem > > as shown in the code above (i.e., *ARx+, *AR7-). > > You see the idea? The data on both sides of the symmetric FIR are > > > added first, then multiplied by the one coefficient. This saves > > MIPS (since this instruction can be done in 1 cycle) AND memory > > since you only have to store ((M - 1) / 2) + 1 coefficients. Note > > that you must precompute the first coefficient's multiplication and > > first data sum before entering the repeat loop with FIRS. > > Gotcha, Randy; thanks. I see where it saves space, but not time if an > addition takes as long as a MAC.
But in this case it doesn't. The addition and the MAC are ALL done in 1 cycle. Pretty slick, eh? Them folks at TI sure are smart. Except that they forgot how to make it work with circular addressing... -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Randy Yates wrote:
> Jerry Avins <jya@ieee.org> writes: > > >>Randy Yates wrote: >> >> ... >> >> >>>Hi Jerry, >>>xmem and ymem both point to data - pmad points to the >>>coefficients. (pmad >> >>>stands for "program memory address"). This instruction computes the >>>following, in C meta code >> >>> B += A * *(pmad+n); >> >>> A = *(xmem++) + *(ymem--); >>>where n is incremented by one when you repeat the instruction and >> >>>I'm assuming the corresponding addressing form for xmem and ymem >>>as shown in the code above (i.e., *ARx+, *AR7-). >>>You see the idea? The data on both sides of the symmetric FIR are >> >>>added first, then multiplied by the one coefficient. This saves >>>MIPS (since this instruction can be done in 1 cycle) AND memory >>>since you only have to store ((M - 1) / 2) + 1 coefficients. Note >>>that you must precompute the first coefficient's multiplication and >>>first data sum before entering the repeat loop with FIRS. >> >>Gotcha, Randy; thanks. I see where it saves space, but not time if an >>addition takes as long as a MAC. > > > But in this case it doesn't. The addition and the MAC are ALL done in > 1 cycle. Pretty slick, eh? Them folks at TI sure are smart. Except that > they forgot how to make it work with circular addressing...
Have you asked the application angineer for the product? Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Randy Yates <randy.yates@sonyericsson.com> writes:

> Note that you must precompute the first coefficient's multiplication and > first data sum before entering the repeat loop with FIRS.
That's wrong. All you have to do is preload A with your first data value. -- % Randy Yates % "...the answer lies within your soul %% Fuquay-Varina, NC % 'cause no one knows which side %%% 919-577-9882 % the coin will fall." %%%% <yates@ieee.org> % 'Big Wheels', *Out of the Blue*, ELO http://home.earthlink.net/~yatescr
Randy Yates wrote:
> Randy Yates <randy.yates@sonyericsson.com> writes: > > >>Note that you must precompute the first coefficient's multiplication and >>first data sum before entering the repeat loop with FIRS. > > > That's wrong. All you have to do is preload A with your first > data value.
Thanks for clearing it up. It's a side issue I hadn't intended to ask about. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Randy Yates wrote:
> The FIRS instruction, > > FIRS xmem, ymem, pmad > > doesn't seem to be compatible with circular buffering! The problem > is the limited addressing modes available with by dual-memory operands, > namely, just > > *ARx > *ARx- ; post decrement > *ARx+ ; post increment > *ARx+0% ; post increment circularly by AR0 amount > > The problem is that, for a symmetric FIR filter operating on > data in a circular buffer, xmem must increment circularly > while ymem decrements circularly (or vice-versa), AND YOU > CAN'T DO BOTH IN THIS INSTRUCTION. > > Unless I'm missing something. Can someone please confirm or > set me straight? > > I acknowledge that one can arrange the data in the buffer > so that circular addressing isn't required, but this eats > cycles. The relative percentage of degradation of such spurious > copying depends on the filter size. > > I just find it strange that, since great pains were taken in > the architecture to support circular addressing, it can't > be used with such an important instruction.
Not knowing the 54xx very well, I might, be shooting in the wild here, but wouldn't it help to simply reverse the ymem buffer in memory, so that both buffers will require a circular increment ? -- Brian
Brian Dam Pedersen <brian.pedersen@mail.danbbs.dk> writes:

> Randy Yates wrote: >> The FIRS instruction, >> FIRS xmem, ymem, pmad >> doesn't seem to be compatible with circular buffering! The problem >> is the limited addressing modes available with by dual-memory operands, >> namely, just >> *ARx >> *ARx- ; post decrement >> *ARx+ ; post increment >> *ARx+0% ; post increment circularly by AR0 amount >> The problem is that, for a symmetric FIR filter operating on >> data in a circular buffer, xmem must increment circularly >> while ymem decrements circularly (or vice-versa), AND YOU >> CAN'T DO BOTH IN THIS INSTRUCTION. Unless I'm missing something. Can >> someone please confirm or >> set me straight? >> I acknowledge that one can arrange the data in the buffer >> so that circular addressing isn't required, but this eats >> cycles. The relative percentage of degradation of such spurious >> copying depends on the filter size. I just find it strange that, >> since great pains were taken in the architecture to support circular >> addressing, it can't >> be used with such an important instruction. > > Not knowing the 54xx very well, I might, be shooting in the wild here, > but wouldn't it help to simply reverse the ymem buffer in memory, so > that both buffers will require a circular increment ?
Hi Brian, That would work, but you'd have to spend (N-1)/2 cycles for every output data point rearranging the data buffer, and then you've lost the computational advantage. It would also require an extra (N-1)/2 words of memory. I don't mean to put you off, but go read the responses I had to Jerry Avins and understand what this instruction is doing. A key point is that xmem and ymem are pointing to symmetric points about a center point in the input data buffer, so everytime you move a notch (a coefficient) you must increment one and decrement the other. -- % Randy Yates % "So now it's getting late, %% Fuquay-Varina, NC % and those who hesitate %%% 919-577-9882 % got no one..." %%%% <yates@ieee.org> % 'Waterfall', *Face The Music*, ELO http://home.earthlink.net/~yatescr