DSPRelated.com
Forums

optimizing image processing - control statement in loop prevents sw pipelining

Started by maikeru February 8, 2005
hi,

I got a problem w/ optimizing a loop. I use the TMS320C6414 DSP.
It is an image processing algorithm and the loop contains a
control statement (if), therefore sw pipelining is not possible.

The if statements checks the lower and upper margines for the
intensity of each pixel. Here is a pseudo code snippet:

for(i = 0; i < scr_mfs->m_nRoiHeight; i++)
{
  [...]
  for(j = 0; j < scr_mfs->m_nRoiWidth; j++, pOut += depth)
  {
    [...] // calculate the value of buf
    if(buf < 0)
      buf = 0;
    else if(buf > MAX_GRAY_SCALE)
      buf = MAX_GRAY_SCALE;

    pOut[0] = buf; //output
  }
}

The data type is short and the algo is working on 12bit images.
Is there a way to do that differently?
Are there any instructions to do the check that won't flush the
pipeline? Any help is appreciated. 

Thank you,

Mike

maikeru wrote:

> hi, > > I got a problem w/ optimizing a loop. I use the TMS320C6414 DSP. > It is an image processing algorithm and the loop contains a > control statement (if), therefore sw pipelining is not possible. > > The if statements checks the lower and upper margines for the > intensity of each pixel. Here is a pseudo code snippet: > > for(i = 0; i < scr_mfs->m_nRoiHeight; i++) > { > [...] > for(j = 0; j < scr_mfs->m_nRoiWidth; j++, pOut += depth) > { > [...] // calculate the value of buf > if(buf < 0) > buf = 0; > else if(buf > MAX_GRAY_SCALE) > buf = MAX_GRAY_SCALE; > > pOut[0] = buf; //output > } > } > > The data type is short and the algo is working on 12bit images. > Is there a way to do that differently? > Are there any instructions to do the check that won't flush the > pipeline? Any help is appreciated. > > Thank you, > > Mike >
You may have to do some assembly language programming. I'm not familiar with the '64xx series, so this is fairly general. First check to see if there are conditionally executable instructions: the ADSP21xx series will let you do a load depending on a flag, for instance. This would solve your problem right quick. If that fails, and if your buffer value calculation fits into the MAC paradigm then scale things so that MAX_GRAY_SCALE maps to 0xffffffff (or 0xffff) and 0 maps to zero. Then you should be able to do an accumulator saturate to hit 0 and 0xffffffff and shift things down to MAX_GRAY_SCALE once you're done. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Thx Tim,

I've already done quite some assemble programming in that project,
but so far I am not aware of any conditionally executable instructions
on the C64xx.

I did not exactly get the thing about MAC paradigm. I know what you
are trying to tell me with the saturation and shifting back. I was
actually
looking for an approach like this, but I don't exactly know how to do
it.

Thank you again for the quick answer,

Mike

maikeru wrote:

> Thx Tim, > > I've already done quite some assemble programming in that project, > but so far I am not aware of any conditionally executable instructions > on the C64xx. > > I did not exactly get the thing about MAC paradigm. I know what you > are trying to tell me with the saturation and shifting back. I was > actually > looking for an approach like this, but I don't exactly know how to do > it. > > Thank you again for the quick answer, > > Mike >
Basically if you arrive at the value of 'buf' through a series of multiply-accumulates, like: buf = sum from {n=0} to {N-1} {inputData_n * coefficient_n} That'll be executed as a vector dot product, and can be done entirely in a MAC instruction. All you would have to do in that case would be to adjust the values of the coefficients to do what I was suggesting. At worst you may be able to do a single MAC instruction, so * Get your value for buf * clear the accumulator * multiply by 0xffffffff / MAX_PIXEL_VALUE * saturate * shift (or multiply) down to MAX_PIXEL_VALUE -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
You can do conditional executes on the C6x.  Every instruction
conditionally executes based on A1, A2, B0, B1, B2.  Read the manual
(ie page 3.13 of the C62x instruction set)
So do a CMP, put the result in one of the above registers, and then use
the register in a conditional execute operation.

Tim Wescott wrote:
> maikeru wrote: > > > Thx Tim, > > > > I've already done quite some assemble programming in that project, > > but so far I am not aware of any conditionally executable
instructions
> > on the C64xx. > > > > I did not exactly get the thing about MAC paradigm. I know what you > > are trying to tell me with the saturation and shifting back. I was > > actually > > looking for an approach like this, but I don't exactly know how to
do
> > it. > > > > Thank you again for the quick answer, > > > > Mike > > > Basically if you arrive at the value of 'buf' through a series of > multiply-accumulates, like: > > buf = sum from {n=0} to {N-1} {inputData_n * coefficient_n} > > That'll be executed as a vector dot product, and can be done entirely
in
> a MAC instruction. All you would have to do in that case would be to
> adjust the values of the coefficients to do what I was suggesting. > > At worst you may be able to do a single MAC instruction, so > > * Get your value for buf > * clear the accumulator > * multiply by 0xffffffff / MAX_PIXEL_VALUE > * saturate > * shift (or multiply) down to MAX_PIXEL_VALUE > > -- > > Tim Wescott > Wescott Design Services > http://www.wescottdesign.com
dan wrote:

> You can do conditional executes on the C6x. Every instruction > conditionally executes based on A1, A2, B0, B1, B2. Read the manual > (ie page 3.13 of the C62x instruction set) > So do a CMP, put the result in one of the above registers, and then use > the register in a conditional execute operation. > > Tim Wescott wrote:
snip This is a much better answer than mine, unless you're already doing MAC's and are poised to just saturate anyway -- perhaps even then. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Hi guys,

thank you for your answers, I was thinking them over
(cuz I'm working on packed 16bit values) and reading
the manuals again when I stumble over the
MIN2 and MAX2 (spru189, 5-118 & 5-124) instructions
for the 64xx. With their help I don't need to worry about
anything, they check against a lower and upper
boundery and return the right results automatically.
I implemented it and it works just fine.

Thx again for your help,

Cheers,

Mike

http://mmatter.gmxhome.de/