comp.dsp | optimizing image processing - control statement in loop prevents sw pipelining

hi,

I got a problem w/ optimizing a loop. I use the TMS320C6414 DSP.
It is an image processing algorithm and the loop contains a
control statement (if), therefore sw pipelining is not possible.

The if statements checks the lower and upper margines for the
intensity of each pixel. Here is a pseudo code snippet:

for(i = 0; i < scr_mfs->m_nRoiHeight; i++)
{
  [...]
  for(j = 0; j < scr_mfs->m_nRoiWidth; j++, pOut += depth)
  {
    [...] // calculate the value of buf
    if(buf < 0)
      buf = 0;
    else if(buf > MAX_GRAY_SCALE)
      buf = MAX_GRAY_SCALE;

    pOut[0] = buf; //output
  }
}

The data type is short and the algo is working on 12bit images.
Is there a way to do that differently?
Are there any instructions to do the check that won't flush the
pipeline? Any help is appreciated. 

Thank you,

Mike

Reply by Tim Wescott ●February 8, 20052005-02-08

maikeru wrote:

> hi,
> 
> I got a problem w/ optimizing a loop. I use the TMS320C6414 DSP.
> It is an image processing algorithm and the loop contains a
> control statement (if), therefore sw pipelining is not possible.
> 
> The if statements checks the lower and upper margines for the
> intensity of each pixel. Here is a pseudo code snippet:
> 
> for(i = 0; i < scr_mfs->m_nRoiHeight; i++)
> {
>   [...]
>   for(j = 0; j < scr_mfs->m_nRoiWidth; j++, pOut += depth)
>   {
>     [...] // calculate the value of buf
>     if(buf < 0)
>       buf = 0;
>     else if(buf > MAX_GRAY_SCALE)
>       buf = MAX_GRAY_SCALE;
> 
>     pOut[0] = buf; //output
>   }
> }
> 
> The data type is short and the algo is working on 12bit images.
> Is there a way to do that differently?
> Are there any instructions to do the check that won't flush the
> pipeline? Any help is appreciated. 
> 
> Thank you,
> 
> Mike
> 
You may have to do some assembly language programming.

I'm not familiar with the '64xx series, so this is fairly general. 
First check to see if there are conditionally executable instructions: 
the ADSP21xx series will let you do a load depending on a flag, for 
instance.  This would solve your problem right quick.

If that fails, and if your buffer value calculation fits into the MAC 
paradigm then scale things so that MAX_GRAY_SCALE maps to 0xffffffff (or 
0xffff) and 0 maps to zero.  Then you should be able to do an 
accumulator saturate to hit 0 and 0xffffffff and shift things down to 
MAX_GRAY_SCALE once you're done.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by maikeru ●February 8, 20052005-02-08

Thx Tim,

I've already done quite some assemble programming in that project,
but so far I am not aware of any conditionally executable instructions
on the C64xx.

I did not exactly get the thing about MAC paradigm. I know what you
are trying to tell me with the saturation and shifting back. I was
actually
looking for an approach like this, but I don't exactly know how to do
it.

Thank you again for the quick answer,

Mike

Reply by Tim Wescott ●February 8, 20052005-02-08

maikeru wrote:

> Thx Tim,
> 
> I've already done quite some assemble programming in that project,
> but so far I am not aware of any conditionally executable instructions
> on the C64xx.
> 
> I did not exactly get the thing about MAC paradigm. I know what you
> are trying to tell me with the saturation and shifting back. I was
> actually
> looking for an approach like this, but I don't exactly know how to do
> it.
> 
> Thank you again for the quick answer,
> 
> Mike
> 
Basically if you arrive at the value of 'buf' through a series of 
multiply-accumulates, like:

buf = sum from {n=0} to {N-1} {inputData_n * coefficient_n}

That'll be executed as a vector dot product, and can be done entirely in 
a MAC instruction.  All you would have to do in that case would be to 
adjust the values of the coefficients to do what I was suggesting.

At worst you may be able to do a single MAC instruction, so

* Get your value for buf
* clear the accumulator
* multiply by 0xffffffff / MAX_PIXEL_VALUE
* saturate
* shift (or multiply) down to MAX_PIXEL_VALUE

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by dan ●February 8, 20052005-02-08

You can do conditional executes on the C6x.  Every instruction
conditionally executes based on A1, A2, B0, B1, B2.  Read the manual
(ie page 3.13 of the C62x instruction set)
So do a CMP, put the result in one of the above registers, and then use
the register in a conditional execute operation.

Tim Wescott wrote:
> maikeru wrote:
>
> > Thx Tim,
> >
> > I've already done quite some assemble programming in that project,
> > but so far I am not aware of any conditionally executable
instructions
> > on the C64xx.
> >
> > I did not exactly get the thing about MAC paradigm. I know what you
> > are trying to tell me with the saturation and shifting back. I was
> > actually
> > looking for an approach like this, but I don't exactly know how to
do
> > it.
> >
> > Thank you again for the quick answer,
> >
> > Mike
> >
> Basically if you arrive at the value of 'buf' through a series of
> multiply-accumulates, like:
>
> buf = sum from {n=0} to {N-1} {inputData_n * coefficient_n}
>
> That'll be executed as a vector dot product, and can be done entirely
in
> a MAC instruction.  All you would have to do in that case would be to

> adjust the values of the coefficients to do what I was suggesting.
>
> At worst you may be able to do a single MAC instruction, so
>
> * Get your value for buf
> * clear the accumulator
> * multiply by 0xffffffff / MAX_PIXEL_VALUE
> * saturate
> * shift (or multiply) down to MAX_PIXEL_VALUE
>
> --
>
> Tim Wescott
> Wescott Design Services
> http://www.wescottdesign.com

Reply by Tim Wescott ●February 8, 20052005-02-08

dan wrote:

> You can do conditional executes on the C6x.  Every instruction
> conditionally executes based on A1, A2, B0, B1, B2.  Read the manual
> (ie page 3.13 of the C62x instruction set)
> So do a CMP, put the result in one of the above registers, and then use
> the register in a conditional execute operation.
> 
> Tim Wescott wrote:
snip

This is a much better answer than mine, unless you're already doing 
MAC's and are poised to just saturate anyway -- perhaps even then.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by maikeru ●February 9, 20052005-02-09

Hi guys,

thank you for your answers, I was thinking them over
(cuz I'm working on packed 16bit values) and reading
the manuals again when I stumble over the
MIN2 and MAX2 (spru189, 5-118 & 5-124) instructions
for the 64xx. With their help I don't need to worry about
anything, they check against a lower and upper
boundery and return the right results automatically.
I implemented it and it works just fine.

Thx again for your help,

Cheers,

Mike

http://mmatter.gmxhome.de/

optimizing image processing - control statement in loop prevents sw pipelining

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group