comp.dsp | sqrt on C6414 DSP| page 2

Reply by Michael Schoeberl ●July 1, 20052005-07-01

>> If TI uses the same nomenclature as Analog Devices, the mantissa is 
>> what's left after you shift the number up by the bits counted by the 
>> NORM instruction.

uuuh - I shouldn't have used this word. I do not calculate the mantissa 
explicitly - I just take the "exponent" as an output from NORM and start 
shifting my input.



> To code an accurate square root, compute 1/sqrt(x) iteratively; that can 
> be done with shifts and multiplies. When done, multiply by x.

this takes quite some instructions!
I tried this as well but I needed at least 2 iterations for Newton 
raphson and did not find a really good starting approximation.

I'm down to 16 cycles which can get pipelines quite well.


> For sqrt(x^2+y^2), there's a quicker way than summing the squares and 
> rooting. 

I found this here as well - they take max(x, y) + 0.5*min(x,y)
but I needed to calculate something else


bye,
Michael

Reply by Michael Schoeberl ●July 1, 20052005-07-01

> Even faster on a DSP is to use a 3 to 6 element Taylor's series.  I'd 
> take the "mantissa", scaled to be between 1/4 <= x_m < 1.  Then I'd execute
> 
> y = a_0 + a_1 * x_m + a_2 * x_m^2 ...

 > ... multivariate least-squares fit ...

can anyone tell me how accurate this can get with 3 or 6 coefficients?


bye,
Michael

Reply by Michael Schoeberl ●July 1, 20052005-07-01

> I understand now. It's very clever. I would explain it a bit 
> differently, but that's just a mater of viewpoint. 

thanks - could you give me a hint how you would explain this? (just the 
basic idea)

I came across this by playing around and I still have no idea why I 
shift the sqrt(2) and why I need a constant sqrt(sqrt(2)) to correct 
this in the end!


> Would you like to write it up as a "Trick" for dspguru.com? 
> http://www.dspguru.com/comp.dsp/tricks/call.htm

ok - I'll do this next week ...


bye,
Michael

Reply by Tim Wescott ●July 1, 20052005-07-01

Michael Schoeberl wrote:

>> Even faster on a DSP is to use a 3 to 6 element Taylor's series.  I'd 
>> take the "mantissa", scaled to be between 1/4 <= x_m < 1.  Then I'd 
>> execute
>>
>> y = a_0 + a_1 * x_m + a_2 * x_m^2 ...
> 
> 
>  > ... multivariate least-squares fit ...
> 
> can anyone tell me how accurate this can get with 3 or 6 coefficients?
> 
> 
> bye,
> Michael

With 6 it's good to 15 bits -- it takes about 5 minutes to code it up 
and try it with Matlab, MathCad, Octave, etc.

-- 
-------------------------------------------
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Reply by Michael Schoeberl ●July 1, 20052005-07-01

> With 6 it's good to 15 bits -- it takes about 5 minutes to code it up 
> and try it with Matlab, MathCad, Octave, etc.

thanks - unfortunately this would take me much longer (I'm just an 
engineer without numerical experience) and here on the east coast it's 
time to go home now! ;-)

more on this on Tuesday - have a nice weekend!


bye,
Michael

Reply by john ●July 2, 20052005-07-02


Tim Wescott wrote:
> Jerry Avins wrote:
>
> > Michael Schoeberl wrote:
> >
> >>  >> [NORM function]
> >>
> >>> What makes a bit "redundant"?
> >>
> >>
> >>
> >> okay - this need more details ...
> >>
> >> the DSP has a one cycle instruction (called NORM) that counts
> >> the bits - starting from the left to the first change.
> >>
> >> I pass a 32 bit integer and get get back the number of
> >> leading zeros ... (works for ned number in 2 complement as well so TI
> >> calls the bits from the left 'redundant')
> >>
> >>
> >> the output of that instruction is similar to
> >> 31 - floor(log2(x))
> >
> >
> > That's an interesting instruction. What does it do with negative numbers?
> >
> >>> What mantissa? Didn't you write "fixed point"?   I see the line
> >>> "exp = floor(log2(x)) - 6;" below. How do you take a fixed-point log?
> >>
> >>
> >>
> >> with the help of this special assembly instruction ...
> >
> >
> > Thanks. I understand it all but the mantissa part. What is a fixed-point
> > mantissa?
> >
> If TI uses the same nomenclature as Analog Devices, the mantissa is
> what's left after you shift the number up by the bits counted by the
> NORM instruction.  Analog Devices has designed their normalization
> instruction (which may or may not be NORM; I can't remember) so that it
> can be vectorized -- this lets you do block floating point calculations
> which makes oodles of sense in a vector processing environment.
>
> --
> -------------------------------------------
> Tim Wescott
> Wescott Design Services
> http://www.wescottdesign.com

Yes, and I wish TI had done the same. Just to explain, on an ADI
processor one can write a one cycle per input loop that automatically
keeps track of minimum shift value for all the inputs. On a '54XX, one
has to find the minimum manually. 

John

Reply by Tim Wescott ●July 2, 20052005-07-02

john wrote:

> 
> Tim Wescott wrote:
> 
>>Jerry Avins wrote:
>>
>>
>>>Michael Schoeberl wrote:
>>>
>>>
>>>> >> [NORM function]
>>>>
>>>>
>>>>>What makes a bit "redundant"?
>>>>
>>>>
>>>>
>>>>okay - this need more details ...
>>>>
>>>>the DSP has a one cycle instruction (called NORM) that counts
>>>>the bits - starting from the left to the first change.
>>>>
>>>>I pass a 32 bit integer and get get back the number of
>>>>leading zeros ... (works for ned number in 2 complement as well so TI
>>>>calls the bits from the left 'redundant')
>>>>
>>>>
>>>>the output of that instruction is similar to
>>>>31 - floor(log2(x))
>>>
>>>
>>>That's an interesting instruction. What does it do with negative numbers?
>>>
>>>
>>>>>What mantissa? Didn't you write "fixed point"?   I see the line
>>>>>"exp = floor(log2(x)) - 6;" below. How do you take a fixed-point log?
>>>>
>>>>
>>>>
>>>>with the help of this special assembly instruction ...
>>>
>>>
>>>Thanks. I understand it all but the mantissa part. What is a fixed-point
>>>mantissa?
>>>
>>
>>If TI uses the same nomenclature as Analog Devices, the mantissa is
>>what's left after you shift the number up by the bits counted by the
>>NORM instruction.  Analog Devices has designed their normalization
>>instruction (which may or may not be NORM; I can't remember) so that it
>>can be vectorized -- this lets you do block floating point calculations
>>which makes oodles of sense in a vector processing environment.
>>
>>--
>>-------------------------------------------
>>Tim Wescott
>>Wescott Design Services
>>http://www.wescottdesign.com
> 
> 
> Yes, and I wish TI had done the same. Just to explain, on an ADI
> processor one can write a one cycle per input loop that automatically
> keeps track of minimum shift value for all the inputs. On a '54XX, one
> has to find the minimum manually. 
> 
> John
> 
Ew ick.  The Analog Devices 21xx DSPs are much better processors for 
making screaming-fast algorithms for a wide array of problems. 
Unfortunately they absolutely suck if you want an RTOS -- in fact IIRC 
they have some hardware stacks who's state you can't query and restore, 
so they're more or less RTOS proof.

The TI 2812 DSP is just as good as the ADSP-21xx for "ordinary" DSP 
stuff like MACs, but can't go as far as the ADI parts.  But saving the 
entire processor state is direct and clear, so fitting an RTOS onto the 
thing is easy as pie.

I have no idea how the Blackfin compares, or other TI fixed-bitters, or 
anyone's floating point ones...

-- 
-------------------------------------------
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Previous 12Next

sqrt on C6414 DSP

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group