comp.dsp | fixed point atan2

Hi,

first off I'd like to apologize in case this question is/sounds 
extremely stupid, but I'm really stuck.

I need an efficient atan2 that doesn't take up as much real estate on an 
FPGA as a CORDIC would. As such, I've been looking at the 'Trick' 
mentioned at dspguru[1].

So according to that trick, given:
x = 0.1838
y = -0.1818

I would now do the following, since it is in the IV quadrant:
r = (x-y)/(x+y)

But here is the problem: the result of that division is around 177. If I 
now continue on and find the angle by doing:
theta = pi/4 - pi/4*r (or rather pi/4*r - pi/4)

it will obviously be horribly wrong. What am I missing? I'm testing all 
this in Matlab, but I also tried using the fixed point toolbox, and the 
result with the same number of bits and fraction length is simply '1'. 
How is this supposed to work? Where am I wrong?

Thank you in advance,
Alex Hornung


[1]: 
http://www.dspguru.com/dsp/tricks/fixed-point-atan2-with-self-normalization

Reply by Greg Heath ●November 30, 20102010-11-30

On Nov 30, 2:47&#4294967295;am, Alex Hornung <ahorn...@gmail.com> wrote:
> Hi,
>
> first off I'd like to apologize in case this question is/sounds
> extremely stupid, but I'm really stuck.
>
> I need an efficient atan2 that doesn't take up as much real estate on an
> FPGA as a CORDIC would. As such, I've been looking at the 'Trick'
> mentioned at dspguru[1].
>
> So according to that trick, given:
> x = 0.1838
> y = -0.1818
>
> I would now do the following, since it is in the IV quadrant:
> r = (x-y)/(x+y)
>
> But here is the problem: the result of that division is around 177. If I
> now continue on and find the angle by doing:
> theta = pi/4 - pi/4*r (or rather pi/4*r - pi/4)
>
> it will obviously be horribly wrong. What am I missing? I'm testing all
> this in Matlab, but I also tried using the fixed point toolbox, and the
> result with the same number of bits and fraction length is simply '1'.
> How is this supposed to work? Where am I wrong?
>
> Thank you in advance,
> Alex Hornung
>
> [1]:http://www.dspguru.com/dsp/tricks/fixed-point-atan2-with-self-normali...

You are missing the use of abs(y). See the accompanying code.

Hope this helps.

Greg

Reply by Alex Hornung ●November 30, 20102010-11-30

On 30/11/2010 10:58, Greg Heath wrote:
> On Nov 30, 2:47 am, Alex Hornung<ahorn...@gmail.com>  wrote:
>> Hi,
>>
>> first off I'd like to apologize in case this question is/sounds
>> extremely stupid, but I'm really stuck.
>>
>> I need an efficient atan2 that doesn't take up as much real estate on an
>> FPGA as a CORDIC would. As such, I've been looking at the 'Trick'
>> mentioned at dspguru[1].
>>
>> So according to that trick, given:
>> x = 0.1838
>> y = -0.1818
>>
>> I would now do the following, since it is in the IV quadrant:
>> r = (x-y)/(x+y)
>>
>> But here is the problem: the result of that division is around 177. If I
>> now continue on and find the angle by doing:
>> theta = pi/4 - pi/4*r (or rather pi/4*r - pi/4)
>>
>> it will obviously be horribly wrong. What am I missing? I'm testing all
>> this in Matlab, but I also tried using the fixed point toolbox, and the
>> result with the same number of bits and fraction length is simply '1'.
>> How is this supposed to work? Where am I wrong?
>>
>> Thank you in advance,
>> Alex Hornung
>>
>> [1]:http://www.dspguru.com/dsp/tricks/fixed-point-atan2-with-self-normali...
>
> You are missing the use of abs(y). See the accompanying code.
>
> Hope this helps.
>
> Greg

It sure does!

Thank you very much,
Alex

Reply by cfelton ●November 30, 20102010-11-30

>I need an efficient atan2 that doesn't take up as much real estate on an 
>FPGA as a CORDIC would. As such, I've been looking at the 'Trick' 
>mentioned at dspguru[1].
>

Why do you believe the CORDIC uses more FPGA "real estate" than this other
approach?

Reply by Alex Hornung ●November 30, 20102010-11-30

On 30/11/2010 13:28, cfelton wrote:
>> I need an efficient atan2 that doesn't take up as much real estate on an
>> FPGA as a CORDIC would. As such, I've been looking at the 'Trick'
>> mentioned at dspguru[1].
>>
>
> Why do you believe the CORDIC uses more FPGA "real estate" than this other
> approach?

 From the data provided by Xilinx, a CORDIC takes up anywhere between 
1300 and 4000 LUT-FF pairs. Any multiplier I'd use would take up at most 
4 xtremeDSP slices and any full adders shouldn't take up much either. As 
far as I can tell the divider would be the biggest block with this 
approach, and according to the Xilinx IP datasheet, it can be anywhere 
between 80 LUT-FF pairs and 500 for my purposes. This still seems quite 
a bit lower than what a CORDIC would require.

In terms of latency it should be almost the same as a CORDIC, mainly due 
to the divider, again.

I would of course welcome any solution that would allow me to make this 
even simpler (for example by removing the divider somehow). Considering 
that I don't require much accuracy, there might be even more efficient 
solutions that I don't know anything about.

As you might have guessed from my first post, I'm quite new to this 
(both FPGAs and DSP) and I'd greatly appreciate any further insight.

Kind Regards,
Alex Hornung

Reply by Steve Pope ●November 30, 20102010-11-30

Alex Hornung  <ahornung@gmail.com> wrote:

>I would of course welcome any solution that would allow me to make this 
>even simpler (for example by removing the divider somehow). Considering 
>that I don't require much accuracy, there might be even more efficient 
>solutions that I don't know anything about.

Have you considered a one-octant LUT plus mirroring?  How much accuracy
do you need?  These things are usually small.

Steve

Reply by Alex Hornung ●November 30, 20102010-11-30

On 30/11/2010 15:18, Steve Pope wrote:
> Alex Hornung<ahornung@gmail.com>  wrote:
>
>> I would of course welcome any solution that would allow me to make this
>> even simpler (for example by removing the divider somehow). Considering
>> that I don't require much accuracy, there might be even more efficient
>> solutions that I don't know anything about.
>
> Have you considered a one-octant LUT plus mirroring?  How much accuracy
> do you need?  These things are usually small.
>
>
> Steve

No, and I have no idea on how that would work, to get everything from 
that one octant. Remember I'm really new to all of this :) Do you happen 
to have any paper/website/etc about it?

The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I 
could even use some lookup for the all the values with this kind of 
accuracy? After all it would just be around 60 possibilities for the 
whole circle.

Cheers,
Alex

Reply by Steve Pope ●November 30, 20102010-11-30

Alex Hornung  <ahornung@gmail.com> wrote:

>On 30/11/2010 15:18, Steve Pope wrote:

>> Alex Hornung<ahornung@gmail.com>  wrote:

>>> I would of course welcome any solution that would allow me to make this
>>> even simpler (for example by removing the divider somehow). Considering
>>> that I don't require much accuracy, there might be even more efficient
>>> solutions that I don't know anything about.

>> Have you considered a one-octant LUT plus mirroring?  How much accuracy
>> do you need?  These things are usually small.

>No, and I have no idea on how that would work, to get everything from 
>that one octant. Remember I'm really new to all of this :) Do you happen 
>to have any paper/website/etc about it?

>The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I 
>could even use some lookup for the all the values with this kind of 
>accuracy? 

Yes, probably.

Suppose the input to your four quadrant arctan is two 5-bit signed values
representing a complex number.  That's 1024 possible arctan values,
which is a prettty large lookup table, but by manipulating these so
that the input of the table lies always in the first octant, you are
now down to 9 * 9 = 81 values.  I think you will find the accuracy
is better than 0.1 radians using such an approach.

I know of no paper, you just have to design it and try it out.

Steve

Reply by cfelton ●November 30, 20102010-11-30

> From the data provided by Xilinx, a CORDIC takes up anywhere between 
>1300 and 4000 LUT-FF pairs. Any multiplier I'd use would take up at most 
>4 xtremeDSP slices and any full adders shouldn't take up much either. As 
>far as I can tell the divider would be the biggest block with this 
>approach, and according to the Xilinx IP datasheet, it can be anywhere 
>between 80 LUT-FF pairs and 500 for my purposes. This still seems quite 
>a bit lower than what a CORDIC would require.
>

That is fairly large for the Xilinx "core".  A CORDIC "core" will have more
modes than you require.  If you only implement what you need it will be
much smaller.  I would implement the CORDIC algorithm or use the look up
table as mentioned.  The CORDIC will only require shifts and adds and a
small state-machine.

Reply by Tim Wescott ●November 30, 20102010-11-30

On 11/30/2010 07:18 AM, Steve Pope wrote:
> Alex Hornung<ahornung@gmail.com>  wrote:
>
>> I would of course welcome any solution that would allow me to make this
>> even simpler (for example by removing the divider somehow). Considering
>> that I don't require much accuracy, there might be even more efficient
>> solutions that I don't know anything about.
>
> Have you considered a one-octant LUT plus mirroring?  How much accuracy
> do you need?  These things are usually small.

Unless the incoming data is normalized he'd still have to do the divide.

But he should have a LUT on his list of things to try.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Previous12 Next

fixed point atan2

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group