Implementation of Arctan(x)

Started by karthikw October 1, 2007
Hi,
     I am  implementing arctan(x) in hardware can any one suggest me any
of the algorithms other than cordic because it takes lots of iterations to
reach the final result. 

Thanks and Regards 
Karthik W


karthikw wrote:
> Hi, > I am implementing arctan(x) in hardware can any one suggest me any > of the algorithms other than cordic because it takes lots of iterations to > reach the final result. >
Some sort of a table lookup + interpolation will probably be the best as far as speed/size. No matter what you do, you'll have some interesting effects as the slope goes to infinity -- you'll either be taking reciprocals (not speedy), or _really_ juggling cost/speed/precision issues. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" gives you just what it says. See details at http://www.wescottdesign.com/actfes/actfes.html
"karthikw" <karthikwali@gmail.com> writes:

> Hi, > I am implementing arctan(x) in hardware can any one suggest me any > of the algorithms other than cordic because it takes lots of iterations to > reach the final result. > > Thanks and Regards > Karthik W
Hi Karthik, Try: Efficient approximations for the arctangent function Rajan, S.; Sichun Wang; Inkol, R.; Joyal, A.; Signal Processing Magazine, IEEE Volume 23, Issue 3, May 2006 Page(s):108 - 111 Digital Object Identifier 10.1109/MSP.2006.1628884 Summary: This paper provides several efficient approximations for the arctangent function using Lagrange interpolation and minimax optimization techniques. These approximations are particularly useful when processing power, memory, and power consumption are i..... -- % Randy Yates % "With time with what you've learned, %% Fuquay-Varina, NC % they'll kiss the ground you walk %%% 919-577-9882 % upon." %%%% <yates@ieee.org> % '21st Century Man', *Time*, ELO http://www.digitalsignallabs.com
karthikw wrote:
> Hi, > I am implementing arctan(x) in hardware can any one suggest me any > of the algorithms other than cordic because it takes lots of iterations to > reach the final result.
What accuracy do you need? Over what range? Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
On Oct 2, 5:16 am, "karthikw" <karthikw...@gmail.com> wrote:
> Hi, > I am implementing arctan(x) in hardware can any one suggest me any > of the algorithms other than cordic because it takes lots of iterations to > reach the final result. > > Thanks and Regards > Karthik W
If you are demodulating FM then forget it. You don't need arctan(x). Hardy
Tim Wescott wrote:
> karthikw wrote: > > Hi, > > I am implementing arctan(x) in hardware can any one suggest me any > > of the algorithms other than cordic because it takes lots of iterations
to
> > reach the final result. > > Some sort of a table lookup + interpolation will probably be the best as > far as speed/size. No matter what you do, you'll have some interesting > effects as the slope goes to infinity
The slope of the arctan goes to zero, not infinity. Perhaps this is interesting: http://dspguru.com/comp.dsp/tricks/alg/fxdatan2.htm Regards, Andor
Andor wrote:
> Tim Wescott wrote: >> karthikw wrote: >>> Hi, >>> I am implementing arctan(x) in hardware can any one suggest me
any
>>> of the algorithms other than cordic because it takes lots of iterations
to
>>> reach the final result. >> Some sort of a table lookup + interpolation will probably be the best as >> far as speed/size. No matter what you do, you'll have some interesting >> effects as the slope goes to infinity > > The slope of the arctan goes to zero, not infinity. Perhaps this is > interesting: >
arctan(x) = sin(x)/cos(x). As x goes to pi/2 the arctan (and it's slope) goes to infinity. Granted, for pi/4 < x < pi/2 you can use 1/(arctan(pi/2-x)) -- but then you're calculating a reciprocal, of a number, so you're back to some computationally intensive calculation. Not that I'm an expert. I've always sliced and diced the ordinate down to 0 <= x <= pi/4, then negated and inverted as necessary -- but I've never needed the absolute fastest speed, either. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" gives you just what it says. See details at http://www.wescottdesign.com/actfes/actfes.html
Tim Wescott <t...@seemywebsite.com> wrote:
> Andor wrote: > > Tim Wescott wrote: > >> karthikw wrote: > >>> Hi, > >>> I am implementing arctan(x) in hardware can any one suggest
me any
> >>> of the algorithms other than cordic because it takes lots of
iterations to
> >>> reach the final result. > >> Some sort of a table lookup + interpolation will probably be the best
as
> >> far as speed/size. No matter what you do, you'll have some
interesting
> >> effects as the slope goes to infinity > > > The slope of the arctan goes to zero, not infinity. Perhaps this is > > interesting: > > arctan(x) = sin(x)/cos(x). As x goes to pi/2 the arctan (and it's > slope) goes to infinity.
You are mixing up tan with arctan. Regards, andor
Andor wrote:
> Tim Wescott <t...@seemywebsite.com> wrote: >> Andor wrote: >>> Tim Wescott wrote: >>>> karthikw wrote: >>>>> Hi, >>>>> I am implementing arctan(x) in hardware can any one
suggest me any
>>>>> of the algorithms other than cordic because it takes lots of
iterations to
>>>>> reach the final result. >>>> Some sort of a table lookup + interpolation will probably be the
best as
>>>> far as speed/size. No matter what you do, you'll have some
interesting
>>>> effects as the slope goes to infinity >>> The slope of the arctan goes to zero, not infinity. Perhaps this is >>> interesting: >> arctan(x) = sin(x)/cos(x). As x goes to pi/2 the arctan (and it's >> slope) goes to infinity. > > You are mixing up tan with arctan. > > Regards, > andor >
Argh. So I am. In which case a look-up table would work nice, except for the minor problem of the infinite ordinate -- then one may want a lookup table with ever-increasing intervals, which actually wouldn't be too bad to implement. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" gives you just what it says. See details at http://www.wescottdesign.com/actfes/actfes.html
On Oct 1, 5:43 pm, Tim Wescott <t...@seemywebsite.com> wrote:
>
...
> > In which case a look-up table would work nice, except for the minor > problem of the infinite ordinate -- then one may want a lookup table > with ever-increasing intervals, which actually wouldn't be too bad to > implement.
so how would one determine the index into the table without some repeated search and compare operations? also, i think something like arctan(1/x) = pi/2 - arctan(x) can be used for the the nearly infinite ordinates, no? (a division is required.) dunno how this would work for hardware, but if a single division is tolerable and for -1 <= x <= 1, a very accurate approximation is: arctan(x) ~= x/f(x^2) where f(u) = 1.0 + 0.33288950512027 * u + -0.08467922817644 * u^2 + 0.03252232640125 * u^3 + -0.00749305860992 * u^4 maybe too many terms, but it's not the finite power series that's bad, but the division by it that's costly in many different contexts. r b-j
On Oct 23, 11:46 pm, robert bristow-johnson
<r...@audioimagination.com> wrote:
> we're gonna have to keep track of the fact that there are two > different Karthiks hanging around here. > > r b-j
Sigh... unfortunately so :) We do have different initials/last names depending on how you look at it. K.
On Oct 23, 1:18 am, Karthik <karthik301...@gmail.com> wrote:
> > Hi, > > Thanks for the suggestion.Do you know any papers which, gives the > > above concept of mixing both lut and cordic, and also good papers on fast > > hardware implementation of cordic. I wanted some information on radix4 > > cordic implementation, can you tell me a good paper on radix4 cordic > > implementation. > > > Thanks and Regards > > Karthik W > > 1) G O O G L E. Ray's writeup(s) on CORDIC is/are probably the most > read and most cited. > > 2) Spend more time thinking than googling. > > Karthik S.
we're gonna have to keep track of the fact that there are two different Karthiks hanging around here. r b-j
> Hi, > Thanks for the suggestion.Do you know any papers which, gives the > above concept of mixing both lut and cordic, and also good papers on fast > hardware implementation of cordic. I wanted some information on radix4 > cordic implementation, can you tell me a good paper on radix4 cordic > implementation. > > Thanks and Regards > Karthik W
1) G O O G L E. Ray's writeup(s) on CORDIC is/are probably the most read and most cited. 2) Spend more time thinking than googling. Karthik S.
>karthikw wrote: > >>>Hi, >>> I am implementing arctan(x) in hardware can any one suggest me
any
>>>of the algorithms other than cordic because it takes lots of
iterations
>> >> to >> >>>reach the final result. >>> >>>Thanks and Regards >>>Karthik W >>> >>> >> >> Hi, >> Thanks for the response. But actually I am implementing an
rectangular
>> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x).
So
>> I need arctan(y/x) not arctan(x) because a divider costs a lot with >> respect to area. I intend a accuracy of 4 decimal place which is q15
fixed
>> point format.If go to lookup table, I need a huge memory of 2^16. >> Interpolation technique is good but I think for hardware
implementation
>> its quite complex ,I some cases I may need a divider or a multiplier.
I
>> also tried looking into the variants (formulas) of arctan but all need
a
>> divider in some or the other way .So if I can have methood which costs >> less on area and with faster speed than cordic then it will be good. >> >> Thanks and Regards >> Karthik W >> >> > > >Unfortunately, most of the algorithms for arctan involve division by a >variable, so I don't think you are going to find any that give much of a
>net savings in latency over CORDIC for similar performance. The best >performance will be from a look-up table, but you are obviously limited >by to a relatively small number of angles by the size of the memory. >Your best bet might be to use a cross between CORDIC and a look-up where
>instead of having a binary rotation decision at each step, you have a >small finite number of rotation angles. That result is then passed to >subsequent stages that have finer angular resolutions. I think in the >end though, you'll wind up with about the same latency as the CORDIC. >The other advantage to using CORDIC is that you obtain not only the >arctan, but you also get the magnitude, and you don't need to compute a >square root for it. The square root and the divide are both hardware >intensive and don't lend themselves well to parallelizing because the >intermediate results depend on the previous intermediate results. Same >is true for CORDIC, only CORDIC gives you both functions at once. > >Perhaps you can look at ways to speed up the CORDIC, maybe using a >multiplied clock or not pipelining at every stage. >
Hi, Thanks for the suggestion.Do you know any papers which, gives the above concept of mixing both lut and cordic, and also good papers on fast hardware implementation of cordic. I wanted some information on radix4 cordic implementation, can you tell me a good paper on radix4 cordic implementation. Thanks and Regards Karthik W
karthikw wrote:

>>Hi, >> I am implementing arctan(x) in hardware can any one suggest me any >>of the algorithms other than cordic because it takes lots of iterations > > to > >>reach the final result. >> >>Thanks and Regards >>Karthik W >> >> > > Hi, > Thanks for the response. But actually I am implementing an rectangular > to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So > I need arctan(y/x) not arctan(x) because a divider costs a lot with > respect to area. I intend a accuracy of 4 decimal place which is q15 fixed > point format.If go to lookup table, I need a huge memory of 2^16. > Interpolation technique is good but I think for hardware implementation > its quite complex ,I some cases I may need a divider or a multiplier. I > also tried looking into the variants (formulas) of arctan but all need a > divider in some or the other way .So if I can have methood which costs > less on area and with faster speed than cordic then it will be good. > > Thanks and Regards > Karthik W > >
Unfortunately, most of the algorithms for arctan involve division by a variable, so I don't think you are going to find any that give much of a net savings in latency over CORDIC for similar performance. The best performance will be from a look-up table, but you are obviously limited by to a relatively small number of angles by the size of the memory. Your best bet might be to use a cross between CORDIC and a look-up where instead of having a binary rotation decision at each step, you have a small finite number of rotation angles. That result is then passed to subsequent stages that have finer angular resolutions. I think in the end though, you'll wind up with about the same latency as the CORDIC. The other advantage to using CORDIC is that you obtain not only the arctan, but you also get the magnitude, and you don't need to compute a square root for it. The square root and the divide are both hardware intensive and don't lend themselves well to parallelizing because the intermediate results depend on the previous intermediate results. Same is true for CORDIC, only CORDIC gives you both functions at once. Perhaps you can look at ways to speed up the CORDIC, maybe using a multiplied clock or not pipelining at every stage.
On Tue, 02 Oct 2007 01:35:10 -0500, "karthikw"
<karthikwali@gmail.com>
wrote:

>>Hi, >> I am implementing arctan(x) in hardware can any one suggest me any >>of the algorithms other than cordic because it takes lots of iterations >to >>reach the final result. >> >>Thanks and Regards >>Karthik W >> >> >Hi, > Thanks for the response. But actually I am implementing an rectangular >to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So >I need arctan(y/x) not arctan(x) because a divider costs a lot with >respect to area. I intend a accuracy of 4 decimal place which is q15 fixed >point format.If go to lookup table, I need a huge memory of 2^16. >Interpolation technique is good but I think for hardware implementation >its quite complex ,I some cases I may need a divider or a multiplier. I >also tried looking into the variants (formulas) of arctan but all need a >divider in some or the other way .So if I can have methood which costs >less on area and with faster speed than cordic then it will be good. > >Thanks and Regards >Karthik W
Hi Karthik, ya' might take a look at "Another Contender in The Arctangent Race", by R. Lyons, IEEE Signal Processing Magazine, DSP Tips & Tricks column, Vol. 21, No. 1, Jan. 2004, page 109. The algorithm there yields an arctan accuracy of roughly one quarter of a degree, and does not use a lookup table. The algo is fairly efficient except, darn it, it requires a divide operation. I don't know of a fast, accurate, table-free, divide-free arctan algorithm. If you find one, Karthik, you'll become famous. Good Luck, [-Rick-]
>On Oct 2, 2:35 am, "karthikw" <karthikw...@gmail.com> wrote: >... >> > I am implementing arctan(x) in hardware can any one suggest me
any
>> >of the algorithms other than cordic because it takes lots of
iterations to
>> >reach the final result. >> >... >> ... I am implementing an rectangular >> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x).
So
>> I need arctan(y/x) not arctan(x) because a divider costs a lot with >> respect to area. I intend a accuracy of 4 decimal place which is q15
fixed
>> point format.If go to lookup table, I need a huge memory of 2^16. >> Interpolation technique is good but I think for hardware
implementation
>> its quite complex ,I some cases I may need a divider or a multiplier.
I
>> also tried looking into the variants (formulas) of arctan but all need
a
>> divider in some or the other way .So if I can have methood which costs >> less on area and with faster speed than cordic then it will be good. > >i don't think you'll avoid division. also you will have to break up >your (x,y) coordinate pair into 4 quadrants (maybe 5, if you want your >answer to alway be the Principal Angle which has magnitude less than >pi) as is done in the atan2(x,y) function. is this for FM >demodulation? ultimately, is what you want the *difference* in angles >from the previous (complex) sample to the current: > > arg{ (x[n] + j*y[n]) } - arg{ x[n-1] + j*y[n-1] } ? > >is that what you want, in the end? we've had discussions about this >previously on comp.dsp, but i do not know the links. > >is your gmail address valid and an account that you check your mail? >is that where i can email you? > >r b-j >
Hi, Thanks for the reply. My mail id is karthikwali@gmail.com. I am implementing arctan(x,y) for FFT. The general formula for atan(x,y) = i log((x + iy)/sqrt(x^2 + y^2)). And you can see how complex it is to implement in hardware.So I thought for going to cordic, but it is requiring 13 cycle latency if I pipeline it.But if I want to calculate it at different times always there will be a latency of 13 cycles. Thanks and Regards Karthik W
On Oct 2, 2:35 am, "karthikw" <karthikw...@gmail.com> wrote:
...
> > I am implementing arctan(x) in hardware can any one suggest me any > >of the algorithms other than cordic because it takes lots of iterations to > >reach the final result. >
...
> ... I am implementing an rectangular > to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So > I need arctan(y/x) not arctan(x) because a divider costs a lot with > respect to area. I intend a accuracy of 4 decimal place which is q15 fixed > point format.If go to lookup table, I need a huge memory of 2^16. > Interpolation technique is good but I think for hardware implementation > its quite complex ,I some cases I may need a divider or a multiplier. I > also tried looking into the variants (formulas) of arctan but all need a > divider in some or the other way .So if I can have methood which costs > less on area and with faster speed than cordic then it will be good.
i don't think you'll avoid division. also you will have to break up your (x,y) coordinate pair into 4 quadrants (maybe 5, if you want your answer to alway be the Principal Angle which has magnitude less than pi) as is done in the atan2(x,y) function. is this for FM demodulation? ultimately, is what you want the *difference* in angles from the previous (complex) sample to the current: arg{ (x[n] + j*y[n]) } - arg{ x[n-1] + j*y[n-1] } ? is that what you want, in the end? we've had discussions about this previously on comp.dsp, but i do not know the links. is your gmail address valid and an account that you check your mail? is that where i can email you? r b-j
>Hi, > I am implementing arctan(x) in hardware can any one suggest me any >of the algorithms other than cordic because it takes lots of iterations
to
>reach the final result. > >Thanks and Regards >Karthik W > >
Hi, Thanks for the response. But actually I am implementing an rectangular to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So I need arctan(y/x) not arctan(x) because a divider costs a lot with respect to area. I intend a accuracy of 4 decimal place which is q15 fixed point format.If go to lookup table, I need a huge memory of 2^16. Interpolation technique is good but I think for hardware implementation its quite complex ,I some cases I may need a divider or a multiplier. I also tried looking into the variants (formulas) of arctan but all need a divider in some or the other way .So if I can have methood which costs less on area and with faster speed than cordic then it will be good. Thanks and Regards Karthik W
robert bristow-johnson wrote:
> On Oct 1, 5:43 pm, Tim Wescott <t...@seemywebsite.com> wrote: > ... >> In which case a look-up table would work nice, except for the minor >> problem of the infinite ordinate -- then one may want a lookup table >> with ever-increasing intervals, which actually wouldn't be too bad to >> implement. > > so how would one determine the index into the table without some > repeated search and compare operations?
For floating point, look at the mantissa and do it sorta logarithmically. For fixed-point, count the number of zeros (preferably with fast, pipelined logic).
> > also, i think something like > > arctan(1/x) = pi/2 - arctan(x) > > can be used for the the nearly infinite ordinates, no? (a division is > required.) > > dunno how this would work for hardware, but if a single division is > tolerable and > for -1 <= x <= 1, a very accurate approximation is: > > arctan(x) ~= x/f(x^2) > > where > > f(u) = 1.0 > + 0.33288950512027 * u > + -0.08467922817644 * u^2 > + 0.03252232640125 * u^3 > + -0.00749305860992 * u^4 > > maybe too many terms, but it's not the finite power series that's bad, > but the division by it that's costly in many different contexts. > > r b-j > >
-- Tim Wescott Wescott Design Services http://www.wescottdesign.com Do you need to implement control loops in software? "Applied Control Theory for Embedded Systems" gives you just what it says. See details at http://www.wescottdesign.com/actfes/actfes.html