On Oct 23, 1:18 am, Karthik <karthik301...@gmail.com> wrote:
> > Hi,
> >     Thanks for the suggestion.Do you know any papers which, gives the
> > above concept of mixing both lut and cordic, and also good papers on fast
> > hardware implementation of cordic. I wanted some information on radix4
> > cordic implementation, can you tell me a good paper on radix4 cordic
> > implementation.
>
> > Thanks and Regards
> > Karthik W
>
> 1) G O O G L E. Ray's writeup(s) on CORDIC is/are probably the most
> read and most cited.
>
> 2) Spend more time thinking than googling.
>
> Karthik S.


we're gonna have to keep track of the fact that there are two
different Karthiks hanging around here.

r b-j

> Hi,
>     Thanks for the suggestion.Do you know any papers which, gives the
> above concept of mixing both lut and cordic, and also good papers on fast
> hardware implementation of cordic. I wanted some information on radix4
> cordic implementation, can you tell me a good paper on radix4 cordic
> implementation.
>
> Thanks and Regards
> Karthik W

1) G O O G L E. Ray's writeup(s) on CORDIC is/are probably the most
read and most cited.

2) Spend more time thinking than googling.

Karthik S.

>karthikw wrote:
>
>>>Hi,
>>>    I am  implementing arctan(x) in hardware can any one suggest me
any
>>>of the algorithms other than cordic because it takes lots of
iterations
>> 
>> to
>> 
>>>reach the final result. 
>>>
>>>Thanks and Regards 
>>>Karthik W
>>>
>>>
>> 
>> Hi,
>>     Thanks for the response. But actually I am implementing an
rectangular
>> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x).
So
>> I need arctan(y/x) not arctan(x) because a divider costs a lot with
>> respect to area. I intend a accuracy of 4 decimal place which is q15
fixed
>> point format.If go to lookup table, I need a huge memory of 2^16.
>> Interpolation technique is good but I think for hardware
implementation
>> its quite complex ,I some cases I may need a divider or a multiplier.
I
>> also tried looking into the variants (formulas) of arctan but all need
a
>> divider in some or the other way .So if I can have methood which costs
>> less on area and with faster speed than cordic then it will be good.
>>  
>> Thanks and Regards 
>> Karthik W     
>>     
>> 
>
>
>Unfortunately, most of the algorithms for arctan involve division by a 
>variable, so I don't think you are going to find any that give much of a

>net savings in latency over CORDIC for similar performance.  The best 
>performance will be from a look-up table, but you are obviously limited 
>by to a relatively small number of angles by the size of the memory. 
>Your best bet might be to use a cross between CORDIC and a look-up where

>instead of having a binary rotation decision at each step, you have a 
>small finite number of rotation angles.  That result is then passed to 
>subsequent stages that have finer angular resolutions.  I think in the 
>end though, you'll wind up with about the same latency as the CORDIC. 
>The other advantage to using CORDIC is that you obtain not only the 
>arctan, but you also get the magnitude, and you don't need to compute a 
>square root for it.  The square root and the divide are both hardware 
>intensive and don't lend themselves well to parallelizing because the 
>intermediate results depend on the previous intermediate results.  Same 
>is true for CORDIC, only CORDIC gives you both functions at once.
>
>Perhaps you can look at ways to speed up the CORDIC, maybe using a 
>multiplied clock or not pipelining at every stage.
>

Hi,
    Thanks for the suggestion.Do you know any papers which, gives the
above concept of mixing both lut and cordic, and also good papers on fast
hardware implementation of cordic. I wanted some information on radix4
cordic implementation, can you tell me a good paper on radix4 cordic
implementation.

Thanks and Regards
Karthik W

karthikw wrote:

>>Hi,
>>    I am  implementing arctan(x) in hardware can any one suggest me any
>>of the algorithms other than cordic because it takes lots of iterations
> 
> to
> 
>>reach the final result. 
>>
>>Thanks and Regards 
>>Karthik W
>>
>>
> 
> Hi,
>     Thanks for the response. But actually I am implementing an rectangular
> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So
> I need arctan(y/x) not arctan(x) because a divider costs a lot with
> respect to area. I intend a accuracy of 4 decimal place which is q15 fixed
> point format.If go to lookup table, I need a huge memory of 2^16.
> Interpolation technique is good but I think for hardware implementation
> its quite complex ,I some cases I may need a divider or a multiplier. I
> also tried looking into the variants (formulas) of arctan but all need a
> divider in some or the other way .So if I can have methood which costs
> less on area and with faster speed than cordic then it will be good.
>  
> Thanks and Regards 
> Karthik W     
>     
> 

Unfortunately, most of the algorithms for arctan involve division by a 
variable, so I don't think you are going to find any that give much of a 
net savings in latency over CORDIC for similar performance.  The best 
performance will be from a look-up table, but you are obviously limited 
by to a relatively small number of angles by the size of the memory. 
Your best bet might be to use a cross between CORDIC and a look-up where 
instead of having a binary rotation decision at each step, you have a 
small finite number of rotation angles.  That result is then passed to 
subsequent stages that have finer angular resolutions.  I think in the 
end though, you'll wind up with about the same latency as the CORDIC. 
The other advantage to using CORDIC is that you obtain not only the 
arctan, but you also get the magnitude, and you don't need to compute a 
square root for it.  The square root and the divide are both hardware 
intensive and don't lend themselves well to parallelizing because the 
intermediate results depend on the previous intermediate results.  Same 
is true for CORDIC, only CORDIC gives you both functions at once.

Perhaps you can look at ways to speed up the CORDIC, maybe using a 
multiplied clock or not pipelining at every stage.

On Tue, 02 Oct 2007 01:35:10 -0500, "karthikw" <karthikwali@gmail.com>
wrote:

>>Hi,
>>     I am  implementing arctan(x) in hardware can any one suggest me any
>>of the algorithms other than cordic because it takes lots of iterations
>to
>>reach the final result. 
>>
>>Thanks and Regards 
>>Karthik W
>>
>>
>Hi,
>    Thanks for the response. But actually I am implementing an rectangular
>to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So
>I need arctan(y/x) not arctan(x) because a divider costs a lot with
>respect to area. I intend a accuracy of 4 decimal place which is q15 fixed
>point format.If go to lookup table, I need a huge memory of 2^16.
>Interpolation technique is good but I think for hardware implementation
>its quite complex ,I some cases I may need a divider or a multiplier. I
>also tried looking into the variants (formulas) of arctan but all need a
>divider in some or the other way .So if I can have methood which costs
>less on area and with faster speed than cordic then it will be good.
> 
>Thanks and Regards 
>Karthik W     

Hi Karthik,
  ya' might take a look at 
"Another Contender in The Arctangent Race", 
by R. Lyons, IEEE Signal Processing Magazine, 
DSP Tips & Tricks column, Vol. 21, No. 1, 
Jan. 2004, page 109.

The algorithm there yields an arctan accuracy 
of roughly one quarter of a degree, and does 
not use a lookup table.  The algo is fairly 
efficient except, darn it, it requires a 
divide operation.

I don't know of a fast, accurate, table-free, 
divide-free arctan algorithm.  If you find 
one, Karthik, you'll become famous.

Good Luck,
[-Rick-]

>On Oct 2, 2:35 am, "karthikw" <karthikw...@gmail.com> wrote:
>...
>> >     I am  implementing arctan(x) in hardware can any one suggest me
any
>> >of the algorithms other than cordic because it takes lots of
iterations to
>> >reach the final result.
>>
>...
>> ... I am implementing an rectangular
>> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x).
So
>> I need arctan(y/x) not arctan(x) because a divider costs a lot with
>> respect to area. I intend a accuracy of 4 decimal place which is q15
fixed
>> point format.If go to lookup table, I need a huge memory of 2^16.
>> Interpolation technique is good but I think for hardware
implementation
>> its quite complex ,I some cases I may need a divider or a multiplier.
I
>> also tried looking into the variants (formulas) of arctan but all need
a
>> divider in some or the other way .So if I can have methood which costs
>> less on area and with faster speed than cordic then it will be good.
>
>i don't think you'll avoid division.  also you will have to break up
>your (x,y) coordinate pair into 4 quadrants (maybe 5, if you want your
>answer to alway be the Principal Angle which has magnitude less than
>pi) as is done in the atan2(x,y) function.  is this for FM
>demodulation?  ultimately, is what you want the *difference* in angles
>from the previous (complex) sample to the current:
>
>     arg{ (x[n] + j*y[n]) }  -  arg{ x[n-1] + j*y[n-1] }   ?
>
>is that what you want, in the end?  we've had discussions about this
>previously on comp.dsp, but i do not know the links.
>
>is your gmail address valid and an account that you check your mail?
>is that where i can email you?
>
>r b-j
>

Hi,
    Thanks for the reply. My mail id is karthikwali@gmail.com. I am
implementing  arctan(x,y) for FFT. The general formula for 

             atan(x,y) = i log((x + iy)/sqrt(x^2 + y^2)). 

             And you can see how complex it is to implement in hardware.So
I thought for going to cordic, but it is requiring 13 cycle latency if I
pipeline it.But if I want to calculate it at different times always there
will be a latency of 13 cycles.

Thanks and Regards
Karthik W

On Oct 2, 2:35 am, "karthikw" <karthikw...@gmail.com> wrote:
...
> >     I am  implementing arctan(x) in hardware can any one suggest me any
> >of the algorithms other than cordic because it takes lots of iterations to
> >reach the final result.
>
...
> ... I am implementing an rectangular
> to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So
> I need arctan(y/x) not arctan(x) because a divider costs a lot with
> respect to area. I intend a accuracy of 4 decimal place which is q15 fixed
> point format.If go to lookup table, I need a huge memory of 2^16.
> Interpolation technique is good but I think for hardware implementation
> its quite complex ,I some cases I may need a divider or a multiplier. I
> also tried looking into the variants (formulas) of arctan but all need a
> divider in some or the other way .So if I can have methood which costs
> less on area and with faster speed than cordic then it will be good.

i don't think you'll avoid division.  also you will have to break up
your (x,y) coordinate pair into 4 quadrants (maybe 5, if you want your
answer to alway be the Principal Angle which has magnitude less than
pi) as is done in the atan2(x,y) function.  is this for FM
demodulation?  ultimately, is what you want the *difference* in angles
from the previous (complex) sample to the current:

     arg{ (x[n] + j*y[n]) }  -  arg{ x[n-1] + j*y[n-1] }   ?

is that what you want, in the end?  we've had discussions about this
previously on comp.dsp, but i do not know the links.

is your gmail address valid and an account that you check your mail?
is that where i can email you?

r b-j

>Hi,
>     I am  implementing arctan(x) in hardware can any one suggest me any
>of the algorithms other than cordic because it takes lots of iterations
to
>reach the final result. 
>
>Thanks and Regards 
>Karthik W
>
>
Hi,
    Thanks for the response. But actually I am implementing an rectangular
to polar cordinate conversion ie x + i y to sqrt(x^2 + y^2)arctan(y/x). So
I need arctan(y/x) not arctan(x) because a divider costs a lot with
respect to area. I intend a accuracy of 4 decimal place which is q15 fixed
point format.If go to lookup table, I need a huge memory of 2^16.
Interpolation technique is good but I think for hardware implementation
its quite complex ,I some cases I may need a divider or a multiplier. I
also tried looking into the variants (formulas) of arctan but all need a
divider in some or the other way .So if I can have methood which costs
less on area and with faster speed than cordic then it will be good.
 
Thanks and Regards 
Karthik W

robert bristow-johnson wrote:
> On Oct 1, 5:43 pm, Tim Wescott <t...@seemywebsite.com> wrote:
> ...
>> In which case a look-up table would work nice, except for the minor
>> problem of the infinite ordinate -- then one may want a lookup table
>> with ever-increasing intervals, which actually wouldn't be too bad to
>> implement.
> 
> so how would one determine the index into the table without some
> repeated search and compare operations?

For floating point, look at the mantissa and do it sorta 
logarithmically.  For fixed-point, count the number of zeros (preferably 
with fast, pipelined logic).
> 
> also, i think something like
> 
>     arctan(1/x) = pi/2 - arctan(x)
> 
> can be used for the the nearly infinite ordinates, no?  (a division is
> required.)
> 
> dunno how this would work for hardware, but if a single division is
> tolerable and
> for -1 <= x <= 1, a very accurate approximation is:
> 
>     arctan(x) ~= x/f(x^2)
> 
> where
> 
>     f(u)  =     1.0
>              +  0.33288950512027 * u
>              + -0.08467922817644 * u^2
>              +  0.03252232640125 * u^3
>              + -0.00749305860992 * u^4
> 
> maybe too many terms, but it's not the finite power series that's bad,
> but the division by it that's costly in many different contexts.
> 
> r b-j
> 
> 


-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" gives you just what it says.
See details at http://www.wescottdesign.com/actfes/actfes.html