Reply by Steve Pope November 30, 20102010-11-30
HardySpicer  <gyansorova@gmail.com> wrote:

>I don't know the application, but if it's software radio then you >don't need atan at all for FM at least.
That would depend upon your choice of demodulator algorithm, it seems to me. Which is going to depend upon other factors, some of which might add up to "need". Steve
Reply by HardySpicer November 30, 20102010-11-30
On Nov 30, 8:47=A0pm, Alex Hornung <ahorn...@gmail.com> wrote:
> Hi, > > first off I'd like to apologize in case this question is/sounds > extremely stupid, but I'm really stuck. > > I need an efficient atan2 that doesn't take up as much real estate on an > FPGA as a CORDIC would. As such, I've been looking at the 'Trick' > mentioned at dspguru[1]. > > So according to that trick, given: > x =3D 0.1838 > y =3D -0.1818 > > I would now do the following, since it is in the IV quadrant: > r =3D (x-y)/(x+y) > > But here is the problem: the result of that division is around 177. If I > now continue on and find the angle by doing: > theta =3D pi/4 - pi/4*r (or rather pi/4*r - pi/4) > > it will obviously be horribly wrong. What am I missing? I'm testing all > this in Matlab, but I also tried using the fixed point toolbox, and the > result with the same number of bits and fraction length is simply '1'. > How is this supposed to work? Where am I wrong? > > Thank you in advance, > Alex Hornung > > [1]:http://www.dspguru.com/dsp/tricks/fixed-point-atan2-with-self-normali=
... I don't know the application, but if it's software radio then you don't need atan at all for FM at least. Hardy
Reply by Rob Gaddi November 30, 20102010-11-30
On 11/30/2010 10:13 AM, Alex Hornung wrote:
> On 30/11/2010 17:51, Rob Gaddi wrote: >> On 11/30/2010 7:42 AM, Steve Pope wrote: >>> Alex Hornung<ahornung@gmail.com> wrote: >>> >>>> On 30/11/2010 15:18, Steve Pope wrote: >>> >>>>> Alex Hornung<ahornung@gmail.com> wrote: >>> >>>>>> I would of course welcome any solution that would allow me to make >>>>>> this >>>>>> even simpler (for example by removing the divider somehow). >>>>>> Considering >>>>>> that I don't require much accuracy, there might be even more >>>>>> efficient >>>>>> solutions that I don't know anything about. >>> >>>>> Have you considered a one-octant LUT plus mirroring? How much accuracy >>>>> do you need? These things are usually small. >>> >>>> No, and I have no idea on how that would work, to get everything from >>>> that one octant. Remember I'm really new to all of this :) Do you >>>> happen >>>> to have any paper/website/etc about it? >>> >>>> The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I >>>> could even use some lookup for the all the values with this kind of >>>> accuracy? >>> >>> Yes, probably. >>> >>> Suppose the input to your four quadrant arctan is two 5-bit signed >>> values >>> representing a complex number. That's 1024 possible arctan values, >>> which is a prettty large lookup table, but by manipulating these so >>> that the input of the table lies always in the first octant, you are >>> now down to 9 * 9 = 81 values. I think you will find the accuracy >>> is better than 0.1 radians using such an approach. >>> >>> I know of no paper, you just have to design it and try it out. >>> >>> Steve >> >> It's early in the morning, and I grant I'm under-caffeinated, but a 1024 >> element lookup table just doesn't strike me as a deal breaker. A single >> Xilinx BRAM or two Altera M9Ks gives you an 18-bit output for that >> 10-bit input, assuming you don't do any octant folding. If you weren't >> using that RAM for anything yet than that's absolutely free, no fabric >> required, and gives you a single cycle ATAN function. >> >> Folding costs you a few cycles (though it can be pipelined if >> throughput's an issue) and a small amount of fabric for a pretty hefty >> resolution improvement, or you can just brute force it by throwing more >> RAMs at the problem. >> >> Never underestimate the ability of a (pseudo-)ROM to implement arbitrary >> functions. >> > > That actually sounds pretty perfect. Didn't even think of using the BRAM > blocks for this, and I can definitely spare 1 out of 192. I'll implement > it with the Xilinx Block Memory Generator, but I was wondering, just out > of curiosity, if there is some way of using them directly from VHDL? > > Regards, > Alex
You're looking to implement a single-port, synchronous read ROM. There's a code template in some piece of documentation (xst.pdf I think) that discusses the VHDL you have to write in order to implement one of thems. Closely following their example code will yield the best results, and you'll want to look through the XST output log in order to make sure that it did in fact infer a ROM. You'll know if it didn't; synthesis will take forever as it tries to build logic trees out of it instead. You can either declare all the table values inline in the VHDL or use std.textio to read them in from an external file. I prefer the external file from an aesthetic standpoint, but it does make things a bit trickier. -- Rob Gaddi, Highland Technology Email address is currently out of order
Reply by Alex Hornung November 30, 20102010-11-30
On 30/11/2010 17:51, Rob Gaddi wrote:
> On 11/30/2010 7:42 AM, Steve Pope wrote: >> Alex Hornung<ahornung@gmail.com> wrote: >> >>> On 30/11/2010 15:18, Steve Pope wrote: >> >>>> Alex Hornung<ahornung@gmail.com> wrote: >> >>>>> I would of course welcome any solution that would allow me to make >>>>> this >>>>> even simpler (for example by removing the divider somehow). >>>>> Considering >>>>> that I don't require much accuracy, there might be even more efficient >>>>> solutions that I don't know anything about. >> >>>> Have you considered a one-octant LUT plus mirroring? How much accuracy >>>> do you need? These things are usually small. >> >>> No, and I have no idea on how that would work, to get everything from >>> that one octant. Remember I'm really new to all of this :) Do you happen >>> to have any paper/website/etc about it? >> >>> The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I >>> could even use some lookup for the all the values with this kind of >>> accuracy? >> >> Yes, probably. >> >> Suppose the input to your four quadrant arctan is two 5-bit signed values >> representing a complex number. That's 1024 possible arctan values, >> which is a prettty large lookup table, but by manipulating these so >> that the input of the table lies always in the first octant, you are >> now down to 9 * 9 = 81 values. I think you will find the accuracy >> is better than 0.1 radians using such an approach. >> >> I know of no paper, you just have to design it and try it out. >> >> Steve > > It's early in the morning, and I grant I'm under-caffeinated, but a 1024 > element lookup table just doesn't strike me as a deal breaker. A single > Xilinx BRAM or two Altera M9Ks gives you an 18-bit output for that > 10-bit input, assuming you don't do any octant folding. If you weren't > using that RAM for anything yet than that's absolutely free, no fabric > required, and gives you a single cycle ATAN function. > > Folding costs you a few cycles (though it can be pipelined if > throughput's an issue) and a small amount of fabric for a pretty hefty > resolution improvement, or you can just brute force it by throwing more > RAMs at the problem. > > Never underestimate the ability of a (pseudo-)ROM to implement arbitrary > functions. >
That actually sounds pretty perfect. Didn't even think of using the BRAM blocks for this, and I can definitely spare 1 out of 192. I'll implement it with the Xilinx Block Memory Generator, but I was wondering, just out of curiosity, if there is some way of using them directly from VHDL? Regards, Alex
Reply by Rob Gaddi November 30, 20102010-11-30
On 11/30/2010 7:42 AM, Steve Pope wrote:
> Alex Hornung<ahornung@gmail.com> wrote: > >> On 30/11/2010 15:18, Steve Pope wrote: > >>> Alex Hornung<ahornung@gmail.com> wrote: > >>>> I would of course welcome any solution that would allow me to make this >>>> even simpler (for example by removing the divider somehow). Considering >>>> that I don't require much accuracy, there might be even more efficient >>>> solutions that I don't know anything about. > >>> Have you considered a one-octant LUT plus mirroring? How much accuracy >>> do you need? These things are usually small. > >> No, and I have no idea on how that would work, to get everything from >> that one octant. Remember I'm really new to all of this :) Do you happen >> to have any paper/website/etc about it? > >> The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I >> could even use some lookup for the all the values with this kind of >> accuracy? > > Yes, probably. > > Suppose the input to your four quadrant arctan is two 5-bit signed values > representing a complex number. That's 1024 possible arctan values, > which is a prettty large lookup table, but by manipulating these so > that the input of the table lies always in the first octant, you are > now down to 9 * 9 = 81 values. I think you will find the accuracy > is better than 0.1 radians using such an approach. > > I know of no paper, you just have to design it and try it out. > > Steve
It's early in the morning, and I grant I'm under-caffeinated, but a 1024 element lookup table just doesn't strike me as a deal breaker. A single Xilinx BRAM or two Altera M9Ks gives you an 18-bit output for that 10-bit input, assuming you don't do any octant folding. If you weren't using that RAM for anything yet than that's absolutely free, no fabric required, and gives you a single cycle ATAN function. Folding costs you a few cycles (though it can be pipelined if throughput's an issue) and a small amount of fabric for a pretty hefty resolution improvement, or you can just brute force it by throwing more RAMs at the problem. Never underestimate the ability of a (pseudo-)ROM to implement arbitrary functions. -- Rob Gaddi, Highland Technology Email address is currently out of order
Reply by Steve Pope November 30, 20102010-11-30
Alex Hornung  <ahornung@gmail.com> wrote:

>On 30/11/2010 17:23, Steve Pope wrote:
>> If the incoming data comprises too many fixed point bits, then he >> has to massage it, either by normalizing (generally involving shifting) >> or by dividing. I went on to give an example where there is 10 bits >> total of incoming data. If there are too many more than this, then >> yes something will have to be done in this regard. Usually it would >> not be a divide though.
>Actually I'm seeing that as little as 2 bits of precision are enough. So >erring a bit on the safe side, 3 bits, would give me a 2^6 lookup table >(64 entries) which should be very reasonable?
Well, I came up with 81 entries (9 * 9, rather than 8 * 8) but that assumes it's convenient to include both end-points of the octant. Due to the necessary mirroring that can be convenient.
>Regarding normalization; I basically read 14-bit complex samples off an >ADC (14 bit real, 14 bit imag), so I was thinking I can just shift them >so all the non-zero bits are behind the decimal point, which should be >normalized (?).
>Am I making some wrong assumption here? Is there anything wrong with >just normalizing by shifting the hell out of it?
No, that's fine. You need to count leading zeros and ones, not just leading zeros, if the data is two's complement. That's about it. It's pretty straightforward that for a given number of bits in the result, normalizing by a factor of two is no more than one bit worse than doing a full divide. Steve
Reply by Alex Hornung November 30, 20102010-11-30
On 30/11/2010 17:23, Steve Pope wrote:
> Tim Wescott<tim@seemywebsite.com> wrote: > >> On 11/30/2010 07:18 AM, Steve Pope wrote: > >>> Have you considered a one-octant LUT plus mirroring? How much accuracy >>> do you need? These things are usually small. > >> Unless the incoming data is normalized he'd still have to do the divide. > > Well, that's a little sweeping. > > If the incoming data comprises too many fixed point bits, then he > has to massage it, either by normalizing (generally involving shifting) > or by dividing. I went on to give an example where there is 10 bits > total of incoming data. If there are too many more than this, then > yes something will have to be done in this regard. Usually it would > not be a divide though. > > Steve
Actually I'm seeing that as little as 2 bits of precision are enough. So erring a bit on the safe side, 3 bits, would give me a 2^6 lookup table (64 entries) which should be very reasonable? Regarding normalization; I basically read 14-bit complex samples off an ADC (14 bit real, 14 bit imag), so I was thinking I can just shift them so all the non-zero bits are behind the decimal point, which should be normalized (?). Am I making some wrong assumption here? Is there anything wrong with just normalizing by shifting the hell out of it? Regards, Alex
Reply by Vladimir Vassilevsky November 30, 20102010-11-30

Alex Hornung wrote:


> The accuracy I need is somewhere around 0.1 to 0.2 radians. Maybe I > could even use some lookup for the all the values with this kind of > accuracy? After all it would just be around 60 possibilities for the > whole circle.
If the accuracy is as coarse as 0.1 radian, then atan2(x,y) ~ x/y within an octant of the circle. Fold the angle to a proper octant and set the signs accordingly. Normalize x and y before division, so you can get by some small additions and corrections without any division at all. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by Steve Pope November 30, 20102010-11-30
Tim Wescott  <tim@seemywebsite.com> wrote:

>On 11/30/2010 07:18 AM, Steve Pope wrote:
>> Have you considered a one-octant LUT plus mirroring? How much accuracy >> do you need? These things are usually small.
>Unless the incoming data is normalized he'd still have to do the divide.
Well, that's a little sweeping. If the incoming data comprises too many fixed point bits, then he has to massage it, either by normalizing (generally involving shifting) or by dividing. I went on to give an example where there is 10 bits total of incoming data. If there are too many more than this, then yes something will have to be done in this regard. Usually it would not be a divide though. Steve
Reply by robert bristow-johnson November 30, 20102010-11-30
On Nov 30, 10:30=A0am, Alex Hornung <ahorn...@gmail.com> wrote:
> > The accuracy I need is somewhere around 0.1 to 0.2 radians.
that's not so strict. an error of 5 degrees is okay?
> Maybe I > could even use some lookup for the all the values with this kind of > accuracy? After all it would just be around 60 possibilities for the > whole circle.
if it's atan2(x,y) you have 2 independent variables for your LUT. want to do this with a finite power series (it wouldn't have to be very high order for just 5 degrees). in fact, maybe that approximation in Rick's 2nd edition would be good enough. r b-j