Forums

Sine Lookup Table with Linear Interpolation

Started by rickman March 16, 2013
On 3/18/2013 2:10 PM, Jerry Avins wrote:
> On 3/17/2013 6:35 PM, rickman wrote: >> On 3/17/2013 6:23 PM, langwadt@fonz.dk wrote: >>> On Mar 17, 10:56 pm, rickman<gnu...@gmail.com> wrote: >>>> On 3/16/2013 8:02 PM, langw...@fonz.dk wrote: >>>> >>>>> why not skip the lut and just do a full sine approximation >>>> >>>>> something like this, page 53-54 >>>> >>>>> http://www.emt.uni-linz.ac.at/education/Inhalte/pr_mderf/downloads/AD... >>>>> >>>>> >>>> >>>> I'm not sure this would be easier. The LUT an interpolation require a >>>> reasonable amount of logic. This requires raising X to the 5th power >>>> and five constant multiplies. My FPGA doesn't have multipliers and this >>>> may be too much for the poor little chip. I suppose I could do some of >>>> this with a LUT and linear interpolation... lol >>>> >>> >>> I see, I assumed you had an MCU (or fpga) with multipliers >>> >>> considered CORDIC? >> >> I should learn that. I know it has been used to good effect in FPGAs. >> Right now I am happy with the LUT, but I would like to learn more about >> CORDIC. Any useful referrences? > > http://www.andraka.com/files/crdcsrvy.pdf should be a good start. Do you > remember Ray Andraka?
Yes, of course. I should have thought of that. I don't go to his site often, but his stuff is always excellent. I may have even downloaded this paper before. I took a look at the CORDIC once a few years ago and didn't get too far. I'm happy with my current approach for the moment, but I will dig into this more another time. I appreciate the link. Thanks Jerry. -- Rick
rickman wrote:

> On 3/18/2013 1:44 AM, Robert Baer wrote: > >> glen herrmannsfeldt wrote: >> >>> In comp.dsp rickman<gnuarm@gmail.com> wrote: >>> >>>> I've been studying an approach to implementing a lookup table (LUT) to >>>> implement a sine function. The two msbs of the phase define the >>>> quadrant. I have decided that an 8 bit address for a single quadrant is >>>> sufficient with an 18 bit output. Another 11 bits of phase will give me >>>> sufficient resolution to interpolate the sin() to 18 bits. >>> >>> >>>> If you assume a straight line between the two endpoints the midpoint of >>>> each interpolated segment will have an error of >>>> ((Sin(high)-sin(low))/2)-sin(mid) >>> >>> >>> See: >>> >>> http://ia601506.us.archive.org/8/items/bitsavers_nationaldaMOSIntegratedCircuits_20716690/1972_National_MOS_Integrated_Circuits.pdf >>> >>> >>> >>> The description starts on page 273. >>> >>> -- glen >> >> Problem with that..could see only front and back cover. >> Different source perhaps? > > > It is a lot of document for just two pages. Try this link to see the > short version... > > arius.com/foobar/1972_National_MOS_Integrated_Circuits_Pages_273-274.pdf >
isn't it "fubar" ? Jamie
On 3/17/2013 6:02 PM, glen herrmannsfeldt wrote:
> In comp.dsp rickman<gnuarm@gmail.com> wrote: >> I've been studying an approach to implementing a lookup table (LUT) to >> implement a sine function. > > (snip) > >> If you assume a straight line between the two endpoints the midpoint of >> each interpolated segment will have an error of >> ((Sin(high)-sin(low))/2)-sin(mid) > > Seems what they instead do is implement > > sin(M)cos(L)+cos(M)sin(L) where M and L are the more and less > significant bits of theta. Also, cos(L) tends to be almost 1, > so they just say it is 1.
I did a little work with this tonight and guess what, this is nearly the same as the linear interpolation. cos(M) is as you say, nearly one. So close in fact, that using the 10 msbs for M and an 18 bit output word, there value calculated for the 1023rd entry is 0x40000. That leaves sin(L). Surprise! This turns out to be a straight line! So the formula is FAPP the same as the linear interpolation with the added fudge factor of multiplying sin(L) by cos(M) rather than just multiplying L directly with the difference sin(M+1)-sin(M). The sin*cos method does require a second table to be stored. In fact, the paper you reference uses a bit of slight of hand to include both the cos(M) * sin(L) at a lesser precision both in the input and output of this table. I'm not sure how well this would work with higher precision needs, but I suppose it is worth a bit further look. It would be nice to eliminate the multiplier which is used in the linear interpolation. So I think this is just different strokes for different folks in the end. The sin*cos method might work out a little better with the RMS noise depending on how the second table is handled. But I think it may end up... well, in the noise anyway. I think the real problem is that as the resolution of the input increases, the second table gets harder to keep small. But their trick of essentially tossing the middle bits seems to work fairly well.
>> The other issue is how to calculate the values for the table to give the >> best advantage to the linear interpolation. Rather than using the exact >> match to the end points stored in the table, an adjustment could be done >> to minimize the deviation over each interpolated segment. Without this, >> the errors are always in the same direction. With an adjustment the >> errors become bipolar and so will reduce the magnitude by half (approx). >> Is this commonly done? It will require a bit of computation to get >> these values, but even a rough approximation should improve the max >> error by a factor of two to around 2-3 ppm. > > Seems like that is one of the suggestions, but not done in the ROMs > they were selling. Then the interpolation has to add or subtract, > which is slightly (in TTL) harder.
Yes, I see this note. This seems to be exactly what I was thinking. But I don't think the data has to be subtracted. The unadjusted error is zero at the end points of each interpolated line and consistently increases in toward the middle from each end point. Adjusting the data in the table results in an error that instead goes positive and negative, but the table data is still always added, although they don't deal with the issues of implementing more than 90 degrees. If you go around the clock you then need to consider sign bits (as opposed to sine bits) on the inputs to the tables.
> The interpolated sine was done in 1970 with 128x8 ROMs. With larger > ROMs, like usual today, you shouldn't need it unless you want really > high resolution.
I am working in a small FPGA with rather limited block RAM capabilities. It is already on the board, so I can't swap it out. I also need two generator circuits which pushes the limits a bit more. At least I don't have any shortage of adder/subtractors or all the other logic this will need. Thanks for the pointer and the advice. -- Rick
On 3/18/2013 10:26 AM, John Larkin wrote:
> On Mon, 18 Mar 2013 10:12:45 -0400, Phil Hobbs > <pcdhSpamMeSenseless@electrooptical.net> wrote: > >> On 03/17/2013 08:27 PM, John Larkin wrote: >>> On Sun, 17 Mar 2013 19:31:38 -0400, Phil Hobbs >>> <pcdhSpamMeSenseless@electrooptical.net> wrote: >>> >>>> On 3/17/2013 6:44 PM, rickman wrote: >>>>> On 3/17/2013 6:20 PM, John Larkin wrote: >>>>>> >>>>>> It would be interesting to distort the table entries to minimize THD. The >>>>>> problem there is how to measure very low THD to close the loop. >>>>> >>>>> It could also be a difficult problem to solve without some sort of >>>>> exhaustive search. Each point you fudge affects two curve segments and >>>>> each curve segment is affected by two points. So there is some degree >>>>> of interaction. >>>>> >>>>> Anyone know if there is a method to solve such a problem easily? I am >>>>> sure that my current approach gets each segment end point to within &#2013266097;1 >>>>> lsb and I suppose once you measure THD in some way it would not be an >>>>> excessive amount of work to tune the 256 end points in a few passes >>>>> through. This sounds like a tree search with pruning. >>>>> >>>>> I'm assuming there would be no practical closed form solution. But >>>>> couldn't the THD be calculated in a simulation rather than having to be >>>>> measured on the bench? >>>>> >>>> >>>> The simple method would be to compute a zillion-point FFT. Since the >>>> function is really truly periodic, the FFT gives you the right answer >>>> for the continuous-time Fourier transform, provided you have enough >>>> points that you can see all the relevant harmonics. >>>> >>> >>> All you need now is a digitizer or spectrum analyzer that has PPM distortion! >>> >>> We've bumped into this problem, trying to characterize some of our ARBs. The >>> best thing is to use a passive lowpass or notch filter (with very good Ls and >>> Cs!) to remove the fundamental and analyze what's left. >>> >>> >> >> I have an HP 339A you can borrow. ;) >> >> Cheers >> >> Phil Hobbs > > That's for audio. Who cares about distortion for audio? And it gets nowhere near > PPM turf. > >
You're no fun anymore. :-) It does have a 0.01% FS range, so you'd expect to get a lot better than -80 dB. But for fast stuff, I'd probably do the same as you--filter the daylights out of it and amplify the residuals. Cheers Phil Hobbs -- Dr Philip C D Hobbs Principal Consultant ElectroOptical Innovations LLC Optics, Electro-optics, Photonics, Analog Electronics 160 North State Road #203 Briarcliff Manor NY 10510 USA +1 845 480 2058 hobbs at electrooptical dot net http://electrooptical.net
In comp.dsp rickman <gnuarm@gmail.com> wrote:
> On 3/18/2013 2:10 PM, Jerry Avins wrote:
(snip, ending on CORDIC)
>> http://www.andraka.com/files/crdcsrvy.pdf should be a good start. Do you >> remember Ray Andraka?
> Yes, of course. I should have thought of that. I don't go to his site > often, but his stuff is always excellent. I may have even downloaded > this paper before. I took a look at the CORDIC once a few years ago and > didn't get too far. I'm happy with my current approach for the moment, > but I will dig into this more another time.
I once thought I pretty well understood binary CORDIC, but it is also used in decimal on many pocket calculators. I never got far enough to figure out how they did that.
> I appreciate the link. Thanks Jerry.
-- glen
In comp.dsp rickman <gnuarm@gmail.com> wrote:

(snip, I wrote)
>> Seems what they instead do is implement
>> sin(M)cos(L)+cos(M)sin(L) where M and L are the more and less >> significant bits of theta. Also, cos(L) tends to be almost 1, >> so they just say it is 1.
> I did a little work with this tonight and guess what, this is nearly the > same as the linear interpolation. cos(M) is as you say, nearly one. So > close in fact, that using the 10 msbs for M and an 18 bit output word, > there value calculated for the 1023rd entry is 0x40000.
> That leaves sin(L). Surprise! This turns out to be a straight line! So > the formula is FAPP the same as the linear interpolation with the added > fudge factor of multiplying sin(L) by cos(M) rather than just > multiplying L directly with the difference sin(M+1)-sin(M). The sin*cos > method does require a second table to be stored. In fact, the paper you > reference uses a bit of slight of hand to include both the cos(M) * > sin(L) at a lesser precision both in the input and output of this table. > I'm not sure how well this would work with higher precision needs, but > I suppose it is worth a bit further look. It would be nice to eliminate > the multiplier which is used in the linear interpolation.
It has been a while since I used a table with actual interpoltation, but as I remember it they give you the spacing somewhere on the table. Now, you could have one table to look up the spacing and another to calculate the interpolation value (that is, a multiply table). So, they divide the problem up into 16 different slopes, but the slopes are equally distributed in theta.
> So I think this is just different strokes for different folks in the > end. The sin*cos method might work out a little better with the RMS > noise depending on how the second table is handled. But I think it may > end up... well, in the noise anyway.
Well, it is also convenient for the size of ROMs they had available. (Or the other way around. A convenient use for the size of ROMs they had to sell.) The tradeoffs might be different for more, smaller ROMs. I believe the manual I had (and didn't find) has the actual ROM tables listed. At least I thought it did. It could also be pipelined, especially with more levels of ROM.
> I think the real problem is that as the resolution of the input > increases, the second table gets harder to keep small. But their trick > of essentially tossing the middle bits seems to work fairly well.
(snip, I wrote)
>> Seems like that is one of the suggestions, but not done in the ROMs >> they were selling. Then the interpolation has to add or subtract, >> which is slightly (in TTL) harder.
> Yes, I see this note. This seems to be exactly what I was thinking. But > I don't think the data has to be subtracted. The unadjusted error is > zero at the end points of each interpolated line and consistently > increases in toward the middle from each end point. Adjusting the data > in the table results in an error that instead goes positive and > negative, but the table data is still always added, although they don't > deal with the issues of implementing more than 90 degrees. If you go > around the clock you then need to consider sign bits (as opposed to sine > bits) on the inputs to the tables.
Oh, I think I see what you mean. But it might have bigger slope discontinuities that way.
>> The interpolated sine was done in 1970 with 128x8 ROMs. With larger >> ROMs, like usual today, you shouldn't need it unless you want really >> high resolution.
> I am working in a small FPGA with rather limited block RAM capabilities. > It is already on the board, so I can't swap it out. I also need two > generator circuits which pushes the limits a bit more. At least I don't > have any shortage of adder/subtractors or all the other logic this will > need.
If you have more, smaller ROMs I think the idea can be repeated. That is, have High, Mid, and Low bits, and two levels of interpolation.
> Thanks for the pointer and the advice.
-- glen
On 3/17/2013 10:14 PM, robert bristow-johnson wrote:
> On 3/17/13 5:49 PM, rickman wrote: >> On 3/16/2013 7:13 PM, robert bristow-johnson wrote: >>> >>> this *should* be a relatively simple issue, but i am confused >>> >>> On 3/16/13 6:30 PM, rickman wrote: >>>> I've been studying an approach to implementing a lookup table (LUT) to >>>> implement a sine function. The two msbs of the phase define the >>>> quadrant. I have decided that an 8 bit address for a single quadrant is >>>> sufficient with an 18 bit output. >>> >>> 10 bits or 1024 points. since you're doing linear interpolation, add one >>> more, copy the zeroth point x[0] to the last x[1024] so you don't have >>> to do any modulo (by ANDing with (1023-1) on the address of the second >>> point. (probably not necessary for hardware implementation.) >>> >>> >>> x[n] = sin( (pi/512)*n ) for 0 <= n <= 1024 >> >> So you are suggesting a table with 2^n+1 entries? Not such a great idea >> in some apps, like hardware. > > i think about hardware as a technology, not so much an app. i think that > apps can have either hardware or software realizations.
Ok, semantics. What word should I use instead of "app"?
>> What is the advantage? > > in *software* when using a LUT *and* linear interpolation for sinusoid > generation, you are interpolating between x[n] and x[n+1] where n is > from the upper bits of the word that comes outa the phase accumulator: > > #define PHASE_BITS 10 > #define PHASE_MASK 0x001FFFFF // 2^21 - 1 > #define FRAC_BITS 11 > #define FRAC_MASK 0x000007FF // 2^11 - 1 > #define ROUNDING_OFFSET 0x000400 > > long phase, phase_increment, int_part, frac_part; > > long y, x[1025]; > > ... > > phase = phase + phase_increment; > phase = phase & PHASE_MASK; > > int_part = phase >> FRAC_BITS; > frac_part = phase & FRAC_MASK; > > y = x[int_part]; > y = y + (((x[int_part+1]-y)*frac_part + ROUNDING_OFFSET)>>FRAC_BITS); > > > now if the lookup table is 1025 long with the last point a copy of the > first (that is x[1024] = x[0]), then you need not mask int_part+1. > x[int_part+1] always exists. > > >> Also, why 10 bit >> address for a 1024 element table? > > uh, because 2^10 = 1024? > > that came from your numbers: 8-bit address for a single quadrant and > there are 4 quadrants, 2 bits for the quadrant.
No, that didn't come from my numbers, well, not exactly. You took my numbers and turned it into a full 360 degree table. I was asking where you got 10 bits, not why a 1024 table needs 10 bits. Actually, in your case it is a 1025 table needing 11 bits.
>>> so a 21 bit total index. your frequency resolution would be 2^(-21) in >>> cycles per sampling period or 2^(-21) * Fs. those 2 million values would >>> be the only frequencies you can meaningful >> >> No, that is the phase sent to the LUT. The total phase accumulator can >> be larger as the need requires. > > why wouldn't you use those extra bits in the linear interpolation if > they are part of the phase word? not doing so makes no sense to me at > all. if you have more bits of precision in the fractional part of the phase
Because that is extra logic and extra work. Not only that, it is beyond the capabilities of the DAC I am using. As it is, the linear interp will generate bits that extend below what is implied by the resolution of the inputs. I may use them or I may lop them off.
>>>> Without considering rounding, this reaches a maximum at the last >>>> segment >>>> before the 90 degree point. >>> >>> at both the 90 and 270 degree points. (or just before and after those >>> points.) >> >> I'm talking about the LUT. The LUT only considers the first quadrant. > > not in software. i realize that in hardware and you're worried about > real estate, you might want only one quadrant in ROM, but in software > you would have an entire cycle of the waveform (plus one repeated point > at the end for the linear interpolation). > > and if you *don't* have the size constraint in your hardware target and > you can implement a 1024 point LUT as easily as a 256 point LUT, then > why not? so you don't have to fiddle with reflecting the single quadrant > around a couple of different ways.
OK, thanks for the insight.
>>> do you mean biasing by 1/2 of a point? then your max error will be *at* >>> the 90 and 270 degree points and it will be slightly more than what you >>> had before. >> >> No, not quite right. There is a LUT with points spaced at 90/255 degrees >> apart starting at just above 0 degrees. The values between points in the >> table are interpolated with a maximum deviation near the center of the >> interpolation. Next to 90 degrees the interpolation is using the maximum >> interpolation factor which will result in a value as close as you can >> get to the correct value if the end points are used to construct the >> interpolation line. 90 degrees itself won't actually be represented, but >> rather points on either side, 90&#2013266097;delta where delta is 360&#2013266096; / 2^(n+1) >> with n being the number of bits in the input to the sin function. >> > > i read this over a couple times and cannot grok it quite. need to be > more explicit (mathematically, or with C or pseudocode) with what your > operations are.
It is simple, the LUT gives values for each point in the table, precise to the resolution of the output. The sin function between these points varies from close to linear to close to quadratic. If you just interpolate between the "exact" points, you have an error that is *always* negative anywhere except the end points of the segments. So it is better to bias the end points to center the linear approximation somewhere in the middle of the real curve. Of course, this has to be tempered by the resolution of your LUT output. If you really want to see my work it is all in a spreadsheet at the moment. I can send it to you or I can copy some of the formulas here. My "optimization" is not nearly optimal. But it is an improvement over doing nothing I believe and is not so hard. I may drop it when I go to VHDL as it is iterative and may be a PITA to code in a function.
>>> if you assume an approximate quadratic behavior over that short segment, >>> you can compute the straight line where the error in the middle is equal >>> in magnitude (and opposite in sign) to the error at the end points. >>> that's a closed form solution, i think. >> >> Yes, it is a little tricky because at this point we are working with >> integer math (or technically fixed point I suppose). > > not with defining the points of the table. you do that in MATLAB or in C > or something. your hardware gets its LUT spec from something you create > with a math program.
I don't think I'll do that. I'll most likely code it in VHDL. Why use yet another tool to generate the FPGA code? BTW, VHDL can do pretty much anything other tools can do. I simulated analog components in may last design... which is what I am doing this for. My sig gen won't work on the right range and so I'm making a sig gen out of a module I build. BTW, to do the calculation you describe above, it has to take into account that the end points have to be truncated to finite resolution. Without that it is just minimizing noise to a certain level only to have it randomly upset. It may turn out that converting from real to finite resolution by rounding is not optimal for such a function.
>> Rounding errors is >> what this is all about. I've done some spreadsheet simulations and I >> have some pretty good results. I updated it a bit to generalize it to >> the LUT size and I keep getting the same max error counts (adjusted to >> work with integers rather than fractions) &#2013266097;3 no matter what the size of >> the interpolation factor. I don't expect this and I think I have >> something wrong in the calculations. I'll need to resolve this. >> >> >>> dunno if that is what you actually want for a sinusoidal waveform >>> generator. i might think you want to minimize the mean square error. >> >> We are talking about the lsbs of a 20+ bit word. Do you think there will >> be much of a difference in result? > > i'm just thinking that if you were using LUT to generate a sinusoid that > is a *signal* (not some parameter that you use to calculate other > stuff), then i think you want to minimize the mean square error (which > is the power of the noise relative to the power of the sinusoid). so the > LUT values might not be exactly the same as the sine function evaluated > at those points.
I don't get your point really. The LUT values will deviate from the sin function because of one thing, the resolution of the output. The sin value calculated will deviate because of three things, input resolution, output resolution and the accuracy of the sin generation. I get what you are saying about the mean square error. How would that impact the LUT values? Are you saying to bias them to minimize the error over the entire signal? Yes, that is what I am talking about, but I don't want to have to calculate the mean square error over the full sine wave. Curently that's two million points. I've upped my design parameters to a 512 entry table and and 10 bit interpolation.
>> I need to actually be able to do the >> calculations and get this done rather than continue to work on the >> process. Also, each end point affects two lines, > > what are "lines"? not quite following you.
Segments. I picture the linear approximations between end points as lines. When you improve one line segment by moving one end point you also impact the adjacent line segment. I have graphed these line segments in the spread sheet and looked at a lot of them. I think the "optimizations" make some improvement, but I'm not sure it is terribly significant. It turns out once you have a 512 entry table, the sin function between the end points is pretty close to linear or the deviation from linear is in the "noise" for an 18 bit output. For example, between LUT(511) and LUT(512) is just a step of 1. How quadratic can that be?
>> so there are tradeoffs, >> make one better and the other worse? It seems to get complicated very >> quickly. > > if you want to define your LUT to minimize the mean square error, > assuming linear interpolation between LUT entries, that can be a little > complicated, but i don't think harder than what i remember doing in grad > school. want me to show you? you get a big system of linear equations to > get the points. if you want to minimize the maximum error (again > assuming linear interpolation), then it's that fitting a straight line > to that little snippet of quadratic curve bit that i mentioned.
God! Thanks but no thanks. Is this really practical to solve for the entire curve? Remember that all points affect other points, possibly more than just the adjacent points. If point A changes point A+1, then it may also affect point A+2, etc. What would be more useful would be to have an idea of just how much *more* improvement can be found.
>>> >>> sin(t + pi/2) = cos(t) >> >> How does that imply a quadratic curve at 90 degrees? > > sin(t) = t + ... small terms when t is small > > cos(t) = 1 - (t^2)/2 + ... small terms when t is small > >> At least I think like the greats! > > but they beat you by a few hundred years.
Blame my parents! -- Rick
glen herrmannsfeldt wrote:
> In comp.dsp Robert Baer<robertbaer@localnet.com> wrote: > > (snip, I wrote) >>> See: > >>> http://ia601506.us.archive.org/8/items/bitsavers_nationaldaMOSIntegratedCircuits_20716690/1972_National_MOS_Integrated_Circuits.pdf > >>> The description starts on page 273. > > I just tried it, copy and paste from the above link, and it worked. > > You might be sure to use Adobe reader if that matters. > > Also, it is page 273 in the document, page 282 in the PDF. > > (snip) > >> Problem with that..could see only front and back cover. >> Different source perhaps? >
I had used my "standard" 4.0 Acrobat; fails in that manner very reliably. ver 7.0 works 100 percent; thanks.
glen herrmannsfeldt wrote:
> In comp.dsp Robert Baer<robertbaer@localnet.com> wrote: > > (snip, I wrote) >>> See: > >>> http://ia601506.us.archive.org/8/items/bitsavers_nationaldaMOSIntegratedCircuits_20716690/1972_National_MOS_Integrated_Circuits.pdf > >>> The description starts on page 273. > > I just tried it, copy and paste from the above link, and it worked. > > You might be sure to use Adobe reader if that matters. > > Also, it is page 273 in the document, page 282 in the PDF. > > (snip) > >> Problem with that..could see only front and back cover. >> Different source perhaps? >
One of many reasons i hate Adobe Reader 7.0: along near the top is a number of cons, one is the so-called hand tool and it defaults as "selected" - BUT you see the "hand" only while moving the cursor and when you stop that changes to the "select tool" so that it is IMPOSSIBLE to move the page around. When moving the hand, the mouse will highlight text if push left button; CANNOT grab!!!!!!!!!!!!!!! That always works in 4.0 .. Newer versions are worse.
glen herrmannsfeldt wrote:
> In comp.dsp Robert Baer<robertbaer@localnet.com> wrote: > > (snip, I wrote) >>> See: > >>> http://ia601506.us.archive.org/8/items/bitsavers_nationaldaMOSIntegratedCircuits_20716690/1972_National_MOS_Integrated_Circuits.pdf > >>> The description starts on page 273. > > I just tried it, copy and paste from the above link, and it worked. > > You might be sure to use Adobe reader if that matters. > > Also, it is page 273 in the document, page 282 in the PDF. > > (snip) > >> Problem with that..could see only front and back cover. >> Different source perhaps? >
The basic approach given there is very good; use of puny 4 or 5 digit tables is a major weakness: sqrt(2)/2 as 0.7071 is junk; CRC in the squares, cubes and roots section gives sqrt(2) as 1.414214 (6 digits) and sqrt(50) as 7.07168 (also 6 digits). And if one needs more, use the Taylor series (for reasonably small angles) as it converges fast (Nth odd power and Nth odd factorial). --or, use the FPU in the computer to get SIN and COS at the same time! Reduction from 360 degrees helps: "Split" 180 degrees: sin(-x)=-sin(x) "Split" 90 degrees: sin(180-x)=sin(x) "Split" 45 degrees: sin(90-x)=cos(x) "Split" 22.5 degrees: (sqrt(0.5)*(cos(x)-sin(x)) Small angles requires less iterations for a given accuracy.