DSPRelated.com
Forums

DDS LUT Size Calculation

Started by rickman February 7, 2017
On 2/8/2017 5:14 PM, Cedron wrote:
>> >> I'm not sure there *is* much of a difference and I'm pretty sure your >> method is not necessarily optimal. >> >> You are assuming that using the slope of the chord is a better fit to >> the curve than the tangent. It would clearly be a better fit if the end > >> points are adjusted so the chord was crossing the circle. In other >> words, use a chord of a circle slightly larger than the unit circle. >> Without this correction I expect the difference between the two is very, > >> very small with the winner depending on how you measure it (area under >> the curve, least squares, etc.). >> >> I will also point out that over short distances the linear approximation > >> is *very* good regardless of the choice of the two. At low angles the >> sine function is very linear while at high angles the distance for the >> interpolation is very small diminishing the effect of any errors. >> >> -- >> >> Rick C > > I agree with everything you said, sort of. > > I never said "optimal", I said "better". > > It more accurate to consider the tangent and chord on a Sine function > graph, rather than the unit circle. > > Which of these two alternatives would be better under any reasonable > metric on any part of the graph depends on the domain of your fractional > portion. > > If you are taking the floor of your value and the fraction portion goes > from zero to one, the chord method (my suggested one) would be better. > Particularly in the .5 to .999999.... range.
You lost me here. I think you are saying the values in the LUT table are truncated rather than rounded. In that case the points will be below the curve and the point to point "chord" would be even *more* below the line.
> If you are taking the nearest integer to your value and the fraction > portion ranges from -.5 to .5, then using the tangent line is better.
I think you have this backwards. If rounding the table entries, the "chord" interpolation has a better chance of being more accurate when the values are rounded up.
> Whether the differences in the two approaches even impact the least > significant bit in your implementation could be determined with a simple > test program. Good luck with your project no matter how you decide to > proceed.
It's not a project... at least not now. It's just a discussion. -- Rick C
In article <o7dvhe$1io$2@dont-email.me>, rickman  <gnuarm@gmail.com> wrote:

>On 2/7/2017 3:42 PM, Steve Pope wrote:
>> dbd <d.dalrymple@sbcglobal.net> wrote:
>>> There must be further unstated requirements or assumptions because, in >>> general, the size of the lookup table addresses and the value of the lsb >>> of the looked up data are independent. You might get different results >>>from 16 bit fixed point and 64 bit float data.
>> I agree, I am curious as to what exactly is being observed here.
>What part don't you understand?
Your reply to DBD clarified to me some of what you're doing. Thanks Steve
> >You lost me here. I think you are saying the values in the LUT table >are truncated rather than rounded. In that case the points will be >below the curve and the point to point "chord" would be even *more* >below the line. > >
No, I'm sorry if I wasn't clear, I'm talking about the domain of your function, not the range. You have some value that you want to find the sine of. You are going to convert that value to an index into your lookup table. The index has to be an integer. When you converted your value to the index scale, chances are you don't hit an exact integer. Say I have a lookup table for every degree and my angle turns out to be 24.7. Do you consider this to be index 24 + .7 or index 25 - .3? If the former, the chord method will be better, if the latter then the tangent method will be better. I somehow got the idea that you were using the floor function, not the nearest integer, from your bit discussion since you talked about a + b and it seemed both were positive. Ced --------------------------------------- Posted through http://www.DSPRelated.com
On Tue, 07 Feb 2017 21:01:28 -0500, rickman wrote:

> On 2/7/2017 2:34 PM, Tim Wescott wrote: >> On Tue, 07 Feb 2017 12:53:27 -0600, Tim Wescott wrote: >> >>> On Tue, 07 Feb 2017 07:09:24 -0500, rickman wrote: >>> >>>> I just wrote a program to calculate the size of the LUT required to >>>> generate the coarse sin/cos values (a in the following equation) to >>>> use the trig identity sin(a+b) = sin(a)cos(b) + cos(a)sin(b), meaning >>>> the max value of b is sufficiently small that cos(b) is 1.0 to within >>>> one lsb of the output sine word. >>>> >>>> The size of a and b (meaning the number of bits) always turns out to >>>> be half of the total. Obviously there is something guiding this. >>>> But I don't know of any trig rules that would make it so. >>> >>> So you're saying that the size of a and b is always equal? >>> >>> Interesting. No, I have no clue of why that is, but it's probably >>> buried in the trig someplace. >> >> Not enough information. What criteria are you using to choose the size >> of a? Because your criteria on b just sets the weight of your overall >> LSB -- without any other rules, you could choose the dividing line >> between a and b arbitrarily. > > I gave the criteria above. "cos(b) is 1.0 to within one lsb of the > output sine word."
That only tells you the necessary weight of the LSB of b, and thus the total number of bits in a and b combined. It doesn't say where the dividing line needs to be -- you could put it all in b (and let a = 0), or all in a (and let b = 0), or just about anything. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!
On Wed, 8 Feb 2017 17:09:07 -0500, rickman <gnuarm@gmail.com> wrote:

>On 2/8/2017 4:59 PM, eric.jacobsen@ieee.org wrote: >> On Wed, 8 Feb 2017 00:43:27 -0500, rickman <gnuarm@gmail.com> wrote: >> >>> On 2/7/2017 11:33 PM, robert bristow-johnson wrote: >>>> On Tuesday, February 7, 2017 at 9:26:35 PM UTC-5, rickman wrote: >>>>> On 2/7/2017 3:42 PM, Steve Pope wrote: >>>>>> dbd <d.dalrymple@sbcglobal.net> wrote: >>>>>> >>>>>>> There must be further unstated requirements or assumptions because, in >>>>>>> general, the size of the lookup table addresses and the value of the lsb >>>>>>> of the looked up data are independent. You might get different results >>>>>> >from 16 bit fixed point and 64 bit float data. >>>>>> >>>>>> I agree, I am curious as to what exactly is being observed here. >>>>> >>>>> What part don't you understand? >>>>> >>>> >>>> i understand what an LUT is. >>>> >>>> i understand quantization error in forming the index into an LUT. >>>> >>>> i understand how simply rounding to the nearest entry (or nearest integer index) is the same as convolving the values of the LUT with a rectangular pulse function. >>>> >>>> i understand how linear interpolation is the same as convolving the values of the LUT with a triangular pulse function. >>>> >>>> i understand from either how to compute the maximum error. >>>> >>>> i understand that so far, i have said nothing about the number of bits of either the input or output word (which is another source of quantization error). >>>> >>>> but this is about figuring out your error as a function of the number of entries in the LUT, is it not? >>>> >>>> i don't understand what your issues are and i will not read the FORTH program. >>> >>> Yeah, I don't' expect anyone to read the program. It's there in case >>> someone was interested in the language. This is a good example that >>> shows how useful and easy to use the language is. BTW, reading the >>> Forth program in this case is no harder than looking at the keystrokes >>> used to perform the calculation on an RPN calculator with some function >>> keys defined. >>> >>> After reading the list of things you understand, I don't understand what >>> you don't understand. I think you are making this more complex than it >>> needs to be. I'll say you should read my other post in this thread >>> where I draw a diagram of the hardware doing the calculation. >>> >>> The calculation is using the trig identity, sin(a+b) = sin(a)cos(b) + >>> cos(a)sin(b). a and b represent the msbs and lsbs respectively, of the >>> phase word used to calculate the sine value. The crux of the issue is >>> to determine the minimum size of a that meets an accuracy requirement >>> set by the output word size. The size of a in turn determines the size >>> of the LUT which is important in determining the practicality for a >>> given application. >>> >>> The trig function is reduced by assuming cos(b) is 1.0. To meet the >>> accuracy requirement this restricts the max size of b. That is the >>> calculation I was doing. How many bits can be allocated to b with the >>> remaining phase word bits being a for a given error? >>> >>> Turns out the problem simplifies because in the region of 0 phase, >>> cos(b) is essentially quadratic. Approximately half the phase angle >>> bits need to used to address the LUT (a) and half used for the remaining >>> term (b). This gives just a bit less than the required accuracy in the >>> result from the LUT. An extra bit is gained however, because the output >>>from the LUT is unsigned and the end result is signed. The final error >>> contribution from the LUT is halved. >>> >>> This got started because someone in another group seems to think only >>> the CORDIC can give "exact" results in a "small" amount of logic. The >>> CORDIC is claimed to be "small" because it has no multiplies. I pointed >>> out the CORDIC algorithm is essentially the same complexity as a >>> multiply, order (N^2). Given that multipliers are available in most >>> FPGAs (or a multiply instruction in CPUs) I couldn't understand why >>> CORDIC was still used. I found that person is producing a 32 bit result >>> which is not so easy with a LUT-multiply approach. So I wanted to >>> quantify it. >> >> I've been puzzled sometimes when people cling to CORDIC algorithms >> when multipliers are available, although there are some niche >> applications where it seems to make sense. >> >>> Once I got the result of "half the word width" for the a and b >>> quantities, I realized I went down this road some five or six years ago >>> and arrived at the same result. I'm not sure I realized then the reason >>> for the result, the quadratic nature of sine near 90 degrees. >> >> I seem to recall this result from back in the days when making a good >> DDS for comm synchronization was a fresh enough thing that it was >> worth cooking up your own stuff to improve performance. These days >> most of the tricks are known or it's easy enough to get good >> performance just by throwing word-width complexity at it. > >Most DDS resolutions are limited by the resolution of the analog >converters, either input or output. The person I was discussing this >indicated they use 32 bit sines but did not give the use case. At 32 >bits you are making some large LUTs and CORDIC may well be the better >solution unless you have off chip memory that can keep up with your data >rates.
Many DDS applications don't have an associated DAC, and the output is used to feed a digital mixer or PLL or some other processing element. It is still application dependent how many bits are actually *needed* at any point in the system, and as complexity has gotten cheaper it's gotten easier to meet requirements without too many tricks. Nevertheless, it's still good to know some of the tricks. ;) --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus
On Thu, 09 Feb 2017 00:25:57 +0000, eric.jacobsen wrote:

> On Wed, 8 Feb 2017 17:09:07 -0500, rickman <gnuarm@gmail.com> wrote: > >>On 2/8/2017 4:59 PM, eric.jacobsen@ieee.org wrote: >>> On Wed, 8 Feb 2017 00:43:27 -0500, rickman <gnuarm@gmail.com> wrote: >>> >>>> On 2/7/2017 11:33 PM, robert bristow-johnson wrote: >>>>> On Tuesday, February 7, 2017 at 9:26:35 PM UTC-5, rickman wrote: >>>>>> On 2/7/2017 3:42 PM, Steve Pope wrote: >>>>>>> dbd <d.dalrymple@sbcglobal.net> wrote: >>>>>>> >>>>>>>> There must be further unstated requirements or assumptions >>>>>>>> because, in general, the size of the lookup table addresses and >>>>>>>> the value of the lsb of the looked up data are independent. You >>>>>>>> might get different results >>>>>>> >from 16 bit fixed point and 64 bit float data. >>>>>>> >>>>>>> I agree, I am curious as to what exactly is being observed here. >>>>>> >>>>>> What part don't you understand? >>>>>> >>>>>> >>>>> i understand what an LUT is. >>>>> >>>>> i understand quantization error in forming the index into an LUT. >>>>> >>>>> i understand how simply rounding to the nearest entry (or nearest >>>>> integer index) is the same as convolving the values of the LUT with >>>>> a rectangular pulse function. >>>>> >>>>> i understand how linear interpolation is the same as convolving the >>>>> values of the LUT with a triangular pulse function. >>>>> >>>>> i understand from either how to compute the maximum error. >>>>> >>>>> i understand that so far, i have said nothing about the number of >>>>> bits of either the input or output word (which is another source of >>>>> quantization error). >>>>> >>>>> but this is about figuring out your error as a function of the >>>>> number of entries in the LUT, is it not? >>>>> >>>>> i don't understand what your issues are and i will not read the >>>>> FORTH program. >>>> >>>> Yeah, I don't' expect anyone to read the program. It's there in case >>>> someone was interested in the language. This is a good example that >>>> shows how useful and easy to use the language is. BTW, reading the >>>> Forth program in this case is no harder than looking at the >>>> keystrokes used to perform the calculation on an RPN calculator with >>>> some function keys defined. >>>> >>>> After reading the list of things you understand, I don't understand >>>> what you don't understand. I think you are making this more complex >>>> than it needs to be. I'll say you should read my other post in this >>>> thread where I draw a diagram of the hardware doing the calculation. >>>> >>>> The calculation is using the trig identity, sin(a+b) = sin(a)cos(b) + >>>> cos(a)sin(b). a and b represent the msbs and lsbs respectively, of >>>> the phase word used to calculate the sine value. The crux of the >>>> issue is to determine the minimum size of a that meets an accuracy >>>> requirement set by the output word size. The size of a in turn >>>> determines the size of the LUT which is important in determining the >>>> practicality for a given application. >>>> >>>> The trig function is reduced by assuming cos(b) is 1.0. To meet the >>>> accuracy requirement this restricts the max size of b. That is the >>>> calculation I was doing. How many bits can be allocated to b with >>>> the remaining phase word bits being a for a given error? >>>> >>>> Turns out the problem simplifies because in the region of 0 phase, >>>> cos(b) is essentially quadratic. Approximately half the phase angle >>>> bits need to used to address the LUT (a) and half used for the >>>> remaining term (b). This gives just a bit less than the required >>>> accuracy in the result from the LUT. An extra bit is gained however, >>>> because the output >>>>from the LUT is unsigned and the end result is signed. The final >>>>error >>>> contribution from the LUT is halved. >>>> >>>> This got started because someone in another group seems to think only >>>> the CORDIC can give "exact" results in a "small" amount of logic. >>>> The CORDIC is claimed to be "small" because it has no multiplies. I >>>> pointed out the CORDIC algorithm is essentially the same complexity >>>> as a multiply, order (N^2). Given that multipliers are available in >>>> most FPGAs (or a multiply instruction in CPUs) I couldn't understand >>>> why CORDIC was still used. I found that person is producing a 32 bit >>>> result which is not so easy with a LUT-multiply approach. So I >>>> wanted to quantify it. >>> >>> I've been puzzled sometimes when people cling to CORDIC algorithms >>> when multipliers are available, although there are some niche >>> applications where it seems to make sense. >>> >>>> Once I got the result of "half the word width" for the a and b >>>> quantities, I realized I went down this road some five or six years >>>> ago and arrived at the same result. I'm not sure I realized then the >>>> reason for the result, the quadratic nature of sine near 90 degrees. >>> >>> I seem to recall this result from back in the days when making a good >>> DDS for comm synchronization was a fresh enough thing that it was >>> worth cooking up your own stuff to improve performance. These days >>> most of the tricks are known or it's easy enough to get good >>> performance just by throwing word-width complexity at it. >> >>Most DDS resolutions are limited by the resolution of the analog >>converters, either input or output. The person I was discussing this >>indicated they use 32 bit sines but did not give the use case. At 32 >>bits you are making some large LUTs and CORDIC may well be the better >>solution unless you have off chip memory that can keep up with your data >>rates. > > Many DDS applications don't have an associated DAC, and the output is > used to feed a digital mixer or PLL or some other processing element. > It is still application dependent how many bits are actually *needed* > at any point in the system, and as complexity has gotten cheaper it's > gotten easier to meet requirements without too many tricks. > Nevertheless, it's still good to know some of the tricks. ;)
That's because as complexity becomes cheaper, more applications come over the horizon to be implemented. So there's always going to be SOME applications that call on you to pull out all the stops. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com I'm looking for work -- see my website!
On 2/8/2017 6:14 PM, Cedron wrote:
>> >> You lost me here. I think you are saying the values in the LUT table >> are truncated rather than rounded. In that case the points will be >> below the curve and the point to point "chord" would be even *more* >> below the line. >> >> > > No, I'm sorry if I wasn't clear, I'm talking about the domain of your > function, not the range. > > You have some value that you want to find the sine of. You are going to > convert that value to an index into your lookup table. The index has to > be an integer. When you converted your value to the index scale, chances > are you don't hit an exact integer. > > Say I have a lookup table for every degree and my angle turns out to be > 24.7. Do you consider this to be index 24 + .7 or index 25 - .3? > > If the former, the chord method will be better, if the latter then the > tangent method will be better. > > I somehow got the idea that you were using the floor function, not the > nearest integer, from your bit discussion since you talked about a + b and > it seemed both were positive.
Arbitrary distinction. I can put any values into the table I wish. If I floor the index I can use sin(index + 0.5) as my table entry or any other value I choose to optimize the resulting calculations. Regardless there will be a curve that is approximated by a straight line. That straight line in the range is the important part. -- Rick C
On 2/8/2017 7:25 PM, eric.jacobsen@ieee.org wrote:
> On Wed, 8 Feb 2017 17:09:07 -0500, rickman <gnuarm@gmail.com> wrote: > >> On 2/8/2017 4:59 PM, eric.jacobsen@ieee.org wrote: >>> On Wed, 8 Feb 2017 00:43:27 -0500, rickman <gnuarm@gmail.com> wrote: >>> >>>> On 2/7/2017 11:33 PM, robert bristow-johnson wrote: >>>>> On Tuesday, February 7, 2017 at 9:26:35 PM UTC-5, rickman wrote: >>>>>> On 2/7/2017 3:42 PM, Steve Pope wrote: >>>>>>> dbd <d.dalrymple@sbcglobal.net> wrote: >>>>>>> >>>>>>>> There must be further unstated requirements or assumptions because, in >>>>>>>> general, the size of the lookup table addresses and the value of the lsb >>>>>>>> of the looked up data are independent. You might get different results >>>>>>> >from 16 bit fixed point and 64 bit float data. >>>>>>> >>>>>>> I agree, I am curious as to what exactly is being observed here. >>>>>> >>>>>> What part don't you understand? >>>>>> >>>>> >>>>> i understand what an LUT is. >>>>> >>>>> i understand quantization error in forming the index into an LUT. >>>>> >>>>> i understand how simply rounding to the nearest entry (or nearest integer index) is the same as convolving the values of the LUT with a rectangular pulse function. >>>>> >>>>> i understand how linear interpolation is the same as convolving the values of the LUT with a triangular pulse function. >>>>> >>>>> i understand from either how to compute the maximum error. >>>>> >>>>> i understand that so far, i have said nothing about the number of bits of either the input or output word (which is another source of quantization error). >>>>> >>>>> but this is about figuring out your error as a function of the number of entries in the LUT, is it not? >>>>> >>>>> i don't understand what your issues are and i will not read the FORTH program. >>>> >>>> Yeah, I don't' expect anyone to read the program. It's there in case >>>> someone was interested in the language. This is a good example that >>>> shows how useful and easy to use the language is. BTW, reading the >>>> Forth program in this case is no harder than looking at the keystrokes >>>> used to perform the calculation on an RPN calculator with some function >>>> keys defined. >>>> >>>> After reading the list of things you understand, I don't understand what >>>> you don't understand. I think you are making this more complex than it >>>> needs to be. I'll say you should read my other post in this thread >>>> where I draw a diagram of the hardware doing the calculation. >>>> >>>> The calculation is using the trig identity, sin(a+b) = sin(a)cos(b) + >>>> cos(a)sin(b). a and b represent the msbs and lsbs respectively, of the >>>> phase word used to calculate the sine value. The crux of the issue is >>>> to determine the minimum size of a that meets an accuracy requirement >>>> set by the output word size. The size of a in turn determines the size >>>> of the LUT which is important in determining the practicality for a >>>> given application. >>>> >>>> The trig function is reduced by assuming cos(b) is 1.0. To meet the >>>> accuracy requirement this restricts the max size of b. That is the >>>> calculation I was doing. How many bits can be allocated to b with the >>>> remaining phase word bits being a for a given error? >>>> >>>> Turns out the problem simplifies because in the region of 0 phase, >>>> cos(b) is essentially quadratic. Approximately half the phase angle >>>> bits need to used to address the LUT (a) and half used for the remaining >>>> term (b). This gives just a bit less than the required accuracy in the >>>> result from the LUT. An extra bit is gained however, because the output >>> >from the LUT is unsigned and the end result is signed. The final error >>>> contribution from the LUT is halved. >>>> >>>> This got started because someone in another group seems to think only >>>> the CORDIC can give "exact" results in a "small" amount of logic. The >>>> CORDIC is claimed to be "small" because it has no multiplies. I pointed >>>> out the CORDIC algorithm is essentially the same complexity as a >>>> multiply, order (N^2). Given that multipliers are available in most >>>> FPGAs (or a multiply instruction in CPUs) I couldn't understand why >>>> CORDIC was still used. I found that person is producing a 32 bit result >>>> which is not so easy with a LUT-multiply approach. So I wanted to >>>> quantify it. >>> >>> I've been puzzled sometimes when people cling to CORDIC algorithms >>> when multipliers are available, although there are some niche >>> applications where it seems to make sense. >>> >>>> Once I got the result of "half the word width" for the a and b >>>> quantities, I realized I went down this road some five or six years ago >>>> and arrived at the same result. I'm not sure I realized then the reason >>>> for the result, the quadratic nature of sine near 90 degrees. >>> >>> I seem to recall this result from back in the days when making a good >>> DDS for comm synchronization was a fresh enough thing that it was >>> worth cooking up your own stuff to improve performance. These days >>> most of the tricks are known or it's easy enough to get good >>> performance just by throwing word-width complexity at it. >> >> Most DDS resolutions are limited by the resolution of the analog >> converters, either input or output. The person I was discussing this >> indicated they use 32 bit sines but did not give the use case. At 32 >> bits you are making some large LUTs and CORDIC may well be the better >> solution unless you have off chip memory that can keep up with your data >> rates. > > Many DDS applications don't have an associated DAC, and the output is > used to feed a digital mixer or PLL or some other processing element. > It is still application dependent how many bits are actually *needed* > at any point in the system, and as complexity has gotten cheaper it's > gotten easier to meet requirements without too many tricks. > Nevertheless, it's still good to know some of the tricks. ;)
The signal from the DDS may not be converted to analog or even any resulting digital signal (think transmitter). But if not that, then the DDS signal is likely to be used with a signal *from* an analog signal (think receiver). There are few systems using a DDS that aren't limited in required resolution by an analog signal somewhere. -- Rick C
On 2/8/2017 6:51 PM, Tim Wescott wrote:
> On Tue, 07 Feb 2017 21:01:28 -0500, rickman wrote: > >> On 2/7/2017 2:34 PM, Tim Wescott wrote: >>> On Tue, 07 Feb 2017 12:53:27 -0600, Tim Wescott wrote: >>> >>>> On Tue, 07 Feb 2017 07:09:24 -0500, rickman wrote: >>>> >>>>> I just wrote a program to calculate the size of the LUT required to >>>>> generate the coarse sin/cos values (a in the following equation) to >>>>> use the trig identity sin(a+b) = sin(a)cos(b) + cos(a)sin(b), meaning >>>>> the max value of b is sufficiently small that cos(b) is 1.0 to within >>>>> one lsb of the output sine word. >>>>> >>>>> The size of a and b (meaning the number of bits) always turns out to >>>>> be half of the total. Obviously there is something guiding this. >>>>> But I don't know of any trig rules that would make it so. >>>> >>>> So you're saying that the size of a and b is always equal? >>>> >>>> Interesting. No, I have no clue of why that is, but it's probably >>>> buried in the trig someplace. >>> >>> Not enough information. What criteria are you using to choose the size >>> of a? Because your criteria on b just sets the weight of your overall >>> LSB -- without any other rules, you could choose the dividing line >>> between a and b arbitrarily. >> >> I gave the criteria above. "cos(b) is 1.0 to within one lsb of the >> output sine word." > > That only tells you the necessary weight of the LSB of b, and thus the > total number of bits in a and b combined. It doesn't say where the > dividing line needs to be -- you could put it all in b (and let a = 0), > or all in a (and let b = 0), or just about anything.
You aren't understanding. The value of cos(b) is ~1.0 at all values of (b) (I said nothing about the lsb of b). This is determined by how many bits are in (a) which determines the max value of (b). Look at the diagram in my post 2/7/2017 at 9:46 PM EST. -- Rick C
On Wed, 08 Feb 2017 18:30:11 -0600, Tim Wescott
<seemywebsite@myfooter.really> wrote:

>On Thu, 09 Feb 2017 00:25:57 +0000, eric.jacobsen wrote: > >> On Wed, 8 Feb 2017 17:09:07 -0500, rickman <gnuarm@gmail.com> wrote: >> >>>On 2/8/2017 4:59 PM, eric.jacobsen@ieee.org wrote: >>>> On Wed, 8 Feb 2017 00:43:27 -0500, rickman <gnuarm@gmail.com> wrote: >>>> >>>>> On 2/7/2017 11:33 PM, robert bristow-johnson wrote: >>>>>> On Tuesday, February 7, 2017 at 9:26:35 PM UTC-5, rickman wrote: >>>>>>> On 2/7/2017 3:42 PM, Steve Pope wrote: >>>>>>>> dbd <d.dalrymple@sbcglobal.net> wrote: >>>>>>>> >>>>>>>>> There must be further unstated requirements or assumptions >>>>>>>>> because, in general, the size of the lookup table addresses and >>>>>>>>> the value of the lsb of the looked up data are independent. You >>>>>>>>> might get different results >>>>>>>> >from 16 bit fixed point and 64 bit float data. >>>>>>>> >>>>>>>> I agree, I am curious as to what exactly is being observed here. >>>>>>> >>>>>>> What part don't you understand? >>>>>>> >>>>>>> >>>>>> i understand what an LUT is. >>>>>> >>>>>> i understand quantization error in forming the index into an LUT. >>>>>> >>>>>> i understand how simply rounding to the nearest entry (or nearest >>>>>> integer index) is the same as convolving the values of the LUT with >>>>>> a rectangular pulse function. >>>>>> >>>>>> i understand how linear interpolation is the same as convolving the >>>>>> values of the LUT with a triangular pulse function. >>>>>> >>>>>> i understand from either how to compute the maximum error. >>>>>> >>>>>> i understand that so far, i have said nothing about the number of >>>>>> bits of either the input or output word (which is another source of >>>>>> quantization error). >>>>>> >>>>>> but this is about figuring out your error as a function of the >>>>>> number of entries in the LUT, is it not? >>>>>> >>>>>> i don't understand what your issues are and i will not read the >>>>>> FORTH program. >>>>> >>>>> Yeah, I don't' expect anyone to read the program. It's there in case >>>>> someone was interested in the language. This is a good example that >>>>> shows how useful and easy to use the language is. BTW, reading the >>>>> Forth program in this case is no harder than looking at the >>>>> keystrokes used to perform the calculation on an RPN calculator with >>>>> some function keys defined. >>>>> >>>>> After reading the list of things you understand, I don't understand >>>>> what you don't understand. I think you are making this more complex >>>>> than it needs to be. I'll say you should read my other post in this >>>>> thread where I draw a diagram of the hardware doing the calculation. >>>>> >>>>> The calculation is using the trig identity, sin(a+b) = sin(a)cos(b) + >>>>> cos(a)sin(b). a and b represent the msbs and lsbs respectively, of >>>>> the phase word used to calculate the sine value. The crux of the >>>>> issue is to determine the minimum size of a that meets an accuracy >>>>> requirement set by the output word size. The size of a in turn >>>>> determines the size of the LUT which is important in determining the >>>>> practicality for a given application. >>>>> >>>>> The trig function is reduced by assuming cos(b) is 1.0. To meet the >>>>> accuracy requirement this restricts the max size of b. That is the >>>>> calculation I was doing. How many bits can be allocated to b with >>>>> the remaining phase word bits being a for a given error? >>>>> >>>>> Turns out the problem simplifies because in the region of 0 phase, >>>>> cos(b) is essentially quadratic. Approximately half the phase angle >>>>> bits need to used to address the LUT (a) and half used for the >>>>> remaining term (b). This gives just a bit less than the required >>>>> accuracy in the result from the LUT. An extra bit is gained however, >>>>> because the output >>>>>from the LUT is unsigned and the end result is signed. The final >>>>>error >>>>> contribution from the LUT is halved. >>>>> >>>>> This got started because someone in another group seems to think only >>>>> the CORDIC can give "exact" results in a "small" amount of logic. >>>>> The CORDIC is claimed to be "small" because it has no multiplies. I >>>>> pointed out the CORDIC algorithm is essentially the same complexity >>>>> as a multiply, order (N^2). Given that multipliers are available in >>>>> most FPGAs (or a multiply instruction in CPUs) I couldn't understand >>>>> why CORDIC was still used. I found that person is producing a 32 bit >>>>> result which is not so easy with a LUT-multiply approach. So I >>>>> wanted to quantify it. >>>> >>>> I've been puzzled sometimes when people cling to CORDIC algorithms >>>> when multipliers are available, although there are some niche >>>> applications where it seems to make sense. >>>> >>>>> Once I got the result of "half the word width" for the a and b >>>>> quantities, I realized I went down this road some five or six years >>>>> ago and arrived at the same result. I'm not sure I realized then the >>>>> reason for the result, the quadratic nature of sine near 90 degrees. >>>> >>>> I seem to recall this result from back in the days when making a good >>>> DDS for comm synchronization was a fresh enough thing that it was >>>> worth cooking up your own stuff to improve performance. These days >>>> most of the tricks are known or it's easy enough to get good >>>> performance just by throwing word-width complexity at it. >>> >>>Most DDS resolutions are limited by the resolution of the analog >>>converters, either input or output. The person I was discussing this >>>indicated they use 32 bit sines but did not give the use case. At 32 >>>bits you are making some large LUTs and CORDIC may well be the better >>>solution unless you have off chip memory that can keep up with your data >>>rates. >> >> Many DDS applications don't have an associated DAC, and the output is >> used to feed a digital mixer or PLL or some other processing element. >> It is still application dependent how many bits are actually *needed* >> at any point in the system, and as complexity has gotten cheaper it's >> gotten easier to meet requirements without too many tricks. >> Nevertheless, it's still good to know some of the tricks. ;) > >That's because as complexity becomes cheaper, more applications come over >the horizon to be implemented. So there's always going to be SOME >applications that call on you to pull out all the stops.
Absolutely, but they seem to be fewer and fewer and perhaps nichier and nichier, so the number of people who know or need various tricks seems to get smaller. This seems to be true of DSP in general, where many systems get built by synthesizing something that was a result of somebody graphically connecting blocks together in a CAD system until it "works". ;) I like that IOT applications seem to be reversing the trend a little bit, because many (if not most) IOT devices are small and either power or resource constrained. It seems to maybe be making designers work harder again. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus