Hello Everybody, I am looking for the source code (in assembler if possible) of a 64 bits integer division for TMS320VC33 or a library with this function. Does anyone know if it exists and where I can find it? Thank you very much Best Regards Nicolas |
|
looking for 64 bits integer division
Started by ●December 11, 2003
Reply by ●December 12, 20032003-12-12
Hi Nicolas I have a 96 bit float library (with divide) that in theory gives 64b (~19 decimal digits) accurate results from two 64b inputs. This will also work for integer division, but the results will need rounding. This is because the algorithm uses Newton raphson iterations where an error feedback is derived from multiplying the original input with a 'guess' of the inverse. (1/Xguess)' = (1/Xguess)*(2.0 - Xin*(1/Xguess)) What this says is that the final accuracy is dependent on the multiplication accuracy and what kind of rounding is performed on the last iteration. At 64b I did not worry to much about that! Non rational divisions like 10/3=3.33333333333333333 will obviously need rounding, but you will also find rational divisions like 10240/1024 = 10.0 yielding 9.999999999999999999 or 10.00000000000000001. I sent out a message awhile back to this newsgroup stating that this is a preliminary and therefor untested library. But you are welcome to it with the understanding that any substantial support or other burden will fall to you. Another teeny little problem is that I dont recall getting around to creating routines that would convert to and from a 64b integer type. Conversion to and from 32b integer however is pretty simple and this would lead to the 64b convert routines. If interested, contact me directly. Best regards, Keith Larson ----------------------------------- At 04:29 PM 12/11/03 +0100, you wrote: >Hello Everybody, >I am looking for the source code (in assembler if possible) of a 64 bits integer division for TMS320VC33 or a library with this function. >Does anyone know if it exists and where I can find it? > >Thank you very much > >Best Regards > >Nicolas +-----------+ |Keith Larson | |Member Group Technical Staff | |Texas Instruments Incorporated | | | | 281-274-3288 | | | | www.micro.ti.com/~klarson | |-----------+ | TMS320C3x/C4x/VC33 Applications | | | | TMS320VC33 | | The lowest cost and lowest power 500 w/Mflop | | floating point DSP on the planet! | +-----------+ |
Reply by ●December 15, 20032003-12-15
Hi Nicolas An interesting question. The tfloat lib is about as speed optimized as I could make it without at the same time killing myself with the details. As you might imagine building a 32b-exponent 64b-mantissa library is done using math primitives with much lower precision. Division is interesting since the precision does not need to be all that high until the last few stages. Basically a good initial guess is needed. The following is taken from tfloat3x.c //-------- // trcp() returns 32b/64b float reciprocal //-------- tfloat trcp(tfloat t0) { int i; tfloat t1, t2; // 96b triple precision float t1 = t0; // t1.exp = 0; // t1 = trcpx(t1); // reciprical of mantissa is 24b accurate t1.exp -= t0.exp; // invert the exponent for(i=0;i<2;i++) // 2 more Newton Raphson.. 48b, then 64b { t2 = tmpy(t1,t0); t2 = tsub(T_TWO,t2); t1 = tmpy(t1,t2); } return t1; } Is this the best? Well maybe not. Having lately been tinkering with the 40b 'long double' routines (that are supported by the compiler) I realize now that trcpx() function could return a 32b accurate initial guess rather than 24b. The advantage of this is that *maybe* 1 loop of tmpy() and tsub() functions can be avoided. However, Im not sure at this time if this will loose an LSB or two. Ill zip this up and send it (but not to the newsgroup), but I also want to give this new approach try, so you might get a second zip. It should not take very long. BTW, another thing that may be of interest is Log Differential Compression. Normally this is best applied to signals like audio/video as a compression method but it can also in some cases be used to reduce the sensitivity to accuracy truncation noise. Basically LDC is nothing more than the differentiation an incoming audio/video data stream, followed with mantissa truncation. At the back-end, you simply need to integrate the signal back. More importantly LDC is a linear process, so you can still process the data. If on the other hand you do NOT truncate the mantissa, you will find that large amplitude low frequencies are highly attenuated allowing higher frequency data to make its way through the DSP math with less truncation by the CPU ALU. Whether or not this works depends on the application, but it could lead you to an improved method. Best regards, Keith Larson -------------- At 09:16 AM 12/15/03 +0100, you wrote: >Hi Keith, >Your library interest me. >I have another question about its divide, do you think it is speed optimised >too? >Thank you for your help. > >Nicolas > >----- Original Message ----- >From: "Keith E. Larson" <> >To: "Nicolas BOURRIOT" <>; <> >Sent: Friday, December 12, 2003 4:36 PM >Subject: Re: [c3x] looking for 64 bits integer division >Hi Nicolas > >I have a 96 bit float library (with divide) that in theory gives 64b (~19 >decimal digits) accurate results from two 64b inputs. This will also work >for integer division, but the results will need rounding. This is because >the algorithm uses Newton raphson iterations where an error feedback is >derived from multiplying the original input with a 'guess' of the inverse. > >(1/Xguess)' = (1/Xguess)*(2.0 - Xin*(1/Xguess)) > >What this says is that the final accuracy is dependent on the multiplication >accuracy and what kind of rounding is performed on the last iteration. At >64b I did not worry to much about that! Non rational divisions like >10/3=3.33333333333333333 will obviously need rounding, but you will also >find rational divisions like 10240/1024 = 10.0 yielding 9.999999999999999999 >or 10.00000000000000001. > >I sent out a message awhile back to this newsgroup stating that this is a >preliminary and therefor untested library. But you are welcome to it with >the understanding that any substantial support or other burden will fall to >you. > >Another teeny little problem is that I dont recall getting around to >creating routines that would convert to and from a 64b integer type. >Conversion to and from 32b integer however is pretty simple and this would >lead to the 64b convert routines. > >If interested, contact me directly. > >Best regards, >Keith Larson >----------------------------------- >At 04:29 PM 12/11/03 +0100, you wrote: >>Hello Everybody, >>I am looking for the source code (in assembler if possible) of a 64 bits >integer division for TMS320VC33 or a library with this function. >>Does anyone know if it exists and where I can find it? >> >>Thank you very much >> >>Best Regards >> >>Nicolas +-----------+ |Keith Larson | |Member Group Technical Staff | |Texas Instruments Incorporated | | | | 281-274-3288 | | | | www.micro.ti.com/~klarson | |-----------+ | TMS320C3x/C4x/VC33 Applications | | | | TMS320VC33 | | The lowest cost and lowest power 500 w/Mflop | | floating point DSP on the planet! | +-----------+ |