Reply by Keith E. Larson December 15, 20032003-12-15
Hi Nicolas

An interesting question. The tfloat lib is about as speed optimized as I
could make it without at the same time killing myself with the details. As
you might imagine building a 32b-exponent 64b-mantissa library is done using
math primitives with much lower precision.

Division is interesting since the precision does not need to be all that
high until the last few stages. Basically a good initial guess is needed.
The following is taken from tfloat3x.c

//--------
// trcp() returns 32b/64b float reciprocal
//--------
tfloat trcp(tfloat t0)
{
int i;
tfloat t1, t2; // 96b triple precision float
t1 = t0; //
t1.exp = 0; //
t1 = trcpx(t1); // reciprical of mantissa is 24b accurate
t1.exp -= t0.exp; // invert the exponent
for(i=0;i<2;i++) // 2 more Newton Raphson.. 48b, then 64b
{ t2 = tmpy(t1,t0);
t2 = tsub(T_TWO,t2);
t1 = tmpy(t1,t2);
}
return t1;
}

Is this the best? Well maybe not. Having lately been tinkering with the
40b 'long double' routines (that are supported by the compiler) I realize
now that trcpx() function could return a 32b accurate initial guess rather
than 24b. The advantage of this is that *maybe* 1 loop of tmpy() and tsub()
functions can be avoided. However, Im not sure at this time if this will
loose an LSB or two.

Ill zip this up and send it (but not to the newsgroup), but I also want to
give this new approach try, so you might get a second zip. It should not
take very long.

BTW, another thing that may be of interest is Log Differential Compression.
Normally this is best applied to signals like audio/video as a compression
method but it can also in some cases be used to reduce the sensitivity to
accuracy truncation noise.

Basically LDC is nothing more than the differentiation an incoming
audio/video data stream, followed with mantissa truncation. At the
back-end, you simply need to integrate the signal back. More importantly
LDC is a linear process, so you can still process the data.

If on the other hand you do NOT truncate the mantissa, you will find that
large amplitude low frequencies are highly attenuated allowing higher
frequency data to make its way through the DSP math with less truncation by
the CPU ALU. Whether or not this works depends on the application, but it
could lead you to an improved method.

Best regards,
Keith Larson
--------------
At 09:16 AM 12/15/03 +0100, you wrote:
>Hi Keith,
>Your library interest me.
>I have another question about its divide, do you think it is speed optimised
>too?
>Thank you for your help.
>
>Nicolas
>
>----- Original Message -----
>From: "Keith E. Larson" <>
>To: "Nicolas BOURRIOT" <>; <>
>Sent: Friday, December 12, 2003 4:36 PM
>Subject: Re: [c3x] looking for 64 bits integer division >Hi Nicolas
>
>I have a 96 bit float library (with divide) that in theory gives 64b (~19
>decimal digits) accurate results from two 64b inputs. This will also work
>for integer division, but the results will need rounding. This is because
>the algorithm uses Newton raphson iterations where an error feedback is
>derived from multiplying the original input with a 'guess' of the inverse.
>
>(1/Xguess)' = (1/Xguess)*(2.0 - Xin*(1/Xguess))
>
>What this says is that the final accuracy is dependent on the multiplication
>accuracy and what kind of rounding is performed on the last iteration. At
>64b I did not worry to much about that! Non rational divisions like
>10/3=3.33333333333333333 will obviously need rounding, but you will also
>find rational divisions like 10240/1024 = 10.0 yielding 9.999999999999999999
>or 10.00000000000000001.
>
>I sent out a message awhile back to this newsgroup stating that this is a
>preliminary and therefor untested library. But you are welcome to it with
>the understanding that any substantial support or other burden will fall to
>you.
>
>Another teeny little problem is that I dont recall getting around to
>creating routines that would convert to and from a 64b integer type.
>Conversion to and from 32b integer however is pretty simple and this would
>lead to the 64b convert routines.
>
>If interested, contact me directly.
>
>Best regards,
>Keith Larson
>-----------------------------------
>At 04:29 PM 12/11/03 +0100, you wrote:
>>Hello Everybody,
>>I am looking for the source code (in assembler if possible) of a 64 bits
>integer division for TMS320VC33 or a library with this function.
>>Does anyone know if it exists and where I can find it?
>>
>>Thank you very much
>>
>>Best Regards
>>
>>Nicolas


+-----------+
|Keith Larson |
|Member Group Technical Staff |
|Texas Instruments Incorporated |
| |
| 281-274-3288 |
| |
| www.micro.ti.com/~klarson |
|-----------+
| TMS320C3x/C4x/VC33 Applications |
| |
| TMS320VC33 |
| The lowest cost and lowest power 500 w/Mflop |
| floating point DSP on the planet! |
+-----------+


Reply by Keith E. Larson December 12, 20032003-12-12
Hi Nicolas

I have a 96 bit float library (with divide) that in theory gives 64b (~19
decimal digits) accurate results from two 64b inputs. This will also work
for integer division, but the results will need rounding. This is because
the algorithm uses Newton raphson iterations where an error feedback is
derived from multiplying the original input with a 'guess' of the inverse.

(1/Xguess)' = (1/Xguess)*(2.0 - Xin*(1/Xguess))

What this says is that the final accuracy is dependent on the multiplication
accuracy and what kind of rounding is performed on the last iteration. At
64b I did not worry to much about that! Non rational divisions like
10/3=3.33333333333333333 will obviously need rounding, but you will also
find rational divisions like 10240/1024 = 10.0 yielding 9.999999999999999999
or 10.00000000000000001.

I sent out a message awhile back to this newsgroup stating that this is a
preliminary and therefor untested library. But you are welcome to it with
the understanding that any substantial support or other burden will fall to you.

Another teeny little problem is that I dont recall getting around to
creating routines that would convert to and from a 64b integer type.
Conversion to and from 32b integer however is pretty simple and this would
lead to the 64b convert routines.

If interested, contact me directly.

Best regards,
Keith Larson
-----------------------------------
At 04:29 PM 12/11/03 +0100, you wrote:
>Hello Everybody,
>I am looking for the source code (in assembler if possible) of a 64 bits
integer division for TMS320VC33 or a library with this function.
>Does anyone know if it exists and where I can find it?
>
>Thank you very much
>
>Best Regards
>
>Nicolas

+-----------+
|Keith Larson |
|Member Group Technical Staff |
|Texas Instruments Incorporated |
| |
| 281-274-3288 |
| |
| www.micro.ti.com/~klarson |
|-----------+
| TMS320C3x/C4x/VC33 Applications |
| |
| TMS320VC33 |
| The lowest cost and lowest power 500 w/Mflop |
| floating point DSP on the planet! |
+-----------+


Reply by Nicolas BOURRIOT December 11, 20032003-12-11
Hello Everybody,
I am looking for the source code (in assembler if possible) of a 64 bits
integer division for TMS320VC33 or a library with this function.
Does anyone know if it exists and where I can find it?

Thank you very much

Best Regards

Nicolas