DSPRelated.com
Forums

Floating-point to fixed-point samples in C/C++

Started by vutruong October 10, 2005
Hi,

I have to use C/C++ to convert the floating-point to fixed-point with 16
bit-length. I am looking for algorithm and samples in C/C++ which show how
to solve this issue. I read Randy Yates' paper at
http://home.earthlink.net/~yatescr/fp.pdf. But it is not easy for me to
implement.

Your helps are appreciated.

Thanks,
Vu


		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
yea that article seems 12 times longer then it should be, and when you
start seeing symbols that are impossible to reproduce with pencil and
paper or need a equation editor in Word you know you are in trouble
(well at least for me anyway) :)

if you could post a algorithm I'd be happy to walk you through it

Hi vu,

Sorry I don't have any code with me as samples, but the following are
some (very vague/generalised) pointers:

1. Decide on one of two approaches - top down or bottom up to convert
your code from floating point to fixed point.

2. Divide your code (if it isn't already divided) into logical parts
(blocks), whereby you can begin converting it one block at a time,
without having to change the rest of the code (block).

3. If you are using a top down approach for example, you should find
the range of input values, and possibly intermediate values so that you
can decide on the fixed point format (if the range is very large, you
need to use a format with lower precision, if the required precision is
high, you may want to compensate with a lower range... its a tradeoff
with the Q. format.) Yates's paper (if I remember properly), might help
you calculate the range and precision with the different possible
formats.

You might want to convert only the intermediate computations to fixed
point first, leaving both input and output of that block in floating
point. Move on one block at a time, before converting the intermediate
results to fixed point too. This way it's easier to debug.

4. Ensure that you have enough guard bits, especially when doing fixed
point multiplication, and sometimes also with additions. You tend to
miss that  at times, especially if its a lot of additions and
subtractions in one step - your final result might just be in range,
but your intermediate results (during addition) could go out of range,
causing problems.


I'm no expert -and I too am learning this stuff - but I do hope this
helps.

I'm sure the rest would let me (and you) know if I was wrong anywhere
:) ;)

"vutruong" <vuttruong@yahoo.com> wrote in message
news:V92dnc56EbwP_tfeRVn-pg@giganews.com...
> Hi, > > I have to use C/C++ to convert the floating-point to fixed-point with 16 > bit-length. I am looking for algorithm and samples in C/C++ which show how > to solve this issue. I read Randy Yates' paper at > http://home.earthlink.net/~yatescr/fp.pdf. But it is not easy for me to > implement. > > Your helps are appreciated. > > Thanks, > Vu
I'm not sure I understand your problem based on your post. If I interpret it in a simple fashion I'd say that this following snippet should do what you want... float in_data; int fix_data_16; fix_data_16 = ((int) in_data) & 0xFFFF; You can get a little more fancy and add some rounding to this if you want...but I'd say this converts "floating-point to fixed-point with 16 bit-length". If you have an existing algorithm implemented in floating point and you want to convert it to work in fixed point data with 16 bit precision - that's going to take a bunch of work depending on the algorithm that you have to convert. Or...are you looking for C/C++ code that takes as input, code in C/C++ for a floating point algorithm and outputs code for the same algorithm in fixed point?? You won't find any C/C++ code for any of the above 2 problems. Cheers Bhaskar
Bhaskar Thiagarajan wrote:

   ...

> I'm not sure I understand your problem based on your post. If I interpret it > in a simple fashion I'd say that this following snippet should do what you > want... > > float in_data; > int fix_data_16; > > fix_data_16 = ((int) in_data) & 0xFFFF; > > You can get a little more fancy and add some rounding to this if you > want...but I'd say this converts "floating-point to fixed-point with 16 > bit-length".
... What is wanted in conversion to fixed point, not integer. Converting a floating-point algorithm to fixed point needs a lot more thought than just converting the numbers. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
> What is wanted in conversion to fixed point, not integer. Converting a > floating-point algorithm to fixed point needs a lot more thought than > just converting the numbers. > > Jerry
C'mon Jerry, Integer is a kind of fixed point! In this case, 16.0. Luiz Carlos
oen_no_spam@yahoo.com.br wrote:
>>What is wanted in conversion to fixed point, not integer. Converting a >>floating-point algorithm to fixed point needs a lot more thought than >>just converting the numbers. >> >>Jerry > > > C'mon Jerry, > > Integer is a kind of fixed point! In this case, 16.0.
I will give you good odds that it is not the kind of fixed point that the OP wants. What will you bet? What's more, the conversion of numbers is not the difficulty that prompted his question. (Another bet?) Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
"Jerry Avins" <jya@ieee.org> wrote in message
news:JJSdnaaFOfOINtfenZ2dnUVZ_s2dnZ2d@rcn.net...
> Bhaskar Thiagarajan wrote: > > ... > > > I'm not sure I understand your problem based on your post. If I
interpret it
> > in a simple fashion I'd say that this following snippet should do what
you
> > want... > > > > float in_data; > > int fix_data_16; > > > > fix_data_16 = ((int) in_data) & 0xFFFF; > > > > You can get a little more fancy and add some rounding to this if you > > want...but I'd say this converts "floating-point to fixed-point with 16 > > bit-length". > > ... > > What is wanted in conversion to fixed point, not integer. Converting a > floating-point algorithm to fixed point needs a lot more thought than > just converting the numbers.
Which is what I was trying to illustrate...that firstly the question needs to be more precise (even though we can *guess* what he/she wants) and it is a lot more involved that trying to get cookbook answers/code.
> Jerry > -- > Engineering is the art of making what you want from things you can get. > &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Hi,

It seems that I didn't describe my issue clearly.

My work is to remove the floating-point. My project is written in C
language. And there are some simple operations: addition, subtraction,
multiplication, division and exponent. I think I will make a fixed-point
library with these simple operations. I read some papers about fixed-point
but they are difficult for me to implement the operations in my library. My
project after removing floating-point will use 16 bits to present a
number.

Thanks for your helps,
Vu
		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com
I know Jerry.
I understood your point, but I just couldn't resist. :)

Luiz Carlos