Ok guys, I'm looking at some computation done in fixed-point, and there is one that I dont quite understand. Here it is: #define mul19dot12by1dot14(a,b) (((((a)>>16)*(b))<<2) + (TINT32) (((TUINT32)(((TUINT16)(a))*(TUINT16)(b)))>>14)) What is being done here is multiplying a 32 bits number by a 16 bits one, so theoretically the result is up to 48 bits. Therefore, to do it in ANSI C, we'd have to use 64 bits arithmetic. But this is slow, especially, on embedded processors. Hence, the computation is divided into 2 16 bits multiplications, and 1 addition, so 32 bits arithmetic can be used. Btw, I believe the result is also in 19.12. I more or less understand the intent, which is to divide the 19.12 number into 2 16 bits numbers, multiplying them by the 1.14 number, which gives us 2 32 bits numbers, which we add to get the result. But I dont quite understand all the subtleties of the computation. here are some specific questions: 1) the first part of the macro: (((a)>>16)*(b))<<2) . Shifting a by 16 to get its MSB, and multypling them, ok, but why the <<2 ?!?! 2) Same thing with the second part. Whyhe >> 14? How does the recombination of the 2 intermediate results work? Thanks in advance, Alex
19.12 x 1.14
Started by ●February 16, 2009
Reply by ●February 16, 20092009-02-16
vectorizor <vectorizor@googlemail.com> wrote:>I dont quite understand all the subtleties of the computation. here >are some specific questions: > >1) the first part of the macro: (((a)>>16)*(b))<<2) . Shifting a by >16 to get its MSB, and multypling them, ok, but why the <<2 ?!?! > >2) Same thing with the second part. Whyhe >> 14? How does the >recombination of the 2 intermediate results work?It seems pretty clear that the values of 2, 14, and 16 satisfy the relation 2 + 14 = 16 and this is used to align the two partial products. I would not use code like this in an implementation. I might conceivably use it to try to bit-match some hardware implementation. In general I would rebel against anything like this, and use something like System C fixed-point types, if I could. Steve
Reply by ●February 16, 20092009-02-16
Many thanks to the prompt answer. On Feb 16, 6:51�pm, spop...@speedymail.org (Steve Pope) wrote:> It seems pretty clear that the values of 2, 14, and 16 satisfy the relation 2 + 14 = 16 and this is used to align the > two partial products.All right, but I cant work out the whys and hows of this alignment. FOr instance, for the 1st part of the multiplication, taking the 16 MSB of the 19.12 number creates an integer that needs to be multiplied by 8 to get its real value in the original 19.12 number. Hence I'd shift it by 3, not 2?! Does that make sense?> I would not use code like this in an implementation. �I might > conceivably use it to try to bit-match some hardware implementation. > In general I would rebel against anything like this, and use > something like System C fixed-point types, if I could.Thanks for the suggestion, I am not aware of System C , and I shall take a look. Thanks again Alex
Reply by ●February 16, 20092009-02-16
vectorizor <vectorizor@googlemail.com> wrote:>All right, but I cant work out the whys and hows of this alignment. >FOr instance, for the 1st part of the multiplication, taking the 16 >MSB of the 19.12 number creates an integer that needs to be multiplied >by 8 to get its real value in the original 19.12 number. Hence I'd >shift it by 3, not 2?! Does that make sense?The calculation looks okay to me. a can be 32 bits and b can be 14 bits. The first term takes the 16 MSB's of a, multiplies them by b, and places the 30 bit result in bits 31 through 2. The second term takes the 16 LSB's of a, multiplies them by b, and places the 16 MSB's of the 30 bit result in bits 15 through 0. This is the correct justification because for the first term, the MSB of the result ends up in bit 31, and for the second term, the MSB of the result ends up in bit 15. Which is correct given that for the first term, you had shifted the input a by 16. Hope this helps. Steve
Reply by ●February 16, 20092009-02-16
vectorizor wrote:> Ok guys, I'm looking at some computation done in fixed-point, and > there is one that I dont quite understand. Here it is: > > #define mul19dot12by1dot14(a,b) (((((a)>>16)*(b))<<2) + (TINT32) > (((TUINT32)(((TUINT16)(a))*(TUINT16)(b)))>>14))[...]> 1) the first part of the macro: (((a)>>16)*(b))<<2) . Shifting a by > 16 to get its MSB, and multypling them, ok, but why the <<2 ?!?!Simple rules: '>>' decreases fractional digit count, '<<' increases, '*' adds. If a is 19.12 and b is 1.14, this yields 12 (from a) - 16 (right shift) + 14 (from multiplication by b) + 2 (left shift) = 12 fractional digits.> 2) Same thing with the second part. Whyhe >> 14?Same thing. 12 + 14 - 12 = 12 fractional bits.> How does the recombination of the 2 intermediate results work?They are compatible because both have 12 fractional bits. So you now only have to check that you've used all input bits. If b has 16 bits total, it is always used completely, and the 32 bits of a are split evenly, so that works, too. Stefan
Reply by ●February 16, 20092009-02-16
Stefan Reuther <stefan.news@arcor.de> wrote:>They are compatible because both have 12 fractional bits. So you now >only have to check that you've used all input bits. If b has 16 bits >total, it is always used completely, and the 32 bits of a are split >evenly, so that works, too.I think if b is more than 14 bits, and a is 32 bits, you end up overflowing some MSB's and it fails. I could be wrong though. I'd have to stare at it longer and I think I've stared at it long enough. :-) Steve
Reply by ●February 16, 20092009-02-16
On Mon, 16 Feb 2009 20:33:19 +0000 (UTC), spope33@speedymail.org (Steve Pope) wrote:>Stefan Reuther <stefan.news@arcor.de> wrote: > >>They are compatible because both have 12 fractional bits. So you now >>only have to check that you've used all input bits. If b has 16 bits >>total, it is always used completely, and the 32 bits of a are split >>evenly, so that works, too. > >I think if b is more than 14 bits, and a is 32 bits, you end >up overflowing some MSB's and it fails. I could be wrong though. >I'd have to stare at it longer and I think I've stared at >it long enough. :-) > >SteveOne of the things that bothers me about the macro is implicit conversions. For example, I take it that the first parameter is assumed to be a variable or value that the c compiler will evaluate as a 32-bit unsigned word and that the second parameter is assumed to be a variable or value that the c compiler will evaluate as a 16-bit unsigned word. Taking a working part of the macro as an example, shifting (a) by 16 bits downward, (a) >> 16, would yield a result that the compiler would maintain as a 32-bit unsigned word. Multiplying it by (b) would cause 'b' to be promoted first to a 32-bit signed value, then to a 32-bit unsigned value (I think it goes in order, like that) before the multiplication is coded. So the compiler would choose a 32x32 multiplication algorithm that tosses away the upper 32 bits of the 64 bit result (without special back-end processing, anyway.) No problem there, really, as the product fits. But the other part of the macro casts (a) and (b) to 16-bit unsigned values, I gather. In doing so, the compiler may choose a 16x16 multiplication that tosses away the upper 16 bits of the result, since the c compiler expects the result doesn't necessarily have to be any larger than the two values being multipled. Since all 16 bits of (a) would be valid [4.12 format with the upper bits tossed away] and since (b) has 15 valid bits [1.14 format], the resulting product (without more analysis) could have 31 bits of useful information. Shifting that downwards by 14 places, as the macro does, might leave 1 valid bit in the upper 16-bit part of the result. And since the c compiler is permitted to toss it, it might get lost and not included in the sum. So I think this is worth a little more thought to be sure it will always work out. To understand the macro though, assuming away the above issues for a moment, we have the following: ((a) >> 16) is this: 0000 0000 0000 0000|0xxx xxxx xxxx xxxx|0000^ where I placed a caret (^) at the location of the implied radix point and where a vertical bar (|) is placed at 16-bit word boundaries. It's only the first two words. I just tacked on the last 0's to allow placement of the implied radix point. (b) is this: 0y^yy yyyy yyyy yyyy assuming that there is no sign present and that the 1.14 designation is accurate. The macro generates two products. Let's look at the details. I'll put hidden parts (those parts the compiler doesn't "see" but which help us keep track of the radix) in parentheses. 0000 0000 0000 0000 | 0xxx xxxx xxxx xxxx (|0000^0000) x 0000 0000 0000 0000 | 0y^yy yyyy yyyy yyyy ------------------------------------------- multiplication 00zz zzzz zzzz zzzz | zzzz zz^zz zzzz zzzz <--- yields =========================================== shifted up by 2 zzzz zzzz zzzz zzzz | zzzz^ zzzz zzzz zz00 <--- yields Note that this retains all of the integer bits of precision and retains only 10 bits of precision below the radix point. The least signicant two bits are lost. Which may, or may not, be okay. The upshot is that the result of this is, so far, in [20.12] format. For now, against my earlier argument about word sizes that I think the compiler might use, let's assume that the other product is also done as a 32x32 multiplication. It then looks like this: 0000 0000 0000 0000 | xxxx^ xxxx xxxx xxxx x 0000 0000 0000 0000 | 0y^yy yyyy yyyy yyyy -------------------------------------------- multiplication 0zzz zz^zz zzzz zzzz | zzzz zzzz zzzz zzzz <--- yields ============================================ shifted down by 14 0000 0000 0000 000z | zzzz^ zzzz zzzz zzzz <--- yields Now, note two things. One is that the implied radix point is located exactly in the same place. This is good. The addition is properly aligned, then, and the result will be sensible. The other thing to note is that there is a valid bit in the upper word. Which brings me back to my earlier point. When the c compiler gets done with the 16x16 multiplication and takes only the lower 16 bits of the result, it is then faced with the addition step. That requires an implicit conversion to take place if the explicit one didn't already force the issue, promoting the 16 bit result of the second product to a 32-bit value before addition. But in doing so, I am thinking that the upper bit shown above may get lost. Jon
Reply by ●February 16, 20092009-02-16
Jon Kirwan <jonk@infinitefactors.org> wrote:>One of the things that bothers me about the macro is implicit >conversions. > >For example, I take it that the first parameter is assumed to be a >variable or value that the c compiler will evaluate as a 32-bit >unsigned word and that the second parameter is assumed to be a >variable or value that the c compiler will evaluate as a 16-bit >unsigned word. > >Taking a working part of the macro as an example, shifting (a) by 16 >bits downward, (a) >> 16, would yield a result that the compiler would >maintain as a 32-bit unsigned word. Multiplying it by (b) would cause >'b' to be promoted first to a 32-bit signed value, then to a 32-bit >unsigned value (I think it goes in order, like that) before the >multiplication is coded. So the compiler would choose a 32x32 >multiplication algorithm that tosses away the upper 32 bits of the 64 >bit result (without special back-end processing, anyway.) No problem >there, really, as the product fits.>But the other part of the macro casts (a) and (b) to 16-bit unsigned >values, I gather. In doing so, the compiler may choose a 16x16 >multiplication that tosses away the upper 16 bits of the result, since >the c compiler expects the result doesn't necessarily have to be any >larger than the two values being multipled.Not in K&R C. Is this one of the things they changed in ANSI C? In K&R, C Reference Manual, section 6.6 on evaluating expressions, it states "First any operands of type char or short are converted to int..." The upcasting occurs before the multiply. In my previous replies to this thread I assumed the result of any C multiply operator is 32 bits as would be, or once was, standard. Of course, an C compiler for an embedded target may choose to do things differently, regardless of any standard. Steve
Reply by ●February 16, 20092009-02-16
In article <gncuke$vvf$1@blue.rahul.net>, Steve Pope says...> Jon Kirwan <jonk@infinitefactors.org> wrote: > >But the other part of the macro casts (a) and (b) to 16-bit unsigned > >values, I gather. In doing so, the compiler may choose a 16x16 > >multiplication that tosses away the upper 16 bits of the result, since > >the c compiler expects the result doesn't necessarily have to be any > >larger than the two values being multipled. > > Not in K&R C. Is this one of the things they changed > in ANSI C? In K&R, C Reference Manual, section 6.6 on > evaluating expressions, it states "First any operands of > type char or short are converted to int..." The upcasting > occurs before the multiply.That does assume ints are 32bits. That assumption does not appear in the original question are far as I can see. And, ints are not required to be 32 bits by any of the C standards AFAIK. Robert
Reply by ●February 16, 20092009-02-16
Robert Adsett <sub2@aeolusdevelopment.com> wrote:>In article <gncuke$vvf$1@blue.rahul.net>, Steve Pope says...>> Jon Kirwan <jonk@infinitefactors.org> wrote:>> >But the other part of the macro casts (a) and (b) to 16-bit unsigned >> >values, I gather. In doing so, the compiler may choose a 16x16 >> >multiplication that tosses away the upper 16 bits of the result, since >> >the c compiler expects the result doesn't necessarily have to be any >> >larger than the two values being multipled.>> Not in K&R C. Is this one of the things they changed >> in ANSI C? In K&R, C Reference Manual, section 6.6 on >> evaluating expressions, it states "First any operands of >> type char or short are converted to int..." The upcasting >> occurs before the multiply.>That does assume ints are 32bits. That assumption does not appear in >the original question are far as I can see.The macro contains the expression ((((a)>>16)*(b))<<2) If ints are not larger than 16 bits, such an expression will never work. Granted, they could be 24 bits or something. Probably though they are 32. Being a macro, with no surrounding code, compiler version, or target stated, we don't really know, but I'm pretty certain the above expression is not going to work as expected with 16 bit ints.>And, ints are not required >to be 32 bits by any of the C standards AFAIK.Certainly true. We're forced to either guess at the context here, or probe the OP for more info. I chose to guess. ;-) Steve






