Reply by March 9, 20052005-03-09
"Shawn Steenhagen" <shawn.NSsteenhagen@NSappliedsignalprocessing.com> writes:

> Randy > > > > From our recent phone conversation and e-mails regarding this post, I > understand why you were offended. My intent was never to insult or offend > you. My intent was to simply share information. I respect you and all who > post on the comp.dsp user group and I sincerely apologize for offending you.
Thank you, Shawn. Any young man who can humble himself and make a public apology as you have just done has regained every ounce of my respect. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Reply by Shawn Steenhagen March 9, 20052005-03-09
Randy



From our recent phone conversation and e-mails regarding this post, I
understand why you were offended.  My intent was never to insult or offend
you.  My intent was to simply share information.   I respect you and all who
post on the comp.dsp user group and I sincerely apologize for offending you.



Sincerely



Shawn K. Steenhagen

Applied Signal Processing, Inc.

www.appliedsignalprocessing.com


"Randy Yates" <yates@ieee.org> wrote in message
news:7jkmoyuf.fsf@ieee.org...
> "Shawn Steenhagen" <shawn.NSsteenhagen@NSappliedsignalprocessing.com>
writes:
> > > Randy, > > > > Careful about that extra sign bit. > > Shawn, your warning is ludicrous. My paper on scaling in FIR > fixed-point arithmetic - a reference for many who have visited this > group looking for information on the topic - discusses this point. I > wrote it years before I ever heard of your name on this group. I am > also the author of the comp.dsp FAQ item on fixed-point arithmetic > and as such am well-versed on the issue. > > I noticed your tendency to assume I was ignorant at the comp.dsp > conference. Perhaps you should reexamine your assumptions to save > yourself further embarassment. > > I'm not going to honor your point with a response except to say that > you have given me no new information and I still stand by everything > I have asserted in this thread. > -- > % Randy Yates % "She's sweet on Wagner-I think she'd die
for Beethoven.
> %% Fuquay-Varina, NC % She love the way Puccini lays down a
tune, and
> %%% 919-577-9882 % Verdi's always creepin' from her room." > %%%% <yates@ieee.org> % "Rockaria", *A New World Record*, ELO > http://home.earthlink.net/~yatescr
Reply by March 9, 20052005-03-09
Tim Wescott <tim@wescottnospamdesign.com> writes:

> Randy Yates wrote: > > - snip - > > Do you think that I specially-formulated the posted code so that it > > > would compile efficiently? I did not. In fact the code you have in > > that post is the very first thing I tried. Tim Wescott had asserted > > that vector multiplication was inefficient in C on a DSP. Now I'm not > > sure what sort of strange variations on vector multiplies you want to > > present as exceptions, but it seems to me that the operation is pretty > > clearly defined. > > > > I think that for the most part we're in violent agreement - but you > seem to have (a) have worked mostly with DSP chips with very good tool > support, and (b) to have missed out on the learning curve that many > software folks have when doing DSP for the first time.
Tim, Perhaps the problem is an ambiguity in what the question is. I thought the question was, "Is it feasible syntactically to perform fixed-point arithmetic in C using standard C integer data types?" To that my answer was and still is yes. If instead the question was, "Are you reasonably assured you won't get a significant speed penalty over assembly when performing fixed-point arithmetic (and other types of operations) on a DSP in C across various platforms?", then I'd agree the answer is no, but I didn't hear anyone ask that question.
> Your speed ratios are contrary to my experience with CISC and RISC > processors, and with older DSP toolsets. I see much closer to 1:1 > ratios with the '2812, but even there you can slow things down if > you're not careful, and the optimizer will sometimes take your code > and fling it on the trash pile without telling you (which, I assume, > is why they call it Code Composter). I even tried to get Code > Composter for the TMS320F2812 to cough up a MAC instruction -- it > wouldn't do it for love or money. I'm tempted to try your code > snippet on one the next time I'm at that particular customer site, > just to see what happens. > > > I'm curious though -- since you cite that code snippet why haven't you > commented on the non-ANSI translation done by the '5xxx "C" compiler?
I just did in an alternate part of this thread. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Reply by March 9, 20052005-03-09
I omitted a point in my previous response. Here it is:

Randy Yates <randy.yates@sonyericsson.com> writes:

> Tim, > > I respond below to your points: > > Tim Wescott <tim@wescottnospamdesign.com> writes:
> [...]
> > (B) As usual TI is playing fast and loose with the ANSI standard, and > > that isn't even close to ANSI-compatible C. If it were the x[n] * > > y[n] operation would be truncated to 16 bits before being added to > > acc, and the result would be meaningless. Compile that up on machine > > that supports 16-bit and 32-bit integers, print out the results, and > > see what I mean.
You cannot really even begin to state this unless you know how the types I've custom-defined (INT16_T, INT32_T, e.g.) are defined, and what an "int" represents on the TI 54x platform. That is because any "non-long" and "non-long long" integer types are promoted to "int"s. It so happens that an INT16_T is a "short int" and an "int" is 16 bits, so your statement is correct, but I'm not sure you went through the proper integer promotion rules to get to your conclusion. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Reply by March 9, 20052005-03-09
Tim,

I respond below to your points:

Tim Wescott <tim@wescottnospamdesign.com> writes:

> Randy Yates wrote: > > > Tim Wescott <tim@wescottnospamdesign.com> writes: > > > > >>Randy Yates wrote: > >> > >> > >>>Tim Wescott <tim@wescottnospamdesign.com> writes: > >>> > >> > >>>>[...] > >>>>Generally if you stick to pure C you are stuck with integer math. > >>>>DSP's are designed to do fixed-radix math pretty quickly, ... > >>> > >>>Tim, I think most of your points are helpful, but this one is > >>>off-the-mark > >> > >>>in my judgement. The typical fixed-point DSP operates much the same as > >>>the C integer operations, performing integer math. Whether the > >>>integers are reinterpreted to be fractional, fixed-point, or integer > >>>is all in the interpretation and has little or nothing to do with the > >>>implementation of the basic arithmetic operations (add, subtract, > >>>multiply). > >>>Of course there are differences between fixed-point DSP ALUs and the > >> > >>>"ALU" of a C compiler, the biggest of which are probably the wide > >>>accumulators and the saturation options when performing various > >>>operations. There is also the typical "left shift by 1" that a > >>>fractional DSP does after a multiply to make the result fractional, > >>>but that is certainly doable in C as well, albeit manually. > >> > >>The difference in clock ticks between implementing a fixed-point > >>arbitrary-radix vector dot-product in assembly on a DSP and trying to > >>do the same thing to the same precision in C on the same processor is > >>on the order of 100:1. > > Who said anything about a vector operation? Your statement was > > > Generally if you stick to pure C you are stuck with integer math. > > > DSP's are designed to do fixed-radix math pretty quickly, ... > > The term "math" does not mean "vector math" in my interpretation. > > > > > >>Even on a MAC-less processor when you are in assembly and multiply two > >>signed numbers N-bit numbers you can choose to take the lower N bits > >>of the 2N-1-bit result as C does, or you can take the upper N-1 bits > >>and do a shift, with way fewer clock cycles (10 or 20:1) than you > >>could implement the same functionality in C. I should know -- I've > >>done it in C a couple of times and in assembly on three or four > >>different processors. > > Apparently they did not include the TI TMS320C54x, arguably one of > > > the most popular DSPs around, and on that processor, the following > > code > > #include "dsptypes.h" > > > /* definitions */ > > > #define VECTOR_LENGTH 64 > > > /* local variables */ > > > /* local function prototypes */ > > > /* function definitions */ > > > int main(int margc, char **margv) { > > > UINT16_T n; > > INT16_T x[VECTOR_LENGTH]; > > INT16_T y[VECTOR_LENGTH]; > > INT32_T acc; > > INT16_T result; > > acc = 0; > > > for (n = 0; n < VECTOR_LENGTH; n++) > > { > > x[n] = n; > > y[n] = VECTOR_LENGTH - n - 1; > > } > > acc = 0; > > > for (n = 0; n < VECTOR_LENGTH; n++) > > { > > acc += x[n] * y[n]; > > } > > result = (INT16_T)(acc >> 16); > > > return result; > > > } > > produces the following assembly language > > > 0000:0108 main > > > 0000:0108 4A11 PSHM 11h > > 0000:0109 4A17 PSHM 17h > > 0000:010A EE80 FRAME -128 > > 0000:010B E781 MVMM SP,AR1 > > 0000:010C 6DE9 MAR *+AR1(64) > > 0000:010E E787 MVMM SP,AR7 > > 0000:010F E782 MVMM SP,AR2 > > 0000:0110 E800 LD #0h,A > > 0000:0111 771A STM 3fh,1ah > > 0000:0113 F072 RPTB 11ah > > 0000:0115 L1 > > 0000:0115 8092 STL A,*AR2+ > > 0000:0116 E93F LD #3fh,B > > 0000:0117 F520 SUB A,0,B > > 0000:0118 8191 STL B,*AR1+ > > 0000:0119 F000 ADD #1h,0,A,A > > 0000:011B L2 > > 0000:011B E782 MVMM SP,AR2 > > 0000:011C 6DEA MAR *+AR2(64) > > 0000:011E E783 MVMM SP,AR3 > > 0000:011F E800 LD #0h,A > > 0000:0120 EC3F RPT #3fh > > 0000:0121 L3 > > 0000:0121 B089 MAC *AR2+,*AR3+,A,A > > 0000:0122 L4 > > 0000:0122 F0E0 SFTL A,0,A > > 0000:0123 F0F0 SFTL A,-16,A > > 0000:0124 6BF8 ADDM 80h,*(18h) > > 0000:0127 F495 NOP 0000:0128 F495 NOP 0000:0129 8A17 > > POPM 17h > > > 0000:012A 8A11 POPM 11h > > 0000:012B F4E4 FRET Both the vector multiply and the end shift > > look pretty damn efficient > > > to me, Tim. Thus even if we agree to interpret your point > > differently, it's still > > > inaccurate for one of the most popular DSPs in the world. > > (A) That is the _only_ case that I know of for sure that the compiler > can figure out it needs to use a MAC and shift -- the version of Code > Composter that comes with the '2812 certainly doesn't do this, or I > couldn't find the magic finger-ring combination.
By "case" do you mean "case of compiler version and architecture"? Perhaps you are right. I didn't really "sneak" this combination on you - I work with the 54x a LOT and it was the easiest thing for me to try. And the vector multiply sample seems straightforward. CEVA's Teak compiler produced 22 lines of assembly for the multiply-accumulate step, so it is clearly inefficient. Due to licensing and confidentiality issues, I'm not posting the result here. However, the TI 5510 C compiler also produced efficient code: ;******************************************************************************* ;* TMS320C55x C/C++ Codegen Unix Version 2.40 * ;* Date/Time created: Wed Mar 9 09:07:28 2005 * ;******************************************************************************* .mmregs .cpl_on .arms_on .c54cm_off .asg AR6, FP .asg XAR6, XFP .asg DPH, MDP .model call=c55_std .model mem=large .noremark 5549 ; code avoids SE CPU_28 .noremark 5558 ; code avoids SE CPU_33 .noremark 5570 ; code avoids SE CPU_40 .noremark 5571 ; code avoids SE CPU_41 .noremark 5573 ; code avoids SE CPU_43 .noremark 5584 ; code avoids SE CPU_47 .noremark 5599 ; code avoids SE CPU_55 .noremark 5503 ; code avoids SE CPU_84 MMR write .noremark 5505 ; code avoids SE CPU_84 MMR read .noremark 5002 ; code respects overwrite rules ;******************************************************************************* ;* GLOBAL FILE PARAMETERS * ;* * ;* Architecture : TMS320C55x * ;* Optimization : Always Choose Smaller Code Size * ;* Memory : Large Model (23-Bit Data Pointers) * ;* Calls : Normal Library ASM calls * ;* Debug Info : Standard TI Debug Information * ;******************************************************************************* .file "/home/unix/us057845/projects/simmac/ti/simmac.c" ; opt55 -m -O3 /var/tmp/aaaa0057b /var/tmp/daaa0057b -w /home/unix/us057845/projects/simmac/ti/dsp-5510/ .sect ".text" .align 4 .global _main .sym _main,_main, 36, 2, 0 .func 13 ;---------------------------------------------------------------------- ; 13 | int main(int margc, char **margv) ;---------------------------------------------------------------------- ;******************************************************************************* ;* FUNCTION NAME: _main * ;* * ;* Function Uses Regs : AC0,AC0,AC1,AC1,T0,AR1,AR2,XAR2,AR3,XAR3,AR4,FP,XFP, * ;* SP,BRC0,CARRY,M40,SATA,SATD,RDM,FRCT,SMUL * ;* Save On Entry Regs : FP * ;* Stack Frame : Full (Frame Pointer in AR6, w/ debug) * ;* Total Frame Size : 131 words * ;* (2 return address/alignment) * ;* (1 frame pointer) * ;* (128 local values) * ;******************************************************************************* ;******************************************************************************* ;* * ;* Using -g (debug) with optimization (-o3) may disable key optimizations! * ;* * ;******************************************************************************* _main: .line 2 ;---------------------------------------------------------------------- ; 15 | UINT16_T n; ; 16 | INT16_T x[VECTOR_LENGTH]; ; 17 | INT16_T y[VECTOR_LENGTH]; ; 18 | INT32_T acc; ; 19 | INT16_T result; ; 21 | acc = 0; ;---------------------------------------------------------------------- ;* T0 assigned to _margc .sym _margc,12, 4, 17, 16 ;* AR0 assigned to _margv .sym _margv,17, 82, 17, 23 ;* BRC0 assigned to L$1 ;* BRC0 assigned to L$2 ;* AR1 assigned to L$2 ;* AR1 assigned to L$1 ;* AR3 assigned to U$12 ;* AR3 assigned to U$12 ;* AR2 assigned to U$5 ;* AR2 assigned to U$5 ;* AR1 assigned to _n .sym _n,18, 13, 4, 16 ;* AC0 assigned to _acc .sym _acc,0, 5, 4, 32 .sym _x,0, 51, 1, 1024,, 64 .sym _y,64, 51, 1, 1024,, 64 PSHBOTH XFP ADD #-128, mmap(SP) MOV XSP, XAR3 MOV XSP, XAR2 AMAR *+AR3(#64) MOV XSP, XFP .line 10 ;---------------------------------------------------------------------- ; 22 | for (n = 0; n < VECTOR_LENGTH; n++) ;---------------------------------------------------------------------- MOV #63, BRC0 RPTBLOCAL L2-1 || MOV #0, AR1 ; loop starts L1: .line 12 ;---------------------------------------------------------------------- ; 24 | x[n] = n; ;---------------------------------------------------------------------- MOV AR1, *AR2+ ; |24| .line 13 ;---------------------------------------------------------------------- ; 25 | y[n] = VECTOR_LENGTH - n - 1; ;---------------------------------------------------------------------- MOV #63, AR4 ; |25| SUB AR1, AR4 ; |25| MOV AR4, *AR3+ ; |25| .line 14 ADD #1, AR1 ; |26| ; loop ends ; |26| L2: MOV XSP, XAR3 MOV XSP, XAR2 AMAR *+AR3(#64) .line 16 ;---------------------------------------------------------------------- ; 28 | acc = 0; ; 29 | for (n = 0; n < VECTOR_LENGTH; n++) ;---------------------------------------------------------------------- MOV #63, BRC0 RPTBLOCAL L4-1 || MOV #0, AC0 ; |28| ; loop starts L3: .line 19 ;---------------------------------------------------------------------- ; 31 | acc += x[n] * y[n]; ;---------------------------------------------------------------------- MPYM *AR3+, *AR2+, AC1 ; |31| MOV mmap(AC1L), AC1 ; |31| ADD AC1, AC0 ; |31| .line 20 ;---------------------------------------------------------------------- ; 34 | result = (INT16_T)(acc >> 16); ;---------------------------------------------------------------------- ; loop ends ; |32| L4: .line 24 ;---------------------------------------------------------------------- ; 36 | return result; ;---------------------------------------------------------------------- MOV HI(AC0), T0 .line 25 ADD #128, mmap(SP) ; |36| POPBOTH XFP RET ; |36| ; return occurs ; |36| .endfunc 37,000000080h,129 ;******************************************************************************* ;* TYPE INFORMATION * ;******************************************************************************* .sym _INT16_T, 0, 3, 13, 16 .sym _UINT16_T, 0, 13, 13, 16 .sym _INT32_T, 0, 5, 13, 32
> (B) As usual TI is playing fast and loose with the ANSI standard, and > that isn't even close to ANSI-compatible C. If it were the x[n] * > y[n] operation would be truncated to 16 bits before being added to > acc, and the result would be meaningless. Compile that up on machine > that supports 16-bit and 32-bit integers, print out the results, and > see what I mean.
According to ANSI specification ISO/IEC 9899-1999(E), the compiler is not in violation. In a nutshell, the operation result is undefined and so performing the operation in extended precision is acceptable. See paragraph 2 of section 3.4.3 on "undefined behavior." However, you're right in that what I wrote was a bit sloppy and non-portable. The fix is very simple; change the line in question to: acc += (INT32_T)x[n] * y[n];.
> Furthermore, you are actually making my point: by starting with an > awareness of the one thing that sets a DSP apart from the rest and > warping your code to fit that one thing you can make the operation > very fast. But in production code you will have to be constantly on > guard to make sure that the C code isn't "improved" in such a way that > makes the compiler implement it as a bunch of "traditional" integer > operations, thereby making it take 10-100 times slower, and likely > reintroducing the truncation (more like 10, TI is good at making fast > processors).
Well I'm not sure what your point is anymore. First you started out by stating that one is stuck with integer math when using pure C. Then you backed off and stated that your point was that vector multiplies are inefficient. Now it seems like you're saying "they're inefficient sometimes on some platforms." At least that seems to be the case. The key point I wanted to make to the OP and to you is that it is feasible to perform fixed-point arithmetic in C (whether on a DSP or not), and that it even runs reasonably well on some platforms, i.e. within 1000 percent of hand-tweaked assembly. (This definition of reasonable is certainly debatable depending on the application's requirements.) A side issue that developed (i.e., via Jerry) seems to be whether or not the fractional "left shift by one" is significant in the determination to perform fixed-point processing in C. I say it is not for the following reasons: 1. The "left shift by one" issue is not really just for "fractional" processing but rather for any fixed-point multiply. The question is generally whether the total dynamic range needs to be preserved or the SNR is more important. 2. A requirement to left shift by one can be brought out of the inner loop in MAC type of operations and thus be made negligible. 3. It is not always desirable to left shift by one and take the high 16 bits after multiplying two fixed-point 16-bit values into a resulting 32-bit value. The N bits you take from M bits that result from a fixed-point operation when N < M can be all over the map depending on several application issues. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Reply by Shawn Steenhagen March 8, 20052005-03-08
Daniel:

Check out Tables 8-3 and 8-4 of section 8.5.2 of the C6000 Compilers user
guide:

http://focus.ti.com/lit/ug/spru187l/spru187l.pdf

this sections lists the intrinsics that the 64x compiler recognizes.  These
intrinsics contain operations common in Q15 operations. For example _smpy( )
will multiply and shift by 1 with saturation protection.

I also see that the "int" data type on the 64 is 32bits, a "short" is 16bits
and a "long" is 40 bits (table 7-1), so keep that in mind as well.

-Shawn


"Daniel" <d.lohausen@freenet.de> wrote in message
news:e83ccc31.0503080137.2f641499@posting.google.com...
> d.lohausen@freenet.de (Daniel) wrote in message
news:<e83ccc31.0503030745.68a487cc@posting.google.com>...
> > Hello Everybody, > > > > for my diploma thesis, I have to implement a Least-Mean-Square > > Algorithm on a fixed-point DSP (TI 6416). The LMS was implemented on a > > floating-point processor(TI 6713) earlier, so I just to the code and > > copied it. Of course, there are a lot of float variables in the code. > > When I ran the program, it workes for small FIR orders (6), but the > > larger the order of the filter, the worse the result. > > Is this because the 6416 cannot work with floating-point numbers > > accuretly? > > What can I do? When I convert all the float variables to integers I > > get overflow problems. > > > > Thanks a lot > > Daniel > > Hello Everybody, > > thanks for all you help. I found out that the algorithm was too slow > in my application. For higher filter orders, the LMS was still > calculating when new samples from the codec had arrived. A software > interrupt was called that started the LMS from the beginning, so it > didn`t end the previous cycle. > Now I removed all the codec stuff and I works for higher filter > orders, but, of course, far too slow. Now I have to adapt the > algorithm to fixed-point. > > I read all the posts and I have a new question: I understand what the > Q.15 data type is but how can I use in in C? Is there a special data > type or what? > > Thanks again > Daniel
Reply by March 8, 20052005-03-08
d.lohausen@freenet.de (Daniel) writes:

> d.lohausen@freenet.de (Daniel) wrote in message news:<e83ccc31.0503030745.68a487cc@posting.google.com>... > > Hello Everybody, > > > > for my diploma thesis, I have to implement a Least-Mean-Square > > Algorithm on a fixed-point DSP (TI 6416). The LMS was implemented on a > > floating-point processor(TI 6713) earlier, so I just to the code and > > copied it. Of course, there are a lot of float variables in the code. > > When I ran the program, it workes for small FIR orders (6), but the > > larger the order of the filter, the worse the result. > > Is this because the 6416 cannot work with floating-point numbers > > accuretly? > > What can I do? When I convert all the float variables to integers I > > get overflow problems. > > > > Thanks a lot > > Daniel > > Hello Everybody, > > thanks for all you help. I found out that the algorithm was too slow > in my application. For higher filter orders, the LMS was still > calculating when new samples from the codec had arrived. A software > interrupt was called that started the LMS from the beginning, so it > didn`t end the previous cycle. > Now I removed all the codec stuff and I works for higher filter > orders, but, of course, far too slow. Now I have to adapt the > algorithm to fixed-point. > > I read all the posts and I have a new question: I understand what the > Q.15 data type is but how can I use in in C? Is there a special data > type or what?
No. Simply use a 16-bit integer on your platform and then follow the rules for fixed-point arithmetic, which I give in my paper: http://home.earthlink.net/~yatescr/fp.pdf You should also be careful, as Tim Wescott has correctly pointed out, to ensure intermediate results are properly computed by the C compiler. E.g., to ensure the two 16-bit values are multiplied using a 32-bit result, do the following explicit cast: INT32_T y; /* this is scaled A(1, 30) INT16_T x1; /* this is scaled A(0, 15) INT16_T x2; /* this is scaled A(0, 15) y = (INT32_T)x1 * x2. Similarly if you want to add two A(0, 15) values, you still will want to explicitly cast one operand to 32 bits if you want to ensure you avoid overflows. You're going to have to become an expert on C type conversions to be sure things work the way you expect. -- Randy Yates Sony Ericsson Mobile Communications Research Triangle Park, NC, USA randy.yates@sonyericsson.com, 919-472-1124
Reply by steve March 8, 20052005-03-08
Daniel wrote:
> d.lohausen@freenet.de (Daniel) wrote in message
news:<e83ccc31.0503030745.68a487cc@posting.google.com>...
> > Hello Everybody, > > > > for my diploma thesis, I have to implement a Least-Mean-Square > > Algorithm on a fixed-point DSP (TI 6416). The LMS was implemented
on a
> > floating-point processor(TI 6713) earlier, so I just to the code
and
> > copied it. Of course, there are a lot of float variables in the
code.
> > When I ran the program, it workes for small FIR orders (6), but the > > larger the order of the filter, the worse the result. > > Is this because the 6416 cannot work with floating-point numbers > > accuretly? > > What can I do? When I convert all the float variables to integers I > > get overflow problems. > > > > Thanks a lot > > Daniel > > Hello Everybody, > > thanks for all you help. I found out that the algorithm was too slow > in my application. For higher filter orders, the LMS was still > calculating when new samples from the codec had arrived. A software > interrupt was called that started the LMS from the beginning, so it > didn`t end the previous cycle. > Now I removed all the codec stuff and I works for higher filter > orders, but, of course, far too slow. Now I have to adapt the > algorithm to fixed-point. > > I read all the posts and I have a new question: I understand what the > Q.15 data type is but how can I use in in C? Is there a special data > type or what? > > Thanks again > Daniel
Makes sense. Just use integer type and play around with a few multiples and look at the results, everything will make sense if you try a small algorithm.
Reply by john March 8, 20052005-03-08
Daniel wrote:
> d.lohausen@freenet.de (Daniel) wrote in message
news:<e83ccc31.0503030745.68a487cc@posting.google.com>...
> > Hello Everybody, > > > > for my diploma thesis, I have to implement a Least-Mean-Square > > Algorithm on a fixed-point DSP (TI 6416). The LMS was implemented
on a
> > floating-point processor(TI 6713) earlier, so I just to the code
and
> > copied it. Of course, there are a lot of float variables in the
code.
> > When I ran the program, it workes for small FIR orders (6), but the > > larger the order of the filter, the worse the result. > > Is this because the 6416 cannot work with floating-point numbers > > accuretly? > > What can I do? When I convert all the float variables to integers I > > get overflow problems. > > > > Thanks a lot > > Daniel > > Hello Everybody, > > thanks for all you help. I found out that the algorithm was too slow > in my application. For higher filter orders, the LMS was still > calculating when new samples from the codec had arrived. A software > interrupt was called that started the LMS from the beginning, so it > didn`t end the previous cycle. > Now I removed all the codec stuff and I works for higher filter > orders, but, of course, far too slow. Now I have to adapt the > algorithm to fixed-point. > > I read all the posts and I have a new question: I understand what the > Q.15 data type is but how can I use in in C? Is there a special data > type or what? > > Thanks again > Daniel
I think this has been discussed in this thread already. John
Reply by Daniel March 8, 20052005-03-08
d.lohausen@freenet.de (Daniel) wrote in message news:<e83ccc31.0503030745.68a487cc@posting.google.com>...
> Hello Everybody, > > for my diploma thesis, I have to implement a Least-Mean-Square > Algorithm on a fixed-point DSP (TI 6416). The LMS was implemented on a > floating-point processor(TI 6713) earlier, so I just to the code and > copied it. Of course, there are a lot of float variables in the code. > When I ran the program, it workes for small FIR orders (6), but the > larger the order of the filter, the worse the result. > Is this because the 6416 cannot work with floating-point numbers > accuretly? > What can I do? When I convert all the float variables to integers I > get overflow problems. > > Thanks a lot > Daniel
Hello Everybody, thanks for all you help. I found out that the algorithm was too slow in my application. For higher filter orders, the LMS was still calculating when new samples from the codec had arrived. A software interrupt was called that started the LMS from the beginning, so it didn`t end the previous cycle. Now I removed all the codec stuff and I works for higher filter orders, but, of course, far too slow. Now I have to adapt the algorithm to fixed-point. I read all the posts and I have a new question: I understand what the Q.15 data type is but how can I use in in C? Is there a special data type or what? Thanks again Daniel