Hello everyone! I'm trying to implement Goertzel's algorithm on TI
C6711 board using fixed-point arithmetic. The problem is that I always end up
with overflow...I'm using Q15 format for the filter coefficients, and I
know that the samples of the input are also coded in Q15. The problem seems to
come from the additions that need to be done at each step, which make the final
result overflow. I use short (16-bit) integers to store the coefficients and the
input, and 32-bit integers for the results at each step. Pardon my ignorance,
but I really can't understand how I can avoid this...Moreover, since the
sum of two Q15 numbers is also Q15, it's totally incnomprehensible to me
what, for example, an 18-bit Q15 number means. I suppose either have to use
saturation or scale down my input (which comes from the built-in microphones on
the PCM 3003 codec), but I don't know how to do it...I'd be grateful
if anyone could give a little help.
Andrew Milias
student @ E.E. department
University of Patras
Greece
DTMF decoding using fixed-point arithmetic
Started by ●April 26, 2006
Reply by ●April 27, 20062006-04-27
C6711 is floating point processor as such you don't need to spend time in
Q format conversions.
straight away use SP( single precision) and DP( precision) intrinsics or assembly instructions for your algorithm ,
this will avoid all overflow issues.
Go to "help" in CCS for writing filter code, you have lots of sample codes available
thanks,
radha.
________________________________
From: a... on behalf of a...@upnet.gr
Sent: Wed 4/26/2006 5:11 PM
To: a...
Subject: [audiodsp] DTMF decoding using fixed-point arithmetic
Hello everyone! I'm trying to implement Goertzel's algorithm on TI C6711 board using fixed-point arithmetic. The problem is that I always end up with overflow...I'm using Q15 format for the filter coefficients, and I know that the samples of the input are also coded in Q15. The problem seems to come from the additions that need to be done at each step, which make the final result overflow. I use short (16-bit) integers to store the coefficients and the input, and 32-bit integers for the results at each step. Pardon my ignorance, but I really can't understand how I can avoid this...Moreover, since the sum of two Q15 numbers is also Q15, it's totally incnomprehensible to me what, for example, an 18-bit Q15 number means. I suppose either have to use saturation or scale down my input (which comes from the built-in microphones on the PCM 3003 codec), but I don't know how to do it...I'd be grateful if anyone could give a little help.
Andrew Milias
student @ E.E. department
University of Patras
Greece
straight away use SP( single precision) and DP( precision) intrinsics or assembly instructions for your algorithm ,
this will avoid all overflow issues.
Go to "help" in CCS for writing filter code, you have lots of sample codes available
thanks,
radha.
________________________________
From: a... on behalf of a...@upnet.gr
Sent: Wed 4/26/2006 5:11 PM
To: a...
Subject: [audiodsp] DTMF decoding using fixed-point arithmetic
Hello everyone! I'm trying to implement Goertzel's algorithm on TI C6711 board using fixed-point arithmetic. The problem is that I always end up with overflow...I'm using Q15 format for the filter coefficients, and I know that the samples of the input are also coded in Q15. The problem seems to come from the additions that need to be done at each step, which make the final result overflow. I use short (16-bit) integers to store the coefficients and the input, and 32-bit integers for the results at each step. Pardon my ignorance, but I really can't understand how I can avoid this...Moreover, since the sum of two Q15 numbers is also Q15, it's totally incnomprehensible to me what, for example, an 18-bit Q15 number means. I suppose either have to use saturation or scale down my input (which comes from the built-in microphones on the PCM 3003 codec), but I don't know how to do it...I'd be grateful if anyone could give a little help.
Andrew Milias
student @ E.E. department
University of Patras
Greece
Reply by ●April 27, 20062006-04-27
Hi,
Your assumption
>since the sum of two Q15 numbers is also Q15
is always not true. It can be Q1.14 (Eg: .99+.99 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
Q15, It shud be represented in Q1.14
So your further calculations should be done depending
on which format the latest result is in and
accordingly display the final result in appropriate Q
format,
-Vishvanath
--- a...@upnet.gr wrote:
> Hello everyone! I'm trying to implement Goertzel's
> algorithm on TI C6711 board using fixed-point
> arithmetic. The problem is that I always end up with
> overflow...I'm using Q15 format for the filter
> coefficients, and I know that the samples of the
> input are also coded in Q15. The problem seems to
> come from the additions that need to be done at each
> step, which make the final result overflow. I use
> short (16-bit) integers to store the coefficients
> and the input, and 32-bit integers for the results
> at each step. Pardon my ignorance, but I really
> can't understand how I can avoid this...Moreover,
> since the sum of two Q15 numbers is also Q15, it's
> totally incnomprehensible to me what, for example,
> an 18-bit Q15 number means. I suppose either have to
> use saturation or scale down my input (which comes
> from the built-in microphones on the PCM 3003
> codec), but I don't know how to do it...I'd be
> grateful if anyone could give a little help.
>
> Andrew Milias
> student @ E.E. department
> University of Patras
> Greece
>
Your assumption
>since the sum of two Q15 numbers is also Q15
is always not true. It can be Q1.14 (Eg: .99+.99 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
Q15, It shud be represented in Q1.14
So your further calculations should be done depending
on which format the latest result is in and
accordingly display the final result in appropriate Q
format,
-Vishvanath
--- a...@upnet.gr wrote:
> Hello everyone! I'm trying to implement Goertzel's
> algorithm on TI C6711 board using fixed-point
> arithmetic. The problem is that I always end up with
> overflow...I'm using Q15 format for the filter
> coefficients, and I know that the samples of the
> input are also coded in Q15. The problem seems to
> come from the additions that need to be done at each
> step, which make the final result overflow. I use
> short (16-bit) integers to store the coefficients
> and the input, and 32-bit integers for the results
> at each step. Pardon my ignorance, but I really
> can't understand how I can avoid this...Moreover,
> since the sum of two Q15 numbers is also Q15, it's
> totally incnomprehensible to me what, for example,
> an 18-bit Q15 number means. I suppose either have to
> use saturation or scale down my input (which comes
> from the built-in microphones on the PCM 3003
> codec), but I don't know how to do it...I'd be
> grateful if anyone could give a little help.
>
> Andrew Milias
> student @ E.E. department
> University of Patras
> Greece
>
Reply by ●April 28, 20062006-04-28
Jeff,
I said we should view the result of addition in Q1.14
format, else the last bit will be lost.
Eg : Could you tell me how do you represent 1.99 in
Q15 format (16 bit is the register width) ?
Your statement is absolutely correct - that overflow
happens and some bits are lost - But if you maintain
the same Q format (Q15) throughout your additions then
your final result will be wrong.
When you see in binary its just placing the decimal
point to proper position, i mean its just how we view
the number.
Regards,
-Vishvanath
--- Jeff Brower wrote:
> Vishvanath-
>
> > Your assumption
> >>since the sum of two Q15 numbers is also Q15
> > is always not true. It can be Q1.14 (Eg: .99+.99 > > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
> > Q15, It shud be represented in Q1.14
>
> This will confuse the OP. Adding a series of Q15
> numbers *always results*
> in a Q15 number, although the result may have extra
> bits to left of
> decimal point. If the register is not long enough,
> or there is not a
> guard-band in the accumulator, then the result will
> overflow and some bits
> lost.
>
> -Jeff
>
> > --- a...@upnet.gr wrote:
> >
> >> Hello everyone! I'm trying to implement
> Goertzel's
> >> algorithm on TI C6711 board using fixed-point
> >> arithmetic. The problem is that I always end up
> with
> >> overflow...I'm using Q15 format for the filter
> >> coefficients, and I know that the samples of the
> >> input are also coded in Q15. The problem seems to
> >> come from the additions that need to be done at
> each
> >> step, which make the final result overflow. I use
> >> short (16-bit) integers to store the coefficients
> >> and the input, and 32-bit integers for the
> results
> >> at each step. Pardon my ignorance, but I really
> >> can't understand how I can avoid this...Moreover,
> >> since the sum of two Q15 numbers is also Q15,
> it's
> >> totally incnomprehensible to me what, for
> example,
> >> an 18-bit Q15 number means. I suppose either have
> to
> >> use saturation or scale down my input (which
> comes
> >> from the built-in microphones on the PCM 3003
> >> codec), but I don't know how to do it...I'd be
> >> grateful if anyone could give a little help.
> >>
> >> Andrew Milias
> >> student @ E.E. department
> >> University of Patras
> >> Greece
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
> > a...
> >
> >
> >
> >
> >
___________________________________________________________
Win tickets to the 2006 FIFA World Cup Germany with Yahoo! Messenger. http://advision.webevents.yahoo.com/fifaworldcup_uk/
I said we should view the result of addition in Q1.14
format, else the last bit will be lost.
Eg : Could you tell me how do you represent 1.99 in
Q15 format (16 bit is the register width) ?
Your statement is absolutely correct - that overflow
happens and some bits are lost - But if you maintain
the same Q format (Q15) throughout your additions then
your final result will be wrong.
When you see in binary its just placing the decimal
point to proper position, i mean its just how we view
the number.
Regards,
-Vishvanath
--- Jeff Brower wrote:
> Vishvanath-
>
> > Your assumption
> >>since the sum of two Q15 numbers is also Q15
> > is always not true. It can be Q1.14 (Eg: .99+.99 > > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
> > Q15, It shud be represented in Q1.14
>
> This will confuse the OP. Adding a series of Q15
> numbers *always results*
> in a Q15 number, although the result may have extra
> bits to left of
> decimal point. If the register is not long enough,
> or there is not a
> guard-band in the accumulator, then the result will
> overflow and some bits
> lost.
>
> -Jeff
>
> > --- a...@upnet.gr wrote:
> >
> >> Hello everyone! I'm trying to implement
> Goertzel's
> >> algorithm on TI C6711 board using fixed-point
> >> arithmetic. The problem is that I always end up
> with
> >> overflow...I'm using Q15 format for the filter
> >> coefficients, and I know that the samples of the
> >> input are also coded in Q15. The problem seems to
> >> come from the additions that need to be done at
> each
> >> step, which make the final result overflow. I use
> >> short (16-bit) integers to store the coefficients
> >> and the input, and 32-bit integers for the
> results
> >> at each step. Pardon my ignorance, but I really
> >> can't understand how I can avoid this...Moreover,
> >> since the sum of two Q15 numbers is also Q15,
> it's
> >> totally incnomprehensible to me what, for
> example,
> >> an 18-bit Q15 number means. I suppose either have
> to
> >> use saturation or scale down my input (which
> comes
> >> from the built-in microphones on the PCM 3003
> >> codec), but I don't know how to do it...I'd be
> >> grateful if anyone could give a little help.
> >>
> >> Andrew Milias
> >> student @ E.E. department
> >> University of Patras
> >> Greece
> >>
> >>
> >>
> >
> >
> >
> >
> >
> >
> >
> > a...
> >
> >
> >
> >
> >
___________________________________________________________
Win tickets to the 2006 FIFA World Cup Germany with Yahoo! Messenger. http://advision.webevents.yahoo.com/fifaworldcup_uk/
Reply by ●April 28, 20062006-04-28
Vishvanath-
> Your assumption
>>since the sum of two Q15 numbers is also Q15
> is always not true. It can be Q1.14 (Eg: .99+.99 > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
> Q15, It shud be represented in Q1.14
This will confuse the OP. Adding a series of Q15 numbers *always results*
in a Q15 number, although the result may have extra bits to left of
decimal point. If the register is not long enough, or there is not a
guard-band in the accumulator, then the result will overflow and some bits
lost.
-Jeff
> --- a...@upnet.gr wrote:
>
>> Hello everyone! I'm trying to implement Goertzel's
>> algorithm on TI C6711 board using fixed-point
>> arithmetic. The problem is that I always end up with
>> overflow...I'm using Q15 format for the filter
>> coefficients, and I know that the samples of the
>> input are also coded in Q15. The problem seems to
>> come from the additions that need to be done at each
>> step, which make the final result overflow. I use
>> short (16-bit) integers to store the coefficients
>> and the input, and 32-bit integers for the results
>> at each step. Pardon my ignorance, but I really
>> can't understand how I can avoid this...Moreover,
>> since the sum of two Q15 numbers is also Q15, it's
>> totally incnomprehensible to me what, for example,
>> an 18-bit Q15 number means. I suppose either have to
>> use saturation or scale down my input (which comes
>> from the built-in microphones on the PCM 3003
>> codec), but I don't know how to do it...I'd be
>> grateful if anyone could give a little help.
>>
>> Andrew Milias
>> student @ E.E. department
>> University of Patras
>> Greece
>>
> Your assumption
>>since the sum of two Q15 numbers is also Q15
> is always not true. It can be Q1.14 (Eg: .99+.99 > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
> Q15, It shud be represented in Q1.14
This will confuse the OP. Adding a series of Q15 numbers *always results*
in a Q15 number, although the result may have extra bits to left of
decimal point. If the register is not long enough, or there is not a
guard-band in the accumulator, then the result will overflow and some bits
lost.
-Jeff
> --- a...@upnet.gr wrote:
>
>> Hello everyone! I'm trying to implement Goertzel's
>> algorithm on TI C6711 board using fixed-point
>> arithmetic. The problem is that I always end up with
>> overflow...I'm using Q15 format for the filter
>> coefficients, and I know that the samples of the
>> input are also coded in Q15. The problem seems to
>> come from the additions that need to be done at each
>> step, which make the final result overflow. I use
>> short (16-bit) integers to store the coefficients
>> and the input, and 32-bit integers for the results
>> at each step. Pardon my ignorance, but I really
>> can't understand how I can avoid this...Moreover,
>> since the sum of two Q15 numbers is also Q15, it's
>> totally incnomprehensible to me what, for example,
>> an 18-bit Q15 number means. I suppose either have to
>> use saturation or scale down my input (which comes
>> from the built-in microphones on the PCM 3003
>> codec), but I don't know how to do it...I'd be
>> grateful if anyone could give a little help.
>>
>> Andrew Milias
>> student @ E.E. department
>> University of Patras
>> Greece
>>
Reply by ●April 28, 20062006-04-28
Vishvanath-
> I said we should view the result of addition in Q1.14
> format, else the last bit will be lost.
> Eg : Could you tell me how do you represent 1.99 in
> Q15 format (16 bit is the register width) ?
After adding 0.99 and 0.99 both in Q15 format with a 16-bit accumlator,
the msb is already lost and the result no longer makes sense. Your
suggestion to use Q14 format to change how we 'view' the result will not
save the bit.
But I know what you mean, just has to be said differently. It would make
sense to apply Q14 format to values *before* adding, in which case a
result of 1.99 can fit in the 16-bit accumulator result.
> Your statement is absolutely correct - that overflow
> happens and some bits are lost - But if you maintain
> the same Q format (Q15) throughout your additions then
> your final result will be wrong.
If the accumlator has a guard-band, as many do including all DSPs that
I've used, then you can continue to add Q15 numbers. For example, if you
have to average 4 Q15 numbers, and you have an 8-bit guard-band, then you
can add them, shift the final sum right by 2 (divide by 4) and you have a
Q15 answer. The "Q view" never changes.
-Jeff
> --- Jeff Brower wrote:
>
>> Vishvanath-
>>
>> > Your assumption
>> >>since the sum of two Q15 numbers is also Q15
>> > is always not true. It can be Q1.14 (Eg: .99+.99 >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
>> > Q15, It shud be represented in Q1.14
>>
>> This will confuse the OP. Adding a series of Q15
>> numbers *always results*
>> in a Q15 number, although the result may have extra
>> bits to left of
>> decimal point. If the register is not long enough,
>> or there is not a
>> guard-band in the accumulator, then the result will
>> overflow and some bits
>> lost.
>>
>> -Jeff
>>
>> > --- a...@upnet.gr wrote:
>> >
>> >> Hello everyone! I'm trying to implement
>> Goertzel's
>> >> algorithm on TI C6711 board using fixed-point
>> >> arithmetic. The problem is that I always end up
>> with
>> >> overflow...I'm using Q15 format for the filter
>> >> coefficients, and I know that the samples of the
>> >> input are also coded in Q15. The problem seems to
>> >> come from the additions that need to be done at
>> each
>> >> step, which make the final result overflow. I use
>> >> short (16-bit) integers to store the coefficients
>> >> and the input, and 32-bit integers for the
>> results
>> >> at each step. Pardon my ignorance, but I really
>> >> can't understand how I can avoid this...Moreover,
>> >> since the sum of two Q15 numbers is also Q15,
>> it's
>> >> totally incnomprehensible to me what, for
>> example,
>> >> an 18-bit Q15 number means. I suppose either have
>> to
>> >> use saturation or scale down my input (which
>> comes
>> >> from the built-in microphones on the PCM 3003
>> >> codec), but I don't know how to do it...I'd be
>> >> grateful if anyone could give a little help.
>> >>
>> >> Andrew Milias
>> >> student @ E.E. department
>> >> University of Patras
>> >> Greece
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > a...
>> >
>> >
>> >
>> >
>> >
> ___________________________________________________________
> Win tickets to the 2006 FIFA World Cup Germany with Yahoo! Messenger.
> http://advision.webevents.yahoo.com/fifaworldcup_uk/
>
> I said we should view the result of addition in Q1.14
> format, else the last bit will be lost.
> Eg : Could you tell me how do you represent 1.99 in
> Q15 format (16 bit is the register width) ?
After adding 0.99 and 0.99 both in Q15 format with a 16-bit accumlator,
the msb is already lost and the result no longer makes sense. Your
suggestion to use Q14 format to change how we 'view' the result will not
save the bit.
But I know what you mean, just has to be said differently. It would make
sense to apply Q14 format to values *before* adding, in which case a
result of 1.99 can fit in the 16-bit accumulator result.
> Your statement is absolutely correct - that overflow
> happens and some bits are lost - But if you maintain
> the same Q format (Q15) throughout your additions then
> your final result will be wrong.
If the accumlator has a guard-band, as many do including all DSPs that
I've used, then you can continue to add Q15 numbers. For example, if you
have to average 4 Q15 numbers, and you have an 8-bit guard-band, then you
can add them, shift the final sum right by 2 (divide by 4) and you have a
Q15 answer. The "Q view" never changes.
-Jeff
> --- Jeff Brower wrote:
>
>> Vishvanath-
>>
>> > Your assumption
>> >>since the sum of two Q15 numbers is also Q15
>> > is always not true. It can be Q1.14 (Eg: .99+.99 >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
>> > Q15, It shud be represented in Q1.14
>>
>> This will confuse the OP. Adding a series of Q15
>> numbers *always results*
>> in a Q15 number, although the result may have extra
>> bits to left of
>> decimal point. If the register is not long enough,
>> or there is not a
>> guard-band in the accumulator, then the result will
>> overflow and some bits
>> lost.
>>
>> -Jeff
>>
>> > --- a...@upnet.gr wrote:
>> >
>> >> Hello everyone! I'm trying to implement
>> Goertzel's
>> >> algorithm on TI C6711 board using fixed-point
>> >> arithmetic. The problem is that I always end up
>> with
>> >> overflow...I'm using Q15 format for the filter
>> >> coefficients, and I know that the samples of the
>> >> input are also coded in Q15. The problem seems to
>> >> come from the additions that need to be done at
>> each
>> >> step, which make the final result overflow. I use
>> >> short (16-bit) integers to store the coefficients
>> >> and the input, and 32-bit integers for the
>> results
>> >> at each step. Pardon my ignorance, but I really
>> >> can't understand how I can avoid this...Moreover,
>> >> since the sum of two Q15 numbers is also Q15,
>> it's
>> >> totally incnomprehensible to me what, for
>> example,
>> >> an 18-bit Q15 number means. I suppose either have
>> to
>> >> use saturation or scale down my input (which
>> comes
>> >> from the built-in microphones on the PCM 3003
>> >> codec), but I don't know how to do it...I'd be
>> >> grateful if anyone could give a little help.
>> >>
>> >> Andrew Milias
>> >> student @ E.E. department
>> >> University of Patras
>> >> Greece
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > a...
>> >
>> >
>> >
>> >
>> >
> ___________________________________________________________
> Win tickets to the 2006 FIFA World Cup Germany with Yahoo! Messenger.
> http://advision.webevents.yahoo.com/fifaworldcup_uk/
>
Reply by ●April 28, 20062006-04-28
Thank you all very much for the help. I agree that using fixed point arithmetic
on a floating point processor is a little weird, but this is what I've been
assigned to do! I'm currently analyzing the new things I've learned
through your posts, and I'll come back soon with more questions!
Thank you very much again.
Andrew Milias
Thank you very much again.
Andrew Milias
Reply by ●April 28, 20062006-04-28
Hi
The format of the data needs careful selection. it is not strainght. I would suggest you to first find out the max and min value of delay output data format and current output data format with worst case input data for loop size your using for the computation of magnitude.
So accordingly you need to choose format accordly. It is quite trick do. first start implementation in floating point in matlab then implament in fixed format based on the
y(n), y(n-1), x(n-1) and x(n) ranges.
The Most popular algorithm for DTMF detection is goertzel algorithm. It is infact it is a single point DFT realised as a first order IIR filter form. So important fact is the Q format depends on LOOP or order of N and input data format (needs scaling) and coefficients format also.
hopre this help
best regards
Amaresh
"Deshpande,Vishvanath" wrote:
Hi,
Your assumption
>since the sum of two Q15 numbers is also Q15
is always not true. It can be Q1.14 (Eg: .99+.99 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
Q15, It shud be represented in Q1.14
So your further calculations should be done depending
on which format the latest result is in and
accordingly display the final result in appropriate Q
format,
-Vishvanath
--- a...@upnet.gr wrote:
> Hello everyone! I'm trying to implement Goertzel's
> algorithm on TI C6711 board using fixed-point
> arithmetic. The problem is that I always end up with
> overflow...I'm using Q15 format for the filter
> coefficients, and I know that the samples of the
> input are also coded in Q15. The problem seems to
> come from the additions that need to be done at each
> step, which make the final result overflow. I use
> short (16-bit) integers to store the coefficients
> and the input, and 32-bit integers for the results
> at each step. Pardon my ignorance, but I really
> can't understand how I can avoid this...Moreover,
> since the sum of two Q15 numbers is also Q15, it's
> totally incnomprehensible to me what, for example,
> an 18-bit Q15 number means. I suppose either have to
> use saturation or scale down my input (which comes
> from the built-in microphones on the PCM 3003
> codec), but I don't know how to do it...I'd be
> grateful if anyone could give a little help.
>
> Andrew Milias
> student @ E.E. department
> University of Patras
> Greece
>
The format of the data needs careful selection. it is not strainght. I would suggest you to first find out the max and min value of delay output data format and current output data format with worst case input data for loop size your using for the computation of magnitude.
So accordingly you need to choose format accordly. It is quite trick do. first start implementation in floating point in matlab then implament in fixed format based on the
y(n), y(n-1), x(n-1) and x(n) ranges.
The Most popular algorithm for DTMF detection is goertzel algorithm. It is infact it is a single point DFT realised as a first order IIR filter form. So important fact is the Q format depends on LOOP or order of N and input data format (needs scaling) and coefficients format also.
hopre this help
best regards
Amaresh
"Deshpande,Vishvanath" wrote:
Hi,
Your assumption
>since the sum of two Q15 numbers is also Q15
is always not true. It can be Q1.14 (Eg: .99+.99 1.99 ) If 0.99 is in Q15 then 1.99 will not fit in
Q15, It shud be represented in Q1.14
So your further calculations should be done depending
on which format the latest result is in and
accordingly display the final result in appropriate Q
format,
-Vishvanath
--- a...@upnet.gr wrote:
> Hello everyone! I'm trying to implement Goertzel's
> algorithm on TI C6711 board using fixed-point
> arithmetic. The problem is that I always end up with
> overflow...I'm using Q15 format for the filter
> coefficients, and I know that the samples of the
> input are also coded in Q15. The problem seems to
> come from the additions that need to be done at each
> step, which make the final result overflow. I use
> short (16-bit) integers to store the coefficients
> and the input, and 32-bit integers for the results
> at each step. Pardon my ignorance, but I really
> can't understand how I can avoid this...Moreover,
> since the sum of two Q15 numbers is also Q15, it's
> totally incnomprehensible to me what, for example,
> an 18-bit Q15 number means. I suppose either have to
> use saturation or scale down my input (which comes
> from the built-in microphones on the PCM 3003
> codec), but I don't know how to do it...I'd be
> grateful if anyone could give a little help.
>
> Andrew Milias
> student @ E.E. department
> University of Patras
> Greece
>
Reply by ●April 28, 20062006-04-28
Jeff,
I agree to the most of your points. Still one doubt or
rather you can correct my understanding if its wrong,
A Q15 format :
0 Integer bits(I),1 Sign Bit(S),15 fractional bit(F)
So Q15 + Q 15 would result,
0 I's, 2 S's, 30 F's of which last 16 F's will be
automatically lost (since it is 16 bit register)
So we are left with,
0 I's 2 S's 14 F's
So now you left shift it by 1 to get rid of sign,
0 I's 1 S 15 F (lsb fraction bit being 0)
In case of overeflow the overflow bit will be sitting
in place of sign bit so can we use that as one of the
integer bit of Q14 (- which is 1I 1S 14 F) and display
the result ????
Please correct me when i am wrong,
Thanks,
-Vishvanath
--- Jeff Brower wrote:
> Vishvanath-
>
> > I said we should view the result of addition in
> Q1.14
> > format, else the last bit will be lost.
> > Eg : Could you tell me how do you represent 1.99
> in
> > Q15 format (16 bit is the register width) ?
>
> After adding 0.99 and 0.99 both in Q15 format with a
> 16-bit accumlator,
> the msb is already lost and the result no longer
> makes sense. Your
> suggestion to use Q14 format to change how we 'view'
> the result will not
> save the bit.
>
> But I know what you mean, just has to be said
> differently. It would make
> sense to apply Q14 format to values *before* adding,
> in which case a
> result of 1.99 can fit in the 16-bit accumulator
> result.
>
> > Your statement is absolutely correct - that
> overflow
> > happens and some bits are lost - But if you
> maintain
> > the same Q format (Q15) throughout your additions
> then
> > your final result will be wrong.
>
> If the accumlator has a guard-band, as many do
> including all DSPs that
> I've used, then you can continue to add Q15 numbers.
> For example, if you
> have to average 4 Q15 numbers, and you have an
> 8-bit guard-band, then you
> can add them, shift the final sum right by 2 (divide
> by 4) and you have a
> Q15 answer. The "Q view" never changes.
>
> -Jeff
>
> > --- Jeff Brower wrote:
> >
> >> Vishvanath-
> >>
> >> > Your assumption
> >> >>since the sum of two Q15 numbers is also Q15
> >> > is always not true. It can be Q1.14 (Eg:
> .99+.99 > >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit
> in
> >> > Q15, It shud be represented in Q1.14
> >>
> >> This will confuse the OP. Adding a series of Q15
> >> numbers *always results*
> >> in a Q15 number, although the result may have
> extra
> >> bits to left of
> >> decimal point. If the register is not long
> enough,
> >> or there is not a
> >> guard-band in the accumulator, then the result
> will
> >> overflow and some bits
> >> lost.
> >>
> >> -Jeff
> >>
> >> > --- a...@upnet.gr wrote:
> >> >
> >> >> Hello everyone! I'm trying to implement
> >> Goertzel's
> >> >> algorithm on TI C6711 board using fixed-point
> >> >> arithmetic. The problem is that I always end
> up
> >> with
> >> >> overflow...I'm using Q15 format for the filter
> >> >> coefficients, and I know that the samples of
> the
> >> >> input are also coded in Q15. The problem seems
> to
> >> >> come from the additions that need to be done
> at
> >> each
> >> >> step, which make the final result overflow. I
> use
> >> >> short (16-bit) integers to store the
> coefficients
> >> >> and the input, and 32-bit integers for the
> >> results
> >> >> at each step. Pardon my ignorance, but I
> really
> >> >> can't understand how I can avoid
> this...Moreover,
> >> >> since the sum of two Q15 numbers is also Q15,
> >> it's
> >> >> totally incnomprehensible to me what, for
> >> example,
> >> >> an 18-bit Q15 number means. I suppose either
> have
> >> to
> >> >> use saturation or scale down my input (which
> >> comes
> >> >> from the built-in microphones on the PCM 3003
> >> >> codec), but I don't know how to do it...I'd be
> >> >> grateful if anyone could give a little help.
> >> >>
> >> >> Andrew Milias
> >> >> student @ E.E. department
> >> >> University of Patras
> >> >> Greece
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
I agree to the most of your points. Still one doubt or
rather you can correct my understanding if its wrong,
A Q15 format :
0 Integer bits(I),1 Sign Bit(S),15 fractional bit(F)
So Q15 + Q 15 would result,
0 I's, 2 S's, 30 F's of which last 16 F's will be
automatically lost (since it is 16 bit register)
So we are left with,
0 I's 2 S's 14 F's
So now you left shift it by 1 to get rid of sign,
0 I's 1 S 15 F (lsb fraction bit being 0)
In case of overeflow the overflow bit will be sitting
in place of sign bit so can we use that as one of the
integer bit of Q14 (- which is 1I 1S 14 F) and display
the result ????
Please correct me when i am wrong,
Thanks,
-Vishvanath
--- Jeff Brower wrote:
> Vishvanath-
>
> > I said we should view the result of addition in
> Q1.14
> > format, else the last bit will be lost.
> > Eg : Could you tell me how do you represent 1.99
> in
> > Q15 format (16 bit is the register width) ?
>
> After adding 0.99 and 0.99 both in Q15 format with a
> 16-bit accumlator,
> the msb is already lost and the result no longer
> makes sense. Your
> suggestion to use Q14 format to change how we 'view'
> the result will not
> save the bit.
>
> But I know what you mean, just has to be said
> differently. It would make
> sense to apply Q14 format to values *before* adding,
> in which case a
> result of 1.99 can fit in the 16-bit accumulator
> result.
>
> > Your statement is absolutely correct - that
> overflow
> > happens and some bits are lost - But if you
> maintain
> > the same Q format (Q15) throughout your additions
> then
> > your final result will be wrong.
>
> If the accumlator has a guard-band, as many do
> including all DSPs that
> I've used, then you can continue to add Q15 numbers.
> For example, if you
> have to average 4 Q15 numbers, and you have an
> 8-bit guard-band, then you
> can add them, shift the final sum right by 2 (divide
> by 4) and you have a
> Q15 answer. The "Q view" never changes.
>
> -Jeff
>
> > --- Jeff Brower wrote:
> >
> >> Vishvanath-
> >>
> >> > Your assumption
> >> >>since the sum of two Q15 numbers is also Q15
> >> > is always not true. It can be Q1.14 (Eg:
> .99+.99 > >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit
> in
> >> > Q15, It shud be represented in Q1.14
> >>
> >> This will confuse the OP. Adding a series of Q15
> >> numbers *always results*
> >> in a Q15 number, although the result may have
> extra
> >> bits to left of
> >> decimal point. If the register is not long
> enough,
> >> or there is not a
> >> guard-band in the accumulator, then the result
> will
> >> overflow and some bits
> >> lost.
> >>
> >> -Jeff
> >>
> >> > --- a...@upnet.gr wrote:
> >> >
> >> >> Hello everyone! I'm trying to implement
> >> Goertzel's
> >> >> algorithm on TI C6711 board using fixed-point
> >> >> arithmetic. The problem is that I always end
> up
> >> with
> >> >> overflow...I'm using Q15 format for the filter
> >> >> coefficients, and I know that the samples of
> the
> >> >> input are also coded in Q15. The problem seems
> to
> >> >> come from the additions that need to be done
> at
> >> each
> >> >> step, which make the final result overflow. I
> use
> >> >> short (16-bit) integers to store the
> coefficients
> >> >> and the input, and 32-bit integers for the
> >> results
> >> >> at each step. Pardon my ignorance, but I
> really
> >> >> can't understand how I can avoid
> this...Moreover,
> >> >> since the sum of two Q15 numbers is also Q15,
> >> it's
> >> >> totally incnomprehensible to me what, for
> >> example,
> >> >> an 18-bit Q15 number means. I suppose either
> have
> >> to
> >> >> use saturation or scale down my input (which
> >> comes
> >> >> from the built-in microphones on the PCM 3003
> >> >> codec), but I don't know how to do it...I'd be
> >> >> grateful if anyone could give a little help.
> >> >>
> >> >> Andrew Milias
> >> >> student @ E.E. department
> >> >> University of Patras
> >> >> Greece
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
Reply by ●May 1, 20062006-05-01
Vishvanath-
> I agree to the most of your points. Still one doubt or
> rather you can correct my understanding if its wrong,
>
> A Q15 format :
> 0 Integer bits(I),1 Sign Bit(S),15 fractional bit(F)
>
> So Q15 + Q 15 would result,
> 0 I's, 2 S's, 30 F's of which last 16 F's will be
> automatically lost (since it is 16 bit register)
This is true for multiply of two Q15 numbers, not addition. After adding,
there is no change in position of decimal point and still only 1 sign bit.
> So we are left with,
> 0 I's 2 S's 14 F's
> So now you left shift it by 1 to get rid of sign,
> 0 I's 1 S 15 F (lsb fraction bit being 0)
True for multiply!
> In case of overeflow the overflow bit will be sitting
> in place of sign bit so can we use that as one of the
> integer bit of Q14 (- which is 1I 1S 14 F) and display
> the result ????
If you use the Q15xQ15 multiply result without left shift, then yes it
must be treated as Q14. It's of interest that many processors and DSPs
include an option for automatic 1-bit left-shift after multiply.
-Jeff
> --- Jeff Brower wrote:
>
>> Vishvanath-
>>
>> > I said we should view the result of addition in
>> Q1.14
>> > format, else the last bit will be lost.
>> > Eg : Could you tell me how do you represent 1.99
>> in
>> > Q15 format (16 bit is the register width) ?
>>
>> After adding 0.99 and 0.99 both in Q15 format with a
>> 16-bit accumlator,
>> the msb is already lost and the result no longer
>> makes sense. Your
>> suggestion to use Q14 format to change how we 'view'
>> the result will not
>> save the bit.
>>
>> But I know what you mean, just has to be said
>> differently. It would make
>> sense to apply Q14 format to values *before* adding,
>> in which case a
>> result of 1.99 can fit in the 16-bit accumulator
>> result.
>>
>> > Your statement is absolutely correct - that
>> overflow
>> > happens and some bits are lost - But if you
>> maintain
>> > the same Q format (Q15) throughout your additions
>> then
>> > your final result will be wrong.
>>
>> If the accumlator has a guard-band, as many do
>> including all DSPs that
>> I've used, then you can continue to add Q15 numbers.
>> For example, if you
>> have to average 4 Q15 numbers, and you have an
>> 8-bit guard-band, then you
>> can add them, shift the final sum right by 2 (divide
>> by 4) and you have a
>> Q15 answer. The "Q view" never changes.
>>
>> -Jeff
>>
>> > --- Jeff Brower wrote:
>> >
>> >> Vishvanath-
>> >>
>> >> > Your assumption
>> >> >>since the sum of two Q15 numbers is also Q15
>> >> > is always not true. It can be Q1.14 (Eg:
>> .99+.99 >> >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit
>> in
>> >> > Q15, It shud be represented in Q1.14
>> >>
>> >> This will confuse the OP. Adding a series of Q15
>> >> numbers *always results*
>> >> in a Q15 number, although the result may have
>> extra
>> >> bits to left of
>> >> decimal point. If the register is not long
>> enough,
>> >> or there is not a
>> >> guard-band in the accumulator, then the result
>> will
>> >> overflow and some bits
>> >> lost.
>> >>
>> >> -Jeff
>> >>
>> >> > --- a...@upnet.gr wrote:
>> >> >
>> >> >> Hello everyone! I'm trying to implement
>> >> Goertzel's
>> >> >> algorithm on TI C6711 board using fixed-point
>> >> >> arithmetic. The problem is that I always end
>> up
>> >> with
>> >> >> overflow...I'm using Q15 format for the filter
>> >> >> coefficients, and I know that the samples of
>> the
>> >> >> input are also coded in Q15. The problem seems
>> to
>> >> >> come from the additions that need to be done
>> at
>> >> each
>> >> >> step, which make the final result overflow. I
>> use
>> >> >> short (16-bit) integers to store the
>> coefficients
>> >> >> and the input, and 32-bit integers for the
>> >> results
>> >> >> at each step. Pardon my ignorance, but I
>> really
>> >> >> can't understand how I can avoid
>> this...Moreover,
>> >> >> since the sum of two Q15 numbers is also Q15,
>> >> it's
>> >> >> totally incnomprehensible to me what, for
>> >> example,
>> >> >> an 18-bit Q15 number means. I suppose either
>> have
>> >> to
>> >> >> use saturation or scale down my input (which
>> >> comes
>> >> >> from the built-in microphones on the PCM 3003
>> >> >> codec), but I don't know how to do it...I'd be
>> >> >> grateful if anyone could give a little help.
>> >> >>
>> >> >> Andrew Milias
>> >> >> student @ E.E. department
>> >> >> University of Patras
>> >> >> Greece
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>
> I agree to the most of your points. Still one doubt or
> rather you can correct my understanding if its wrong,
>
> A Q15 format :
> 0 Integer bits(I),1 Sign Bit(S),15 fractional bit(F)
>
> So Q15 + Q 15 would result,
> 0 I's, 2 S's, 30 F's of which last 16 F's will be
> automatically lost (since it is 16 bit register)
This is true for multiply of two Q15 numbers, not addition. After adding,
there is no change in position of decimal point and still only 1 sign bit.
> So we are left with,
> 0 I's 2 S's 14 F's
> So now you left shift it by 1 to get rid of sign,
> 0 I's 1 S 15 F (lsb fraction bit being 0)
True for multiply!
> In case of overeflow the overflow bit will be sitting
> in place of sign bit so can we use that as one of the
> integer bit of Q14 (- which is 1I 1S 14 F) and display
> the result ????
If you use the Q15xQ15 multiply result without left shift, then yes it
must be treated as Q14. It's of interest that many processors and DSPs
include an option for automatic 1-bit left-shift after multiply.
-Jeff
> --- Jeff Brower wrote:
>
>> Vishvanath-
>>
>> > I said we should view the result of addition in
>> Q1.14
>> > format, else the last bit will be lost.
>> > Eg : Could you tell me how do you represent 1.99
>> in
>> > Q15 format (16 bit is the register width) ?
>>
>> After adding 0.99 and 0.99 both in Q15 format with a
>> 16-bit accumlator,
>> the msb is already lost and the result no longer
>> makes sense. Your
>> suggestion to use Q14 format to change how we 'view'
>> the result will not
>> save the bit.
>>
>> But I know what you mean, just has to be said
>> differently. It would make
>> sense to apply Q14 format to values *before* adding,
>> in which case a
>> result of 1.99 can fit in the 16-bit accumulator
>> result.
>>
>> > Your statement is absolutely correct - that
>> overflow
>> > happens and some bits are lost - But if you
>> maintain
>> > the same Q format (Q15) throughout your additions
>> then
>> > your final result will be wrong.
>>
>> If the accumlator has a guard-band, as many do
>> including all DSPs that
>> I've used, then you can continue to add Q15 numbers.
>> For example, if you
>> have to average 4 Q15 numbers, and you have an
>> 8-bit guard-band, then you
>> can add them, shift the final sum right by 2 (divide
>> by 4) and you have a
>> Q15 answer. The "Q view" never changes.
>>
>> -Jeff
>>
>> > --- Jeff Brower wrote:
>> >
>> >> Vishvanath-
>> >>
>> >> > Your assumption
>> >> >>since the sum of two Q15 numbers is also Q15
>> >> > is always not true. It can be Q1.14 (Eg:
>> .99+.99 >> >> > 1.99 ) If 0.99 is in Q15 then 1.99 will not fit
>> in
>> >> > Q15, It shud be represented in Q1.14
>> >>
>> >> This will confuse the OP. Adding a series of Q15
>> >> numbers *always results*
>> >> in a Q15 number, although the result may have
>> extra
>> >> bits to left of
>> >> decimal point. If the register is not long
>> enough,
>> >> or there is not a
>> >> guard-band in the accumulator, then the result
>> will
>> >> overflow and some bits
>> >> lost.
>> >>
>> >> -Jeff
>> >>
>> >> > --- a...@upnet.gr wrote:
>> >> >
>> >> >> Hello everyone! I'm trying to implement
>> >> Goertzel's
>> >> >> algorithm on TI C6711 board using fixed-point
>> >> >> arithmetic. The problem is that I always end
>> up
>> >> with
>> >> >> overflow...I'm using Q15 format for the filter
>> >> >> coefficients, and I know that the samples of
>> the
>> >> >> input are also coded in Q15. The problem seems
>> to
>> >> >> come from the additions that need to be done
>> at
>> >> each
>> >> >> step, which make the final result overflow. I
>> use
>> >> >> short (16-bit) integers to store the
>> coefficients
>> >> >> and the input, and 32-bit integers for the
>> >> results
>> >> >> at each step. Pardon my ignorance, but I
>> really
>> >> >> can't understand how I can avoid
>> this...Moreover,
>> >> >> since the sum of two Q15 numbers is also Q15,
>> >> it's
>> >> >> totally incnomprehensible to me what, for
>> >> example,
>> >> >> an 18-bit Q15 number means. I suppose either
>> have
>> >> to
>> >> >> use saturation or scale down my input (which
>> >> comes
>> >> >> from the built-in microphones on the PCM 3003
>> >> >> codec), but I don't know how to do it...I'd be
>> >> >> grateful if anyone could give a little help.
>> >> >>
>> >> >> Andrew Milias
>> >> >> student @ E.E. department
>> >> >> University of Patras
>> >> >> Greece
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>






