Celeb Lover wrote:

> I am not following the "headroom" discussion below (last paragraph). I
> don't see that any scaling of the input downwards is necessary other than
> a possible slight adjustment of the inputs that will completely avoid
> overflows. So I do not see the necessity for the scaling you mention and
> the consequential raising of the noise floor.

If we assume sinusoids define the upper bound of useable dynamic range
then both methods have essentially the same dynamic range (12 bits for
a 256 point 16 bit FFT). (Although the scale by 1/sqrt(2) method is
slightly lower, but lets ignore that for the present).

The difference between the two is how white(ish) and sinusoidal signals
scale:

The scale by 1/2 on each pass method
====================================
Sinusoids scale by a factor of 1 (in the appropriate bin)
White noise scales by a factor of 1/sqrt(N) (=1/16 for N=256)

So the 12 bits you must use at the FFT input are bits 15..4.
This means that full scale sinusiods show up as full scale
outputs (in the appropriate bin) and quantisation noise floor
of the input (at or around bit 4) is scaled to about the same
level as the quantisation noise floor of the FFT (at or around
bit 0) 

Incidently, the reason I say the input "quantisation noise" floor
is at or around bit 4, despite our input being full 16 bits is
that this is the effective noise floor of the system as a whole
if we want to keep the real noise floor slightly above the quantisation
noise floor at all stages of our processing (as we generally do).  

The scale by 1/sqrt(2) on each pass method
==========================================
Sinusoids scale by a factor of sqrt(N) (in the appropriate bin)
White noise scales by a factor of 1

So the 12 bits you must use at the FFT input are bits 11..0.
Again, this means that full scale sinusiods (at or around bit 11)
show up as full scale outputs (in the appropriate bin) and
quantisation noise floor of the input (at or around bit 0) is
scaled to about the same level as the quantisation noise floor
of the FFT (at or around bit 0).

                ----------------

So far there doesn't seem to be much to chose between them. But
suppose I know my system isn't going to have to deal with sinusiods.
The only signals of high amplitude are going to be things like
impulsive spikes or high level noise or wideband chirps say.

With such signals the the scale by 1/sqrt(2) on each pass method
still potentially gives me the full 16 bits input dynamic range
It's only sinusoids that will cause overflow if they exceed 12
bits, not all signals. The system might well be able to cope
perfectly well with 15 bit spikes say (at or around bit 14).

But with the scale by 1/2 on each pass method the corresponding
spike would be at or around bit 18. Oops, no it can't be.
If I wanted to deal with such spikes I'd have to scale the input
by 1/8 so the spike was at or around bit 15. But doing this would
also scale my real noise floor to be at or around bit 1.
So at the FFT output the real noise floor would be at or around
bit -3 (I.E. well below the FFT quantisation noise floor).

So although we might think of both arrangements as giving
12 bit dynamic range, with the scale by 1/2 on each pass
method this limit applies to all signals, whereas it only
applies to sinusoids for the scale by 1/sqrt(2) on each pass
method. With the hypothetical spike example I've bought myself
another 18 dB dynamic range by using the 1/sqrt(2) method.

But using my earlier half baked theory, maybe that should
really be only 18-10.LOG10(3P/4) = 10.2 dB for P=8.

Regards
--
Adrian Hey

Damn I hate is when I don't check the posting name.

"Celeb Lover" <cl@erols.com> wrote in message
news:bdfl4m$omh$1@bob.news.rcn.net...
>
> "Adrian Hey" <ahey@NoSpicedHam.iee.org> wrote in message
> news:bdebg5$27p$1$8300dec7@news.demon.co.uk...
> > Dirk Bell wrote:
> >
> > > One of the advantages of FFT processing is that you get processing
gain.
> > > If I had a few bits in, the dynamic range of the output may be
> > > significantly increased.
> >
> > No kidding :-)
>
> Adrian,
>
> Not kidding at all.  In fact the processing gain is greater for narrowband
> signals and becomes negligible as the signal approaches broadband.  Thus
the
> case you have described below (broadband) works because there is less
> processing gain.  There is nothing magical about scaling by 1/sqrt(2)
> everytime.  However, if the signal turns out not to be broadband, then you
> risk overflowing and blowing out your spectral estimate. If you had a
> broadband signal, and you knew it ahead of time, then putting 12 of the 16
> bits in the 12 lsb's and doing a 1/sqrt(2) scaling every time or  1/2
> scaling every other time would not take advantage of your knowledge of the
> signal, since the absolute scaling is not different than putting the full
16
> bits in the input and scaling down by 1/2 every time, it would just
generate
> more quantization noise ( NOTE in the extreme narrowband case both of
these
> would allow overflow to occur).
>
> As for scaling to make overflows impossible being overly pessimistic, if
you
> know that you are not going to get narrowband signals I would completely
> agree with you.  The dilemma is exactly what you do know, how much scaling
> do you take out, what is the negative impact in the application if you
> overflow, can you accept the negative impact of overflow, and what is any
> negative impact from loss of accuracy or underflow from scaling that
> prevents any possible overflows? The tradeoffs are part of the engineering
> decisions you must make.
>
> I am not following the "headroom" discussion below (last paragraph). I
don't
> see that any scaling of the input downwards is necessary other than a
> possible slight adjustment of the inputs that will completely avoid
> overflows. So I do not see the necessity for the scaling you mention and
the
> consequential raising of the noise floor.
>
> Dirk
>
> Dirk A. Bell
> DSP Consultant
>
> >
> > > If I have 12 bits in the lsbs and I requantize to
> > > 12 bits in the lsbs at each stage I will be adding more quantization
> noise
> > > than if I had 16
> > > bits in and requantized to 16 bits at each stage.  You can get away
with
> > > the every other stage divide by 2 because you have more coarsely
> quantized
> > > the input to make up for not doing the scaling at each stage. You are
> > > still doing roughly (maybe exactly) the same scaling but handicapping
> > > yourself by adding an additional 4 bits of quantization noise to the
> input
> > > (for 16 vs 12 bits). Actually 3 bits  if you scale the input to 15
bits
> > > to avoid overflow.  Then you add more noise at every stage by cutting
> > > back to 12 bits.
> >
> > I think if do some sums you'll find the situation using 1/sqrt(2)
scaling
> > on each pass (or 1/2 scaling on alternate passes) is not as bad as you
> > suggest. It also has advantages for processing high level wideband
signals
> > IMO.
> >
> > Here's my (latest) take on this. Looking at this in a little more detail
I
> > think my original guesstimate of the difference in quantisation noise
> > powers (1.7 dB for 256 point FFT) for the two techiques was wrong.
> >
> > For a N point, P pass FFT (N=2^P) the scaling by 1/2 each pass method
> > gives quantization noise power independent of P if P is large, whereas
> > the scale by 1/sqrt(2) each pass method gives quantisation noise power
> > proportional to P. In fact I think the ratio of the two works out to
> > be something like 3P/4 for large P, all other factors being equal.
> >
> > So using the scale by 1/sqrt(2) method (or scale by 1/2 on alternate
> > passes), to get the same SNR performance as the scale by 1/2
> > method we need to prescale the FFT input by sqrt(3P/4)
> >
> > So the bit growth with the scale by 1/sqrt(2) method is in fact..
> >         LOG2(sqrt(N.(3/4).P))
> >         = (P + LOG2(3/4) + LOG2(P))/2
> >         = (P + LOG2(P) - 0.4)/2     (bits)
> >
> > For N=256 (P=8) this gives 5.3 bits (vs. 4 bits).
> >
> > So it would seem at first sight that you're correct. (Puting
> > the signal in the top end of the FFT input and scaling by
> > 1/2 on each pass is best).
> >
> > But (even assuming this analysis is correct:-), as usual real
> > life isn't so simple. We've assumed that the upper limit
> > of the useable dynamic range is determined by the requirement
> > that sinusoids at this level don't overflow. For many applications
> > I think this will be unduly pessimistic. For wide band signals
> > the useable dynamic range is greater than this.
> >
> > If we put the signal (albeit scaled up by sqrt(3P/4)) in the
> > bottom end of our FFT input (with 1/sqrt(2) scaling each pass)
> > we still have significant headroom without affecting the
> > quantisation noise performance.
> >
> > The only way of giving ourselves the necessary headroom if we're
> > using the top end (with 1/2 scaling each pass) is to pre-scale the
> > FFT input downwards (in effect raising the quantisation noise floor
> > of the FFT by a corresponding figure).
> >
> > Regards
> > --
> > Adrian Hey
> >
> >
> >
>
>

"Adrian Hey" <ahey@NoSpicedHam.iee.org> wrote in message
news:bdebg5$27p$1$8300dec7@news.demon.co.uk...
> Dirk Bell wrote:
>
> > One of the advantages of FFT processing is that you get processing gain.
> > If I had a few bits in, the dynamic range of the output may be
> > significantly increased.
>
> No kidding :-)

Adrian,

Not kidding at all.  In fact the processing gain is greater for narrowband
signals and becomes negligible as the signal approaches broadband.  Thus the
case you have described below (broadband) works because there is less
processing gain.  There is nothing magical about scaling by 1/sqrt(2)
everytime.  However, if the signal turns out not to be broadband, then you
risk overflowing and blowing out your spectral estimate. If you had a
broadband signal, and you knew it ahead of time, then putting 12 of the 16
bits in the 12 lsb's and doing a 1/sqrt(2) scaling every time or  1/2
scaling every other time would not take advantage of your knowledge of the
signal, since the absolute scaling is not different than putting the full 16
bits in the input and scaling down by 1/2 every time, it would just generate
more quantization noise ( NOTE in the extreme narrowband case both of these
would allow overflow to occur).

As for scaling to make overflows impossible being overly pessimistic, if you
know that you are not going to get narrowband signals I would completely
agree with you.  The dilemma is exactly what you do know, how much scaling
do you take out, what is the negative impact in the application if you
overflow, can you accept the negative impact of overflow, and what is any
negative impact from loss of accuracy or underflow from scaling that
prevents any possible overflows? The tradeoffs are part of the engineering
decisions you must make.

I am not following the "headroom" discussion below (last paragraph). I don't
see that any scaling of the input downwards is necessary other than a
possible slight adjustment of the inputs that will completely avoid
overflows. So I do not see the necessity for the scaling you mention and the
consequential raising of the noise floor.

Dirk

Dirk A. Bell
DSP Consultant

>
> > If I have 12 bits in the lsbs and I requantize to
> > 12 bits in the lsbs at each stage I will be adding more quantization
noise
> > than if I had 16
> > bits in and requantized to 16 bits at each stage.  You can get away with
> > the every other stage divide by 2 because you have more coarsely
quantized
> > the input to make up for not doing the scaling at each stage. You are
> > still doing roughly (maybe exactly) the same scaling but handicapping
> > yourself by adding an additional 4 bits of quantization noise to the
input
> > (for 16 vs 12 bits). Actually 3 bits  if you scale the input to 15 bits
> > to avoid overflow.  Then you add more noise at every stage by cutting
> > back to 12 bits.
>
> I think if do some sums you'll find the situation using 1/sqrt(2) scaling
> on each pass (or 1/2 scaling on alternate passes) is not as bad as you
> suggest. It also has advantages for processing high level wideband signals
> IMO.
>
> Here's my (latest) take on this. Looking at this in a little more detail I
> think my original guesstimate of the difference in quantisation noise
> powers (1.7 dB for 256 point FFT) for the two techiques was wrong.
>
> For a N point, P pass FFT (N=2^P) the scaling by 1/2 each pass method
> gives quantization noise power independent of P if P is large, whereas
> the scale by 1/sqrt(2) each pass method gives quantisation noise power
> proportional to P. In fact I think the ratio of the two works out to
> be something like 3P/4 for large P, all other factors being equal.
>
> So using the scale by 1/sqrt(2) method (or scale by 1/2 on alternate
> passes), to get the same SNR performance as the scale by 1/2
> method we need to prescale the FFT input by sqrt(3P/4)
>
> So the bit growth with the scale by 1/sqrt(2) method is in fact..
>         LOG2(sqrt(N.(3/4).P))
>         = (P + LOG2(3/4) + LOG2(P))/2
>         = (P + LOG2(P) - 0.4)/2     (bits)
>
> For N=256 (P=8) this gives 5.3 bits (vs. 4 bits).
>
> So it would seem at first sight that you're correct. (Puting
> the signal in the top end of the FFT input and scaling by
> 1/2 on each pass is best).
>
> But (even assuming this analysis is correct:-), as usual real
> life isn't so simple. We've assumed that the upper limit
> of the useable dynamic range is determined by the requirement
> that sinusoids at this level don't overflow. For many applications
> I think this will be unduly pessimistic. For wide band signals
> the useable dynamic range is greater than this.
>
> If we put the signal (albeit scaled up by sqrt(3P/4)) in the
> bottom end of our FFT input (with 1/sqrt(2) scaling each pass)
> we still have significant headroom without affecting the
> quantisation noise performance.
>
> The only way of giving ourselves the necessary headroom if we're
> using the top end (with 1/2 scaling each pass) is to pre-scale the
> FFT input downwards (in effect raising the quantisation noise floor
> of the FFT by a corresponding figure).
>
> Regards
> --
> Adrian Hey
>
>
>

> > "Dirk Bell" <dirkman@erols.com> wrote in message
> > > What DSP are  you using?
> stenasc@yahoo.com (Bob) writes:
> Writing it for an ASIC in VHDL

May be a daft question, but in that case, can't you just use wider
words - or are there things outside your control which preclude that?

-- 
martin.j.thompson@trw.com
TRW Conekt, Solihull, UK
http://www.trw.com/conekt

Dirk Bell wrote:

> One of the advantages of FFT processing is that you get processing gain.
> If I had a few bits in, the dynamic range of the output may be
> significantly increased.

No kidding :-)

> If I have 12 bits in the lsbs and I requantize to
> 12 bits in the lsbs at each stage I will be adding more quantization noise
> than if I had 16
> bits in and requantized to 16 bits at each stage.  You can get away with
> the every other stage divide by 2 because you have more coarsely quantized
> the input to make up for not doing the scaling at each stage. You are
> still doing roughly (maybe exactly) the same scaling but handicapping
> yourself by adding an additional 4 bits of quantization noise to the input
> (for 16 vs 12 bits). Actually 3 bits  if you scale the input to 15 bits
> to avoid overflow.  Then you add more noise at every stage by cutting
> back to 12 bits.

I think if do some sums you'll find the situation using 1/sqrt(2) scaling
on each pass (or 1/2 scaling on alternate passes) is not as bad as you
suggest. It also has advantages for processing high level wideband signals
IMO.

Here's my (latest) take on this. Looking at this in a little more detail I
think my original guesstimate of the difference in quantisation noise
powers (1.7 dB for 256 point FFT) for the two techiques was wrong.

For a N point, P pass FFT (N=2^P) the scaling by 1/2 each pass method
gives quantization noise power independent of P if P is large, whereas
the scale by 1/sqrt(2) each pass method gives quantisation noise power
proportional to P. In fact I think the ratio of the two works out to
be something like 3P/4 for large P, all other factors being equal.

So using the scale by 1/sqrt(2) method (or scale by 1/2 on alternate
passes), to get the same SNR performance as the scale by 1/2
method we need to prescale the FFT input by sqrt(3P/4)

So the bit growth with the scale by 1/sqrt(2) method is in fact..
        LOG2(sqrt(N.(3/4).P))
        = (P + LOG2(3/4) + LOG2(P))/2
        = (P + LOG2(P) - 0.4)/2     (bits)

For N=256 (P=8) this gives 5.3 bits (vs. 4 bits).

So it would seem at first sight that you're correct. (Puting
the signal in the top end of the FFT input and scaling by
1/2 on each pass is best).

But (even assuming this analysis is correct:-), as usual real
life isn't so simple. We've assumed that the upper limit
of the useable dynamic range is determined by the requirement
that sinusoids at this level don't overflow. For many applications
I think this will be unduly pessimistic. For wide band signals
the useable dynamic range is greater than this.

If we put the signal (albeit scaled up by sqrt(3P/4)) in the
bottom end of our FFT input (with 1/sqrt(2) scaling each pass)
we still have significant headroom without affecting the
quantisation noise performance.

The only way of giving ourselves the necessary headroom if we're
using the top end (with 1/2 scaling each pass) is to pre-scale the
FFT input downwards (in effect raising the quantisation noise floor
of the FFT by a corresponding figure).

Regards
--
Adrian Hey

Hi Dirk, 


> Was: Bob, why 4? Divide inputs by 2 for <0.5 FS.
That is why I thought of dividing by 4.
> Fix: Bob, why 4? Divide inputs by 2 for <=0.5 FS.
As you say...divide by 2 here.

> > What DSP are  you using?
Writing it for an ASIC in VHDL

> > If you multiply 16 bits by 16 bits and getting a 32 bit result are you
> > converting it to a 16 bit output by taking the upper 16 of 32 bits? 

That's correct.

I will give this a try and divide by 2 each stage. I'll let you know how it goes.

Thank you to everyone who responded. I greatly appreciate it.

Bob




"Dirk Bell" <dirkman@erols.com> wrote in message news:<bdc875$sih$1@bob.news.rcn.net>...
> CORRECTION:
> Was: Bob, why 4? Divide inputs by 2 for <0.5 FS.
> Fix: Bob, why 4? Divide inputs by 2 for <=0.5 FS.
> 
> Dirk
> 
> "Dirk Bell" <dirkman@erols.com> wrote in message
> news:bdc76e$qjp$1@bob.news.rcn.net...
> >
> > "Bob" <stenasc@yahoo.com> wrote in message
> > news:20540d3a.0306250006.4dd8149a@posting.google.com...
> > > Reading your answers folks....thank you all very much.
> > >
> > > Dirk
> > >
> > > > Try scaling your input components (max of combined real and imaginary
>  parts)
> > > > to be < 0.5 FS
> > > > If you have 8 stages for a 256-pt FFT change your stage output scaling
>  to
> > > > divide by 2.
> > >
> > > ...This means dividing the real and imaginary inputs by 4....OK ?
> >
> > Bob, why 4? Divide inputs by 2 for <0.5 FS.
> >
> > The largest magntiude input for 16 bit range would be REAL=0x8000
> > (= -1.0FS), IMAG=0x8000 (= -1.0FS) for resulting mag =sqrt(2)*FS.
> > Dividing by 2 gives REAL'=0xC000 (= -0.5FS), IMAG'=0xC000 (= -0.5FS) for
> > resulting mag =sqrt(2)/2*FS<FS, what is acceptable.
> >
> > Some questions:
> >
> > What DSP are  you using?
> > Are you doing integer multiplies rather than fractional multiplies?
>  Integer
> > multiplies would mean there is an extra sign bit and scaling by 1/2 if the
> > result is to be interpreted as fractional and taken from the upper 16 of
>  32
> > bits.
> > If you multiply 16 bits by 16 bits and getting a 32 bit result are you
> > converting it to a 16 bit output by taking the upper 16 of 32 bits? If
>  not,
> > how?
> > How big is your accumulator? For example, a 16 bit X 16 bit multiply might
> > be put into a 40 bit accumulator.
> >
> > >
> > > Is it possible to scale the twiddle factors by dividing their values
> > > also by as well. If I do this what will the effect be ?
> > >
> >
> > You could, but I would just work on the data. A shift should be almost
>  free.
> >
> > > I would still plan to divide each output by 2 as well.
> >
> > Not both.  Then you tend towards underflow.
> >
> > Dirk
> >
> > <snipped>
> >
> >

CORRECTION:
Was: Bob, why 4? Divide inputs by 2 for <0.5 FS.
Fix: Bob, why 4? Divide inputs by 2 for <=0.5 FS.

Dirk

"Dirk Bell" <dirkman@erols.com> wrote in message
news:bdc76e$qjp$1@bob.news.rcn.net...
>
> "Bob" <stenasc@yahoo.com> wrote in message
> news:20540d3a.0306250006.4dd8149a@posting.google.com...
> > Reading your answers folks....thank you all very much.
> >
> > Dirk
> >
> > > Try scaling your input components (max of combined real and imaginary
> parts)
> > > to be < 0.5 FS
> > > If you have 8 stages for a 256-pt FFT change your stage output scaling
> to
> > > divide by 2.
> >
> > ...This means dividing the real and imaginary inputs by 4....OK ?
>
> Bob, why 4? Divide inputs by 2 for <0.5 FS.
>
> The largest magntiude input for 16 bit range would be REAL=0x8000
> (= -1.0FS), IMAG=0x8000 (= -1.0FS) for resulting mag =sqrt(2)*FS.
> Dividing by 2 gives REAL'=0xC000 (= -0.5FS), IMAG'=0xC000 (= -0.5FS) for
> resulting mag =sqrt(2)/2*FS<FS, what is acceptable.
>
> Some questions:
>
> What DSP are  you using?
> Are you doing integer multiplies rather than fractional multiplies?
Integer
> multiplies would mean there is an extra sign bit and scaling by 1/2 if the
> result is to be interpreted as fractional and taken from the upper 16 of
32
> bits.
> If you multiply 16 bits by 16 bits and getting a 32 bit result are you
> converting it to a 16 bit output by taking the upper 16 of 32 bits? If
not,
> how?
> How big is your accumulator? For example, a 16 bit X 16 bit multiply might
> be put into a 40 bit accumulator.
>
> >
> > Is it possible to scale the twiddle factors by dividing their values
> > also by as well. If I do this what will the effect be ?
> >
>
> You could, but I would just work on the data. A shift should be almost
free.
>
> > I would still plan to divide each output by 2 as well.
>
> Not both.  Then you tend towards underflow.
>
> Dirk
>
> <snipped>
>
>

One of the advantages of FFT processing is that you get processing gain. If
I had a few bits in, the dynamic range of the output may be significantly
increased. If I have 12 bits in the lsbs and I requantize to 12 bits in the
lsbs at each stage I will be adding more quantization noise than if I had 16
bits in and requantized to 16 bits at each stage.  You can get away with the
every other stage divide by 2 because you have more coarsely quantized the
input to make up for not doing the scaling at each stage. You are still
doing roughly (maybe exactly) the same scaling but handicapping yourself by
adding an additional 4 bits of quantization noise to the input (for 16 vs 12
bits).  Actually 3 bits  if you scale the input to 15 bits to avoid
overflow.  Then you add more noise at every stage by cutting back to 12
bits.

Dirk

at each stage then
"Adrian Hey" <ahey@NoSpicedHam.iee.org> wrote in message
news:bdbnlb$h0t$1$8300dec7@news.demon.co.uk...
> Dirk Bell wrote:
>
> > Scaling by 1/sqrt(2) would allow overflow, which would require continual
> > checking and saturation on an integer machine (a major complication) as
> > well as distortion in the output.  Scaling by 1/2 on every other pass
> > would do the same.  Not a good idea.
>
> Only if the 12 bit signal was in the top end of the 16 bits, not the
> bottom end (as I suggested).
>
> > Putting the signal in the bottom 12 bits is not a good idea because of
> > excessive rounding error. Normally want the maximum modulus of the input
> > to be <16 bits to avoid overflow with 1/2 scaling at each stage.
Shifting
> > 16 bit real and imag inputs right 1 bit prior to use would guarantee
this
> > for arbitrary inputs.
>
> Actually, I don't think rounding error is particularly significant.
> But if the shift rights (or multiplies by 1/2) can be achieved at
> acceptable cost I agree it's probably best to minimise rounding errors
> by doing as you suggest. (Use the top 12 bits and shift right on each
pass).
>
> Now digging out my copy of OS to try and justify my assertion that
rounding
> error isn't particularly significant using the shift on alternate passes
> technique, I realise that this isn't a case they analyse.
>
> They give an analysis for unscaled FFT such that overflow is avoided
> (effectively constraining our signal to the bottom 8 bits). They also
> give an analysis for the divide by 2 on each pass. But they don't
> cover the divide by 2 on alternate passes method. What a pity :-)
>
> Still, I have a hunch (half baked theory) that difference in quantisation
> noise between the two will only amount to 1.7 dB or so (< 1/3 bits worth).
>
> Regards
> --
> Adrian Hey
>

"Bob" <stenasc@yahoo.com> wrote in message
news:20540d3a.0306250006.4dd8149a@posting.google.com...
> Reading your answers folks....thank you all very much.
>
> Dirk
>
> > Try scaling your input components (max of combined real and imaginary
parts)
> > to be < 0.5 FS
> > If you have 8 stages for a 256-pt FFT change your stage output scaling
to
> > divide by 2.
>
> ...This means dividing the real and imaginary inputs by 4....OK ?

Bob, why 4? Divide inputs by 2 for <0.5 FS.

The largest magntiude input for 16 bit range would be REAL=0x8000
(= -1.0FS), IMAG=0x8000 (= -1.0FS) for resulting mag =sqrt(2)*FS.
Dividing by 2 gives REAL'=0xC000 (= -0.5FS), IMAG'=0xC000 (= -0.5FS) for
resulting mag =sqrt(2)/2*FS<FS, what is acceptable.

Some questions:

What DSP are  you using?
Are you doing integer multiplies rather than fractional multiplies? Integer
multiplies would mean there is an extra sign bit and scaling by 1/2 if the
result is to be interpreted as fractional and taken from the upper 16 of 32
bits.
If you multiply 16 bits by 16 bits and getting a 32 bit result are you
converting it to a 16 bit output by taking the upper 16 of 32 bits? If not,
how?
How big is your accumulator? For example, a 16 bit X 16 bit multiply might
be put into a 40 bit accumulator.

>
> Is it possible to scale the twiddle factors by dividing their values
> also by as well. If I do this what will the effect be ?
>

You could, but I would just work on the data. A shift should be almost free.

> I would still plan to divide each output by 2 as well.

Not both.  Then you tend towards underflow.

Dirk

<snipped>

Bob wrote:

> Adrian/Robert
> It is a radix-2 implementation, but I wanted to make sure I avoided
> any possibilities of overflow. What do you think of scaling the
> Twiddle Factors ?

Not an easy solution, IMHO
If T is the twiddle factor (|T|=1), DIT butterflies are like this..
        X=A+(B.T)
        Y=A-(B.T)
and DIF butterflies are like this..
        X=(A+B)
        Y=(A-B).T
(A and B are complex numbers from the previous pass)

Scaling the twiddle factor alone will give you the wrong
answer. If your going to do this (scaling the twiddle factors
by 1/2 say), your DIT butterflies must become..
        X=A.(1/2)+(B.(T/2))
        Y=A.(1/2)-(B.(T/2))
Likewise for DIF.

How difficult it is to do this efficiently probably depends on
the processor you're using and how cleverly you use the registers
& instruction set at your disposal. With most processors it's
probably easier to post-scale the butterfly results (somehow).

But you should take a look at application notes for your processor.
FFT's are a common benchmark so it's likely that considerble effort
has been made to win brownie points for the fastest FFT possible.
Especially for the standard radix-2 algorithms.

Regards
--
Adrian Hey