DSPRelated.com
Forums

channeliser of Polyphase filter followed by FFT

Started by kaz 2 months ago22 replieslatest reply 2 weeks ago242 views

Hi,

I have never done a channeliser. I can understand the classic method of "shift each channel to dc and filter/decimate" or use bandpass filter). I am trying to understand the alternative design using polyphase filter (PF) followed by FFT. 

I was happy to see this document https://cdrdv2-public.intel.com/650298/versatile-c... 

from Intel but upon some thought and simple modelling I feel it doesn't make sense. I am focused on page 2 only. In particular this below statement:

screenshot 2024-12-12 153347_36295.png


How can phase be so rotated by polyphases as to cancel out except for the baseband.

Can someone please also shed some light on this PF => FFT architecture. 

Thanks


[ - ]
Reply by napiermDecember 12, 2024

There are quite a few fred harris presentations.  But I'll admit I didn't really understand how they worked until I had to implement one myself.  But yes, on its face it sure looks like witchcraft.

Do you have access to Matlab?  Fred has this animated matlab script showing the "arms" of the phases walking around as an input tone moves from one channel to the next.

The short answer is that all the added phases cancel out except for the channel the signal is in.

Another way to think about it is the prototype baseband filter.  In channel 0 (DC) the filter has not moved.  So you filter 1st and then decimate.  The filter design is important as all of the frequency bands will fold back on top.  So your stop band should continue to roll off as 1/f rather than equ-ripple.

In channel 1 the filter has been rotated using a complex exponential to the center of channel 1.  The signal is filtered with the rotated response and then decimated.  It therefore aliases into the decimated base band of that channel.  Or you can consider that the input signal has been rotated 1st, filtered and then decimated.

FWIW,

Mark Napier


[ - ]
Reply by kazDecember 12, 2024

Thanks Mark

I normally avoid reading external code as it drags me sideways. I am more interested in worded description of concepts. 

I generated 5 complex tones at (0, .01, 0.02, ,03, .04) of Fs of 1. Applied 5 polyphases and then added up the results. There is some attenuation of tones away from dc. But nothing to fit the term "Cancel out".


screenshot 2024-12-12 161244_58764.png


Or just imagine a sine wave, decimate to get samples 0, 1, 2, 3, 4 on the 5 filtered versions. Add up then it can only cancel out if it is rotated 180 degrees (for sine waveform).

[ - ]
Reply by kazDecember 12, 2024

Hi Mark,


I don't get it really when you say:

"Another way to think about it is the prototype baseband filter.  In channel 0 (DC) the filter has not moved.  So you filter 1st and then decimate.  The filter design is important as all of the frequency bands will fold back on top.  So your stop band should continue to roll off as 1/f rather than equ-ripple.

In channel 1 the filter has been rotated using a complex exponential to the center of channel 1.  The signal is filtered with the rotated response and then decimated.  It therefore aliases into the decimated base band of that channel.  Or you can consider that the input signal has been rotated 1st, filtered and then decimated."

Unfortunately, the Intel statement talks about adding the polyphase outputs. These are just polyphases of prototype filter and no rotation is implied at this filtering stage. It just decimates the input into a number of streams.

[ - ]
Reply by napiermDecember 12, 2024

The FFT supplies the rotation.  Each channel is rotated to baseband, filtered, and then decimated.  On a high level that is what happens.

The amazing clever part is that decimation happens 1st. Then the filter is run on the multiple decimated legs.  The output of those phases are input to the FFT.  Then the FFT does the sum of all the poly-phases and rotates each channel to base-band.  Pure Magic.

Mark Napier


[ - ]
Reply by kazDecember 12, 2024

I am sure it can't be magic.

My question is about the statement of Intel per se, before any rotation by mixing or FFT. It just refers to the outputs of polyphase filters ahead of any further processing.

The statement says "If the outputs of PF are summed up ...etc"



[ - ]
Reply by napiermDecember 12, 2024

Yes, it is magic of the Noble variety.

From a mechanics point of view, you need a long tap delay and a few multipliers followed by an FFT. (or IFFT, depending on how you arrange the data)

Lets say, for instance, you want a 128 phase channelizer.  For simplicity say it also decimates by 128.  This is known as a Maximally Decimated Filter Bank.

Design a prototype FIR low pass filter.  The length of the filter should be a multiple of 128.  If I use 4 multipliers then the filter would need to be 512 taps.  The approximate cut off of the filter will be 1/2 the channel width.  The two-sided width will be the channel width.

The tap delay (memory) is accessed in a spiral fashion.  The first input into the FFT is the current sample + the 128_th + the 256_th + the 384_th all times their respective tap weights.

X0 = x(n)*w0 + x(n-128)*w128 + x(n-256)*w256 + x(n-384)*w384

X1 = x(n-1)*w1 + x(n-129)*w129 + x(n-257)*w257 + x(n-385)*w385

....

X127 = x(n-127)*w127 + x(n-255)*w255 + x(n-383)*w383 + x(n-511)*w511

x = input samples (could be complex, probably so for most systems)

X = inputs to the FFT/IFFT

w = tap weights of the FIR.


After the FFT computation move the input shift register by 128 samples.  So one FFT every 128 input samples.

Each FFT output bin is one sample of that channel.  Each successive FFT provides one more sample.  It is doing this for all 128 channels at the same time.  Magic.


Try it yourself.

Mark Napier

PS.  Fred Harris says that the filter should be odd length, so 511 taps rather than 512 in this case.  Say make the trailing tap w511 = 0;  For my application I couldn't tell the difference.'

[ - ]
Reply by kazDecember 13, 2024

Thanks Mark,

I have this Octave code for 16 channels, each a single complex tone, 16 polyphases, each 8 taps. (I also tried your figures of 128 channels but are excessive for my platform).

I focus on the summed filtered output for now before checking FFT.

In particular I focus on the dc channel as it does not need any rotation. I can't see a clean channel 0 at summed filter output.

Can you please advise what is missing.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

clear; pkg load signal

n = 16*500;    %total input samples

f = -.001*8;   %normalised first tone

for i = 1:16  

  ch(i,:) = exp(j*2*pi*(0:n-1)*f);  

  f = f + 0.001;

end

x = sum(ch);  %input signal

%16 polyphases, each 8 taps

h = fir1(126,2*0.008);

for i = 1:16  

  p(i,:) = filter(h(i:16:end),1,x(i:16:end));

end

figure; plot(abs(fft(x)));          %input channels

figure; plot(abs(fft(sum(p))));     %polyphase sum

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%






[ - ]
Reply by napiermDecember 13, 2024

Hello kaz,

I think you need a nested loop.  Increment through your input array like j2 = 1:16:n; something like that.

Then compute your phases.  Each one has eight taps.  Take the FFT of your 16 phases to produce 16 outputs and append them your channel output array.  Keep going.  You will wind up with a size(16,500) array with each row being an output channel.

Also, it is hard for me to see what the frequences are.  I always define a sample rate Fs.  Then design using the actual frequencies so I can verify where the cut offs are.

For instance, say Fs = 16e6; % 16 MHz

Fs/2 -> 8Mhz

16 channels are distributed around the unit circle.  One is centered at DC.  One is centered on the Nyquist frequency, Fs/2.

The channel width is

Fw = Fs/16; % 1 MHz

Fw is also the channel spacing center to center.


The low pass cut off frequency is 1/2 the channel width.

fc = Fw/2;

Your pass band edge will depend on your filter length and your desired attenuation.  For the sake of an example:

fb = fc * (2/3);

Design your FIR for length 8*16;

Hope this helps,

Mark Napier


[ - ]
Reply by kazDecember 13, 2024

Thanks Mark,

I see my above code working after modifying with your division of Nyquist to sub-bands.

Thus, the Intel document is very misleading as it doesn't specify such critical condition. Instead, I assumed I can use any arbitrary sub-bands as I wished within any part of Nyquist.

[ - ]
Reply by Murali2017December 13, 2024

Hello 

please refer to the tutorial paper written  which was published in IEEE Signal Processing Magazine Jan 2016. The link is given below

https://ieeexplore.ieee.org/abstract/document/7366712

Polyphase Channelizer Demystified [Lecture Notes]

in this lecture note we have illustrated the effect of the polyphase paths which give a different phase shift to each arm and the FFT which combines the outputs from each polyphase arm by applying the appropriate phase rotation. the outputs from the FFT bins will cancel depending on the phase shifts and respective rotations

Murali

[ - ]
Reply by napiermDecember 13, 2024
[ - ]
Reply by cangianeDecember 14, 2024

Hi Kaz + all,

The attached pdf is my favorite (most intuitive) way to visualize the polyphase channelizer.  The development starts with the traditional polyphase decimation filter and ends with the polyphase channelizer (familiarity with polyphase decimation filters is assumed).  I hope it is helpful.


This is the first time I've attempted to upload a pdf to this site, so hope it works!

Pete


ChzVisualization.pdf
[ - ]
Reply by kazDecember 15, 2024

Hi Mark + all,

According to Intel document, the input switching and FFT/iFFT are as below:

screenshot 2024-12-15 104546_63437.png

According to Harris video of your link:

screenshot 2024-12-15 105556_76610.png


according to a website referring to Harris:

screenshot 2024-12-15 104745_26139.png

They are different but it could be valid issues to do with channel mapping.

Similar difference is there in synthesis channelisers.

Any thoughts.


[ - ]
Reply by napiermDecember 15, 2024

I noticed that too. Looking at my example it follows the picture in the Harris presentation.  FWIW it does work.  If you really want to go deeper there is a dissertation you might be interested in.  It is a compendium on the subject.

Xiaofei Chen  “Non-Maximally Decimated Filter Bank and Its Applications in Wideband Signal Processing,” Dissertation with Fred Harris on committee.

https://digitalcollections.sdsu.edu/search?search_...

https://digitalcollections.sdsu.edu/do/fe3cacb9-b7...


[ - ]
Reply by kazDecember 15, 2024

Thanks Mark,

I have now an Octave model for analysis and synthesis that shows some good points. whether the commutator goes one way or the other and whether we use FFT or iFFT it affects "only" channel mapping as far as I notice.

In summary I learned:

1) The bandwidth must be split up equally across all digital domain.

2) Though we use FFT/iFFT it is really neither in the true sense.

We are not moving between time/frequency domain but exploiting the mixing addition of these engines instead of using explicit NCOs and adders. We are "fooling" the FFT. A channel is not FFTed into its frequency bin but all the channels are arranged across one FFT frame and so each is multiplied by the negative frequency generated internally by the engine then added. If FFT is direct i.e. not butterfly architecture then this approach does not seem to offer any benefit to save resource.

3) My model fails to recover random input as opposed to well defined sub bands. It could be aliasing or overlapping issues.

[ - ]
Reply by napiermDecember 15, 2024

In practice the channelizer allows you to break up a wide band signal and operate on it at a lower sample rate.  The design of the filter will determine how it operates.  In my example the stop band starts right at the channel edge.  For the reconstruction application the analysis filter has the 3dB point at the channel edge to allow for overlap.  Also it is Non-Maximally Decimated by a factor of 2 in the examples I saw.

In my application the channelizer is used to serialize a band of frequency-hopping transmitters.  It is extremely efficient.

FWIW,

Mark Napier


[ - ]
Reply by bitbybitspJanuary 21, 2025

Mark, I really like that you're using filters that are 3dB down at the overlap point.  That's my approach also.  I recently listened to a fred harris presentation from a few years ago, and if I understood it correctly he didn't know how to make good filters that were 3dB down to make that work.

Instead, I believe he went with 6dB down, and then reconstructed with a filter that was 0dB down in the passband.

So I think you have one over on him.

Regards,

Ross

[ - ]
Reply by napiermJanuary 22, 2025

Hey Ross,

I would never say such a thing! :)

If I accomplish anything it is by standing on the shoulders of giants.

(Even that is stolen)

Best regards,

Mark Napier


[ - ]
Reply by bitbybitspJanuary 25, 2025

I tried posting this earlier, but somehow it didn't take.  Here's a link to my presentation on how Polyphase Filter Banks work.  As I see it, it's not magic, it's symmetry.

https://bxbsp.com/Tutorials/Tutorial1/Page_001.htm...


[ - ]
Reply by kazJanuary 25, 2025

Hi Ross,

Thanks for the link.

Regarding your BxBFFT notes copied below:

--------------------------------------

1. Power savings: Half the power of other FFTs (in Xilinx).
2. Resource savings:  Half the FPGA LUTs (in Xilinx).
3. High Throughput:  Highly parallel processing at the highest clock rates. Also the lowest latency.
4. Large FFT Sizes:  Supports 128k and greater FFT sizes with efficiency.
5. Feature Support:    Non-power-of-2 FFTs, real-to-complex FFTs, many included options to meet design goals.
...etc.
--------------------------------------------------------------------------------
Sounds way too good. 
Assuming 2k FFT
1) Can you break down the resource usage to "luts, registers, multipliers and memory"
2) What maximum clock speed can it operate in xilinx (Fmax)
3) is it a unique architecture rather than butterfly
4) do you have the datasheet 

Thanks
Kaz

[ - ]
Reply by bitbybitspJanuary 25, 2025

Hi Kaz,

You've left off some details regarding what you're looking for.  To give you some numbers I'll assume that you want a complex-to-complex FFT, to operate at 4 complex points per clock (SSR=4), and you're thinking of implementing in an Ultrascale+ (xczu28dr-ffvg1517-2-e) with 18-bit data precision, with an AXI-S interface, with natural input and output order.

For a 2K FFT, I got these numbers from Vivado 2019.2 during out-of-context testing a few years ago (Watts are dynamic power only):

FFTFmaxLUTsREGsDSPsBRAM36Watts
BxBFFT746.34160104634213.51.577
Xilinx SSR691.1946114207454.02.837
Spiral646.0824089687214.02.105
Astron367.511604901515944.05.731

How good the numbers are depends on the FFT size, parallelism, etc.  There are performance curves on my web site, on the page specific to Ultrascale/Ultrascale+.

As far as architecture, the BxBFFT supports radix stages of 2,3,4,5,7,8,and 16, in serial or parallel versions.  So not just the ordinary radix-2 butterfly, no.  For this 2048-point FFT processing 4 input points in parallel every clock, the BxBFFT typically uses four radix-4 stages, one radix-2 stage, and a final radix-4 stage.  Although I can override the stages used if there's an advantage in doing so.  (Sometimes there is.  Every combination has slightly different resource usage, with tradeoffs.)

Yes, there is a datasheet.  Although it's more of a 33-page user guide.  This is a length necessary to cover all the possible configuration options (most of which ordinary users won't need to read about).  There's also a 9-page quick start if you're using Xilinx's IP Integrator.  And there are a number of README files, and examples, and included tests that customers can run for themselves to verify operation.

I just rebuilt a BxBFFT now in Vivado 2024.2.  Now the BxBFFT uses these resources:

FFTFmaxLUTsREGsDSPsBRAM36Watts
BxBFFT728.94202103784213.51.894

So it's about the same as a few years ago. I imagine the power estimate has gone up simply because Xilinx has upgraded their power models, so all of the power estimates above would probably go up likewise.  (But I'm too lazy to check.)

Regards,

Ross

[ - ]
Reply by bitbybitspJanuary 21, 2025

I wrote a presentation on how Polyphase Filter Banks work some time ago.  You can find a link to it on my channelizer page, https://bxbchan.com

I think you might find it better than most other presentations out there.  I tend to find them very confusing.

An example of the confusion is the concept of "aliases canceling".  As I see it, aliases are filtered out, they don't "cancel".  This is because effectively the PFB filter is not 4 or 6 or 7 taps long.  Its full effective size is the number of taps times the FFT length.  This very long filter is enough to filter out aliases from the other bins as the sampling rate is reduced.

I also don't think of the PFB as magic, although in some sense it is a valid perspective.  Instead of magic, I think of it as symmetry.  The whole arrangement of bins equally spaced in frequencies that are a multiple of a common base frequency and fundamentally related to the sampling rate sets up a lot of symmetry.  This symmetry ends up meaning that as you rearrange operations, in many cases the same operation is just being repeated multiple times.  Combining operations saves immense work.

The work isn't just saved because there is an FFT instead of a DFT.  Just getting to the DFT, there's immense savings in the filter coefficient multiplies and in the mixer multiplies.

I hope you find my presentation interesting/useful.

Regards,
Ross