A Two-Level Reconfigurable Cell Array for Digital Signal Processing
Reconfigurable hardware has become an attractive option for implementing digital signal processing, especially in systems that require both high performance and flexibility. This thesis presents a novel two-level reconfigurable architecture targeted toward systems with these requirements. The architecture supports a large orthogonal design space whereby designers can customize the word length, amount of parallelism, number of functional units, and functional unit connectivity to meet the needs of the application. On the upper level, algorithms are mapped onto an array of 4-bit cells and a hierarchical interconnection fabric. The interconnection structure contains a mesh of 4-bit busses for local data transfer, as well as an H-tree for communicating results between functional units. On the lower level, each cell contains a small matrix of elements that collectively implement all necessary operations. The matrix of elements has only two configurations: one optimized for mathematical functions such as multiply-accumulates, and the other optimized for memory operations. The system also contains pipeline latches to maximize clock rate and throughput. Circuit simulations indicate that the architecture achieves a clock frequency of 200 MHz in a modest 0.25-μm CMOS technology. An initial prototype of the reconfigurable cell has been fabricated in 0.5-μm CMOS and tested for functionality. The estimated execution time for a 16-bit, 256-point Fast Fourier Transform shows a speedup ranging from 1.6 to 14 compared to contemporary digital signal processors.
Summary
This master's thesis describes a two-level reconfigurable architecture that maps DSP algorithms onto an array of 4-bit cells with a hierarchical interconnect (local mesh of 4-bit busses plus an H-tree). It explains how designers can tune word length, parallelism, number of functional units, and connectivity to balance performance and flexibility for applications such as real-time DSP, communications, audio, and radar processing.
Key Takeaways
- Map algorithms to a 4-bit cell array and hierarchical interconnect to exploit fine-grained reconfigurability.
- Configure word length, degree of parallelism, and functional-unit count to trade off performance, area, and power.
- Exploit the mesh of local 4-bit busses and H-tree global fabric to structure dataflow for low-latency operations.
- Evaluate architectural trade-offs (connectivity vs. throughput) when targeting real-time DSP and communication workloads.
- Estimate resource usage and mapping strategies for common DSP kernels (FFT, filters, adaptive routines) on the proposed fabric.
Who Should Read This
Advanced DSP/FPGA/ASIC engineers, researchers, or graduate students designing reconfigurable hardware who need guidance on mapping DSP algorithms to fine-grained, hierarchical fabrics for high-performance, flexible systems.
Still RelevantAdvanced
Related Documents
- A New Approach to Linear Filtering and Prediction Problems TimelessAdvanced
- A Quadrature Signals Tutorial: Complex, But Not Complicated TimelessIntermediate
- An Introduction To Compressive Sampling TimelessIntermediate
- Lecture Notes on Elliptic Filter Design TimelessAdvanced
- Computing FFT Twiddle Factors TimelessAdvanced







