Algorithms and tools for automatic generation of DSP hardware structures
The increased complexity of Digital Signal Processing (DSP) algorithms demands for the development of more complex and more efficient hardware structures. The work presented herein describes the core components for the development of a tool capable of automatic generation of efficient hardware structures, therefore facilitating developers work. It comprises algorithms and techniques for i) balancing the paths in a graph, ii) scheduling of operations to functional units, iii) allocating registers and iv) generating the VHDL code. Results show that the developed techniques are capable of generating the hardware structure of typical DSP algorithms represented in data-flow graphs with over 2,000 nodes in around 200 ms, scaling to 80,000 nodes in about 214 s. Within the developed techniques, solving the scheduling problem is one of the most complex tasks: it is a NP-complete problem and directly influences the number of functional units and registers required. Therefore, experimental analysis was made on scheduling algorithms for time-constrained problems. Results show that simple list-based algorithms are more efficient in large problems than more complex algorithms: they run faster and tend to require less functional units.
Summary
This 2006 master's thesis presents core algorithms and an end-to-end toolchain for automatic generation of efficient DSP hardware from data-flow graphs. Readers will learn methods for path balancing, operation scheduling, register allocation, and synthesizable VHDL generation, plus performance and scalability results for large DSP graphs.
Key Takeaways
- Balance data-flow graphs to equalize path delays and enable high-throughput pipelining.
- Schedule operations onto functional units under resource and timing constraints to meet latency and throughput targets.
- Allocate registers and manage lifetimes to minimize storage while preserving correctness.
- Generate synthesizable VHDL from scheduled graphs to accelerate hardware implementation.
- Estimate tool performance and scalability for large DSP graphs (e.g., ~2,000 nodes in ~200 ms; ~80,000 nodes in ~214 s).
Who Should Read This
DSP and hardware engineers, FPGA/ASIC implementers, and tool developers who need practical techniques for mapping data-flow DSP algorithms into efficient, synthesizable hardware.
Still RelevantAdvanced
Related Documents
- A Quadrature Signals Tutorial: Complex, But Not Complicated TimelessIntermediate
- Lecture Notes on Elliptic Filter Design TimelessAdvanced
- Computing FFT Twiddle Factors TimelessAdvanced
- Digital Envelope Detection: The Good, the Bad, and the Ugly TimelessIntermediate
- The World's Most Interesting FIR Filter Equation: Why FIR Filters Can Be Line... TimelessAdvanced







