DSPRelated.com
Automatic Parallel Memory Address Generation for Parallel DSP Computing

Automatic Parallel Memory Address Generation for Parallel DSP Computing

Jiehua Dai
Still RelevantAdvanced

The concept of Parallel Vector (scratch pad) Memories (PVM) was introduced as one solution for Parallel Computing in DSP, which can provides parallel memory addressing efficiently with minimum latency. The parallel programming more efficient by using the parallel addressing generator for parallel vector memory (PVM) proposed in this thesis. However, without hiding complexities by cache, the cost of programming is high. To minimize the programming cost, automatic parallel memory address generation is needed to hide the complexities of memory access. This thesis investigates methods for implementing conflict-free vector addressing algorithms on a parallel hardware structure. In particular, match vector addressing requirements extracted from the behaviour model to a prepared parallel memory addressing template, in order to supply data in parallel from the main memory to the on-chip vector memory. According to the template and usage of the main and on-chip parallel vector memory, models for data pre-allocation and permutation in scratch pad memories of ASIP can be decided and configured. By exposing the parallel memory access of source code, the memory access flow graph (MFG) will be generated. Then MFG will be used combined with hardware information to match templates in the template library. When it is matched with one template, suited permutation equation will be gained, and the permutation table that include target addresses for data pre-allocation and permutation is created. Thus it is possible to automatically generate memory address for parallel memory accesses. A tool for achieving the goal mentioned above is created, Permutator, which is implemented in C++ combined with XML. Memory access coding template is selected, as a result that permutation formulas are specified. And then PVM address table could be generated to make the data pre-allocation, so that efficient parallel memory access is possible. The result shows that the memory access complexities is hiden by using Permutator, so that the programming cost is reduced.It works well in the context that each algorithm with its related hardware information is corresponding to a template case, so that extra memory cost is eliminated.


Summary

This master's thesis presents methods for automatic parallel memory address generation targeted at Parallel Vector (scratchpad) Memories (PVM) to speed parallel DSP computing. It shows how to extract vector addressing requirements from behavioral models and implement conflict-free address-generation strategies to reduce programming cost and latency for real-time DSP kernels like FFTs and filters.

Key Takeaways

  • Understand the PVM (parallel vector memory) architecture and its advantages over cache-based approaches for low-latency DSP
  • Implement conflict-free vector addressing algorithms that map behavioral access patterns to parallel memory banks
  • Apply automatic address-generation techniques to accelerate common DSP kernels (e.g., FFT, block filters)
  • Evaluate trade-offs between programming complexity, latency, and hardware resources when using scratchpad memories

Who Should Read This

Advanced DSP engineers, hardware architects, and researchers working on parallel/real-time DSP implementations who want techniques to automate memory-access scheduling and reduce memory-access conflicts.

Still RelevantAdvanced

Topics

Real-Time DSPFFT/Spectral AnalysisMultirate Systems

Related Documents