Sign in

username or email:

password:



Not a member?
Forgot your password?

Search documents



Search tips

Documents by category

Ads

See Also

Embedded SystemsFPGA

DSP Documents > Algorithm Adaptation and Optimization of a Novel DSP Vector Co-processor

In this section, our goal is to keep a comprehensive and organised list of DSP related documents (papers, theses, etc) available for free on the web. Most of the documents are available in pdf format, so you'll need a pdf reader to view them. Add a document to the list.

To narrow the list, you can filter the documents by 'type':
All Types | Books | Master Theses | Others | Papers/Articles | PhD Theses 

Page of Sorted by

Algorithm Adaptation and Optimization of a Novel DSP Vector Co-processor

By Andréas Karlsson

Abstract:

The Division of Computer Engineering at Linköping's university is currently researching the possibility to create a highly parallel DSP platform, that can keep up with the computational needs of upcoming standards for various applications, at low cost and low power consumption. The architecture is called ePUMA and it combines a general RISC DSP master processor with eight SIMD co-processors on a single chip. The master processor will act as the main processor for general tasks and execution control, while the co-processors will accelerate computing intensive and parallel DSP kernels.This thesis investigates the performance potential of the co-processors by implementing matrix algebra kernels for QR decomposition, LU decomposition, matrix determinant and matrix inverse, that run on a single co-processor. The kernels will then be evaluated to find possible problems with the co-processors' microarchitecture and suggest solutions to the problems that might exist. The evaluation shows that the performance potential is very good, but a few problems have been identified, that causes significant overhead in the kernels. Pipeline mismatches, that occurs due to different pipeline lengths for different instructions, causes pipeline hazards and the current solution to this, doesn't allow effective use of the pipeline. In some cases, the single port memories will cause bottlenecks, but the thesis suggests that the situation could be greatly improved by using buffered memory write-back. Also, the lack of register forwarding makes kernels with many data dependencies run unnecessarily slow.

Download Document Download Document
(This item is protected by original copyright)

Rate this document:
0
Rating: 0 | Votes: 0


Comments


No comments yet for this document


Add a Comment
You need to login before you can post a comment (best way to prevent spam). ( Not a member? )