DSPRelated.com
Blogs

GPGPU DSP

Shehrzad January 16, 20101 comment

Greetings dear readers and welcome to my inaugural blog posting!  I'm new to this blogging thing so I hope there is a grace period while I get acclimated.  Before I jump into the meat of this posting allow me to introduce myself and briefly discuss where I intend to go with this blog.

Until quite recently I was Director of Software Engineering at a medical device startup, before resigning to strike out on my own.  I have experience in a wide variety of industries, in addition to medical, having worked in the defense, the life sciences and telecommunications industries.  I guess if you could think of a common theme to my career thus far it's been that I am always looking to push the envelope.

I want to start this blog discussing signal processing with GPGPU (General-Purpose computation on Graphics Processing Units), specifically Nvidia's CUDA (Compute Unified Device Architecture) as that is what I am most familiar with.  I'm probably one of the few people around who has leveraged CUDA in an actual product, and I am quite excited about GPGPU techniques in general.  With the seeming "breakdown" in Moore's Law in recent years (CPU clock frequencies having topped out at just under
4 GHz), I really feel that large-scale parallelism is the wave of the future on these architectures. I'd be interested in hearing your feedback and whether or not you agree that statement.

For those of you who might not know much about GPGPU techniques, there are plenty of resources on the web.  But to summarize the history, basically in the old days of GPGPU, the programmable shader components of the commodity graphics card were leveraged to perform general (floating-point) computations, really by mis-using 3D graphics APIs like OpenGL or DirectX and sequestering them for the purpose of "fooling" the GPU.  Instead of computing millions of lighting calculations for thousands of mesh vertices (the raison d' etre of a graphics card, after all), the graphics pipeline is used to efficiently perform many computations in parallel.  Thus one could perform a 2D FFT of an image by texture mapping the pixels and then applying some very strange (at least to a computer graphics programmer) shaders on that image.  It's actually much the same today, except for the highly significant fact that the programmable aspect of the GPU is abstracted behind a well-defined API like CUDA or OpenCL.

Some of the topics I intend to discuss in more detail in subsequent posts:

  1. CUDA or OpenCL?
  2. The significance of the dramatic change in direction behind Intel's Larrabee initiative?
  3. Fermi, Nvidia's supposed "FPGA killer"
  4. Walk-through and breakdown of a CUDA embodiment of a signal-processing algorithm


[ - ]
Comment by awegenerJanuary 18, 2010
Hi Shehrzad - welcome to the blogosphere! I will be following your posts. Rgds, Al

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: