DSPRelated.com
Forums

Which DSP DSK for 700 MFLOPS Audio

Started by Hubble January 13, 2006
Hi all,

I have an audio algorithm on Apple (Altivec), which requires about 1400
MFlops (Float32) for Stereo, mostly multiply (700 MFLops) and add (700
MFlops). The demonstrator currently runs on a Powerbook G4, currently
in mono. The next job would be to built a prototype device to convince
potential customers that I can built it on hardware, i.e. without
requiring a dedicated computer, and somewhat smaller than a Powerbook.

As a minimum I need
   >=700.000.000 Multiplications/Second, either 20 Bit integer or
floating point
   >=700.000.000 Additions/Second
   >=64 KByte RAM
   44.1 kSamples/s Stereo input, medium to high quality
   44.1 kSamples/s Stereo output, medium to high quality
   max 100x160mm size preferable.

A DSP DSK could fulfil these requirements. The Analog Devices Blackfin
(e.g. 533) have Audio I/O, but they have only 16 bit Multiplications
and are rather expensive. Also, the development software of AD is
limited to 90 days trial. TMS32C64xx DSK has 4 Audio jacks, but all
different and mono (mic, line i/o, speaker out)  and is also rather
expensive, but they are floating point and the development software is
not a trial version.

What else can I use. My budget is 500 EUR/$, up to 1000EUR/$, but
preferably less since currently, this is my private project.

Hubble.

Hubble wrote:
> Hi all, > > I have an audio algorithm on Apple (Altivec), which requires about 1400 > MFlops (Float32) for Stereo, mostly multiply (700 MFLops) and add (700 > MFlops). The demonstrator currently runs on a Powerbook G4, currently > in mono. The next job would be to built a prototype device to convince > potential customers that I can built it on hardware, i.e. without > requiring a dedicated computer, and somewhat smaller than a Powerbook. > > As a minimum I need > >=700.000.000 Multiplications/Second, either 20 Bit integer or > floating point > >=700.000.000 Additions/Second > >=64 KByte RAM > 44.1 kSamples/s Stereo input, medium to high quality > 44.1 kSamples/s Stereo output, medium to high quality > max 100x160mm size preferable. > > A DSP DSK could fulfil these requirements. The Analog Devices Blackfin > (e.g. 533) have Audio I/O, but they have only 16 bit Multiplications > and are rather expensive. Also, the development software of AD is > limited to 90 days trial. TMS32C64xx DSK has 4 Audio jacks, but all > different and mono (mic, line i/o, speaker out) and is also rather > expensive, but they are floating point and the development software is > not a trial version. > > What else can I use. My budget is 500 EUR/$, up to 1000EUR/$, but > preferably less since currently, this is my private project.
How about one of these: http://www.xess.com/prod035.php3 plugged into one of these: http://www.xess.com/prod037.php3 ? This will easily meet your processing requirement. The development software is free. But you will be designing hardware, not software. Regards, Allan
>A DSP DSK could fulfil these requirements.
>How about one of these: >http://www.xess.com/prod035.php3 >plugged into one of these: >http://www.xess.com/prod037.php3
?
>But you will be designing hardware, not software.
Thanks a lot, Allan. I am familiar with VHDL (but not with Verilog), so your suggestion certainly deems consideration. Hubble.
"Hubble" <reiner@huober.de> wrote in news:1137136901.126189.217170
@g43g2000cwa.googlegroups.com:

> Hi all, > > I have an audio algorithm on Apple (Altivec), which requires about 1400 > MFlops (Float32) for Stereo, mostly multiply (700 MFLops) and add (700 > MFlops). The demonstrator currently runs on a Powerbook G4, currently > in mono. The next job would be to built a prototype device to convince > potential customers that I can built it on hardware, i.e. without > requiring a dedicated computer, and somewhat smaller than a Powerbook. > > As a minimum I need > >=700.000.000 Multiplications/Second, either 20 Bit integer or > floating point > >=700.000.000 Additions/Second > >=64 KByte RAM > 44.1 kSamples/s Stereo input, medium to high quality > 44.1 kSamples/s Stereo output, medium to high quality > max 100x160mm size preferable. > > A DSP DSK could fulfil these requirements. The Analog Devices Blackfin > (e.g. 533) have Audio I/O, but they have only 16 bit Multiplications > and are rather expensive. Also, the development software of AD is > limited to 90 days trial. TMS32C64xx DSK has 4 Audio jacks, but all > different and mono (mic, line i/o, speaker out) and is also rather > expensive, but they are floating point and the development software is > not a trial version. > > What else can I use. My budget is 500 EUR/$, up to 1000EUR/$, but > preferably less since currently, this is my private project. > > Hubble. >
Our soon to be released dspstak 21369zx2 will be close. This DSP currently runs at 333MHz (SIMD) which is 666M MACs. In theory, it should be able to run at 400 MHz (ADI hasn't qualified the ADSP-21369 for this speed yet as far as I know) If you could partition your design, we could certainly run this in two cards. Our dspstak zx platform, based on a ADSP-21262 and Cyclone FPGA would be another possibility. You could partition your algorithm to run in trhe FPGA and DSP. 700M MACs is a lot of processing for an audio application. Is this really necessary? Could you find some more efficient implementations? For example, if you are performing large FIRs, you might use FFT based convolution as an alternative. -- Al Clark Danville Signal Processing, Inc. -------------------------------------------------------------------- Purveyors of Fine DSP Hardware and other Cool Stuff Available at http://www.danvillesignal.com
Hubble wrote:
> Hi all, > > I have an audio algorithm on Apple (Altivec), which requires about 1400 > MFlops (Float32) for Stereo, mostly multiply (700 MFLops) and add (700 > MFlops).
The question is, do you need dot products in your algorithm, ie. can you use a fast MAC engine? Or even better, do you need correlation or convolution (these are also constructed out of "mostly multiply and adds")? Regards, Andor
>The question is, do you need dot products in your algorithm, ie. can >you use a fast MAC engine?
Yes, great parts are in fact dot products. I have no doubt that I can squeeze out (near) the maximum of out of a device if only I find one. Hubble.
Hubble wrote:
> >The question is, do you need dot products in your algorithm, ie. can > >you use a fast MAC engine? > > Yes, great parts are in fact dot products. I have no doubt that I can > squeeze out (near) the maximum of out of a device if only I find one.
Ok, if you can reduce the 1400 MFLOPS to 700 MMACS, then you can use either Al's DSP Stack (which gives you close to 700 MMACS), or you can use ADI's 21369-EZKIT: http://www.analog.com/en/epHSProd/0,,21369-HARDWARE,00.html which gives you 800 32bit floating-point MMACS, or, to be on the save side, the TS201-kit: http://www.analog.com/en/prod/0%2C2877%2CTS201%25252DHARDWARE%2C00.html which gives you 2400 32bit floating-point MMACS. If you are with a university, you can get a substantial discount. I think the test drive of the development system becomes crippled after 90 days (only limited amount of code space), but can still be used for small to medium projects (especially if you program in assembler). Look on their webpage for details. Note that on any DSP you can do very fast FFTs - if you need the dot products for convolution or correlation, you should consider frequency domain methods.
> > Hubble.
Regards, Andor
"Andor" <andor.bariska@gmail.com> wrote in news:1137186628.469268.105440
@z14g2000cwz.googlegroups.com:

> Hubble wrote: >> >The question is, do you need dot products in your algorithm, ie. can >> >you use a fast MAC engine? >> >> Yes, great parts are in fact dot products. I have no doubt that I can >> squeeze out (near) the maximum of out of a device if only I find one. > > Ok, if you can reduce the 1400 MFLOPS to 700 MMACS, then you can use > either Al's DSP Stack (which gives you close to 700 MMACS), or you can > use ADI's 21369-EZKIT: > > http://www.analog.com/en/epHSProd/0,,21369-HARDWARE,00.html > > which gives you 800 32bit floating-point MMACS, or, to be on the save > side, the TS201-kit:
The EZ-Kit 21369 board is currently only rated to 333MHz. They have the same issue we have. As far as we can tell, it's not the layout. The TigerSHARC is much faster. The peripherals are much better on the SHARC. -- Al Clark Danville Signal Processing, Inc. -------------------------------------------------------------------- Purveyors of Fine DSP Hardware and other Cool Stuff Available at http://www.danvillesignal.com
> > http://www.analog.com/en/prod/0%2C2877%2CTS201%25252DHARDWARE%2C00.html > > which gives you 2400 32bit floating-point MMACS. If you are with a > university, you can get a substantial discount. I think the test drive > of the development system becomes crippled after 90 days (only limited > amount of code space), but can still be used for small to medium > projects (especially if you program in assembler). Look on their > webpage for details. > > Note that on any DSP you can do very fast FFTs - if you need the dot > products for convolution or correlation, you should consider frequency > domain methods.
> >> >> Hubble. > > Regards, > Andor > >
Al Clark wrote:
...
> The TigerSHARC is much faster. The peripherals are much better on the > SHARC.
I always wondered about that - it's so silly to make a DSP with no serial ports. Conversely, none of the third generation SHARCs are equipped for multiprocessing. It seems like ADI wants to draw a clear line between the two families: SHARCs for single processors systems in consumer electronics, TS for multiprocessor systems. Still no reason not to include a serial port.
Hi all,

Could you give us a hint what is your algorithm is or compare it with
something like MP3 codec or post-processing (in a wide sense: SBR,
equalizer...).  Ordinarily audio decoder consumes about 7-20
MIPS/per_channel@48kHz on DSP, encoder needs 2-4 times more
computational power. All this is rather conventional, but 700 MIPS
seems to me too huge for audio. Not sure you are really need so power
DSP for a stereo processing.

Dmitry