DSPRelated.com
Forums

Small, fast, resource-rich processor

Started by Tim Wescott September 11, 2013
I'm working on a project that needs to have a pretty hefty amount of 
digital signal processing done in more or less real time ("soft" real 
time, if you must split hairs).

For a variety of reasons I think this algorithm would work best on a 
small single-board computer (my customer disagrees -- but getting it shoe-
horned into the chips I was considering is going to take WORK, and I 
think it'll be cheaper for them to go with more expensive hardware).

So I'm looking for suggestions.  I mostly build custom boards or I make 
algorithms for other people's hardware -- I've never specified a single-
board computer that's gone into production.

I was thinking PC-104, but I've never actually used a PC-104 computer, 
and I have no idea, beyond trade-show displays, how the market has 
evolved.

So, here's what I think I need.  Anyone who wants to look through this 
and point me to the current crop of solutions for all this is welcome to 
do so -- I'll be grateful.

Small: PC-104 form factor, or some other solution that's less than about 
20 square inches of board and less than an inch tall.

Fast: Something that supports native dual-precision floating point, and 
has a clock rate of 500MHz or better.  This algorithm runs about 5x 
faster than real time as a Linux application on a Dell Dimension 8300.  
That's a 2.8GHz Pentium 4, so if it's running alone it should do more 
with less.

Resource-rich: The algorithm runs, albeit way slow, on a STM32F407, using 
less than 128kB of memory.  So at least that much memory plus whatever is 
necessary for any OS (see below).

Ports: Comes with serial ports.  I don't need Ethernet or that stuff.  
Depending on the processor (see below), having a JTAG debug port would be 
nice.

Extensible: I need something onto which I can easily slap an ADC board, 
or something that talks USB, and suggestions for matching ADC modules 
that talk USB.  My preference is something that has an easy parallel I/O 
implementation, an SPI controller that I can hook to an ADC, and/or some 
generic general-purpose I/O pins that I can bit-bang.

Long legs: I need something that'll be on the market for at least five 
years, preferably a decade.  Better yet would be something that comes 
from some sort of a standard (that's not on it's last legs) so that if 
today's choice goes out of production tomorrow, we can choose another 
that's form-fit-function compatible.

Processor: My preference is for ARM or Intel, but I'm open to anything 
for which there's a good port of the gnu tools.

OS: Depends somewhat on how the "extensible" happens.  If I have to talk 
to an ADC using USB, then I want the board to be running Linux (or 
Windows in a pinch).  Otherwise, I'm happy with putting my own little RTOS 
on there.

Thanks in advance.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

On Wed, 11 Sep 2013 11:11:04 -0500, Tim Wescott
<tim@seemywebsite.really> wrote:

>I'm working on a project that needs to have a pretty hefty amount of >digital signal processing done in more or less real time ("soft" real >time, if you must split hairs). > >For a variety of reasons I think this algorithm would work best on a >small single-board computer (my customer disagrees -- but getting it shoe- >horned into the chips I was considering is going to take WORK, and I >think it'll be cheaper for them to go with more expensive hardware). > >So I'm looking for suggestions. I mostly build custom boards or I make >algorithms for other people's hardware -- I've never specified a single- >board computer that's gone into production. > >I was thinking PC-104, but I've never actually used a PC-104 computer, >and I have no idea, beyond trade-show displays, how the market has >evolved. > >So, here's what I think I need. Anyone who wants to look through this >and point me to the current crop of solutions for all this is welcome to >do so -- I'll be grateful. > >Small: PC-104 form factor, or some other solution that's less than about >20 square inches of board and less than an inch tall. > >Fast: Something that supports native dual-precision floating point, and >has a clock rate of 500MHz or better. This algorithm runs about 5x >faster than real time as a Linux application on a Dell Dimension 8300. >That's a 2.8GHz Pentium 4, so if it's running alone it should do more >with less. > >Resource-rich: The algorithm runs, albeit way slow, on a STM32F407, using >less than 128kB of memory. So at least that much memory plus whatever is >necessary for any OS (see below). > >Ports: Comes with serial ports. I don't need Ethernet or that stuff. >Depending on the processor (see below), having a JTAG debug port would be >nice. > >Extensible: I need something onto which I can easily slap an ADC board, >or something that talks USB, and suggestions for matching ADC modules >that talk USB. My preference is something that has an easy parallel I/O >implementation, an SPI controller that I can hook to an ADC, and/or some >generic general-purpose I/O pins that I can bit-bang. > >Long legs: I need something that'll be on the market for at least five >years, preferably a decade. Better yet would be something that comes >from some sort of a standard (that's not on it's last legs) so that if >today's choice goes out of production tomorrow, we can choose another >that's form-fit-function compatible. > >Processor: My preference is for ARM or Intel, but I'm open to anything >for which there's a good port of the gnu tools. > >OS: Depends somewhat on how the "extensible" happens. If I have to talk >to an ADC using USB, then I want the board to be running Linux (or >Windows in a pinch). Otherwise, I'm happy with putting my own little RTOS >on there.
AFAIK, PC/104 or PC/104+ processors are all x86 architectures (or close relatives), so any fancy ARM chip is probably out. We've used boards from Versalogic http://www.versalogic.com/ in a few products (low volume, high cost, long product lifetimes (shipboard)) to pretty good effect. They're quite good about putting out board rev notices and handling migrations when they do EOL a product line. Add-on PC-104 boards are from EMAC Inc. http://www.emacinc.com/sbc_pc_addons.htm. Also use a couple of their 8051-ish SBCs for various purposes.
On Wed, 11 Sep 2013 12:45:41 -0400, Rich Webb wrote:

> On Wed, 11 Sep 2013 11:11:04 -0500, Tim Wescott > <tim@seemywebsite.really> wrote: > >>I'm working on a project that needs to have a pretty hefty amount of >>digital signal processing done in more or less real time ("soft" real >>time, if you must split hairs). >> >>For a variety of reasons I think this algorithm would work best on a >>small single-board computer (my customer disagrees -- but getting it >>shoe- horned into the chips I was considering is going to take WORK, and >>I think it'll be cheaper for them to go with more expensive hardware). >> >>So I'm looking for suggestions. I mostly build custom boards or I make >>algorithms for other people's hardware -- I've never specified a single- >>board computer that's gone into production. >> >>I was thinking PC-104, but I've never actually used a PC-104 computer, >>and I have no idea, beyond trade-show displays, how the market has >>evolved. >> >>So, here's what I think I need. Anyone who wants to look through this >>and point me to the current crop of solutions for all this is welcome to >>do so -- I'll be grateful. >> >>Small: PC-104 form factor, or some other solution that's less than about >>20 square inches of board and less than an inch tall. >> >>Fast: Something that supports native dual-precision floating point, and >>has a clock rate of 500MHz or better. This algorithm runs about 5x >>faster than real time as a Linux application on a Dell Dimension 8300. >>That's a 2.8GHz Pentium 4, so if it's running alone it should do more >>with less. >> >>Resource-rich: The algorithm runs, albeit way slow, on a STM32F407, >>using less than 128kB of memory. So at least that much memory plus >>whatever is necessary for any OS (see below). >> >>Ports: Comes with serial ports. I don't need Ethernet or that stuff. >>Depending on the processor (see below), having a JTAG debug port would >>be nice. >> >>Extensible: I need something onto which I can easily slap an ADC board, >>or something that talks USB, and suggestions for matching ADC modules >>that talk USB. My preference is something that has an easy parallel I/O >>implementation, an SPI controller that I can hook to an ADC, and/or some >>generic general-purpose I/O pins that I can bit-bang. >> >>Long legs: I need something that'll be on the market for at least five >>years, preferably a decade. Better yet would be something that comes >>from some sort of a standard (that's not on it's last legs) so that if >>today's choice goes out of production tomorrow, we can choose another >>that's form-fit-function compatible. >> >>Processor: My preference is for ARM or Intel, but I'm open to anything >>for which there's a good port of the gnu tools. >> >>OS: Depends somewhat on how the "extensible" happens. If I have to talk >>to an ADC using USB, then I want the board to be running Linux (or >>Windows in a pinch). Otherwise, I'm happy with putting my own little >>RTOS on there. > > AFAIK, PC/104 or PC/104+ processors are all x86 architectures (or close > relatives), so any fancy ARM chip is probably out. > > We've used boards from Versalogic http://www.versalogic.com/ in a few > products (low volume, high cost, long product lifetimes (shipboard)) to > pretty good effect. They're quite good about putting out board rev > notices and handling migrations when they do EOL a product line. > > Add-on PC-104 boards are from EMAC Inc. > http://www.emacinc.com/sbc_pc_addons.htm. Also use a couple of their > 8051-ish SBCs for various purposes.
There are some ARM boards with PC-104 interfaces. See, for example: http://www.embeddedarm.com/products/single-board-computers.php#pc/104-bus-embedded-computers
In comp.dsp Tim Wescott <tim@seemywebsite.really> wrote:
> I'm working on a project that needs to have a pretty hefty amount of > digital signal processing done in more or less real time ("soft" real > time, if you must split hairs).
Just wondering, have you thought about FPGA based systolic arrays? Since you mention floating point, I will guess that it isn't the best choice, but you don't say all that much about the computation. How many floating poing add/subtract, multiply, and divide per second are needed? -- glen
On Wed, 11 Sep 2013 18:26:46 +0000, glen herrmannsfeldt wrote:

> In comp.dsp Tim Wescott <tim@seemywebsite.really> wrote: >> I'm working on a project that needs to have a pretty hefty amount of >> digital signal processing done in more or less real time ("soft" real >> time, if you must split hairs). > > Just wondering, have you thought about FPGA based systolic arrays?
This is a high-zoot, low production volume task. And I have things working just dandy on a PC. So I'd like to take that "works dandy" and translate it -- with as little effort as possible -- to something that'll work inside of a box. Trying to translate this algorithm (it's a Kalman filter) into an FPGA- based system would be a nightmare and a time-sink. I'm trying to avoid the time sink.
> Since you mention floating point, I will guess that it isn't the best > choice, but you don't say all that much about the computation. > > > How many floating poing add/subtract, multiply, and divide per second > are needed?
It needs to be able to sustain about 500k, double-precision FLOPs. 1M would be nice. So it's not a huge challenge for a PC-class processor. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
Hi Tim,

On 9/11/2013 9:11 AM, Tim Wescott wrote:
> I'm working on a project that needs to have a pretty hefty amount of > digital signal processing done in more or less real time ("soft" real > time, if you must split hairs).
Consider a "real" DSP? Or, do you have a fair amount of "conventional" coding that might make such a choice "tedious"?
> For a variety of reasons I think this algorithm would work best on a > small single-board computer (my customer disagrees -- but getting it shoe- > horned into the chips I was considering is going to take WORK, and I > think it'll be cheaper for them to go with more expensive hardware).
Quantities? Target cost? Power? Environmental?
> So I'm looking for suggestions. I mostly build custom boards or I make > algorithms for other people's hardware -- I've never specified a single- > board computer that's gone into production. > > I was thinking PC-104, but I've never actually used a PC-104 computer, > and I have no idea, beyond trade-show displays, how the market has > evolved.
<frown> PC-104 tends to carry some higher costs than roll-your-own (assuming the right quantities, of course). I.e., do you really need the "expandability" that PC-104 brings to the table? Will you be buying daughter cards -- or, rolling your own to add interfaces not present on the first board?
> So, here's what I think I need. Anyone who wants to look through this > and point me to the current crop of solutions for all this is welcome to > do so -- I'll be grateful. > > Small: PC-104 form factor, or some other solution that's less than about > 20 square inches of board and less than an inch tall. > > Fast: Something that supports native dual-precision floating point, and > has a clock rate of 500MHz or better. This algorithm runs about 5x > faster than real time as a Linux application on a Dell Dimension 8300. > That's a 2.8GHz Pentium 4, so if it's running alone it should do more > with less. > > Resource-rich: The algorithm runs, albeit way slow, on a STM32F407, using > less than 128kB of memory. So at least that much memory plus whatever is > necessary for any OS (see below).
Is it realistic to trade memory for performance? (unroll loops, table lookups, etc.) Is that 128KB TEXT+DATA, TEXT, DATA, etc.? Any requirement for persistent memory?
> Ports: Comes with serial ports. I don't need Ethernet or that stuff. > Depending on the processor (see below), having a JTAG debug port would be > nice. > > Extensible: I need something onto which I can easily slap an ADC board, > or something that talks USB, and suggestions for matching ADC modules > that talk USB. My preference is something that has an easy parallel I/O > implementation, an SPI controller that I can hook to an ADC, and/or some > generic general-purpose I/O pins that I can bit-bang.
Your ADC has a low bandwidth?
> Long legs: I need something that'll be on the market for at least five > years, preferably a decade. Better yet would be something that comes > from some sort of a standard (that's not on it's last legs) so that if > today's choice goes out of production tomorrow, we can choose another > that's form-fit-function compatible.
This, IMO, is where you will spend most of your selection effort (esp if you aren't rolling your own). There are a *lot* of "no-name" offerings that will fit your bill. But, no real guarantees that any of these people will be producing the same (or compatible) product *3* years from now. Here, PC104 may help. Vendors *seem* like they try to keep their products around a bit longer than most -- perhaps acknowledging the fact that their devices are used in applications where the customer (i.e., you) wouldn't want to upgrade "every year", etc. Presumably, your code base is portable (not written in ASM) so your main concern is mechanical and hardware features?
> Processor: My preference is for ARM or Intel, but I'm open to anything > for which there's a good port of the gnu tools.
x86 is big in the PC104 world. Think "PC" :>
> OS: Depends somewhat on how the "extensible" happens. If I have to talk > to an ADC using USB, then I want the board to be running Linux (or > Windows in a pinch). Otherwise, I'm happy with putting my own little RTOS > on there.
... because you don't have a USB stack?
> Thanks in advance.
Why not a hybrid approach? Design something that handles <some-specific-aspect-of-your-problem> and *buy* something that handles the rest? (the trick, of course, is finding an efficient means of communicating between the two; if you are passing gobs of data back AND forth, this will be a loser!) I.e., think of it like designing a "smart peripheral" instead of designing a "system".
Consider TMS64xx SOMs made by LogicPD and others.


VLV


Vladimir Vassilevsky <nospam@nowhere.com> writes:

> Consider TMS64xx SOMs made by LogicPD and others.
E.g., Orsys. There are some FPGA daughtercards you can use with them as well. Not PC-104. -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
Randy Yates <yates@digitalsignallabs.com> writes:

> Vladimir Vassilevsky <nospam@nowhere.com> writes: > >> Consider TMS64xx SOMs made by LogicPD and others. > > E.g., Orsys. There are some FPGA daughtercards you can use with them as > well. Not PC-104.
Correction: The Orsys is just a board, not a SOM (although not sure where you'd draw the line). -- Randy Yates Digital Signal Labs http://www.digitalsignallabs.com
On 11/09/13 20:37, Tim Wescott wrote:
> On Wed, 11 Sep 2013 18:26:46 +0000, glen herrmannsfeldt wrote: > >> In comp.dsp Tim Wescott <tim@seemywebsite.really> wrote: >>> I'm working on a project that needs to have a pretty hefty amount of >>> digital signal processing done in more or less real time ("soft" real >>> time, if you must split hairs). >> >> Just wondering, have you thought about FPGA based systolic arrays? > > This is a high-zoot, low production volume task. And I have things > working just dandy on a PC. So I'd like to take that "works dandy" and > translate it -- with as little effort as possible -- to something that'll > work inside of a box. > > Trying to translate this algorithm (it's a Kalman filter) into an FPGA- > based system would be a nightmare and a time-sink. I'm trying to avoid > the time sink. > >> Since you mention floating point, I will guess that it isn't the best >> choice, but you don't say all that much about the computation. >> >> >> How many floating poing add/subtract, multiply, and divide per second >> are needed? > > It needs to be able to sustain about 500k, double-precision FLOPs. 1M > would be nice. > > So it's not a huge challenge for a PC-class processor. >
I don't think there are many "microcontrollers" that have double precision floating point. There are, of course, plenty of DSP's that can do this. There are also some of the Freescale MPC microcontrollers that have double precision floating point, though most have just single precision. One thing to be careful of is that there can be a very big difference between the theoretical MIPs and floating point performance, and the real-world performance. Once you move to processors that are running at 400MHz+ with external memory, it's a lot easier - there are many with double precision floating point, and many that could handle this in software floating point if needed. If you are looking for low volumes, it should not be hard to find ready-made cpu cards with Cortex A processors that will handle this easily. You can get them in SO-DIMM boards (i.e., the same size and shape as memory for laptops). A popular choice these days is the Freescale iMX.6 - it is cropping up in lots of tiny Linux and Android systems. But you can happily use them bare-bones if you prefer.