On Wednesday, December 9, 2015 at 9:28:23 AM UTC+13, Steve Pope wrote:
> <gyansorova@gmail.com> wrote:
>
> >> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>
> >>> I was wondering how many of you when sampling signals use
> >>> two concurrent loops. First loop to do the sampling and the
> >>> second for the DSP. You send the data between the two with
> >>> a FIFO. I assume this is way faster than a straight through
> >>> single loop. Of course FPGAs are ok for this but unaware if
> >>> anybody does this on say two processors.
>
> >Two parallel while loops which run independantly of each other (not on
> >the same processor)
>
> You can certainly do this. A major complication would be if you
> had to send data in both directions. If it's one direction only
> (from the first processor to the second processor) than the
> second processor can simply poll the first processor.
>
> If things get too hairy you would either need to write a sychronizaiton
> kernel, or buy (or I suppose, write from scratch) an RTOS. Depending on
> your total project flow, using an RTOS right off the bat might
> be wise.
>
>
> Steve
I was on a course by National Instruments and it appears they have solved all these issues on their compact Rio. It has FPGA and a RTOS and they can communicate through FIFOs and multiple software loops can also communicate. Some of it is quite easy but it appears to be quite challenging to understand the more exotic side of it. They can monitor the jitter in real even and adjust speeds.
Very neat stuff but not at all cheap. One of them is controlling the Haldron Collider apparently.
Wouldn't be much use for some applications of course, too expensive.
Reply by Bob Masta●December 9, 20152015-12-09
On Mon, 7 Dec 2015 21:53:47 -0800 (PST),
gyansorova@gmail.com wrote:
>I was wondering how many of you when sampling signals use two concurrent lo=
>ops. First loop to do the sampling and the second for the DSP. You send the=
> data between the two with a FIFO. I assume this is way faster than a strai=
>ght through single loop. Of course FPGAs are ok for this but unaware if any=
>body does this on say two processors.
I suspect that these days most everyone is using some form
of "two concurrent loops" in the sense that the sampling is
treated as a separate process that fills a buffer, and the
DSP is handled by a separate process that reads the buffer.
Whether these run on separate processors, or separate tasks
on a single processor, isn't important to the basic concept.
Either way you need some handshaking to tell when a buffer
is full and needs servicing, whether that is passed as an
interrupt, DMA request, or software message. The choice
will probably depend on "hard" time constraints versus
memory use. You can ride out long intervals between buffer
services with a large-enough queue of buffers, for example.
(Or one really large circular buffer with pointer access.)
Or am I misunderstanding the question?
Best regards,
Bob Masta
DAQARTA v8.00
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Sound Level Meter
Frequency Counter, Pitch Track, Pitch-to-MIDI
FREE 8-channel Signal Generator, DaqMusiq generator
Science with your sound card!
Reply by glen herrmannsfeldt●December 9, 20152015-12-09
rickman <gnuarm@gmail.com> wrote:
(snip on two processor serial pipelines for DSP)
(then I wrote)
>> I believe that more usual is to do operations in parallel on
>> the two machines. That is, not in pipeline form.
(snip)
>> I suspect one could divide up the FFT, again so that the two
>> processors are doing the same thing two different parts of the data.
> The FFT does not split cleanly like this. All output data depends on
> all input data and the intermediate calculations are shared. You could
> split the data into two parallel paths with a final step to combine them
> and perform the final pass.
I didn't try too hard, but that is about what I was thinking about.
> This would not scale well with more
> processors requiring more passes to be done at the end for every
> multiple of 2 processors used (or other numbers depending on the radix
> of the butterflies used).
But the OP only asked about two, and didn't give details about
the algorithm in question.
> Rather the FFT is best decomposed by passes with the data flowing
> between each pass like a pipeline. So this is not a good example to
> demonstrate your point.
I agree it doesn't scale to larger number of processors so well.
For a serial pipeline, you need an efficient way to pass data.
If you don't have that, the overhead will be worse than the speedup.
Though you should be able to partition a 2D FFT, by rows and then
columns, without much trouble.
In case anyone was interested, the paper describing the algorithm
used for the Lytro camera starts out with a 4D FFT. That is as far
as I remember it, though. Fortunately N isn't so big.
-- glen
Reply by Steve Pope●December 8, 20152015-12-08
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote:
>Steve Pope <spope33@speedymail.org> wrote:
>> <gyansorova@gmail.com> wrote:
>>>> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>>>>> I was wondering how many of you when sampling signals use
>>>>> two concurrent loops. First loop to do the sampling and the
>>>>> second for the DSP. You send the data between the two with
>>>>> a FIFO. I assume this is way faster than a straight through
>>>>> single loop. Of course FPGAs are ok for this but unaware if
>>>>> anybody does this on say two processors.
>>>Two parallel while loops which run independantly of each other (not on
>>>the same processor)
>> You can certainly do this. A major complication would be if you
>> had to send data in both directions. If it's one direction only
>> (from the first processor to the second processor) than the
>> second processor can simply poll the first processor.
>I believe that more usual is to do operations in parallel on
>the two machines. That is, not in pipeline form.
Just to clarify, I did not intend to exclude parallel processing
in my description above.
It sounds like the OP wishes to do some front-end processing
in the first processor and some more extensive DSP in the
second processor. These can occur simultaneously.
The stated use of "while loops" suggests polling, that is, whenever
the second processor has run out of data, it polls the first
processor to see if more data is ready. This is less efficient than
having the appropriate I/O primitives built into the OS (something like a
select() system call is useful) and supported by the hardware, but
it still allows parallelism.
Steve
Reply by rickman●December 8, 20152015-12-08
On 12/8/2015 6:38 PM, glen herrmannsfeldt wrote:
> Steve Pope <spope33@speedymail.org> wrote:
>> <gyansorova@gmail.com> wrote:
>
>>>> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>
>>>>> I was wondering how many of you when sampling signals use
>>>>> two concurrent loops. First loop to do the sampling and the
>>>>> second for the DSP. You send the data between the two with
>>>>> a FIFO. I assume this is way faster than a straight through
>>>>> single loop. Of course FPGAs are ok for this but unaware if
>>>>> anybody does this on say two processors.
>
>>> Two parallel while loops which run independantly of each other (not on
>>> the same processor)
>
>> You can certainly do this. A major complication would be if you
>> had to send data in both directions. If it's one direction only
>> (from the first processor to the second processor) than the
>> second processor can simply poll the first processor.
>
> I believe that more usual is to do operations in parallel on
> the two machines. That is, not in pipeline form.
>
>> If things get too hairy you would either need to write a sychronizaiton
>> kernel, or buy (or I suppose, write from scratch) an RTOS. Depending on
>> your total project flow, using an RTOS right off the bat might
>> be wise.
>
> One might do a matrix multiply with one processor doing one half of
> the matrix, and the other doing the other half. In that case, there
> is little synchronization to get right.
>
> I suspect one could divide up the FFT, again so that the two
> processors are doing the same thing two different parts of the data.
The FFT does not split cleanly like this. All output data depends on
all input data and the intermediate calculations are shared. You could
split the data into two parallel paths with a final step to combine them
and perform the final pass. This would not scale well with more
processors requiring more passes to be done at the end for every
multiple of 2 processors used (or other numbers depending on the radix
of the butterflies used).
Rather the FFT is best decomposed by passes with the data flowing
between each pass like a pipeline. So this is not a good example to
demonstrate your point.
--
Rick
Reply by glen herrmannsfeldt●December 8, 20152015-12-08
Steve Pope <spope33@speedymail.org> wrote:
> <gyansorova@gmail.com> wrote:
>>> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>>>> I was wondering how many of you when sampling signals use
>>>> two concurrent loops. First loop to do the sampling and the
>>>> second for the DSP. You send the data between the two with
>>>> a FIFO. I assume this is way faster than a straight through
>>>> single loop. Of course FPGAs are ok for this but unaware if
>>>> anybody does this on say two processors.
>>Two parallel while loops which run independantly of each other (not on
>>the same processor)
> You can certainly do this. A major complication would be if you
> had to send data in both directions. If it's one direction only
> (from the first processor to the second processor) than the
> second processor can simply poll the first processor.
I believe that more usual is to do operations in parallel on
the two machines. That is, not in pipeline form.
> If things get too hairy you would either need to write a sychronizaiton
> kernel, or buy (or I suppose, write from scratch) an RTOS. Depending on
> your total project flow, using an RTOS right off the bat might
> be wise.
One might do a matrix multiply with one processor doing one half of
the matrix, and the other doing the other half. In that case, there
is little synchronization to get right.
I suspect one could divide up the FFT, again so that the two
processors are doing the same thing two different parts of the data.
For an FIR, if you wait until enough data is in the input buffer,
you can have one compute the even output samples, and the other odd.
Doing it this way, easily generalizes to more processors.
Not so easy in a serial pipeline form.
-- glen
Reply by Eric Jacobsen●December 8, 20152015-12-08
On Tue, 8 Dec 2015 08:26:05 -0800 (PST), gyansorova@gmail.com wrote:
>On Tuesday, December 8, 2015 at 8:54:04 PM UTC+13, rickman wrote:
>> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>> > I was wondering how many of you when sampling signals use two concurren=
>t loops. First loop to do the sampling and the second for the DSP. You send=
> the data between the two with a FIFO. I assume this is way faster than a s=
>traight through single loop. Of course FPGAs are ok for this but unaware if=
> anybody does this on say two processors.
>>=20
>> I guess I'm not clear on what you mean by one or two loops. Do you mean=
>=20
>> processes?
>>=20
>> --=20
>>=20
>> Rick
>
>Two parallel while loops which run independantly of each other (not on the =
>same processor)
I'm actually in the middle of doing this sort of thing right now on a
vanilla Linux platform where the real-time application is subject to
the whims of the OS.
Using threads (e.g., Posix threads) helps a lot, even if there's only
one CPU, but if there is more than one CPU it allows the system to
distribute the various processes (aka threads) across the available
CPUs.
Even with just one CPU it can prevent hardware drivers or interface
routines from blocking (and sitting on their hands) while time is
wasting away that you need to do the processing in the other loop (or
other thread).
In my current case it's similar to what you're describing: the
routine that collects the samples from the input (ADC, AFE, whatever),
gets it's own time management and thread, which periodically hands
stuff off to another thread that is running compute-intensive
processing. It is essentially virtual parallel while() loops, even
on one CPU, as the OS allows resources to switch between them rather
than allowing one to block the other. If multiple CPUs are
available, then the threads can be assigned to CPU resources based on
availability.
Eric Jacobsen
Anchor Hill Communications
http://www.anchorhill.com
Reply by Steve Pope●December 8, 20152015-12-08
<gyansorova@gmail.com> wrote:
>> On 12/8/2015 12:53 AM, gyansorova@gmail.com wrote:
>>> I was wondering how many of you when sampling signals use
>>> two concurrent loops. First loop to do the sampling and the
>>> second for the DSP. You send the data between the two with
>>> a FIFO. I assume this is way faster than a straight through
>>> single loop. Of course FPGAs are ok for this but unaware if
>>> anybody does this on say two processors.
>Two parallel while loops which run independantly of each other (not on
>the same processor)
You can certainly do this. A major complication would be if you
had to send data in both directions. If it's one direction only
(from the first processor to the second processor) than the
second processor can simply poll the first processor.
If things get too hairy you would either need to write a sychronizaiton
kernel, or buy (or I suppose, write from scratch) an RTOS. Depending on
your total project flow, using an RTOS right off the bat might
be wise.
Steve
Reply by Tim Wescott●December 8, 20152015-12-08
On Mon, 07 Dec 2015 21:53:47 -0800, gyansorova wrote:
> I was wondering how many of you when sampling signals use two concurrent
> loops. First loop to do the sampling and the second for the DSP. You
> send the data between the two with a FIFO. I assume this is way faster
> than a straight through single loop. Of course FPGAs are ok for this but
> unaware if anybody does this on say two processors.
Given the wide range of answers you're getting, and the puzzlement that
accompanies them -- perhaps you could tell us more, or even what your
real question is?
--
Tim Wescott
Wescott Design Services
http://www.wescottdesign.com
Reply by Les Cargill●December 8, 20152015-12-08
gyansorova@gmail.com wrote:
> I was wondering how many of you when sampling signals use two
> concurrent loops. First loop to do the sampling and the second for
> the DSP. You send the data between the two with a FIFO. I assume this
> is way faster than a straight through single loop. Of course FPGAs
> are ok for this but unaware if anybody does this on say two
> processors.
>
Results will vary. There is also nothing to keep you from interleaving
the two "loops" in one say, "while (1) ..." loop for a single
processor. "Two processors" is slightly ambiguous; could be a dual-core
in which case some things are still shared that you'll have to
accommodate.
You kinda want the sampling part to be something like DMA. Much
also depends on the interface to the sampling hardware. A 'loop'
could be behind a device driver facade.
The way to know what works is to instrument the code with at least
counters that are incremented to show when something is not right.
--
Les Cargill