# Fundamental Sampling Theory and Real-time Constraints

Started by October 28, 2012
```Randy Yates <yates@digitalsignallabs.com> wrote:

(snip, I wrote)

>> I like to design one directional systolic arrays, such that the output
>> depends on the previous N samples.

> Huh? The method I described does depend on the previous N - 1 samples
> plus the current sample.

>> If you aren't clocking so fast, though, you should be able to do other
>> ways of multiplying and adding.

>> I think a bidirectional systolic array will do what you say.

> I've always heard the term "systolic array" but never understood
> what it meant.

They are described here: http://en.wikipedia.org/wiki/Systolic_array

but maybe not so well.

Most common are one dimensional arrays.  The name comes from
an analogy to blood flow, moving each heart beat.

Start with a shift register wide enough to hold one sample, and N
stages long. That gives you the values of the last N samples.
Feed those to multipliers (or look-up tables with multiples
of the constant coefficients) then adders. Now, another
series of registers holds the partial sums.

Actually, there are various combinations of registers, mutlipliers,
and adders that you can use, but the systolic idea is that data
moves down the pipeline (usually) one stage per clock cycle,
and partial results also move along.

-- glen
```
```>> So I suppose my question is, if I were to massively oversample my
signal, would I be able to avoid this latency cost or am I ultimately
bounded by
the bandwidth of the signal of interest?

Hi,

this may sound painfully obvious, but maybe it isn't:
A system that has to perform anti-alias filtering will be slower than one
that doesn't.

In other words: If you increase the sampling rate and adjust the (at least
partly analog!) anti-aliasing filter accordingly, the group delay
decreases.

This doesn't solve all your problems, though. If the algorithm demands a
processing interval of x seconds, the lower or higher sampling rate doesn't
change a thing. You'll need more or less samples, but the absolute length
is the same. The key word here is, more often than not: "time-bandwidth
product" (of some part in your algorithm).

-markus

PS: Don't buy a FIR filter that imposes one sample delay. It's broken :-)
Done correctly, the first coefficient appears at the output -immediately-
(through combinatorial logic on an FPGA).
At kHz frequencies, this should be unproblematic, as I wouldn't expect
timing constraints force a register at the input.

```
```"mnentwig" <24789@dsprelated> writes:

>>> So I suppose my question is, if I were to massively oversample my
> signal, would I be able to avoid this latency cost or am I ultimately
> bounded by
> the bandwidth of the signal of interest?
>
> Hi,
>
> this may sound painfully obvious, but maybe it isn't:
> A system that has to perform anti-alias filtering will be slower than one
> that doesn't.
>
> In other words: If you increase the sampling rate and adjust the (at least
> partly analog!) anti-aliasing filter accordingly, the group delay
> decreases.

Good point.

> [...]

> PS: Don't buy a FIR filter that imposes one sample delay. It's broken :-)
> Done correctly, the first coefficient appears at the output -immediately-
> (through combinatorial logic on an FPGA).

As the frat guys on Animal House coughed under their breathes, "bullshit."

There is always a delay, even on an FPGA. It may be femtoseconds, but
it's still a delay. I took the high road and considered it an output at
n+1.
--
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com
```
```>> bullshit
well, I think we all understand what's meant :-)

Of course, zero delay isn't possible. But all hardware propagation delays
can be combined into one chunk of combinatorial logic, to the point where
we hit timing constraints (which isn't my main concern at kHz rates).

- If the delay remains unchanged, everything is fine.
- If it increases, something is broken (conceptually and / or
implementation, i.e. unnecessary register in Verilog code).

```
```"mnentwig" <24789@dsprelated> writes:

>>> bullshit
> well, I think we all understand what's meant :-)
>
> Of course, zero delay isn't possible. But all hardware propagation delays
> can be combined into one chunk of combinatorial logic, to the point where
> we hit timing constraints (which isn't my main concern at kHz rates).
>
> - If the delay remains unchanged, everything is fine.
> - If it increases, something is broken (conceptually and / or
> implementation, i.e. unnecessary register in Verilog code).

I see what you're saying. You don't have to wait until n+1 to process
the outputs of cascaded filters unless and until you hit a hardware
limit on the number of operations you can perform in one sample time.
Yes, I agree with this.

However, that is only one type of filter configuration.

Consider a trivially simple FIR: h[n] = 1. Place that filter in a
trivially-simple feedback loop:

y[n] = x[n] + y[n-1]

In this case, the one-sample delay y[n-1] is required.
--
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com
```
```to operate on the -previous- input / output sample, I need one unit delay.
I agree with that.
And to operate on the -current- input / output sample, I use one unit delay
less => none at all.

There isn't really a problem here. Run [1 0 0 0 0] through any correct FIR
filter routine, configured for [1] impulse response, and it gives [1 0 0 0
0] as output.
```
```Randy Yates <yates@digitalsignallabs.com> writes:

> "mnentwig" <24789@dsprelated> writes:
>
>>>> So I suppose my question is, if I were to massively oversample my
>> signal, would I be able to avoid this latency cost or am I ultimately
>> bounded by
>> the bandwidth of the signal of interest?
>>
>> Hi,
>>
>> this may sound painfully obvious, but maybe it isn't:
>> A system that has to perform anti-alias filtering will be slower than one
>> that doesn't.
>>
>> In other words: If you increase the sampling rate and adjust the (at least
>> partly analog!) anti-aliasing filter accordingly, the group delay
>> decreases.
>
> Good point.
>
>> [...]
>
>> PS: Don't buy a FIR filter that imposes one sample delay. It's broken :-)
>> Done correctly, the first coefficient appears at the output -immediately-
>> (through combinatorial logic on an FPGA).
>
> As the frat guys on Animal House coughed under their breathes,
> "bullshit."

A apologize for this. If you're just talking about FIR filters,
which is the case you stated, then you are correct. I retract and
apologize for my "bullshit" statement.
--
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com
```
```no need to apologize :-)
as I said, I think we all agree, "zero delay" isn't possible.

I'm just more picky than ususal, as the original problem was to reduce
latency.

```
```On Sun, 28 Oct 2012 16:19:23 -0500, WaveRider wrote:

>>On Sun, 28 Oct 2012 10:59:02 -0500, WaveRider wrote:
>>
>>> Hi everyone,
>>>
>>> I'm new to the arena of DSP and have arrived at a bit of a
>>> sampling-based conundrum. I'm wondering if anyone here can help me out
>>> with possible solutions, or can simply set my understanding straight.
>>>
>>> I have an application where my signal of interest has a bandwidth of
>>> 500Hz.
>>> At Nyquist, I'd be sampling at 1 kHz. Of course, I could oversample by
>>> this by some factor and filter it down in digital, but at the end of
> the
>>> day I have a bandwidth of 500 Hz.
>>>
>>> I have a constraint on my system: I must produce a result from my DSP
>>> within 300 ms of a change to the input. At worst case, then, I assume
>>> that it must be within 300ms of getting a new sample. However, if I
>>> sample and subsequently clock the DSP system at 1 kHz, my clock period
>>> is 1 ms. If I were to then, say, collect 1024 samples for an algorithm
>>> I've already taken over a second! Now, I could pipeline my samples
>>> through my DSP path. However, I still pay the price in latency. There
>>> will still be a delay from the change in signal to the final output
> that
>>> would be unacceptable.
>>>
>>> So I suppose my question is, if I were to massively oversample my
>>> signal, would I be able to avoid this latency cost or am I ultimately
>>> bounded by the bandwidth of the signal of interest? ie. Will my
>>> oversampled processing be "useless"? Could I sample at nyquist and run
>>> the actual processing at a faster rate?
>>>
>>> One thing to consider is that part of my DSP path is the discrete
>>> wavelet transform, which itself deals with rate conversion. Do my
>>> samples have to be at nyquist to take advantage of the wavelet
>>> transform? It would seem so.
>>>
>>> I'd appreciate any insight into this. Thanks!
>>
>>First, as SteveU pointed out, you're conflating sampling rate and
>>bandwidth.  They _are_ related, they _are not_ the same thing.
>>
>>Second, you're making a common error about Nyquist rates, which is going
>>to lead you to severe aliasing if you don't change your ways:
>>http://www.wescottdesign.com/articles/Sampling/sampling.pdf
>>
>>Third, a bandwidth of 500Hz suggests that all the interesting stuff can
>>be settled and done with in 2ms on a good day.  So what are you doing
>>that's going to take 300ms?
>>
>>Forth, I would expect that your algorithm is going to take the sum of
>>whatever real-world time it was going to take anyway, plus one or two
>>sample times.  If you need 1000 samples to do something meaningful at a
>>sampling rate of 1kHz, that's because you need one second's worth of
>>data
>
>>-- not because 1000 samples taken at a sampling rate of 1MHz would be
>>sufficient.
>>
>>I deduce that your problem is of detection and estimation.  Perhaps if
>>already tell you what it's going to boil down to: either your signal is
>>going to give you worthwhile information within 300ms of whatever event
>>you're trying to detect, or it won't.  In the first case you have a
>>chance of doing something useful.  In the second case, you're out of
> luck.
>>
>>Since you're not letting us know what you're _actually_ trying to do,
>>there's not much help that I (or, I suspect) the rest of us can give
>>you,
>
>>
>>--
>>Tim Wescott
>>Control system and signal processing consulting www.wescottdesign.com
>>
>>
> Hi Tim,
>
> I'm trying to do some simple myoelectric-based control.
>
> To restore the spectral content of the signal that has been attenuated
> by human skin, I'd like to implement a whitening filter based on Burg AR
> estimation. To converge to a better estimate of the AR process affecting
> my signal, this typically needs more samples to work on than less. This
> module will then update a pre-whitening FIR's coefficients with its
> estimates. If I do it in a periodic "burst" fashion I suspect I may be
> able to get around this timing problem of 1 second, especially since I
> don't expect the estimate to vary greatly.

I don't see why this has to be bursty, or why it -- by itself -- would
cause problems with delaying your signal.

Just run multiple threads: one process estimates the whitening filter,
another one applies the current whitening filter and does the estimation.

> On the other hand, I'll also be trying to detect features using the
> Discrete Wavelet Transform and 256-pt FFT followed by several estimator
> modules that work on windows of samples. I'm doing this all in hardware
> (Verilog) and so I can leverage true parallelism.

Something isn't matching up here.  I alluded to this earlier, but you're
not speaking to it at all.

You are measuring a real signal, and trying to pull out some real
feature.  But you speak of using a fixed-size FFT with an as-yet
undetermined sampling rate.  How can this be?  Shouldn't you be talking
about an FFT over some defined real time interval, with as many points as
it needs to have?

> So I get that my sampling rate can be much, much lower than my
> processing rate. However, that means anything working on the samples
> themselves will do no meaningful work until the next sample arrives,
> right? In that case I'll need to come up with an enabling scheme I
> suppose so that the various elements in the chain power-down once
> they've done their number crunching until the next valid sample arrives.

Possibly.  I'm not up on the most current technologies, but back when I
was paying attention it wasn't always obvious how to get low power from
an FPGA -- basically, if you clocked the thing at all, you lit up a whole
lot of logic that then consumed power.

If you're re-using hardware (which is a wise thing to do at such low
sampling rates), then the way to save power with an FPGA may well be to
just throttle the clock down until you're finishing with sample n-1 right
about the time that sample n comes in the door.

Better yet, you may want to use a processor to do this.

> The wavelet transform somewhat troubles me here, however. If I
> oversample this 500 Hz bandwidth signal by some factor, and then filter
> it down in digital, my initial detail coefficient levels will be junk,
> right? Unless the low-pass/high-pass filters for the first iteration
> aren't half-band but instead are designed to segment my real signal band
> within the much larger oversampled band in half. Am I getting this
> right? This is probably my most important question.

I'm not really up on all this new-fangled wavelet stuff (it doesn't help
much with control systems).  So your question is a bit jargon-rich.

I wouldn't call sampling faster than 1000Hz "oversampling" in any moral
sense, as in "you shouldn't be doing that" or "that's extravagant".  I'd

Yes, the faster that you sample the less that your higher frequency bins
in your FFT (or whatever passes for higher frequency wavelets) will tell
power (and assuming that you're using a windowed FFT and calling it
"wavelets"), I would be inclined to just do a longer FFT and discard the
higher bins.

If you absolutely positively feel that you must limit your sampling rate
to 1000Hz at some point, then sampling generously fast in the analog
domain and going through some sort of decimation process in the digital
domain is probably a good idea.  It's easy to make well-behaved filters
in the digital domain that are practically impossible to make in the
analog domain.  Were I doing this, I would start by investigating whether

One thing that concerns me, however, is that you seem to be falling into
the same trap with the wavelet transform that I see done with Kalman
filtering and fuzzy logic: namely that your statements make almost as
much sense with the word "magic" substituted for "wavelet".  If you
really know what you're doing with the wavelet transform, then many of
these questions should answer themselves.  If you're under the impression
that you're wielding a _magic_ transform, then progress is going to be
slow and -- at best -- random.

--
My liberal friends think I'm a conservative kook.
My conservative friends think I'm a liberal kook.
Why am I not happy that they have found common ground?

Tim Wescott, Communications, Control, Circuits & Software
http://www.wescottdesign.com
```