DSPRelated.com
Forums

How to downsample this signal?

Started by Piotr Wyderski May 13, 2017
rickman <gnuarm@gmail.com> writes:

> On 5/16/2017 3:40 PM, Randy Yates wrote: >> Randy Yates <randyy@garnerundergroundinc.com> writes: >> >>> rickman <gnuarm@gmail.com> writes: >>> >>>> On 5/16/2017 1:11 PM, Randy Yates wrote: >>>>> rickman <gnuarm@gmail.com> writes: >>>>> >>>>>> On 5/15/2017 11:49 AM, Randy Yates wrote: >>>>>>> rickman <gnuarm@gmail.com> writes: >>>>>>> >>>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote: >>>>>>>>> rickman wrote: >>>>>>>>> >>>>>>>>>> Are the 248 words for data or also program? >>>>>>>>> >>>>>>>>> Data only, there are dedicated code memories. The control part >>>>>>>>> is totally crazy: there are two separate code memories working >>>>>>>>> in parallel (they take both paths simultaneosly in order to >>>>>>>>> provide instructions with no delay in the case of a branch). >>>>>>>>> But there is no such thing as a program counter, the code >>>>>>>>> memories store instructions at consecutive addresses without >>>>>>>>> any deeper structure. A run of instructions is called a block >>>>>>>>> and is ended with the presence of a jump instruction. There >>>>>>>>> is also a third control memory which stores the information >>>>>>>>> where the i-th block begins and to which j-th state the FSM >>>>>>>>> should go in the case of a branch. In short, hardware basic >>>>>>>>> blocks. The data path is a VLIW with exposed pipelining, >>>>>>>>> which adds fun. >>>>>>>>> >>>>>>>>>> That should be plenty of room for coefficients. >>>>>>>>> >>>>>>>>> But doesn't the FIR require a lot of cells for >>>>>>>>> the past data values? >>>>>>>> >>>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to >>>>>>>> be asking. I can't picture how it works, but I specifically remember >>>>>>>> NOT needing to store previous data inputs. I am thinking you store >>>>>>>> outputs which will be fewer because of the decimation. As the input >>>>>>>> data comes in the output samples can be built up and when one is >>>>>>>> complete it is outputted and replaced by a new one. >>>>>>>> >>>>>>>> But read a proper reference. If I wasn't busy today I'd dig this up >>>>>>>> for you. >>>>>>> >>>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for >>>>>>> a length N polyphase FIR decimation by M. Just think first principles: >>>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M. >>>>>> >>>>>> I am pretty sure that conclusion is not correct. When I was asked to >>>>>> implement a polyphase filter that was the part I had trouble getting >>>>>> my head around but finally understood how it worked. At the time this >>>>>> was being done on a very early DSP chip with very limited memory, so >>>>>> storing the N-1 previous samples was not an option. The guy >>>>>> presenting the problem to me was the classic engineer who could read >>>>>> and understand things, but couldn't explain anything to other people. >>>>>> So I had to take the paper he gave me and finally figure it out for >>>>>> myself. >>>>>> >>>>>> I will dig around when I get a chance and figure out exactly how this >>>>>> worked. But the basic idea is that instead of saving multiple inputs >>>>>> and calculating the outputs one at a time (iterating over the stored >>>>>> inputs) - as each input comes in it is iterated over the stored >>>>>> outputs and each output is ... well output, when it has been fully >>>>>> calculated. Since there are fewer outputs than inputs (by the >>>>>> decimation factor of M) you store fewer outputs than you would inputs. >>>>>> >>>>>> This is not hard to understand if you just stop thinking it *has* to >>>>>> be the way you are currently seeing it. (shown here with simplified >>>>>> notation since it is too messy to show the indexes with decimation) >>>>>> >>>>>> y(k) = a(0)x(i) + a(1)x(i-1) + a(2)x(i-2) + ... >>>>>> y(k+1) = a(0)x(i+1) + a(1)x(i) + a(2)x(i-1) + ... >>>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i) + ... >>>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ... >>>>>> >>>>>> Instead of thinking you have to calculate all the terms of y(k) at one >>>>>> time (and so store the last N*M values of x), consider that when x(i) >>>>>> arrives, you can calculate the x(i) terms and add them into y(k) >>>>>> (which is then output) and y(k+1), y(k+2)... This allows the storage >>>>>> of fewer values by a factor of M. It does require more memory fetches >>>>>> than the standard MAC calculation since you have to fetch not only the >>>>>> coefficient and the input, but also the intermediate output value >>>>>> being calculated on each MAC operation. >>>>>> >>>>>> Maybe I won't have to dig it up. I'm pretty sure this is how it >>>>>> worked. It ties in nicely with the "polyphase" aspect to allow a >>>>>> minimum number of calculations on arrival of each input since not all >>>>>> the coefficients are used on each input. So the coefficients are >>>>>> divided into "phases" (N/M coefficients) with only one phase used for >>>>>> any given input sample. >>>>> >>>>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by >>>>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus >>>>> the sets of input values for different outputs are disjoint. >>>> >>>> Yes, so what? You still don't need to store any inputs to calculate >>>> the output. >>> >>> Where would they come from if they were not stored? >> >> OK I see what you mean now. You can do a running MAC as they come in. >> That is certainly true. > > Ok, glad we got past that.
Yeah, thanks for hanging with me, Rick. For some reason I kept on overlooking the key concept, which you plainly stated: Instead of thinking you have to calculate all the terms of y(k) AT ONE TIME (and so store the last N*M values of x), consider that when x(i) ... Mea culpa. -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com