On 31.08.16 19.39, Tim Wescott wrote:
> r(k) = h(x(k) + n1(k)) + n2(k)
>
> where r(k) is a known received signal, x(k) is a presumed input signal,
> and n1(k) and n2(k) are unknown noise.  I'm taking a repeated signal for
> both x and r and averaging the snot out of them over the cycle, so the
> actual noise is presumed to be low.  However, the system isn't exactly
> linear, so n2(k) has some "noise" that's actually nonlinear effects.

There is no exact solution for nonlinear parts.

> I want to put on my "easy math" blinders and assume that h is linear and
> time invariant (it is, sorta), and that the noise signals are white and
> Gaussian.  Then I want to find the impulse response of h.

n1 and n2 are equivalent since you will be able to find an n2' that 
includes the noise of n1 transformed by h. So you will not be able to 
distinguish them. In fact this reduces your equation reduces to
   r(k) = h(x(k)) + n

Assuming that n is not correlated with x you may simply repeat the 
measurement to separate h from n. In fact it is quite easy. Use a signal 
with the following properties for x(k):

   amplitude(f) = 1 to the power of kappa for f in [fmin, fmax], 0 otherwise
   phase(f) = 2 pi random()
   IFFT of the above returns a finite length x(k)

Choose kappa appropriate for your h, e.g. 0 for white noise and -1/2 for 
pink noise.

Now repeat x(k) for a long time, in fact x(t), and *synchronously* 
sample r(t). Add every kmax sample together and divide by the number of 
repetitions (in fact an average operation) so that you finally get r(k). Now

   h(f) = FFT(r(k)) / FFT(x(k))

The above division is a complex, per component vector division.
To get the impulse response of h just calculate

   h(k) = IFFT(h(f))

The noise cancels in the averaging operation. Note that you do not need 
to make any assumptions about n here.
Note also, that you do not need a window function and you even should 
not use it.

The theory behind that is that the cyclic x(t) basically makes its 
spectrum discrete, exactly matching the FFT bins of the following 
analysis. Assuming that n is uncorrelated means that the energy density 
in an infinitesimal small frequency interval is zero. So you need only 
to measure long enough and n will almost vanish from your h result.

The average operation on every kmax-th r(t) sample is mathematically 
equivalent to taking an FFT of the entire measurement r(t) and dropping 
every result that does not exactly fit to a frequency in x(t) with 
non-zero energy. Only the required computation power is significantly 
smaller. In fact a Raspberry Pi would manage real time analysis at 200 
kHz sample rate and 64k FFT size easily.

Keep in mind that the information is kept in the /time axis/ (the time 
is an invariant). In fact the coherence need to last for the entire 
measurement. So it is absolutely required that the *crystal oscillator* 
of the DAC for x(t) is *exactly the same* than the one for the ADC for 
r(t). As long as both are in the same component this is likely. On the 
other hand it is not required for the oscillator to be very stable. Even 
cheap PC sound cards meet the needs.

I already did high precision loudspeaker measurements in the garden 
while trucks are passing. There is not even any visible effect in the 
measurement response when a truck passes.

In fact the above procedure theoretically works for any x(k) as long as 
it is repeated. The only drawback is that you only get reasonable 
results for frequencies that have reasonable energy in x(f). So your 
measured impulse response might be incorrect if x(f) is not well 
distributed.

Regarding non-linearities:

Non linear transformation will mix up frequencies (inter modulation). In 
fact terms with frequencies that are sums or differences of used 
frequencies in x(f) occur. For second order they have /two/ terms, for 
3rd order three and so on.
There is is quite simple way to get rid of the second order terms:
simply set all amplitudes in x(f) for /even/ base frequencies to 0. When 
all used frequencies in x(f) are odd then any sum or difference of them 
is /even/ and does no longer contribute to the result. - Unfortunately 
there is no comparable solution for the 3rd order IM.
Remember to double the FFT size (kmax) to compensate for the dropped 
frequencies.

> I'm currently doing this by absolute brute force.  I'm making a matrix of
> delayed versions of x and making the matrix
>
>      [x1 0  0  ...]
>      [x2 x1 0  ...]
> X = [x3 x2 x1 ...]
>      [     ...    ]

?

> Then I'm calculating the least-mean-squared error solution to
>
> r = X * b
>
> Then I'm taking the vector b as the impulse response of h.

Although possible, you should not do this in the time domain.

> The way I'm doing it gives good results, but it seems exceedingly
> wasteful of computing resources.

Exactly.

> Is there a NAME for the FFT magic I'm trying to do, and is there a white
> paper (or a book chapter) on it that I can go read?

Maybe you can follow the method I mentioned above.

Marcel

On Wed, 31 Aug 2016 22:52:05 -0700, SG wrote:

> Am Mittwoch, 31. August 2016 19:39:07 UTC+2 schrieb Tim Wescott:
>> 
>> The way I'm doing it gives good results, but it seems exceedingly
>> wasteful of computing resources.  It not only seems that this should be
>> something that could be done handily using the FFT, but that I should
>> know how to do it.  So far, I've failed.  It ought to be some variation
>> of Wiener filtering -- but, as I've said, so far all I get when trying
>> to apply FFTs to the problem is absolute crap coming back.
> 
> Did you try Pwelch's method for estimating the transfer function? (See
> Matlab's tfestimate. It's FFT-based). It's fast and robust. But I
> suppose you might need more than 4096 samples if you expect the impulse
> response to span about 512 samples. You might need much more to let the
> method work on large blocks and have enough blocks for the averaging.

I will look that up.  I was lacking a name to start searching on.

The x(k) and r(k) are built up as averages around a number of detected 
pulses, so there's been some noise averaging done from the get-go.
> 
> Another approach would be to window and zero-pad x and b so that the
> columns of X contain the same non-zero coefficients only shifted. Then
> you could try to solve the normal equation system instead:
> 
>    A r = X^T b  with  A = X^T X
> 
> Due to the structure of X, A will be a symmetric Toeplitz matrix (and
> hopefully a positive definite one). For these kinds of matrices there
> are special O(n^2) solvers. Computing A can be done using an FFT since
> it contains the scaled auto-correlation of x. And X^T b is a cross-
> correlation between x and b which can also be computed using the FFT.
> 
> Though, numerically, this approach is worse as it basically involves
> squaring X. So, if the condition number of X is already bad, this is
> probably a bad idea. But it'll be faster. The condition number of A is
> basically close to the ratio of the highest to the lowest power spectral
> density of x. If your signal x is white or whitish then the condition
> number of A should be fine and you shouldn't have any numerical
> problems. Solving similar Toeplitz systems is what many speech coding
> people do for estimating the all-pole filter that turns a buzzing sound
> or noise into speech.
> 

You have r and b backwards.  I'm using

b = X \ r

in Scilab, which, when you hand it an over-determined system, finds the 
least-squares solution.  I _think_ it's using singular value 
decomposition under the hood, but I'm not sure.  Whether it's SVD or 
something else, I'm pretty sure it's not using the matrix-squared method.

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

On Thu, 01 Sep 2016 12:50:07 +0000, Steve Pope wrote:

> On 09/01/2016 04:51 AM, Tim Wescott wrote:
> 
>> I don't get to choose x(k).  I've got a huge collection of raw signal
>> recordings from a customer, from which I'm trying to reconstruct
>> channel models.  I suppose I should just do it by brute force and be
>> happy...
> 
> Earlier you said the x(k) were "presumed", what does this mean? Are they
> known or unknown?

Deduced by demodulating a data stream, then reconstructing the 
transmitted pulse stream.

> Seem to me you need to know at least their second-order statistics to be
> able to decouple h from the characteristics already present in x(k).

To the extent that I correctly decode the data and get the bit timing 
right, I can reconstruct x(k).  It's that damned "to the extent" that 
means that I can't entirely trust x(k).

-- 
Tim Wescott
Control systems, embedded software and circuit design
I'm looking for work!  See my website if you're interested
http://www.wescottdesign.com

On 09/01/2016 04:51 AM, Tim Wescott wrote:

> I don't get to choose x(k).  I've got a huge collection of raw signal
> recordings from a customer, from which I'm trying to reconstruct channel
> models.  I suppose I should just do it by brute force and be happy...

Earlier you said the x(k) were "presumed", what does this mean?
Are they known or unknown?

Seem to me you need to know at least their second-order statistics
to be able to decouple h from the characteristics already present
in x(k).

Steve

Am Donnerstag, 1. September 2016 10:58:13 UTC+2 schrieb SG:
> [...]
> In the spectral domain with k overlapping windows you should be able
> to treat all the frequencies in isolation. So, for some frequency you
> collect all the respective FFT bins into columns vectors x, n1, n2, b.
> Then you have the following overdetermined equation system for each
> frequency bin:
> 
>   (x + n1) r + n2 = b                             (1)
> 
> where r is a complex scalar (unknown amplitude and phase response).
> These are k equations (due to k FFT blocks) with a single unknown r.
> Solving this in a least squares sense can be done by the corresponding
> normal equation:
> 
>   (x + n1)^H (x + n1) r = (x + n1)^H (b - n2)     (2)
> 
> Assuming n2 does not correlate with x nor with n1 we can ignore it:
> 
>   (x + n1)^H (x + n1) r = (x + n1)^H b            (3)
> 
> Assuming x does not correlate with n1 we can rewrite this into
> 
>   (x^H x + n1^H n1) r = (x + n1)^H b              (4)
> 
> This is a little nicer. n1^H n1 is basically proportional to the power
> spectral density of n1 which you might know or be able to estimate
> somehow. But this still leaves n1 on the right hand side. I don't see
> a an easy way to get rid of it -- unless you're fine with solving a
> nonlinear equation system that tries to recover n1 as well constrained
> with n1^H n1 = some_power_level.

On second thought, this isn't nonlinear but underdetermined. You could
select any n1 constrained to n1^H n1 = some_power_level which gives
you

            (x + n1)^H b
  r = -------------------------
      (x^H x + some_power_level)

This tells you how well r is determined since you don't know n1 but
only its norm, sqrt(some_power_level).

And restoring your use of variable names:

            (x + n1)^H r
  h = -------------------------
      (x^H x + some_power_level)

Sorry for messing up r, h, b before. Your equation system confused me
to think b is the received signal and r is the impulse response.

> > Cheers!
> > SG

On Thursday, September 1, 2016 at 7:52:22 AM UTC+2, SG wrote:
> [...]
> Another approach would be to window and zero-pad x and b so that the
> columns of X contain the same non-zero coefficients only shifted. Then
> you could try to solve the normal equation system instead:
> 
>    A r = X^T b  with  A = X^T X
> 
> Due to the structure of X, A will be a symmetric Toeplitz matrix (and
> hopefully a positive definite one). For these kinds of matrices there
> are special O(n^2) solvers. Computing A can be done using an FFT since
> it contains the scaled auto-correlation of x. And X^T b is a cross-
> correlation between x and b which can also be computed using the FFT.
> 
> Though, numerically, this approach is worse as it basically involves
> squaring X. So, if the condition number of X is already bad, this is
> probably a bad idea. But it'll be faster. The condition number of A is
> basically close to the ratio of the highest to the lowest power
> spectral density of x. If your signal x is white or whitish then the
> condition number of A should be fine and you shouldn't have any
> numerical problems. Solving similar Toeplitz systems is what many
> speech coding people do for estimating the all-pole filter that turns
> a buzzing sound or noise into speech.

Warning: I might have gotton your b and r mixed up. I was assuming r
to be your impulse response and b to be the convolved signal:

  conv(x + n1, r) + n2 = b

And there might be a problem with not knowing n1. Your approach and
the ones I was talking about basically ignore n1 completely. This may
be a bad idea.

In the spectral domain with k overlapping windows you should be able
to treat all the frequencies in isolation. So, for some frequency you
collect all the respective FFT bins into columns vectors x, n1, n2, b.
Then you have the following overdetermined equation system for each
frequency bin:

  (x + n1) r + n2 = b                             (1)

where r is a complex scalar (unknown amplitude and phase response).
These are k equations (due to k FFT blocks) with a single unknown r.
Solving this in a least squares sense can be done by the corresponding
normal equation:

  (x + n1)^H (x + n1) r = (x + n1)^H (b - n2)     (2)

Assuming n2 does not correlate with x nor with n1 we can ignore it:

  (x + n1)^H (x + n1) r = (x + n1)^H b            (3)

Assuming x does not correlate with n1 we can rewrite this into

  (x^H x + n1^H n1) r = (x + n1)^H b              (4)

This is a little nicer. n1^H n1 is basically proportional to the power
spectral density of n1 which you might know or be able to estimate
somehow. But this still leaves n1 on the right hand side. I don't see
a an easy way to get rid of it -- unless you're fine with solving a
nonlinear equation system that tries to recover n1 as well constrained
with n1^H n1 = some_power_level.

> Cheers!
> SG

On 09/01/2016 04:51 AM, Tim Wescott wrote:
> On Wed, 31 Aug 2016 18:07:45 +0000, Eric Jacobsen wrote:
>
>> On Wed, 31 Aug 2016 12:39:00 -0500, Tim Wescott
>> <seemywebsite@myfooter.really> wrote:
>>
>>> I've got a nice real-world problem that I've reduced down to the
>>> following homework-y sounding problem:
>>>
>>> Given a system h, I give it an excitation x in sampled time, and I get
>>> back
>>>
>>> r(k) = h(x(k) + n1(k)) + n2(k)
>>>
>>> where r(k) is a known received signal, x(k) is a presumed input signal,
>>> and n1(k) and n2(k) are unknown noise.  I'm taking a repeated signal for
>>> both x and r and averaging the snot out of them over the cycle, so the
>>> actual noise is presumed to be low.  However, the system isn't exactly
>>> linear, so n2(k) has some "noise" that's actually nonlinear effects.
>>>
>>> I want to put on my "easy math" blinders and assume that h is linear and
>>> time invariant (it is, sorta), and that the noise signals are white and
>>> Gaussian.  Then I want to find the impulse response of h.
>>>
>>> I'm currently doing this by absolute brute force.  I'm making a matrix
>>> of delayed versions of x and making the matrix
>>>
>>>    [x1 0  0  ...]
>>>    [x2 x1 0  ...]
>>> X = [x3 x2 x1 ...]
>>>    [     ...    ]
>>>
>>> Then I'm calculating the least-mean-squared error solution to
>>>
>>> r = X * b
>>>
>>> Then I'm taking the vector b as the impulse response of h.
>>>
>>> x and r are, at the moment, 4096 points long, and I'm getting 512 points
>>> for b.  So I'm calculating the above equation for a 4096 x 512 point
>>> matrix.
>>>
>>> The way I'm doing it gives good results, but it seems exceedingly
>>> wasteful of computing resources.  It not only seems that this should be
>>> something that could be done handily using the FFT, but that I should
>>> know how to do it.  So far, I've failed.  It ought to be some variation
>>> of Wiener filtering -- but, as I've said, so far all I get when trying
>>> to apply FFTs to the problem is absolute crap coming back.
>>>
>>> Is there a NAME for the FFT magic I'm trying to do, and is there a white
>>> paper (or a book chapter) on it that I can go read?  I'm not quite at
>>> the point of buying a chain saw and attempting to make a living by
>>> carving tree trunks into cheezy-looking bears and selling them to
>>> tourists, but I'm getting close.
>>>
>>> --
>>>
>>> Tim Wescott Wescott Design Services http://www.wescottdesign.com
>>>
>>> I'm looking for work -- see my website!
>>
>> If you want a reasonable frequency response in place of the impulse
>> response, or to compute the impulse response, then if the system
>> cooperates you can sweep x(k) with a linear FM.   If this is done
>> coherently so that the phase of x(k) tracked across to r(k), then the FT
>> of r(k) will essentially be the impulse response.
>>
>> The idea is to do the function of a Vector Signal Analyzer, where x(k)
>> is the swept stimulus and r(k) is the measured output.   This does
>> assume that n1 and n2 are not screwing things up too badly, and that
>> h(k) is reasonably LTI.
>
> I don't get to choose x(k).  I've got a huge collection of raw signal
> recordings from a customer, from which I'm trying to reconstruct channel
> models.  I suppose I should just do it by brute force and be happy...
>
How broadband are those stimuli, either individually or when you look at 
their sum? That's usually what determines how well you can characterise 
the system.

Regards,
Steve

Am Mittwoch, 31. August 2016 19:39:07 UTC+2 schrieb Tim Wescott:
> 
> The way I'm doing it gives good results, but it seems exceedingly 
> wasteful of computing resources.  It not only seems that this should be 
> something that could be done handily using the FFT, but that I should 
> know how to do it.  So far, I've failed.  It ought to be some variation 
> of Wiener filtering -- but, as I've said, so far all I get when trying to 
> apply FFTs to the problem is absolute crap coming back.

Did you try Pwelch's method for estimating the transfer function? (See
Matlab's tfestimate. It's FFT-based). It's fast and robust. But I
suppose you might need more than 4096 samples if you expect the
impulse response to span about 512 samples. You might need much more
to let the method work on large blocks and have enough blocks for the
averaging.

Another approach would be to window and zero-pad x and b so that the
columns of X contain the same non-zero coefficients only shifted. Then
you could try to solve the normal equation system instead:

   A r = X^T b  with  A = X^T X

Due to the structure of X, A will be a symmetric Toeplitz matrix (and
hopefully a positive definite one). For these kinds of matrices there
are special O(n^2) solvers. Computing A can be done using an FFT since
it contains the scaled auto-correlation of x. And X^T b is a cross-
correlation between x and b which can also be computed using the FFT.

Though, numerically, this approach is worse as it basically involves
squaring X. So, if the condition number of X is already bad, this is
probably a bad idea. But it'll be faster. The condition number of A is
basically close to the ratio of the highest to the lowest power
spectral density of x. If your signal x is white or whitish then the
condition number of A should be fine and you shouldn't have any
numerical problems. Solving similar Toeplitz systems is what many
speech coding people do for estimating the all-pole filter that turns
a buzzing sound or noise into speech.

Cheers!
SG

Tim Wescott wrote:
> I've got a nice real-world problem that I've reduced down to the
> following homework-y sounding problem:
>
> Given a system h, I give it an excitation x in sampled time, and I get
> back
>
> r(k) = h(x(k) + n1(k)) + n2(k)
>
> where r(k) is a known received signal, x(k) is a presumed input signal,
> and n1(k) and n2(k) are unknown noise.  I'm taking a repeated signal for
> both x and r and averaging the snot out of them over the cycle, so the
> actual noise is presumed to be low.  However, the system isn't exactly
> linear, so n2(k) has some "noise" that's actually nonlinear effects.
>
> I want to put on my "easy math" blinders and assume that h is linear and
> time invariant (it is, sorta), and that the noise signals are white and
> Gaussian.  Then I want to find the impulse response of h.
>
> I'm currently doing this by absolute brute force.  I'm making a matrix of
> delayed versions of x and making the matrix
>
>      [x1 0  0  ...]
>      [x2 x1 0  ...]
>  X = [x3 x2 x1 ...]
>      [     ...    ]
>
> Then I'm calculating the least-mean-squared error solution to
>
> r = X * b
>
> Then I'm taking the vector b as the impulse response of h.
>
> x and r are, at the moment, 4096 points long, and I'm getting 512 points
> for b.  So I'm calculating the above equation for a 4096 x 512 point
> matrix.
>
> The way I'm doing it gives good results, but it seems exceedingly
> wasteful of computing resources.  It not only seems that this should be
> something that could be done handily using the FFT, but that I should
> know how to do it.  So far, I've failed.  It ought to be some variation
> of Wiener filtering -- but, as I've said, so far all I get when trying to
> apply FFTs to the problem is absolute crap coming back.
>
> Is there a NAME for the FFT magic I'm trying to do, and is there a white
> paper (or a book chapter) on it that I can go read?  I'm not quite at the
> point of buying a chain saw and attempting to make a living by carving
> tree trunks into cheezy-looking bears and selling them to tourists, but
> I'm getting close.
>

Is there a method involving deconvolution lurking under all that?

-- 
Les Cargill