I had a few questions regarding matrix arithmetic in DSP, (specifically in regards to convolution and deconvolution), I was hoping someone could answer.
As a forewarning, I apologize for any incorrect notation / terminology I may use, as my understanding of matrix arithmetic in regards to DSP is practically non-existent.
Question 1: What is the most common way of representing an audio waveform as a matrix?
I assumed that an audio waveform would be represented by an N x M matrix, where N = sample data and M = (bit-depth * channel count).
Example: assuming we had a 16 bit mono audio signal (M = 16 * 1 = 16), our matrix would look something like the following:
[0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1]
[0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0]
[0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1]
[...and so on]
...yet it seems that MatLab represent an audio signal as an N x M matrix, where N = channel count and M = sample data.
Example: assuming we once again had a 16 bit mono audio signal, our matrix would look something like the following:
[0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 1...and so on]
Question 2: When performing matrix arithmetic with audio waveforms, is it more common to represent sample data in binary or decimal format?
Question 3: What are the equivalent DSP operations for the following matrix operations (if an equivalent DSP operation exists)?
a. Matrix addition
b. Matrix subtraction
c. Matrix multiplication
d. Matrix inversion
e. Matrix transposition
Question 4: As far as convolution is concerned, can I assume that using something like matrix multiplication to represent convolution won't work unless we alter one of our matrices?
For instance, let's say we have two mono audio signals.
With signal 1 (S1) equal to our IR, and signal 2 (S2) equal to the audio signal we wish to convolve.
S1 = IR = [1 2 3 4]
S2 = signal to convolve = [5 6 7 8]
Simply performing the matrix multiplication of S1 and S2 would result in:
[5 6 7 8] x  = [(5 * 1 = 5) + (6 * 2 = 12) + (7 * 3 = 21) + (8 * 4 = 32)] = 70
...which would not be equivalent to convolution.
Instead, in order to perform convolution, we'd need to take each sample from S2 and multiply it by each sample of S1 while also performing a shift.
So rather than represent our IR as:
...we'd need to represent our IR as:
[1 2 3 4 0 0 0]
[0 1 2 3 4 0 0]
[0 0 1 2 3 4 0]
[0 0 0 1 2 3 4]
Representing our IR in this form allows us to properly perform convolution via a simple matrix multiplication, resulting in:
[5 6 7 8] x [1 2 3 4 0 0 0] = [(5 * 1) (5 * 2) (5 * 3) (5 *4) (5 * 0) (5 * 0) (5 * 0)]
[0 1 2 3 4 0 0] [(6 * 0) (6 * 1) (6 * 2) (6 * 3) (6 * 4) (6 * 0) (6 * 0)]
[0 0 1 2 3 4 0] [(7 * 0) (7 * 0) (7 * 1) (7 * 2) (7 * 3) (7 * 4) (7 * 0)]
[0 0 0 1 2 3 4] [(8 * 0) (8 * 0) (8 * 0) (8 * 1) (8 * 2) (8 * 3) (8 * 4)]
[(5) (10 + 6 = 16) (15 + 12 + 7 = 34) (20 + 18 + 14 + 8 = 60) (24 + 21 + 16 = 61) (28 + 24 = 52) (32)]
Question 5: Given that expanding our IR in the manner detailed in question 4 will results in massive matrix sizes, are there any optimizations I could use to limit my IR matrix size (or at the very least, decrease my calculation times)?
Note: I've read about separable 2D convolution:
...but was wondering if there was anything else more efficient?
Question 7: As far as deconvolution is concerned, can I assume that we'd need to perform matrix division (also known as multiplying by the inverse of the IR)?
If not, how would I perform deconvolution using matrix arithmetic?
Audio is a 1D signal... a series of numbers.
Your approach is not correct... you can't convert a 1D signal into a 2D signal by using the bit patterns as a second dimension. A 2D signal is an image, where you have a grid of rows and columns of pixels, with each pixel being a number.
A good part of present day DSP was developed by those that like extensive mathematics, such as matrix theory. If you already know matrix math, understanding DSP in this context is natural and easy. But if you aren't already an expert in matrix techniques, forget about using it to understand DSP. There are far easier ways that work just as well.
Indeed Audio signal is 1D but an image is 2D array of pixels
1D signals can be conveniently processed in some tools as 2D array with each channel independent of others.
Matlab was named from the words Matrix-Lab but it is equally a tool for processing 1D and 2D arrays.
2D arrays look same as a 2D matrix but the concept and computations are quite different.
Thank you for your input.
Just to get another opinion on the matter, was I wrong to convert my IR to a 2D matrix / array (in order to replicate the effects of convolution / deconvolution)?
The reason I ask is this: I'm forced to use matrix multiplication in a certain project, and am trying to merge some DSP concepts into a completely different domain (which is why my questions seem a bit off / confusing).
That said, do you know of any way to perform convolution / deconvolution using a 1D signal and 1D IR through the use of matrix multiplication?
I only ask, as I can see representing an IR as a 2D matrix will greatly decrease my computational efficiency compared to performing the same operation (i.e. matrix multiplication) using a 1D IR.
convolution is dead easy. you get sum of products as you slide your audio signal "underneath" the fixed IR.Deconvolution is not that easy as we have lost the filtered part. I don't know if matrix computation helps here.
I always thought there was a filtration of frequency spectrum for both convolution and deconvolution.
Is that not the case?
In other words, if the signal I wish to convolve has a frequency range of 100 Hz to 18 KHz, while my IR has a frequency range of 150 Hz to 17.5 KHz, after convolving my signal by my IR, I'd have a resultant convolved signal with a frequency range of 150 Hz to 17.5 KHz.
In other words, when performing convolution using an IR, only the frequencies within the IR are boosted in the signal we wish to convolve.
All other frequencies (not in the IR) are lost in the resulting convolved signal.
It seems however that you're indicating that this filtration effect only happens for deconvolution, but not convolution.
you filter in time domain by applying convolution.
reversal is deconvolution??
you first move to this domain then you can filter in frequency domain by direct multiplication then reverse to time domain
deconvolution will be division in frequency domain??
I can show that (circular) convolution in time domain is equivalent to multiplication in frequency domain.
I haven't experience on deconvolution. All I can say is that you cannot reverse efficiently as input is filtered and the filtered area could have been anything...
Thank you as always for your input.
Though I agree with your point, what I was really trying to do was represent the effect of convolution / deconvolution through matrix multiplication.
In order to do that, I didn't see a way of being able to keep the IR as a 1D signal / array / matrix. As the effect of convolution wasn't maintained when a 1D signal was convolved by a 1D IR (instead I had to use a 1D signal convolved by a 2D IR).
The reason for using matrix multiplication is that it's a required operation in a project that I'm working on.
I suppose with my subsequent posts on this forum, that I should include a clearer forewarning that what I'm really trying to do is merge some aspects of DSP into a completely different domain.
Apologies if I was a little unclear / ambiguous.
Your thoughts on signal streams as matrix seems wrong. Signal streams are 2D arrays and this is not same as matrix. I have never used matrix computations in my work so can't help much with regard to your questions which seem wrong type of questions.
First and foremost, you seem stuck on representing the streams as binary. In amath-land, each channel of an audio stream would be represented as a vector of **real** numbers. In a DSP application you'd represent them as vectors of floating-point or fixed-point numbers, depending on what design decisions you'd made about your processor. It would be reasonable to represent a stereo signal as a 2xN matrix.
If you're doing it on a computer, you do it in the format that the computer understands -- certainly not decimal, unless it's a really OLD computer and you're doing the work in COBOL.
Question 3: The equivalent operations are:
a. Matrix addition
b. Matrix subtraction
c. Matrix multiplication
d. Matrix inversion
e. Matrix transposition
Somehow, I think you're lost.
Question 4 & part of Question 5:
Your representation of convolution as a matrix operation is correct. As you can see, it's not the most numerically efficient way of getting the job done. There are **lots and lots** of algorithms in the literature for performing convolution. The two most common are vector dot-products with sliding indexes and FFT methods (in fact, a fixed-point DSP chip isn't worth the name if it can't do a vector dot product at a rate of one tap per clock cycle).
The rest of Question 5:
You generally don't need to do 2D convolution on audio data.
Uh -- right. Counting is a thing.
Deconvolution, or at least the things that you'd do using deconvolution, is a large area of study in itself. **In general**, it's done by computing an inverse filter using one of a wide variety of methods (which may include matrix inversion), and then applying that inverse filter using convolution. **In general** it's not exact, because the data isn't complete -- you're trying to undo a filter, and filters remove information. Because filters remove information, part of deconvolution is figuring out how much you want to try to reconstruct, and how much you need to leave by the wayside.
Thanks for getting back to me.
As for being lost, yes and no (there's an important reason why I'm using matrix multiplication - more on that later).
That said, I was hoping I could confirm a few things with you.
1. With regards to my use of matrix multiplication to perform convolution; are you absolutely certain that my representation of convolution is correct?
The reason I ask is that during convolution (as well as deconvolution), there's a filtration of frequency information that occurs as a by-product of the convolution process (ex. in situations where frequency information in a signal to be convolved, does not exist in the IR which will be convolving that same signal).
In other words, can I assume that when performing convolution using matrix multiplication, that the resultant 1 dimensional matrix will be missing the appropriate frequency information (in the same way if I had used an FFT or vector dot product)?
2. Counting is overrated.
3. With regards to deconvolution, from your post it sounds like simply taking the inverse of a matrix (ex. matrix to the -1 power) and then performing a matrix multiplication is the way to go.
That said, you also mention that we aren't filtering frequency data (as a by-product of deconvolution). Can I assume that this is the case for both convolution and deconvolution (as your response to question 4 seemed to indicate that wasn't the case for convolution but was the case for deconvolution - when answering question 7)?
4. Assuming a hypothetical situation where you were forced to use matrix multiplication to perform deconvolution, how would you simulate the effect of frequency filtration?
5. Assuming that there isn't an easy way of simulating frequency filtration (for either convolution or deconvolution), would a simple workaround be to ensure that both the signal to be convolved and IR, had the same frequency information?
In other words, if we could choose both our IR and audio signal to convolve (ensuring that the frequency information for both matched) ahead of time, could we then perform convolution and deconvolution in a manner which exactly matches that of using an FFT to perform convolution and deconvolution, but by using matrix multiplication?
u = [u1, u2, u3, u4]
v = [v1, v2, v3]
u can be represented as a matrix
A = [u1, 0, 0
u2, u1, 0
u3, u2, u1
u4, u3, u2
0, u4, u3
0, 0, u4]
u * v = A v