Convolution
The most important operation in DSP
You know that an LTI system is completely described by its impulse response h[n]. You know that any input can be decomposed into shifted, scaled impulses. The final piece: how do you actually compute the output?
The answer is convolution. It is the single most important operation in digital signal processing. Every filter, every reverb effect, every smoothing algorithm, every matched detector: they all compute a convolution.
The formula is simple. The insight behind it is even simpler. Let's build it from two different directions and see why they meet at the same answer.
The Input-Side View
Think of it this way: each input sample is a tiny event, like dropping a pebble into water. When sample x[k] arrives at time k, it triggers the system's impulse response, but scaled by x[k] and shifted to start at time k.
So x[0] triggers x[0] · h[n]. Then x[1] triggers x[1] · h[n - 1] (same shape, shifted one step later). Then x[2] triggers x[2] · h[n - 2]. And so on.
The output is the sum of all these overlapping copies:
y[n] = ∑k x[k] · h[n - k]
This is convolution. Each input sample creates a ripple (a scaled copy of h[n]). Where the ripples overlap, they add together. The output at any moment is the sum of all the ripples that are still ringing at that moment.
The Output-Side View: Flip and Slide
There's a completely different way to compute the same result. Instead of asking "what does each input sample contribute?", ask: "to get one output sample y[n], which input samples matter and how much does each contribute?"
The recipe:
- Flip the impulse response: reverse h[m] to get h[-m]
- Slide the flipped version to position n: h[n - m]
- Multiply point-by-point with x[m] wherever they overlap
- Sum the products. That sum is y[n].
Then slide to the next n and repeat. Each position gives you one output sample.
Same Formula, Two Perspectives
The input-side view and the flip-and-slide view aren't two different operations. They're two ways to read the same sum:
y[n] = ∑k x[k] · h[n - k]
- Input-side: For each k, the product x[k] · h[n - k] is a contribution from input sample k. Sum them across all k to get y[n]. But you can also read it column-wise: for a fixed k, as n varies, x[k] · h[n - k] traces out a scaled, shifted copy of h.
- Output-side: For each n, the products x[k] · h[n - k] are the overlapping values. Sum them across k to get y[n]. This is one output sample at a time.
It's the same grid of numbers, just summed in a different order. Rows versus columns. The total is the same.
Properties Worth Knowing
- Output length: Convolving N samples with M samples produces N + M - 1 output samples. The "extra" M - 1 samples come from the impulse response ringing out after the last input sample.
- Commutativity: x * h = h * x. You get the same output regardless of which signal you call the "input" and which you call the "filter." This follows from a simple change of variable in the sum.
- Identity: x * δ = x. Convolving with an impulse leaves the signal unchanged. This is why h[n] is defined as the output when the input is δ[n].
- Associativity: (x * h1) * h2 = x * (h1 * h2). Cascading two systems is equivalent to convolving their impulse responses first, then filtering once.
What Convolution Does
Different impulse responses produce different effects:
- h = [1/3, 1/3, 1/3] — 3-point moving average. Smooths the signal by averaging neighbors. A simple low-pass filter.
- h = [1, -1] — First difference. Enhances changes and suppresses constants. A simple high-pass filter.
- h = [1, 0, 0, 0.5] — Echo. Adds a delayed, quieter copy of the input to itself. Used in audio effects.
- h = δ[n] — Identity. The output equals the input. Convolution with an impulse leaves the signal unchanged.
The impulse response is the system's personality. Change h, and you change what the system does. The convolution formula stays the same.
Looking Ahead
Convolution works, but it's expensive: convolving an N-sample input with an M-sample impulse response requires roughly N × M multiplications. For large signals and long filters, this adds up fast.
There is a remarkable shortcut. It turns out that convolution in the time domain corresponds to multiplication in the frequency domain. If you transform both signals to the frequency domain (using the DFT), multiply them point-by-point, and transform back, you get the same result as convolution, but much faster for large signals.
This is one of the key reasons the DFT matters, and it's what we'll explore in the next chapter.
Frequently Asked Questions
Is convolution the same as multiplication?
No. Convolution involves flipping, shifting, multiplying, and summing -- much more involved than element-wise multiplication. However, there is a deep connection: convolution in the time domain is equivalent to multiplication in the frequency domain. This convolution theorem is one of the most important results in DSP and is the reason the FFT is so useful for filtering.
Why is the output longer than the input?
When the last input sample arrives, the impulse response has just been triggered. It takes M - 1 more time steps for that last copy to finish ringing out. So the output extends M - 1 samples beyond the last input, making the total length N + M - 1.
Does the order matter? Is x * h the same as h * x?
Yes, convolution is commutative: x[n] * h[n] = h[n] * x[n]. You get the same output regardless of which signal you treat as the input and which as the filter. This follows directly from the convolution formula by a change of summation variable.
Quick Check
Test your understanding of the key concepts from this lesson.






