This may be a broad question, but I am trying to understand how to obtain an impulse response of a room acoustics. I am fairly familiar with DSP, FIR, and the theories and so on, but I need to implement some codes on how to obtain a room impulse response.
If you could point me to a particular articles it would be helpful.
Unfortunately, an impulse is a impractical signal to excite a system (infinite amplitude, infinitely small duration, etc).
Your best bet would be by exciting the room with a white signal (chirp, white noise) and extract the transfer function from it. So, if your recorded signal is y[n] and the input is x[n], the transfer function is H(z)=Y(z)/X(z).
Notice that if your excitation signal has any nulls in the frequency domain (or even a point of smaller response), your estimation will be off, as Y(z) contains noise. You should instead apply some kind of regularization. For example
where eps is a small number. The larger the eps, the higher the attenuation you'll get at those frequencies (in comparison to the "true" impulse response,but the better noise immunity you'll have.
A good approach is to record the y several times and average the reponse, this should help to reduce some of the noise. Or, you could record a long excitation signal (e.g. a slow rising chirp, or a long sequence of pseudo-random samples).
For humor's sake.
In the EE department at Columbia University, there was a great black and white picture, in the hallway, of a faculty member and some students on the stage of an auditorium.
The excitation device, used to make the impulse response measurement, was a small canon that was being primed for firing.
This blog post seems to cover some of the basics:
Or you might go straight to one of the Farina papers:
For measuring room acoustics in a controlled environment (where you get to choose the excitation), (multiple) logarithmically swept sines seems to be universally used due to the (usually) beneficial frequency distribution and the robustness against small non-linearities. Back when I was a student, MLS sequences was still a thing due to computational simplicity (avoiding multiplications in the FFT-like deconvolution).
Probably the best description for this is "Linear System Identification," and you will find lots of solutions to this particular question.
Here's a frequency domain method using cross-correlation between a measured response and a known test signal, with matlab code for implementing it. Make sure your test signal has all the frequency content you care about and a narrow auto-correlation! A sine sweep is probably a good choice. Once you have your complex frequency response you can take the FFT to get back the impulse response.
You can also do the above process in the time domain using replica correlation and windowing.
For an adaptive estimate, you can use an LMS filter to estimate your unknown system. You can also experiment with putting other types of adaptive filters in place of the LMS filter.
In all of these cases you need to take into account that this will give you the impulse response of not just the room, but the whole system. If possible, you should try to isolate the other parts of your system that contribute to the impulse response, like the test source, microphone, and electronics.
Many, many years back I worked with Bill Farrow of Farrow interpolator fame. He worked out an incredibly clever scheme to get an LMS filter to converge extremely fast.
The details of his scheme were published by Jack Salz in the BSTJ. (Not so funny story about why Bill is not a co-author at least, but rather only mentioned in the paper.)
The basic idea was to have an LMS filter that is exactly the same length as the Maximal Length Sequence (MLS) (2^N-1 samples), which needs to be longer than the impulse response to be measured. The MLS (+- 1s) is used to excite the DUT, and sent as the input to the FIR of the LMS. However, the update to the LMS has 1 added to the excitation term, so that the updates are made with 0 and 2 instead of +- 1. The filter will be completely converged (matrix inverted) after exactly two iterations of the MLS. The residual error is based on the other noise sources in the environment, and the inadequate length of the sequence compared to the response required, otherwise the convergence is exact.
There are loads of tools out there already that you can use as a learning framework for IR measurement. These days I prefer to use Matlab to do IR measurement (since I tend to do further analysis in Matlab anyway). If you're a Matlab user, there are many options, for example:
- ITA Toolbox
- IoSR Toolbox
- MikIRAM utility (disclaimer - this is my own toolbox, freely available on github)
My advice would be to download one or more of these and dig into the underlying code. The AKTools and ITA toolboxes in particular have plenty of further documentation/references as well to follow up.
This is not mean to sour your endeavors or any advice given here (which is mostly excellent). It's unclear what your end purpose is for doing this, but that leaves out critical details that are needed for a solution. In my VERY limited experience on the subject, perhaps the biggest obstacle in achieving your desired goal is the instrumentation: namely speakers and microphones.
This is but one of the myriad of details regarding recording and reproduction of audible sound. There are many that have appealing aesthetic value, but very little science to back up their claims. Years ago (when I worked at a HiFi store while in college in the 70's), one the "leading" philosophies of good speaker sound then was Panasonic's "Linear Phase" series. Their claim to fame was that their speakers reproduced the phases of the acoustics faithfully as enjoyed by a live audience at the recording. Of course, this assumes that the recording was done faithfully too, which is arguable at best.
David Beatty in Kansas City years ago had a famous sound studio for his retail store. I was famous world over. Like any good studio, it was devoid of objects that would interfere with "good" acoustics. That is also of primary importance in reproduction.
My point is: if the room's acoustic properties are ideal, and the speakers are "full fidelity", including their placement in perfect "harmony" with the room, AND the microphones, cabling, and audio amplification are all "perfect", you might come close to achieving your goal. But everything in the process has its limitations that can have a significant impact on the end-result of frequency response, mostly due to the inherent low-pass filtering in each step of the process. There are encyclopedias on the subject, of which I have only touched.
The best you can achieve is the trimmed response of one specific location in a room where the microphone is located, and it's very dependent on that microphone's fidelity.
But that's not to say it can't be meaningful, just not as generally applicable as one might hope.
Not directly answering your question in terms of algorithms, but in terms of a really good measurement tool and a fairly extensive writeup check out REW (Room Eq Wizard) https://www.roomeqwizard.com/ (it's free to use).
Thank you for your responses. By the way, I don't have MATLAB, but I have Octave.
I was wondering if your codes will be compatible with Octave? Or do you know of any tools that are written for Octave?
I have REW but I'll look into some of the literatures from REW forum.
we do that a lot and we usually excite the room with a log sine sweep followed by a pause and a pink noise signal of some seconds and record that with a microphone (obviously).
We use the log sine sweep to calculate the impulse response directly (I think it is called Farina method). The advantage is, that the influence of harmonic distortions in the measurement path is moved to negative time and can be cut off.
Still - sometimes the loudspeaker has trouble with high energy signals in the higher frequencies and attenuates them for protection and causes all kind of trouble, so if we are not sure, we use the pink noise signal to adapt a filter (NLMS). If the two results compare well, we probably got it right...
Dudelsound gives an excellent answer. First, swept sine and MLS (pseudo random white noise) techniques emerged over the years as widely accepted methods, and second -- equally important -- it's always sound engineering approach to combine more than one method and take a consensus.
Since you've posed a broad question, I thought I might add a little historical perspective. Here is an interesting and not-too-dated read on one of the early successful swept sine methods:
Ariel's SYSId was another, very popular back in the day. People still send e-mails to Signalogic asking whether we have old Ariel DSP-16 boards.
And not to be forgotten, MLSSA was the first commercially useful MLS product, from DRA Labs, which I think is still operating today.
The original papers on these methods give insight into pros and cons of each. No method is a one-size-fits-all.