DSPRelated.com
Forums

Reading Large Files and Wavread

Started by afir...@gmail.com December 5, 2009
Hello, this is my first post on these forums, so please bear with me; I'm trying to read relatively large audio files (.wav format, ~1GB in size) into matlab. However, I run out of memory long before the full file is read.

Right now it seems to me that the best solution would be reading in chunks of the signal, manually downsampling, then analyzing them separately, although this seems to be a rather inelegant solution.

The maximum frequency component of the signal I'd be interested would be ~10khz, while wavread returns one sampled at ~50kHz and at 16 bits per sample. I couldn't find any ways to change the sampling frequency or number of bits per signal that wavread would read the signal at, so my question is ultimately two-fold;

A. Is there a way to change the sampling frequency or bit/sample parameters for wavread (or is this something that's specified by the audio file itself?)

B. Is wavread the optimal solution for reading in this file? Or is there an alternate method that I haven't yet discovered?

Thanks,
Alrik
On Sat, Dec 5, 2009 at 12:32 AM, wrote:

> Hello, this is my first post on these forums, so please bear with me; I'm
> trying to read relatively large audio files (.wav format, ~1GB in size) into
> matlab. However, I run out of memory long before the full file is read.
>
> Right now it seems to me that the best solution would be reading in chunks
> of the signal, manually downsampling, then analyzing them separately,
> although this seems to be a rather inelegant solution.
>
> Actually it is not inelegant, on the other hand, it might even speed up
your simulation. I presume you dont need the entire file at the same time
for processing. if you can fit your working set into your processors RAM,
the simulation would be much faster.

> The maximum frequency component of the signal I'd be interested would be
> ~10khz, while wavread returns one sampled at ~50kHz and at 16 bits per
> sample. I couldn't find any ways to change the sampling frequency or number
> of bits per signal that wavread would read the signal at, so my question is
> ultimately two-fold;
>
> A. Is there a way to change the sampling frequency or bit/sample parameters
> for wavread (or is this something that's specified by the audio file
> itself?)
>
> You can use any of the sample rate converter tools available to change the
sampling frequency of the file. Mind that wavread inherently doesnt have a
notion of default sampling frequency. whatever is the sampling frequency
written into you rwave file is what the wavread would return you.
Btw, why is your wavefile at 50kHz if you are interested only till 10kHz.
can you get your inputs at say 25kHz?

> B. Is wavread the optimal solution for reading in this file? Or is there an
> alternate method that I haven't yet discovered?
>
> wavread doesnt do much by itself. i guess your concern is how to control
the number of samples read, which i believe wavread does give you an option
of mentioning the starting sample number and the ending sample number.
http://www.mathworks.com/access/helpdesk_r13/help/techdoc/ref/wavread.html

good luck

> Thanks,
> Alrik
>
>
Alrik-

> Hello, this is my first post on these forums, so please bear
> with me; I'm trying to read relatively large audio files
> (.wav format, ~1GB in size) into matlab. However, I run out
> of memory long before the full file is read.
>
> Right now it seems to me that the best solution would be
> reading in chunks of the signal, manually downsampling, then
> analyzing them separately, although this seems to be a rather
> inelegant solution.

The elegant solution is to process by continuously reading frames of input data. This approach is eventually required
for any actual audio product that operates in real-time. Speech codecs such as GSM, EVRC, G729, audio codecs such as
MP3, noise cancellation, etc. all use frame-based algorithms. Although you're not doing video, the same holds true
there.

For an academic project or research where the end objective is not real-world implementation, continuous frame-based
processing is not required.

> The maximum frequency component of the signal I'd be
> interested would be ~10khz, while wavread returns one
> sampled at ~50kHz and at 16 bits per sample. I couldn't find
> any ways to change the sampling frequency or number of bits
> per signal that wavread would read the signal at, so my
> question is ultimately two-fold;

wavread() goes by what's contained in the header of the .wav file -- it gets the same header info you would see using
WMP or other media player. If you want to change a sampling rate, you can use other MATLAB functions. I would
mention that if you're interested in getting a clear representation of 10 kHz frequencies, then 48 kHz sampling rate
might be appropriate. Theoretically you can't go lower than 20 kHz, so you might try a typical audio rate of 22.05
kHz. But in that case, you would only have 2 sample points to present each period of a 10 kHz sinewave. That's
something to think about... if you have any higher frequencies, or even if your 10 kHz waveform isn't a sinewave (say
it's a square wave), then 22.05 kHz is not enough.

> A. Is there a way to change the sampling frequency or
> bit/sample parameters for wavread (or is this something
> that's specified by the audio file itself?)

wavread() doesn't do that. Try looking for a function like mfilt.firsrc().

> B. Is wavread the optimal solution for reading in this
> file? Or is there an alternate method that I haven't yet
> discovered?

To get .wav file data into MATLAB, wavread() is what you want.

-Jeff
Thanks for the replies, they have cleared up some misconceptions I was
previously beleaguered by!

Alrik

On Mon, Dec 7, 2009 at 10:27 AM, Jeff Brower wrote:

> Alrik-
>
> > Hello, this is my first post on these forums, so please bear
> > with me; I'm trying to read relatively large audio files
> > (.wav format, ~1GB in size) into matlab. However, I run out
> > of memory long before the full file is read.
> >
> > Right now it seems to me that the best solution would be
> > reading in chunks of the signal, manually downsampling, then
> > analyzing them separately, although this seems to be a rather
> > inelegant solution.
>
> The elegant solution is to process by continuously reading frames of input
> data. This approach is eventually required
> for any actual audio product that operates in real-time. Speech codecs
> such as GSM, EVRC, G729, audio codecs such as
> MP3, noise cancellation, etc. all use frame-based algorithms. Although
> you're not doing video, the same holds true
> there.
>
> For an academic project or research where the end objective is not
> real-world implementation, continuous frame-based
> processing is not required.
>
> > The maximum frequency component of the signal I'd be
> > interested would be ~10khz, while wavread returns one
> > sampled at ~50kHz and at 16 bits per sample. I couldn't find
> > any ways to change the sampling frequency or number of bits
> > per signal that wavread would read the signal at, so my
> > question is ultimately two-fold;
>
> wavread() goes by what's contained in the header of the .wav file -- it
> gets the same header info you would see using
> WMP or other media player. If you want to change a sampling rate, you can
> use other MATLAB functions. I would
> mention that if you're interested in getting a clear representation of 10
> kHz frequencies, then 48 kHz sampling rate
> might be appropriate. Theoretically you can't go lower than 20 kHz, so you
> might try a typical audio rate of 22.05
> kHz. But in that case, you would only have 2 sample points to present each
> period of a 10 kHz sinewave. That's
> something to think about... if you have any higher frequencies, or even if
> your 10 kHz waveform isn't a sinewave (say
> it's a square wave), then 22.05 kHz is not enough.
>
> > A. Is there a way to change the sampling frequency or
> > bit/sample parameters for wavread (or is this something
> > that's specified by the audio file itself?)
>
> wavread() doesn't do that. Try looking for a function like mfilt.firsrc().
>
> > B. Is wavread the optimal solution for reading in this
> > file? Or is there an alternate method that I haven't yet
> > discovered?
>
> To get .wav file data into MATLAB, wavread() is what you want.
>
> -Jeff