comp.dsp | sampling a wav file

hi I am new to this forum and processing wav files.

I want take a wav file as input and give an array which contains values of
samples of the audio signal as output according to the sampling frequency.
I am in a position where I cannot decipher any open-source softwares, a
simple program in c/c++ or python will help me.

Thanks in anticiaption

Reply by Erik de Castro Lopo ●August 18, 20072007-08-18

sharatechno wrote:

> hi I am new to this forum and processing wav files.
> 
> I want take a wav file as input

For reading WAV (and a bunch of other file types) you probably can't
get anything better than libsndfile:

    http://www.mega-nerd.com/libsndfile/

There are example programs included.

> and give an array which contains values of 
> samples of the audio signal as output according to the sampling frequency.

Sorry,the rest of that sentence made no sense to me.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"Education is an admirable thing, but it is well to remember from
time to time that nothing that is worth knowing can be taught."
--  Oscar Wilde

Reply by mnentwig ●August 20, 20072007-08-20

Hi,

below is a quick-and-dirty program for "octave" (which is open source,
download "octave-forge" package from sourceforge.net).
I guess it should work also with Matlab.

It shows how to load a waveform, and plots the spectrum obtained via FFT.


The remainder is some averaging to keep the amount of plotted data
manageable.

This is a simple example, could be extended as needed (for example:
include windowing).

Cheers

Markus

close all; clear all;
[y, fs, bits]=wavread('c:/audio_temp/a.wav');
leftChan=y(:, 1);
n=length(leftChan);

spectrum=fft(leftChan);
% HERE is the spectrum, one sample per frequency


spectrum=abs(spectrum).^2; % amplitude spectrum 

% average and downsample to reduce data in plot

winlen=1000;
win=ones(1, winlen)/winlen;
spectrum=fftconv(spectrum, win); % averaging
spectrum=10*log10(spectrum); % convert to dB

step=floor(length(spectrum)/1000);
spectrum=spectrum(1:step:length(spectrum)/2);

figure(); 
plot(spectrum);

Reply by Andre Lodwig ●August 20, 20072007-08-20

If you know your sampling rate and resolution, just ignore the header 
(44 bytes if I remember correctly) and treat the rest as binary data...

Best regards,

Andre


mnentwig wrote:
> Hi,
> 
> below is a quick-and-dirty program for "octave" (which is open source,
> download "octave-forge" package from sourceforge.net).
> I guess it should work also with Matlab.
> 
> It shows how to load a waveform, and plots the spectrum obtained via FFT.
> 
> 
> The remainder is some averaging to keep the amount of plotted data
> manageable.
> 
> This is a simple example, could be extended as needed (for example:
> include windowing).
> 
> Cheers
> 
> Markus
> 
> close all; clear all;
> [y, fs, bits]=wavread('c:/audio_temp/a.wav');
> leftChan=y(:, 1);
> n=length(leftChan);
> 
> spectrum=fft(leftChan);
> % HERE is the spectrum, one sample per frequency
> 
> 
> spectrum=abs(spectrum).^2; % amplitude spectrum 
> 
> % average and downsample to reduce data in plot
> 
> winlen=1000;
> win=ones(1, winlen)/winlen;
> spectrum=fftconv(spectrum, win); % averaging
> spectrum=10*log10(spectrum); % convert to dB
> 
> step=floor(length(spectrum)/1000);
> spectrum=spectrum(1:step:length(spectrum)/2);
> 
> figure(); 
> plot(spectrum);
>

Reply by Richard Dobson ●August 20, 20072007-08-20

Andre Lodwig wrote:
> If you know your sampling rate and resolution, just ignore the header 
> (44 bytes if I remember correctly) and treat the rest as binary data...
> 
> Best regards,
> 
> Andre
> 
> 

This is not safe, as WAV files do not (and have never had) a fixed-size 
header. These days header sizes may actually be quite a bit larger, as 
there may be various extra chunks before the audio data chunk itself, 
and the new version of WAVE, WAVEFORMATEXTENSIBLE has a much larger 
40-byte format chunk. Even the older WAVEFORMATEX chunk can be either 16 
or 18 bytes long, so that with an otherwise minimal header the audio 
data can be either 44 or 46 bytes in. There is only one reliable way to 
read a WAVE file, and that is to parse it properly. Unfortunately, from 
what I have been able to glean, Matlab has not updated its wavread.m 
file for a long time, so there will be many legit files it will reject.

Richard Dobson

Reply by David Lee ●August 21, 20072007-08-21

sharatechno wrote...
> hi I am new to this forum and processing wav files.
>
> I want take a wav file as input and give an array which contains values of
> samples of the audio signal as output according to the sampling frequency.
> I am in a position where I cannot decipher any open-source softwares, a
> simple program in c/c++ or python will help me.
>
> Thanks in anticiaption

It's very easy to extract data from a PCM wav file, once you are familiar with the structure. You 
have to appreciate that whilst the overall stucture is well standardized it is very flexible (and 
frequently mis-used) and the details can vary greatly between applications that create them.

Wav files are based on the Resource Interchange File Format (RIFF), which is a generalized tagged 
file format built up of labelled "chunks". A wave version of a RIF file must contain three chunks of 
information (actually strictly speaking it's really one WAVE chunk which contains two additional 
sub-chunks but the structure is similar and it shouldn't make any real difference to your 
understanding):

1) A "RIFF" chunk that defines the type of the file - "WAVE" in this case

2) A FORMAT sub-chunk (labelled "fmt ") that defines the form of encoding (I assume that you are 
interested in PCM) and parameters including number of channels, bytes per sample, sampling rate etc.

3) The DATA sub-chunk (labelled "data") that contains the actual samples. For a PCM file each these 
are simply the amplitudes of the waveform at each sampling instant. No time information is saved in 
the file - you can re-create this from the sampling rate and the number of the sample, counted from 
the beginning of the data.

The problem is that a file can also contain any number of additional application-specific labelled 
chunks (and sub-chunks) and the chunks can appear in any order. An intelligent programmer will write 
the standard chunks in the conventional order, as above, and add any proprietory chunks to the end 
of the file - unfortunately programmers are often not very intelligent so you must parse the entire 
file in order to be sure of correctly identifying all the chunks you need and ignoring the ones that 
you don't. A lot of applications don't do this and fail to read perfectly acceptable wav files - 
either rejecting them or else assuming that the file is RAW format and asking you to input sampling 
parameters (obviously any unidentified chunks are then interpreted as digital noise).

Another nasty little gotcha is that the FORMAT chunk of a RIFF file can legally contain several 
bytes of extra application-specific parameters tacked on the end of the fmt sub-chunk, preceded by a 
two-byte integer defining the number of extra bytes. Extra bytes are not supposed be used in a WAVE 
file and many applications don't bother to check and so can't find the header of the DATA sub-chunk 
and therefore fail to read the file properly when it has been written by an application that does 
use non-standard extra parameters.

A couple of detailed descriptions of the WAV file format that helped me to write my own parsing 
software are to be found at:
http://ccrma.stanford.edu/CCRMA/Courses/422/projects/WaveFormat/
http://technology.niagarac.on.ca/courses/ctec1631/WavFileFormat.html

The best way to get to grips with the format is to explore the contents of various wav files using a 
hex editor and identify the various chunks and parameters (sampling rate etc). If you don't have one 
then you can download a good free hex editor "Frhed" from http://www.kibria.de

Finally I wrote up a summary of the structure for my own use whilst coding my application - whilst 
it was never intended for public consumption and I don't guarantee its accuracy I'll add it to the 
end of this message in case it may be of some use.

Hope this may have helped

Regards

David

****************************************************************
Wave File Format
******************************

Wave files (.wav) are based on the Resource Interchange File Format (RIFF), which is a generalized 
tagged file format built up of labelled chunks.

Each chunk has the format:

Bytes 0-3: ASCII &#4294967295;RIFF&#4294967295;
Bytes 4-7: Length of remainder of Chunk (from Byte 8)
Bytes 8-11: ASCII Chunk ID
Byte 12 to end Chunk data

Chunk data may be split into sub-chunks where the sub-chunk format is similar to a chunk:

Bytes 0-3: ASCII Sub-chunk ID
Bytes 4-7: Length of sub-chunk (from Byte 8)
Bytes 8 to end: Sub-chunk data

&#4294967295;Classic&#4294967295; wave files contain only a single &#4294967295;WAVE&#4294967295; chunk, but proprietary formats &#4294967295; such as BatSound 
can contain additional chunks used by a particular application.  When reading such files these 
chunks should be skipped until the &#4294967295;WAVE&#4294967295; chunk is encountered.

The WAVE chunk must contain two sub-chunks &#4294967295; Format and Data. &#4294967295; each prefixed with a 4-byte 
Sub-chunk ID and Sub-chunk size, as above.  Once again, proprietary wave formats may contain other 
application-specific sub-chunks that should be skipped until the Format and Data chunks are 
encountered.

Wave Chunk Format:

Bytes 0-3: ASCII &#4294967295;RIFF&#4294967295;
Bytes 4-7: Length of Wave Chunk (from Byte 8)
Bytes 8-11: ASCII &#4294967295;WAVE&#4294967295;

Format Sub-chunk

Bytes 0-3: ASCII &#4294967295;fmt &#4294967295;
Bytes 4-7: Length of Format sub-chunk (from Byte 8) &#4294967295; should be 0x10 (but may be larger &#4294967295; eg 0x12 in 
BatSound  wav file
Bytes 8-9: Compression Format &#4294967295; 1 = uncompressed PCM file
Bytes 10-11: Number of channels &#4294967295; normally 1 or 2 but can be larger
Bytes 12-15: Sample rate (samples per second)
Bytes 16-19: Byte rate
Bytes 20-21: Bytes per sample (including  all channels)
Bytes 22-23: Bits per sample (8, 16 etc)
Bytes 24-25: Number of bytes of extra parameters (should be absent for a wave file but present and 
set equal to zero for a BatSound wav File)
Bytes 26 to end: Space for extra parameters

Data Sub-chunk:

Bytes 0-3: ASCII &#4294967295;data &#4294967295;
Bytes 4-7: Length of Data sub-chunk (from Byte 8) &#4294967295; number of bytes of data to read.
Bytes 8 to end: The actual sound data.  Channel data interleaved ie 1,2

Reply by Richard Dobson ●August 21, 20072007-08-21

David Lee wrote:
..
> 
> The problem is that a file can also contain any number of additional 
> application-specific labelled chunks (and sub-chunks) and the chunks can 
> appear in any order. 

They are not necessarily application-specific (there is a formally 
defined APPL chunk for this); many are chunks defined by microsoft; 
others by standards bodeis such as the EBU (for the Broadcast WAVE 
format etc).

An intelligent programmer will write the standard
> chunks in the conventional order, as above, and add any proprietory 
> chunks to the end of the file

There is an importnat destinction between truly "proprietary" chunks, 
and auxiliary standard chunks that are defined mostly by Microsoft for 
use in all files. There are many such chunks (e.g. defining information 
for use in samplers, such as root frequency, looping points) that do 
need (if present) to be supplied before the data chunk. One practical 
reason is that WAVE files are designed to be streamable (e.g. through a 
unix-style pipe to another process or more generally over a network, and 
you definitely do not want to wait for the data chunk to complete before 
reading that information).

> 
> Another nasty little gotcha is that the FORMAT chunk of a RIFF file can 
> legally contain several bytes of extra application-specific parameters 
> tacked on the end of the fmt sub-chunk, preceded by a two-byte integer 
> defining the number of extra bytes. Extra bytes are not supposed be used 
> in a WAVE file 

Not so. The 18byte WAVEFORMATEX form is mandated by Microsoft for 
Floating-point files (wFormatTag=3). For any compressed audio data, a 
FACT chunk must also follow the format chunk; Microsoft did concede that 
this was not required (but is still optional) for Type-3 floating-point. 
All parsing code must be prepared to find at least either of the 16 and 
18byte format chunks; and beyond that be able to read the 
WAVEFORMATEXTENSIBLE format which is required for all high-resolution 
formats (> stereo, >16bits). As I mentioned in another mail in this 
thread, that extends the format chunk to 40 bytes.

I also contributed to the definition of the PEAK chunk, which gives the 
value and position of the largest sample in each channel; this is also 
mandated to precede the data chunk, not least as the information can be 
use to rescale any floating-point files whose amplitude exceeds +-1.0. 
We use this in Csound, for example. See 
http://music.calarts.edu/~tre/PeakChunk.html for the details.

> 
> A couple of detailed descriptions of the WAV file format that helped me 
> to write my own parsing software are to be found at:
> http://ccrma.stanford.edu/CCRMA/Courses/422/projects/WaveFormat/
> http://technology.niagarac.on.ca/courses/ctec1631/WavFileFormat.html
> 

These are unfortunaly both far from complete or up-to-date descriptions. 
Part of the problem is that documentation on more recent elements is 
generally confined to Microsoft developer documentation, so many 3rd 
party developers are relying on what is almost obsolete information.

WAVEFORMATEXTENSIBLE is documented here:

http://www.microsoft.com/whdc/device/audio/multichaud.mspx

And I have a placed a copy of the original RIFFMCI document here:

http://people.bath.ac.uk/masrwd/riffmcidoc.zip

Richard Dobson

Reply by Erik de Castro Lopo ●August 21, 20072007-08-21

Richard Dobson wrote:

> WAVEFORMATEXTENSIBLE is documented here:
> 
> http://www.microsoft.com/whdc/device/audio/multichaud.mspx
> 
> And I have a placed a copy of the original RIFFMCI document here:
> 
> http://people.bath.ac.uk/masrwd/riffmcidoc.zip

If people want to spend considerable amounts of time writing code
to read WAV files when something like:

    http://www.mega-nerd.com/libsndfile/

already exists, then thats fine by me, but please, please, 
PLEASE do not write code to create WAV files because you will
almost certainly get it wrong and if your software becomes
widely used I will have to add yet more workarounds to 
libsndfile so that it accepts the broken files you create.

Please note that libsndfile is released under the GNU Lesser
Gerneal Public License allowing it to be used in Open Source,
shareware and closed proprietary programs. The *only* minor
restrictions are :

  - You must link to libsndfile as a DLL (windows), shared 
    library (unix) or dynlib (Mac OSX).
  - Provide a text document with your program that specifies
    your program uses libsndfile and that libsndfile is released
    under the GNU LGPL.
  - Provide a copy of the GNU LGPL.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
Moore's Law: hardware speed doubles every 18 months
Gates' Law: software speed halves every 18 months

Reply by Richard Dobson ●August 21, 20072007-08-21

Erik de Castro Lopo wrote:
..
> 
> If people want to spend considerable amounts of time writing code
> to read WAV files when something like:
> 
>     http://www.mega-nerd.com/libsndfile/
> 
> already exists, then thats fine by me, but please, please, 
> PLEASE do not write code to create WAV files 
...

Absolutely; but I was writing with respect to the suggested 
octave/matlab script, where the long-established builtin command is 
"wavread" (no question of supporting anything else but WAVE, it seems).
Many (including me) would like not to have to change scripts to use a 
new function, but just have wavread work for more WAVE formats. There is 
a gazillion audio-oriented scripts out there that all use wavread.

Since you are here and it's reasonably on-topic -  your octave scripts 
as supplied with libsndfile (and inside the most recent OSX octave.app 
bundle),  seem to only read/write Matlab 5 MAT files using the "load" 
command, not soundfiles as such. Is there a cross-platform libsndfile 
wrapper for octave and Matlab that does read and write wave and aiff 
files (at least), with control over sample format? I could use that 
myself right now as it happens!

It has always seemed that the only way to read and write more file 
formats, and more correctly, has been and is to edit wavread.m 
accordingly. Or, put another way, for complete portability, all 
soundfile parsing has to be written in octave/matlab language, not C. In 
which case, those doing the editing do need to know what the issues and 
gotchas are!

Richard Dobson

Reply by Erik de Castro Lopo ●August 22, 20072007-08-22

Richard Dobson wrote:

> Since you are here and it's reasonably on-topic -  your octave scripts
> as supplied with libsndfile (and inside the most recent OSX octave.app
> bundle),  seem to only read/write Matlab 5 MAT files using the "load"
> command, not soundfiles as such.

Yes, I would like to do something more, but its hard to do it in a 
cross platform way.

I get around this by using sndfile-convert (in the example directory)
to convert from anything else to mat5 format.

> Is there a cross-platform libsndfile 
> wrapper for octave and Matlab that does read and write wave and aiff
> files (at least), with control over sample format?

Not that I know of.

About a decade ago I suggested to John Eaton that libsndfile be used
to read audio file. Since libsndfile now does FLAC and Ogg/Vorbis (not
publically released yet), it might be time to try this again.

> It has always seemed that the only way to read and write more file
> formats, and more correctly, has been and is to edit wavread.m
> accordingly.

Good luck adding FLAC and Ogg/Vorbis support in pure Matlab/Octave
code :-).

I really do thing its time to push for Octave to do audio file
reading via libsndfile whereever libsndfile is available.

Erik
-- 
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"I once worked for a company where as part of the BS5750 "Quality"
process I attended a meeting where I was informed that it was Company
Policy not to use free software. When I asked him for his written
authorisation for me to remove X Windows from our Sun workstations,
he backtracked."   -- Phil Hunt

Previous12 Next

sampling a wav file

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group