DSPRelated.com
Forums

sampling a wav file

Started by sharatechno August 18, 2007
hi I am new to this forum and processing wav files.

I want take a wav file as input and give an array which contains values of
samples of the audio signal as output according to the sampling frequency.
I am in a position where I cannot decipher any open-source softwares, a
simple program in c/c++ or python will help me.

Thanks in anticiaption


sharatechno wrote:

> hi I am new to this forum and processing wav files. > > I want take a wav file as input
For reading WAV (and a bunch of other file types) you probably can't get anything better than libsndfile: http://www.mega-nerd.com/libsndfile/ There are example programs included.
> and give an array which contains values of > samples of the audio signal as output according to the sampling frequency.
Sorry,the rest of that sentence made no sense to me. Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught." -- Oscar Wilde
Hi,

below is a quick-and-dirty program for "octave" (which is open source,
download "octave-forge" package from sourceforge.net).
I guess it should work also with Matlab.

It shows how to load a waveform, and plots the spectrum obtained via FFT.


The remainder is some averaging to keep the amount of plotted data
manageable.

This is a simple example, could be extended as needed (for example:
include windowing).

Cheers

Markus

close all; clear all;
[y, fs, bits]=wavread('c:/audio_temp/a.wav');
leftChan=y(:, 1);
n=length(leftChan);

spectrum=fft(leftChan);
% HERE is the spectrum, one sample per frequency


spectrum=abs(spectrum).^2; % amplitude spectrum 

% average and downsample to reduce data in plot

winlen=1000;
win=ones(1, winlen)/winlen;
spectrum=fftconv(spectrum, win); % averaging
spectrum=10*log10(spectrum); % convert to dB

step=floor(length(spectrum)/1000);
spectrum=spectrum(1:step:length(spectrum)/2);

figure(); 
plot(spectrum);

If you know your sampling rate and resolution, just ignore the header 
(44 bytes if I remember correctly) and treat the rest as binary data...

Best regards,

Andre


mnentwig wrote:
> Hi, > > below is a quick-and-dirty program for "octave" (which is open source, > download "octave-forge" package from sourceforge.net). > I guess it should work also with Matlab. > > It shows how to load a waveform, and plots the spectrum obtained via FFT. > > > The remainder is some averaging to keep the amount of plotted data > manageable. > > This is a simple example, could be extended as needed (for example: > include windowing). > > Cheers > > Markus > > close all; clear all; > [y, fs, bits]=wavread('c:/audio_temp/a.wav'); > leftChan=y(:, 1); > n=length(leftChan); > > spectrum=fft(leftChan); > % HERE is the spectrum, one sample per frequency > > > spectrum=abs(spectrum).^2; % amplitude spectrum > > % average and downsample to reduce data in plot > > winlen=1000; > win=ones(1, winlen)/winlen; > spectrum=fftconv(spectrum, win); % averaging > spectrum=10*log10(spectrum); % convert to dB > > step=floor(length(spectrum)/1000); > spectrum=spectrum(1:step:length(spectrum)/2); > > figure(); > plot(spectrum); >
Andre Lodwig wrote:
> If you know your sampling rate and resolution, just ignore the header > (44 bytes if I remember correctly) and treat the rest as binary data... > > Best regards, > > Andre > >
This is not safe, as WAV files do not (and have never had) a fixed-size header. These days header sizes may actually be quite a bit larger, as there may be various extra chunks before the audio data chunk itself, and the new version of WAVE, WAVEFORMATEXTENSIBLE has a much larger 40-byte format chunk. Even the older WAVEFORMATEX chunk can be either 16 or 18 bytes long, so that with an otherwise minimal header the audio data can be either 44 or 46 bytes in. There is only one reliable way to read a WAVE file, and that is to parse it properly. Unfortunately, from what I have been able to glean, Matlab has not updated its wavread.m file for a long time, so there will be many legit files it will reject. Richard Dobson
sharatechno wrote...
> hi I am new to this forum and processing wav files. > > I want take a wav file as input and give an array which contains values of > samples of the audio signal as output according to the sampling frequency. > I am in a position where I cannot decipher any open-source softwares, a > simple program in c/c++ or python will help me. > > Thanks in anticiaption
It's very easy to extract data from a PCM wav file, once you are familiar with the structure. You have to appreciate that whilst the overall stucture is well standardized it is very flexible (and frequently mis-used) and the details can vary greatly between applications that create them. Wav files are based on the Resource Interchange File Format (RIFF), which is a generalized tagged file format built up of labelled "chunks". A wave version of a RIF file must contain three chunks of information (actually strictly speaking it's really one WAVE chunk which contains two additional sub-chunks but the structure is similar and it shouldn't make any real difference to your understanding): 1) A "RIFF" chunk that defines the type of the file - "WAVE" in this case 2) A FORMAT sub-chunk (labelled "fmt ") that defines the form of encoding (I assume that you are interested in PCM) and parameters including number of channels, bytes per sample, sampling rate etc. 3) The DATA sub-chunk (labelled "data") that contains the actual samples. For a PCM file each these are simply the amplitudes of the waveform at each sampling instant. No time information is saved in the file - you can re-create this from the sampling rate and the number of the sample, counted from the beginning of the data. The problem is that a file can also contain any number of additional application-specific labelled chunks (and sub-chunks) and the chunks can appear in any order. An intelligent programmer will write the standard chunks in the conventional order, as above, and add any proprietory chunks to the end of the file - unfortunately programmers are often not very intelligent so you must parse the entire file in order to be sure of correctly identifying all the chunks you need and ignoring the ones that you don't. A lot of applications don't do this and fail to read perfectly acceptable wav files - either rejecting them or else assuming that the file is RAW format and asking you to input sampling parameters (obviously any unidentified chunks are then interpreted as digital noise). Another nasty little gotcha is that the FORMAT chunk of a RIFF file can legally contain several bytes of extra application-specific parameters tacked on the end of the fmt sub-chunk, preceded by a two-byte integer defining the number of extra bytes. Extra bytes are not supposed be used in a WAVE file and many applications don't bother to check and so can't find the header of the DATA sub-chunk and therefore fail to read the file properly when it has been written by an application that does use non-standard extra parameters. A couple of detailed descriptions of the WAV file format that helped me to write my own parsing software are to be found at: http://ccrma.stanford.edu/CCRMA/Courses/422/projects/WaveFormat/ http://technology.niagarac.on.ca/courses/ctec1631/WavFileFormat.html The best way to get to grips with the format is to explore the contents of various wav files using a hex editor and identify the various chunks and parameters (sampling rate etc). If you don't have one then you can download a good free hex editor "Frhed" from http://www.kibria.de Finally I wrote up a summary of the structure for my own use whilst coding my application - whilst it was never intended for public consumption and I don't guarantee its accuracy I'll add it to the end of this message in case it may be of some use. Hope this may have helped Regards David **************************************************************** Wave File Format ****************************** Wave files (.wav) are based on the Resource Interchange File Format (RIFF), which is a generalized tagged file format built up of labelled chunks. Each chunk has the format: Bytes 0-3: ASCII �RIFF� Bytes 4-7: Length of remainder of Chunk (from Byte 8) Bytes 8-11: ASCII Chunk ID Byte 12 to end Chunk data Chunk data may be split into sub-chunks where the sub-chunk format is similar to a chunk: Bytes 0-3: ASCII Sub-chunk ID Bytes 4-7: Length of sub-chunk (from Byte 8) Bytes 8 to end: Sub-chunk data �Classic� wave files contain only a single �WAVE� chunk, but proprietary formats � such as BatSound can contain additional chunks used by a particular application. When reading such files these chunks should be skipped until the �WAVE� chunk is encountered. The WAVE chunk must contain two sub-chunks � Format and Data. � each prefixed with a 4-byte Sub-chunk ID and Sub-chunk size, as above. Once again, proprietary wave formats may contain other application-specific sub-chunks that should be skipped until the Format and Data chunks are encountered. Wave Chunk Format: Bytes 0-3: ASCII �RIFF� Bytes 4-7: Length of Wave Chunk (from Byte 8) Bytes 8-11: ASCII �WAVE� Format Sub-chunk Bytes 0-3: ASCII �fmt � Bytes 4-7: Length of Format sub-chunk (from Byte 8) � should be 0x10 (but may be larger � eg 0x12 in BatSound wav file Bytes 8-9: Compression Format � 1 = uncompressed PCM file Bytes 10-11: Number of channels � normally 1 or 2 but can be larger Bytes 12-15: Sample rate (samples per second) Bytes 16-19: Byte rate Bytes 20-21: Bytes per sample (including all channels) Bytes 22-23: Bits per sample (8, 16 etc) Bytes 24-25: Number of bytes of extra parameters (should be absent for a wave file but present and set equal to zero for a BatSound wav File) Bytes 26 to end: Space for extra parameters Data Sub-chunk: Bytes 0-3: ASCII �data � Bytes 4-7: Length of Data sub-chunk (from Byte 8) � number of bytes of data to read. Bytes 8 to end: The actual sound data. Channel data interleaved ie 1,2
David Lee wrote:
..
> > The problem is that a file can also contain any number of additional > application-specific labelled chunks (and sub-chunks) and the chunks can > appear in any order.
They are not necessarily application-specific (there is a formally defined APPL chunk for this); many are chunks defined by microsoft; others by standards bodeis such as the EBU (for the Broadcast WAVE format etc). An intelligent programmer will write the standard
> chunks in the conventional order, as above, and add any proprietory > chunks to the end of the file
There is an importnat destinction between truly "proprietary" chunks, and auxiliary standard chunks that are defined mostly by Microsoft for use in all files. There are many such chunks (e.g. defining information for use in samplers, such as root frequency, looping points) that do need (if present) to be supplied before the data chunk. One practical reason is that WAVE files are designed to be streamable (e.g. through a unix-style pipe to another process or more generally over a network, and you definitely do not want to wait for the data chunk to complete before reading that information).
> > Another nasty little gotcha is that the FORMAT chunk of a RIFF file can > legally contain several bytes of extra application-specific parameters > tacked on the end of the fmt sub-chunk, preceded by a two-byte integer > defining the number of extra bytes. Extra bytes are not supposed be used > in a WAVE file
Not so. The 18byte WAVEFORMATEX form is mandated by Microsoft for Floating-point files (wFormatTag=3). For any compressed audio data, a FACT chunk must also follow the format chunk; Microsoft did concede that this was not required (but is still optional) for Type-3 floating-point. All parsing code must be prepared to find at least either of the 16 and 18byte format chunks; and beyond that be able to read the WAVEFORMATEXTENSIBLE format which is required for all high-resolution formats (> stereo, >16bits). As I mentioned in another mail in this thread, that extends the format chunk to 40 bytes. I also contributed to the definition of the PEAK chunk, which gives the value and position of the largest sample in each channel; this is also mandated to precede the data chunk, not least as the information can be use to rescale any floating-point files whose amplitude exceeds +-1.0. We use this in Csound, for example. See http://music.calarts.edu/~tre/PeakChunk.html for the details.
> > A couple of detailed descriptions of the WAV file format that helped me > to write my own parsing software are to be found at: > http://ccrma.stanford.edu/CCRMA/Courses/422/projects/WaveFormat/ > http://technology.niagarac.on.ca/courses/ctec1631/WavFileFormat.html >
These are unfortunaly both far from complete or up-to-date descriptions. Part of the problem is that documentation on more recent elements is generally confined to Microsoft developer documentation, so many 3rd party developers are relying on what is almost obsolete information. WAVEFORMATEXTENSIBLE is documented here: http://www.microsoft.com/whdc/device/audio/multichaud.mspx And I have a placed a copy of the original RIFFMCI document here: http://people.bath.ac.uk/masrwd/riffmcidoc.zip Richard Dobson
Richard Dobson wrote:

> WAVEFORMATEXTENSIBLE is documented here: > > http://www.microsoft.com/whdc/device/audio/multichaud.mspx > > And I have a placed a copy of the original RIFFMCI document here: > > http://people.bath.ac.uk/masrwd/riffmcidoc.zip
If people want to spend considerable amounts of time writing code to read WAV files when something like: http://www.mega-nerd.com/libsndfile/ already exists, then thats fine by me, but please, please, PLEASE do not write code to create WAV files because you will almost certainly get it wrong and if your software becomes widely used I will have to add yet more workarounds to libsndfile so that it accepts the broken files you create. Please note that libsndfile is released under the GNU Lesser Gerneal Public License allowing it to be used in Open Source, shareware and closed proprietary programs. The *only* minor restrictions are : - You must link to libsndfile as a DLL (windows), shared library (unix) or dynlib (Mac OSX). - Provide a text document with your program that specifies your program uses libsndfile and that libsndfile is released under the GNU LGPL. - Provide a copy of the GNU LGPL. Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- Moore's Law: hardware speed doubles every 18 months Gates' Law: software speed halves every 18 months
Erik de Castro Lopo wrote:
..
> > If people want to spend considerable amounts of time writing code > to read WAV files when something like: > > http://www.mega-nerd.com/libsndfile/ > > already exists, then thats fine by me, but please, please, > PLEASE do not write code to create WAV files
... Absolutely; but I was writing with respect to the suggested octave/matlab script, where the long-established builtin command is "wavread" (no question of supporting anything else but WAVE, it seems). Many (including me) would like not to have to change scripts to use a new function, but just have wavread work for more WAVE formats. There is a gazillion audio-oriented scripts out there that all use wavread. Since you are here and it's reasonably on-topic - your octave scripts as supplied with libsndfile (and inside the most recent OSX octave.app bundle), seem to only read/write Matlab 5 MAT files using the "load" command, not soundfiles as such. Is there a cross-platform libsndfile wrapper for octave and Matlab that does read and write wave and aiff files (at least), with control over sample format? I could use that myself right now as it happens! It has always seemed that the only way to read and write more file formats, and more correctly, has been and is to edit wavread.m accordingly. Or, put another way, for complete portability, all soundfile parsing has to be written in octave/matlab language, not C. In which case, those doing the editing do need to know what the issues and gotchas are! Richard Dobson
Richard Dobson wrote:

> Since you are here and it's reasonably on-topic - your octave scripts > as supplied with libsndfile (and inside the most recent OSX octave.app > bundle), seem to only read/write Matlab 5 MAT files using the "load" > command, not soundfiles as such.
Yes, I would like to do something more, but its hard to do it in a cross platform way. I get around this by using sndfile-convert (in the example directory) to convert from anything else to mat5 format.
> Is there a cross-platform libsndfile > wrapper for octave and Matlab that does read and write wave and aiff > files (at least), with control over sample format?
Not that I know of. About a decade ago I suggested to John Eaton that libsndfile be used to read audio file. Since libsndfile now does FLAC and Ogg/Vorbis (not publically released yet), it might be time to try this again.
> It has always seemed that the only way to read and write more file > formats, and more correctly, has been and is to edit wavread.m > accordingly.
Good luck adding FLAC and Ogg/Vorbis support in pure Matlab/Octave code :-). I really do thing its time to push for Octave to do audio file reading via libsndfile whereever libsndfile is available. Erik -- ----------------------------------------------------------------- Erik de Castro Lopo ----------------------------------------------------------------- "I once worked for a company where as part of the BS5750 "Quality" process I attended a meeting where I was informed that it was Company Policy not to use free software. When I asked him for his written authorisation for me to remove X Windows from our Sun workstations, he backtracked." -- Phil Hunt