DSPRelated.com
Forums

Can you calculate a 44.1k wav streams volume by looking at the signed integers?

Started by SA Dev February 23, 2004
I'm trying to make multiple wav streams the same volume...

Can you calculate a 44.1k wav streams volume by looking at the signed
integers?

I've noticed that the difference between successive integers affects the
volume the most, for example, if the first integer is 4000 and then next
integer is -4000, then this is an 8000 difference.  This 8000 difference is
louder than say a 4000 difference.

At first, I thought, ok this difference is the volume, but it is more
complicated than this.  For example, if an 8000 for 1/100 second is one
particular volume, where the 8000 for 1/1000 second is much less volume.  My
point is that even though I can find differences in the wav file of say
20000, they only exist for very small amounts of time and create much less
volume than the 20000 difference would if it were longer.

Is there an algorithm to look at this and estimate the volume?

Thanks,

SA Dev


"SA Dev" <nospam38925@forme.com> wrote in message
news:403a02fe@news.tulsaconnect.com...
> I'm trying to make multiple wav streams the same volume... > > Can you calculate a 44.1k wav streams volume by looking at the signed > integers? > > I've noticed that the difference between successive integers affects the > volume the most, for example, if the first integer is 4000 and then next > integer is -4000, then this is an 8000 difference. This 8000 difference is > louder than say a 4000 difference. > > At first, I thought, ok this difference is the volume, but it is more > complicated than this. For example, if an 8000 for 1/100 second is one > particular volume, where the 8000 for 1/1000 second is much less volume. My > point is that even though I can find differences in the wav file of say > 20000, they only exist for very small amounts of time and create much less > volume than the 20000 difference would if it were longer. > > Is there an algorithm to look at this and estimate the volume?
For starters, look into RMS calculation. When done, take a look at the ReplayGain algorithm, which is a more advanced version taking psychoacoustics into account. -- GCP
Hi GCP,

> For starters, look into RMS calculation. > When done, take a look at the ReplayGain algorithm, which is a more > advanced version taking psychoacoustics into account.
Yes, I looked at both of those, but I have one question: Is it the amplitude by itself, or the difference between one and the next amplitudes that creates volume? If I send just a +4000, I won't get a sound unless the next sample is different right? Thanks, SA Dev
SA Dev wrote:
> Hi GCP, > > >>For starters, look into RMS calculation. >>When done, take a look at the ReplayGain algorithm, which is a more >>advanced version taking psychoacoustics into account. > > > Yes, I looked at both of those, but I have one question: > > Is it the amplitude by itself, or the difference between one and the next > amplitudes that creates volume? If I send just a +4000, I won't get a sound > unless the next sample is different right?
Taking the difference between adjacent samples is a crude high-pass filter. It blocks DC perfectly, but it attenuates other low frequencies too. Since humans can't hear much below 20Hz, we cannot hear DC (0 Hertz). If you are concerned about having a lot of DC in your waveforms, you can subtract out the average value. After you do that, you can calculate the RMS. That will be very closely related to the volume. -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (703) 779-7770 Failure is always an option
"SA Dev" <nospam38925@forme.com> wrote in message
news:403a49a5$1@news.tulsaconnect.com...
> Hi GCP, > > > For starters, look into RMS calculation. > > When done, take a look at the ReplayGain algorithm, which is a more > > advanced version taking psychoacoustics into account. > > Yes, I looked at both of those, but I have one question: > > Is it the amplitude by itself, or the difference between one and the next > amplitudes that creates volume? If I send just a +4000, I won't get a
sound
> unless the next sample is different right?
You are noticing the effect that the human ear isn't equally sensitive to all frequencies. In fact, it has no sensitivity to DC. In order to properly calculate the perceived volume, the frequency sensitivity of the human ear must also be included. This is what the ReplayGain algorithm attempts to do. But this gets tricky as the sensitivity changes considerably with listening volume. So in many cases, a "good enough" answer can be obtained by using a very crude approximation of the ear response, for example eliminating DC by subtracting the average level as suggested by Jim. However, most "real" audio waveforms have almost no DC component, so even this can often be ignored in many cases. In summary, just taking the RMS level will work quite well for most "real" audio signals, but may fail with synthetic waveforms such as the DC signal you described.
Hi Jon,

>for example eliminating DC by subtracting the average level as > suggested by Jim. However, most "real" audio waveforms have almost no DC > component, so even this can often be ignored in many cases. > In summary, just taking the RMS level will work quite well for most "real" > audio signals, but may fail with synthetic waveforms such as the DC signal > you described.
This is what I don't understand. How would I subtract the DC part? I have an algorithm that is currently doing this: Take 50ms worth of sample data (+32767 to -32767), square each one, add them all together, take the square root. That is the value for that 50ms. How would I change this to deal with strange DC's? Thanks, SA Dev
Jim,

> Taking the difference between adjacent samples is a crude high-pass > filter. It blocks DC perfectly, but it attenuates other low frequencies > too. Since humans can't hear much below 20Hz, we cannot hear DC (0
Hertz). I got you.
> If you are concerned about having a lot of DC in your waveforms, you can > subtract out the average value. After you do that, you can calculate > the RMS. That will be very closely related to the volume.
How would I do this? I've tried the RMS approach and it works pretty good except with some strange waveforms I am also working with, they are much less loud and I suspect it is because they aren't a traditional waveform. They are a waveform based on software that is trying to emulate the sound electronic components would make. I can post a copy of the waveform to my ftp site and provide a link if that would help. Thanks, SA Dev
"SA Dev" <nospam38925@forme.com> wrote in message
news:403a7ae3$1@news.tulsaconnect.com...
> Hi Jon, > > >for example eliminating DC by subtracting the average level as > > suggested by Jim. However, most "real" audio waveforms have almost no
DC
> > component, so even this can often be ignored in many cases. > > In summary, just taking the RMS level will work quite well for most
"real"
> > audio signals, but may fail with synthetic waveforms such as the DC
signal
> > you described. > > This is what I don't understand. How would I subtract the DC part? I
have
> an algorithm that is currently doing this: > > Take 50ms worth of sample data (+32767 to -32767), square each one, add
them
> all together, take the square root. That is the value for that 50ms. > > How would I change this to deal with strange DC's?
Are you working with a file of audio or "real time" streaming audio? If it is a file, first go through the entire file and add up all the samples. Then divide this by the total number of samples in the file. This is the average value of the file. Now subtract this from every sample to remove the DC component. If your file is really long, you can probably safely break it into smaller pieces (e.g. 1 second) and just remove the DC from each piece separately. Maybe even doing this on your 50ms blocks would work OK for your application? If you are working with real time audio, you will need to implement a high pass filter, which is a bit more complex.
Hi Jon,

First, thanks for the help, I really appreciate it!

> Are you working with a file of audio or "real time" streaming audio?
A real time stream, unfortunently.
> If you are working with real time audio, you will need to implement a high > pass filter, which is a bit more complex.
Can you tell me how a high pass filter works or show me some weblinks to it? Thanks, SA Dev
Try a google search.  But in your case, I'd start with just doing the
average on your 50ms block.

"SA Dev" <nospam38925@forme.com> wrote in message
news:403a930e$1@news.tulsaconnect.com...
> Hi Jon, > > First, thanks for the help, I really appreciate it! > > > Are you working with a file of audio or "real time" streaming audio? > > A real time stream, unfortunently. > > > If you are working with real time audio, you will need to implement a
high
> > pass filter, which is a bit more complex. > > Can you tell me how a high pass filter works or show me some weblinks to
it?
> > Thanks, > > SA Dev > >