DSPRelated.com
Forums

counterphase detection in stereo audio

Started by kork February 4, 2010
> Then compare the relative strengths of the L+R and L-R channels -- > normally L-R should be significantly smaller than L+R. �In fact, this is > why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast > protocol depends on this property, won't work without it, etc. > >
There is nothing fundamental in the FM stereo protocol that depends on L+R being greater then L-R. The "wrong" (incorrect phasing) way will work fine on a stereo receiver, but it will not work fine on a mono receiver. It's just that most times you want to also be heard well by folks listening in mono. Mark
On Fri, 05 Feb 2010 10:08:21 -0500, Jerry Avins wrote:

> kork wrote: >>> Tim Wescott wrote: >>>> On Thu, 04 Feb 2010 09:57:49 -0600, kork wrote: >>>> >>>> >>>>>> kork wrote: >>>>>>> Hi folks, >>>>>>> >>>>>>> I'm going to develop a quality control application that inspects >>>>> recently >>>>>>> imported audio files for a number of checks. One of them is the >>>>> detection >>>>>>> of counterphase fragments in the file. With counterphase I mean a >> 180 >>>>>>> degrees (or pi rad, if you prefer) >>>>> phase >>>>>>> shift between the two audio channels in the (stereo) file. In a >> radio >>>>>>> broadcast of the file this is killing when it is listened through >>>>>>> a mono-receiver. >>>>>>> >>>>>>> I was thinking of subtracting one channel from the other (or >>>>>>> reverse >> a >>>>>>> channel and add it to the other). Then flagging the audio >>>>>>> fragments >> as >>>>>>> counterphase when the resulting signal differs a lot from zero >> during >>>>> a >>>>>>> certain amount of time. >>>>>>> But since it is likely that the 2 channels are anything but equal, >> I >>>>> may >>>>>>> never get to see a flatlioe. >>>>>>> >>>>>>> I thought maybe you DSP guys can give me some insights on this? >> Maybe >>>>>>> there's a test in the frequency domain I can think of? >>>>>> Compute (L+R) and (L-R), rectify, accumulate, compare. It is very >>>>>> obvious if the stereo channels are in phase or out of phase. >>>>>> >>>>>> >>>>>> Vladimir Vassilevsky >>>>>> DSP and Mixed Signal Design Consultant http://www.abvolt.com >>>>> Hi Vladimir, >>>>> >>>>> Thanks for your answer. >>>>> Would you mind elaborating a bit on the "rectify" and "accumulate" >>>>> suggestions? They're not so obvious terms for me in this domain. >> Thanks >>>>> again. >>>> "Rectify": take the absolute value. >>>> >>>> "Accumulate": sum up a bunch of samples. >>>> >>>> Then compare the relative strengths of the L+R and L-R channels -- >>>> normally L-R should be significantly smaller than L+R. In fact, this >> is >>>> why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast >>>> protocol depends on this property, won't work without it, etc. >>>> >>>> I'll charge you money for answers, too, but only if the question >>>> takes >> >>>> more than a few lines to answer. >>> The accumulation should be lossy; i.e., include a "forgetting factor". >>> alternatively, you could dump the result after a suitable time and >>> start >> >>> over. >>> >>> Jerry >>> -- >> >> Thanks Tim and Jerry, >> >> I appreciate the jargon explanation. >> This sounds pretty straight-forward to implement. I'll have a go at it. >> >> Jerry, your "forgetting factor" sounds logical. I was thinking of just >> testing separate successive chunks of samples, so I won't have any >> "memory-effect". > > That will require counting and branching. Forgetting is actually > simpler. The convention is that x[n] is the input and y[n] is the > output. Set y[n+1] = (1-a)*y[n] + a*x[n+1]. For stability, 0 > a > 1. > Larger values forget faster. This is called an exponential averager.
Forgetting takes fewer lines of more direct code, but has some mathematical subtleties that can trip up a naive maintainer. For testing, I'd be tempted to accumulate-and-dump just to raise the probability that a non-DSP maintainer that comes after me would be able to understand the functioning of the code. -- www.wescottdesign.com
*DT* defined - DIVERGING from 'topic'


Jerry Avins wrote:
> kork wrote: >>> Tim Wescott wrote: >>>> On Thu, 04 Feb 2010 09:57:49 -0600, kork wrote: >>>> >>>> >>>>>> kork wrote: >>>>>>> Hi folks, >>>>>>> >>>>>>> I'm going to develop a quality control application that inspects >>>>> recently >>>>>>> imported audio files for a number of checks. One of them is the >>>>> detection >>>>>>> of counterphase fragments in the file. With counterphase I mean a >> 180 >>>>>>> degrees (or pi rad, if you prefer) >>>>> phase >>>>>>> shift between the two audio channels in the (stereo) file. In a >> radio >>>>>>> broadcast of the file this is killing when it is listened through a >>>>>>> mono-receiver. >>>>>>> >>>>>>> I was thinking of subtracting one channel from the other (or reverse >> a >>>>>>> channel and add it to the other). Then flagging the audio fragments >> as >>>>>>> counterphase when the resulting signal differs a lot from zero >> during >>>>> a >>>>>>> certain amount of time. >>>>>>> But since it is likely that the 2 channels are anything but equal, >> I >>>>> may >>>>>>> never get to see a flatlioe. >>>>>>> >>>>>>> I thought maybe you DSP guys can give me some insights on this? >> Maybe >>>>>>> there's a test in the frequency domain I can think of? >>>>>> Compute (L+R) and (L-R), rectify, accumulate, compare. It is very >>>>>> obvious if the stereo channels are in phase or out of phase. >>>>>> >>>>>> >>>>>> Vladimir Vassilevsky >>>>>> DSP and Mixed Signal Design Consultant http://www.abvolt.com >>>>> Hi Vladimir, >>>>> >>>>> Thanks for your answer. >>>>> Would you mind elaborating a bit on the "rectify" and "accumulate" >>>>> suggestions? They're not so obvious terms for me in this domain. >> Thanks >>>>> again. >>>> "Rectify": take the absolute value. >>>> >>>> "Accumulate": sum up a bunch of samples. >>>> >>>> Then compare the relative strengths of the L+R and L-R channels -- >>>> normally L-R should be significantly smaller than L+R. In fact, this >> is >>>> why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast >>>> protocol depends on this property, won't work without it, etc. >>>> >>>> I'll charge you money for answers, too, but only if the question takes >> >>>> more than a few lines to answer. >>> The accumulation should be lossy; i.e., include a "forgetting >>> factor". alternatively, you could dump the result after a suitable >>> time and start >> >>> over. >>> >>> Jerry >>> -- >> >> Thanks Tim and Jerry, >> >> I appreciate the jargon explanation. >> This sounds pretty straight-forward to implement. I'll have a go at it. >> >> Jerry, your "forgetting factor" sounds logical. I was thinking of just >> testing separate successive chunks of samples, so I won't have any >> "memory-effect". > > That will require counting and branching. Forgetting is actually > simpler. The convention is that x[n] is the input and y[n] is the > output. Set y[n+1] = (1-a)*y[n] + a*x[n+1]. For stability, 0 > a > 1. > Larger values forget faster. This is called an exponential averager. > > Jerry
Why is this called an "exponential averager"? I have heard of "boxcar" and "running" averages. What is/are the difference(s)? What other averageres exist? The OP apparently says he is looking at *NON*overlapping chunks of data. Is there not there an *INTRINSIC* forgetting factor?
Richard Owlett wrote:
> *DT* defined - DIVERGING from 'topic' > > > Jerry Avins wrote: >> kork wrote: >>>> Tim Wescott wrote: >>>>> On Thu, 04 Feb 2010 09:57:49 -0600, kork wrote: >>>>> >>>>> >>>>>>> kork wrote: >>>>>>>> Hi folks, >>>>>>>> >>>>>>>> I'm going to develop a quality control application that inspects >>>>>> recently >>>>>>>> imported audio files for a number of checks. One of them is the >>>>>> detection >>>>>>>> of counterphase fragments in the file. With counterphase I mean a >>> 180 >>>>>>>> degrees (or pi rad, if you prefer) >>>>>> phase >>>>>>>> shift between the two audio channels in the (stereo) file. In a >>> radio >>>>>>>> broadcast of the file this is killing when it is listened through a >>>>>>>> mono-receiver. >>>>>>>> >>>>>>>> I was thinking of subtracting one channel from the other (or >>>>>>>> reverse >>> a >>>>>>>> channel and add it to the other). Then flagging the audio fragments >>> as >>>>>>>> counterphase when the resulting signal differs a lot from zero >>> during >>>>>> a >>>>>>>> certain amount of time. >>>>>>>> But since it is likely that the 2 channels are anything but equal, >>> I >>>>>> may >>>>>>>> never get to see a flatlioe. >>>>>>>> >>>>>>>> I thought maybe you DSP guys can give me some insights on this? >>> Maybe >>>>>>>> there's a test in the frequency domain I can think of? >>>>>>> Compute (L+R) and (L-R), rectify, accumulate, compare. It is very >>>>>>> obvious if the stereo channels are in phase or out of phase. >>>>>>> >>>>>>> >>>>>>> Vladimir Vassilevsky >>>>>>> DSP and Mixed Signal Design Consultant http://www.abvolt.com >>>>>> Hi Vladimir, >>>>>> >>>>>> Thanks for your answer. >>>>>> Would you mind elaborating a bit on the "rectify" and "accumulate" >>>>>> suggestions? They're not so obvious terms for me in this domain. >>> Thanks >>>>>> again. >>>>> "Rectify": take the absolute value. >>>>> >>>>> "Accumulate": sum up a bunch of samples. >>>>> >>>>> Then compare the relative strengths of the L+R and L-R channels -- >>>>> normally L-R should be significantly smaller than L+R. In fact, this >>> is >>>>> why the 'wrong' way is a broadcast-killer -- the FM stereo >>>>> broadcast protocol depends on this property, won't work without it, >>>>> etc. >>>>> >>>>> I'll charge you money for answers, too, but only if the question takes >>> >>>>> more than a few lines to answer. >>>> The accumulation should be lossy; i.e., include a "forgetting >>>> factor". alternatively, you could dump the result after a suitable >>>> time and start >>> >>>> over. >>>> >>>> Jerry >>>> -- >>> >>> Thanks Tim and Jerry, >>> >>> I appreciate the jargon explanation. >>> This sounds pretty straight-forward to implement. I'll have a go at it. >>> >>> Jerry, your "forgetting factor" sounds logical. I was thinking of just >>> testing separate successive chunks of samples, so I won't have any >>> "memory-effect". >> >> That will require counting and branching. Forgetting is actually >> simpler. The convention is that x[n] is the input and y[n] is the >> output. Set y[n+1] = (1-a)*y[n] + a*x[n+1]. For stability, 0 > a > 1. >> Larger values forget faster. This is called an exponential averager. >> >> Jerry > > Why is this called an "exponential averager"?
With this averager. the past decays away exponentially. Seen as an IIR filter, it is a first-order low-pass a "time" constant easily related to a.
> I have heard of "boxcar" and "running" averages. > What is/are the difference(s)?
A boxcar averager reports the average of the last N samples: N is your choice. A running average is the average of all past samples. Not quite trivial to implement.
> What other averageres exist?
As many more as you might have reason to invent.
> The OP apparently says he is looking at *NON*overlapping chunks of data. > Is there not there an *INTRINSIC* forgetting factor?
Past blocks are forgotten, but I call that a process. The forgetting *factor* in the exponential averager is 1-a, applied to each new result. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Mark <makolber@yahoo.com> wrote:
 
>> Then compare the relative strengths of the L+R and L-R channels -- >> normally L-R should be significantly smaller than L+R. ?In fact, this is >> why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast >> protocol depends on this property, won't work without it, etc.
> There is nothing fundamental in the FM stereo protocol that depends on > L+R being greater then L-R.
I think that it true, but it isn't easy to figure out. The spectrum of an FM signal is a very complicated function of the modulating signal. The all L-R case will have all the power in the subcarrier 38kHz +/-15kHz range. If in addition you had an unusual amount of power in the higher audio range it might be that you could get outside the FCC limits. (I don't know how the FCC specifies the limits for FM transmitters.)
> The "wrong" (incorrect phasing) way will work fine on a stereo > receiver, but it will not work fine on a mono receiver.
It is convenient that as (L-R) increases (L+R) decreases such that the sum doesn't get too large. (Assuming L and R stay in range.) -- glen
glen herrmannsfeldt wrote:
> Mark <makolber@yahoo.com> wrote: > >>> Then compare the relative strengths of the L+R and L-R channels -- >>> normally L-R should be significantly smaller than L+R. ?In fact, this is >>> why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast >>> protocol depends on this property, won't work without it, etc. > >> There is nothing fundamental in the FM stereo protocol that depends on >> L+R being greater then L-R. > > I think that it true, but it isn't easy to figure out. The spectrum > of an FM signal is a very complicated function of the modulating > signal. The all L-R case will have all the power in the subcarrier > 38kHz +/-15kHz range. If in addition you had an unusual amount > of power in the higher audio range it might be that you could > get outside the FCC limits. (I don't know how the FCC specifies > the limits for FM transmitters.)
I don't think that the RF spectrum is relevant. Compatible FM stereo consists of L+R in the main band where a mono detector will reproduce it, and L-R multiplexed in a way that can, for this discussion, remain mysterious. A stereo receiver combines (L+R) and (L-R) to produce L and R. The problem addressed here occurs at the transmitter. If one of the channels [L, R] is inverted before the modulator gets it. The main FM channel will consist of L-R. An announcer speaking into a single mic that feeds both channels might as well have stayed home. *All* of his voice will be in the (L-R) channel that a mono receiver doesn't see.
>> The "wrong" (incorrect phasing) way will work fine on a stereo >> receiver, but it will not work fine on a mono receiver. > > It is convenient that as (L-R) increases (L+R) decreases > such that the sum doesn't get too large. (Assuming L and R > stay in range.)
Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
Jerry Avins <jya@ieee.org> wrote:
(snip)
 
> I don't think that the RF spectrum is relevant. Compatible FM stereo > consists of L+R in the main band where a mono detector will reproduce > it, and L-R multiplexed in a way that can, for this discussion, remain > mysterious. A stereo receiver combines (L+R) and (L-R) to produce L and > R. The problem addressed here occurs at the transmitter. If one of the > channels [L, R] is inverted before the modulator gets it. The main FM > channel will consist of L-R. An announcer speaking into a single mic > that feeds both channels might as well have stayed home. *All* of his > voice will be in the (L-R) channel that a mono receiver doesn't see.
Yes. But with all the power in the subcarrier, that is, above 19kHz, I think that changes the spectrum of the transmitted signal. With 75kHz deviation and a large amount of 38kHz signal, you might get a significant amount outside the 200kHz wide band. As well as I understand it, assumptions were made that, on average, most of the power isn't that high. The only way to know would be to read the FCC rules in detail. With highly compressed rock music, the amplitude could be pretty high. If, in addition, the signal had a large component close to 15kHz (maybe there are FCC rules on that, too), that makes it even worse.
>>> The "wrong" (incorrect phasing) way will work fine on a stereo >>> receiver, but it will not work fine on a mono receiver.
>> It is convenient that as (L-R) increases (L+R) decreases >> such that the sum doesn't get too large. (Assuming L and R >> stay in range.)
-- glen
In article <hki7js$cav$9@naig.caltech.edu>, gah@ugcs.caltech.edu says...
> > >Jerry Avins <jya@ieee.org> wrote: >(snip) > >> I don't think that the RF spectrum is relevant. Compatible FM stereo >> consists of L+R in the main band where a mono detector will reproduce >> it, and L-R multiplexed in a way that can, for this discussion, remain >> mysterious. A stereo receiver combines (L+R) and (L-R) to produce L and >> R. The problem addressed here occurs at the transmitter. If one of the >> channels [L, R] is inverted before the modulator gets it. The main FM >> channel will consist of L-R. An announcer speaking into a single mic >> that feeds both channels might as well have stayed home. *All* of his >> voice will be in the (L-R) channel that a mono receiver doesn't see. > >Yes. But with all the power in the subcarrier, that is, above 19kHz, >I think that changes the spectrum of the transmitted signal. >With 75kHz deviation and a large amount of 38kHz signal, you might >get a significant amount outside the 200kHz wide band. > >As well as I understand it, assumptions were made that, >on average, most of the power isn't that high. The only way to >know would be to read the FCC rules in detail. With highly >compressed rock music, the amplitude could be pretty high. >If, in addition, the signal had a large component close to >15kHz (maybe there are FCC rules on that, too), that makes >it even worse. > >>>> The "wrong" (incorrect phasing) way will work fine on a stereo >>>> receiver, but it will not work fine on a mono receiver. > >>> It is convenient that as (L-R) increases (L+R) decreases >>> such that the sum doesn't get too large. (Assuming L and R >>> stay in range.)
The FCC Rules limit the peak frequency deviation of the FM carrier to +/- 75 kHz. It turns out that the peak modulation of the FM carrier is the larger of the left or right input signals due to an interesting and slightly non-intuitive property of the FM stereo multiplex signal called "interleaving." You can prove it by proving the equivalence of (1) generating the multiplex signal by summing L+R and double sideband suppressed-carrier amplitude-modulated L-R and (2) alternately sampling the L and R signals at the stereo subcarrier rate (38 kHz). (The proof just requires some trig identities.) Without the 19kHz pilot tone, the peak modulation produced by a pure L+R signal is the same as that produced by a pure L-R signal -- flipping the polarity of one channel does not change the peak modulation at all. The presence of the 19 kHz pilot tone, which is phase-locked to the 38 kHz subcarrier, slightly breaks the interleaving rule. It turns out that in the presence of a pilot tone at 9% modulation, a pure L+R signal modulates the FM carrier 2.7% higher than a pure L or pure R signal with the same content.
Robert Orban wrote:

   ...

> The FCC Rules limit the peak frequency deviation of the FM carrier to +/- > 75 kHz. It turns out that the peak modulation of the FM carrier is the > larger of the left or right input signals due to an interesting and > slightly non-intuitive property of the FM stereo multiplex signal called > "interleaving." You can prove it by proving the equivalence of (1) > generating the multiplex signal by summing L+R and double sideband > suppressed-carrier amplitude-modulated L-R and (2) alternately sampling the > L and R signals at the stereo subcarrier rate (38 kHz). (The proof just > requires some trig identities.) > > > Without the 19kHz pilot tone, the peak modulation produced by a pure L+R > signal is the same as that produced by a pure L-R signal -- flipping the > polarity of one channel does not change the peak modulation at all. > > The presence of the 19 kHz pilot tone, which is phase-locked to the 38 kHz > subcarrier, slightly breaks the interleaving rule. It turns out that in the > presence of a pilot tone at 9% modulation, a pure L+R signal modulates the > FM carrier 2.7% higher than a pure L or pure R signal with the same > content.
That's interesting detail that I didn't go into because it isn't related to the gist of this discussion. Briefly, a mono receiver is sensitive to the FM portion of the stereo signal, normally L + R. If that should by inadvertence be L - R, a mono signal fed to the transmitter will be lost. Jerry -- Engineering is the art of making what you want from things you can get. &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;
>On Thu, 04 Feb 2010 09:57:49 -0600, kork wrote: > > >>> >>>kork wrote: >>>> Hi folks, >>>> >>>> I'm going to develop a quality control application that inspects >> recently >>>> imported audio files for a number of checks. One of them is the >> detection >>>> of counterphase fragments in the file. With counterphase I mean a 180 >>>> degrees (or pi rad, if you prefer) >> phase >>>> shift between the two audio channels in the (stereo) file. In a radio >>>> broadcast of the file this is killing when it is listened through a >>>> mono-receiver. >>>> >>>> I was thinking of subtracting one channel from the other (or reverse
a
>>>> channel and add it to the other). Then flagging the audio fragments
as
>>>> counterphase when the resulting signal differs a lot from zero during >> a >>>> certain amount of time. >>>> But since it is likely that the 2 channels are anything but equal, I >> may >>>> never get to see a flatlioe. >>>> >>>> I thought maybe you DSP guys can give me some insights on this? Maybe >>>> there's a test in the frequency domain I can think of? >>> >>>Compute (L+R) and (L-R), rectify, accumulate, compare. It is very >>>obvious if the stereo channels are in phase or out of phase. >>> >>> >>>Vladimir Vassilevsky >>>DSP and Mixed Signal Design Consultant http://www.abvolt.com >> >> Hi Vladimir, >> >> Thanks for your answer. >> Would you mind elaborating a bit on the "rectify" and "accumulate" >> suggestions? They're not so obvious terms for me in this domain. Thanks >> again. > >"Rectify": take the absolute value. > >"Accumulate": sum up a bunch of samples. > >Then compare the relative strengths of the L+R and L-R channels -- >normally L-R should be significantly smaller than L+R. In fact, this is >why the 'wrong' way is a broadcast-killer -- the FM stereo broadcast >protocol depends on this property, won't work without it, etc. > >I'll charge you money for answers, too, but only if the question takes >more than a few lines to answer. > >-- >www.wescottdesign.com >
An implementation of this has been running successfully for the past 15 months so thanks a lot for the help. One nice anecdote to mention is that when an early version of the application checked live recordings from churches or big halls (especially with organs, flutes or sopranos) it gave a lot of false hits. Apparently these types of recordings (using a couple of microphones hanging high on the ceiling) resulted in a recording with not much difference between the accumulations of L+R and L-R. I checked for [L-R] > [L+R] when I should have checked for [L-R] > [L+R]*n. Where 2 <= 'n' <= 10. Hope this helps others. Thanks again. Rob.