DSPRelated.com
Forums

Android phone sample rate DSP puzzle

Started by Robert Scott February 8, 2016
I have a DSP puzzle that has to do with Android phones.  I  developed
an Android app that I sell to piano tuners.  It samples the microphone
audio at 44100 samples per second and performs typical frequency
analysis with an FFT, which is actually graphed for the user to see,
to assist in piano tuning.

The problem is that on a very few devices (actually only one model
that I know for sure: the LG G3) my customers report the FFT graph
exhibits a curious feature.  When a pure tone is presented to the
microphone near 4000 Hz, the FFT graph shows two peaks - one at the
desired frequency and the other at the mirror image location on the
other side of 4000 Hz.  So if the tone was at 3900 Hz, the FFT graph
would show peaks at 3900 and 4100.

When I first heard of this from my customers with LG G3 phones, I
thought to myself "I know exactly what is going on.  The phone is
initially sampling the mic at 8000 Hz, and then up-sampling to 44100
to deliver the requested data rate to my app.  Of course up-sampling
cannot hide the Nyquist aliasing caused by the initial 8000 Hz
sampling, hence the double peaks."

So I bought a cheap LG G3 off of eBay so I could test this for myself.
And sure enough, there are two peaks mirrored around 4000 Hz.  But
when I started examining the raw time series data, it was not what I
had expected.  The raw data did not have the expected repetitions of 5
or 6 copies of each sample.  Of course they might have been doing
linear interpolation, but it did not look like that either.  But the
most challenging fact I noticed is that when the tone was very close
to 4000 Hz, the two peaks in the FFT were not of the same amplitude.
The "true" peak was generally somewhat higher in amplitude than the
"mirror image peak" � sometimes 2 to 3 times as much amplitude.  This
held true even when the tone passed from below to above the 4000 Hz
threshold.  But tones more than 50 Hz away from 4000 Hz had mirrored
peaks of roughly the same amplitude.  This phenomenon contradicts my
assumption of what was causing this problem.  My understanding is that
if you sample a 3990 Hz tone at 8000 samples per second, it will be
indistinguishable from 4010 Hz.  Therefore there should have been no
way I could tell which peak was the �real� one.

Therefore I am questioning my whole premise.  Perhaps the problem is
not that the phone is sampling at 8000 Hz and then up-sampling.  But
then comes the DSP puzzle.  Is there any other signal processing
operation you can think of that the LG G3 system might be doing in the
course of satisfying my request for microphone audio sampled at 44100
samples per second?  The mirroring is perfect, as far as I can tell.
As I gradually approach 4000 Hz with a tone generator, the two peaks
move closer together, always perfectly centered around 4000 Hz.  And
this does not happen on other Android phones or tablets.

Robert Scott
Hopkins, MN

On Tue, 09 Feb 2016 03:42:53 +0000, Robert Scott wrote:

> I have a DSP puzzle that has to do with Android phones. I developed an > Android app that I sell to piano tuners. It samples the microphone > audio at 44100 samples per second and performs typical frequency > analysis with an FFT, which is actually graphed for the user to see, to > assist in piano tuning. > > The problem is that on a very few devices (actually only one model that > I know for sure: the LG G3) my customers report the FFT graph exhibits a > curious feature. When a pure tone is presented to the microphone near > 4000 Hz, the FFT graph shows two peaks - one at the desired frequency > and the other at the mirror image location on the other side of 4000 Hz. > So if the tone was at 3900 Hz, the FFT graph would show peaks at 3900 > and 4100. > > When I first heard of this from my customers with LG G3 phones, I > thought to myself "I know exactly what is going on. The phone is > initially sampling the mic at 8000 Hz, and then up-sampling to 44100 to > deliver the requested data rate to my app. Of course up-sampling cannot > hide the Nyquist aliasing caused by the initial 8000 Hz sampling, hence > the double peaks." > > So I bought a cheap LG G3 off of eBay so I could test this for myself. > And sure enough, there are two peaks mirrored around 4000 Hz. But when > I started examining the raw time series data, it was not what I had > expected. The raw data did not have the expected repetitions of 5 or 6 > copies of each sample. Of course they might have been doing linear > interpolation, but it did not look like that either. But the most > challenging fact I noticed is that when the tone was very close to 4000 > Hz, the two peaks in the FFT were not of the same amplitude. The "true" > peak was generally somewhat higher in amplitude than the "mirror image > peak" – sometimes 2 to 3 times as much amplitude. This held true even > when the tone passed from below to above the 4000 Hz threshold. But > tones more than 50 Hz away from 4000 Hz had mirrored peaks of roughly > the same amplitude. This phenomenon contradicts my assumption of what > was causing this problem. My understanding is that if you sample a 3990 > Hz tone at 8000 samples per second, it will be indistinguishable from > 4010 Hz. Therefore there should have been no way I could tell which > peak was the “real” one. > > Therefore I am questioning my whole premise. Perhaps the problem is not > that the phone is sampling at 8000 Hz and then up-sampling. But then > comes the DSP puzzle. Is there any other signal processing operation > you can think of that the LG G3 system might be doing in the course of > satisfying my request for microphone audio sampled at 44100 samples per > second? The mirroring is perfect, as far as I can tell. As I gradually > approach 4000 Hz with a tone generator, the two peaks move closer > together, always perfectly centered around 4000 Hz. And this does not > happen on other Android phones or tablets. > > Robert Scott Hopkins, MN
Perhaps my google-fu is weak, but I was unable to work out the exact CODEC chip used in that phone. Here is a typical device designed for phones though: http://download.yamaha.com/file/47688 I'm going to commit a logical fallacy and assume the one in the G3 is the same. It has 8kHz sampling at some parts of the signal chain (which would explain the mirroring you're seeing). The ADC is delta-sigma, and this usually implies a brickwall lowpass filter somewhere, which might explain the observed differences in amplitude close to Fs/2. Regards, Allan
Upon closer examination of the computed spectrum on this Android
phone, the problem is a little different from what I described before.
Here is what the spectrum actually does.

When a tone generator is below about 3700 Hz, the spectrum displayed
on the phone shows just one peak at the desired frequency.  As the
frequency of the tone increases toward 4000 Hz, a very tiny mirror
image peak begins to appear on the other side of 4000 Hz.  It
gradually gains in amplitude until by 3958 Hz, the amplitude of the
image peak is actually higher than the correct peak.  As the tone goes
above 4000 Hz, the image peak appears below 4000 Hz, and gradually
decreases in amplitude as the tone frequency increases.  I ran the
tone frequency up to 4698 Hz and saw a single peak at 4698 Hz in the
spectrum and no image peak.  This entirely destroys my supposition
that this phone is initially sampling at 8000 Hz and then up-sampling
to 44100, because if it were, there would be no way to show a single
peak at 4698 Hz with no image peak, right?  I mean, the information
that discriminates between 4698 and 3302 is totally destroyed if the
audio is initially sampled at 8000 Hz.

But something is going on in the phone's audio system that intoduces
this image around 4000 Hz.  Could it be some sort of hetrodyning?  I
know in single sideband radio there are ways to invert the audio
spectrum if the detection carrier is set on the wrong side of the
signal.  But why would things return to normal for tones well away
from 4000 Hz?

By the way, I have examined my tone generator with a commerial
spectrum analyzer and verified that it is not generating any image
frequencies.  They are definitely being generated in the phone.

Robert Scott
Hopkins, MN

On Monday, February 8, 2016 at 10:42:57 PM UTC-5, Robert Scott wrote:
> I have a DSP puzzle that has to do with Android phones. I developed > an Android app that I sell to piano tuners. It samples the microphone > audio at 44100 samples per second and performs typical frequency > analysis with an FFT, which is actually graphed for the user to see, > to assist in piano tuning. > > The problem is that on a very few devices (actually only one model > that I know for sure: the LG G3) my customers report the FFT graph > exhibits a curious feature. When a pure tone is presented to the > microphone near 4000 Hz, the FFT graph shows two peaks - one at the > desired frequency and the other at the mirror image location on the > other side of 4000 Hz. So if the tone was at 3900 Hz, the FFT graph > would show peaks at 3900 and 4100. > > When I first heard of this from my customers with LG G3 phones, I > thought to myself "I know exactly what is going on. The phone is > initially sampling the mic at 8000 Hz, and then up-sampling to 44100 > to deliver the requested data rate to my app. Of course up-sampling > cannot hide the Nyquist aliasing caused by the initial 8000 Hz > sampling, hence the double peaks." > > So I bought a cheap LG G3 off of eBay so I could test this for myself. > And sure enough, there are two peaks mirrored around 4000 Hz. But > when I started examining the raw time series data, it was not what I > had expected. The raw data did not have the expected repetitions of 5 > or 6 copies of each sample. Of course they might have been doing > linear interpolation, but it did not look like that either. But the > most challenging fact I noticed is that when the tone was very close > to 4000 Hz, the two peaks in the FFT were not of the same amplitude. > The "true" peak was generally somewhat higher in amplitude than the > "mirror image peak" - sometimes 2 to 3 times as much amplitude. This > held true even when the tone passed from below to above the 4000 Hz > threshold. But tones more than 50 Hz away from 4000 Hz had mirrored > peaks of roughly the same amplitude. This phenomenon contradicts my > assumption of what was causing this problem. My understanding is that > if you sample a 3990 Hz tone at 8000 samples per second, it will be > indistinguishable from 4010 Hz. Therefore there should have been no > way I could tell which peak was the "real" one. > > Therefore I am questioning my whole premise. Perhaps the problem is > not that the phone is sampling at 8000 Hz and then up-sampling. But > then comes the DSP puzzle. Is there any other signal processing > operation you can think of that the LG G3 system might be doing in the > course of satisfying my request for microphone audio sampled at 44100 > samples per second? The mirroring is perfect, as far as I can tell. > As I gradually approach 4000 Hz with a tone generator, the two peaks > move closer together, always perfectly centered around 4000 Hz. And > this does not happen on other Android phones or tablets. > > Robert Scott > Hopkins, MN
Just curious, which microphone is used by your app ? Because G3 has 2 built-in mics - one on top and one at the bottom. The actual voice input is a result of some DSP processing between 2 inputs to achieve better noise cancellation. Not familiar enough with Android API, but I doubt it allows you access to raw sampled signals from each of those mics separately... which means you are out of luck P.S Why use FFT for this ???
Robert Scott <no-one@notreal.invalid> wrote:

>Upon closer examination of the computed spectrum on this Android >phone, the problem is a little different from what I described before. >Here is what the spectrum actually does. > >When a tone generator is below about 3700 Hz, the spectrum displayed >on the phone shows just one peak at the desired frequency. As the >frequency of the tone increases toward 4000 Hz, a very tiny mirror >image peak begins to appear on the other side of 4000 Hz. It >gradually gains in amplitude until by 3958 Hz, the amplitude of the >image peak is actually higher than the correct peak. As the tone goes >above 4000 Hz, the image peak appears below 4000 Hz, and gradually >decreases in amplitude as the tone frequency increases. I ran the >tone frequency up to 4698 Hz and saw a single peak at 4698 Hz in the >spectrum and no image peak. This entirely destroys my supposition >that this phone is initially sampling at 8000 Hz and then up-sampling >to 44100, because if it were, there would be no way to show a single >peak at 4698 Hz with no image peak, right?
Right.
>that discriminates between 4698 and 3302 is totally destroyed if the >audio is initially sampled at 8000 Hz. > >But something is going on in the phone's audio system that intoduces >this image around 4000 Hz. Could it be some sort of hetrodyning?
Exactly what I was thinking. Steve
On Tue, 9 Feb 2016 08:56:18 -0800 (PST), angrydude
<simfidude@gmail.com> wrote:

>Just curious, which microphone is used by your app ? Because G3 has 2 built-in mics - one on top and one at the bottom. The actual voice input is a result of some DSP processing between 2 inputs to achieve better noise cancellation. >Not familiar enough with Android API, but I doubt it allows you access to raw sampled signals from each of those mics separately... which means you are out of luck
My app is a generic Android app for all sorts of phones and tablets. It does nothing special in the way of microphone selection. We get whatever the OS gives us as a default audio input stream. I may just have declare the LG G3 unsuitable for my app.
> >P.S Why use FFT for this ???
It turns out quite useful since there are times in piano tuning when more than one tone is present at once. In that case any single-tone detection algorithm is going to get confused. We also use quadrature phase detection for single-tone fine tuning, like a strobe tuner, so there are two kinds of displays on the screen at once. Robert Scott Hopkins, MN
to the OP..

could it be aliasing of harmonics?

does the effect happen at any other frequencies?

does the effect change, if you lower or raise the  level of the input tone?

M

On 2/9/2016 11:52 AM, Robert Scott wrote:
> Upon closer examination of the computed spectrum on this Android > phone, the problem is a little different from what I described before. > Here is what the spectrum actually does. > > When a tone generator is below about 3700 Hz, the spectrum displayed > on the phone shows just one peak at the desired frequency. As the > frequency of the tone increases toward 4000 Hz, a very tiny mirror > image peak begins to appear on the other side of 4000 Hz. It > gradually gains in amplitude until by 3958 Hz, the amplitude of the > image peak is actually higher than the correct peak. As the tone goes > above 4000 Hz, the image peak appears below 4000 Hz, and gradually > decreases in amplitude as the tone frequency increases. I ran the > tone frequency up to 4698 Hz and saw a single peak at 4698 Hz in the > spectrum and no image peak. This entirely destroys my supposition > that this phone is initially sampling at 8000 Hz and then up-sampling > to 44100, because if it were, there would be no way to show a single > peak at 4698 Hz with no image peak, right? I mean, the information > that discriminates between 4698 and 3302 is totally destroyed if the > audio is initially sampled at 8000 Hz. > > But something is going on in the phone's audio system that intoduces > this image around 4000 Hz. Could it be some sort of hetrodyning? I > know in single sideband radio there are ways to invert the audio > spectrum if the detection carrier is set on the wrong side of the > signal. But why would things return to normal for tones well away > from 4000 Hz? > > By the way, I have examined my tone generator with a commerial > spectrum analyzer and verified that it is not generating any image > frequencies. They are definitely being generated in the phone.
So by the time your tone reaches 4698 Hz, the alias tone is gone. What happens at higher frequencies? Can you get tones into the phone all the way up to 20 kHz? Any other ranges that produce aliases, such as near 8 kHz, or 12 kHz? -- Rick
On Tue, 9 Feb 2016 16:32:57 -0500, rickman <gnuarm@gmail.com> wrote:

>On > >So by the time your tone reaches 4698 Hz, the alias tone is gone. What >happens at higher frequencies? Can you get tones into the phone all the >way up to 20 kHz? Any other ranges that produce aliases, such as near 8 >kHz, or 12 kHz? > >-- >
Good guess. In fact there is aliasing around 8 KHz too. (I couldn't test 12 kHz). The mirror image aliases are somewhat lower in amplitude though - never more than 20% of the main peak. But I have new evidence: I have been able to verify that this phenomenon is not just in my code. I adjusted the tone generator for 3945 Hz, which gave a nice alias at 4055 in my app. Then I closed my app and opened the built-in Voice Recorder app and recorded that pure 3945 Hz tone. When I played it back it had that unmistakable nasty sound of two tones only 110 Hz apart. So this problem is happening with the standard Voice Recorder app too! I guess there is nothing more I can do about it then. -Robert Scott Hopkins, MN
On 2/9/2016 6:23 PM, Robert Scott wrote:
> On Tue, 9 Feb 2016 16:32:57 -0500, rickman <gnuarm@gmail.com> wrote: > >> On >> >> So by the time your tone reaches 4698 Hz, the alias tone is gone. What >> happens at higher frequencies? Can you get tones into the phone all the >> way up to 20 kHz? Any other ranges that produce aliases, such as near 8 >> kHz, or 12 kHz? >> >> -- >> > > Good guess. In fact there is aliasing around 8 KHz too. (I couldn't > test 12 kHz). The mirror image aliases are somewhat lower in > amplitude though - never more than 20% of the main peak. But I have > new evidence: > > I have been able to verify that this phenomenon is not just in my > code. I adjusted the tone generator for 3945 Hz, which gave a nice > alias at 4055 in my app. Then I closed my app and opened the built-in > Voice Recorder app and recorded that pure 3945 Hz tone. When I played > it back it had that unmistakable nasty sound of two tones only 110 Hz > apart. So this problem is happening with the standard Voice Recorder > app too! > > I guess there is nothing more I can do about it then.
I just realized that if your signal above 4 kHz gets into the result that your program sees, the sample rate can never be 8 kHz at any point. If that happened the tone would simply be in the 9 to 4 kHz range, end of story. The process of upsampling and filtering might leave images above 4 kHz, but you would never see the dominate tone there. They would simply be aliases of the original tone and should be reduced in amplitude by the filtering. However... if the original sample rate were 48 kHz and the 44.1 kHz sample rate your program is requesting were produced by sample rate conversion, the artifacts might come about from a poor design. I would have to think about that some more. Usually sample rate conversion is done by upconverting to a very high sample rate that is the least common multiple of the two and then down converted. There are a lot of zeros in the arithmetic so the calculations are not done in the direct manner, but rather with computationally more efficient equivalents. If not done correctly there might be some aliasing going on, but this is just a thought and a very incomplete one. -- Rick