Synchronizing sampling rates between a device and an audio output
Started by 6 years ago●12 replies●latest reply 6 years ago●417 viewsHi,
I am collecting IQ samples from a software defined radio and decimating them to a rate of 192kHz. I then pass the buffers to an audio component (part of a library which I have no access to the source code) which has a sampling rate of 192kHz. Ithen can use another spectrum scope that does demodulation to hear the resulting sound.
If I set the prefill buffer on the audio component to 5 seconds for example, then I get 5 seconds of perfect sound after the buffer is filled. However, as time goes on, I start hearing pops and dropouts. So the sample rate is not the same. If may be off by just a bit, but it adds up.
What are some techniques to circumvent this? Thanks
What you are describing is a pretty common problem. How you go about handling it depends on the system.
First off, which SDR stack is it? GNU radio? If so you would be better asking the question to that forum.
<snip>
Send Discuss-gnuradio mailing list submissions to
discuss-gnuradio@gnu.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
or, via email, send a message with subject or body 'help' to
discuss-gnuradio-request@gnu.org
<snip>
Usually there is only one block in the system with a fixed rate. The rest of the blocks will match rates to it. If both are truly fixed rate then there needs to be a rate matching block.
I've done pretty much the same thing in hardware a few times where I don't have control of either the source or sink rates and am dealing with a band-limited signal. The answer is a Farrow filter. The sourced samples are put into a tap delay chain that feeds the interpolating filter. Also, the source rate is the enabled "domain" that pushes the control logic. The key component is an NCO that is steered so that its rollover matches rates with the sink. The best way to run the NCO is a PLL guided by the sink enable pulses. Another way is by filtering the number of audio samples in a FIFO and steering towards the center (half-empty or half-full) but that method does tend toward limit cycles. The PLL is best. The NCO's rollover is used to trigger a sample from the filter that is put into a FIFO. The phase of the NCO at rollover is taken as ALPHA which is used interpolate between points. The output of the FIFO is going at a fixed rate into the sink.
Hope any of this helps.
Mark Napier
This is my software so it's up to me to code this...
The fact that you "get 5 seconds of perfect sound" from the buffered signal demonstrates precisely that the sample rates are well-matched. As your intuition suggested, what you hear after those 5 seconds could be "dropouts", i.e. the system can't handle real-time operation at that rate. The effect of different sampling rates would sound different, you would never get any periods of "perfect sound". You will probably need to run some profiling, optimize your signal chain somewhere. It's pretty clear that the producer (software radio) is not able to feed the consumer (the audio component) with a continuous stream at 192 kHz, or there's some other bottleneck in-between.
You can use an adjustable rate interpolate function along with clock-tracking (PLL) to keep sync. As Mark mentioned, a Farrow filter can be used. Otherwise, some sort of sample drop/add logic, but this will be not all that dissimilar to what you already have happening, although it will be more controlled and predictable. It is likely with this method that you will have more frequent, but smaller disturbances in your signal that you have now. This may or may not be better, depending on the rest of your constraints.
You are correct. In fact, there are several seconds in addition to the 5 where the sound is perfect so I do believe that it is an underrun situation.
Fred Harris would advocate a poly-phase interpolating filter, but that too is steered by the NCO.
Who defines the rate at which samples are 'picked up' (the 192kHz) - is this defined by the audio interface? Who defines the input rate (the high rate)? Is it the software defined radio?
If so, you need to somehow match the rates as you suspect. A simple method would be to count the samples demanded by your audio routine and the samples provided by the radio and compare the numbers - after a while you will notice a difference - a ratio of say 0.9998 or something - this would tell you that you need to insert an extra sample every 5000 samples (5000*0.9998=4999) - you could choose to insert one sample twice for example. This is a simple method that can work well if your modulation is digital. In an analog environment this can be audible (for narrow band signals since the error tends to be wide-band).
The "perfect" method would be to perform a sample rate conversion with ratio 0.9998, but that's a lot of calculations.
The input rate is defined by the radio and the output by the Windows operating system. I think the complicated part here is to calculate how much the rate is off. It does not appear that the audio component can provide any info so I guess I would have to use the system clock.
I think windows aligns itself to the audio interface clock... And no, you don't have to use a timer at all. Everytime you receive N samples from the radio, you count the _N samples that have been picked up by windows.
Then add N to some integrator (accumulator) accN and add _N to acc_N
accN += N;
acc_N += _N;
after a few seconds accN and acc_N will be different and the ratio between the two will tell you the rate difference...
Hi,
Yes that will probably work.
I did do some benchmarks with timers and I found that on average, the audio routine was two packets short over a 10 second period. Packets are 16384 bytes. That's like 20% off. So something is not right. But I do know that all of the data is being generated correctly since even a 20 second pre fill buffer gives 40 or more seconds of proper audio. What is happening is that every now and then a buffer doesn't make it in time. I think...
Well, it turns out that exactly the same number of packets leave the DSP code and arrive in the audio code. In fact if I send the audio to a wav file and play it back on another SDR software that can use wave files as input, the sound is perfect.
So what is happening, is that the sound packet appears to arrive just slightly delayed which causes the clicks etc. Something must be dropping the packet when this happens.
There are so many places where this synchronization problem occurs, that it is in fact a very common problem. Some applications places identified below: Modems, network radio receivers, telephone network synchronization, and speakerphones.
I apologize in advance, but here is an example from my experience from many years ago.
I worked at Bell Labs/Lucent/agere. We built the chipset for Sirius Satellite Radio back then (before the merger with XM). The transmitter/encoder defines the sampling rate of the system, but the player has D/As that define the play-out speed. These D/As are usually driven by a local clock derived from some local crystal, and therefore not synched with the Tx. The decoded audio from the satellite is deposited into a buffer in fairly large chunks, so the concept of a PLL becomes a little more difficult, because the phase comparisons only happen rather occasionally.
We used a high performance Farrow interpolator good to 108 dB snr average and 82 dB worst case. (Incidentally, I worked with Bill Farrow for several years while he was developing his ideas. Originally targeted at matching Rx clock to Tx clock for the receivers of telephone line modems.)
For Sirius, we defined a high water and low water mark for the output buffer, and if we reached the high water mark, we sped the "output clock" up a little to consume the audio samples a little faster. If we reached the low water mark, we slowed the clock down. We tried to reach a point where we only had to make very occasional changes as the process banged up against the marks infrequently.
Sirius has changed the design now, and do their own silicon as far as I know.
In the telephone network there is a very similar synch problem between hops. Bell Labs had a very interesting patent that made sure that the corrections to the phase never just wandered back and forth between the equivalent of the marks discussed. In the patent, the corrections settled on keeping close to one of the marks, and just bounced off it a little now and again to keep it very close to the mark, but never slipping a clock. This process significantly limited phase wander.
I also worked on a "native audio" speaker-phone on a windows platform, where we had A/D, D/A converters on the modem plugin card running at 8 ksps and the audio on the PC was running at 22.05 ksps. Needless to say they were not synchronized, and there was an offset of a few parts per million (that differed for every unit) in addition to the obvious 441/160 ratio. Since it was full duplex, we had to deal with the Farrow, and fixed ratio converters, in both directions. The speakerphone was part of Lucent/agere/LSI modem chipset functionality for many years, and shipped with millions of lap-top PCs.
Oh, the good old days!
David Shaw