Forums

The ITU G.723.1 test vectors

Started by steveu June 5, 2011
I have been playing with the ITU reference code for the G.723.1 speech
codec. If I build and run the floating and fixed point versions of the
codec they seem to work OK, but the floating one doesn't pass the test
vectors that come with the code. It looks like the expected results of
tests are wrong. When VAD/CNG is enabled they seem to expect an
unrealistically high SNR to pass the tests. Can anyone shed light on bug
reports/fixes/etc in this area.

Steve


steveu wrote:

> I have been playing with the ITU reference code for the G.723.1 speech > codec. If I build and run the floating and fixed point versions of the > codec they seem to work OK, but the floating one doesn't pass the test > vectors that come with the code. It looks like the expected results of > tests are wrong. When VAD/CNG is enabled they seem to expect an > unrealistically high SNR to pass the tests. Can anyone shed light on bug > reports/fixes/etc in this area.
FWIW the floating point results don't have to be bit exact; there are minimum accuracy requirements for this case. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
> > >steveu wrote: > >> I have been playing with the ITU reference code for the G.723.1 speech >> codec. If I build and run the floating and fixed point versions of the >> codec they seem to work OK, but the floating one doesn't pass the test >> vectors that come with the code. It looks like the expected results of >> tests are wrong. When VAD/CNG is enabled they seem to expect an >> unrealistically high SNR to pass the tests. Can anyone shed light on
bug
>> reports/fixes/etc in this area. > >FWIW the floating point results don't have to be bit exact; there are >minimum accuracy requirements for this case.
They obviously can't be bit exact with a floating point codec. Simply changing the compiler optimisation level changes them quite a bit. The ITU tests work by checking the SNR against a specified minimum. The minimums for the cases with VAD/CNG enabled look stupid. Steve

steveu wrote:
>> >>steveu wrote: >> >> >>>I have been playing with the ITU reference code for the G.723.1 speech >>>codec. If I build and run the floating and fixed point versions of the >>>codec they seem to work OK, but the floating one doesn't pass the test >>>vectors that come with the code. It looks like the expected results of >>>tests are wrong. When VAD/CNG is enabled they seem to expect an >>>unrealistically high SNR to pass the tests. Can anyone shed light on > bug >>>reports/fixes/etc in this area. >> >>FWIW the floating point results don't have to be bit exact; there are >>minimum accuracy requirements for this case. > > They obviously can't be bit exact with a floating point codec. Simply > changing the compiler optimisation level changes them quite a bit.
The floating point results can be bit exact as the codec inputs and outputs are quantized. The compiler should be in ANSI-compliant mode.
> The ITU > tests work by checking the SNR against a specified minimum. The minimums > for the cases with VAD/CNG enabled look stupid.
When I worked on the codec compliance, I had to check the test vectors through all steps of the algorithm. Very unobvious and minor changes can show up at the output. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
> > >steveu wrote: >>> >>>steveu wrote: >>> >>> >>>>I have been playing with the ITU reference code for the G.723.1 speech >>>>codec. If I build and run the floating and fixed point versions of the >>>>codec they seem to work OK, but the floating one doesn't pass the test >>>>vectors that come with the code. It looks like the expected results of >>>>tests are wrong. When VAD/CNG is enabled they seem to expect an >>>>unrealistically high SNR to pass the tests. Can anyone shed light on >> bug >>>>reports/fixes/etc in this area. >>> >>>FWIW the floating point results don't have to be bit exact; there are >>>minimum accuracy requirements for this case. >> >> They obviously can't be bit exact with a floating point codec. Simply >> changing the compiler optimisation level changes them quite a bit. > >The floating point results can be bit exact as the codec inputs and >outputs are quantized. The compiler should be in ANSI-compliant mode.
Because things are heavily quantised, you often get a bit match, but its really by luck. If some of the decisions are near switching points between two quantised value, you should expect some of those decisions to hop around with the tiniest changes in your code.
>> The ITU >> tests work by checking the SNR against a specified minimum. The
minimums
>> for the cases with VAD/CNG enabled look stupid. > >When I worked on the codec compliance, I had to check the test vectors >through all steps of the algorithm. Very unobvious and minor changes can >show up at the output.
I've been there with other codecs. It so simple to do this stuff with fixed point, as *everything* bit matches down the processing chain. Floating point is a PITA to work with, by comparison. What is bothering me is the last group of tests. The expect a high SNR (around 60dB) from the checksnr program, but I only get 30+ dB. Looking at what is computed, I think 30+ dB is what I should expect. Steve
Steve Underwood <coppice@n_o_s_p_a_m.coppice.org> wrote:
>>steveu wrote: >>>>steveu wrote: >>>>>I have been playing with the ITU reference code for the G.723.1 speech >>>>>codec. If I build and run the floating and fixed point versions of the >>>>>codec they seem to work OK, but the floating one doesn't pass the test >>>>>vectors that come with the code.
>>> They obviously can't be bit exact with a floating point codec. Simply >>> changing the compiler optimisation level changes them quite a bit.
>>The floating point results can be bit exact as the codec inputs and >>outputs are quantized. The compiler should be in ANSI-compliant mode.
> Because things are heavily quantised, you often get a bit match, but its > really by luck. If some of the decisions are near switching points between > two quantised value, you should expect some of those decisions to hop > around with the tiniest changes in your code.
For add and subtract, I might expect it to be exact, and, with enough bits, for multiply. With divide, it is real easy to get a different rounding. But, this doesn't really seem like something that should be done in floating point. Even if you have a really large dynamic range, maybe 100dB, the bits lost to the exponent don't really make up for the increased dynamic range. I could see u-law or A-law coding, which has some similarity to floating point, but isn't floating point. -- glen
>Steve Underwood <coppice@n_o_s_p_a_m.coppice.org> wrote: >>>steveu wrote: >>>>>steveu wrote: >>>>>>I have been playing with the ITU reference code for the G.723.1
speech
>>>>>>codec. If I build and run the floating and fixed point versions of
the
>>>>>>codec they seem to work OK, but the floating one doesn't pass the
test
>>>>>>vectors that come with the code. > >>>> They obviously can't be bit exact with a floating point codec. Simply >>>> changing the compiler optimisation level changes them quite a bit. > >>>The floating point results can be bit exact as the codec inputs and >>>outputs are quantized. The compiler should be in ANSI-compliant mode. > >> Because things are heavily quantised, you often get a bit match, but
its
>> really by luck. If some of the decisions are near switching points
between
>> two quantised value, you should expect some of those decisions to hop >> around with the tiniest changes in your code. > >For add and subtract, I might expect it to be exact, and, with >enough bits, for multiply. With divide, it is real easy to get >a different rounding.
The precise sequence of the calculations is not defined by C. When you turn on compiler optimisation the sequence can be radically different from the way the maths in written. Even with optimisation turned off it varies quite a bit. Tiny changes to the source code can massively reshuffle the calculations. If you are compiling for a Pentium or Athlon with an old compiler it probably uses 8087 instructions with 80 bit intermediates, which get truncated to 64 bit stored values in ways that change at the whim of the compiler. A modern compiler, and all 64 bit compilers, use the SSE instructions which are always 64 bit. There are many variables in how floating point results are calculated.
>But, this doesn't really seem like something that should >be done in floating point. Even if you have a really large >dynamic range, maybe 100dB, the bits lost to the exponent don't >really make up for the increased dynamic range. > >I could see u-law or A-law coding, which has some similarity >to floating point, but isn't floating point.
If you run codecs like G.723.1 on desktop machines and servers, the speed achievable with a floating point implementation is much greater than a fixed point implementation. The lack of proper saturated arithmetic really cripples the fixed point capabilities of these machines. There is some saturated support in the MMX instructions, but its aimed at video applications. The saturated operations needed to mimic a typical DSP chip efficiently are not there. Steve

steveu wrote:

> If you run codecs like G.723.1 on desktop machines and servers, the speed > achievable with a floating point implementation is much greater than a > fixed point implementation.
It has to do with the inefficiency of the ITU-T reference code. That code was developed for simplicity and clarity; it could be optimized by several times in speed.
> The lack of proper saturated arithmetic really > cripples the fixed point capabilities of these machines.
This isn't the case. I profiled it; about 50% of time is spent calculating dot products. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
> > >steveu wrote: > >> If you run codecs like G.723.1 on desktop machines and servers, the
speed
>> achievable with a floating point implementation is much greater than a >> fixed point implementation. > >It has to do with the inefficiency of the ITU-T reference code. That >code was developed for simplicity and clarity; it could be optimized by >several times in speed.
I hope so. :-)
>> The lack of proper saturated arithmetic really >> cripples the fixed point capabilities of these machines. > >This isn't the case. I profiled it; about 50% of time is spent >calculating dot products.
Possibly. I haven't profiled anything yet. However, the bit exact code saturates every step of those dot products. Steve

Steve Underwood wrote:

>> >>steveu wrote: >> >> >>>If you run codecs like G.723.1 on desktop machines and servers, the > > speed > >>>achievable with a floating point implementation is much greater than a >>>fixed point implementation. >> >>It has to do with the inefficiency of the ITU-T reference code. That >>code was developed for simplicity and clarity; it could be optimized by >>several times in speed. > > I hope so. :-) >
AFAIK they don't know of cyclic buffers. They move data instead :-)
>>>The lack of proper saturated arithmetic really >>>cripples the fixed point capabilities of these machines. >> >>This isn't the case. I profiled it; about 50% of time is spent >>calculating dot products. > > Possibly. I haven't profiled anything yet. However, the bit exact code > saturates every step of those dot products.
AFAIK the ITU integer code is based on a library of primitive operations implemented as ANSI C functions. Just unrolling this into a straight code gives big boost in speed. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com