Forums

mid-tread v mid-rise quantization?

Started by Richard Dobson December 14, 2012
Hello all,

I am engaged in writing up various topics a propos sound and music 
computing in UK schools, in the context of the relatively new Computer 
Science curriculum. Included in the curriculum (14-16 GCSE, and 16-18 
A-Level) under the core topic of "data representation" is the sub-topic 
of audio sampling, including a basic understanding of the choice of 
sample resolution and the action of the ADC.  Needless to say, the 
resources (diagrams etc) currently provided by teachers (and seemingly 
most technical papers)  illustrate the usual (?) mid-tread style of 
quantization,,often without any indication there is an alternative.

However, interestingly, the sound examples of quantization provided by 
R.W. Stewart on his "DSPedia" CD-ROM course (vintage 1996) all use the 
mid-rise style (without discussion) - which means among other things he 
demonstrates one-bit quantization on music and speech tracks, which is 
really rather cool. This is what alerted me to mid-rise in the first 
place, and what suggests it must be at least comparable in importance to 
mid-tread. But even today conclusive comparative information on the net 
is proving difficult to find.

I would like to be able to add a bit of colour and interest to an 
otherwise rather subtle distinction which teachers might easily ignore 
altogether (though I will be publishing on a teacher's forum a simple 
Python program which can demo both forms, applied to a soundfile, so 
they will have no excuse for not at least trying it out).

All I really have is:  mid-tread gives you zero valued samples (and zaps 
any low-level samples which quantize to zero), mid-rise gives you 
one-bit quantization with no dropouts (and gives bipolar symmetry over 
all the available bits), but no zero samples. I suspect there must be 
more to the choice than that.

So my questions are: how important is the selection of one or the other 
in dsp, comms, ee, etc, and what (in general terms) determines the 
choice? Is, for example,  mid-tread very much the dominant form in the 
industry? Or, is the whole question something you merely note in 
passing, or disregard altogether?

Richard Dobson
If the adc is properly dithered it doesn't matter which style you choose, other than a 1/2 lab dc offset. 
Without dither, the mid-tread output drops to 0 for inputs less than 1lsb p-p, and the mid-rise produces a constant square-wave output for inputs less than 2 lab p-p, so it seems the mid-tread would be preferred.
The reason this is not so important anymore is that the vast majority of audio converters are of the sigma-delta type, and in these converters you almost always bring out more bits from the digital filter than can really be supported by the thermal noise of the analog circuits, and therefore the system is well dithered by good old Gaussian noise. If you choose to truncate some of those bits afterwards, then you should add dither first. If you are lazy and don't do this, then of course you can choose between mid-tread and mid-rise. But remember that most high resolution audio converters do not have good enough dc offset specs to make this a meaningful choice. The only exception to this would be the use of digital high pass filters which exist on many parts. 

Bob
On 12/14/12 5:42 PM, Richard Dobson wrote:
> > All I really have is: mid-tread gives you zero valued samples (and zaps > any low-level samples which quantize to zero), mid-rise gives you > one-bit quantization with no dropouts (and gives bipolar symmetry over > all the available bits), but no zero samples. I suspect there must be > more to the choice than that. >
(assume quantization step, delta, is 1) mid-tread is this: Q{ x } = floor( x + 1/2 ) mid-rise is this: Q{ x } = floor( x ) + 1/2 there is no other definition to either. both have "no dropouts", there is no "dead zone' at zero as you would have with "round-toward-zero" quantization: Q{ x } = sgn(x) floor( |x| ) when DSPs (using 2's complement arithmetic) automatically round, they are doing mid-tread and there is one extra value at the most negative looking like 0x8000. there is no 2's complement negation of that value. with mid-rise, 0x0000 would be +1/2 bit value and 0xFFFF would be -1/2 bit value. mid-rise is just like mid-tread but with an additional bit extended and it is 1. mid-tread extends all bits beyond the LSB as all zeros.
> So my questions are: how important is the selection of one or the other > in dsp, comms, ee, etc, and what (in general terms) determines the > choice?
pick mid-tread and assume that zeros are appended to the right of the LSB. i have seen a trick (won't say where), in Mot 68K arithmetic where only the one's complement is used for negation (the NOT instruction), not the two's complement (the NEG instruction). the reason why was that they did not want to add instructions to test for 0x8000, so 0x8000 was the negation of 0x7FFF and 0xFFFF was the negation of 0x0000. there was a minus zero and a plus zero. so there was a sorta dead zone, but what it really was, was a small error in negation. the decision was to just live with this small error and have compact and fast code. i see the cleverness in that thinking but i don't know that i would make the same design decision.
> Is, for example, mid-tread very much the dominant form in the > industry?
yes. it's all mid-tread.
> Or, is the whole question something you merely note in > passing, or disregard altogether?
i don't worry about it. i disregard the whole issue. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Robert Adams <robert.adams@analog.com> wrote:
> If the adc is properly dithered it doesn't matter which style > you choose, other than a 1/2 lab dc offset. Without dither, > the mid-tread output drops to 0 for inputs less than 1lsb p-p, > and the mid-rise produces a constant square-wave output for inputs > less than 2 lab p-p, so it seems the mid-tread would be preferred. > The reason this is not so important anymore is that the vast majority > of audio converters are of the sigma-delta type, and in these > converters you almost always bring out more bits from the digital > filter than can really be supported by the thermal noise of the > analog circuits, and therefore the system is well dithered by good > old Gaussian noise.
It seems that 24 bits isn't unusual these days. At least recorders claim to record 24 bits. I presume with a microphone input there is plenty of noise. Maybe with line input there is a little less.
> If you choose to truncate some of those bits afterwards, > then you should add dither first. If you are lazy and don't do this, > then of course you can choose between mid-tread and mid-rise.
Some time ago, I wrote a 24 bit to 16 bit converter. My first try didn't dither, but then later I added it. With the data at a low enough level (shift right enough times) and the volume control all the way up, I could hear the difference. But real, unshifted, signal probably had enough noise even for 16 bits.
> But remember that most high resolution audio converters do not > have good enough dc offset specs to make this a meaningful choice. > The only exception to this would be the use of digital high pass > filters which exist on many parts.
I have computed the mean for recorded tracks from my DR-1. Many are in the hundreds (out of 24 bits). After converting to 16 bit, the mean is usually 1. For one recording session, all tracks were in the negative 800's or so, but most sessions were positive. Might be a different microphone, though. -- glen