comp.dsp | The $10000 Hi-Fi| page 7

Reply by Steve Pope ●May 6, 20152015-05-06

robert bristow-johnson  <rbj@audioimagination.com> wrote:

>On 5/6/15 1:45 AM, Steve Pope wrote:

>> The real point is, when you quantize from analog and/or reduce precision,
>> dithering is wise.  If there is not already enough noise in the
>> source signal to auto-dither the thing, then add explicit dithering.
>> Don't waste brain cycles arguing "it might not matter", just do it,
>> unless the cost is prohibitive, which it almost never is, especially
>> in audio.

[...]

>wellllll....
>
>i do audio. in audio algorithms i deal with, there are several, 
>sometimes *many* points of quantization.  like, often, every gain block 
>and more.
>
>dither isn't that cheap.  a good RNG costs instruction cycles.  cheap 
>RNGs (like linear congruance) usually sound like crap.  building TPDF 
>dither doubles the cost (unless you're building that high-pass TPDF 
>which does not double the cost).
>
>i have found that undithered truncating *with* noise-shaping (like that 
>fixed-point DC blocking trick at dspguru.com) which Randy calls 
>"fraction saving" (a *very* good name for it) works *very* well for most 
>of these internal nodes (at 24 bit).  it steers the quantization noise 
>into the top octave (and away from the bottom 8 or 9 octaves) where i am 
>deaf anyway.  and it solves the DC limit-cycle problem which is why it's 
>useful in a DC-blocking filter or any other audio filter with high-Q 
>poles (an annoyance is when absolute silence goes into your filter after 
>some sound and the level comes down to -70 dB and gets stuck there).
>
>and, with floating-point, i most often don't do anything.  even with 
>single-precision float.  i'm trying to understand how i can cheaply get 
>those lost 8 bits in the SHArC when it goes from 40 internal bits to 32. 
>  i would think that rounding error can be used with noise shaping 
>feedback in the same manner as with fixed-point error shaping.
>
>however, like when going from 24-bit to 16-bit (like in mastering), 
>really good dither and noise-shaping (or using "colored dither" like 
>UV22) is mandatory.  dunno what should be done if mastering to MP3.  i 
>thought the MP3 coding alg figgered it out anyway.
>
>so, in my opinion, some brainy judgment is useful.

Thank you for the insight.  You are of course correct, and I
was over-generalizing above.

Tangent:

It is possible, at least in a lot of situations I encounter, to choose
precisions such that the noise present (either as added dither noise,
or otherwise) at the input(s) to a algorithmic block has the
effect of dithering all of the precision-reduction points 
within that block, such that the only additional point
where dithering might be desired is at the output(s) of the
block.  I think I'm going to say that this might be true
pretty generally, given enough freedom to choose precisions.

(Hence my phrasing above, "if there isn't already enough noise
in the signal to dither" the precision-reduction point.)

But you might end up with some really long word widths, and
if you're designing to a processor word width or processor floating 
point format, you likely won't have that freedom.

The dual to this tangent is: if you're not applying sufficient
noise to the input of such a block, then some of the 
precision-reduction points in the internal signal flow will be 
effectively undithered, and the signals at those points quite often 
exhibit unexpected, non-intuitive, and/or undesired behavior.  

I have seen design projects get into such a state, in various systems
that I have worked on, with the potential time designers might
spend trying to understand the implications of every instance of 
undithered behavior being potentially quite large and could easily fill 
up one's time available that might arguably be better spent tackling 
weightier problems.  It is here that I see what I'd call a 
blanket-dithering design methodology as quite useful, and in very
large systems, almost mandatory because the alternative is
so time-consuming.

Having said that, I am sure that in audio that trade-offs you
describe above are necessary.

Steve

Reply by Steve Pope ●May 6, 20152015-05-06

robert bristow-johnson  <rbj@audioimagination.com> wrote:

>On 5/6/15 9:23 AM, Steve Pope wrote:

>> Not sure Sony was the first, but they had an early 16-bit
>> quantizer (I believe marketed as a "PCM unit", although that's
>> a misnomer) that piggybacked onto a VCR used as a data recorder.
>> This was around 1978.

>i think it's the Sony F1.  and they used betamax, not VHS.  

Yes, it had to be Betamax, which I had completely forgotten
existed.

>and this is 
>the reason why 44.1 kHz (or, more precisely, 44.056 kHz with the F1) 
>became the CD sample rate standard.  very icky.  too bad they didn't go 
>with 48 kHz.

I had not realized the relationship there.  Thanks.



Steve

Reply by Cedron ●May 6, 20152015-05-06

[...snip...]

>> Cedron<103185@dsprelated>  wrote:
>>
>>> Do you apply software audio compression? (not data compression). 
>Many
>>> years ago I derived the formulas for a compression curve that
>extended the
>>> straight line portion with a hyperbola
>
>something like  x - (1-1/r)*log(k + e^x) ?

Strictly speaking, I don't think that is a hyperbola.  This was many years
ago, so off the top of my head:

A / ( x - B ) + C

The inputs were the slope of the line, and optionally the cutoff point. 
The program would then solve for A, B, and C.  The curve would match the
end of the line, the first derivative would match the slope, and the curve
would hit (1,1).  If the cutoff point was not specified, an "optimal" one
was selected, but I can't remember what the criteria was that made it
optimal.  Anyway, if you plotted the input/output graph you got a straight
line to a point where it started gradually bending till it hit the upper
right corner.

If you selected a slope less than one, the graph would curve upward to hit
the corner.  I believe it was invertible, so if you ran it through with a
slope of m, then ran that through with a slope of 1/m, you got your
original (minus rounding errors) back.

>
>("r" is the compression ratio after the bend. "k" is the knee
>softness.)
>
>
>>> that matched the first derivative
>>> so there was no knee effect.
>
>a bend without a knee?  what does that mean?
>
>do you actually mean a "soft knee"?  a knee with programmable softness?
>
Yes, I suppose so.  An extremely soft knee, as in no discontinuity in the
first derivative, and a smoothly declining second derivative past the
small discontinuity at the juncture.  

-----------

I am finding this discussion fascinating.  As I mentioned earlier, I am
rewriting my recording program in Linux using ALSA and XLib.  Most of the
work is in the display and recording controls.  I already have a simple
prototype that blindly records.  My intention is to stick with CD quality,
i.e. 44100, 16bit.  For the recording portion I write out a wav file
header and then just copy the data from the sound buffer to the file. 
Would anybody recommend that I do it differently?  Particularly, should I
add dithering.  Should I make 24bit, 96K a command line option?

---------------------------------------
Posted through http://www.DSPRelated.com

Reply by rickman ●May 6, 20152015-05-06

On 5/6/2015 8:01 PM, Steve Pope wrote:
> robert bristow-johnson  <rbj@audioimagination.com> wrote:
>
>> On 5/6/15 9:23 AM, Steve Pope wrote:
>
>>> Not sure Sony was the first, but they had an early 16-bit
>>> quantizer (I believe marketed as a "PCM unit", although that's
>>> a misnomer) that piggybacked onto a VCR used as a data recorder.
>>> This was around 1978.
>
>> i think it's the Sony F1.  and they used betamax, not VHS.
>
> Yes, it had to be Betamax, which I had completely forgotten
> existed.
>
>> and this is
>> the reason why 44.1 kHz (or, more precisely, 44.056 kHz with the F1)
>> became the CD sample rate standard.  very icky.  too bad they didn't go
>> with 48 kHz.
>
> I had not realized the relationship there.  Thanks.

I don't think the fact that it was Beta vs. VHS had anything to do with 
it.  I think the sample rate is linked to the TV rates which are the 
same for both.  There were early digital recordings on VCRs and the CD 
sample rate was set to be compatible.

-- 

Rick

Reply by Randy Yates ●May 6, 20152015-05-06

robert bristow-johnson <rbj@audioimagination.com> writes:
> [...]
> i would have a few little bones to pick with this paper.  first of all
> the author repeats the common mistake of confusing power spectrum
> ("white" vs. "colored") and p.d.f. (RPDF vs. TPDF vs. Gaussian).
> them's about different properties of a random process.  you most
> certainly **can** have TPDF *and* some kinds of colored noise.  my
> favorite

Robert,

I was going to mention it too, but I grow tired of finding and pointing
out such faults in thinking...
-- 
Randy Yates
Digital Signal Labs
http://www.digitalsignallabs.com

Reply by Greg Berchin ●May 7, 20152015-05-07

On Wed, 06 May 2015 18:06:15 -0400, robert bristow-johnson
<rbj@audioimagination.com> wrote:

>i would have a few little bones to pick with this paper.  first of all 
>the author repeats the common mistake of confusing power spectrum 
>("white" vs. "colored") and p.d.f. (RPDF vs. TPDF vs. Gaussian).  

Granted. But my point was that truncation introduces distortion while
proper dither does not. I was most interested in finding a paper that
included spectral plots demonstrating this.

Reply by Greg Berchin ●May 7, 20152015-05-07

On Wed, 06 May 2015 18:30:20 -0400, robert bristow-johnson
<rbj@audioimagination.com> wrote:

>whether it's with an old Freescale 56K or some other processor, a lot of 
>internal arithmetic in audio algorithms is done at 24 bits.  32-bit 
>floats have a 25-bit mantissa.

As I understand it, single-precision IEEE 754 floating point has 24 bit
precision with 23 fractional bits explicitly stored. Were you referring
to a different single-precision format?

Reply by glen herrmannsfeldt ●May 7, 20152015-05-07

robert bristow-johnson <rbj@audioimagination.com> wrote:

(snip, I wrote)
>> I use a pretty simple LFSR generator, based on CRC32, to generate
>> bits, shift and add.

> oooh, glen, that can't be good.  are you using the LFSR to just get 
> random *bits* (like +1 and -1) that are white.  then it works good.  but 
> if you're using the whole shift register for a random number, that's not 
> so good.  that's because 50% of the time the following register value is 
> related to the current by a factor of 2.  and the spectrum of this is 
> *not* white but is low-pass.

It runs on the input data stream, so there are 24 shifts before
the next one is used. Most likely there is plenty of noise in
the low bits of the input 24 bit data. If the input is really
quiet, it could have some zeros in the high bits, but there should
still be enough noise in the low bits. These are recorded with a
live audience, but the amplifier noise is likely enough to keep
the lowest bits pretty random.

I could actually hear the difference, turning my 100W amplifier
all the way up, and putting my ear right next to the speaker. 
If a normal signal came though, it might have destroyed the
speaker, so it is pretty far down.

But yes, there are probably better sources. If I wanted to convert
16 bit down to 8 bit, I might worry more about it.

-- glen

Reply by Steve Pope ●May 7, 20152015-05-07

rickman  <gnuarm@gmail.com> wrote:

>I am also a doubting Thomas, but I'm willing to leave some room for 
>self-doubt (just not a lot).  I do know that truncation can do funny 
>things as the frequencies in the signal beat with the truncation as well 
>as each other, producing non-linear distortion.  But at -144 dBFS it is 
>hard to imagine it would be in any way audible.   In 16 bits (-96 dBFS) 
>I'm willing to acknowledge magic ears can hear it easily (my ears are 
>far from magic).

>It is always possible that there was some flaw in the original design 
>that got fixed when switching to rounding.  I just can't imagine anyone 
>can hear the effects of 24 bit arithmetic.

Within a digital algorithm that includes feedback, switching from
undithered truncation to undithered rounding can make a large (generally 
beneficial) difference in behavior, or even stability.

But viewed in isolation, truncation vs. rounding, without dithering,
both introduce (often undesired) signal-correlated noise at similar levels.

Steve

Reply by Steve Pope ●May 7, 20152015-05-07

Greg Berchin  <gjberchin@chatter.net.invalid> wrote:

>On Wed, 06 May 2015 10:36:45 -0400, rickman <gnuarm@gmail.com> wrote:

>>When you say you "dithered" it at 14 bits, did you truncate/round the 
>>data to 14 bits?  If not, you were just listening to added noise which 
>>is very different.

>The bits below 14 (or whatever "N" was selected) were all 0.

>The code works in double precision floating point, scales as necessary
>to make 2^(N-1) represent full-scale, adds the dither (a fractional
>value), rounds [add 0.5 then floor()], and then re-scales to 16 bits.

That would be correct.

Tangentially, a vast range of fixed-point values and operations
can be cast to and from doubles without deviating from 
bit-exactness, that is, the result will be the same as if
you tediously programmed it entirely in fixed point.  Eseentially all the 
projects I've worked on in the past decade or so have been simulated
in this manner.

Steve

Previous 5 678 9 10 Next

The $10000 Hi-Fi

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group