How do I handle the mixing of multiple input sources without overflow? The number of inputs is not known until runtime, and can change at any point during the execution of the program. The data is just 16bit PCM samples so the mixing is essentially done by calculating the sum of the samples at each point. The calculations are done in 32bit, but then they have to be converted back to 16 bit and this is where i'm stuck. So far I have tried the following solutions.. -Multiplying the input sources by a constant to make them quieter. However when only one input is playing it is very quiet, and distortion is still audible if many inputs are playing at the same time. -Dividing the output value by the number of active input channels at any given time. While this eliminates clipping it creates a noticeable and uncomfortable change in volume when a new input is introduced. Surely there is some sort of equation I can use that solves this problem since software DAWs can pull this off without a problem. I would appreciate any sort of help, even just pointing me in the direction of a book or website, as this problem has had me stuck for too long. Thanks
Dealing with clipping/overflow
Started by ●July 14, 2008
Reply by ●July 14, 20082008-07-14
myxit wrote:> How do I handle the mixing of multiple input sources without overflow? The > number of inputs is not known until runtime, and can change at any point > during the execution of the program. The data is just 16bit PCM samples so > the mixing is essentially done by calculating the sum of the samples at > each point. The calculations are done in 32bit, but then they have to be > converted back to 16 bit and this is where i'm stuck. > > So far I have tried the following solutions.. > > -Multiplying the input sources by a constant to make them quieter. However > when only one input is playing it is very quiet, and distortion is still > audible if many inputs are playing at the same time. > > -Dividing the output value by the number of active input channels at any > given time. While this eliminates clipping it creates a noticeable and > uncomfortable change in volume when a new input is introduced. > > > Surely there is some sort of equation I can use that solves this problem > since software DAWs can pull this off without a problem. > > I would appreciate any sort of help, even just pointing me in the > direction of a book or website, as this problem has had me stuck for too > long./Deja vu/ all over again! The appropriate action depends on the nature of the channels to be mixes and the use the result will be put to. You need to clarify the issues. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●July 14, 20082008-07-14
> >/Deja vu/ all over again! > >The appropriate action depends on the nature of the channels to be mixes>and the use the result will be put to. You need to clarify the issues. > >Jerry >-- >Engineering is the art of making what you want from things you can get. >����������������������������������������������������������������������� >Sorry if this has been asked before. I did check but I cant find anything which matches exactly what i'm after. Essentially the program is a sequencer. It takes input from a score file which tells it what notes to play at any given time. At any point in the mix there could be as little as 0 inputs and as many as 100+ (rare, but possible). I have used Cubase which does the same thing in its MIDI track editor and no matter how many notes you throw at it, the sound is clear and undistorted. With my program, any more than about 5 notes and it sounds terrible.
Reply by ●July 14, 20082008-07-14
"myxit" <isurgey@gmail.com> writes:>> >>/Deja vu/ all over again! >> >>The appropriate action depends on the nature of the channels to be mixes > >>and the use the result will be put to. You need to clarify the issues. >> >>Jerry >>-- >>Engineering is the art of making what you want from things you can get. >>����������������������������������������������������������������������� >> > Sorry if this has been asked before. I did check but I cant find anything > which matches exactly what i'm after. > > Essentially the program is a sequencer. It takes input from a score file > which tells it what notes to play at any given time. At any point in the > mix there could be as little as 0 inputs and as many as 100+ (rare, but > possible). > > I have used Cubase which does the same thing in its MIDI track editor and > no matter how many notes you throw at it, the sound is clear and > undistorted. With my program, any more than about 5 notes and it sounds > terrible.There's no easy solution. If you MUST avoid overflow and MUST handle with NO KNOWLEDGE up to 100 inputs, then you MUST scale each one down by ceil(log2(100)) bits (7 bits). However, I think that most MIDI sequencers are designed for N-note polyphony, which means that you only have to scale down by ceil(log2(N)). -- % Randy Yates % "She tells me that she likes me very much, %% Fuquay-Varina, NC % but when I try to touch, she makes it %%% 919-577-9882 % all too clear." %%%% <yates@ieee.org> % 'Yours Truly, 2095', *Time*, ELO http://www.digitalsignallabs.com
Reply by ●July 14, 20082008-07-14
myxit wrote:>> /Deja vu/ all over again! >> >> The appropriate action depends on the nature of the channels to be mixes > >> and the use the result will be put to. You need to clarify the issues. >> >> Jerry...> Sorry if this has been asked before. I did check but I cant find anything > which matches exactly what i'm after. > > Essentially the program is a sequencer. It takes input from a score file > which tells it what notes to play at any given time. At any point in the > mix there could be as little as 0 inputs and as many as 100+ (rare, but > possible). > > I have used Cubase which does the same thing in its MIDI track editor and > no matter how many notes you throw at it, the sound is clear and > undistorted. With my program, any more than about 5 notes and it sounds > terrible.A recent query concerned telephone conferencing. Among the points that I made was the observation that it matters only little when many talk at once because no one will be intelligible anyway. Music is another matter; that's what makes headroom important. How does the peak output of a single note in Cubase compare with that of a full chord doubled in octaves? If it is nearly the same, then an adaptive gain is used (which would sound pretty poor for music). Otherwise, there is ample headroom. If no single note uses more than 12 bits full scale and processing uses 24 bits, then there is 4096:1 (46 dB) room for expansion. That's just an example. Since the output will have to be shortened with truncation (with or without fraction saving) and dithering, it is probably more than the best amount. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●July 14, 20082008-07-14
>There's no easy solution. If you MUST avoid overflow and MUST handle >with NO KNOWLEDGE up to 100 inputs, then you MUST scale each one down >by ceil(log2(100)) bits (7 bits). > >However, I think that most MIDI sequencers are designed for N-note >polyphony, which means that you only have to scale down byceil(log2(N)).>-- >% Randy Yates % "She tells me that she likes me verymuch,>%% Fuquay-Varina, NC % but when I try to touch, she makesit>%%% 919-577-9882 % all tooclear.">%%%% <yates@ieee.org> % 'Yours Truly, 2095', *Time*, ELO>http://www.digitalsignallabs.com >I've done a bit of research and it seems that 32 note polyphony is about the highest for most MIDI sequencers. So i'll just go with that and adjust my program to accept no more than 32 notes at the same time. I presume this will give me enough headroom, as Jerry mentioned, with which to mix without overflow. However i'm not exactly sure what calculation must be made to scale down by 4 bits ( ceil(log2(32)) ), and should this calculation be done for each sample of the 16bit input streams, or can it be done on the 32 bit output buffer once all additions have been made?
Reply by ●July 14, 20082008-07-14
"myxit" <isurgey@gmail.com> writes:>>There's no easy solution. If you MUST avoid overflow and MUST handle >>with NO KNOWLEDGE up to 100 inputs, then you MUST scale each one down >>by ceil(log2(100)) bits (7 bits). >> >>However, I think that most MIDI sequencers are designed for N-note >>polyphony, which means that you only have to scale down by > ceil(log2(N)). >>-- >>% Randy Yates % "She tells me that she likes me very > much, >>%% Fuquay-Varina, NC % but when I try to touch, she makes > it >>%%% 919-577-9882 % all too > clear." >>%%%% <yates@ieee.org> % 'Yours Truly, 2095', *Time*, ELO > >>http://www.digitalsignallabs.com >> > > I've done a bit of research and it seems that 32 note polyphony is about > the highest for most MIDI sequencers. So i'll just go with that and adjust > my program to accept no more than 32 notes at the same time. I presume this > will give me enough headroom, as Jerry mentioned, with which to mix without > overflow. > > However i'm not exactly sure what calculation must be made to scale down > by 4 bits ( ceil(log2(32)) ),that would be 5 bits.> and should this calculation be done for each > sample of the 16bit input streams, or can it be done on the 32 bit output > buffer once all additions have been made?If each of your 32 inputs are 16 bits wide, then ideally you would use an accumulator that is at least 21 bits wide. Of course a long int (32-bit int) would do just fine. The procedure would be to sum all N of of the potential inputs and then takes bits 4 through 19 (counting from bit 0), with rounding, dithering, or noise-shaping, as the final output. If you simply round you'll probably be fine. The other techniques would take you some time to grok. An alternative that would preserve more of the low-level detail at the cost of potential clipping would be to saturate the final word at bit M, where M is less than 19, then take bits 4 - (19 - M) to M as the final output. -- % Randy Yates % "Midnight, on the water... %% Fuquay-Varina, NC % I saw... the ocean's daughter." %%% 919-577-9882 % 'Can't Get It Out Of My Head' %%%% <yates@ieee.org> % *El Dorado*, Electric Light Orchestra http://www.digitalsignallabs.com
Reply by ●July 14, 20082008-07-14
> >If each of your 32 inputs are 16 bits wide, then ideally you would use >an accumulator that is at least 21 bits wide. Of course a long int >(32-bit int) would do just fine. The procedure would be to sum all N of >of the potential inputs and then takes bits 4 through 19 (counting from >bit 0), with rounding, dithering, or noise-shaping, as the final >output. If you simply round you'll probably be fine. The other >techniques would take you some time to grok. > >An alternative that would preserve more of the low-level detail at the >cost of potential clipping would be to saturate the final word at bit M, >where M is less than 19, then take bits 4 - (19 - M) to M as the final >output. >-- >% Randy Yates % "Midnight, on the water... >%% Fuquay-Varina, NC % I saw... the ocean's daughter." >%%% 919-577-9882 % 'Can't Get It Out Of My Head' >%%%% <yates@ieee.org> % *El Dorado*, Electric Light Orchestra >http://www.digitalsignallabs.com >Thats done it! I just done a bitwise shift right on the 32 bit accumulator and cast it down to a short. Sounds perfect and no distortion. Thank you so much. You have no idea how long i've been scratching my head over this one!
Reply by ●July 14, 20082008-07-14
myxit wrote: ...> Thats done it! I just done a bitwise shift right on the 32 bit accumulator > and cast it down to a short. Sounds perfect and no distortion. > > Thank you so much. You have no idea how long i've been scratching my head > over this one!Great! More sophisticated would be adding fraction saving, then dithering. If you do that, there's no need to divide by a power of two if you need to squeeze out the last bit of headroom. Any number will do. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●July 14, 20082008-07-14
myxit wrote:> Essentially the program is a sequencer. It takes input from a score file > which tells it what notes to play at any given time. At any point in the > mix there could be as little as 0 inputs and as many as 100+ (rare, but > possible).An alternative to hard clipping would be soft-clipping. That adds some harmonics but for audio that may be okay. The tanh function occurs naturally when transistor differential-pairs ciruits go into the non-linear region. That adds harmonics/distortion to the signal, but in a pleasant way, and musicans are used to the sound. My first try would be a polynomial soft-clipping based on tanh. Musicans will notice when they go into the red-zone and either live with it or turn down their channel volume. They may even start to like a tad of grind. :-) It's much more musical than hard clipping in any case. Nils






