Dear All !! **************************************************** Any shed of the Kowledge on this will help my me out ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ I am working on the module in which i have to mix the two (audio/speech) files Its look simple to add the each samples of the two diffrent audio file and then write into the Mixed file. But here comes the problem That if i simply add the two diffrent audio files (Each samples) then there may be over flow of the range, so I decided to divide the each sample by two and then add the data and write into the file. what I observed that the resultant mixed wav file whcih I got has the low volume, and this is obvious that as i am dividing the value of each sample by two. So it is decreasing the amplitude level. So I took another Way to mixed the audio files. Let the two signal be A and B respectively, the range is between 0 and 255. Y = A + B � A * B / 255 Where Y is the resultant signal which contains both signal A and B, merging two audio streams into single stream by this method solves the problem of overflow and information loss to an extent. If the range of 8-bit sampling is between �127 to 128 If both A and B are negative Y = A +B � (A * B / (-127)) Else Y = A + B � A * B / 128 For n-bit sampling audio signal If both A and B are negative Y = A + B � (A * B / (-2 pow(n-1) � 1)) Else Y = A + B � (A * B / (2 pow(n-1)) Now the aplying the above approach I am geting the good sound qualites for the mixing of the two audio signal. But As I am increasing the number of the files to be mixed then I hear some sort of disturbance (Noise) in it. Means that as the number of the files is increased then the disturbence in the audio mixed file also increases. WHat is the reason behind this ??? Is there is some underlying hard ware Problem or The qualites Of the sound recording depend on the Recording Device ?????????? I want to have some review of your views on this. Personally what I think is that it may due to the folloing factors 1: Digital computaion error http://www.filter-solutions.com/quant.html 2: Due to aggressinve increase of the amplitude of the mixed file, as we go on increasing the number of the audio files. I.e higher the number of the files the resultant values of the mixed audio fuiles will be increased and will tend towards the higgher range i.e towards 32767 in the case of the positive samples. and will tend towards the -32768 when the two samples of the audio files are negative. { here I am talking about the 16 bit audio data Recorded at the 8KHz sampled } So is there Any other approach So that I can approve my self that the Mixed audio data is Noise Free (At most I have to Mix the 10 Audio Files). One More queery is, what is the reason behind the distortion when the low level recording is done and when we paly the same file. Is there any distortion in it. ????? and in my perception we have the distortion in the recorded and the play back of the same Audio file. For which I am stating my views. (Correct me where ever I am wrong) Explanation 1--> If we have a good A/D-D/A converter also in recording and playback the audio files, Then there comes the picture of the distortion also. we know that the digital recording is extremely accurate due to its (S/N) high signal to noise Ratio. Now Regarding at the low level, digital is actually better than analog, due to its 90 dB dynamic range. The best we can get from phonograph (Recording and Playing software/device) records is around 60 dB. More precise is around 40 dB. we can hear the range of the 120-plus dB. This is why recordings utilize a lot of compression (Compressor--> a electronic device that quickly turns the volume up when the music/speech is soft and quickly turns it down when it is loud). Now here comes the Picture of the compressor which compress and the term Quickly" which means some loss of digital data at the both the ends (High and Low). Since low level Surroundings detail are completely stripped by digitizing when we record at the low level. So the digitizing the low level signal lose the relevent information which result in the distortion. Note : In the Sound cards Use the A/D and The D/A converter and it is involved with the samling frequency and It is not sure that Exact sampling frequnecy is same for the difrent sound cards which may vary and very low level. So which also cause the Distortion at the low level. Explainion 2--> Now suppose If we record the audio data from the one's system(Recording Device) at the low level volume set, in the volume control. such that a sound recorded at the 100% low level of the recording. And when this recorded audio file is played back at the another System at the 100% low level of the volume control and if we dont vary the setting then it will paly the same with out distortion And if there is diffrence in the Volume level control setting at which it is recorded and audio file played back will result in some sort of distortion. Note : If there is variance in the recorded and the played back audio files volume control then also their will be distortion. So for the Low level Recording and listining there will be some distoortion will be seen if we play this low level recorded file into another system at the very high level. Explainnation 3--> Some software and the hard ware Use the Normalisation concept for various algorthim used. Some normalisers are basically "Volme expaders," and some are the "Limiters" They stretch the dynamic range of the material, the low sounds in the original remain low and that to at their original level, while the level of the loudest sounds is raised peak level permiitted by the recording proccess and what eevr lies in between is raised in level pro-portionately. (Addaptive increase), Which also cause the distortion of the original recorded sound. Hence to hear the low volumes sounding we have to increase the volume, to hear the lower volumes (soft volumes) parts of audio file, Hence all the enhance signal is also plyaed causing the distortion. Note: Mostaly the sound Recorded under the concept of normalisation at low level can also cause the Distortion. Very High Music and the Speech are recorded at the (Compressor/Expansion) Algorthim which uses the Normalisation. One More Thing what is the Lowest and the upper limit for the recoerding of the 16 bit data 8Khz sampling frquency so that we dont have the NOISE for the same recoerded and the play back audio file. ??????????????? Any shed of the Kowledge on this will help my me out Thanks In Advance Regards Ranjeet
What is the Flaw in My understanding ??
Started by ●December 14, 2004
Reply by ●December 14, 20042004-12-14
"ranjeet" <ranjeet.gupta@gmail.com> wrote in message news:77c88a3b.0412141310.1397d0ff@posting.google.com...> > [Snip] > Let the two signal be A and B respectively, the range is between 0 and255.> > Y = A + B - A * B / 255 > > [Snip]I do not understand what you are trying to do here, I have not seen the approach before. But I can tell you that multiplication in time is equivalent to convolution in frequency so the spectra of signal Y(z) contains the spectra of A(z) and B(z) Plus the convolution of A(z)(*)B(z) which will add noise to the final result. The more of these signals you mix in this manner the more noise you are going to add. Scaling by 1/2 to avoid overflow will guarantee that no y(k) result will overflow, but at the cost of overall (on average) smaller signals. In making scaling decisions to prevent overflow, one approach is to think of the signals as random and look the pdfs. What is the probability that A + B will be greater than 255? Then make a tradeoff between nice large robust signals and the probabilty that every once in a while a signal may be clipped and choose a scale factor somewhere between 1 (highest probability of overflow) and 1/2 (no probability of overflow). Also, it helps to saturate on overflow (rather than wrap around) so that the overflow only appears as a slight distortion. (not a terribly wrong answer with the wrong sign) -Shawn Steenhagen
Reply by ●December 14, 20042004-12-14
"Shawn Steenhagen" <shawn.NSsteenhagen@NSappliedsignalprocessing.com> wrote in message news:SmJvd.731$qQ4.531@fe03.lga...> > "ranjeet" <ranjeet.gupta@gmail.com> wrote in message > news:77c88a3b.0412141310.1397d0ff@posting.google.com... > > > > [Snip] > > Let the two signal be A and B respectively, the range is between 0 and > 255. > > > > Y = A + B - A * B / 255 > > > > [Snip] > > I do not understand what you are trying to do here, I have not seen the > approach before. But I can tell you that multiplication in time is > equivalent to convolution in frequency so the spectra of signal Y(z) > contains the spectra of A(z) and B(z) Plus the convolution of A(z)(*)B(z) > which will add noise to the final result. The more of these signals you mix > in this manner the more noise you are going to add. > > Scaling by 1/2 to avoid overflow will guarantee that no y(k) result will > overflow, but at the cost of overall (on average) smaller signals. In > making scaling decisions to prevent overflow, one approach is to think of > the signals as random and look the pdfs. What is the probability that A + B > will be greater than 255? Then make a tradeoff between nice large robust > signals and the probabilty that every once in a while a signal may be > clipped and choose a scale factor somewhere between 1 (highest probability > of overflow) and 1/2 (no probability of overflow). > > Also, it helps to saturate on overflow (rather than wrap around) so that the > overflow only appears as a slight distortion. (not a terribly wrong answer > with the wrong sign)Regarding the scaling, since you are working with files, you may be able to analyze the results after mixing and then apply an appropriate scaling factor to maximize peak value but avoid clipping. This is usually called normalization. One simple approach would be to use a conservative scaling factor to guarantee overflow will not occur and, as you are mixing the files, keep a running tab on the maximum value you ever encounter. When finished, find the scaling factor G = full_scale/max_value, where full_scale is the maximum number your wave format can handle (probably 2^15 - 1 for 16-bit signed). Then multiply the mixed result file by G. The result should be a file whose maximum output level is as large as possible without clipping.
Reply by ●December 14, 20042004-12-14
"ranjeet" <ranjeet.gupta@gmail.com> wrote in message news:77c88a3b.0412141310.1397d0ff@posting.google.com...> Dear All !! > > **************************************************** > Any shed of the Kowledge on this will help my me out > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > I am working on the module in which i have to mix the two (audio/speech) > files > Its look simple to add the each samples of the two diffrent audio file > and > then write into the Mixed file. > > But here comes the problem That if i simply add the two diffrent audio > files > (Each samples) then there may be over flow of the range, so I decided to > divide the each sample by two and then add the data and write into the > file. > > what I observed that the resultant mixed wav file whcih I got has the low > volume, and this is obvious that as i am dividing the value of each > sample by > two. So it is decreasing the amplitude level.The objective is to add the two files together. So far, so good. You didn't say how the files themselves were scaled in the first place - but it appears that their volume is adequate. Is that right? If you add two uncorrelated files together for mixing purposes then it may well be similar to adding to noise records together. The resulting amplitude is an increase of sqrt(2) and not 2. So, perhaps you'd do better to divide each by sqrt(2). Some amount of clipping is likely but may be acceptable. Obviously what you do is dependent on your implementation and the tools that are available. Fred
Reply by ●December 15, 20042004-12-15
Hi Ranjeet, if you distort your signal you get distortion. It's as simple as that. I'm not quite sure how I should read your formulas. For example when you write "Y = A + B � (A * B / (-2 pow(n-1) � 1))" what is the * supposed to mean? Convolution? It can't be multiplication, because you write "-2 pow(n-1)" which contains an explicit multiplication that you don't write using '*'. And what is the last '1' standing for? In general, if you use a nonlinear process for mixing your signals (which is how I *think* I can interpret your description) you are distorting the shape of their waveforms which will add distortion noise. The more signals you mix in this manner (and the more non-linearly you scale them) the more noise will be introduced. As others have already said you need to scale the N signals by 1/N in the worst case, and if you start out with 8 bit signals you're losing a lot of information in the process. I would recommend you convert your signals to floating point first and do the mixing there. You can then scale the sum later as you see fit, or better yet, normalize so your output signal fits into the target wordlength. -- Stephan M. Bernsee http://www.dspdimension.com
Reply by ●December 15, 20042004-12-15
On 2004-12-15 07:47:21 +0100, Stephan M. Bernsee <spam@dspdimension.com> said:> I'm not quite sure how I should read your formulas. [...] And what is > the last '1' standing for?Ah, looks like my news reader is ballsing up the formula. When I look at it through Google groups I see that there's a minus before the '1'. In my news reader there isn't, because you didn't use a minus but an Em dash...! Nevermind. -- Stephan M. Bernsee http://www.dspdimension.com
Reply by ●December 15, 20042004-12-15
ranjeet wrote:> I am working on the module in which i have to mix the two (audio/speech) files > Its look simple to add the each samples of the two diffrent audio file and > then write into the Mixed file.> But here comes the problem That if i simply add the two diffrent audio files > (Each samples) then there may be over flow of the range, so I decided to > divide the each sample by two and then add the data and write into the file.You should add them together with one extra bit available, and then divide by two. The difference is in rounding.> what I observed that the resultant mixed wav file whcih I got has the low > volume, and this is obvious that as i am dividing the value of each sample by > two. So it is decreasing the amplitude level. > > So I took another Way to mixed the audio files. > > Let the two signal be A and B respectively, the range is between 0 and 255. > > Y = A + B � A * B / 255(snip) Don't do that. A*B is the equivalent of a modulator, with the right sign convention a balanced modulator, but not what you want when adding signals. This is the term that creates intermodulation distortion in audio signals. -- glen