Jon, I have wondered whether the blips/pops were at the boundaries, but have not investigated enough to determine this or not. I suspect that they are not, but if they are I may well have to implement something like you describe. I think Erik is just trying to help me understand how to use libsamplerate properly...which from looking at his sample source code, it looks like I was not using it properly at first. I have not had time yet to examine the code but will do so today. Thanks for everyone's help. Jeff
Different Resampling Rates for Segments of a Sound File
Started by ●November 18, 2005
Reply by ●November 20, 20052005-11-20
Reply by ●November 21, 20052005-11-21
A few more questions if you have the time... :) I modified the warping sample to use my own warping ratios. There are quite a few different segments in my example...11631 to be exact, and each segment is very small. And yes, they are intended to be contiguous. My assumption from looking at the "warp" array is that the indices represent offsets in units of samples, as opposed to frames (my input file is 16-bit, 48000Hz stereo). I set up the warp array such that the warp[i].ratio value was the floating point number that would scale the segment from warp[i].index to warp[i+1].index to my desired size (1.0/23.976). Most of these ratios averaged about 0.31. To check my work, I looped over the warp array, applied all the ratios to the duration of each segment, and came up with a total expected output size of approximately 485 seconds (my source file is 1570 seconds). This is the correct output size for my particular application, so all seemed well. However, after running this through the algorithm, the actual output file played back too slowly. Instead of being 485 seconds, it was 654 seconds. To further complicate things, I actually want to throw away a chunk of audio at the beginning of the file. In other words, my first warp[0].index value is not zero. To accomodate this, I changed the seek function that seeks the beginning of the input file to seek to warp[0].index instead of just zero. That being said, the output does in fact sound good...it is totally artifact-free...no blips, pops, or anything like that. I am sure that I am doing something wrong. Do you have any idea what I might be missing? Sincerely, Jeff D.
Reply by ●November 21, 20052005-11-21
jeffdod wrote:> > A few more questions if you have the time... :) > > I modified the warping sample to use my own warping ratios. There are > quite a few different segments in my example...11631 to be exact, and > each segment is very small. And yes, they are intended to be > contiguous. My assumption from looking at the "warp" array is that the > indices represent offsets in units of samples, as opposed to frames (my > input file is 16-bit, 48000Hz stereo).No, it is frames, not samples.> I set up the warp array such > that the warp[i].ratio value was the floating point number that would > scale the segment from warp[i].index to warp[i+1].index to my desired > size (1.0/23.976). Most of these ratios averaged about 0.31.I seem to remember that in your original email, you wrote the sample rate was about 4 times higher than you wanted. If that was the case I would expect the ratio to be about 0.25.> To further complicate things, I actually want to throw away a chunk of > audio at the beginning of the file. In other words, my first > warp[0].index value is not zero. To accomodate this, I changed the seek > function that seeks the beginning of the input file to seek to > warp[0].index instead of just zero.That should be OK.> That being said, the output does in fact sound good...it is totally > artifact-free...no blips, pops, or anything like that.What a wonderful advertisement for Secret Rabbit Code :-). It also means that you've got things basically right.> I am sure that I > am doing something wrong. Do you have any idea what I might be missing?Really not sure what's going on here. I suspect that there is something not quite right with your calculation of the ratios. Interestingly, your ratio of 0.31, divided by the length you got (654) times the length you think it should be (485) 0.31 * 485/654.0 = 0.229 which is very close to 0.25. Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo +-----------------------------------------------------------+ The main confusion about C++ is that its practitioners think it is simultaneously a high and low level language when in reality it is good at neither.
Reply by ●November 21, 20052005-11-21
> No, it is frames, not samples.Ah! Okay, I will take that into account.> I seem to remember that in your original email, you wrote the sample > rate was about 4 times higher than you wanted. If that was the case > I would expect the ratio to be about 0.25.Sorry if I created confusion there...0.25 was a guess on my part. The original file is 1570 seconds long and the resulting file should be 485, so the real average ratio is closer to 0.31.> > To further complicate things, I actually want to throw away a chunk of > > audio at the beginning of the file. In other words, my first > > warp[0].index value is not zero. To accomodate this, I changed the seek > > function that seeks the beginning of the input file to seek to > > warp[0].index instead of just zero. > > That should be OK. >I also commented out the code that sets the ratio to 1.0 if warp[0].index is > zero. I simply go ahead and use warp[0].ratio from the beginning and increment the index into the warp array.> What a wonderful advertisement for Secret Rabbit Code :-). >It is an impressive piece of work.> Really not sure what's going on here. I suspect that there is something > not quite right with your calculation of the ratios. Interestingly, > your ratio of 0.31, divided by the length you got (654) times the length > you think it should be (485) > > 0.31 * 485/654.0 = 0.229 > > which is very close to 0.25. >I am sure of my average ratio, but my actual ratios fluctuate up and down around this average (or it wouldn't be an average!). There must be something wrong. If I hard-code all the ratios to 1.0, I get a file that is the same duration as the original. If I hard-code all the ratios to 0.10 (which is smaller than any actual ratio that I use) I get a file that is exactly one-tenth of the original. So I know that my ratios are within the bounds of what libsamplerate accepts. I will keep looking at it to see where I went wrong. Thanks. Jeff Dodson
Reply by ●November 21, 20052005-11-21
Erik, I have figured out what the problem is, although I won't know how to fix it until I gain a better understanding of the code. It appears that the problem is in the "timewarp_convert" routine near the top of the while(1) loop where it figures out if it is time to advance to the next element in the warp array: if (warp_index < ARRAY_LEN (warp) - 1 && input_count >= warp [warp_index].index) { src_data.src_ratio = warp [warp_index].ratio ; warp_index ++ ; } ; It looks like input_count is never >= warp[warp_index].index, which causes the routine to use the first ratio throughout the entire file, instead of applying a different one from segment to segment. Jeff Dodson
Reply by ●November 21, 20052005-11-21
Oops. My mistake. Strike that last comment. the input_count does exceed the warp[warp_index].index value. The problem is that I ruined your ARRAY_LEN macro by changing the warp array to a dynamically-allocated array instead of a static array. I corrected this by replacing the ARRAY_LEN call with the integer size that I have allocated the array for, and it advances properly through the different scaling factors for the segments. The file is now the correct duration (485 seconds) but the strange audio artifacts that I saw with my method are back! Basically what it sounds like is that each segment plays properly until near the end, and then there is a speed up, and then it slows back down to the proper speed. Very strange.
Reply by ●November 21, 20052005-11-21
Problem solved. I upped the BUFFER_LEN to be larger than my largest segment size (I set it to 32768) and the strange audio artifacts disappeared. Thank you so much for your help! Jeff D.
Reply by ●November 21, 20052005-11-21
jeffdod wrote:> > Problem solved. I upped the BUFFER_LEN to be larger than my largest > segment size (I set it to 32768) and the strange audio artifacts > disappeared. Thank you so much for your help!Glad to hear that. Cheers, Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo +-----------------------------------------------------------+ Saying Python is easier than C++ is like saying that turning a light switch on or off is easier than operating a nuclear reactor.