Reply by Erik de Castro Lopo November 21, 20052005-11-21
jeffdod wrote:
> > Problem solved. I upped the BUFFER_LEN to be larger than my largest > segment size (I set it to 32768) and the strange audio artifacts > disappeared. Thank you so much for your help!
Glad to hear that. Cheers, Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo +-----------------------------------------------------------+ Saying Python is easier than C++ is like saying that turning a light switch on or off is easier than operating a nuclear reactor.
Reply by jeffdod November 21, 20052005-11-21
Problem solved. I upped the BUFFER_LEN to be larger than my largest
segment size (I set it to 32768) and the strange audio artifacts
disappeared. Thank you so much for your help!

Jeff D.

Reply by jeffdod November 21, 20052005-11-21
Oops. My mistake. Strike that last comment. the input_count does exceed
the warp[warp_index].index value. The problem is that I ruined your
ARRAY_LEN macro by changing the warp array to a dynamically-allocated
array instead of a static array. I corrected this by replacing the
ARRAY_LEN call with the integer size that I have allocated the array
for, and it advances properly through the different scaling factors for
the segments. The file is now the correct duration (485 seconds) but
the strange audio artifacts that I saw with my method are back!
Basically what it sounds like is that each segment plays properly until
near the end, and then there is a speed up, and then it slows back down
to the proper speed. Very strange.

Reply by jeffdod November 21, 20052005-11-21
Erik,

I have figured out what the problem is, although I won't know how to
fix it until I gain a better understanding of the code. It appears that
the problem is in the "timewarp_convert" routine near the top of the
while(1) loop where it figures out if it is time to advance to the next
element in the warp array:

    if (warp_index < ARRAY_LEN (warp) - 1 && input_count >= warp
[warp_index].index)
    {
        src_data.src_ratio = warp [warp_index].ratio ;
        warp_index ++ ;
    } ;

It looks like input_count is never >= warp[warp_index].index, which
causes the routine to use the first ratio throughout the entire file,
instead of applying a different one from segment to segment.

Jeff Dodson

Reply by jeffdod November 21, 20052005-11-21
> No, it is frames, not samples.
Ah! Okay, I will take that into account.
> I seem to remember that in your original email, you wrote the sample > rate was about 4 times higher than you wanted. If that was the case > I would expect the ratio to be about 0.25.
Sorry if I created confusion there...0.25 was a guess on my part. The original file is 1570 seconds long and the resulting file should be 485, so the real average ratio is closer to 0.31.
> > To further complicate things, I actually want to throw away a chunk of > > audio at the beginning of the file. In other words, my first > > warp[0].index value is not zero. To accomodate this, I changed the seek > > function that seeks the beginning of the input file to seek to > > warp[0].index instead of just zero. > > That should be OK. >
I also commented out the code that sets the ratio to 1.0 if warp[0].index is > zero. I simply go ahead and use warp[0].ratio from the beginning and increment the index into the warp array.
> What a wonderful advertisement for Secret Rabbit Code :-). >
It is an impressive piece of work.
> Really not sure what's going on here. I suspect that there is something > not quite right with your calculation of the ratios. Interestingly, > your ratio of 0.31, divided by the length you got (654) times the length > you think it should be (485) > > 0.31 * 485/654.0 = 0.229 > > which is very close to 0.25. >
I am sure of my average ratio, but my actual ratios fluctuate up and down around this average (or it wouldn't be an average!). There must be something wrong. If I hard-code all the ratios to 1.0, I get a file that is the same duration as the original. If I hard-code all the ratios to 0.10 (which is smaller than any actual ratio that I use) I get a file that is exactly one-tenth of the original. So I know that my ratios are within the bounds of what libsamplerate accepts. I will keep looking at it to see where I went wrong. Thanks. Jeff Dodson
Reply by Erik de Castro Lopo November 21, 20052005-11-21
jeffdod wrote:
> > A few more questions if you have the time... :) > > I modified the warping sample to use my own warping ratios. There are > quite a few different segments in my example...11631 to be exact, and > each segment is very small. And yes, they are intended to be > contiguous. My assumption from looking at the "warp" array is that the > indices represent offsets in units of samples, as opposed to frames (my > input file is 16-bit, 48000Hz stereo).
No, it is frames, not samples.
> I set up the warp array such > that the warp[i].ratio value was the floating point number that would > scale the segment from warp[i].index to warp[i+1].index to my desired > size (1.0/23.976). Most of these ratios averaged about 0.31.
I seem to remember that in your original email, you wrote the sample rate was about 4 times higher than you wanted. If that was the case I would expect the ratio to be about 0.25.
> To further complicate things, I actually want to throw away a chunk of > audio at the beginning of the file. In other words, my first > warp[0].index value is not zero. To accomodate this, I changed the seek > function that seeks the beginning of the input file to seek to > warp[0].index instead of just zero.
That should be OK.
> That being said, the output does in fact sound good...it is totally > artifact-free...no blips, pops, or anything like that.
What a wonderful advertisement for Secret Rabbit Code :-). It also means that you've got things basically right.
> I am sure that I > am doing something wrong. Do you have any idea what I might be missing?
Really not sure what's going on here. I suspect that there is something not quite right with your calculation of the ratios. Interestingly, your ratio of 0.31, divided by the length you got (654) times the length you think it should be (485) 0.31 * 485/654.0 = 0.229 which is very close to 0.25. Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo +-----------------------------------------------------------+ The main confusion about C++ is that its practitioners think it is simultaneously a high and low level language when in reality it is good at neither.
Reply by jeffdod November 21, 20052005-11-21
A few more questions if you have the time... :)

I modified the warping sample to use my own warping ratios. There are
quite a few different segments in my example...11631 to be exact, and
each segment is very small. And yes, they are intended to be
contiguous. My assumption from looking at the "warp" array is that the
indices represent offsets in units of samples, as opposed to frames (my
input file is 16-bit, 48000Hz stereo). I set up the warp array such
that the warp[i].ratio value was the floating point number that would
scale the segment from warp[i].index to warp[i+1].index to my desired
size (1.0/23.976). Most of these ratios averaged about 0.31. To check
my work, I looped over the warp array, applied all the ratios to the
duration of each segment, and came up with a total expected output size
of approximately 485 seconds (my source file is 1570 seconds). This is
the correct output size for my particular application, so all seemed
well.

However, after running this through the algorithm, the actual output
file played back too slowly. Instead of being 485 seconds, it was 654
seconds.

To further complicate things, I actually want to throw away a chunk of
audio at the beginning of the file. In other words, my first
warp[0].index value is not zero. To accomodate this, I changed the seek
function that seeks the beginning of the input file to seek to
warp[0].index instead of just zero.

That being said, the output does in fact sound good...it is totally
artifact-free...no blips, pops, or anything like that. I am sure that I
am doing something wrong. Do you have any idea what I might be missing?

Sincerely,
Jeff D.

Reply by jeffdod November 20, 20052005-11-20
Jon,

I have wondered whether the blips/pops were at the boundaries, but have
not investigated enough to determine this or not. I suspect that they
are not, but if they are I may well have to implement something like
you describe.

I think Erik is just trying to help me understand how to use
libsamplerate properly...which from looking at his sample source code,
it looks like I was not using it properly at first. I have not had time
yet to examine the code but will do so today.

Thanks for everyone's help.

Jeff

Reply by Jon Harris November 20, 20052005-11-20
Are the blips and pops at the segment boundaries?  If so, that's to be expected 
unless you take special care to handle these cases.  I would test your algorithm 
on a large chunk (many seconds) of audio to make sure it is clean apart from 
segment boundaries.  If indeed the boundaries are the problem, you will need to 
use audio from the previous and/or next segments in calculating the samples near 
the boundary.  Maybe that is what Erik is working on for you?

-- 
Jon Harris
SPAM blocker in place:
Remove 99 (but leave 7) to reply

"jeffdod" <jeffdod@netzero.net> wrote in message 
news:1132348958.109763.97130@g43g2000cwa.googlegroups.com...
> Hello Erik, > > That is what I did, using both the simple and fuller API's. However, > the data produced by doing this has strange artifacts in it--blips, > pops, and repetition of tiny audio segments over the top of the > "normal" sounding track. The data produced by the simple API was better > than the full API. I discovered the reason for this was that the > process calls for the full API would sometimes return fewer data > samples than I had requested for the output buffer. This would leave > gaps that sounded strange. The simple API would always return the > correct number of output samples. For instance, my input buffer might > have 9610 floats in it, and I would request 4004 floats as the output. > However, it would frequently return 3893 floats or some other value > instead. > > Can you think of any reason why this might be? Any help is *greatly* > appreciated! > > Jeff D. >
Reply by jeffdod November 19, 20052005-11-19
Erik wrote:
>> Its highly likely that you were not using either correctly. <<
Well, you are probably right! I greatly appreciate you putting together a demo program for me! I will check it out immediately. Gratefully, Jeff D.