comp.dsp | Different Resampling Rates for Segments of a Sound File| page 2

Reply by jeffdod ●November 20, 20052005-11-20

Jon,

I have wondered whether the blips/pops were at the boundaries, but have
not investigated enough to determine this or not. I suspect that they
are not, but if they are I may well have to implement something like
you describe.

I think Erik is just trying to help me understand how to use
libsamplerate properly...which from looking at his sample source code,
it looks like I was not using it properly at first. I have not had time
yet to examine the code but will do so today.

Thanks for everyone's help.

Jeff

Reply by jeffdod ●November 21, 20052005-11-21

A few more questions if you have the time... :)

I modified the warping sample to use my own warping ratios. There are
quite a few different segments in my example...11631 to be exact, and
each segment is very small. And yes, they are intended to be
contiguous. My assumption from looking at the "warp" array is that the
indices represent offsets in units of samples, as opposed to frames (my
input file is 16-bit, 48000Hz stereo). I set up the warp array such
that the warp[i].ratio value was the floating point number that would
scale the segment from warp[i].index to warp[i+1].index to my desired
size (1.0/23.976). Most of these ratios averaged about 0.31. To check
my work, I looped over the warp array, applied all the ratios to the
duration of each segment, and came up with a total expected output size
of approximately 485 seconds (my source file is 1570 seconds). This is
the correct output size for my particular application, so all seemed
well.

However, after running this through the algorithm, the actual output
file played back too slowly. Instead of being 485 seconds, it was 654
seconds.

To further complicate things, I actually want to throw away a chunk of
audio at the beginning of the file. In other words, my first
warp[0].index value is not zero. To accomodate this, I changed the seek
function that seeks the beginning of the input file to seek to
warp[0].index instead of just zero.

That being said, the output does in fact sound good...it is totally
artifact-free...no blips, pops, or anything like that. I am sure that I
am doing something wrong. Do you have any idea what I might be missing?

Sincerely,
Jeff D.

Reply by Erik de Castro Lopo ●November 21, 20052005-11-21

jeffdod wrote:
> 
> A few more questions if you have the time... :)
> 
> I modified the warping sample to use my own warping ratios. There are
> quite a few different segments in my example...11631 to be exact, and
> each segment is very small. And yes, they are intended to be
> contiguous. My assumption from looking at the "warp" array is that the
> indices represent offsets in units of samples, as opposed to frames (my
> input file is 16-bit, 48000Hz stereo).

No, it is frames, not samples.

> I set up the warp array such
> that the warp[i].ratio value was the floating point number that would
> scale the segment from warp[i].index to warp[i+1].index to my desired
> size (1.0/23.976). Most of these ratios averaged about 0.31.

I seem to remember that in your original email, you wrote the sample 
rate was about 4 times higher than you wanted. If that was the case
I would expect the ratio to be about 0.25.

> To further complicate things, I actually want to throw away a chunk of
> audio at the beginning of the file. In other words, my first
> warp[0].index value is not zero. To accomodate this, I changed the seek
> function that seeks the beginning of the input file to seek to
> warp[0].index instead of just zero.

That should be OK.

> That being said, the output does in fact sound good...it is totally
> artifact-free...no blips, pops, or anything like that.

What a wonderful advertisement for Secret Rabbit Code :-).

It also means that you've got things basically right.

> I am sure that I
> am doing something wrong. Do you have any idea what I might be missing?

Really not sure what's going on here. I suspect that there is something
not quite right with your calculation of the ratios. Interestingly,
your ratio of 0.31, divided by the length you got (654) times the length 
you think it should be (485)

     0.31 * 485/654.0 = 0.229

which is very close to 0.25.


Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
The main confusion about C++ is that its practitioners think
it is simultaneously a  high and low level language when in
reality it is good at neither.

Reply by jeffdod ●November 21, 20052005-11-21

> No, it is frames, not samples.

Ah! Okay, I will take that into account.

> I seem to remember that in your original email, you wrote the sample
> rate was about 4 times higher than you wanted. If that was the case
> I would expect the ratio to be about 0.25.

Sorry if I created confusion there...0.25 was a guess on my part. The
original file is 1570 seconds long and the resulting file should be
485, so the real average ratio is closer to 0.31.

> > To further complicate things, I actually want to throw away a chunk of
> > audio at the beginning of the file. In other words, my first
> > warp[0].index value is not zero. To accomodate this, I changed the seek
> > function that seeks the beginning of the input file to seek to
> > warp[0].index instead of just zero.
>
> That should be OK.
>

I also commented out the code that sets the ratio to 1.0 if
warp[0].index is > zero. I simply go ahead and use warp[0].ratio from
the beginning and increment the index into the warp array.

> What a wonderful advertisement for Secret Rabbit Code :-).
>

It is an impressive piece of work.

> Really not sure what's going on here. I suspect that there is something
> not quite right with your calculation of the ratios. Interestingly,
> your ratio of 0.31, divided by the length you got (654) times the length
> you think it should be (485)
>
>      0.31 * 485/654.0 = 0.229
>
> which is very close to 0.25.
>

I am sure of my average ratio, but my actual ratios fluctuate up and
down around this average (or it wouldn't be an average!). There must be
something wrong. If I hard-code all the ratios to 1.0, I get a file
that is the same duration as the original. If I hard-code all the
ratios to 0.10 (which is smaller than any actual ratio that I use) I
get a file that is exactly one-tenth of the original. So I know that my
ratios are within the bounds of what libsamplerate accepts. I will keep
looking at it to see where I went wrong.

Thanks.

Jeff Dodson

Reply by jeffdod ●November 21, 20052005-11-21

Erik,

I have figured out what the problem is, although I won't know how to
fix it until I gain a better understanding of the code. It appears that
the problem is in the "timewarp_convert" routine near the top of the
while(1) loop where it figures out if it is time to advance to the next
element in the warp array:

    if (warp_index < ARRAY_LEN (warp) - 1 && input_count >= warp
[warp_index].index)
    {
        src_data.src_ratio = warp [warp_index].ratio ;
        warp_index ++ ;
    } ;

It looks like input_count is never >= warp[warp_index].index, which
causes the routine to use the first ratio throughout the entire file,
instead of applying a different one from segment to segment.

Jeff Dodson

Reply by jeffdod ●November 21, 20052005-11-21

Oops. My mistake. Strike that last comment. the input_count does exceed
the warp[warp_index].index value. The problem is that I ruined your
ARRAY_LEN macro by changing the warp array to a dynamically-allocated
array instead of a static array. I corrected this by replacing the
ARRAY_LEN call with the integer size that I have allocated the array
for, and it advances properly through the different scaling factors for
the segments. The file is now the correct duration (485 seconds) but
the strange audio artifacts that I saw with my method are back!
Basically what it sounds like is that each segment plays properly until
near the end, and then there is a speed up, and then it slows back down
to the proper speed. Very strange.

Reply by jeffdod ●November 21, 20052005-11-21

Problem solved. I upped the BUFFER_LEN to be larger than my largest
segment size (I set it to 32768) and the strange audio artifacts
disappeared. Thank you so much for your help!

Jeff D.

Reply by Erik de Castro Lopo ●November 21, 20052005-11-21

jeffdod wrote:
> 
> Problem solved. I upped the BUFFER_LEN to be larger than my largest
> segment size (I set it to 32768) and the strange audio artifacts
> disappeared. Thank you so much for your help!

Glad to hear that.

Cheers,
Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo
+-----------------------------------------------------------+
Saying Python is easier than C++ is like saying that turning a 
light  switch on or off is easier than operating a nuclear reactor.

Previous 12Next

Different Resampling Rates for Segments of a Sound File

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group