comp.dsp | Fast (real-time) time stretch code

I am looking for code for slowing down music without altering pitch.

If anyone is interested, here are some programs that are designed for
doing this for music transcription. A couple examples here with free
trials:

Amazing Slow Downer:  http://www.ronimusic.com/ 
Transcribe!: http://www.seventhstring.com/
Slow Gold: http://www.worldwidewoodshed.com/products.htm

I'm looking to do something similar (for a non-commercial app) but
with different features.

I think that some softer (Transcribe! above) does this via Fourier
transform. Is that the best way to achieve this without reentrant
glitches?

Reply by robert bristow-johnson ●May 29, 20082008-05-29

first of all, usually for a "real-time" process we mean that the same
amount of time going in is what comes out.  time-scaling or resampled
(and pitch-shifted) sounds have a different number samples going in
than coming out (with the same time assumed between samples).  if such
were running real-time, that means it could be started and running
real-time for an indefinitely long period of time.  in that case, you
would hit the end of a buffer on one side or the other.

On May 29, 5:01 am, R <R...@nospam.com> wrote:
> I am looking for code for slowing down music without altering pitch.

as you're slowing it down, are your samples coming from disk (or
whatever source) repeating over some regions?

>
> If anyone is interested, here are some programs that are designed for
> doing this for music transcription. A couple examples here with free
> trials:
>
> Amazing Slow Downer:  http://www.ronimusic.com/
> Transcribe!:http://www.seventhstring.com/
> Slow Gold:http://www.worldwidewoodshed.com/products.htm
>
> I'm looking to do something similar (for a non-commercial app) but
> with different features.
>
> I think that some softer (Transcribe! above) does this via Fourier
> transform. Is that the best way to achieve this without reentrant
> glitches?

it's about the only way to do it for broadbanded, full-mix audio.  if
the source was a monophonic instrument that plays single notes at a
time, then time-scaling can be accomplished by staying in the time-
domain and making judicious use of splicing.  but if your audio has
all sorts of harmonically unrelated frequency components, doing this
in the time domain might result in a splice where not all harmonic
components are spliced in phase.  when some splice has some frequency
component that is 180 degrees out of phase when spliced, that's gonna
sound pretty bad.

r b-j

Reply by R ●May 30, 20082008-05-30

On Thu, 29 May 2008 14:35:49 -0700 (PDT), robert bristow-johnson
<rbj@audioimagination.com> wrote:

>
>first of all, usually for a "real-time" process we mean that the same
>amount of time going in is what comes out. 

I meant response time. I could probably pre-crunch an entire MP3 file,
but I was hoping to process designated segments on the fly.

>On May 29, 5:01 am, R <R...@nospam.com> wrote:
>> I am looking for code for slowing down music without altering pitch.
>
>as you're slowing it down, are your samples coming from disk (or
>whatever source) repeating over some regions?

Yes, samples coming from disk. Yes, an option to keep replaying the
segment, if that's what you meant.  Usually it would entail slow down
rather than speed up.

>> If anyone is interested, here are some programs that are designed for
>> doing this for music transcription. A couple examples here with free
>> trials:
>>
>> Amazing Slow Downer:  http://www.ronimusic.com/
>> Transcribe!:http://www.seventhstring.com/
>> Slow Gold:http://www.worldwidewoodshed.com/products.htm
>>
>> I'm looking to do something similar (for a non-commercial app) but
>> with different features.
>>
>> I think that some softer (Transcribe! above) does this via Fourier
>> transform. Is that the best way to achieve this without reentrant
>> glitches?
>
>it's about the only way to do it for broadbanded, full-mix audio.  if
>the source was a monophonic instrument that plays single notes at a
>time, then time-scaling can be accomplished by staying in the time-
>domain and making judicious use of splicing.

Yeah, I understand the splice problem. I've seen some older hardware
"Harmonizers" that try to make intelligent decisions on splice points,
but that's tough with more than one or two simultaneous pitches. And
as you point out, overtones won't necessarily align even with
monophonic sources. But even the old hardware was not that bad
sometimes. Guitarists and keyboard players used those for playing
root+5th or even full chords at times. You've probably heard the
result on recordings.

Still, if it can be done via FFT without having to crunch overnight,
that would be preferable. The sound doesn't have to be hifi, but best
that it's free of distracting pulsing or harsh artifacts.

So...any code available for doing this? Pref in C/C++ so I could get
it running on a Windows machine.

Reply by robert bristow-johnson ●May 30, 20082008-05-30

On May 29, 11:48&#4294967295;pm, R <R...@nospam.com> wrote:
> On Thu, 29 May 2008 14:35:49 -0700 (PDT), robert bristow-johnson
>
> <r...@audioimagination.com> wrote:
>
> >first of all, usually for a "real-time" process we mean that the same
> >amount of time going in is what comes out.
>
> I meant response time.

i think you meant the computation is efficient enough that if you can
play back the audio file either faster or slower from the disk, and
the algorithm output doesn't fall behind more than a known and bounded
delay.

>
> >as you're slowing it down, are your samples coming from disk (or
> >whatever source) repeating over some regions?
>
> Yes, samples coming from disk. Yes, an option to keep replaying the
> segment, if that's what you meant.

yeah, if you're slowing it down, you would have to repeat segments in
some manner.  and if you were speeding it up, you would be omitting
some segments.  (discounting any cross-fading in the splices.)

>
> >> I think that some softer (Transcribe! above) does this via Fourier
> >> transform. Is that the best way to achieve this without reentrant
> >> glitches?
>
> >it's about the only way to do it for broadbanded, full-mix audio. &#4294967295;if
> >the source was a monophonic instrument that plays single notes at a
> >time, then time-scaling can be accomplished by staying in the time-
> >domain and making judicious use of splicing.
>
> Yeah, I understand the splice problem. I've seen some older hardware
> "Harmonizers" that try to make intelligent decisions on splice points,
> but that's tough with more than one or two simultaneous pitches.

it depends on the relationship between pitches.  playing a heavy power
chord (fifth and major third) should not sound so bad.

> And
> as you point out, overtones won't necessarily align even with
> monophonic sources.

did i say that??  (i have to check.)  for *harmonic* monophonic
sources, you should nearly always be able to find a splice length that
makes all of the harmonic overtones happy.  if they slightly detune at
the weaker very high harmonics, those splices won't be particularly
noticible.

> But even the old hardware was not that bad
> sometimes. Guitarists and keyboard players used those for playing
> root+5th or even full chords at times. You've probably heard the
> result on recordings.

yeah.  actually i was in on the pitch-shifting algs on one of the
Eventide Harmonizer models.  and my point above is even more true
(that some polyphonic input to a time-domain pitch shifter can come
out very good, depending on what the notes are) for chords that are
just fifths and no third.  those pitch-shift fine.  easy.  (think of
the tonic and it's fifth as being the 2nd and 3rd harmonic of a common
fundamental that doesn't necessarily have any energy at the
fundamental.  then it's a periodic function.  sorta.)

> Still, if it can be done via FFT without having to crunch overnight,
> that would be preferable. The sound doesn't have to be hifi, but best
> that it's free of distracting pulsing or harsh artifacts.

there a bunch of products.  SoundToys (or Wave Mechanics) SPEED,
Serato Pitch 'n Time.

> So...any code available for doing this? Pref in C/C++ so I could get
> it running on a Windows machine.

it's not too hard to write a simple phase vocoder.  i'm not gonna send
you any code nor tell you any tricks that make it sound better than
what you might get from a published alg (like Laroche or Puckette).
that's not too hard.  do you have your file and sound I/O worked out?

r b-j

Reply by R ●May 30, 20082008-05-30

On Thu, 29 May 2008 22:02:29 -0700 (PDT), robert bristow-johnson
<rbj@audioimagination.com> wrote:

>On May 29, 11:48&#4294967295;pm, R <R...@nospam.com> wrote:
>> On Thu, 29 May 2008 14:35:49 -0700 (PDT), robert bristow-johnson
>>
>> <r...@audioimagination.com> wrote:
>>
>> >first of all, usually for a "real-time" process we mean that the same
>> >amount of time going in is what comes out.
>>
>> I meant response time.
>
>i think you meant the computation is efficient enough that if you can
>play back the audio file either faster or slower from the disk, and
>the algorithm output doesn't fall behind more than a known and bounded
>delay.

Ahem...

>> >as you're slowing it down, are your samples coming from disk (or
>> >whatever source) repeating over some regions?
>>
>> Yes, samples coming from disk. Yes, an option to keep replaying the
>> segment, if that's what you meant.
>
>yeah, if you're slowing it down, you would have to repeat segments in
>some manner.  and if you were speeding it up, you would be omitting
>some segments.  (discounting any cross-fading in the splices.)

OK--that's obvious. I thought maybe you were suggesting that caching a
preprocessed version of the audio file would be more efficient if it
were to be played multiple times. Which is a good thought, because it
probably would be looped.

>> And
>> as you point out, overtones won't necessarily align even with
>> monophonic sources.
>
>did i say that??  (i have to check.)

Oh, maybe you didn't. I had tried to find info via Google archives of
this group. Maybe that was from one of those posts.

>> But even the old hardware was not that bad
>> sometimes. Guitarists and keyboard players used those for playing
>> root+5th or even full chords at times. You've probably heard the
>> result on recordings.
>
>yeah.  actually i was in on the pitch-shifting algs on one of the
>Eventide Harmonizer models.

No kidding. That's what I had in mind when referring to splice algs,
etc. You probably worked on the newer versions, so you would have had
access to some serious DSP power. The older ones used bit-slice
processors. Some had options for a primitive secondary channel that
assisted in finding splice points. 

>> Still, if it can be done via FFT without having to crunch overnight,
>> that would be preferable. The sound doesn't have to be hifi, but best
>> that it's free of distracting pulsing or harsh artifacts.
>
>there a bunch of products.  SoundToys (or Wave Mechanics) SPEED,
>Serato Pitch 'n Time.

I wasn't looking for a pre-written program, but I've been wondering
whether I'd be further ahead learning how to host a VST plugin.

>> So...any code available for doing this? Pref in C/C++ so I could get
>> it running on a Windows machine.
>
>it's not too hard to write a simple phase vocoder.  i'm not gonna send
>you any code nor tell you any tricks that make it sound better than
>what you might get from a published alg (like Laroche or Puckette).
>that's not too hard.  do you have your file and sound I/O worked out?

No problem writing file or sound IO. Or the UI for that matter. I've
done enough of that. I was just looking for a start on the time
stretch code. The idea is mostly for a quick rehearsal/transcription
tool, so I didn't want to get too deep into piles of DSP books. I'll
look for the Laroche and Puckette algorithms (thanks for the lead).
Maybe if I'm lucky, someone has posted some working C or C++ code.

Reply by robert bristow-johnson ●May 30, 20082008-05-30

On May 30, 2:42 am, R <R...@nospam.com> wrote:
> On Thu, 29 May 2008 22:02:29 -0700 (PDT), robert bristow-johnson
>
> <r...@audioimagination.com> wrote:
> >On May 29, 11:48 pm, R <R...@nospam.com> wrote:
> >> On Thu, 29 May 2008 14:35:49 -0700 (PDT), robert bristow-johnson
>
> >> <r...@audioimagination.com> wrote:
>
> >> >first of all, usually for a "real-time" process we mean that the same
> >> >amount of time going in is what comes out.
>
> >> I meant response time.
>
> >i think you meant the computation is efficient enough that if you can
> >play back the audio file either faster or slower from the disk, and
> >the algorithm output doesn't fall behind more than a known and bounded
> >delay.
>
> Ahem...

not sure what you mean here.

> >> >as you're slowing it down, are your samples coming from disk (or
> >> >whatever source) repeating over some regions?
>
> >> Yes, samples coming from disk. Yes, an option to keep replaying the
> >> segment, if that's what you meant.
>
> >yeah, if you're slowing it down, you would have to repeat segments in
> >some manner.  and if you were speeding it up, you would be omitting
> >some segments.  (discounting any cross-fading in the splices.)
>
> OK--that's obvious. I thought maybe you were suggesting that caching a
> preprocessed version of the audio file would be more efficient if it
> were to be played multiple times.

no i meant that whatever the process is, at least in audio DSP
processing that the process can handle processing the input to the
output without falling farther and farther behind.  that's all i mean
when i think of "real-time".  for other disciplines, there are
additional requirements, but not audio DSP.  oddly, even though i have
zero contribution to the comp.dsp FAQ, for some odd luck i got to
contribute to the comp.realtime FAQ to the point of contributing to
the definition.

http://www.faqs.org/faqs/realtime-computing/faq/

"In a real-time DSP process, the analyzed (input) and/or generated
(output)
samples (whether they are grouped together in large segments or
processed
individually) can be processed (or generated) continuously in the time
it
takes to input and/or output the same set of samples independent of
the
processing delay.

"Consider an audio DSP example: if a process requires 2.01 seconds to
analyze
or process 2.00 seconds of sound, it is not real-time. If it takes
1.99
seconds, it is (or can be made into) a real-time DSP process.

"A common life example I like to make is standing in a line (or queue)
waiting for the checkout in a grocery store. If the line asymtotically
grows
longer and longer without bound, the checkout process is not real-
time. If
the length of the line is bounded, customers are being 'processed' and
outputted as rapidly, on average, as they are being inputted and that
process *is* real-time. The grocer might go out of business or must at
least
lose business if he/she cannot make his/her checkout process real-time
(so
it's fundamentally important that this process be real-time)."

> >> But even the old hardware was not that bad
> >> sometimes. Guitarists and keyboard players used those for playing
> >> root+5th or even full chords at times. You've probably heard the
> >> result on recordings.
>
> >yeah.  actually i was in on the pitch-shifting algs on one of the
> >Eventide Harmonizer models.
>
> No kidding. That's what I had in mind when referring to splice algs,
> etc. You probably worked on the newer versions, so you would have had
> access to some serious DSP power. The older ones used bit-slice
> processors. Some had options for a primitive secondary channel that
> assisted in finding splice points.

i worked on the DSP4000 which had some later spins.  they told me that
some of my algs survived to the later modes.  i did not work on the
classic H3000 nor the SP2016 or similar.  i thought that the bit-slice
(AMD2900 series) was just the SP2016.  the H3000 was that old crappy
16-bit TI DSP (3 of 'em).

> >> Still, if it can be done via FFT without having to crunch overnight,
> >> that would be preferable. The sound doesn't have to be hifi, but best
> >> that it's free of distracting pulsing or harsh artifacts.
>
> >there a bunch of products.  SoundToys (or Wave Mechanics) SPEED,
> >Serato Pitch 'n Time.
>
> I wasn't looking for a pre-written program, but I've been wondering
> whether I'd be further ahead learning how to host a VST plugin.
>

maybe.  i've never done VST, but i think that i shoulda learnt how to.

> >> So...any code available for doing this? Pref in C/C++ so I could get
> >> it running on a Windows machine.
>
> >it's not too hard to write a simple phase vocoder.  i'm not gonna send
> >you any code nor tell you any tricks that make it sound better than
> >what you might get from a published alg (like Laroche or Puckette).
> >that's not too hard.  do you have your file and sound I/O worked out?
>
> No problem writing file or sound IO. Or the UI for that matter. I've
> done enough of that. I was just looking for a start on the time
> stretch code. The idea is mostly for a quick rehearsal/transcription
> tool, so I didn't want to get too deep into piles of DSP books. I'll
> look for the Laroche and Puckette algorithms (thanks for the lead).
> Maybe if I'm lucky, someone has posted some working C or C++ code.

check out the music-dsp archive.  maybe there.

R, send me a decent email address.  i'll see if i can find some old
program that might run on Matlab or Octave for you.  it's from a paper
i did in 2001 so it's just proof of concept, slow, and not ready for
prime-time in any product.  if you can translate that to C (maybe get
a decent FFT routine in C, perhaps FFTW), you'll have something to
start with.

r b-j

Fast (real-time) time stretch code

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group