DSPRelated.com
Forums

speed, pitch and tempo

Started by gaet...@yahoo.it September 12, 2006
Hello,
I'm looking for a tutorial or any web reference to know more about
changing
speed, pitch or tempo of an audio signal.
Can anyone point me in the right direction?
Thanks,
                                                  gaetano

gaetanoortisi@yahoo.it wrote:
> Hello, > I'm looking for a tutorial or any web reference to know more about > changing > speed, pitch or tempo of an audio signal. > Can anyone point me in the right direction?
Start with http://www.dspdimension.com/ Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
gaetanoortisi@yahoo.it wrote:
> Hello, > I'm looking for a tutorial or any web reference to know more about > changing > speed, pitch or tempo of an audio signal. > Can anyone point me in the right direction?
"Time-domain harmonic scaling" is a commonly-used technique to achieve this sort of transformations. A google search ought to dig out some material. C
Chetan Vinchhi wrote:

> "Time-domain harmonic scaling" is a commonly-used technique to > achieve this sort of transformations. A google search ought to dig > out some material. >
No interpolation technique is needed to change speed? Thanks, Gaetano
gaetanoortisi@yahoo.it wrote:
> Chetan Vinchhi wrote: > > > "Time-domain harmonic scaling" is a commonly-used technique to > > achieve this sort of transformations. A google search ought to dig > > out some material. > > > > No interpolation technique is needed to change speed?
Yes, you do need that. You can do 3 types of transformation: 1. Change tempo and pitch: This is achieved by resampling the original signal appropriately. 2. Change tempo without changing pitch: You need an algorithm like time domain harmonic scaling for this. The basic idea is that you determine whole pitch periods in a processing window and repeat pitch periods to lower the tempo or delete pitch periods to increase the tempo. You would of course have to take care of any discontinuities or other artifacts at the boundaries. 3. Change pitch without changing tempo: Conceptually, this can be achieved by first changing the *tempo* and then resampling. In practice, you may be able to achieve this in a single step as a combination of harmonic scaling and resampling. Does this cover all the cases you had in mind? C
Chetan Vinchhi wrote:
> gaetanoortisi@yahoo.it wrote: > > Chetan Vinchhi wrote: > > > > > "Time-domain harmonic scaling" is a commonly-used technique to > > > achieve this sort of transformations. A google search ought to dig > > > out some material. > > > > > > > No interpolation technique is needed to change speed? > > Yes, you do need that. You can do 3 types of transformation: > > 1. Change tempo and pitch: This is achieved by resampling the > original signal appropriately. > > 2. Change tempo without changing pitch: You need an algorithm > like time domain harmonic scaling for this. The basic idea is that > you determine whole pitch periods in a processing window and > repeat pitch periods to lower the tempo or delete pitch periods to > increase the tempo. You would of course have to take care of any > discontinuities or other artifacts at the boundaries. > > 3. Change pitch without changing tempo: Conceptually, this can > be achieved by first changing the *tempo* and then resampling. > In practice, you may be able to achieve this in a single step as > a combination of harmonic scaling and resampling.
not to discourage the time-domain splicing technique (sometimes it works very well) but, if the application is to polyphonic music, TDHS might leave glitches because the automated process will find *no* decently matched places to splice. this is because there would be no way to find a single splice displacement that would be an integer number of periods for all of these unrelated (nonharmonic) sinusoidal components. many might be spliced in-phase, but some could be 180 degrees out of phase when spliced. then you might have to use a frequency-domain technique like the phase vocoder or sinusoidal modeling. somewhere in between the time-domain and frequency-domain is to do multiband splitting of the audio and apply a separate TDHS to each band. if you split the audio into a hundred bands or so, applying a separate TDHS to each band, conceptually this is a lot like the frequency-domain techniques because it will be unlikely that more than one sinusoid would fit into each band. r b-j
robert bristow-johnson wrote:

...
> not to discourage the time-domain splicing technique (sometimes it > works very well) but, if the application is to polyphonic music, TDHS > might leave glitches because the automated process will find *no* > decently matched places to splice.
Robert, what in your opinion would be an appropriate procedure for time scaling a measured room response? Regards, Andor
in article 1158167334.782667.122830@m73g2000cwd.googlegroups.com, Andor at
andor.bariska@gmail.com wrote on 09/13/2006 13:08:

> > robert bristow-johnson wrote: > > ... >> not to discourage the time-domain splicing technique (sometimes it >> works very well) but, if the application is to polyphonic music, TDHS >> might leave glitches because the automated process will find *no* >> decently matched places to splice. > > Robert, > > what in your opinion would be an appropriate procedure for time scaling > a measured room response?
you mean to make a little reverberant room sound like a big reverberant room (without reducing the bandwidth of the reverberations)? i never thunked of that before. why would you want to do that? you wouldn't be getting a "true" reverb sound as you would if you went into some cathedral, measured the impulse response, and used that in some massive convolution machine. if you're trying to approximate the "character" of some reverb, i thought the common methods of a matrixed feedback with a bunch of delay lines and long APFs (this Schroeder thing, try Googling: Schroeder reverb) would be as good of a fictional room as anything else. needless to say, i am not sure how i would time-scale a short reverb IR into a long one. i guess i would try both time and frequency domain techniques and listen to the result and try to decide which one sounds less phony. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
robert bristow-johnson wrote:

> > robert bristow-johnson wrote: > > > > ... > >> not to discourage the time-domain splicing technique (sometimes it > >> works very well) but, if the application is to polyphonic music, TDHS > >> might leave glitches because the automated process will find *no* > >> decently matched places to splice. > > > > Robert, > > > > what in your opinion would be an appropriate procedure for time scaling > > a measured room response? > > you mean to make a little reverberant room sound like a big reverberant room > (without reducing the bandwidth of the reverberations)? i never thunked of > that before. > > why would you want to do that? you wouldn't be getting a "true" reverb > sound as you would if you went into some cathedral, measured the impulse > response, and used that in some massive convolution machine.
But you could scale the cathedral down to a match box if you can time compress the impulse response. The relative distance between each reflection is decreased, resulting in the preception of reduced volume while maintaining the geometry of the space . This time compression (or expansion) could be part of a convolution machine.
> if you're > trying to approximate the "character" of some reverb, i thought the common > methods of a matrixed feedback with a bunch of delay lines and long APFs > (this Schroeder thing, try Googling: Schroeder reverb) would be as good of a > fictional room as anything else. > > needless to say, i am not sure how i would time-scale a short reverb IR into > a long one. i guess i would try both time and frequency domain techniques > and listen to the result and try to decide which one sounds less phony.
I had a look at some of the impulse responses of the pure time and pure frequency methods at http://www.dspdimension.com. They produce multiple copies of the single main pulse, which would of course be catastrophic for a reverb. It seems a more advanced approach is necessary to scale impulse responses. Regards, Andor
in article 1158180465.381809.175420@i3g2000cwc.googlegroups.com, Andor at
andor.bariska@gmail.com wrote on 09/13/2006 16:47:

> robert bristow-johnson wrote: > >>> robert bristow-johnson wrote: >>> >>> ... >>>> not to discourage the time-domain splicing technique (sometimes it >>>> works very well) but, if the application is to polyphonic music, TDHS >>>> might leave glitches because the automated process will find *no* >>>> decently matched places to splice. >>> >>> Robert, >>> >>> what in your opinion would be an appropriate procedure for time scaling >>> a measured room response? >> >> you mean to make a little reverberant room sound like a big reverberant room >> (without reducing the bandwidth of the reverberations)? i never thunked of >> that before. >> >> why would you want to do that? you wouldn't be getting a "true" reverb >> sound as you would if you went into some cathedral, measured the impulse >> response, and used that in some massive convolution machine. > > But you could scale the cathedral down to a match box if you can time > compress the impulse response.
but you lose information. i don't see how time-compressing a cathedral impulse response (assuming it's a "good" sounding space) would have any promise of creating a "good" sounding impulse response for a smaller room.
> I had a look at some of the impulse responses of the pure time and pure > frequency methods at http://www.dspdimension.com. They produce multiple > copies of the single main pulse, which would of course be catastrophic > for a reverb.
any discrete reflection will be a copy of the main pulse. "good" sounding spaces have *some* discrete reflections, often the "early reflections", but that the reflections of the reflections become more and more diffuse and eventually the impulse response looks like some kinda white noise with an exponentially decaying envelope. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."