Forums

Timestretching with Granular Synthesis: Keeping the pitch constant?

Started by Joe Bloggs June 7, 2008
I have made a program that timestretches WAV sound files using granular 
synthesis. It has three variable input values: % stretch factor, grain 
length (in samples) and grains per second. All seems to be working fine 
except for one problem I'm having with the pitch of the stretched sound.

If, for example, I stretch a sound to 200% with a grain length of 1000 
samples and 41 grains per second, the pitch of the stretched sound changes 
from the pitch of the original sound. If I use the same settings again with 
a stretch factor of 300% then the pitch goes down further from the 
original. After some trial and error I can get the pitch to match the 
original by altering the grains per second to a higher value.

I can see that there is a relation between the stretch factor and the 
grains per second (the longer/higher the stretch factor, the higher the 
grains per second value must be) but I can't work out a formula for it. So 
my question: Is there a common algorithm I can use to calculate the grains 
per second value based on the % stretch factor and the grain length, in 
order to keep the pitch constant? I also need to ensure that the grains 
stay spaced apart evenly.

I hope I have made sense, thanks.
On Jun 7, 3:25 pm, Joe Bloggs <j...@bloggs.com> wrote:
> I have made a program that timestretches WAV sound files using granular > synthesis. It has three variable input values: % stretch factor, grain > length (in samples) and grains per second. All seems to be working fine > except for one problem I'm having with the pitch of the stretched sound. > > If, for example, I stretch a sound to 200% with a grain length of 1000 > samples and 41 grains per second, the pitch of the stretched sound changes > from the pitch of the original sound. If I use the same settings again with > a stretch factor of 300% then the pitch goes down further from the > original. After some trial and error I can get the pitch to match the > original by altering the grains per second to a higher value. > > I can see that there is a relation between the stretch factor and the > grains per second (the longer/higher the stretch factor, the higher the > grains per second value must be) but I can't work out a formula for it. So > my question: Is there a common algorithm I can use to calculate the grains > per second value based on the % stretch factor and the grain length, in > order to keep the pitch constant? I also need to ensure that the grains > stay spaced apart evenly. > > I hope I have made sense, thanks.
we'll see if this makes sense or not. Granular Synhesis has essentially two flavors or modes of doing it. one is "pitch synchronous" which comes out pretty close to the same thing as PSOLA. it needs to have some kind of pitch detector and the grains are derived from the input in a pitch-synchronous manner. (neighboring grains will be aligned to similar parts of adjacent periods in the input). the other is pitch asynchronous. the alignment of the windows to lift off grains is pretty much clueless to the part of the input waveform they are aligned to. so which way is it, in your case? r b-j
robert bristow-johnson <rbj@audioimagination.com> wrote in
news:f1126715-6edc-4487-8e94-ccb8129e7268@s50g2000hsb.googlegroups.com: 

> we'll see if this makes sense or not. Granular Synhesis has > essentially two flavors or modes of doing it. one is "pitch > synchronous" which comes out pretty close to the same thing as PSOLA. > it needs to have some kind of pitch detector and the grains are > derived from the input in a pitch-synchronous manner. (neighboring > grains will be aligned to similar parts of adjacent periods in the > input). the other is pitch asynchronous. the alignment of the > windows to lift off grains is pretty much clueless to the part of the > input waveform they are aligned to. so which way is it, in your case? > > r b-j >
Thanks for replying. My algorithm does not do any pitch detection when creating and aligning the grains (I don't know how to do that yet), they just get overlap-added in a synchronous fashion with a fixed 'hop size' between them - so I guess that would make it pitch asynchronous? Sorry if I seem a little confused as this is all new to me. I assume I will have to use the PSOLA method to get the result I am looking for then?
On Jun 8, 5:38 pm, Joe Bloggs <j...@bloggs.com> wrote:
...
> > My algorithm does not do any pitch detection when creating and aligning the > grains (I don't know how to do that yet), they just get overlap-added in a > synchronous fashion with a fixed 'hop size' between them - so I guess that > would make it pitch asynchronous? Sorry if I seem a little confused as this > is all new to me. > > I assume I will have to use the PSOLA method to get the result I am looking > for then?
to make your time-stretching "glitch free" you would have to do something like PSOLA if you're doing it entirely in the time domain. it would be glitch free really only if the input is quasi-periodic (sometimes called "quasi-harmonic"). to stretch, you would effectively end up "splicing in" repeated periods of this quasi- periodic input. cross-fading (which is effectively what OLA windowing does) helps obscure any glitches that you might get because the input isn't perfectly periodic. an extra period would be spliced in often enough to accomplish the degree of time-stretching, or "% stretch factor", that you (or the user) specifies. conceptually the simplest way to do this "pitch detection", which is really measuring the period length of a periodic or quasi-periodic input, you should look up Average Magnitude Difference Function (AMDF) (or maybe "magnitude-squared"). this would be a process that runs in parallel with your OLA splicing process and informs the the OLA process to how much extra audio to splice in each time a splice occurs. r b-j
robert bristow-johnson <rbj@audioimagination.com> wrote in
news:e881ba44-f8a2-41fb-9825-d0d90283ea95@w7g2000hsa.googlegroups.com: 

> to make your time-stretching "glitch free" you would have to do > something like PSOLA if you're doing it entirely in the time domain. > it would be glitch free really only if the input is quasi-periodic > (sometimes called "quasi-harmonic"). to stretch, you would > effectively end up "splicing in" repeated periods of this quasi- > periodic input. cross-fading (which is effectively what OLA windowing > does) helps obscure any glitches that you might get because the input > isn't perfectly periodic. an extra period would be spliced in often > enough to accomplish the degree of time-stretching, or "% stretch > factor", that you (or the user) specifies. > > conceptually the simplest way to do this "pitch detection", which is > really measuring the period length of a periodic or quasi-periodic > input, you should look up Average Magnitude Difference Function (AMDF) > (or maybe "magnitude-squared"). this would be a process that runs in > parallel with your OLA splicing process and informs the the OLA > process to how much extra audio to splice in each time a splice > occurs. > > r b-j >
Thanks for your help, it's much appreciated.