Reply by NS December 26, 20052005-12-26
>>>> If you want to shift just the pitch of speech signals, while >>>> maintaining all the rest unchanged (i.e. time, speed, formants >>>> etc.), then you need to decompose speech into LPC & excitation, >>>> apply speech shifting to the excitation, and synthesize back to >>>> generate the new speech. >>>> >>> This applies to the speech only. >> >> >> >> Not really, it applies to any signal that exhibits periodicity (and >> spectral envelope), and provides the capability of time-varying >> tracking of the signal's characteristics. > > > What if there is more then one source of the periodicity with the > different periods? What if no clear periodicity can be derived? > LPC + pitch assumes the human speech structure. > >> >>> The more general method to make the pitch or the speed change is by >>> the use of a filterbank. The signal is extrapolated by repetition or >>> truncated separately in the each subband. >> >> >> >> That method is not "more general", rather than the versa. Unlike the >> proposed method of LPC+excitation decomposition, the suggested >> filter-bank method "discretizes" the spectrum into fixed bands and is >> not optimized to the time-varying characteristics of the input signal. > > > Once there is no clear periodicity,
If there's no periodicity or at least pseudo periodicity, then there's no "pitch" right? and the question was about pitch shifting wasn't it?
> then the only way to make the signal
Don't be so sure as "the only way", as there are several ways of doing that, that may be very well the only way you happened to know...
> "longer" or "shorter" is to extrapolate it by continuation of the > subframe (in the each band) or truncate it. The subband processing > moderates the edge effects and prevents the spillage of the > interpolation artifacts into the different subbands.
Not really, one may take advantages of the temporal signal characteristic, and the the corresponding perception properties of the auditory system to such. Subband is just one way, and certainly not the best way, of doing that. NS
Reply by Ron N. December 26, 20052005-12-26
Vladimir Vassilevsky wrote:
> >>> If you want to shift just the pitch of speech signals, while > >>> maintaining all the rest unchanged (i.e. time, speed, formants etc.), > >>> then you need to decompose speech into LPC & excitation, apply speech > >>> shifting to the excitation, and synthesize back to generate the new > >>> speech. > >>> > >> This applies to the speech only. > > > > Not really, it applies to any signal that exhibits periodicity (and > > spectral envelope), and provides the capability of time-varying tracking > > of the signal's characteristics. > > What if there is more then one source of the periodicity with the > different periods? What if no clear periodicity can be derived? > LPC + pitch assumes the human speech structure. > > >> The more general method to make the pitch or the speed change is by > >> the use of a filterbank. The signal is extrapolated by repetition or > >> truncated separately in the each subband. > > > > > > That method is not "more general", rather than the versa. Unlike the > > proposed method of LPC+excitation decomposition, the suggested > > filter-bank method "discretizes" the spectrum into fixed bands and is > > not optimized to the time-varying characteristics of the input signal. > > Once there is no clear periodicity, then the only way to make the signal > "longer" or "shorter" is to extrapolate it by continuation of the > subframe (in the each band) or truncate it. The subband processing > moderates the edge effects and prevents the spillage of the > interpolation artifacts into the different subbands.
However, if there were multiple exciter sources with differing periods and envelopes, would not mixing together those overtones, which just happen to be contained in the same subband but result from different exciters, result in some time domain spillage and artifacts instead? IMHO. YMMV. -- rhn A.T nicholson d.O.t C-o-M
Reply by Vladimir Vassilevsky December 26, 20052005-12-26


>>> If you want to shift just the pitch of speech signals, while >>> maintaining all the rest unchanged (i.e. time, speed, formants etc.), >>> then you need to decompose speech into LPC & excitation, apply speech >>> shifting to the excitation, and synthesize back to generate the new >>> speech. >>> >> This applies to the speech only. > > > Not really, it applies to any signal that exhibits periodicity (and > spectral envelope), and provides the capability of time-varying tracking > of the signal's characteristics.
What if there is more then one source of the periodicity with the different periods? What if no clear periodicity can be derived? LPC + pitch assumes the human speech structure.
> >> The more general method to make the pitch or the speed change is by >> the use of a filterbank. The signal is extrapolated by repetition or >> truncated separately in the each subband. > > > That method is not "more general", rather than the versa. Unlike the > proposed method of LPC+excitation decomposition, the suggested > filter-bank method "discretizes" the spectrum into fixed bands and is > not optimized to the time-varying characteristics of the input signal.
Once there is no clear periodicity, then the only way to make the signal "longer" or "shorter" is to extrapolate it by continuation of the subframe (in the each band) or truncate it. The subband processing moderates the edge effects and prevents the spillage of the interpolation artifacts into the different subbands. VLV
Reply by NS December 26, 20052005-12-26
> NS wrote: > >> If you want to shift just the pitch of speech signals, while >> maintaining all the rest unchanged (i.e. time, speed, formants etc.), >> then you need to decompose speech into LPC & excitation, apply speech >> shifting to the excitation, and synthesize back to generate the new >> speech. >> > > This applies to the speech only.
Not really, it applies to any signal that exhibits periodicity (and spectral envelope), and provides the capability of time-varying tracking of the signal's characteristics.
> The more general method to make the > pitch or the speed change is by the use of a filterbank. The signal is > extrapolated by repetition or truncated separately in the each subband.
That method is not "more general", rather than the versa. Unlike the proposed method of LPC+excitation decomposition, the suggested filter-bank method "discretizes" the spectrum into fixed bands and is not optimized to the time-varying characteristics of the input signal. Please refer to the suggested text to better understand the subject. Thanks, NS
Reply by Vladimir Vassilevsky December 26, 20052005-12-26

NS wrote:

> If you want to shift just the pitch of speech signals, while maintaining > all the rest unchanged (i.e. time, speed, formants etc.), then you need > to decompose speech into LPC & excitation, apply speech shifting to the > excitation, and synthesize back to generate the new speech. >
This applies to the speech only. The more general method to make the pitch or the speed change is by the use of a filterbank. The signal is extrapolated by repetition or truncated separately in the each subband. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com
Reply by NS December 26, 20052005-12-26
If you want to shift just the pitch of speech signals, while maintaining 
all the rest unchanged (i.e. time, speed, formants etc.), then you need 
to decompose speech into LPC & excitation, apply speech shifting to the 
excitation, and synthesize back to generate the new speech.

For example if you wish to double the pitch, then you "squeeze" the 
pitch periods (by downsampling for example) and double their number, on 
the other hand if you want to halve the pitch then you "stretch" each 
period twice as much (by upsampling for example) and halve the number of 
periods.

These subjects are known as "time scaling" or "frequency scaling".

For more please refer to the excellent text:
"Speech coding and synthesis"
by W B Kleijn;  K K Paliwal
Publisher: Amsterdam ; New York : Elsevier, 1995.
ISBN: 0444821694



H wrote:
> Hi. > > I want to do some pitch shifting of some samples I have. I'm trying to > avoid the whole sample rate conversion problem. I can hack up the > hardware some. > > Q: Can I just alter the clock and rate that I send data to my DAC > *WITHOUT* changing the anti-aliasing low-pass and accomplish my goal? > > Thank you. > Henry.
Reply by Jerry Avins December 25, 20052005-12-25
H wrote:
> Hi Jerry, > > In article <q6WdnVYqN_AOXDHeRVn-oA@rcn.net>, jya@ieee.org says... > >>H wrote: >> >>>Q: Can I just alter the clock and rate that I send data to my DAC >>>*WITHOUT* changing the anti-aliasing low-pass and accomplish my goal?
Changing the clock rate changes both the pitch and the signal duration. Changing one without also changing the other is considerably more complex. http://www.dspdimension.com/ has a tutorial and source code. Jerry -- Engineering is the art of making what you want from things you can get. &#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;
Reply by John Herman December 24, 20052005-12-24
In article <MPG.1e177525e14c8af79897b7@news-server.san.rr.com>, H <user@user.user> wrote:
>Hi Jerry, > >What I am unclear about is the anti-aliasing filter. The filter cut-off >is typically at fs/2, so now if the playback rate is altered, and it is >no longer at fs, won't that affect the quality of the pitch shift? > >Henry.
The analog anti-aliasing filter cutoff is typically less than fs/2. The system designer normally designates a frequency and below where the signal aliased by the sampling must be attentuated by some amount that depends on the requirements of the system. This determines the requirements for the antialiasing filter. Typically, the range is fs/4 to fs/3 but higher frequencies are possible. With brickwall filters, frequencies to 0.95 * fs/2 are available.
Reply by Jerry Avins December 24, 20052005-12-24
H wrote:
> Hi Jerry, > > In article <q6WdnVYqN_AOXDHeRVn-oA@rcn.net>, jya@ieee.org says... > >>H wrote: >> >>>Q: Can I just alter the clock and rate that I send data to my DAC >>>*WITHOUT* changing the anti-aliasing low-pass and accomplish my goal? >> >>A set of samples is very much like the wavy groove of a phonograph >>record. Increasing the turntable speed is like increasing the sample >>rate. Draw your own conclusions from that. > > > Thank you for the analogy, but I'm still left scratching my head. I > know that altering the sample rate will shift the pitch, that makes > sense. > > What I am unclear about is the anti-aliasing filter. The filter cut-off > is typically at fs/2, so now if the playback rate is altered, and it is > no longer at fs, won't that affect the quality of the pitch shift?
fs *is* the sample frequency. When you change it, it changes. Do you remember The Chipmunks? They made their recordings by singing slowly an octave lower than the score, then playing back at double speed. That restored both pitch and tempo, but doubled all the formant frequencies. A low-pass filter of 5 KHz in the microphone would have permitted 10 KHz in the released recording. Clear as mud now? Jerry -- Engineering is the art of making what you want from things you can get. &#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;&#2013266095;
Reply by H December 24, 20052005-12-24
Hi Jerry,

In article <q6WdnVYqN_AOXDHeRVn-oA@rcn.net>, jya@ieee.org says...
> H wrote: > > Q: Can I just alter the clock and rate that I send data to my DAC > > *WITHOUT* changing the anti-aliasing low-pass and accomplish my goal? > > A set of samples is very much like the wavy groove of a phonograph > record. Increasing the turntable speed is like increasing the sample > rate. Draw your own conclusions from that.
Thank you for the analogy, but I'm still left scratching my head. I know that altering the sample rate will shift the pitch, that makes sense. What I am unclear about is the anti-aliasing filter. The filter cut-off is typically at fs/2, so now if the playback rate is altered, and it is no longer at fs, won't that affect the quality of the pitch shift? Henry.