> Sony or somebody made a speech playback tape player in the 60s that
> would speed talk a book without pitch shifting it. I think it looked
> for vowel sounds like 'eee' 'iii' 'ooo' which always seemd to be about
> 300ms long, and somehow shortened them to about 100ms (reed relay?)
> There werent any microprocessors in the 60s... maybe some dtl or
> ttl.... How to find the vowel sounds? Bank of filters?
IIRC, it was a spinning-head machine. It could speed or slow what was
recorded depending on which way the head bank rotated. It was OK for
notes and dictation by trained people. It wasn't good enough for
"Talking Books".
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
Reply by BobG●February 22, 20052005-02-22
Sony or somebody made a speech playback tape player in the 60s that
would speed talk a book without pitch shifting it. I think it looked
for vowel sounds like 'eee' 'iii' 'ooo' which always seemd to be about
300ms long, and somehow shortened them to about 100ms (reed relay?)
There werent any microprocessors in the 60s... maybe some dtl or
ttl.... How to find the vowel sounds? Bank of filters?
Reply by Tam_heidthebaw●February 22, 20052005-02-22
ujamshy@yahoo.com wrote in message news:<1109007096.279904.78700@l41g2000cwc.googlegroups.com>...
> Can you recommend a package that manipulates audio speed
> without pitch change and is provided with source code (c or c++)?
>
> Thanks,
> Ury Jamshy.
You need a phase-vocoder.
Tam
Reply by ●February 22, 20052005-02-22
ujamshy@yahoo.com writes:
> Can you recommend a package that manipulates audio speed
> without pitch change and is provided with source code (c or c++)?
>
Reply by robert bristow-johnson●February 21, 20052005-02-21
in article 1109007096.279904.78700@l41g2000cwc.googlegroups.com,
ujamshy@yahoo.com at ujamshy@yahoo.com wrote on 02/21/2005 12:31:
> Can you recommend a package that manipulates audio speed
> without pitch change and is provided with source code (c or c++)?
wow! you ain't for much!
there are programs on the market that do that. some work very well. all
that i know of have proprietary code and algorithms. we have talked about
the public domain algorithms (that you might find in the lit) on this
newsgroup and dspdimension.com talks about it more.
there are two general directions you can go (as far as i can see it):
either something purely in the time-domain (jargon: "TDHS", "SOLA", "PSOLA",
"WSOLA"). all so the same basic thing which is to splice in extra copies of
audio snippets (if you're stretching it) or splicing out small snippets (if
your shrinking it). if the audio is a monophonic tone (a single human
voice) or a nice major chord (i call this a "quasi-periodic" waveform), then
you can splice in or out a piece of audio that has endpoints that are
similar to each other. you are effectively splicing in or out a single (or
maybe an integer multiple) cycle or period of the waveform and, if combined
with cross-fading, you can close up the gap left pretty seamlessly and it
sounds pretty good. if the audio is not quasi-periodic (a full bandwidth
orchestra with lots of different notes or a dissonant sound or a sound with
lots of percussive components), then you will not be able to find any really
good splice points and you will hear "glitches" when splices are made.
the other general approach is frequency-domain (jargon: "phase-vocoder",
"sinusoidal modeling", "FFT/iFFT") which is conceptually similar to the
time-domain expect it applies this splicing separately and independently to
individual frequency components. (in the time-domain approach, this
splicing is applied to all frequency components together, and if the sound
is not harmonic enough, no single splice displacement will satisfy all of
the frequency components. some of them will get spliced out of phase giving
you a "glitch".) all frequency components are spliced perfectly (no
glitches!), but that allows the phase of harmonics of a single note to slip
causing them to add up in a way that was not their interrelationship before.
the waveshape is not preserved. so these frequency-domain time-scalers
sometimes sound worse than the time-domain methods.
both methods require some computationally intensive stuff (AMDF or
auto-correlation for time-domain, FFT and iFFT for frequency-domain) but the
frequency-domain time-scalers cost more.
--
r b-j rbj@audioimagination.com
"Imagination is more important than knowledge."
Reply by ●February 21, 20052005-02-21
Can you recommend a package that manipulates audio speed
without pitch change and is provided with source code (c or c++)?
Thanks,
Ury Jamshy.