DSPRelated.com
Forums

How to change a voice's pitch in real time?

Started by Piotr Mancini July 13, 2020
I just learned how to convert an audioclip from a 33.3 rpm vinyl record to 78 rpm, here:

https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1 

What I need is similar but probably harder. I am developing a web application based on a segment from a 2013 TV program. Only the first seconds are relevant:

https://www.youtube.com/watch?v=8MF04X2aLBw 

The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle.

http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/ 

I already have the code that will receive those two variables as the user drags the mouse around. The effects will be:

 - Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part.


 - Vector Angle changes: as it is modified the pitch (listen to the two extremes attached) will vary.

What I need is the back-end part (library, etc).

The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome.
 

TIA,

-Ramon F. Herrera
JFK Numbers
On Sunday, July 12, 2020 at 10:58:29 PM UTC-5, Piotr Mancini wrote:
> I just learned how to convert an audioclip from a 33.3 rpm vinyl record to 78 rpm, here: > > https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1 > > What I need is similar but probably harder. I am developing a web application based on a segment from a 2013 TV program. Only the first seconds are relevant: > > https://www.youtube.com/watch?v=8MF04X2aLBw > > The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle. > > http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/ > > I already have the code that will receive those two variables as the user drags the mouse around. The effects will be: > > - Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part. > > > - Vector Angle changes: as it is modified the pitch (listen to the two extremes attached) will vary. > > What I need is the back-end part (library, etc). > > The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome. > > > TIA, > > -Ramon F. Herrera > JFK Numbers
Wow! This used to be one of the few Usenet newsgroups that had survived the onslaught of the sons/daughters of bitches. In fact, the production and interchanges were remarkable. The bastards killed it! Is there ANY Usenet Newsgroup that is actually functional? -Ramon JFK Numbers
On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, Piotr Mancini wrote:

(snip)

> - Vector Angle changes: as it is modified the pitch (listen to > the two extremes attached) will vary.
More usual are ones to change speed, but not pitch. Since you can change the sampling rate, that is equivalent. Before digital, there were analog tape players that did this using a moving head. You can speed up voice by cutting out segments, long enough to determine pitch, but short enough not to determine phonemes. With the popular 44.1kHz sampling rate divisible by 3, (3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s, so remove 14700 samples, then leave 29400 samples. Now you want to resample. Since it is hard to describe a better way in a short note, double every other sample of the 29400 sample fragment. You should probably low-pass the result, but maybe close enough.
On Monday, July 13, 2020 at 8:53:31 PM UTC-5, ga...@u.washington.edu wrote:
> On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, JFK Numbers wrote: > > (snip) > > > - Vector Angle changes: as it is modified the pitch (listen to > > the two extremes attached) will vary. > > More usual are ones to change speed, but not pitch. Since you > can change the sampling rate, that is equivalent. > > Before digital, there were analog tape players that did this > using a moving head. > > You can speed up voice by cutting out segments, long enough to > determine pitch, but short enough not to determine phonemes. > > With the popular 44.1kHz sampling rate divisible by 3, > (3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s, > so remove 14700 samples, then leave 29400 samples. > > Now you want to resample. Since it is hard to describe a better > way in a short note, double every other sample of the 29400 sample > fragment. You should probably low-pass the result, but > maybe close enough.
Thank you so much! Finally... What I need pretty much is an OSS library to manipulate audio signals. The more high level (audio-specific?), the better. Once I have that, I will either code the app myself, or (most likely) get a hired gun (Freelancer) to do the implementation by you described for me. Thanks again!! -Ramon JFK Numbers

On Sunday, July 12, 2020 at 10:58:29 PM UTC-5, JFK Numbers wrote:
> I just learned how to convert an audioclip from a 33.3 rpm vinyl record to 78 rpm, here: > > https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1 > > What I need is similar but probably harder. I am developing a web application based on a segment from a 2013 TV program. Only the first seconds are relevant: > > https://www.youtube.com/watch?v=8MF04X2aLBw > > The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle. > > http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/ > > I already have the code that will receive those two variables as the user drags the mouse around. The effects will be: > > - Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part. > > - Vector Angle changes: as it is modified the pitch will vary. > > What I need is the back-end part (library, etc). > > The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome. > > > TIA, > > -Ramon F. Herrera > JFK Numbers
This needs further explanation. I will try to be as succinct as possible. If you prefer technical issues only and don't care about history, politics and controversy please STOP reading now. Move on. If you haven't please watch this videoclip and pay close attention. That is the most advanced study ever done of the shooting, it was paid with the unlimited expense credit card of the Koch brothers. https://www.youtube.com/watch?v=8MF04X2aLBw There have been 12 "scientific" studies of the Kennedy murder. All have produced pre-ordained results, some are LN ("It was Lee, alone, 3 shots") the rest are CT. What they have in common is that they are all fraudulent. Every single one of them. See them here: https://archive.org/details/@the_12_fraudulent_studies?sort=titleSorter This is also important. Below is the front end to the audio-distortion application, just click on "Continue". http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/The-12-Fraudulent-Studies/ One of my fundamental beliefs is this, as I told a book author: - Chances of a lawyer admitting to a counterpart: "You were right all these years, it was a conspiracy"? Hades will proverbially freeze over before that happens. - Chances of a physician telling a dissenting colleague: "You were right on the autopsy X-rays, the fatal shot did not come from behind" Similar to the odds above. - Chances of an engineer, physicist, 3D designer, etc.? Now we are talking. And I mean that in the literal sense: They ARE talking. Our colleagues are. Notice these 2 images: http://www.jfknumbers.org/~ramon/jfk/Two-Mikes-One-is-a-Liar.png http://www.jfknumbers.org/~ramon/jfk/My-Dear-Colleague-and-Mentor-Mike-McCormick.jpg That is enough intro. Later, I will be asking (make that: begging) for help on the audio aspects of "The Subject That Never Dies". The official version is a dead man walking, BTW. It is up to us, numerically trained people (who have been away from the case, always controlled by liars, err, I mean: lawyers) to solve forever that tragic event. We cannot allow the Fake News, haters of academia, science, logic, MAHA hat wearing types to destroy history. -Ramon JFK Numbers ramon at jfknumbers dot org
On Mon, 13 Jul 2020 12:57:15 -0700 (PDT), Piotr Mancini
<piotr.mancini@gmail.com> wrote:

>On Sunday, July 12, 2020 at 10:58:29 PM UTC-5, Piotr Mancini wrote: >> I just learned how to convert an audioclip from a 33.3 rpm vinyl record to 78 rpm, here: >> >> https://community.adobe.com/t5/audition/converting-33-recording-to-78-how/td-p/9448709?page=1 >> >> What I need is similar but probably harder. I am developing a web application based on a segment from a 2013 TV program. Only the first seconds are relevant: >> >> https://www.youtube.com/watch?v=8MF04X2aLBw >> >> The clone of that segment is an interactive application, under construction, seen below. Notice how the user has two degrees of freedom: the vector's magnitude and its angle. >> >> http://www.dealey-plaza.org/this-government-as-promised/SBT-MBT-Tools/Haags-Measurement-Tool/ >> >> I already have the code that will receive those two variables as the user drags the mouse around. The effects will be: >> >> - Vector Magnitude changes: the voice volume increases/decreases. This is most likely the easy part. >> >> >> - Vector Angle changes: as it is modified the pitch (listen to the two extremes attached) will vary. >> >> What I need is the back-end part (library, etc). >> >> The question is actually more general than simple signal intensity and frequency. I need to "modulate" a signal based on user's activity. Any recommendations are welcome. >> >> >> TIA, >> >> -Ramon F. Herrera >> JFK Numbers > >Wow! This used to be one of the few Usenet newsgroups that had survived the onslaught of the sons/daughters of bitches. In fact, the production and interchanges were remarkable. > >The bastards killed it! > >Is there ANY Usenet Newsgroup that is actually functional? > >-Ramon >JFK Numbers
Yes there are a few I think. But not much. I remember this group in the early 1990s ! People here helped me out when I needed another DSP56001 processor and someone sent me one ! Still have it today. Long live comp.dsp Or something like that... boB
Am 14.07.2020 um 05:00 schrieb Piotr Mancini:
> On Monday, July 13, 2020 at 8:53:31 PM UTC-5, ga...@u.washington.edu wrote: >> On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, JFK Numbers wrote: >> >> (snip) >> >>> - Vector Angle changes: as it is modified the pitch (listen to >>> the two extremes attached) will vary. >> >> More usual are ones to change speed, but not pitch. Since you >> can change the sampling rate, that is equivalent. >> >> Before digital, there were analog tape players that did this >> using a moving head. >> >> You can speed up voice by cutting out segments, long enough to >> determine pitch, but short enough not to determine phonemes. >> >> With the popular 44.1kHz sampling rate divisible by 3, >> (3*14.7kHz), you could speed up by 1.5 by removing 0.1s every 0.2s, >> so remove 14700 samples, then leave 29400 samples. >> >> Now you want to resample. Since it is hard to describe a better >> way in a short note, double every other sample of the 29400 sample >> fragment. You should probably low-pass the result, but >> maybe close enough. > > Thank you so much! Finally... > > What I need pretty much is an OSS library to manipulate audio signals. The more high level (audio-specific?), the better. > > Once I have that, I will either code the app myself, or (most likely) get a hired gun (Freelancer) to do the implementation by you described for me. > > Thanks again!! > > -Ramon > JFK Numbers >
Not sure if it has exactly what you have been looking for, but have you had a look at the open source audio editor Audacity (https://www.audacityteam.org/)? If it can accomplish what you need you might be able to extract the required functions from its backend library portaudio. Greetz, Sebastian
On Thursday, July 16, 2020 at 12:21:25 PM UTC-7, Sebastian Doht wrote:

(snip)

> Not sure if it has exactly what you have been looking for, but have you > had a look at the open source audio editor Audacity > (https://www.audacityteam.org/)? If it can accomplish what you need you > might be able to extract the required functions from its backend library > portaudio.
Note that the OPs question can be answered with two operations. One changes the speed without changing the pitch, then resampling to get the original speed with pitch change. The OP asked for 'real time', which I will interpret as minimal delay. It can't be done with zero delay, but maybe close enough.
On Sunday, July 12, 2020 at 8:58:29 PM UTC-7, Piotr Mancini wrote:

(snip)

> The question is actually more general than simple signal > intensity and frequency. I need to "modulate" a signal based > on user's activity. Any recommendations are welcome.
Reminds me of at a seminar last year (that is, pre-Covid) wondering about an accent remover. The seminar speaker had a strong accent which made it hard to understand. (That was in CSE, too!) At a later seminar, related to deep learning and neural nets, someone had a system that might be able to do it. It would use one voice as a training set, then convert others into that voice. But note that this could also be done using a speech to text system, followed by a text to speech synthesizer. Not real time, but maybe close enough.