DSPRelated.com
Forums

Video-equivalent of "pitch-shifting."

Started by Radium August 21, 2007
On 8/23/2007, robert bristow-johnson posted this:
> On Aug 22, 1:37 am, isw <i...@witzend.com> wrote: >> >> The fact that there is essentially no relation between these two >> entities -- i.e. the data stream is comprised of a sequence of >> descriptions of a series of still images -- is the reason why what you >> want to do is almost certainly impossible. >> >> If you really want to try, the first step will be to devise a method of >> recording video that does not quantize the temporal axis; i.e. not using >> a sequence of still images. > > can't we think of the intensity (and chroma components) of a > particular point (x,y) of a still image as a sampled (at a rate of 30 > Hz) value of a continuous-time signal that represents intensity at > that point? i.e. we have I(x,y,t) being sampled as I(x,y,n*T). and > then use some kinda interpolation to hypothetically reconstruct the > "still" images in between the sequence we are given? i imagine there > would be some blurring, but if the resolution was very good to start > with, would that not work. at least as a beginning point? > > r b-j
Yes. Radium included that idea in his first thread of the week, but it was not liked by the community[1] :-) I thought it was perhaps his only cogent idea, but OTOH, I'm not sure where to take it. [1] At least in the part of the thread that I read. -- Gene E. Bloch (Gino) letters617blochg3251 (replace the numbers by "at" and "dotcom")
In article <1187894499.374279.239240@q5g2000prf.googlegroups.com>,
 robert bristow-johnson <rbj@audioimagination.com> wrote:

> On Aug 22, 1:37 am, isw <i...@witzend.com> wrote: > > > > The fact that there is essentially no relation between these two > > entities -- i.e. the data stream is comprised of a sequence of > > descriptions of a series of still images -- is the reason why what you > > want to do is almost certainly impossible. > > > > If you really want to try, the first step will be to devise a method of > > recording video that does not quantize the temporal axis; i.e. not using > > a sequence of still images. > > can't we think of the intensity (and chroma components) of a > particular point (x,y) of a still image as a sampled (at a rate of 30 > Hz) value of a continuous-time signal that represents intensity at > that point? i.e. we have I(x,y,t) being sampled as I(x,y,n*T). and > then use some kinda interpolation to hypothetically reconstruct the > "still" images in between the sequence we are given? i imagine there > would be some blurring, but if the resolution was very good to start > with, would that not work. at least as a beginning point?
That's not far from the way PAL (25 FPS)-to-NTSC (29.97 FPS) converters work. Isaac
Ron N. wrote:
> On Aug 23, 10:02 am, isw <i...@witzend.com> wrote: >> There is an interesting sort-of exception to frame rate conservation, >> when film source is encoded at 24 FPS (actually about 23.98) and the >> decoder performs 3-2 pulldown to deliver the NTSC-required 29.97 FPS, >> but that's not germane to this discussion. > > Actually, it is very germane, since 3-2 pulldown is similar > to how some primitive audio pitch/rate changing hardware worked, > by duplicating small time domain frames of audio at a fixed > proportion and rate. Some MPEG decoders do "special effects" > by varying the frame duplicate/drop fractions to slow down > or speed up playback using the same mechanism as for pulldown.
I think he means the change from 2:2 pull-down's natural rate of 30fps and TV's 29.97. I call that negligible. jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
robert bristow-johnson wrote:
> On Aug 23, 3:37 am, "Ron N." <rhnlo...@yahoo.com> wrote: >> Some pitch-shifters or time-stretchers also duplicate and >> blend preceding and following periods of waveforms or >> spectral frame contents. > > when i first saw the thread title, that's what i first thought about. > actually, not pitch-shifting but more time-scaling. it seems to me > natural that if they were speeding up or slowing down the motion in > the video (which means only for the termporal dimension, not either > "x" or "y"), that would naturally correspond to the same speeding up > or slowing down of tempo (without pitch change) of the audio. if you > twist the knob that makes the actress talk faster (Ms. Motormouth), it > shouldn't be upshifting her pitch to sound like Wendy or Bebe in South > Park.
But he wants her to talk faster, say the same number of words, and finish in the same time! What's worse, I think he's serious. Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
On Aug 23, 11:31 am, robert bristow-johnson
<r...@audioimagination.com> wrote:

> when i first saw the thread title, that's what i first thought about. > actually, not pitch-shifting but more time-scaling.
That's the opposite of what I'm looking for.
> it seems to me > natural that if they were speeding up or slowing down the motion in > the video (which means only for the termporal dimension, not either > "x" or "y"), that would naturally correspond to the same speeding up > or slowing down of tempo (without pitch change) of the audio. if you > twist the knob that makes the actress talk faster (Ms. Motormouth), it > shouldn't be upshifting her pitch to sound like Wendy or Bebe in South > Park.
I want the actress to talk at the same speed, at a lower-pitch, and finish at the same-time without any low-pass filtering.
In article <7ZadnQ5t0t3gtE_bnZ2dnUVZ_hqdnZ2d@rcn.net>,
 Jerry Avins <jya@ieee.org> wrote:

> Ron N. wrote: > > On Aug 23, 10:02 am, isw <i...@witzend.com> wrote: > >> There is an interesting sort-of exception to frame rate conservation, > >> when film source is encoded at 24 FPS (actually about 23.98) and the > >> decoder performs 3-2 pulldown to deliver the NTSC-required 29.97 FPS, > >> but that's not germane to this discussion. > > > > Actually, it is very germane, since 3-2 pulldown is similar > > to how some primitive audio pitch/rate changing hardware worked, > > by duplicating small time domain frames of audio at a fixed > > proportion and rate. Some MPEG decoders do "special effects" > > by varying the frame duplicate/drop fractions to slow down > > or speed up playback using the same mechanism as for pulldown. > > I think he means the change from 2:2 pull-down's natural rate of 30fps
30 fps? What is that the "natural rate" of? I'm not aware of anything that runs "naturally" at 30 fps. Isaac
On Aug 26, 10:23 pm, Radium <gluceg...@gmail.com> wrote:
> On Aug 23, 11:31 am, robert bristow-johnson > > <r...@audioimagination.com> wrote: > > when i first saw the thread title, that's what i first thought about. > > actually, not pitch-shifting but more time-scaling. > > That's the opposite of what I'm looking for. > > > it seems to me > > natural that if they were speeding up or slowing down the motion in > > the video (which means only for the termporal dimension, not either > > "x" or "y"), that would naturally correspond to the same speeding up > > or slowing down of tempo (without pitch change) of the audio. if you > > twist the knob that makes the actress talk faster (Ms. Motormouth), it > > shouldn't be upshifting her pitch to sound like Wendy or Bebe in South > > Park. > > I want the actress to talk at the same speed, at a lower-pitch, and > finish at the same-time without any low-pass filtering.
okay, so that is a real-time pitch shifter. you can buy those things and you can buy plug-ins that do it. so what is the "video- equivalent" to a real-time pitch shifter? r b-j (Jerry, i hope i didn't step in a puddle of dung, did i? were you trying to warn me away?)
On Aug 26, 8:43 pm, robert bristow-johnson <r...@audioimagination.com>
wrote:

> so what is the "video- > equivalent" to a real-time pitch shifter?
I wish I knew. This is so interesting for me yet so difficult for me to answer. What is the "video-equivalent" to a real-time pitch shifter if the video is B&W? Since I've been giving wrong answers to my questions, I'll definitely need guidance. I do know that video-frequency [in B&W video, not color] has two elements: 1. Temporal frequency 2. Spatial frequency #1 only applies if the video consists of changing visual signals [such as a movie or show] #2 applies to all video signals -- including still images. In color video, there is the a 3rd element [which is irrelevant to this discussion] and that relates to the wavelengths of lights in the video.
Radium wrote:

   ...

> I want the actress to talk at the same speed, at a lower-pitch, and > finish at the same-time without any low-pass filtering.
That's audio pitch shifting. What has it to do with video? You wanted something else in the past. Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;
On Aug 26, 9:47 pm, Jerry Avins <j...@ieee.org> wrote:

> Radium wrote:
> > I want the actress to talk at the same speed, at a lower-pitch, and > > finish at the same-time without any low-pass filtering.
> That's audio pitch shifting. What has it to do with video?
I want the video-equivalent of that.