DSPRelated.com
Forums

Question about the way Smith defines frames of an input signal for the STFT

Started by maxplanck July 12, 2008
In the first equation on this page:

http://ccrma.stanford.edu/~jos/parshl/Short_Time_Fourier_Transform_STFT.html

The mth frame of the input signal is defined as: x(n-mR), where R is the
hop size and n is the sequence of input signal sample index numbers of the
first frame (m=0).  This makes sense to me, except for the fact that I
would expect the mth frame of the input signal to be defined as: x(n+mR). 


The - only makes sense to me if we are using negative sample index numbers
(0,-1,-2,...) which I'm sure the author does not intend.. or if we are
analyzing the signal backwards, which Dr Smith does mention later in his
paper on PARSHL as being useful for partial tracking.  However, I think
that he would mention it up front if he intended to incorporate backward
signal analysis into the basic definitions used here.  Does anyone
understand why the author is using a - instead of a + in this definition?
maxplanck wrote:
> In the first equation on this page: > > http://ccrma.stanford.edu/~jos/parshl/Short_Time_Fourier_Transform_STFT.html > > The mth frame of the input signal is defined as: x(n-mR), where R is the > hop size and n is the sequence of input signal sample index numbers of the > first frame (m=0). This makes sense to me, except for the fact that I > would expect the mth frame of the input signal to be defined as: x(n+mR). > > > The - only makes sense to me if we are using negative sample index numbers > (0,-1,-2,...) which I'm sure the author does not intend.. or if we are > analyzing the signal backwards, which Dr Smith does mention later in his > paper on PARSHL as being useful for partial tracking. However, I think > that he would mention it up front if he intended to incorporate backward > signal analysis into the basic definitions used here. Does anyone > understand why the author is using a - instead of a + in this definition?
Suppose that you want to shift the graph y = f(x) k units to the right, drawing a new function y = f(x'). Would you make the substitution x'=x+k or x'=x-k? If in doubt, try it both ways. Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
If the signal is defined from n=0 to N, and if our window is defined from
n=0 to W where W is a positive number, wouldn't we want to shift the signal
to the left?

If we shift the signal to the right, then the second frame will contain
some undefined/zero samples of the signal, and after a few hops there will
be no signal inside the winow..

Is one of the assumptions that I'm making here wrong?


>maxplanck wrote: >> In the first equation on this page: >> >>
http://ccrma.stanford.edu/~jos/parshl/Short_Time_Fourier_Transform_STFT.html
>> >> The mth frame of the input signal is defined as: x(n-mR), where R is
the
>> hop size and n is the sequence of input signal sample index numbers of
the
>> first frame (m=0). This makes sense to me, except for the fact that I >> would expect the mth frame of the input signal to be defined as:
x(n+mR).
>> >> >> The - only makes sense to me if we are using negative sample index
numbers
>> (0,-1,-2,...) which I'm sure the author does not intend.. or if we are >> analyzing the signal backwards, which Dr Smith does mention later in
his
>> paper on PARSHL as being useful for partial tracking. However, I
think
>> that he would mention it up front if he intended to incorporate
backward
>> signal analysis into the basic definitions used here. Does anyone >> understand why the author is using a - instead of a + in this
definition?
> >Suppose that you want to shift the graph y = f(x) k units to the right, >drawing a new function y = f(x'). Would you make the substitution x'=x+k
>or x'=x-k? If in doubt, try it both ways. > >Jerry >-- >Engineering is the art of making what you want from things you can get. >����������������������������������������������������������������������� >
On Jul 13, 10:53 am, "maxplanck" <erik.bo...@comcast.net> wrote:

> If the signal is defined from n=0 to N, and if our window is defined from > n=0 to W where W is a positive number, wouldn't we want to shift the signal > to the left?
> If we shift the signal to the right, then the second frame will contain > some undefined/zero samples of the signal, and after a few hops there will > be no signal inside the winow.. > > Is one of the assumptions that I'm making here wrong? >...
Jerry Avins posted:
>Suppose that you want to shift the graph y = f(x) k units to the right, >drawing a new function y = f(x'). Would you make the substitution x'=x+k >or x'=x-k? If in doubt, try it both ways. >Jerry
One says move the graph right, one says move the signal left. The original question was about numbers. Instead of arguing semantics, did you plug in the numbers Jerry suggested? What did you get? Dale B. Dalrymple
>One says move the graph right, one says move the signal left. > >The original question was about numbers. Instead of arguing semantics, >did you plug in the numbers Jerry suggested? What did you get? > >Dale B. Dalrymple
This is how I plugged in the numbers: Let's say the signal is x(n)=n, defined from n=0 to N. If we consider y=x(n-1), then y is undefined at n=0, since x(n) is undefined at n=-1, no? Sorry I'm sure this is simple and it's just having trouble passing through a thick part of my skull, I have trouble sometimes with simple stuff..
On Jul 13, 12:28 pm, "maxplanck" <erik.bo...@comcast.net> wrote:
> >One says move the graph right, one says move the signal left. > > >The original question was about numbers. Instead of arguing semantics, > >did you plug in the numbers Jerry suggested? What did you get? > > >Dale B. Dalrymple > > This is how I plugged in the numbers: > > Let's say the signal is x(n)=n, defined from n=0 to N. If we consider > y=x(n-1), then y is undefined at n=0, since x(n) is undefined at n=-1, no? > > Sorry I'm sure this is simple and it's just having trouble passing through > a thick part of my skull, I have trouble sometimes with simple stuff..
The only time you need to shift is when n > N and you are performing the second and subsequent frame FFTs. Why would you ever consider shifting at n = 0 except as a troll? Dale B. Dalrymple
>The only time you need to shift is when n > N and you are performing >the second and subsequent frame FFTs. Why would you ever consider >shifting at n = 0? > >Dale B. Dalrymple
People often do STFT with overlapping frames, don't they? I was using a one sample hop size just to keep the example simple. If I've made mistakes it's not on purpose. I've tried very hard to figure this out on my own, and am stuck. I'm hoping that people will tell me specifically where and why I'm wrong, as well as what the correct way is and why. My question boils down to this: In Smith's definition of the mth frame xm(n), the sample indices included in the current frame DECREASE as m increases. I.e. n>n-R>n-2R>n-3R since R is positive. We are essentially stepping backwards in terms of sample index as m increases. Illustrated here step by step: Our first analysis frame = x(-Mh), x(-Mh+1), ..., x(-1), x(0), x(1), ..., x(Mh-1), x(Mh) Our second analysis frame = x(-Mh-R), x(-Mh+1-R), ..., x(-1-R), x(0-R), x(1-R), ..., x(Mh-1-R), x(Mh-R) Our third analysis frame = x(-Mh-2R), x(-Mh+1-2R), ..., x(-1-2R), x(0-2R), x(1-2R), ..., x(Mh-1-2R), x(Mh-2R) If our first analysis frame is the end of the signal, then we are analyzing the signal from back to front. If our first analysis frame is the beginning of the signal, then we are stepping backwards after the first frame into undefined sample index territory (i.e. sample indices which have no signal level assigned to them). Can someone please correct me?
On Jul 13, 4:20 pm, "maxplanck" <erik.bo...@comcast.net> wrote:
> >The only time you need to shift is when n > N and you are performing > >the second and subsequent frame FFTs. Why would you ever consider > >shifting at n = 0? > > >Dale B. Dalrymple > > People often do STFT with overlapping frames, don't they? I was using a > one sample hop size just to keep the example simple. If I've made mistakes > it's not on purpose. I've tried very hard to figure this out on my own, > and am stuck. I'm hoping that people will tell me specifically where and > why I'm wrong, as well as what the correct way is and why. > > My question boils down to this: In Smith's definition of the mth frame > xm(n), the sample indices included in the current frame DECREASE as m > increases. I.e. n>n-R>n-2R>n-3R since R is positive. We are essentially > stepping backwards in terms of sample index as m increases. Illustrated > here step by step: > > Our first analysis frame = x(-Mh), x(-Mh+1), ..., x(-1), x(0), x(1), ..., > x(Mh-1), x(Mh) > > Our second analysis frame = x(-Mh-R), x(-Mh+1-R), ..., x(-1-R), x(0-R), > x(1-R), ..., x(Mh-1-R), x(Mh-R) > > Our third analysis frame = x(-Mh-2R), x(-Mh+1-2R), ..., x(-1-2R), x(0-2R), > x(1-2R), ..., x(Mh-1-2R), x(Mh-2R) > > If our first analysis frame is the end of the signal, then we are > analyzing the signal from back to front. If our first analysis frame is > the beginning of the signal, then we are stepping backwards after the first > frame into undefined sample index territory (i.e. sample indices which have > no signal level assigned to them). > > Can someone please correct me?
As I said already, the second frame is calculated for larger values of n than the first frame. The value of n in the n-mR frame of x(n) is increasing so that the indices in xsubm(n), the array used in the transform are the same as for the first frame. Note that it is not x(n) (where n increases through the frames) but xsubm(n) (n defined to be in the same range for all frames) that is transformed. By definition "n" in xsubm() does not have the same range a "n" in x() for any frame but the first. This is the first equation in the document you cite. Dale B. Dalrymple
Thanks a lot, that makes more sense.  I think that the root of my problem
understanding this may have to do with that "equals sign with triangle
above it" symbol.  I thought that symbol only indicated that the author was
making a definition, but from what you said it seems that this symbol must
mean something other than that.  Can you please tell me what the name of
this symbol is, so that I can look up its meaning?  

I should be able to deduce the symbol's meaning from the verbal
description of what it does here that you posted, but I am just interested
in seeing a general definition.

Thanks again
On thinking about this a bit more, some things seem vague and could be
cleared up by answering this question:


When you say this: "The value of n in the n-mR frame of x(n) is
increasing so that the indices in xsubm(n), the array used in the
transform are the same as for the first frame."

Do you mean that if:

n=-1,0,1 when m=0

and if our hop size R = 3

then:

n=2,3,4 when m=1


and

xsub0(n)=xsub0(-1),xsub0(0),xsub0(1)

and

xsub1(n-3) = xsub1(2-3),xsub1(3-3),xsub1(4-3) =
xsub1(-1),xsub1(0),xsub1(1)

and 

xsub0(-1)=x(0)
xsub0(0)=x(1)
xsub0(1)=x(2)
xsub1(-1)=x(3)
xsub1(0)=x(4)
xsub1(1)=x(5)




>The second frame is calculated for larger values of >n than the first frame. The value of n in the n-mR frame of x(n) is >increasing so that the indices in xsubm(n), the array used in the >transform are the same as for the first frame. Note that it is not >x(n) (where n increases through the frames) but xsubm(n) (n defined to >be in the same range for all frames) that is transformed. By >definition "n" in xsubm() does not have the same range a "n" in x() >for any frame but the first. This is the first equation in the >document you cite. > >Dale B. Dalrymple >