Hi, I'm trying to obtain the REPS for some audio data, but it's proving quite difficult. The link at http://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq=reps+%22ripple+enhanced%22&source=web&ots=SKImt8khEd&sig=YuV15zt9gQQxE2iLxzJMdCB8kJ4&hl=en&sa=X&oi=book_result&resnum=1&ct=result#PPA253,M1 shows the only explanation I can get (I've also read the paper referenced in the link - the book auythor has just copied and pasted from the paper). So this is my understanding of how to obtain the REPS: 1. Obatin the linear power spectrum of a part of the signal, and square it to make it a bit more useful. I think that's just the usual fft output, then squared, i.e. X=fft(signal) X=X.^2 2. Now get rid of some lower frequency components X(freq<50)=0; now take the ifft of X: X=ifft(X) 3. take DFT of the result X=fft(X); My method clearly isn't giving me the results I need, and I dont see the point in doing an inverse DFT than a DFT again straight afterwards. I must be looking at this the wrong way. Any help would be greatly appreciated Adam
Ripple Enhanced Power Spectrum (REPS)
Started by ●January 22, 2009
Reply by ●January 22, 20092009-01-22
adamchapman wrote:> Hi, > > I'm trying to obtain the REPS for some audio data, but it's proving > quite difficult. > > The link at > http://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq=reps+%22ripple+enhanced%22&source=web&ots=SKImt8khEd&sig=YuV15zt9gQQxE2iLxzJMdCB8kJ4&hl=en&sa=X&oi=book_result&resnum=1&ct=result#PPA253,M1 > > shows the only explanation I can get (I've also read the paper > referenced in the link - the book auythor has just copied and pasted > from the paper). > > > > So this is my understanding of how to obtain the REPS:I needed to figure out from the subject line what REPS means. Tsk tsk.> 1. Obatin the linear power spectrum of a part of the signal, and > square it to make it a bit more useful. I think that's just the usual > fft output, then squared, i.e. > X=fft(signal) > X=X.^2Not "to make it a bit more useful" but to convert it to a power spectrum. X=fft(signal) Power spectrum = x^2 ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������
Reply by ●January 22, 20092009-01-22
On Jan 22, 6:12�pm, Jerry Avins <j...@ieee.org> wrote:> adamchapman wrote: > > Hi, > > > I'm trying to obtain the REPS for some audio data, but it's proving > > quite difficult. > > > The link at > >http://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > > shows the only explanation I can get (I've also read the paper > > referenced in the link - the book auythor has just copied and pasted > > from the paper). > > > So this is my understanding of how to obtain the REPS: > > I needed to figure out from the subject line what REPS means. Tsk tsk. > > > 1. Obatin the linear power spectrum of a part of the signal, and > > square it to make it a bit more useful. I think that's just the usual > > fft output, then squared, i.e. > > X=fft(signal) > > X=X.^2 > > Not "to make it a bit more useful" but to convert it to a power spectrum. > X=fft(signal) > Power spectrum = x^2 > > � �... > > Jerry > -- > Engineering is the art of making what you want from things you can get. > �����������������������������������������������������������������������- Hide quoted text - > > - Show quoted text -Yes I probably should have written it in a way thats makes more sense. What really doesn't make sense to me though is why the method involves perfoming an inverse DFT, then take a DFT of that! Did it look the same to you from the book I linked?
Reply by ●January 22, 20092009-01-22
On Jan 22, 6:18�pm, adamchapman <adamchapman1...@hotmail.co.uk> wrote:> On Jan 22, 6:12�pm, Jerry Avins <j...@ieee.org> wrote: > > > > > > > adamchapman wrote: > > > Hi, > > > > I'm trying to obtain the REPS for some audio data, but it's proving > > > quite difficult. > > > > The link at > > >http://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > > > shows the only explanation I can get (I've also read the paper > > > referenced in the link - the book auythor has just copied and pasted > > > from the paper). > > > > So this is my understanding of how to obtain the REPS: > > > I needed to figure out from the subject line what REPS means. Tsk tsk. > > > > 1. Obatin the linear power spectrum of a part of the signal, and > > > square it to make it a bit more useful. I think that's just the usual > > > fft output, then squared, i.e. > > > X=fft(signal) > > > X=X.^2 > > > Not "to make it a bit more useful" but to convert it to a power spectrum. > > X=fft(signal) > > Power spectrum = x^2 > > > � �... > > > Jerry > > -- > > Engineering is the art of making what you want from things you can get. > > �����������������������������������������������������������������������- Hide quoted text - > > > - Show quoted text - > > Yes I probably should have written it in a way thats makes more sense. > What really doesn't make sense to me though is why the method involves > perfoming an inverse DFT, then take a DFT of that! > > Did it look the same to you from the book I linked?- Hide quoted text - > > - Show quoted text -Oh and I said more useful referring to the suaring of the power spectrum. The book mentions you normally get better results if you square the spectrum
Reply by ●January 22, 20092009-01-22
On Jan 22, 8:30 am, adamchapman <adamchapman1...@hotmail.co.uk> wrote:> Hi, > > I'm trying to obtain the REPS for some audio data, but it's proving > quite difficult. > > The link athttp://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > shows the only explanation I can get (I've also read the paper > referenced in the link - the book auythor has just copied and pasted > from the paper). > > So this is my understanding of how to obtain the REPS: > > 1. Obatin the linear power spectrum of a part of the signal, and > square it to make it a bit more useful. I think that's just the usual > fft output, then squared, i.e. > X=fft(signal) > X=X.^2That's X = |fft(signal)|^2> > 2. Now get rid of some lower frequency components > X(freq<50)=0; > now take the ifft of X: > X=ifft(X)You have reversed the order of these operations.> > 3. take DFT of the result > X=fft(X); > > My method clearly isn't giving me the results I need,Since you haven't told us the point of this, we can hardly speculate on what would help, except that you might try to correctly perform the calculation the book suggests.> and I dont see > the point in doing an inverse DFT than a DFT again straight > afterwards. > > I must be looking at this the wrong way. > > Any help would be greatly appreciated > AdamDale B. Dalrymple
Reply by ●January 22, 20092009-01-22
On Jan 22, 10:21 am, adamchapman <adamchapman1...@hotmail.co.uk> wrote:> On Jan 22, 6:18 pm, adamchapman <adamchapman1...@hotmail.co.uk> wrote: > > > > > On Jan 22, 6:12 pm, Jerry Avins <j...@ieee.org> wrote: > > > > adamchapman wrote: > > > > Hi, > > > > > I'm trying to obtain the REPS for some audio data, but it's proving > > > > quite difficult. > > > > > The link at > > > >http://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > > > > shows the only explanation I can get (I've also read the paper > > > > referenced in the link - the book auythor has just copied and pasted > > > > from the paper). > > > > > So this is my understanding of how to obtain the REPS: > > > > I needed to figure out from the subject line what REPS means. Tsk tsk. > > > > > 1. Obatin the linear power spectrum of a part of the signal, and > > > > square it to make it a bit more useful. I think that's just the usual > > > > fft output, then squared, i.e. > > > > X=fft(signal) > > > > X=X.^2 > > > > Not "to make it a bit more useful" but to convert it to a power spectrum. > > > X=fft(signal) > > > Power spectrum = x^2 > > > > ... > > > > Jerry > > > -- > > > Engineering is the art of making what you want from things you can get. > > > �����������������������������������������������������������������������- Hide quoted text - > > > > - Show quoted text - > > > Yes I probably should have written it in a way thats makes more sense. > > What really doesn't make sense to me though is why the method involves > > perfoming an inverse DFT, then take a DFT of that! > > > Did it look the same to you from the book I linked?- Hide quoted text - > > > - Show quoted text - > > Oh and I said more useful referring to the suaring of the power > spectrum. The book mentions you normally get better results if you > square the spectrumThe book -calculates- the power spectrum, it does not -square- the power spectrum. Dale B. Dalrymple
Reply by ●January 22, 20092009-01-22
Thanks Dale On Jan 22, 6:56�pm, dbd <d...@ieee.org> wrote:> On Jan 22, 8:30 am, adamchapman <adamchapman1...@hotmail.co.uk> wrote: > > > > > > > Hi, > > > I'm trying to obtain the REPS for some audio data, but it's proving > > quite difficult. > > > The link athttp://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > > shows the only explanation I can get (I've also read the paper > > referenced in the link - the book auythor has just copied and pasted > > from the paper). > > > So this is my understanding of how to obtain the REPS: > > > 1. Obatin the linear power spectrum of a part of the signal, and > > square it to make it a bit more useful. I think that's just the usual > > fft output, then squared, i.e. > > X=fft(signal) > > X=X.^2 > > That's X = |fft(signal)|^2 > > > > > 2. �Now get rid of some lower frequency components > > X(freq<50)=0; > > now take the ifft of X: > > X=ifft(X) > > You have reversed the order of these operations. >I reversed them because it seemed wierd to take out certain frequency components when no longer in the frequency domain, looked like a typo in the book to me.> > > > 3. take DFT of the result > > X=fft(X); > > > My method clearly isn't giving me the results I need, > > Since you haven't told us the point of this, we can hardly speculate > on what would help, except that you might try to correctly perform the > calculation the book suggests. >Im trying to calculate the fundamental frequency (F0) of speech data, using equations 11.12 and 11.13 in the book. I've obtained what I believe is the power spectrum, but i don't see how it is "ripple enhanced" since it is obtained in the same way as any standard power spectrum. My f0 candidates (denoted fn in the book) are 50,51,52..... 499,500. This is the widely suggested search range for human voices. Something that does confuse me though is what to do to calulate R(l,k.fn) in eq. 11.12 . For each searched harmonic of the f0 candidates (k*f0): Should I calculate R as a sum of all the Spectrum data within a range k*fn-fn/2 < k*fn < k*fn+fn/2 or use that range with a triangular weighting filter or just interepolate between the spectral data points to estimate the power at the multiples of f0? Sorry if it seems I was not asking for help with the right part, I was looking for problems in the way I obtained the spectrum because it was the bit I was least confident about> > and I dont see > > the point in doing an inverse DFT than a DFT again straight > > afterwards. > > > I must be looking at this the wrong way. > > > Any help would be greatly appreciated > > Adam > > Dale B. Dalrymple- Hide quoted text - > > - Show quoted text -
Reply by ●January 22, 20092009-01-22
On Jan 22, 7:24�pm, adamchapman <adamchapman1...@hotmail.co.uk> wrote:> Thanks Dale > > On Jan 22, 6:56�pm, dbd <d...@ieee.org> wrote: > > > > > > > On Jan 22, 8:30 am, adamchapman <adamchapman1...@hotmail.co.uk> wrote: > > > > Hi, > > > > I'm trying to obtain the REPS for some audio data, but it's proving > > > quite difficult. > > > > The link athttp://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > > > shows the only explanation I can get (I've also read the paper > > > referenced in the link - the book auythor has just copied and pasted > > > from the paper). > > > > So this is my understanding of how to obtain the REPS: > > > > 1. Obatin the linear power spectrum of a part of the signal, and > > > square it to make it a bit more useful. I think that's just the usual > > > fft output, then squared, i.e. > > > X=fft(signal) > > > X=X.^2 > > > That's X = |fft(signal)|^2 > > > > 2. �Now get rid of some lower frequency components > > > X(freq<50)=0; > > > now take the ifft of X: > > > X=ifft(X) > > > You have reversed the order of these operations. > > I reversed them because it seemed wierd to take out certain frequency > components when no longer in the frequency domain, looked like a typo > in the book to me. > > > > > > 3. take DFT of the result > > > X=fft(X); > > > > My method clearly isn't giving me the results I need, > > > Since you haven't told us the point of this, we can hardly speculate > > on what would help, except that you might try to correctly perform the > > calculation the book suggests. > > Im trying to calculate the fundamental frequency (F0) of speech data, > using equations 11.12 and 11.13 in the book. I've obtained what I > believe is the power spectrum, but i don't see how it is "ripple > enhanced" since it is obtained in the same way as any standard power > spectrum. > > My f0 candidates (denoted fn in the book) are 50,51,52..... 499,500. > This is the widely suggested search range for human voices. Something > that does confuse me though is what to do to calulate R(l,k.fn) in eq. > 11.12 �. > > For each searched harmonic of the f0 candidates (k*f0): > � �Should I calculate R as a sum of all the Spectrum data within a > range � k*fn-fn/2 < k*fn < k*fn+fn/2 > � �or use that range with a triangular weighting filter > > � �or just interepolate between the spectral data points to estimate > the power at the multiples of f0? > > Sorry if it seems I was not asking for help with the right part, I was > looking for problems in the way I obtained the spectrum because it was > the bit I was least confident about > > > > > > and I dont see > > > the point in doing an inverse DFT than a DFT again straight > > > afterwards. > > > > I must be looking at this the wrong way. > > > > Any help would be greatly appreciated > > > Adam > > > Dale B. Dalrymple- Hide quoted text - > > > - Show quoted text -- Hide quoted text - > > - Show quoted text -- Hide quoted text - > > - Show quoted text -OK I made a silly. In equation 11.13 I was taking the average power of just the points related to harmonics, where it should be the average power over the whole spectrum. And it looks like R(l,kfn) should be found from interpolating between existing spectrum points Sorry if I wasted your time.
Reply by ●January 22, 20092009-01-22
On Jan 22, 10:30�am, adamchapman <adamchapman1...@hotmail.co.uk> wrote:> Hi, > > I'm trying to obtain the REPS for some audio data, but it's proving > quite difficult. > > The link athttp://books.google.co.uk/books?id=7yJXMJUw03oC&pg=PA253&lpg=PA253&dq... > > shows the only explanation I can get (I've also read the paper > referenced in the link - the book auythor has just copied and pasted > from the paper). > > So this is my understanding of how to obtain the REPS: > > 1. Obatin the linear power spectrum of a part of the signal, and > square it to make it a bit more useful. I think that's just the usual > fft output, then squared, i.e. > X=fft(signal) > X=X.^2 > > 2. �Now get rid of some lower frequency components > X(freq<50)=0; > now take the ifft of X: > X=ifft(X) > > 3. take DFT of the result > X=fft(X); > > My method clearly isn't giving me the results I need, and I dont see > the point in doing an inverse DFT than a DFT again straight > afterwards. > > I must be looking at this the wrong way. > > Any help would be greatly appreciated > AdamAdam, You have indeed reversed the last part of the algorithm. To better understand why, let's go back to the cepstrum for a moment. Back in the '80's, in obtaining the pitch period of speech, often the log{FFT[x (n)]} (which, by the way, is the Fourier transform of the cepstrum) would be windowed such that frequency components below the pitch period are set to zero. The IFFT would then be taken. The auther is doing the same for his algorithm. Look at the title of the section. It is for voicing detection of the fundamental frequency. Another way of saying pitch detection. He takes |FFT[x (n)]|^2, removes the lower frequency components (frequencies below the pitch frequency is a good start), then takes the IFFT. It is equivalent to the cepstrum process with the substitution of the squared magnitude instead of the log of the magnitude. Maurice Givens
Reply by ●January 22, 20092009-01-22
adamchapman wrote: ...> I reversed them because it seemed wierd to take out certain frequency > components when no longer in the frequency domain, looked like a typo > in the book to me....> Sorry if it seems I was not asking for help with the right part, I was > looking for problems in the way I obtained the spectrum because it was > the bit I was least confident aboutYou followed the recipe, introducing your own substitutions in a few places. (That's reasonable.) When the glop was cooked, it tasted lousy. (That's not necessarily surprising.) You then assumed that it must be the recipe at fault, since replacing salt with cumin made sense to you. (That's not reasonable.) ... Jerry -- Engineering is the art of making what you want from things you can get. �����������������������������������������������������������������������






