Reply by waywardgeek December 29, 20102010-12-29
>On Dec 29, 5:55=A0am, "waywardgeek" <waywardgeek@n_o_s_p_a_m.gmail.com> >wrote: >>.. > >> >I lean toward 'time-aliasing' as the process seems to resemble the >> >effect of aliasing in the frequency domain. >> >Dale B. Dalrymple >> >> I see. =A0There are algorithms (WSOLA, PICOLA, TD-PSOLA) that are time
do=
>main >> only algorithms, where steps 2, 3, and 4 don't exist. > >3) still exists but is something like resampling. If there weren't a >modification, what would be the point? And if the resampling is simply >a sample rate change it might be efficiently performed via 2), 3) and >4). > >> =A0OLA does refer to >> generation of the output in those cases, where what I'm doing refers to >> generating the input.=A0"Time-aliasing" describes what MDCT algorithms
do=
>, > >In the case of MDCT algorithms, it is the entire system 1) to 5) with >proper windowing in both 1) and 5) >that works to achieve aliasing cancellation. > >> which is more like what I'm doing. =A0Is this right? =A0Shall I use the
t=
>erm >> "Time-aliased Hann?" >> ... > >That works for me. I think it would cause less confusion than >overloading OLA. > >Dale B. Dalrymple >
So, after reading a bit of that Ph D thesis, I checked the spectral noise of a Hann window with 2X width to compare against the 2X time aliased version. Worst case spectral leakage in the general case is nearly identical. However, in the pitch synchronous case, the 2X Hann window has every other frequency with low values, as would be expected, since the window is double the fundamental pitch period. There seems to be a good reason to use two-pitch periods with time aliasing, as this avoids the low values for odd harmonics, without losig effective time or frequency resolution. It also seems that Hamming windows have too much spectral leakage for high dynamic range spectrograms, and that Hann or Blackman windows do better. I'll keep reading the thesis. I'm sure I'll learn a lot more there. Bill
Reply by dbd December 29, 20102010-12-29
On Dec 29, 5:55&#4294967295;am, "waywardgeek" <waywardgeek@n_o_s_p_a_m.gmail.com>
wrote:
>..
> >I lean toward 'time-aliasing' as the process seems to resemble the > >effect of aliasing in the frequency domain. > >Dale B. Dalrymple > > I see. &#4294967295;There are algorithms (WSOLA, PICOLA, TD-PSOLA) that are time domain > only algorithms, where steps 2, 3, and 4 don't exist.
3) still exists but is something like resampling. If there weren't a modification, what would be the point? And if the resampling is simply a sample rate change it might be efficiently performed via 2), 3) and 4).
> &#4294967295;OLA does refer to > generation of the output in those cases, where what I'm doing refers to > generating the input.&#4294967295;"Time-aliasing" describes what MDCT algorithms do,
In the case of MDCT algorithms, it is the entire system 1) to 5) with proper windowing in both 1) and 5) that works to achieve aliasing cancellation.
> which is more like what I'm doing. &#4294967295;Is this right? &#4294967295;Shall I use the term > "Time-aliased Hann?" > ...
That works for me. I think it would cause less confusion than overloading OLA. Dale B. Dalrymple
Reply by waywardgeek December 29, 20102010-12-29
>On Dec 28, 2:12=A0pm, "waywardgeek" <waywardgeek@n_o_s_p_a_m.gmail.com> >wrote: >> ... > >>=A0I read the OLA term many times >> in time domain algorithms in the PSOLA family (WSOLA, TD-SOLA, PICOLA, >> MBROLA), and the overlap-add is more or less the same in all of them, >> overlapping two sound samples and fading one in while the other is
faded
>> out. =A0That's why I though OLA-Hann was fairly descriptive, but I was >> unaware of OLA as a name involved in FFT land. >> >> Is there a better label? >> >> Thanks, >> Bill > >I think we are talking about algorithms performing all or part of the >following generalized processing chain: > >1) Generate time sequential data blocks from continuing stream of time >samples >2) Analysis (transform) of block >3) Modification/transmission of block >4) Synthesis (transform or inverse transform) of block >5) Regenerate continuing stream of time samples from time sequential >blocks. > >Your approach is applied in 1). It involves the overlap and sum of >time sample data segments within a single windowed block. The >conventional use of 'OLA' refers to a process in 5) used to combine >time sample data from 2 or more (perhaps windowed) data blocks. This >is where 'OLA' appears in the PSOLA family. The difference is between >an analysis function and a synthesis function. > >I lean toward 'time-aliasing' as the process seems to resemble the >effect of aliasing in the frequency domain. > >Dale B. Dalrymple
I see. There are algorithms (WSOLA, PICOLA, TD-PSOLA) that are time domain only algorithms, where steps 2, 3, and 4 don't exist. OLA does refer to generation of the output in those cases, where what I'm doing refers to generating the input. "Time-aliasing" describes what MDCT algorithms do, which is more like what I'm doing. Is this right? Shall I use the term "Time-aliased Hann?" Bill
Reply by dbd December 29, 20102010-12-29
On Dec 28, 11:24&#4294967295;pm, Fred Marshall <fmarshall_xremove_the...@xacm.org>
wrote:
> ...
> I don't think the first agrees with the second..... > > w(n) from n=0 to N-1 is only half a von Hann window... isn't it? > > At least that's what I get when I calculate it from this very expression. > > Fred
I think Bill meant a single (double length) Hann window over two blocks which are added after each being weighted by half the Hann weight. That is what the equations show. That isn't what is usually meant by OLA, your interpretation is more in line with the usual usage of OLA so we have discussed alternative terms for Bill's process. See the rest of the thread for how we have been proceeding with that. Dale B. Dalrymple
Reply by Fred Marshall December 29, 20102010-12-29
On 12/28/2010 7:33 PM, dbd wrote:
> On Dec 28, 5:23 pm, Fred Marshall<fmarshall_xremove_the...@xacm.org> > wrote: >> ... >> >> Aha! I've not had the time to ponder this as much as I'd like. And, I >> didn't ever get an update to the incorrect expressions - or missed that >> if posted. >> >> But, I get what this post is saying. >> It appears the intent is to overlap the sequences such that the middle >> of one coincides with the end of another? Is that correct? >> So, if one starts with 2 length N sequences, the result would be a >> length 3N/2 sequence that ramps up over the first N/2 samples and ramps >> down over the last N/2 samples and has a fat middle of N/2 samples? >> >> Fred > > Fred > > Please read the OP's reference: > > http://vinux-project.org/ola-hann > > and see if that is what you are thinking of. I think the OP's > terminology may have been misleading. And that is why I have suggested > the use of different terms that others have already been using. > > Dale B. Dalrymple >
Dale, It says: "Applying a OLA-Hann window is done by first applying a Hann window to two adjacent FFT input frames, and then adding the first frame to the second." And then it says: w(n) = (1 - cos(pi*n/N))/2, n = 0 .. 2*N - 1. x'(n) = w(n)*x(n) + w(n + N)*x(n + N), n = 0 .. N - 1 I don't think the first agrees with the second..... w(n) from n=0 to N-1 is only half a von Hann window... isn't it? At least that's what I get when I calculate it from this very expression. Fred
Reply by dbd December 28, 20102010-12-28
On Dec 28, 5:23&#4294967295;pm, Fred Marshall <fmarshall_xremove_the...@xacm.org>
wrote:
> ... > > Aha! &#4294967295;I've not had the time to ponder this as much as I'd like. &#4294967295;And, I > didn't ever get an update to the incorrect expressions - or missed that > if posted. > > But, I get what this post is saying. > It appears the intent is to overlap the sequences such that the middle > of one coincides with the end of another? &#4294967295;Is that correct? > So, if one starts with 2 length N sequences, the result would be a > length 3N/2 sequence that ramps up over the first N/2 samples and ramps > down over the last N/2 samples and has a fat middle of N/2 samples? > > Fred
Fred Please read the OP's reference: http://vinux-project.org/ola-hann and see if that is what you are thinking of. I think the OP's terminology may have been misleading. And that is why I have suggested the use of different terms that others have already been using. Dale B. Dalrymple
Reply by Fred Marshall December 28, 20102010-12-28
On 12/28/2010 2:12 PM, waywardgeek wrote:

> overlapping two sound samples and fading one in while the other is faded > out. > > Thanks, > Bill
Aha! I've not had the time to ponder this as much as I'd like. And, I didn't ever get an update to the incorrect expressions - or missed that if posted. But, I get what this post is saying. It appears the intent is to overlap the sequences such that the middle of one coincides with the end of another? Is that correct? So, if one starts with 2 length N sequences, the result would be a length 3N/2 sequence that ramps up over the first N/2 samples and ramps down over the last N/2 samples and has a fat middle of N/2 samples? Fred
Reply by dbd December 28, 20102010-12-28
On Dec 28, 2:12&#4294967295;pm, "waywardgeek" <waywardgeek@n_o_s_p_a_m.gmail.com>
wrote:
> ...
>&#4294967295;I read the OLA term many times > in time domain algorithms in the PSOLA family (WSOLA, TD-SOLA, PICOLA, > MBROLA), and the overlap-add is more or less the same in all of them, > overlapping two sound samples and fading one in while the other is faded > out. &#4294967295;That's why I though OLA-Hann was fairly descriptive, but I was > unaware of OLA as a name involved in FFT land. > > Is there a better label? > > Thanks, > Bill
I think we are talking about algorithms performing all or part of the following generalized processing chain: 1) Generate time sequential data blocks from continuing stream of time samples 2) Analysis (transform) of block 3) Modification/transmission of block 4) Synthesis (transform or inverse transform) of block 5) Regenerate continuing stream of time samples from time sequential blocks. Your approach is applied in 1). It involves the overlap and sum of time sample data segments within a single windowed block. The conventional use of 'OLA' refers to a process in 5) used to combine time sample data from 2 or more (perhaps windowed) data blocks. This is where 'OLA' appears in the PSOLA family. The difference is between an analysis function and a synthesis function. I lean toward 'time-aliasing' as the process seems to resemble the effect of aliasing in the frequency domain. Dale B. Dalrymple
Reply by waywardgeek December 28, 20102010-12-28
>On Fri, 24 Dec 2010 17:23:21 -0800 (PST), dbd <dbd@ieee.org> wrote: >Hi Dale, > Thanks for your detailed reply. > >Here are my two cents. The reason Bill Cox's OLA-Hann >spectrum has such low sidelobe levels is because >(1) he only adds two windowed sequences to obtain his 5th >figure's 500-point sequence, and (2) his original >1000-point sequence was low in frequency. Those two >conditions mean that the first and last samples of his >5th figure's 500-point sequence (the OLA_Hann sequence) >will be very close in amplitude, which of course leads >to low sidelobe levels.
Hi, Rick. I agree that side-lobe levels are reduced when the end of the waveform feeds smoothly into the beginning. I think that using exactly two frames, rather than three or more, allows this to be true for any waveform. The first half fades in while the second fades out. Regardless of the input waveform, the result will feed back on itself smoothly.
>What would interest me is how Bill obtained the blue and >green curves in his first figure. That green spec magnitude >curve is definitely NOT the spec magnitudes of a 1000-point >DFT of his 2nd figure's 1000-point time sequence. >And the blue spec magnitude curve is definitely NOT the spec >magnitudes of a 500-point DFT of his 5th figure's 500-point >time sequence.
That graph was made with the python programs linked in the web page. It's not related to the example waveforms in the second through 5th figures. It's a worst-case spectral noise plot of the FFT from 900 Hz to 1100 Hz when the input waveform is somewhere from 1000 to 1001 Hz, using a 10000 Hz sample rate, and 1% step size. The worst case for OLA-Hann (sorry! suggest another name!) occurs at 1000.75 Hz, which makes sense since it's half way between harmonic and anti-harmonic. The Hamming window has a worst case spectral noise at 1000.48 Hz.
>Dale, I took a quick look at that Ph.D thesis by Jason F Dahl. >His Figure 1.1 (on page number 3) seems awfully strange. >It looks to me that this figure contains significant >conceptual and notational errors. Although I admit I >should continue reading his thesis to see what I can learn >from it. > >See Ya', >[-Rick-]
I need to read it too! I've had it open on my laptop now for three days... Bill
Reply by waywardgeek December 28, 20102010-12-28
>On Dec 25, 6:14=A0am, "waywardgeek" <waywardgeek@n_o_s_p_a_m.gmail.com> >wrote: >>... >> if we market this particular old idea under OLA-Window names, maybe it
wi=
>ll >> stick, but maybe I've simply read too many articles with OLA in the
title=
>. >> I'd like to credit the original researchers if I can find them. >> >> Thanks, >> Bill > >The problem with the "OLA" phrase is that most of the papers that use >it are about synthesis of time series after IFFTs either alone or as >part of an analysis-modify-synthesis structure. I was able to find >Rick's article, but only because I didn't use OLA as a search term. >So, good luck on the signal processing, but I hope your usage of an >overloading "OLA" dies away. > >Dale B. Dalrymple >
Hi, Dale. I'm happy to take your advice. I read the OLA term many times in time domain algorithms in the PSOLA family (WSOLA, TD-SOLA, PICOLA, MBROLA), and the overlap-add is more or less the same in all of them, overlapping two sound samples and fading one in while the other is faded out. That's why I though OLA-Hann was fairly descriptive, but I was unaware of OLA as a name involved in FFT land. Is there a better label? Thanks, Bill