DSPRelated.com
Forums

Filtering Stock Market Data

Started by Robert Krten April 16, 2007
Many thanks to all who helped with my attempts to filter stock market
data with an FIR filter.  I have come to some conclusions.

As was pointed out, (paraphrasing) "Applying a numeric method to a set
of data will result in numbers, but not necessarily meaningful numbers"
-- and, unfortunately, that's exactly what happened :-(

I tried several methods:
a) FIR -- problem is filter length; the longer the filter, the bigger
   the delay, i.e, the last filter output corresponds to the middle of
   the in-filter data sequence, not the last sample (which is what
   I (think I) need).

b) FFT -- I then tried running an FFT on the data.  It looked promising,
   in that I could chop out various bands and reconstruct them through
   an inverse FFT, but it was no help in "predicting" the data.  What
   turned out happening is that the FFT made the sample that I gave it
   be periodic (i.e., a loop).  So, the components all "lined up" to
   give a sharp dropoff at the end of the samples so that it would line
   back up with the beginning of the samples.  Zero padding the samples
   at the beginning and the end was even worse; the "prediction" the
   FFT made was that the stock starts at zero and goes to zero :-)

c) repetetive subtraction -- next was an ugly, brute force method, of
   subtracting sines out of the sample.  The outer loop was the
   amplitude, next inner was the phase, and the innermost was the
   frequency.  While this did come up with the top N sines required to
   reconstitute the original waveform (to within some degree of error),
   it too failed miserably to "predict" anything useful.  It's also
   REALLY slow (40 seconds on an AMD64/3500 for 256-element samples).

So, I'm at a crossroads.  I believe (because I can "see" it in the
charts) that there are periodicities to be extracted from stock market
data.  However, I'm mathematically inept enough to not be able to
proceed much further than what I've done.

If anyone is interested in helping with this, please email me -- I'm
not sure this is 100% on topic for comp.dsp any more (but there is no
comp.stockmarketanalysis newsgroup :-))

Cheers,
-RK

-- 
Robert Krten, Antique computer collector looking for PDP-8 and PDP-8/S
minicomputers; check out their "good home" at www.parse.com/~museum
Robert Krten wrote:

   ...

> b) FFT -- ... the "prediction" the > FFT made was that the stock starts at zero and goes to zero :-)
"Ashes to ashes, dust to dust." :-)
> > c) repetetive subtraction -- next was an ugly, brute force method, of > subtracting sines out of the sample.
When done properly, this is a long, slow way to do a Fourier transform. ...
> So, I'm at a crossroads. I believe (because I can "see" it in the > charts) that there are periodicities to be extracted from stock market > data. However, I'm mathematically inept enough to not be able to > proceed much further than what I've done.
I believe that the periodicities aren't closely related to particular stocks. They have to do with the timing of outside events, such as tax and dividend dates, accounting deadlines, government announcements, etc. Any predictor of stock prices that ignores current events is bound to be poor. Stock prices reflect more-or-less-well informed guesses about the future. The past is largely irrelevant.
> If anyone is interested in helping with this, please email me -- I'm > not sure this is 100% on topic for comp.dsp any more (but there is no > comp.stockmarketanalysis newsgroup :-))
And I can guess why not! Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
On 16 Apr, 15:23, info2...@parse.com (Robert Krten) wrote:
> Many thanks to all who helped with my attempts to filter stock market > data with an FIR filter. I have come to some conclusions. > > As was pointed out, (paraphrasing) "Applying a numeric method to a set > of data will result in numbers, but not necessarily meaningful numbers" > -- and, unfortunately, that's exactly what happened :-( > > I tried several methods: > a) FIR -- problem is filter length; the longer the filter, the bigger > the delay, i.e, the last filter output corresponds to the middle of > the in-filter data sequence, not the last sample (which is what > I (think I) need).
Try to run your FIR filter on the time-reversed sequence (i.e. turn the data sequence "upside-down"). This ought to produce a zero-delay filter, but it does not make meaningless data meaningful...
> So, I'm at a crossroads. I believe (because I can "see" it in the > charts) that there are periodicities to be extracted from stock market > data. However, I'm mathematically inept enough to not be able to > proceed much further than what I've done. > > If anyone is interested in helping with this, please email me -- I'm > not sure this is 100% on topic for comp.dsp any more (but there is no > comp.stockmarketanalysis newsgroup :-))
As I said before, Kalman filters seem to be the name of the game. A good intro to the subject is Durbinn ad Koopman: "Time series analysis by state space methods" Oxford Press, 2001. One model which is used to develop the theory, is observation = trend + seasonal + disturbance and is introduced in equation 2.1. The Kalman filter does require a bit of mathematical skills to understand and implement, but the authors show an example (not stock market data, but still) where they separate the three componets, early in chapter 9. Rune
Rune wrote:
> On 16 Apr, 15:23, info2...@parse.com (Robert Krten) wrote: > > > Many thanks to all who helped with my attempts to filter stock market > > data with an FIR filter. I have come to some conclusions. > > > As was pointed out, (paraphrasing) "Applying a numeric method to a set > > of data will result in numbers, but not necessarily meaningful numbers" > > -- and, unfortunately, that's exactly what happened :-( > > > I tried several methods: > > a) FIR -- problem is filter length; the longer the filter, the bigger > > the delay, i.e, the last filter output corresponds to the middle of > > the in-filter data sequence, not the last sample (which is what > > I (think I) need). > > Try to run your FIR filter on the time-reversed sequence (i.e. > turn the data sequence "upside-down"). This ought to produce > a zero-delay filter,
I think Robert used a linear-phase FIR (from his description of the filter delay being in the middle of the filter taps). Running a linear- phase FIR backwards results in the exact same output as running the linear-phase FIR forwards. I think he really wants a minimum-phase filter, but, as you say,
> it does not make meaningless data > meaningful...
Regards, Andor
Andor wrote:
> Rune wrote: >> On 16 Apr, 15:23, info2...@parse.com (Robert Krten) wrote: >> >>> Many thanks to all who helped with my attempts to filter stock market >>> data with an FIR filter. I have come to some conclusions. >>> As was pointed out, (paraphrasing) "Applying a numeric method to a set >>> of data will result in numbers, but not necessarily meaningful numbers" >>> -- and, unfortunately, that's exactly what happened :-( >>> I tried several methods: >>> a) FIR -- problem is filter length; the longer the filter, the bigger >>> the delay, i.e, the last filter output corresponds to the middle of >>> the in-filter data sequence, not the last sample (which is what >>> I (think I) need). >> Try to run your FIR filter on the time-reversed sequence (i.e. >> turn the data sequence "upside-down"). This ought to produce >> a zero-delay filter, > > I think Robert used a linear-phase FIR (from his description of the > filter delay being in the middle of the filter taps). Running a linear- > phase FIR backwards results in the exact same output as running the > linear-phase FIR forwards. I think he really wants a minimum-phase > filter, but, as you say, > >> it does not make meaningless data >> meaningful...
Rune suggested running the data backward. I don't understand the implication yet, though. Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Robert Krten wrote:
> Many thanks to all who helped with my attempts to filter stock market > data with an FIR filter. I have come to some conclusions. > > As was pointed out, (paraphrasing) "Applying a numeric method to a set > of data will result in numbers, but not necessarily meaningful numbers" > -- and, unfortunately, that's exactly what happened :-(
Robert, the meaning in the numbers is not due to the method used to get the numbers, but solely your personal interpretation. Mixing this up has gotten comp.dsp into rather lengthy discussions in the past, which hopefully won't re-occur in this thread.
> I tried several methods: > a) FIR -- problem is filter length; the longer the filter, the bigger > the delay, i.e, the last filter output corresponds to the middle of > the in-filter data sequence, not the last sample (which is what > I (think I) need).
See my reply to Rune.
> > b) FFT -- I then tried running an FFT on the data. It looked promising, > in that I could chop out various bands and reconstruct them through > an inverse FFT, but it was no help in "predicting" the data. What > turned out happening is that the FFT made the sample that I gave it > be periodic (i.e., a loop). So, the components all "lined up" to > give a sharp dropoff at the end of the samples so that it would line > back up with the beginning of the samples. Zero padding the samples > at the beginning and the end was even worse; the "prediction" the > FFT made was that the stock starts at zero and goes to zero :-)
It's a trivial fact that DFT based extrapolation results in periodic repetition of the data. Perhaps you should study the methods you use more closely before going off on a wild goose chase.
> > c) repetetive subtraction -- next was an ugly, brute force method, of > subtracting sines out of the sample. The outer loop was the > amplitude, next inner was the phase, and the innermost was the > frequency. While this did come up with the top N sines required to > reconstitute the original waveform (to within some degree of error), > it too failed miserably to "predict" anything useful. It's also > REALLY slow (40 seconds on an AMD64/3500 for 256-element samples).
If you do this with completely random data, you get a nice decomposition into sines, similarly as the DFT, with absolutely no predictive power. If it's not in there, you can't find it.
> > So, I'm at a crossroads. I believe (because I can "see" it in the > charts) that there are periodicities to be extracted from stock market > data. However, I'm mathematically inept enough to not be able to > proceed much further than what I've done.
There are special methods for prediction of data sequences. They depend on the fact that the data is correlated. However, a standard model for stock market data is a random walk with _independent_ increments (usually Gaussian distributed). Independent implies uncorrelated. This means your best guess at the next value in the sequence is in fact the last value you measured. A more down-to-earh approach at stock market guessing is to find two linearly correlated (with time delay) sequences, and then predict the value of the time-advanced series from the values of the time-delayed series. Don't know if it has been applied with any success. Regards, Andor
info2007@parse.com (Robert Krten) wrote in news:6eKdndP-
2sBZ5b7bnZ2dnUVZ_sCinZ2d@magma.ca:

> > So, I'm at a crossroads. I believe (because I can "see" it in the > charts) that there are periodicities to be extracted from stock market > data.
You absolutely don't need to do filtering to pull periodicities out of data. For examples of the tools you might need, try googling for "random signals sunspots" or some such. Also, you can just try looking at the maximum of a cross correlation with a sin of the frequency of interest to check for periodicities. Try one week, one month, one quarter, one year, and four years to start. -- Scott Reverse name to reply
Scott Seidman wrote:
> info2007@parse.com (Robert Krten) wrote in news:6eKdndP- > 2sBZ5b7bnZ2dnUVZ_sCinZ2d@magma.ca: > >> So, I'm at a crossroads. I believe (because I can "see" it in the >> charts) that there are periodicities to be extracted from stock market >> data. > > You absolutely don't need to do filtering to pull periodicities out of > data. For examples of the tools you might need, try googling for "random > signals sunspots" or some such. > > Also, you can just try looking at the maximum of a cross correlation with a > sin of the frequency of interest to check for periodicities. Try one week, > one month, one quarter, one year, and four years to start. >
Of course there is periodic behaviour in the markets, and in the global economy in general. Downturns are fairly evenly spaced in time. Something nobody seems to understand makes the world economy pulse on an 11-12 year cycle. All the garbage about the fast pace of modern life is revealed to be garbage when you notice how little this cycle time has changed, despite huge changes in the response times of buying and selling. However, the interesting things, from a beating the system point of view, are in the fine details, which are almost certainly submerged in noise. From a pragmatic point of view, this is not a problem. You can make just as much money from smoke and mirrors, as from the market itself. :-) Regards, Steve
Steve Underwood <steveu@dis.org> wrote in
news:f0044q$hbk$1@nnews.pacific.net.hk: 

> Scott Seidman wrote: >> info2007@parse.com (Robert Krten) wrote in news:6eKdndP- >> 2sBZ5b7bnZ2dnUVZ_sCinZ2d@magma.ca: >> >>> So, I'm at a crossroads. I believe (because I can "see" it in the >>> charts) that there are periodicities to be extracted from stock >>> market data. >> >> You absolutely don't need to do filtering to pull periodicities out >> of data. For examples of the tools you might need, try googling for >> "random signals sunspots" or some such. >> >> Also, you can just try looking at the maximum of a cross correlation >> with a sin of the frequency of interest to check for periodicities. >> Try one week, one month, one quarter, one year, and four years to >> start. >> > Of course there is periodic behaviour in the markets, and in the > global economy in general. Downturns are fairly evenly spaced in time. > Something nobody seems to understand makes the world economy pulse on > an 11-12 year cycle. All the garbage about the fast pace of modern > life is revealed to be garbage when you notice how little this cycle > time has changed, despite huge changes in the response times of buying > and selling. However, the interesting things, from a beating the > system point of view, are in the fine details, which are almost > certainly submerged in noise. From a pragmatic point of view, this is > not a problem. You can make just as much money from smoke and mirrors, > as from the market itself. :-) > > Regards, > Steve >
Whether any simple periodic model can help you make money is a very different question. I was just pointing out that filtering won't help. My own assumption is that there are others who do this sort of stuff for a living, and if there were anything in it, they would be making money for their clients. It's a huge, very complicated, MIMO system with inputs that are squishy. Probably very tough to make sense of it. -- Scott Reverse name to reply
Scott Seidman wrote:
> info2007@parse.com (Robert Krten) wrote in news:6eKdndP- > 2sBZ5b7bnZ2dnUVZ_sCinZ2d@magma.ca: > >> So, I'm at a crossroads. I believe (because I can "see" it in the >> charts) that there are periodicities to be extracted from stock market >> data. > > You absolutely don't need to do filtering to pull periodicities out of > data. For examples of the tools you might need, try googling for "random > signals sunspots" or some such. > > Also, you can just try looking at the maximum of a cross correlation with a > sin of the frequency of interest to check for periodicities. Try one week, > one month, one quarter, one year, and four years to start.
Don't forget the 11-year sunspot cycle. There's a correlation there too. Jerry -- Engineering is the art of making what you want from things you can get. &macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;&macr;