DSPRelated.com
Forums

Glitch/inconsistency detection in navigation data

Started by Rune Allnor June 26, 2006
Andor wrote:

> Have fun in your Waterworld :-)!
You're in Switzerland, right? If so, this conversation is rock'n roll... ;) Nah, not really. Max wave height is somewhere around 1 m. Rune
Martin Eisenberg wrote:
> Rune Allnor wrote: > > Andor wrote: > > >> What do you look for in a manual check? > > > > Anything that can be said to be a "glitch" or an "irregular > > feature". Some of the tracks seem to "jump" sideways on the > > order of 2Dx to 3Dx, where Dx is the along-track distance > > between measurements. Then they get back "on-track" within a > > variable number of samples, say 5 to 20 samples. > > > > I have a couple of ideas about how to *detect* these things, for > > manual correction. But then, if there already exists a wheel, > > why re-invent it... > > This may not be of immediate help to you but there's a book titled > "Detection of Abrupt Changes" online at > http://www.irisa.fr/sisthem/kniga/ . > > Regarding correction, your problem sounds similar to that of click > removal in audio which has been treated here before, I think, and is > commonly attacked with LPC extrapolation. > > > Martin
Thanks. I'll download the book and look... eh... listen into clicks.
> Quidquid latine scriptum sit, altum viditur.
Ehum... I am totally aware that I reveal my ignorance for all worrld to see, but what does that mean...? The first four words are Latin, but "altum viditur" could be Icelandic for "knows [it] all"... Rune
Rune Allnor wrote:
> Martin Eisenberg wrote:
>> Quidquid latine scriptum sit, altum viditur. > > Ehum... I am totally aware that I reveal my ignorance for all > worrld to see, but what does that mean...? > > The first four words are Latin, but "altum viditur" could be > Icelandic for "knows [it] all"...
How weird... It means: "Whatever is written in Latin looks elevated" (that is, deep ;). Martin -- I think perhaps the most important problem is that we are trying to understand the fundamental workings of the universe via a language devised for telling one another where the best fruit is. --Terry Pratchett
Martin Eisenberg wrote:
> Rune Allnor wrote: > > Martin Eisenberg wrote: > > >> Quidquid latine scriptum sit, altum viditur. > > > > Ehum... I am totally aware that I reveal my ignorance for all > > worrld to see, but what does that mean...? > > > > The first four words are Latin, but "altum viditur" could be > > Icelandic for "knows [it] all"... > > How weird... It means: "Whatever is written in Latin looks elevated" > (that is, deep ;).
Of course... "Rise, and you shall reach Rock Bottom." Sounds like somebody who had a drink too many. Oh well. Rune
Eliminating "stray" waypoints requires knowledge of the situation.

If you are travelling by foot and taking trackpoints every minute, then
a point that is more than 250 metres off the general direction of travel
is an exception. ( running at 15km/h for one minute) If it is an olympic
runner, hen you need to increase that value.

If travelling on a bicycle, you need to know the condition. If on a main
road, a trackpoint doesn't need to be that far off before it is an
exception. But if travelling up a switchback on a mountain pass road in
the alps, then a lot of your points will be "off course" but still very
valid to map your track.

If travelling by car on the same road, your trackpoints may not be taken
at sufficiently close intervals to log some of the switchbacks and log
just a mostly straight road up the mountain with minor ddviations
between each point.

If you have 3 points, A B and C, B can be considred an exception if the
discance between B and the axis A-C is greater than some amount. (the
aviation formulary has those equations to calculate distance between a
point and an axis).

Now, if you have 4 points, A B C and D with B and C being odd, the above
logic won't work because B will be within the limits of the A-C axis.
Rune Allnor wrote:

> Hi folks. > > Assume you are given som 20,000 - 100,000 points of raw (x,y,z) > navigation data as previously logged on some platform, and are asked > to clean up the data. > > My first, naive idea is to use some Kalman filter. So far so good. > > However, the data might contain glitches (jumps) or other > inconsistencies > that need to be sorted out before feeding them to the Kalman filter. > > How would you go about detecting glitches and maybe even do a > first-pass correction of such data? A manual check of 100,000 data > points doesn't seem as a very tempting prospect... > > Rune >
Hey Rune: One thing you could do would be to compare the data with the Kalman filter output, and ignore it whenever there's too much difference. This causes problems if your system truly moves outside of your 'known good' band. Another thing that works is to discount the data when it's glitchy, but not completely. You could either have two state evolution rules, one for when the data is 'bad' and another for when it's 'good'. This would result in the filter being influenced by genuinely bad data, but it would also result in the filter being able to find it's way home if it were way off track. My understanding of Kalman filters is somewhat casual*, but I do know that the classic construction assumes linear systems excited by Gaussian noise, and quadratic cost functions. If your underlying systems aren't linear, if your noise isn't Gaussian or if your cost functions aren't quadratic then the optimal solution isn't linear anymore, so a plain ol' Kalman filter is ruled out -- search around on the term 'extended Kalman' for insight in that case. * I've got to get a good book on this. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com Posting from Google? See http://cfaj.freeshell.org/google/ "Applied Control Theory for Embedded Systems" came out in April. See details at http://www.wescottdesign.com/actfes/actfes.html
Rune,

this problem is the classical application for so called "robust statistical
methods".

1) Decide for a window length of n data points. This window will be used to
scan through your data for outlier and glitch detection.

2) Get the first n points of data (in case of your three dimensional
problem, cover each dimension on its own)

3) Compute the MEDIAN of the data (= the 50% PERCENTILE) If in doubt how to
compute the MEDIAN, check MATLAB help. The MEDIAN is a very good
approximation to the mean value of your data. However, the MEDIAN is
completely insensitive to outliers and glitches as long as the number of
ourliers within your window is smaller then n/2. Note that you now have a
outlier insensitive measure for the mean of your data. The MEDIAN is to
robust counterpiece to the MEAN.

4) For all data within your window compute the absolute value of the
difference between the data and the median and put the results of this
operation into a buffer array.

5) Multiply all values in your buffer with a scaling constant 1/0.654.

6) Compute the MEDIAN of the the data in your buffer. The MEDIAN of the
buffer values is the so called "MAD" (Median absolute deviation) of the
original values within your window. The MAD is the outlier insensitive
robust counterpiece to the normal standard deviation which is highly
sensitive to outliers. The scaling factor of 1/0.645 makes the MAD directly
comparable to the standard deviation for normal distributed values.

7) A rule of thumb in classical statistics is: If data is more far away from
the mean than 5 time the standard deviation, it is most probably a outlier.
Now that you have robust values available apply this rule to the robust
values: If data is more than 5 time away from the MEDIAN than the MAD then
it is most probably a outlier.

8) If a outlier is detected be sure to remove it from all diemensions.

9) Shift your window by one and go to step 3

10) Repeat for all data

Sounds as if a lot computing power is necessary for the algorithm and yes,
indeed it is. Note that computing the MEDIAN involves ordering the data
within your window in an ascending order and you need to compute a MEDIAN
twice per window shift. Use a fast sorting algorithm!

I use the above method as my standard outlier detection routine whereever i
expect to be confronted with outliers and it works very well. Note that the
length of the window is a compromise between processing time and having
enough values in the window to make significant statistics with. Note also,
that this algorithm detects outliers fairly well but is not well suited if
your your data contains inconsitencies like sudden steps (offsets that
stay).

Regards

Ulrich

"Rune Allnor" <allnor@tele.ntnu.no> schrieb im Newsbeitrag
news:1151317297.168309.116350@m73g2000cwd.googlegroups.com...
> Hi folks. > > Assume you are given som 20,000 - 100,000 points of raw (x,y,z) > navigation data as previously logged on some platform, and are asked > to clean up the data. > > My first, naive idea is to use some Kalman filter. So far so good. > > However, the data might contain glitches (jumps) or other > inconsistencies > that need to be sorted out before feeding them to the Kalman filter. > > How would you go about detecting glitches and maybe even do a > first-pass correction of such data? A manual check of 100,000 data > points doesn't seem as a very tempting prospect... > > Rune >
Tim Wescott wrote:
> Rune Allnor wrote: > > > Hi folks. > > > > Assume you are given som 20,000 - 100,000 points of raw (x,y,z) > > navigation data as previously logged on some platform, and are asked > > to clean up the data. > > > > My first, naive idea is to use some Kalman filter. So far so good. > > > > However, the data might contain glitches (jumps) or other > > inconsistencies > > that need to be sorted out before feeding them to the Kalman filter. > > > > How would you go about detecting glitches and maybe even do a > > first-pass correction of such data? A manual check of 100,000 data > > points doesn't seem as a very tempting prospect... > > > > Rune > > > Hey Rune: > > One thing you could do would be to compare the data with the Kalman > filter output, and ignore it whenever there's too much difference. This > causes problems if your system truly moves outside of your 'known good' > band. > > Another thing that works is to discount the data when it's glitchy, but > not completely. You could either have two state evolution rules, one > for when the data is 'bad' and another for when it's 'good'. This would > result in the filter being influenced by genuinely bad data, but it > would also result in the filter being able to find it's way home if it > were way off track. > > My understanding of Kalman filters is somewhat casual*, but I do know > that the classic construction assumes linear systems excited by Gaussian > noise, and quadratic cost functions. If your underlying systems aren't > linear, if your noise isn't Gaussian or if your cost functions aren't > quadratic then the optimal solution isn't linear anymore, so a plain ol' > Kalman filter is ruled out -- search around on the term 'extended > Kalman' for insight in that case.
Right... things get messy when all the nice idealizations and assumptions turn out to be a bit too ideal... been there.
> * I've got to get a good book on this.
Let me know if you find one. All I have in my library (which I don't have access to for another couple of weeks) is an early 90s edition of Brown and Kwang(?). Rune
Rune Allnor wrote:
> Hi folks. > > Assume you are given som 20,000 - 100,000 points of raw (x,y,z) > navigation data as previously logged on some platform, and are asked > to clean up the data. > > My first, naive idea is to use some Kalman filter. So far so good. > > However, the data might contain glitches (jumps) or other > inconsistencies > that need to be sorted out before feeding them to the Kalman filter. > > How would you go about detecting glitches and maybe even do a > first-pass correction of such data? A manual check of 100,000 data > points doesn't seem as a very tempting prospect... > > Rune >
You didn't say what kind of navigational system this is. I think that might affect the nature of what you see quite a bit. I'm no nav system expert. I've simply had cause to use the output of nav systems for various purposes. Understanding the qualities of the nav data has been a somewhat low priority compared to my other tasks. That said..... My experience with taking data from inertial navigation systems is that at first sight they look real glitchy. However, if you apply almost any simple LPF - say, a single pole LPF - with a suitable cut off frequency the data quietens down wonderfully. What appear at first sight to be glitches actually appear to be unbiased noise that simply has a very spiky quality. Steve
JF Mezei wrote:
> Eliminating "stray" waypoints requires knowledge of the situation. > > If you are travelling by foot and taking trackpoints every minute, then > a point that is more than 250 metres off the general direction of travel > is an exception. ( running at 15km/h for one minute) If it is an olympic > runner, hen you need to increase that value. > > If travelling on a bicycle, you need to know the condition. If on a main > road, a trackpoint doesn't need to be that far off before it is an > exception. But if travelling up a switchback on a mountain pass road in > the alps, then a lot of your points will be "off course" but still very > valid to map your track. > > If travelling by car on the same road, your trackpoints may not be taken > at sufficiently close intervals to log some of the switchbacks and log > just a mostly straight road up the mountain with minor ddviations > between each point. > > If you have 3 points, A B and C, B can be considred an exception if the > discance between B and the axis A-C is greater than some amount. (the > aviation formulary has those equations to calculate distance between a > point and an axis). > > Now, if you have 4 points, A B C and D with B and C being odd, the above > logic won't work because B will be within the limits of the A-C axis.
I think this approach might be useful. Do you have any references (books, articles) to how these things are done in aviation? Rune