DSPRelated.com
Forums

Maintain ordering in a 2D matrix, what is an efficient algorithm for this?

Started by Luna Moon August 26, 2007
In a 2D array of double(floating point) numbers,

ideally it should have an ordering on two dimensions:

for left to right, the numbers should increase,

from bottom to up, the numbers should increase,

if in the middle of some row and some column, there are valleys and
peaks, which means that the ordering is violated,

I want to detect these violations and set the peaks or valleys to some
predefined constants, say -1.

my question is:

how to detect any such violation of ordering that efficiently in C/C+
+, also also Matlab?

Basically, the question is how to detect peaks and valleys which
violate the ordering and set them to an alert number, say -1, for post-
processing... This is
like "highlighting" ...

Thanks!


> For example,
> 90 91 93 94 95 96 97 98 99 100 > 10 12 13 14 15 16 17 18 11 19 > 80 82 83 84 85 86 87 88 81 89 > 70 72 73 74 75 76 77 78 71 79 > 60 62 63 64 65 66 67 68 61 69 > 50 52 53 54 55 56 57 58 51 59 > 40 42 43 44 45 46 47 48 41 49 > 30 32 33 34 35 36 37 38 31 39 > 20 22 23 24 25 26 27 28 21 29 > 0 1 2 3 4 5 6 7 8 9
The second row is definitely violating the ordering, judging from their corresponding neighbors above and below. And that 11 between 18 and 19 is another violation. This is exactly my question, how do I set these violation numbers to a predefined ALERT constant, so for post processing, I can filling the invalid numbers out by a digital filter using image processing technologies...
> If you examine the 8 by 8 lower left submatrix, the numbers > increase left to right and bottom to top. Examining column-wise, > in the second-last column, the numbers drop, but remain above > the starting column value. Is that second last column a valley, > or is the second through third-last a peak? Similarily if you > examine up the columns.
> Or consider,
> 50 51 52 53 54 55 > 30 32 31 34 33 35 > 40 42 41 44 43 45 > 10 12 11 14 13 15 > 20 22 21 24 23 25 > 0 2 1 4 3 5
Any violation of orders will be defined to be invalid numbers, such as the 1 in between 2 and 4, and more generally, 31 41 11 21 1 and 10 12 11 14 13 15 That's exactly my headache, how to decide these violations and set them to an invalid number and then fill in the holes using image processing techniques. I know how to do the filter part, but I don't know how to detect the violations... Thanks!
Luna Moon wrote:
> In a 2D array of double(floating point) numbers, > > ideally it should have an ordering on two dimensions: > > for left to right, the numbers should increase, > > from bottom to up, the numbers should increase, > > if in the middle of some row and some column, there are valleys and > peaks, which means that the ordering is violated, > > I want to detect these violations and set the peaks or valleys to some > predefined constants, say -1.
Yes, that was apparent when you posted the exact same question to comp.programming yesterday.
> my question is: > > how to detect any such violation of ordering that efficiently in C/C+ > +, also also Matlab?
The obvious algorithm would be to use a pair of nested 'for' loops and a comparison operator. If a number is smaller than both the one before it and after it, then it's a valley. If a number is larger than both the one before it and after it, then it's a peak. (If it's the same as the one before it, you have to do some special processing, but nothing too complex -- just keep a flag around that tells you whether the number before the repeating sequence was smaller or was larger.) You would do one pair of nested 'for' loops for processing horizontally (checking each row for horizontal "violations" of your rule that all numbers should increase), and then another pair of nested 'for' loops for processing vertically. Replacing with -1 rather than keeping a separate list of elements which are peaks and valleys will probably make things more complicated, but it's possible. Have you already considered this algorithm? Is it good enough for you purposes, or do you need something more efficient? I think the reason you didn't get much of an answer last time you asked the question is that you didn't explain whether you already considered this algorithm. This leaves people unsure what to suggest to you. Also, as long as we're on this subject, you may have some ambiguities to iron out. For example, if you have this pattern: 5 6 5 8 7 8 5 6 5 Then the element with value 7 is both a peak and a valley. Is that a problem? If not, good, but if so, your algorithm will have to take that into account. - Logan
Hi, 

I can give you some matlab commands to get started "efficiently" (meaning
with the smallest amount of work):

It subtracts a right-shifted and a down-shifted version of the matrix from
the original, and checks whether the change is positive or negative.
Actually I ran it on octave, but it should work on matlab too.

Cheers

Markus

m=[90 91 93 94 95 96 97 98 99 100;...
10 12 13 14 15 16 17 18 11 19;...
80 82 83 84 85 86 87 88 81 89;...
70 72 73 74 75 76 77 78 71 79;...
60 62 63 64 65 66 67 68 61 69;...
50 52 53 54 55 56 57 58 51 59;...
40 42 43 44 45 46 47 48 41 49;...
30 32 33 34 35 36 37 38 31 39;...
20 22 23 24 25 26 27 28 21 29;...
0 1 2 3 4 5 6 7 8 9];

[nRow, nCol]=size(m);
m(nRow+1, nCol+1)=0; % extend size by 1,1
m1=circshift(m, [1, 0]); % roll down
m2=circshift(m, [0, 1]); % roll right
r1=(m1-m < 0)
r2=(m2-m > 0)

output:

r1 =

   1   1   1   1   1   1   1   1   1   1   0
   0   0   0   0   0   0   0   0   0   0   0
   1   1   1   1   1   1   1   1   1   1   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0
   0   0   0   0   0   0   0   0   0   0   0

r2 =

   0   0   0   0   0   0   0   0   0   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   1   0   1
   0   0   0   0   0   0   0   0   0   0   1
   0   0   0   0   0   0   0   0   0   0   0
> Yes, that was apparent when you posted the exact same question to > comp.programming yesterday.
Yes, I realized that I need to post to a broadened range of appropriate groups...
> > > my question is: > > > how to detect any such violation of ordering that efficiently in C/C+ > > +, also also Matlab? > > The obvious algorithm would be to use a pair of nested 'for' loops > and a comparison operator. If a number is smaller than both the one > before it and after it, then it's a valley. If a number is larger > than both the one before it and after it, then it's a peak. (If > it's the same as the one before it, you have to do some special > processing, but nothing too complex -- just keep a flag around that > tells you whether the number before the repeating sequence was > smaller or was larger.) > > You would do one pair of nested 'for' loops for processing horizontally > (checking each row for horizontal "violations" of your rule that all > numbers should increase), and then another pair of nested 'for' loops > for processing vertically.
I just don't know how to handle variable length violations. Single- point violation is of course easy to detect. Just if (a[i-1]>a[i] && a[i+1]>a[i]) || (a[i-1]<a[i] && a[i+1]<a[i]) { decide a[i] is a violation; } First step, I'd like to find some algorithm that is working and reasonably efficient; Next step, hopefully there are some special techniques that can make it super fast...
> > Replacing with -1 rather than keeping a separate list of elements which > are peaks and valleys will probably make things more complicated, but > it's possible. > > Have you already considered this algorithm? Is it good enough for you > purposes, or do you need something more efficient? I think the reason > you didn't get much of an answer last time you asked the question is > that you didn't explain whether you already considered this algorithm. > This leaves people unsure what to suggest to you. > > Also, as long as we're on this subject, you may have some ambiguities to > iron out. For example, if you have this pattern: > > 5 6 5 > 8 7 8 > 5 6 5 > > Then the element with value 7 is both a peak and a valley. Is that > a problem? If not, good, but if so, your algorithm will have to take > that into account. >
7 is a violation for sure. That's exactly what I want to detect, and filter out later.
Luna Moon wrote:
> In a 2D array of double(floating point) numbers, > ideally it should have an ordering on two dimensions: > for left to right, the numbers should increase, > from bottom to up, the numbers should increase, > if in the middle of some row and some column, there are valleys and > peaks, which means that the ordering is violated, > > I want to detect these violations and set the peaks or valleys to some > predefined constants, say -1.
What do the numbers represent? Why do you want to detect these violations? If you explain your goal then perhaps people can give you more relevant advice. -- Regards, Martin Leese E-mail: please@see.Web.for.e-mail.INVALID Web: http://members.tripod.com/martin_leese/
In article <1188140618.263726.294090@o80g2000hse.googlegroups.com>,
Luna Moon  <lunamoonmoon@gmail.com> wrote:
>In a 2D array of double(floating point) numbers,
>ideally it should have an ordering on two dimensions: >for left to right, the numbers should increase, >from bottom to up, the numbers should increase,
>if in the middle of some row and some column, there are valleys and >peaks, which means that the ordering is violated,
>I want to detect these violations and set the peaks or valleys to some >predefined constants, say -1.
First you need to define precisely what a "peak" is. Suppose you have a perfectly ordered array, M. Now take any element on the "inside" of M, M(x,y), and replace that element with a value that is larger than M(x-1,y) and M(x,y+1) [assuming that the indices increase from the top-left to the bottom-right and that the first index is the row and the second index is the column -- Matlab and Fortran notation rather than C notation]. In the modified matrix, MM, is the location (x,y) a "peak"? With the (lack of) definition you gave, the answer is "NO, it is not". MM(x,y) obeys the rule that "for left to right, the numbers should increase" and obeys the rule that "from bottom to up, the numbers should increase". So MM(x,y) fits the rules, and it is the locations to the right and above it that violate the rules of increase. MM(x-1,y) and M(x,y+1) are thus in "valleys", one of which continues upwards until you find some MM(x-p,y) which is greater than MM(x,y), and the other of which continues rightward until you find some MM(x,y+q) which is greater than MM(x,y). *If* it turns out that MM(x,y) is greater than M(1,y) and greater than M(x,#columns) (i.e., at the outer edges of the matrix), then you might have a case for defining MM(x,y) as a peak, if you *define* the upper and right edges as being in the proper order. But the boundary edges might have obvious violations; you can adjust for this if you at least -define- the upper-right corner as being in the correct order. [If you do so define, then if you flip the perfectly ordered matrix M top to bottom and left to right, the result would be declared to be *all* peak except for the upper right corner...] There's another reason for being careful about -defining- the upper and right edges as being in the proper order. Take the perfectly ordered matrix, M, and take some upper-right subset of it and subtract some value from that subset, leaving a result similar to where a geological fault dropped an entire section. (You don't need to drop along a stright rectangular boundary for this discussion point to hold.) Now, is the region that was dropped a "valley", or is the region that did not drop a "plateau"? Before we can discuss efficient code or data structures for finding the peaks and valleys, you must give us an unambiguous definition of what a "peak" or "valley" is. How wide can a peak be? How wide can a valley be? What is the test to tell us whether the peak or valley has ended? Another test situation: take the perfectly ordered matrix M, and add a sloping ramp to it (you need a fixed width font for this diagram) 8 |7 | 6 6 | 5 | 3 2 The '|' are not datapoints; they just show the "fault-line". For this diagram, I would tend to say that 8, 7, 6 are all parts of a single peak. If, though, you were to -define- peaks as only in relationship to the immediately adjacent datapoints, then because the 7 is less than it's immediately adjacent 8 neighbour, 7 would be deemed to be part of a "valley", as would the first 6 and as would the 5. But if you look at the overall trend, the 5 is greater than the 3 that was before the peak, so arguably the 5 is already the restoration of the proper trend rather than being in a "valley" between the two 6's. Whatever definition you give us should be precise enough that we can -mechanically- decide such cases. Once you give us a mechanical definition of what is "peak" or "valley", then we can set about producing efficient code. I would suggest that you start with the one-dimensional case: if you have a generally-increasing vector, give us a specification to decide whether any particular point is a "peak" or "valley" within that vector. Once we have a one-dimensional test, we can apply it across the first (upper) row. Then as we examine the columns, we can look at the result of examining the first row; any datapoint that is "still there" in the first row acts as a reference point for the column, but for any column in which the first-row position was marked as outlier, we have to consider using the corresponding second row data point as the reference point.. providing that it wouldn't be zapped if you examine across the second row... (I can immediately see flaws in this approach too :( ) -- There are some ideas so wrong that only a very intelligent person could believe in them. -- George Orwell
On Aug 26, 4:17 pm, Martin Leese <ple...@see.Web.for.e-mail.INVALID>
wrote:
> Luna Moon wrote: > > In a 2D array of double(floating point) numbers, > > ideally it should have an ordering on two dimensions: > > for left to right, the numbers should increase, > > from bottom to up, the numbers should increase, > > if in the middle of some row and some column, there are valleys and > > peaks, which means that the ordering is violated, > > > I want to detect these violations and set the peaks or valleys to some > > predefined constants, say -1. > > What do the numbers represent? Why do you > want to detect these violations? If you > explain your goal then perhaps people can > give you more relevant advice. > > -- > Regards, > Martin Leese > E-mail: ple...@see.Web.for.e-mail.INVALID > Web:http://members.tripod.com/martin_leese/
These are results from numerical integrations and numerical ODEs. According to theory, there should be such an ordering. However, due to instability, we see often slight violation of orderings, much like noise added onto theoretical results. Since the surface is smooth, we thought it should be better to fix the violations by filtering/ smoothing/interpolating types of techniques.
On Aug 26, 4:35 pm, rober...@ibd.nrc-cnrc.gc.ca (Walter Roberson)
wrote:
> In article <1188140618.263726.294...@o80g2000hse.googlegroups.com>, > Luna Moon <lunamoonm...@gmail.com> wrote: > > >In a 2D array of double(floating point) numbers, > >ideally it should have an ordering on two dimensions: > >for left to right, the numbers should increase, > >from bottom to up, the numbers should increase, > >if in the middle of some row and some column, there are valleys and > >peaks, which means that the ordering is violated, > >I want to detect these violations and set the peaks or valleys to some > >predefined constants, say -1. > > First you need to define precisely what a "peak" is. > > Suppose you have a perfectly ordered array, M. Now take any element > on the "inside" of M, M(x,y), and replace that element with a value > that is larger than M(x-1,y) and M(x,y+1) [assuming that the > indices increase from the top-left to the bottom-right and that > the first index is the row and the second index is the column -- > Matlab and Fortran notation rather than C notation]. > > In the modified matrix, MM, is the location (x,y) a "peak"? With > the (lack of) definition you gave, the answer is "NO, it is not". > MM(x,y) obeys the rule that "for left to right, the numbers should > increase" and obeys the rule that "from bottom to up, the numbers > should increase". So MM(x,y) fits the rules, and it is > the locations to the right and above it that violate the rules > of increase. MM(x-1,y) and M(x,y+1) are thus in "valleys", > one of which continues upwards until you find some MM(x-p,y) > which is greater than MM(x,y), and the other of which continues > rightward until you find some MM(x,y+q) which is greater than MM(x,y). > > *If* it turns out that MM(x,y) is greater than M(1,y) and greater > than M(x,#columns) (i.e., at the outer edges of the matrix), then > you might have a case for defining MM(x,y) as a peak, if you > *define* the upper and right edges as being in the proper order. > But the boundary edges might have obvious violations; you can > adjust for this if you at least -define- the upper-right corner as > being in the correct order. [If you do so define, then if you flip > the perfectly ordered matrix M top to bottom and left to right, the > result would be declared to be *all* peak except for the upper > right corner...] > > There's another reason for being careful about -defining- the upper > and right edges as being in the proper order. Take the perfectly > ordered matrix, M, and take some upper-right subset of it and > subtract some value from that subset, leaving a result similar to > where a geological fault dropped an entire section. (You don't > need to drop along a stright rectangular boundary for this > discussion point to hold.) Now, is the region that was dropped > a "valley", or is the region that did not drop a "plateau"? > > Before we can discuss efficient code or data structures for > finding the peaks and valleys, you must give us an unambiguous > definition of what a "peak" or "valley" is. How wide can a peak > be? How wide can a valley be? What is the test to tell us whether > the peak or valley has ended? > > Another test situation: take the perfectly ordered matrix M, and > add a sloping ramp to it (you need a fixed width font for this diagram) > 8 > |7 > | 6 6 > | 5 > | > 3 > 2 > > The '|' are not datapoints; they just show the "fault-line". For > this diagram, I would tend to say that 8, 7, 6 are all parts of a > single peak. If, though, you were to -define- peaks as only in > relationship to the immediately adjacent datapoints, then because > the 7 is less than it's immediately adjacent 8 neighbour, 7 would > be deemed to be part of a "valley", as would the first 6 and as would > the 5. But if you look at the overall trend, the 5 is greater > than the 3 that was before the peak, so arguably the 5 is already > the restoration of the proper trend rather than being in a "valley" > between the two 6's. > > Whatever definition you give us should be precise enough that > we can -mechanically- decide such cases. Once you give us a > mechanical definition of what is "peak" or "valley", then we can > set about producing efficient code. > > I would suggest that you start with the one-dimensional case: if > you have a generally-increasing vector, give us a specification > to decide whether any particular point is a "peak" or "valley" > within that vector. > > Once we have a one-dimensional test, we can apply it across the > first (upper) row. Then as we examine the columns, we > can look at the result of examining the first row; any datapoint > that is "still there" in the first row acts as a reference point for > the column, but for any column in which the first-row position was > marked as outlier, we have to consider using the corresponding > second row data point as the reference point.. providing that > it wouldn't be zapped if you examine across the second row... > (I can immediately see flaws in this approach too :( ) > --
I thought the rules about the ordering is very clear? Probably we don't need to fuss about the concept of peaks or valleys... we just focus on the violations... I could not visualize what's wrong with my definition. Also no matter what font I used to display the faulty line, I could not see the pattern clearly... Could you please post more simple examples?

Luna Moon wrote:

> I thought the rules about the ordering is very clear? Probably we > don't need to fuss about the concept of peaks or valleys... we just > focus on the violations...
Yes but it isn't clear at all. How do you identify which number is a violator? In the sequence 1 3 8 7 9 Is the 7 or is the 8 the violator? What about 1 3 6 5 9 Is the violator the 6 or 5? And more important, in those sequences why would you choose to consider one a violator over the other? -jim
> > I could not visualize what's wrong with my definition.
You've only defined what no violation looks like.
> > Also no matter what font I used to display the faulty line, I could > not see the pattern clearly... > > Could you please post more simple examples?
----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==---- http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups ----= East and West-Coast Server Farms - Total Privacy via Encryption =----
On Aug 26, 7:24 pm, jim <"sjedgingN0sp"@m...@mwt.net> wrote:
> Luna Moon wrote: > > I thought the rules about the ordering is very clear? Probably we > > don't need to fuss about the concept of peaks or valleys... we just > > focus on the violations... > > Yes but it isn't clear at all. How do you identify which number is a > violator? > > In the sequence 1 3 8 7 9 Is the 7 or is the 8 the violator? >
Good questions! In this case, I have no idea. If I say both 8 and 7 are wrong, is that consistent now?
> What about 1 3 6 5 9 Is the violator the 6 or 5?
same as above...
> > You've only defined what no violation looks like. >
Because that's the ordering I want to maintain. Any kind of violation needs to be fixed...