The following sorting algorithm (written in Basic), which could be easily adapted to other kinds of data, is theoretically much faster than present algorithms. Marcel Luttgens 'Integers to sort: DATA 15,5,8,3,25,15,7,1,17,25,4,19,2,15,18,7,18 'Number of integers = 17 'Biggest integer = 25 DIM n(25) FOR i = 1 TO 17 READ a PRINT a; n(a) = n(a) + 1 NEXT i PRINT FOR i = 1 TO 25 FOR j = 1 TO n(i) IF n(i) <> 0 THEN PRINT i; NEXT j NEXT i
Fast sorting algorithm
Started by ●March 20, 2004
Reply by ●March 20, 20042004-03-20
On Sat, 20 Mar 2004 02:04:41 +0000, Marcel Luttgens wrote:> The following sorting algorithm (written in Basic), which could be > easily adapted to other kinds of data, is theoretically much faster than > present algorithms. > > Marcel Luttgens > > 'Integers to sort: > > DATA 15,5,8,3,25,15,7,1,17,25,4,19,2,15,18,7,18 > > 'Number of integers = 17 > 'Biggest integer = 25 > > DIM n(25) > > FOR i = 1 TO 17 > READ a > PRINT a; > n(a) = n(a) + 1 > NEXT i > PRINT > > FOR i = 1 TO 25 > FOR j = 1 TO n(i) > IF n(i) <> 0 THEN PRINT i; > NEXT j > NEXT iIf i see it correctly the cost of this algorithm doesn't really depend on the ammount you have to sort through, but rather on the largest Value pressent. If i were to store exponential data in the Array, the time cost would be exponential as well. If i store even over-exponential data, it gets even worse. If you can be ceratain about the nature of your data I think this can be used for optimization. Till -- Please add "Salt and Peper" to the subject line to bypass my spam filter
Reply by ●March 20, 20042004-03-20
> The following sorting algorithm (written in Basic), which could be easilyadapted> to other kinds of data, is theoretically much faster than presentalgorithms.> > Marcel Luttgens > > 'Integers to sort: > > DATA 15,5,8,3,25,15,7,1,17,25,4,19,2,15,18,7,18 > > 'Number of integers = 17 > 'Biggest integer = 25 > > DIM n(25) > > FOR i = 1 TO 17 > READ a > PRINT a; > n(a) = n(a) + 1 > NEXT i > PRINT > > FOR i = 1 TO 25 > FOR j = 1 TO n(i) > IF n(i) <> 0 THEN PRINT i; > NEXT j > NEXT iThis is a counter suitable for counting arrays of small integral values, not a sort, though it can be used as a sort for integral values where the number of integral values (m) is known. When used as such it is O (n * m). OK for sorting where and < n. Where m>n, as m and n become large, this rapidly becomes slower than the naive O(n*n) sorts such as bubble, insertion, selection and shell sorts and much slower than the O (nlog n) sorts such as quick heap and mergesorts. m rapidly becomes unmanageable for string data. It cannot be used for non integral data. Keep thinking, though. Cheers, Alf --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.601 / Virus Database: 382 - Release Date: 29/02/2004
Reply by ●March 20, 20042004-03-20
Marcel Luttgens wrote:> The following sorting algorithm (written in Basic), which could be easily adapted > to other kinds of data, is theoretically much faster than present algorithms.I don't know why you are posting this here (people around here don't take sorting very seriously, check a recent thread about sorting on DSPs :). You have hit on what is called pigeonhole search (the pigeonholes are your n(i)). It in fact runs in linear time. Its drawback is that you can only sort values taken from a discrete set. More here: http://en.wikipedia.org/wiki/Pigeonhole_sort Regards, Andor
Reply by ●March 20, 20042004-03-20
"Till Crueger" <TillFC@gmx.net> wrote in message news:<c3ha2o$vk2$1@f1node01.rhrz.uni-bonn.de>...> On Sat, 20 Mar 2004 02:04:41 +0000, Marcel Luttgens wrote: > > > The following sorting algorithm (written in Basic), which could be > > easily adapted to other kinds of data, is theoretically much faster than > > present algorithms. > > > > Marcel Luttgens > > > > 'Integers to sort: > > > > DATA 15,5,8,3,25,15,7,1,17,25,4,19,2,15,18,7,18 > > > > 'Number of integers = 17 > > 'Biggest integer = 25 > > > > DIM n(25) > > > > FOR i = 1 TO 17 > > READ a > > PRINT a; > > n(a) = n(a) + 1 > > NEXT i > > PRINT > > > > FOR i = 1 TO 25 > > FOR j = 1 TO n(i) > > IF n(i) <> 0 THEN PRINT i; > > NEXT j > > NEXT i > > > If i see it correctly the cost of this algorithm doesn't really depend on > the ammount you have to sort through, but rather on the largest Value > pressent. If i were to store exponential data in the Array, the time cost > would be exponential as well. If i store even over-exponential data, it > gets even worse. If you can be ceratain about the nature of your data I > think this can be used for optimization. > > TillYou are right, but this is a new approach, and the progam should take the nature of the data into account. It should be possible to emulate the data with some set of integers, sort them and then retrieve the original data. I only gave the principle. With this approach, the number of steps depends essentially on the number of data to be sorted, not on its square, which is the case of all known sorting algorithms (AFAIK). Marcel
Reply by ●March 20, 20042004-03-20
On Sat, 20 Mar 2004 06:59:13 +0000, Marcel Luttgens wrote:> You are right, but this is a new approach, and the progam should take > the nature of the data into account. > It should be possible to emulate the data with some set of integers, > sort them and then retrieve the original data.Hmm, The only way I can think of right now to emulate any data with integers is to sort it and number it. But for this you would already need another sorting algorithm. If you have any other way to represent for example strings throuch integers, so that the Alphabetic order is kept in the representation, that would really be a great way to sort through large sets of Data.> I only gave the principle. With this approach, the number of steps > depends essentially on the number of data to be sorted, not on its > square, which is the case of all known sorting algorithms (AFAIK).I think there are several sorting algorithms wich have a complexity x*log(x) but I might be mistaken there. TIll -- Please add "Salt and Peper" to the subject line to bypass my spam filter
Reply by ●March 20, 20042004-03-20
Till Crueger wrote:> > I think there are several sorting algorithms wich have a complexity > x*log(x) but I might be mistaken there. >Any data in a computer whose representation in bits has a fixed ordering of signifigance can be sorted in O(N) operations by a radix sort. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Reply by ●March 20, 20042004-03-20
On Sat, 20 Mar 2004 11:07:18 +0000, Bob Cain wrote:> Any data in a computer whose representation in bits has a fixed ordering > of signifigance can be sorted in O(N) operations by a radix sort.Thanks for pointing that one out. I didn't know radix sort yet. Do I see it correctly though, that the complexity is then also dependend on the length of the words, since you have to sort the array once for every digit? Till (OT added) -- Please add "Salt and Peper" to the subject line to bypass my spam filter
Reply by ●March 20, 20042004-03-20
Till Crueger wrote:> On Sat, 20 Mar 2004 11:07:18 +0000, Bob Cain wrote: > > >>Any data in a computer whose representation in bits has a fixed ordering >>of signifigance can be sorted in O(N) operations by a radix sort. > > > Thanks for pointing that one out. I didn't know radix sort yet. Do I see > it correctly though, that the complexity is then also dependend on the > length of the words, since you have to sort the array once for every > digit?Right. From which an argument can be made that it is still O(N*log(N)) but that doesn't change the fact that in real application it's O(N). The thing that's nice is that a comparison operation can be replaced by a bit test. All those with ones at the current bit position go to the top and all the zeros to the bottom. That's not as signifigant as it once was. What gets really interesting is to base a binary tree structure on this where each node of the tree represents a bit of the search argument to be tested. Luther Woodrum of IBM developed that data structure extensively in the '70s and '80s. A proper insertion algorithm always places a new entry on the backpath of a search done on the new argument and locates the place so that a scan always finds the leaves in lexicogographically sorted order. He called it a radix partitioned tree. Bob -- "Things should be described as simply as possible, but no simpler." A. Einstein
Reply by ●March 21, 20042004-03-21
Bob Cain wrote:> Till Crueger wrote: >> On Sat, 20 Mar 2004 11:07:18 +0000, Bob Cain wrote:>>> Any data in a computer whose representation in bits has a fixed ordering >>> of signifigance can be sorted in O(N) operations by a radix sort.>> Thanks for pointing that one out. I didn't know radix sort yet. Do I see >> it correctly though, that the complexity is then also dependend on the >> length of the words, since you have to sort the array once for every >> digit?> Right. From which an argument can be made that it is still O(N*log(N)) > but that doesn't change the fact that in real application it's O(N). > The thing that's nice is that a comparison operation can be replaced by > a bit test. All those with ones at the current bit position go to the > top and all the zeros to the bottom. That's not as signifigant as it > once was.It can also be done in bases other than two. IBM used to make card sorting machines that would read one column of a punched card and drop the card in the appropriate bin. The result is a decimal based radix sort. For alphanumeric data you need two passes per card column, one for the digits and one for the zones. -- glen






