sort

List overview All Threads
Download

newer

older

Strange?

Array.flatten

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

23 Apr 2003 23 Apr '03

2:10 p.m.

Pike's sort is not stable.

Show replies by date

Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) ＠ Pike (-) developers forum

23 Apr 23 Apr

2:15 p.m.

URL:http://pike.ida.liu.se/generated/manual/modref/ex/predef_3A_3A/sort.html

Note

The sorting algorithm used is not stable, ie elements that are equal may get reordered.

/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)

Previous text:

...

2003-04-23 16:09: Subject: sort

Pike's sort is not stable.

/ Johan Sundström (folkskådare)

Martin Baehr

2:18 p.m.

what effect could the reordering of equal elements have?

greetings, martin.

Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) ＠ Pike (-) developers forum

2:35 p.m.

Well, it could have the effect that elements which are equal are reordered.

/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)

Previous text:

...

2003-04-23 16:22: Subject: Re: sort

what effect could the reordering of equal elements have?

greetings, martin.

/ Brevbäraren

Martin Baehr

2:32 p.m.

yes, but how do you know they are being reordered, if they are equal? and what problems could that create?

greetings, martin.

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:40 p.m.

The shutdown procedure of your nuclear powerplant might be executed in an out-of-orderly fashion. All sorts of interesting things might end up happening or not happening. :-)

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-23 16:37: Subject: Re: sort

yes, but how do you know they are being reordered, if they are equal? and what problems could that create?

greetings, martin.

/ Brevbäraren

Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) ＠ Pike (-) developers forum

2:45 p.m.

If the procedure steps must be executed in any particular order, they shouldn't be equal. The whole point in sorting a procedure is of course to get a total ordering from a partial ordering describing the interdependenices between the steps, and if the partial order is incorrect then of course the total ordering will be incorrect as well.

/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)

Previous text:

...

2003-04-23 16:39: Subject: Re: sort

The shutdown procedure of your nuclear powerplant might be executed in an out-of-orderly fashion. All sorts of interesting things might end up happening or not happening. :-)

/ Johan Sundström (folkskådare)

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:45 p.m.

(It was an intentionally horrible example, for lack of a good one.) Applying some odd additional sorting criterion on a list of filenames might be a more likely activity where you would prefer sort stability.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-23 16:41: Subject: Re: sort

If the procedure steps must be executed in any particular order, they shouldn't be equal. The whole point in sorting a procedure is of course to get a total ordering from a partial ordering describing the interdependenices between the steps, and if the partial order is incorrect then of course the total ordering will be incorrect as well.

/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)

Mirar ＠ Pike developers forum

3:25 p.m.

Think for instance of a user interface with a table, with a few columns of which the user could sort on. A traditional user interface uses stable sorts, so if the the user first sorts on column 1 and then on column 2, the result will be sorted on column 2, but if two elements are equal in that column the previous sort will make those element sort internally on column 1:

1 2 abc hej ghi hej def fii

...

1< 2

abc hej def fii {sort(column(table,1),table)} ghi hej

1 >2< abc hej ghi hej {sort(column(table,2),table)} def fii

and not

1 >2< ghi hej abc hej def fii

This can of course be passed by by remembering the previous sorts, and make a comparison function that uses those. But a stable sort would sometimes be much nicer.

/ Mirar

Previous text:

...

2003-04-23 16:37: Subject: Re: sort

yes, but how do you know they are being reordered, if they are equal? and what problems could that create?

greetings, martin.

/ Brevbäraren

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:40 p.m.

Equality is of course quite enough for basic types such as integers; the equality property being as strong as `==. When you are shuffling about an array of objects, reordering equal objects might render odd problems if you need stable sorting.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-23 16:22: Subject: Re: sort

what effect could the reordering of equal elements have?

greetings, martin.

/ Brevbäraren

Martin Baehr

2:44 p.m.

how can two objects be different when they are equal? isn't == then comparing object references? and doesn't the equality of references mean that they are the same object?

greetings, martin.

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

3 p.m.

We can have several different statements that describes the same information: Statement("100:-") Statement("One hundred SEK") Statement("hundra spänn")

It would make sense to set these statements to be equal to each other. However, when we serialize these we don't know in which order they'll appear (if they are stored in an array that is sorted). While it doesn't matter in the application, we would get bigger diffs in the CVS where we store the serialization.

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-23 16:48: Subject: Re: sort

how can two objects be different when they are equal? isn't == then comparing object references? and doesn't the equality of references mean that they are the same object?

greetings, martin.

/ Brevbäraren

Jonas Walld�n ＠ Pike developers forum

3 p.m.

Have you never ever written a program that sorts a list of names? Let's say you have a set of records like this:

({ ({ "Simpson", "Homer" }), ({ "Simpson", "Bart" }), ({ "Flanders", "Ned" }) })

If you want to sort on last name as the primary key and the first name as the secondary you may run two sorting passes where the first one only inspects the first name:

({ ({ "Simpson", "Bart" }), ({ "Simpson", "Homer" }), ({ "Flanders", "Ned" }) })

(since "B" <= "H" <= "N").

Finally, run a second sort pass, this time using the last name as the key. This will only return correct results if the sort algorithm is stable.

/ Jonas Walldén

Previous text:

...

2003-04-23 16:48: Subject: Re: sort

how can two objects be different when they are equal? isn't == then comparing object references? and doesn't the equality of references mean that they are the same object?

greetings, martin.

/ Brevbäraren

Martin Baehr

3 p.m.

duh, now i get it. it is not the order of the elements themselves, but the secondary order of dependent elements. thanks.

greetings, martin.

Jonas Walld�n ＠ Pike developers forum

3:15 p.m.

Well, ({ "Simpson", "Homer" }) _is_ one element so it is in fact the order of the elements since you don't want to mix first and last names for different people. On the other hand the sort comparison callback may (as in this case) look at sub-parts of the elements when deciding how to sort the elements.

/ Jonas Walldén

Previous text:

...

2003-04-23 17:05: Subject: Re: sort

duh, now i get it. it is not the order of the elements themselves, but the secondary order of dependent elements. thanks.

greetings, martin.

/ Brevbäraren

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:20 p.m.

I noticed too (post facto). Array.sort_array wasn't as well documented though, but shares the property.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-23 16:12: Subject: sort

URL:http://pike.ida.liu.se/generated/manual/modref/ex/predef_3A_3A/sort.html

Note

The sorting algorithm used is not stable, ie elements that are equal may get reordered.

/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:20 p.m.

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-23 16:15: Subject: sort

I noticed too (post facto). Array.sort_array wasn't as well documented though, but shares the property.

/ Johan Sundström (folkskådare)

Martin Stjernholm, Roxen IS ＠ Pike developers forum

6:35 p.m.

It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

Mirar ＠ Pike developers forum

6:35 p.m.

Is there any situation where you want an optimized case of sorting long lists of many similar elements and doesn't care about being stable?

/ Mirar

Previous text:

...

2003-04-23 20:32: Subject: sort

It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

/ Martin Stjernholm, Roxen IS

Ludger Merkens

24 Apr 24 Apr

7:29 a.m.

On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...

It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...

/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

Martin Stjernholm, Roxen IS ＠ Pike developers forum

5:10 p.m.

I think it's quicksort, but knowing Hubbe there are probably a couple of extra quirks thrown into it. ;)

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-24 09:30: Subject: Re: sort

On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Brevbäraren

Jonas Walld�n ＠ Pike developers forum

5:45 p.m.

I believe it's introsort which is an improvement of quicksort.

/ Jonas Walldén

Previous text:

...

2003-04-24 19:06: Subject: Re: sort

I think it's quicksort, but knowing Hubbe there are probably a couple of extra quirks thrown into it. ;)

/ Martin Stjernholm, Roxen IS

David Hedbor ＠ Pike developers forum

6:50 p.m.

It was quicksort until it was realized some years back that it had AWFUL performance if many entries were the same. :)

/ David Hedbor

Previous text:

...

2003-04-24 19:06: Subject: Re: sort

I think it's quicksort, but knowing Hubbe there are probably a couple of extra quirks thrown into it. ;)

/ Martin Stjernholm, Roxen IS

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

25 Apr 25 Apr

1:10 a.m.

If I remember correctly, sort does not allocate any memory if you sort an array with only one reference.

Either way, I would be more conserned with how much extra time it takes than how much extra memory it takes.

/Hubbe

...

On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-24 09:30: Subject: Re: sort

On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Brevbäraren

Ludger Merkens

7:55 a.m.

On Fri, 25 Apr 2003, Fredrik (Naranek) Hubinette (Real Build Master) @ Pike (-) developers forum wrote:

...

If I remember correctly, sort does not allocate any memory if you sort an array with only one reference.

That was the reason for my question. But quicksort doesn't take extra memory (at least in theory). On the other hand it is not stable, that votes pro quicksort beeing the current algorithm.

...

Either way, I would be more conserned with how much extra time it takes than how much extra memory it takes.

agreed. Yet there should exist a fast, stable, in situ algorithm suited as a general purpose sorting algorithm for pike. If that is not the case, it could be interesting to provide a second algorithm e.g. _sort that would be optimized for almost sorted arrays and stable.

--- Ludger

...

  /Hubbe
...
On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...
2003-04-24 09:30: Subject: Re: sort

On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Brevbäraren

Martin Stjernholm, Roxen IS ＠ Pike developers forum

2:10 p.m.

Yes, you're almost right and I was mistaken. It only allocates extra memory when sort() is used with more than one argument. Number of references doesn't matter.

I believe the extra time would be small. An initialization pass through the position vector would be required, and swaps would have to swap in that one too. Comparisons in the position vector would only occur for elements that are equal.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-25 03:07: Subject: Re: sort

If I remember correctly, sort does not allocate any memory if you sort an array with only one reference.

Either way, I would be more conserned with how much extra time it takes than how much extra memory it takes.
 /Hubbe
...
On Wed, 23 Apr 2003, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:

...
It's very simple: Just remember the original position and sort on that when items are equal otherwise. That requires a little bit more memory so a separate stable version could be considered. But I don't think it's worth the trouble since sort() already temporarily allocates memory linear to the size of the sorted array.

which algorithm are you actually using?

--- Ludger

...
/ Martin Stjernholm, Roxen IS

Previous text:

...
2003-04-23 16:18: Subject: sort

Martin Stjernholm had somewhat interesting ideas regarding making any unstable sort algorithm stable with a more or less cheap wrapper. I'm hoping he might comment on the subject. :)

/ Johan Sundström (folkskådare)

/ Fredrik (Naranek) Hubinette (Real Build Master)

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

3:45 p.m.

Actually, I don't think you'll need an extra vector, since sort already uses a [0..n[ vector to keep track of the sort order. (When not sorting destructively.)

So basically the overhead would only be: o extra compares when equal o possibly more swaps because of more defined order o destructive sorting when array has one ref is no longer possible (fairly rare anyways)

All that would be required is a minor fix to: array.c:internal_cmpfun and anoter minor one to builtin.c:f_sort

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-25 16:06: Subject: Re: sort

Yes, you're almost right and I was mistaken. It only allocates extra memory when sort() is used with more than one argument. Number of references doesn't matter.

I believe the extra time would be small. An initialization pass through the position vector would be required, and swaps would have to swap in that one too. Comparisons in the position vector would only occur for elements that are equal.

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

26 Apr 26 Apr

12:15 a.m.

Hmm, I dug into the sorting stuff a bit and found that the sort rule for arrays is somewhat odd: It sorts recursively on the first element (only). That's a bit peculiar since it doesn't go on to compare the second if the first is equal etc, so it's still not a full recursive sort. Another thing is that it doesn't have any cyclic check, so the following causes sort() to hang:

array a = ({({0}), 0}); a[0][0] = a; a[1] = a; sort (a);

Questions: Why is sort() semi-recursive in this way? Should the hang above be fixed? (I think so.) If so, how?

Mappings and multisets are sorted on the memory address. If sort() is made stable I think it's better to change it so that they aren't reordered at all. Well, multisets could perhaps be sorted recursively on the first index in the same way as arrays.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-25 17:40: Subject: Re: sort

Actually, I don't think you'll need an extra vector, since sort already uses a [0..n[ vector to keep track of the sort order. (When not sorting destructively.)

So basically the overhead would only be: o extra compares when equal o possibly more swaps because of more defined order o destructive sorting when array has one ref is no longer possible (fairly rare anyways)

All that would be required is a minor fix to: array.c:internal_cmpfun and anoter minor one to builtin.c:f_sort

/ Fredrik (Naranek) Hubinette (Real Build Master)

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

27 Apr 27 Apr

5:10 a.m.

Sorting on *all* elements of the arrays would probably be a lot more useful, and fixing the hang would obviously be a good thing.

Why it works that way? Because I thought it was a good idea at the time.

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-26 02:12: Subject: Re: sort

Hmm, I dug into the sorting stuff a bit and found that the sort rule for arrays is somewhat odd: It sorts recursively on the first element (only). That's a bit peculiar since it doesn't go on to compare the second if the first is equal etc, so it's still not a full recursive sort. Another thing is that it doesn't have any cyclic check, so the following causes sort() to hang:

array a = ({({0}), 0}); a[0][0] = a; a[1] = a; sort (a);

Questions: Why is sort() semi-recursive in this way? Should the hang above be fixed? (I think so.) If so, how?

Mappings and multisets are sorted on the memory address. If sort() is made stable I think it's better to change it so that they aren't reordered at all. Well, multisets could perhaps be sorted recursively on the first index in the same way as arrays.

/ Martin Stjernholm, Roxen IS

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

28 Apr 28 Apr

12:30 p.m.

Cyclic arrays cause Array.flatten() to hang as well. It's an intricate problem, if addressed.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-26 02:12: Subject: Re: sort

Hmm, I dug into the sorting stuff a bit and found that the sort rule for arrays is somewhat odd: It sorts recursively on the first element (only). That's a bit peculiar since it doesn't go on to compare the second if the first is equal etc, so it's still not a full recursive sort. Another thing is that it doesn't have any cyclic check, so the following causes sort() to hang:

array a = ({({0}), 0}); a[0][0] = a; a[1] = a; sort (a);

Questions: Why is sort() semi-recursive in this way? Should the hang above be fixed? (I think so.) If so, how?

Mappings and multisets are sorted on the memory address. If sort() is made stable I think it's better to change it so that they aren't reordered at all. Well, multisets could perhaps be sorted recursively on the first index in the same way as arrays.

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

26 Apr 26 Apr

1:50 p.m.

Yes, stable sorting comes practically for free when get_order is used.

There's more overhead in the most common case when it uses sort_array_destructively. If get_alpha_order/order_array is used instead it takes about twice the time. If a position vector is added to sort_array_destructively it gets about 15-20% slower. When objects are sorted this overhead is lost completely compared to calling lfuns.

So what do you all think? Is it worth it to make sort() stable? I think it is since not very much of the time in an average Pike program is spent sorting, so the overall impact would be very low. A third option is to make sort() stable and introduce an unstable_sort() for those occasions where it really matters.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-25 17:40: Subject: Re: sort

Actually, I don't think you'll need an extra vector, since sort already uses a [0..n[ vector to keep track of the sort order. (When not sorting destructively.)

So basically the overhead would only be: o extra compares when equal o possibly more swaps because of more defined order o destructive sorting when array has one ref is no longer possible (fairly rare anyways)

All that would be required is a minor fix to: array.c:internal_cmpfun and anoter minor one to builtin.c:f_sort

/ Fredrik (Naranek) Hubinette (Real Build Master)

Jonas Walld�n ＠ Pike developers forum

2:05 p.m.

And how about making it locale-independent at the same time?

/ Jonas Walldén

Previous text:

...

2003-04-26 15:48: Subject: Re: sort

Yes, stable sorting comes practically for free when get_order is used.

There's more overhead in the most common case when it uses sort_array_destructively. If get_alpha_order/order_array is used instead it takes about twice the time. If a position vector is added to sort_array_destructively it gets about 15-20% slower. When objects are sorted this overhead is lost completely compared to calling lfuns.

So what do you all think? Is it worth it to make sort() stable? I think it is since not very much of the time in an average Pike program is spent sorting, so the overall impact would be very low. A third option is to make sort() stable and introduce an unstable_sort() for those occasions where it really matters.

/ Martin Stjernholm, Roxen IS

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

2:05 p.m.

It is. It is not platformindependent though.

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-26 16:01: Subject: Re: sort

And how about making it locale-independent at the same time?

/ Jonas Walldén

Jonas Walld�n ＠ Pike developers forum

2:10 p.m.

Starting in which version?

...

megalon:jonasw $ LC_COLLATE=sv_SE ./pike-sås/7.2/bin/pike Pike v7.2 release 364 running Hilfe v2.0 (Incremental Pike Frontend)

...
sort( ({ "å", "ä", "ö" }) );

Result: ({ /* 3 elements */ "\345", "\344", "\366" }) Terminal closed. megalon:jonasw $ LC_COLLATE= ./pike-sås/7.2/bin/pike Pike v7.2 release 364 running Hilfe v2.0 (Incremental Pike Frontend)

...
sort( ({ "å", "ä", "ö" }) );

Result: ({ /* 3 elements */ "\344", "\345", "\366" })

/ Jonas Walldén

Previous text:

...

2003-04-26 16:02: Subject: Re: sort

It is. It is not platformindependent though.

/ Martin Nilsson (har bott i google)

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

2:10 p.m.

Pike 7.5

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-26 16:06: Subject: Re: sort

Starting in which version?

...
megalon:jonasw $ LC_COLLATE=sv_SE ./pike-sås/7.2/bin/pike Pike v7.2 release 364 running Hilfe v2.0 (Incremental Pike Frontend)

...
sort( ({ "å", "ä", "ö" }) );

Result: ({ /* 3 elements */ "\345", "\344", "\366" }) Terminal closed. megalon:jonasw $ LC_COLLATE= ./pike-sås/7.2/bin/pike Pike v7.2 release 364 running Hilfe v2.0 (Incremental Pike Frontend)

...
sort( ({ "å", "ä", "ö" }) );

Result: ({ /* 3 elements */ "\344", "\345", "\366" })

/ Jonas Walldén

Jonas Walld�n ＠ Pike developers forum

2:10 p.m.

Ok, any plans for back-porting this to stable releases?

/ Jonas Walldén

Previous text:

...

2003-04-26 16:08: Subject: Re: sort

Pike 7.5

/ Martin Nilsson (har bott i google)

Martin Stjernholm, Roxen IS ＠ Pike developers forum

2:15 p.m.

Well, although it's what people typically want it's clearly not compatible.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-26 16:09: Subject: Re: sort

Ok, any plans for back-porting this to stable releases?

/ Jonas Walldén

Peter Bortas ＠ Pike developers forum

2:20 p.m.

I consider it a bug, but perhaps one that people rely on...

/ Peter Bortas

Previous text:

...

2003-04-26 16:11: Subject: Re: sort

Well, although it's what people typically want it's clearly not compatible.

/ Martin Stjernholm, Roxen IS

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

2:20 p.m.

No. Unless we decide that it was a bug that sort was environment dependent, there needs to be some compatibility fixed. Locale.SWE.sort or something.

Besides, it's not really a great improvement to go from locale dependent to platform dependent (although a bit faster I could guess).

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-26 16:09: Subject: Re: sort

Ok, any plans for back-porting this to stable releases?

/ Jonas Walldén

Martin Stjernholm, Roxen IS ＠ Pike developers forum

2:20 p.m.

But the latter is just a bug that's fairly easy to fix.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-26 16:15: Subject: Re: sort

No. Unless we decide that it was a bug that sort was environment dependent, there needs to be some compatibility fixed. Locale.SWE.sort or something.

Besides, it's not really a great improvement to go from locale dependent to platform dependent (although a bit faster I could guess).

/ Martin Nilsson (har bott i google)

Martin Stjernholm, Roxen IS ＠ Pike developers forum

2:10 p.m.

That has something to do with MEMCMP and signedness, right?

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-26 16:02: Subject: Re: sort

It is. It is not platformindependent though.

/ Martin Nilsson (har bott i google)

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

2:10 p.m.

Yep.

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-26 16:06: Subject: Re: sort

That has something to do with MEMCMP and signedness, right?

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

28 Apr 28 Apr

6:45 p.m.

I took a look in the configure script and port.c and there seems to be adequate measures to ensure that MEMCMP compares unsigned. On which platforms doesn't it work?

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-26 16:02: Subject: Re: sort

It is. It is not platformindependent though.

/ Martin Nilsson (har bott i google)

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

6:45 p.m.

Add a sort test to the testsuite and see...

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-28 20:40: Subject: Re: sort

I took a look in the configure script and port.c and there seems to be adequate measures to ensure that MEMCMP compares unsigned. On which platforms doesn't it work?

/ Martin Stjernholm, Roxen IS

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

26 Apr 26 Apr

2:05 p.m.

+1.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-26 16:01: Subject: Re: sort

And how about making it locale-independent at the same time?

/ Jonas Walldén

Peter Bortas ＠ Pike developers forum

2:10 p.m.

I'm always for killing locale-dependencies, but I didn't think it was.

/ Peter Bortas

Previous text:

...

2003-04-26 16:01: Subject: Re: sort

And how about making it locale-independent at the same time?

/ Jonas Walldén

Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum

2:10 p.m.

I find both suggestions good; either would be an improvement.

/ Johan Sundström (folkskådare)

Previous text:

...

2003-04-26 15:48: Subject: Re: sort

Yes, stable sorting comes practically for free when get_order is used.

There's more overhead in the most common case when it uses sort_array_destructively. If get_alpha_order/order_array is used instead it takes about twice the time. If a position vector is added to sort_array_destructively it gets about 15-20% slower. When objects are sorted this overhead is lost completely compared to calling lfuns.

So what do you all think? Is it worth it to make sort() stable? I think it is since not very much of the time in an average Pike program is spent sorting, so the overall impact would be very low. A third option is to make sort() stable and introduce an unstable_sort() for those occasions where it really matters.

/ Martin Stjernholm, Roxen IS

David Hedbor ＠ Pike developers forum

8 p.m.

Have you tried this with various "types" of arrays too, say arrays that have very many elements of the same type etc? Personally I don't see why we wouldn't want sort and stable_sort (or some other naming). I have _never_ missed a stable sort and thus don't think 15-20% performance hit is worth it. It gives me a hit for something _I_ don't want.

/ David Hedbor

Previous text:

...

2003-04-26 15:48: Subject: Re: sort

Yes, stable sorting comes practically for free when get_order is used.

There's more overhead in the most common case when it uses sort_array_destructively. If get_alpha_order/order_array is used instead it takes about twice the time. If a position vector is added to sort_array_destructively it gets about 15-20% slower. When objects are sorted this overhead is lost completely compared to calling lfuns.

So what do you all think? Is it worth it to make sort() stable? I think it is since not very much of the time in an average Pike program is spent sorting, so the overall impact would be very low. A third option is to make sort() stable and introduce an unstable_sort() for those occasions where it really matters.

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

11:10 p.m.

I did try it with arrays with many elements of the same type. You can see the tests for yourself; they're in the benchmark module. Considering how the compare function treats different types I don't think there's a measurable difference for arrays containing different types. I used integers since they are among the fastest types to compare, which means the 15-20% figure is as bad as possible.

A reason to make the normal sort function stable is that it's less surprising if you aren't aware of the issue, so it can avoid bugs. There have been several occasions when unexpected or annoying behavior has surfaced later on because sort() is unstable.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-26 21:58: Subject: Re: sort

Have you tried this with various "types" of arrays too, say arrays that have very many elements of the same type etc? Personally I don't see why we wouldn't want sort and stable_sort (or some other naming). I have _never_ missed a stable sort and thus don't think 15-20% performance hit is worth it. It gives me a hit for something _I_ don't want.

/ David Hedbor

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

27 Apr 27 Apr

6:15 a.m.

Possible optimization: Check the type field of the array and sort destructively if it doesn't contain any container types.

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-26 15:48: Subject: Re: sort

Yes, stable sorting comes practically for free when get_order is used.

There's more overhead in the most common case when it uses sort_array_destructively. If get_alpha_order/order_array is used instead it takes about twice the time. If a position vector is added to sort_array_destructively it gets about 15-20% slower. When objects are sorted this overhead is lost completely compared to calling lfuns.

So what do you all think? Is it worth it to make sort() stable? I think it is since not very much of the time in an average Pike program is spent sorting, so the overall impact would be very low. A third option is to make sort() stable and introduce an unstable_sort() for those occasions where it really matters.

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

noon

Good suggestion. Then it won't be slower in most common cases, providing the type field is accurate. I wonder how often it is, though. I found that allocate(X, Y) didn't set it, for instance.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-27 08:14: Subject: Re: sort

Possible optimization: Check the type field of the array and sort destructively if it doesn't contain any container types.

/ Fredrik (Naranek) Hubinette (Real Build Master)

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

28 Apr 28 Apr

4:20 a.m.

It's fairly random weather the type field is accurate or not. Improving the type field accuracy is always a good thing as it gives many minor speed improvements, so please go ahead and fix allocate()

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-27 13:56: Subject: Re: sort

Good suggestion. Then it won't be slower in most common cases, providing the type field is accurate. I wonder how often it is, though. I found that allocate(X, Y) didn't set it, for instance.

/ Martin Stjernholm, Roxen IS

Martin Nilsson (har bott i google) ＠ Pike (-) developers forum

10:20 a.m.

And that he did, with added bonus. Great work!

/ Martin Nilsson (har bott i google)

Previous text:

...

2003-04-28 06:17: Subject: Re: sort

It's fairly random weather the type field is accurate or not. Improving the type field accuracy is always a good thing as it gives many minor speed improvements, so please go ahead and fix allocate()

/ Fredrik (Naranek) Hubinette (Real Build Master)

Martin Stjernholm, Roxen IS ＠ Pike developers forum

5:45 p.m.

Yes, I went through all "allocate_array" in the core and in most modules so now it should be accurate most of the time. There are still places in src/modules/Perl/permod.c and src/modules/system/nt.c that I haven't bothered with.

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-28 12:17: Subject: Re: sort

And that he did, with added bonus. Great work!

/ Martin Nilsson (har bott i google)

Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum

9:25 p.m.

I assume you take all the places that modify ITEM(array) directly into account? If the type field is missing types, it leads to some very strange errors...

/ Fredrik (Naranek) Hubinette (Real Build Master)

Previous text:

...

2003-04-28 19:43: Subject: Re: sort

Yes, I went through all "allocate_array" in the core and in most modules so now it should be accurate most of the time. There are still places in src/modules/Perl/permod.c and src/modules/system/nt.c that I haven't bothered with.

/ Martin Stjernholm, Roxen IS

Martin Stjernholm, Roxen IS ＠ Pike developers forum

10:05 p.m.

I'm aware of that. If I've missed such places they do it in "publicly stored" arrays which wouldn't safe anyway (or at least very precarious).

/ Martin Stjernholm, Roxen IS

Previous text:

...

2003-04-28 23:22: Subject: Re: sort

I assume you take all the places that modify ITEM(array) directly into account? If the type field is missing types, it leads to some very strange errors...

/ Fredrik (Naranek) Hubinette (Real Build Master)

8136

Age (days ago)

8141

Last active (days ago)

pike-devel@lists.lysator.liu.se

55 comments

11 participants

tags (0)

participants (11)

David Hedbor ＠ Pike developers forum
Fredrik (Naranek) Hubinette (Real Build Master) ＠ Pike (-) developers forum
Johan Sundstr�m (folksk�dare) ＠ Pike (-) developers forum
Jonas Walld�n ＠ Pike developers forum
Ludger Merkens
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) ＠ Pike (-) developers forum
Martin Baehr
Martin Nilsson (har bott i google) ＠ Pike (-) developers forum
Martin Stjernholm, Roxen IS ＠ Pike developers forum
Mirar ＠ Pike developers forum
Peter Bortas ＠ Pike developers forum