I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Well I might, but just because you ask so politely ;-). Serious, you're right, I've been looking at java, but that is the Pike way.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
No I think I keep it as it is, then you have the possibility to insert and replace if exist with '[]= and insert but do not replace with insert. And I don't see it as an error if a key already exist, and by returning the value belonging to the key you can make a choice if you want to replace the value or not.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
What is the name and syntax you have planned for them?
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
With "if the there hasn't been any changes in the collection", I did not mean that the iterator would not work, I mean that the iterator will not traverse the elements in the same order.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
The main goal I had in mind when creating the collection interfaces was that you could write an algorithm that take a collection and then later reuse it with any other collection. By declaring a lot of methods optional this will spoil the abstractness of the whole thing because then you has to know the implementation of the algorithm before you know if your collection will work anyway.
/ Peta, jo det är jag
Previous text:
2002-11-12 04:19: Subject: More about ADTs
There is nothing that prevents that _indices() should be added to all types that need it. But I did not made it a demand for being a collection since there is after all types that does not have indices that could be a collection.
You're right, e.g. a queue would probably have no indexing operation and hence no indexes. It should be optional, like _values().
About sets, I regard the content as both indices and values.
I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
/.../
Shouldn't it be `==, wouldn't it be hard to handle basic types as elements otherwise?
That's not a problem; equal() works like == on integers, floats and strings (which I presume you mean with "basic types"). Or did I misunderstand you?
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
Don't think that an insert should do a replace.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
If the end could not be found it should probably also throw an error. Another solution is not to allow negative integers.
My vote is then to throw an error; negative indices are useful as a convenience, although they do add slightly more work for the ADT implementation.
I had a get_comparator method but realised that it will just give another interface and not be necessary in a minimal base.
Is it really a good goal to be minimal? Then you should remove a lot of other functions too, even _get_iterator since there might be collections that have no use for iterators (e.g. queues are perfectly useful without such a thing).
I think it's much better to define these things centrally and optionally, so that if they exist they have a known name and behavior. Otherwise there's the risk that the function to set the comparison function in one ADT is named e.g. set_less and takes a `<, while it in another is called set_comparator and takes a three-way comparison function returning -1, 0 or 1.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
That won't work well with how the == operator is used. /.../
Yes it should be _equal()
Excellent solution, actually. Didn't think of that.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
There could be collections where you can change the order of the indices.
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
Question do you think I should keep the '[] at all since a for example a linked list will be O(n) and there will probably be others. After all the only reason there is '[] is to resemble the existing types delete, insert and replace will work just as well.
Any other name for it would be confusing in the cases the operation does exist, since the idiomatic name is `[] and nothing else. With this reasoning, a linked list would simply not have any `[].
But drawing lines to Lisp, that would perhaps be unfortunate if one really wants to use such things in vector-like ways. Maybe it's better to soften that efficiency argument to: "If an `[] exists, it should be the fastest available lookup method." That'd still be enough to have some sort of definition to distinguish between the index and value sets, which is the key point here.
Finally a few words about unlimited/undefined sizes. They are undoubtedly cause a lot of problems and since one of the goals with the structure is to make it possible to use the collections and its sub-concepts in algorithms and most algorithms are quite useless on unlimited and undefined sizes. Therefore I think it's best to not let collections be of a unlimited/undefined size.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
Also, at least collections of unknown size aren't that esoteric, just think of a queue that works between processes as well as threads. Such a thing might use a pipe with some system dependent buffer size. It can also reasonably have other properties like not being able to be iterated over, there's no `[]=, etc.
/ Martin Stjernholm, Roxen IS