I've made som changes and here is a new suggestion for a common interface of the new ADTs.
Common API for new ADTs in Pike
1. Introduction This document describes a common platform for new ADTs in pike to stand on. The purpose of the platform is to make the new ADTs as flexible as possible and possible to use together with generic algorithms. The base of the platform is the Collection concept, being the base concepts of all the ADTs, and the Iterator concept making it possible to iterate over collections without knowing the underlying structure. Besides the two base concepts this document also describes a few sub-concepts that are related to the two base concepts according to the two figures below.
Figur 1 Collection concepts
Collection |-----------> Ordered Collection | | | |--------> Sequence | |--------> Ordered Set <------| | |--------> Ordered Map <--| | |----> Map --------------------------------------| | |----> Set ------------------------------------------|
Figur 2 Interface concepts
Iterator -------> Ordered Iterator
2. Collection concepts
2.1. Collection Collection is the base concept of all the ADTs. The reason it is called collection is that all the ADTs in the library is in some way a collection of items. To implement a collection the following methods must be implemented
iterator _get_iterator() Return an iterator for this ADT positioned at the first item in the ADT.
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
array _values() Return an array with the values in the ADT.
int _has_value(mixed value) Returns 1 if value is in the value domain of collection, or 0 (zero) if not found
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
string _sprintf(int conversion_type, mapping(string:int)|void params) Present the collection in a human readable way. For more information see _sprintf in the pike manual.
int is_empty() Return 0 if the ADT do has elements.
int max_size() Return the maximal size of the ADT if there is any, otherwise -1.
void clear() Clear the ADT.
iterator first() Return an iterator for this ADT positioned at the first item in the ADT, same as _get_iterator().
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
2.2. Set Set is a collection containing any number of items but only one of each item. It is a sub-concept of Collection and must thereby besides implementing the methods below also implements the methods in Collections.
int '[](mixed item) Return 1 if the item could be found in the set, 0 if it couldn't
int '[]=(mixed item, int exist) Add the item to the set if exist is true remove the item if exist is false.
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
int insert(mixed item) Add the item to the set if it isn't a part of the set. Return true if the item was added, else false.
mixed delete(mixed item) Remove the item from the set. If the item was removed it will be returned otherwise a 0 zero_type is returned.
multiset cast("multiset") The set should at least be able to cast to a multiset, additional castings are optional.
2.3. Map A map is a data structure that allows you to associate a key with one or more values and then quickly retrieve those values using the key. To implement a Map the ADT must have all the methods in this concept and in the Collection concept.
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
mapping cast("mapping") The map should at leas be able to cast to a mapping, additional castings are optional.
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. Any type could be used as an index if it satisfies a few rules. See Index for more information. There is also nothing that prevents the use of the values as indices as long as the value satisfies the rules for an Index.
int _has_index(int index) Returns 1 if index is in the index domain of collection, or 0 (zero) if not found
array _indices() Return an array with the indices of the collection.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
mixed '[]=(int ind, mixed item) Set the index ind to be item. Throw an error if ind is out of range
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item. -1 if no object was found. Throw an error if start is out of range.
void insert(index ind, mixed item) Insert an item in the sequence at the index ind. The size of the sequence is increased by one and the items at index and above have their indices increases by one. Throw an error if ind is out of range.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
void add(mixed item) Add the object item as fast as possible to the sequence.
int delete_at(int ind) Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
iterator get_iterator(int ind) Get an iterator for this ADT positioned at the index ind. Throw an error if ind is out of range.
array cast("array") The sequence should at least be able to cast to an array, additional castings are optional.
2.6. OrderedMap An OrderedMap is a map where the keys are orderable and the iterator belonging to an OrderedMap will traverse the values in that order.
2.7. OrderedSet An OrderedSet is a set were the values are orderable and the iterator belonging to an OrderedSet will traverse the values in that order.
3. Iterator concepts
3.1. Iterator The iterator concept does not differ much from the present day iterator. Only a few new methods are added to make it possible to step backwards and compare iterators.
int(0 1) '!() Should return 0 (zero) when not at end of collection, and 1 at end of collection
void '+=(int n) Should advance the specified number of steps
iterator '-=(int n) Move the iterator n steps backwards
int '==(iterator iter) Return true if the two iterators is pointing to the same object. Throw an error if the two iterators belong to different collections.
mixed index() Should return the current index.
mixed value() Should return the current value
int has_previous() Return true if the iterator can move one step backwards.
int has_next() Return true if the iterator can move one step forwards. It have the same use as '!().
collection get_collection() Return the collection this iterator iterates over.
Optional void set_value(mixed value) Set the value of this iterator. This method should only be available if it is possible to set the value via an iterator.
Optional void add(mixed value) Add a new item to the collection after the item the iterator currently is pointing at. This method should only be available if it is possible to add entries via the iterator
Optional mixed remove() Remove the item the iterator currently is pointing at and move to the next item. This method should only be available if it is possible to remove entries via the iterator
3.2. OrderedIterator An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
Int(0..1) '>(OrderedIterator iter) Return 1 if iter is positioned before this iterator otherwise 0. Throw an error if the two iterators belong to different collections.
4. Index If a type could be used as an index in an OrderedCollection then it must be possible to use the following operations on it.
'== If the two indexes a and b is equal it should return 1 otherwise 0. The following rules must also be satisfied. a == a must be true a == b implies b == a !(a == b) implies a != b a == b and b == c implies a == c
'< If the index a is ordered lower than b it should return 1 otherwise 0. The following rules must also be satisfied. a < a must be false a < b implies !(b < a) a < b implies (b > a) a < b and b < c implies a < c
'> If the index a is ordered higher than b it should return 1 otherwise 0. The following rules must also be satisfied. a > a must be false a > b implies !(b > a) a > b implies b < a a > b and b > c implies a > c
Requiring a cast("array") for Sets is probably also sane, especially where a cast to multiset is more expensive due to the need of imposing an order.
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-07 13:08: Subject: More about ADTs
I've made som changes and here is a new suggestion for a common interface of the new ADTs.
Common API for new ADTs in Pike
- Introduction
This document describes a common platform for new ADTs in pike to stand on. The purpose of the platform is to make the new ADTs as flexible as possible and possible to use together with generic algorithms. The base of the platform is the Collection concept, being the base concepts of all the ADTs, and the Iterator concept making it possible to iterate over collections without knowing the underlying structure. Besides the two base concepts this document also describes a few sub-concepts that are related to the two base concepts according to the two figures below.
Figur 1 Collection concepts
Collection |-----------> Ordered Collection | | | |--------> Sequence | |--------> Ordered Set <------| | |--------> Ordered Map <--| | |----> Map --------------------------------------| | |----> Set ------------------------------------------|
Figur 2 Interface concepts
Iterator -------> Ordered Iterator
- Collection concepts
2.1. Collection Collection is the base concept of all the ADTs. The reason it is called collection is that all the ADTs in the library is in some way a collection of items. To implement a collection the following methods must be implemented
iterator _get_iterator() Return an iterator for this ADT positioned at the first item in the ADT.
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
array _values() Return an array with the values in the ADT.
int _has_value(mixed value) Returns 1 if value is in the value domain of collection, or 0 (zero) if not found
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
string _sprintf(int conversion_type, mapping(string:int)|void params) Present the collection in a human readable way. For more information see _sprintf in the pike manual.
int is_empty() Return 0 if the ADT do has elements.
int max_size() Return the maximal size of the ADT if there is any, otherwise -1.
void clear() Clear the ADT.
iterator first() Return an iterator for this ADT positioned at the first item in the ADT, same as _get_iterator().
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
2.2. Set Set is a collection containing any number of items but only one of each item. It is a sub-concept of Collection and must thereby besides implementing the methods below also implements the methods in Collections.
int '[](mixed item) Return 1 if the item could be found in the set, 0 if it couldn't
int '[]=(mixed item, int exist) Add the item to the set if exist is true remove the item if exist is false.
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
int insert(mixed item) Add the item to the set if it isn't a part of the set. Return true if the item was added, else false.
mixed delete(mixed item) Remove the item from the set. If the item was removed it will be returned otherwise a 0 zero_type is returned.
multiset cast("multiset") The set should at least be able to cast to a multiset, additional castings are optional.
2.3. Map A map is a data structure that allows you to associate a key with one or more values and then quickly retrieve those values using the key. To implement a Map the ADT must have all the methods in this concept and in the Collection concept.
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
mapping cast("mapping") The map should at leas be able to cast to a mapping, additional castings are optional.
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. Any type could be used as an index if it satisfies a few rules. See Index for more information. There is also nothing that prevents the use of the values as indices as long as the value satisfies the rules for an Index.
int _has_index(int index) Returns 1 if index is in the index domain of collection, or 0 (zero) if not found
array _indices() Return an array with the indices of the collection.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
mixed '[]=(int ind, mixed item) Set the index ind to be item. Throw an error if ind is out of range
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item. -1 if no object was found. Throw an error if start is out of range.
void insert(index ind, mixed item) Insert an item in the sequence at the index ind. The size of the sequence is increased by one and the items at index and above have their indices increases by one. Throw an error if ind is out of range.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
void add(mixed item) Add the object item as fast as possible to the sequence.
int delete_at(int ind) Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
iterator get_iterator(int ind) Get an iterator for this ADT positioned at the index ind. Throw an error if ind is out of range.
array cast("array") The sequence should at least be able to cast to an array, additional castings are optional.
2.6. OrderedMap An OrderedMap is a map where the keys are orderable and the iterator belonging to an OrderedMap will traverse the values in that order.
2.7. OrderedSet An OrderedSet is a set were the values are orderable and the iterator belonging to an OrderedSet will traverse the values in that order.
- Iterator concepts
3.1. Iterator The iterator concept does not differ much from the present day iterator. Only a few new methods are added to make it possible to step backwards and compare iterators.
int(0 1) '!() Should return 0 (zero) when not at end of collection, and 1 at end of collection
void '+=(int n) Should advance the specified number of steps
iterator '-=(int n) Move the iterator n steps backwards
int '==(iterator iter) Return true if the two iterators is pointing to the same object. Throw an error if the two iterators belong to different collections.
mixed index() Should return the current index.
mixed value() Should return the current value
int has_previous() Return true if the iterator can move one step backwards.
int has_next() Return true if the iterator can move one step forwards. It have the same use as '!().
collection get_collection() Return the collection this iterator iterates over.
Optional void set_value(mixed value) Set the value of this iterator. This method should only be available if it is possible to set the value via an iterator.
Optional void add(mixed value) Add a new item to the collection after the item the iterator currently is pointing at. This method should only be available if it is possible to add entries via the iterator
Optional mixed remove() Remove the item the iterator currently is pointing at and move to the next item. This method should only be available if it is possible to remove entries via the iterator
3.2. OrderedIterator An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
Int(0..1) '>(OrderedIterator iter) Return 1 if iter is positioned before this iterator otherwise 0. Throw an error if the two iterators belong to different collections.
- Index
If a type could be used as an index in an OrderedCollection then it must be possible to use the following operations on it.
'== If the two indexes a and b is equal it should return 1 otherwise 0. The following rules must also be satisfied. a == a must be true a == b implies b == a !(a == b) implies a != b a == b and b == c implies a == c
'< If the index a is ordered lower than b it should return 1 otherwise 0. The following rules must also be satisfied. a < a must be false a < b implies !(b < a) a < b implies (b > a) a < b and b < c implies a < c
'> If the index a is ordered higher than b it should return 1 otherwise 0. The following rules must also be satisfied. a > a must be false a > b implies !(b > a) a > b implies b < a a > b and b > c implies a > c
/ Peta, jo det är jag
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
Is there any way to tell the last two cases apart? If not, do you consider it useless to be able to do that?
array _values() Return an array with the values in the ADT.
I take the presence of this in Collection but not _indices() as a sign that e.g. the Set class would return its contents with this function. That doesn't work very well with the current multiset implementation. If it's possible to do lookup on a value with `[], then it should (by definition) be part of the index set and not the value set. (The same applies to _has_value.)
Also, this function has to be optional, or else it's not possible to have unlimited collections.
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
This function should also be optional. It might be possible to define for some unlimited collections, but far from all of them.
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
Note that this function can't be implemented for unlimited collections, so it must be optional.
2.2. Set
/.../
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
I assume you mean _get_iterator. The argument must be optional to make the interface compatible with Collection.
int insert(mixed item)
/.../
mixed delete(mixed item)
/.../
These should probably be lfuns, since they need to be implemented for the new multisets too. Therefore it would probably be better if they followed the convention with an initial "_" as the other lfuns.
Actually most, if not all, functions in Collection and inheriting classes will likely become lfuns sooner or later so this remark applies in general.
Perhaps some explanation of the lfun design might be in order: They are used by various global functions and builtin language constructs to let objects behave like builtin data types. Wherever they are used, it's only tested whether an identifier with that name exists in the object, and if it does then it's used under the assumption that it actually carries out the intended operation; no check is done that the object inherits some interface that declares the function.
In other words, the existence of the identifier itself is interpreted as if the object adheres to an interface. This use of identifiers is something that makes object orientation in Pike in general radically different from e.g. C++ and Java. The advantage is that it's very convenient to implement only those parts of an interface that are necessary to get something to work; to accomplish the same design in e.g. C++ you'd have to have an abstract class for almost every function and then inherit a bunch of classes for the functions you want to implement. The disadvantage is that the identifier namespace in classes is essentially global in this regard.
I think that the advantage outweights the disadvantage when it comes to core interfaces like collections. The namespace issue is then solved either by the use of the backquote operator syntax, or by the "_" prefix. (The notable exception is "cast", which I consider an oversight. "create" and "destroy" might also be considered exceptions, but they are otoh part of the interface to objects themselves and not how they are used to emulate other data types.)
Note that this argumentation does not apply to iterators since they are explicitly requested and therefore can't be confused with anything else, as opposed to e.g. equal() which can be fed (and must work on) any object.
As for your reasons for not having "_" in front of insert and delete:
- There is currently no lfun insert and delete.
That's only an argument insofar that there's not any existing interface to adhere to (the same applies to _search and _has_value, for that matter). I.e. it's not an argument against using "_", only one that it doesn't have to be used.
- I belive that it should be possible to remove and insert without using and lfun
Of course. All lfuns can be called as ordinary functions too.
- If there will be a insert and delete lfun then it should not be difficult to use the current insert and delete methods.
Well, the inconsistent behavior of delete() in Sequence (see below) currently makes that a bit hard, but I hope you'll change that. I don't see how this is an argument either for or against the underline convention.
My argument for using "_" is that if these functions are made into lfuns (which very likely will happen) then they would collide with existing functions that doesn't adhere to this interface (just do a grep through lib/modules to see a couple of examples of that).
multiset cast("multiset") The set should at least be able to cast to a multiset, /.../
Is there a particular reason for that? That means that an unlimited collection can't be a Set. Besides, I can't see how this operation is a necessary part of something being a set, abstractly speaking. I suggest you make it an optional but recommended operation instead.
2.3. Map A map is a data structure that allows you to associate a key with one or more values /.../
Afaics this means that the property unique vs duplicate indices is connected with the property of whether values exist or not. Why?
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
0 ought to be UNDEFINED instead (i.e. a 0 for which zero_type returns 1).
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
The return value is item?
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
This doesn't mention the case when a Map contains duplicate indices. (Also applies to the other functions that uses a key here, for that matter.)
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
You probably mean UNDEFINED lastly. (Yes, I'm being picky but in my experience interface docs better be correct down to the letter.)
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. /.../ int _has_index(int index)
/.../
array _indices()
/.../
This does not make sense to me. For me, and for the primitive data types in Pike, an index is something that can be looked up (in an operation that is at least faster than O(n), otherwise I'd call it "searched for"). That has nothing to do with order between indices.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
Odd definition. Isn't the key concept that it has a well defined order and that the order is _arbitrary_, i.e. that each element has an order wrt to every other element and that order is specified only by how the sequence is created and not on any sort of comparison function?
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item.
Why is this not already in Collection? Note that a (well defined) order isn't necessary to be able to do this; see how search() works on mappings, for example. (Ideally one would always use an iterator to do a search through the values, but then it could not be used with the current search() function, so I agree that an lfun like this one is necessary.)
[Return] -1 if no object was found.
This doesn't work if this function is extended to Maps (or OrderedMaps). UNDEFINED is a better return value to get a consistent interface with those.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
This is inconsistent with delete() in Set and Map, which takes an index. Now it takes a value all of a sudden. I don't think this operation is necessary at all; then there could just as well be a variant of insert() that searches for a value and inserts another one before/after it.
int delete_at(int ind) Remove the item at the index ind. /.../
This is the one that ought to be named delete(), I think.
Return 0 if the removal failed.
Why would it fail, given that the index is in range?
2.6. OrderedMap
/.../
2.7. OrderedSet
/.../
What about being able to set the order in these?
I miss functions to get an iterator based on inequalities, i.e. functions that returns an iterator pointing to the closest element less/greater (or equal) to a given key.
I also miss _search, but otoh I think that one should be present already in Collection.
3.1. Iterator
/.../
int '==(iterator iter) /.../ Throw an error if the two iterators belong to different collections.
That won't work well with how the == operator is used. It can be called between arbitrary things e.g. in the master, inside mappings etc. It must be able to cope with any argument; it's not even limited to objects.
This is actually a generic problem with the current use of `==, `< and `>. One would like to have some sort of domain restriction of them, but there's currently no such thing.
I miss operations to be able to search in either direction, e.g. a search_eq_forward which steps the iterator forward until a value that is equal with the argument according to `==. A more general variant might be a function that takes a predicate function to do the matching. It could also be useful with iterator variants of map() and filter(), i.e. map_forward, filter_backward etc.
3.2. OrderedIterator
An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
With this definition a mapping would be an ordered collection, and also almost every collection implementation I can think of that can be iterated over at all.
It's more interesting to differentiate between collections with and without a well defined order. (Actually, most of the time when I've said "ordered" in this conference, I really ment "with well defined order".) I can only see two ways to create a well defined order: Either through keeping the arbitrary order created by the user operations (i.e. like in Sequence), or by having a comparison function.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
Also note that there are several occasions where an iterator over a collection without a well defined order can have meaningful `< and `> operations, but the exception above then becomes essential. See the discussion starting in 9265430.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
This exception handling might be a problem in the same way as in `==, but it's not equally severe here. The best way to solve it would probably be to make a typed exception that the caller can intercept and discard when appropriate.
- Index
/.../
a < b implies !(b < a)
But !(b < a) does not imply a < b, I hope, since that'd rule out partial orders. Another relation I hope doesn't hold is:
!(a < b) && !(b < a) implies a == b
Or alternatively:
a != b implies a < b || b < a
Well, that was quite a lot of comments. Generally speaking, I think you should especially reconsider your designation of the index and value sets; see for instance how has_index() and has_value() operates on arrays and multisets. I think a good definition for them is built around the indexing operation:
o If an `[] operator exists, it's able to check if the given key exists in the collection and, in that case, find the corresponding element. That operation must have a time complexity strictly less than O(the size of the collection).
o The key given to `[] is matched within the set of _indices_ of the elements in the collection.
o The result returned by `[] comes from the set of _values_ of the elements in the collection.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-07 13:08: Subject: More about ADTs
I've made som changes and here is a new suggestion for a common interface of the new ADTs.
Common API for new ADTs in Pike
- Introduction
This document describes a common platform for new ADTs in pike to stand on. The purpose of the platform is to make the new ADTs as flexible as possible and possible to use together with generic algorithms. The base of the platform is the Collection concept, being the base concepts of all the ADTs, and the Iterator concept making it possible to iterate over collections without knowing the underlying structure. Besides the two base concepts this document also describes a few sub-concepts that are related to the two base concepts according to the two figures below.
Figur 1 Collection concepts
Collection |-----------> Ordered Collection | | | |--------> Sequence | |--------> Ordered Set <------| | |--------> Ordered Map <--| | |----> Map --------------------------------------| | |----> Set ------------------------------------------|
Figur 2 Interface concepts
Iterator -------> Ordered Iterator
- Collection concepts
2.1. Collection Collection is the base concept of all the ADTs. The reason it is called collection is that all the ADTs in the library is in some way a collection of items. To implement a collection the following methods must be implemented
iterator _get_iterator() Return an iterator for this ADT positioned at the first item in the ADT.
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
array _values() Return an array with the values in the ADT.
int _has_value(mixed value) Returns 1 if value is in the value domain of collection, or 0 (zero) if not found
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
string _sprintf(int conversion_type, mapping(string:int)|void params) Present the collection in a human readable way. For more information see _sprintf in the pike manual.
int is_empty() Return 0 if the ADT do has elements.
int max_size() Return the maximal size of the ADT if there is any, otherwise -1.
void clear() Clear the ADT.
iterator first() Return an iterator for this ADT positioned at the first item in the ADT, same as _get_iterator().
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
2.2. Set Set is a collection containing any number of items but only one of each item. It is a sub-concept of Collection and must thereby besides implementing the methods below also implements the methods in Collections.
int '[](mixed item) Return 1 if the item could be found in the set, 0 if it couldn't
int '[]=(mixed item, int exist) Add the item to the set if exist is true remove the item if exist is false.
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
int insert(mixed item) Add the item to the set if it isn't a part of the set. Return true if the item was added, else false.
mixed delete(mixed item) Remove the item from the set. If the item was removed it will be returned otherwise a 0 zero_type is returned.
multiset cast("multiset") The set should at least be able to cast to a multiset, additional castings are optional.
2.3. Map A map is a data structure that allows you to associate a key with one or more values and then quickly retrieve those values using the key. To implement a Map the ADT must have all the methods in this concept and in the Collection concept.
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
mapping cast("mapping") The map should at leas be able to cast to a mapping, additional castings are optional.
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. Any type could be used as an index if it satisfies a few rules. See Index for more information. There is also nothing that prevents the use of the values as indices as long as the value satisfies the rules for an Index.
int _has_index(int index) Returns 1 if index is in the index domain of collection, or 0 (zero) if not found
array _indices() Return an array with the indices of the collection.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
mixed '[]=(int ind, mixed item) Set the index ind to be item. Throw an error if ind is out of range
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item. -1 if no object was found. Throw an error if start is out of range.
void insert(index ind, mixed item) Insert an item in the sequence at the index ind. The size of the sequence is increased by one and the items at index and above have their indices increases by one. Throw an error if ind is out of range.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
void add(mixed item) Add the object item as fast as possible to the sequence.
int delete_at(int ind) Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
iterator get_iterator(int ind) Get an iterator for this ADT positioned at the index ind. Throw an error if ind is out of range.
array cast("array") The sequence should at least be able to cast to an array, additional castings are optional.
2.6. OrderedMap An OrderedMap is a map where the keys are orderable and the iterator belonging to an OrderedMap will traverse the values in that order.
2.7. OrderedSet An OrderedSet is a set were the values are orderable and the iterator belonging to an OrderedSet will traverse the values in that order.
- Iterator concepts
3.1. Iterator The iterator concept does not differ much from the present day iterator. Only a few new methods are added to make it possible to step backwards and compare iterators.
int(0 1) '!() Should return 0 (zero) when not at end of collection, and 1 at end of collection
void '+=(int n) Should advance the specified number of steps
iterator '-=(int n) Move the iterator n steps backwards
int '==(iterator iter) Return true if the two iterators is pointing to the same object. Throw an error if the two iterators belong to different collections.
mixed index() Should return the current index.
mixed value() Should return the current value
int has_previous() Return true if the iterator can move one step backwards.
int has_next() Return true if the iterator can move one step forwards. It have the same use as '!().
collection get_collection() Return the collection this iterator iterates over.
Optional void set_value(mixed value) Set the value of this iterator. This method should only be available if it is possible to set the value via an iterator.
Optional void add(mixed value) Add a new item to the collection after the item the iterator currently is pointing at. This method should only be available if it is possible to add entries via the iterator
Optional mixed remove() Remove the item the iterator currently is pointing at and move to the next item. This method should only be available if it is possible to remove entries via the iterator
3.2. OrderedIterator An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
Int(0..1) '>(OrderedIterator iter) Return 1 if iter is positioned before this iterator otherwise 0. Throw an error if the two iterators belong to different collections.
- Index
If a type could be used as an index in an OrderedCollection then it must be possible to use the following operations on it.
'== If the two indexes a and b is equal it should return 1 otherwise 0. The following rules must also be satisfied. a == a must be true a == b implies b == a !(a == b) implies a != b a == b and b == c implies a == c
'< If the index a is ordered lower than b it should return 1 otherwise 0. The following rules must also be satisfied. a < a must be false a < b implies !(b < a) a < b implies (b > a) a < b and b < c implies a < c
'> If the index a is ordered higher than b it should return 1 otherwise 0. The following rules must also be satisfied. a > a must be false a > b implies !(b > a) a > b implies b < a a > b and b > c implies a > c
/ Peta, jo det är jag
I take the presence of this in Collection but not _indices() as a sign that e.g. the Set class would return its contents with this function. That doesn't work very well with the current multiset implementation. If it's possible to do lookup on a value with `[], then it should (by definition) be part of the index set and not the value set. (The same applies to _has_value.)
There is nothing that prevents that _indices() should be added to all types that need it. But I did not made it a demand for being a collection since there is after all types that does not have indices that could be a collection. About sets, I regard the content as both indices and values.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
This function should also be optional. It might be possible to define for some unlimited collections, but far from all of them.
Shouldn't it be `==, wouldn't it be hard to handle basic types as elements otherwise?
I assume you mean _get_iterator. The argument must be optional to make the interface compatible with Collection.
No I did mean get_iterator since the _get_iterator() didn't take a position, but making it optional argument instead could be a good solution.
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
Don't think that an insert should do a replace.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
If the end could not be found it should probably also throw an error. Another solution is not to allow negative integers.
What about being able to set the order in these?
I had a get_comparator method but realised that it will just give another interface and not be necessary in a minimal base.
That won't work well with how the == operator is used. It can be called between arbitrary things e.g. in the master, inside mappings etc. It must be able to cope with any argument; it's not even limited to objects.
This is actually a generic problem with the current use of `==, `< and `>. One would like to have some sort of domain restriction of them, but there's currently no such thing.
Yes it should be _equal()
With this definition a mapping would be an ordered collection, and also almost every collection implementation I can think of that can be iterated over at all.
Sorry been reading in the definition of 'strictly ordered' from OrderedCollection when I've been reading this.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
There could be collections where you can change the order of the indices.
o If an `[] operator exists, it's able to check if the given key exists in the collection and, in that case, find the corresponding element. That operation must have a time complexity strictly less than O(the size of the collection).
Question do you think I should keep the '[] at all since a for example a linked list will be O(n) and there will probably be others. After all the only reason there is '[] is to resemble the existing types delete, insert and replace will work just as well.
Finally a few words about unlimited/undefined sizes. They are undoubtedly cause a lot of problems and since one of the goals with the structure is to make it possible to use the collections and its sub-concepts in algorithms and most algorithms are quite useless on unlimited and undefined sizes. Therefore I think it's best to not let collections be of a unlimited/undefined size.
/ Peta, jo det är jag
Previous text:
2002-11-09 02:09: Subject: More about ADTs
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
Is there any way to tell the last two cases apart? If not, do you consider it useless to be able to do that?
array _values() Return an array with the values in the ADT.
I take the presence of this in Collection but not _indices() as a sign that e.g. the Set class would return its contents with this function. That doesn't work very well with the current multiset implementation. If it's possible to do lookup on a value with `[], then it should (by definition) be part of the index set and not the value set. (The same applies to _has_value.)
Also, this function has to be optional, or else it's not possible to have unlimited collections.
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
This function should also be optional. It might be possible to define for some unlimited collections, but far from all of them.
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
Note that this function can't be implemented for unlimited collections, so it must be optional.
2.2. Set
/.../
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
I assume you mean _get_iterator. The argument must be optional to make the interface compatible with Collection.
int insert(mixed item)
/.../
mixed delete(mixed item)
/.../
These should probably be lfuns, since they need to be implemented for the new multisets too. Therefore it would probably be better if they followed the convention with an initial "_" as the other lfuns.
Actually most, if not all, functions in Collection and inheriting classes will likely become lfuns sooner or later so this remark applies in general.
Perhaps some explanation of the lfun design might be in order: They are used by various global functions and builtin language constructs to let objects behave like builtin data types. Wherever they are used, it's only tested whether an identifier with that name exists in the object, and if it does then it's used under the assumption that it actually carries out the intended operation; no check is done that the object inherits some interface that declares the function.
In other words, the existence of the identifier itself is interpreted as if the object adheres to an interface. This use of identifiers is something that makes object orientation in Pike in general radically different from e.g. C++ and Java. The advantage is that it's very convenient to implement only those parts of an interface that are necessary to get something to work; to accomplish the same design in e.g. C++ you'd have to have an abstract class for almost every function and then inherit a bunch of classes for the functions you want to implement. The disadvantage is that the identifier namespace in classes is essentially global in this regard.
I think that the advantage outweights the disadvantage when it comes to core interfaces like collections. The namespace issue is then solved either by the use of the backquote operator syntax, or by the "_" prefix. (The notable exception is "cast", which I consider an oversight. "create" and "destroy" might also be considered exceptions, but they are otoh part of the interface to objects themselves and not how they are used to emulate other data types.)
Note that this argumentation does not apply to iterators since they are explicitly requested and therefore can't be confused with anything else, as opposed to e.g. equal() which can be fed (and must work on) any object.
As for your reasons for not having "_" in front of insert and delete:
- There is currently no lfun insert and delete.
That's only an argument insofar that there's not any existing interface to adhere to (the same applies to _search and _has_value, for that matter). I.e. it's not an argument against using "_", only one that it doesn't have to be used.
- I belive that it should be possible to remove and insert without using and lfun
Of course. All lfuns can be called as ordinary functions too.
- If there will be a insert and delete lfun then it should not be difficult to use the current insert and delete methods.
Well, the inconsistent behavior of delete() in Sequence (see below) currently makes that a bit hard, but I hope you'll change that. I don't see how this is an argument either for or against the underline convention.
My argument for using "_" is that if these functions are made into lfuns (which very likely will happen) then they would collide with existing functions that doesn't adhere to this interface (just do a grep through lib/modules to see a couple of examples of that).
multiset cast("multiset") The set should at least be able to cast to a multiset, /.../
Is there a particular reason for that? That means that an unlimited collection can't be a Set. Besides, I can't see how this operation is a necessary part of something being a set, abstractly speaking. I suggest you make it an optional but recommended operation instead.
2.3. Map A map is a data structure that allows you to associate a key with one or more values /.../
Afaics this means that the property unique vs duplicate indices is connected with the property of whether values exist or not. Why?
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
0 ought to be UNDEFINED instead (i.e. a 0 for which zero_type returns 1).
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
The return value is item?
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
This doesn't mention the case when a Map contains duplicate indices. (Also applies to the other functions that uses a key here, for that matter.)
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
You probably mean UNDEFINED lastly. (Yes, I'm being picky but in my experience interface docs better be correct down to the letter.)
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. /.../ int _has_index(int index)
/.../
array _indices()
/.../
This does not make sense to me. For me, and for the primitive data types in Pike, an index is something that can be looked up (in an operation that is at least faster than O(n), otherwise I'd call it "searched for"). That has nothing to do with order between indices.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
Odd definition. Isn't the key concept that it has a well defined order and that the order is _arbitrary_, i.e. that each element has an order wrt to every other element and that order is specified only by how the sequence is created and not on any sort of comparison function?
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item.
Why is this not already in Collection? Note that a (well defined) order isn't necessary to be able to do this; see how search() works on mappings, for example. (Ideally one would always use an iterator to do a search through the values, but then it could not be used with the current search() function, so I agree that an lfun like this one is necessary.)
[Return] -1 if no object was found.
This doesn't work if this function is extended to Maps (or OrderedMaps). UNDEFINED is a better return value to get a consistent interface with those.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
This is inconsistent with delete() in Set and Map, which takes an index. Now it takes a value all of a sudden. I don't think this operation is necessary at all; then there could just as well be a variant of insert() that searches for a value and inserts another one before/after it.
int delete_at(int ind) Remove the item at the index ind. /.../
This is the one that ought to be named delete(), I think.
Return 0 if the removal failed.
Why would it fail, given that the index is in range?
2.6. OrderedMap
/.../
2.7. OrderedSet
/.../
What about being able to set the order in these?
I miss functions to get an iterator based on inequalities, i.e. functions that returns an iterator pointing to the closest element less/greater (or equal) to a given key.
I also miss _search, but otoh I think that one should be present already in Collection.
3.1. Iterator
/.../
int '==(iterator iter) /.../ Throw an error if the two iterators belong to different collections.
That won't work well with how the == operator is used. It can be called between arbitrary things e.g. in the master, inside mappings etc. It must be able to cope with any argument; it's not even limited to objects.
This is actually a generic problem with the current use of `==, `< and `>. One would like to have some sort of domain restriction of them, but there's currently no such thing.
I miss operations to be able to search in either direction, e.g. a search_eq_forward which steps the iterator forward until a value that is equal with the argument according to `==. A more general variant might be a function that takes a predicate function to do the matching. It could also be useful with iterator variants of map() and filter(), i.e. map_forward, filter_backward etc.
3.2. OrderedIterator
An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
With this definition a mapping would be an ordered collection, and also almost every collection implementation I can think of that can be iterated over at all.
It's more interesting to differentiate between collections with and without a well defined order. (Actually, most of the time when I've said "ordered" in this conference, I really ment "with well defined order".) I can only see two ways to create a well defined order: Either through keeping the arbitrary order created by the user operations (i.e. like in Sequence), or by having a comparison function.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
Also note that there are several occasions where an iterator over a collection without a well defined order can have meaningful `< and `> operations, but the exception above then becomes essential. See the discussion starting in 9265430.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
This exception handling might be a problem in the same way as in `==, but it's not equally severe here. The best way to solve it would probably be to make a typed exception that the caller can intercept and discard when appropriate.
- Index
/.../
a < b implies !(b < a)
But !(b < a) does not imply a < b, I hope, since that'd rule out partial orders. Another relation I hope doesn't hold is:
!(a < b) && !(b < a) implies a == b
Or alternatively:
a != b implies a < b || b < a
Well, that was quite a lot of comments. Generally speaking, I think you should especially reconsider your designation of the index and value sets; see for instance how has_index() and has_value() operates on arrays and multisets. I think a good definition for them is built around the indexing operation:
o If an `[] operator exists, it's able to check if the given key exists in the collection and, in that case, find the corresponding element. That operation must have a time complexity strictly less than O(the size of the collection).
o The key given to `[] is matched within the set of _indices_ of the elements in the collection.
o The result returned by `[] comes from the set of _values_ of the elements in the collection.
/ Martin Stjernholm, Roxen IS
There is nothing that prevents that _indices() should be added to all types that need it. But I did not made it a demand for being a collection since there is after all types that does not have indices that could be a collection.
You're right, e.g. a queue would probably have no indexing operation and hence no indexes. It should be optional, like _values().
About sets, I regard the content as both indices and values.
I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
/.../
Shouldn't it be `==, wouldn't it be hard to handle basic types as elements otherwise?
That's not a problem; equal() works like == on integers, floats and strings (which I presume you mean with "basic types"). Or did I misunderstand you?
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
Don't think that an insert should do a replace.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
If the end could not be found it should probably also throw an error. Another solution is not to allow negative integers.
My vote is then to throw an error; negative indices are useful as a convenience, although they do add slightly more work for the ADT implementation.
I had a get_comparator method but realised that it will just give another interface and not be necessary in a minimal base.
Is it really a good goal to be minimal? Then you should remove a lot of other functions too, even _get_iterator since there might be collections that have no use for iterators (e.g. queues are perfectly useful without such a thing).
I think it's much better to define these things centrally and optionally, so that if they exist they have a known name and behavior. Otherwise there's the risk that the function to set the comparison function in one ADT is named e.g. set_less and takes a `<, while it in another is called set_comparator and takes a three-way comparison function returning -1, 0 or 1.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
That won't work well with how the == operator is used. /.../
Yes it should be _equal()
Excellent solution, actually. Didn't think of that.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
There could be collections where you can change the order of the indices.
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
Question do you think I should keep the '[] at all since a for example a linked list will be O(n) and there will probably be others. After all the only reason there is '[] is to resemble the existing types delete, insert and replace will work just as well.
Any other name for it would be confusing in the cases the operation does exist, since the idiomatic name is `[] and nothing else. With this reasoning, a linked list would simply not have any `[].
But drawing lines to Lisp, that would perhaps be unfortunate if one really wants to use such things in vector-like ways. Maybe it's better to soften that efficiency argument to: "If an `[] exists, it should be the fastest available lookup method." That'd still be enough to have some sort of definition to distinguish between the index and value sets, which is the key point here.
Finally a few words about unlimited/undefined sizes. They are undoubtedly cause a lot of problems and since one of the goals with the structure is to make it possible to use the collections and its sub-concepts in algorithms and most algorithms are quite useless on unlimited and undefined sizes. Therefore I think it's best to not let collections be of a unlimited/undefined size.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
Also, at least collections of unknown size aren't that esoteric, just think of a queue that works between processes as well as threads. Such a thing might use a pipe with some system dependent buffer size. It can also reasonably have other properties like not being able to be iterated over, there's no `[]=, etc.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-11 12:39: Subject: More about ADTs
I take the presence of this in Collection but not _indices() as a sign that e.g. the Set class would return its contents with this function. That doesn't work very well with the current multiset implementation. If it's possible to do lookup on a value with `[], then it should (by definition) be part of the index set and not the value set. (The same applies to _has_value.)
There is nothing that prevents that _indices() should be added to all types that need it. But I did not made it a demand for being a collection since there is after all types that does not have indices that could be a collection. About sets, I regard the content as both indices and values.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
This function should also be optional. It might be possible to define for some unlimited collections, but far from all of them.
Shouldn't it be `==, wouldn't it be hard to handle basic types as elements otherwise?
I assume you mean _get_iterator. The argument must be optional to make the interface compatible with Collection.
No I did mean get_iterator since the _get_iterator() didn't take a position, but making it optional argument instead could be a good solution.
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
Don't think that an insert should do a replace.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
If the end could not be found it should probably also throw an error. Another solution is not to allow negative integers.
What about being able to set the order in these?
I had a get_comparator method but realised that it will just give another interface and not be necessary in a minimal base.
That won't work well with how the == operator is used. It can be called between arbitrary things e.g. in the master, inside mappings etc. It must be able to cope with any argument; it's not even limited to objects.
This is actually a generic problem with the current use of `==, `< and `>. One would like to have some sort of domain restriction of them, but there's currently no such thing.
Yes it should be _equal()
With this definition a mapping would be an ordered collection, and also almost every collection implementation I can think of that can be iterated over at all.
Sorry been reading in the definition of 'strictly ordered' from OrderedCollection when I've been reading this.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
There could be collections where you can change the order of the indices.
o If an `[] operator exists, it's able to check if the given key exists in the collection and, in that case, find the corresponding element. That operation must have a time complexity strictly less than O(the size of the collection).
Question do you think I should keep the '[] at all since a for example a linked list will be O(n) and there will probably be others. After all the only reason there is '[] is to resemble the existing types delete, insert and replace will work just as well.
Finally a few words about unlimited/undefined sizes. They are undoubtedly cause a lot of problems and since one of the goals with the structure is to make it possible to use the collections and its sub-concepts in algorithms and most algorithms are quite useless on unlimited and undefined sizes. Therefore I think it's best to not let collections be of a unlimited/undefined size.
/ Peta, jo det är jag
On Tue, Nov 12, 2002 at 04:20:12AM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
hmm, insert would allow you to get an error if the element is already there, which can be very usefull. an assignment would never fail.
greetings, martin.
I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Well I might, but just because you ask so politely ;-). Serious, you're right, I've been looking at java, but that is the Pike way.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
No I think I keep it as it is, then you have the possibility to insert and replace if exist with '[]= and insert but do not replace with insert. And I don't see it as an error if a key already exist, and by returning the value belonging to the key you can make a choice if you want to replace the value or not.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
What is the name and syntax you have planned for them?
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
With "if the there hasn't been any changes in the collection", I did not mean that the iterator would not work, I mean that the iterator will not traverse the elements in the same order.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
The main goal I had in mind when creating the collection interfaces was that you could write an algorithm that take a collection and then later reuse it with any other collection. By declaring a lot of methods optional this will spoil the abstractness of the whole thing because then you has to know the implementation of the algorithm before you know if your collection will work anyway.
/ Peta, jo det är jag
Previous text:
2002-11-12 04:19: Subject: More about ADTs
There is nothing that prevents that _indices() should be added to all types that need it. But I did not made it a demand for being a collection since there is after all types that does not have indices that could be a collection.
You're right, e.g. a queue would probably have no indexing operation and hence no indexes. It should be optional, like _values().
About sets, I regard the content as both indices and values.
I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Are elements compared with equal() or `==? To be consistent it should be the first, but there clearly can be situations where one would rather do the latter.
/.../
Shouldn't it be `==, wouldn't it be hard to handle basic types as elements otherwise?
That's not a problem; equal() works like == on integers, floats and strings (which I presume you mean with "basic types"). Or did I misunderstand you?
Isn't it more useful if it replaces the value in the second case? Then a Map that doesn't allow multiple indices can simply let `[]= and insert be the same.
Don't think that an insert should do a replace.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
How is the range defined? Is it possible to index from the end with negative integers? If so, how will that work together with unlimited Sequences?
If the end could not be found it should probably also throw an error. Another solution is not to allow negative integers.
My vote is then to throw an error; negative indices are useful as a convenience, although they do add slightly more work for the ADT implementation.
I had a get_comparator method but realised that it will just give another interface and not be necessary in a minimal base.
Is it really a good goal to be minimal? Then you should remove a lot of other functions too, even _get_iterator since there might be collections that have no use for iterators (e.g. queues are perfectly useful without such a thing).
I think it's much better to define these things centrally and optionally, so that if they exist they have a known name and behavior. Otherwise there's the risk that the function to set the comparison function in one ADT is named e.g. set_less and takes a `<, while it in another is called set_comparator and takes a three-way comparison function returning -1, 0 or 1.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
That won't work well with how the == operator is used. /.../
Yes it should be _equal()
Excellent solution, actually. Didn't think of that.
If there is a well defined order, I don't think the exception "if the there hasn't been any changes in the collection" should be necessary; it's possible to do better than that.
There could be collections where you can change the order of the indices.
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
Question do you think I should keep the '[] at all since a for example a linked list will be O(n) and there will probably be others. After all the only reason there is '[] is to resemble the existing types delete, insert and replace will work just as well.
Any other name for it would be confusing in the cases the operation does exist, since the idiomatic name is `[] and nothing else. With this reasoning, a linked list would simply not have any `[].
But drawing lines to Lisp, that would perhaps be unfortunate if one really wants to use such things in vector-like ways. Maybe it's better to soften that efficiency argument to: "If an `[] exists, it should be the fastest available lookup method." That'd still be enough to have some sort of definition to distinguish between the index and value sets, which is the key point here.
Finally a few words about unlimited/undefined sizes. They are undoubtedly cause a lot of problems and since one of the goals with the structure is to make it possible to use the collections and its sub-concepts in algorithms and most algorithms are quite useless on unlimited and undefined sizes. Therefore I think it's best to not let collections be of a unlimited/undefined size.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
Also, at least collections of unknown size aren't that esoteric, just think of a queue that works between processes as well as threads. Such a thing might use a pipe with some system dependent buffer size. It can also reasonably have other properties like not being able to be iterated over, there's no `[]=, etc.
/ Martin Stjernholm, Roxen IS
No I think I keep it as it is, then you have the possibility to insert and replace if exist with '[]= and insert but do not replace with insert. /.../
That's a valid point. Don't really know when the second variety would be useful, but still.
and by returning the value belonging to the key you can make a choice if you want to replace the value or not.
The return value should then be UNDEFINED and not 0 if the insert is successful.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
What is the name and syntax you have planned for them?
I haven't thought about that yet, besides that it should be something with either a "`" or a "_" first (for the lfuns, that is). ADT.set_order and ADT.get_order seems like reasonable names for the global functions, and then the lfuns would be _set_order and _get_order. Would that do?
The ordering function is a less-than operation and not a three-way comparator (i.e. like strcmp(3)). My reasons for that are:
o It's simpler to make a less-than function. o It's actually not that often one can use the extra information given by a three-way comparator to save calls, at least in binary trees. o It can be slightly less work to do a less-than operation. o The issue of equality is not connected to the order; in a three-way comparator it's easier to make the mistake that it should return 0 only when the operands are equal, and not when they are orderwise unrelated (i.e. the order is partial).
With "if the there hasn't been any changes in the collection", I did not mean that the iterator would not work, I mean that the iterator will not traverse the elements in the same order.
Yes, well, you stated a premise that an iterator on both ordered and unordered collections should be able to fulfill. I wanted to point out that additional conditions can be added to an iterator on an ordered collection.
The main goal I had in mind when creating the collection interfaces was that you could write an algorithm that take a collection and then later reuse it with any other collection. By declaring a lot of methods optional this will spoil the abstractness of the whole thing because then you has to know the implementation of the algorithm before you know if your collection will work anyway.
Not the algorithm itself, only the functions it needs. The algorithms are inherently dependent on the properties of the collections they can work with, and those properties are more than this. I believe that the need by an algorithm for certain functions is an accurate way of expressing which properties it needs(*). Since all the functions are adequately abstracted and don't touch on the implementations in improper ways, it's also an abstract way.
In this view, you're saying that you wish to oversimplify the interface just to be able to make a simple rule that an algorithm that work on one collection will work on all (or on one of a few subsets). The backside is that you then also completely rule out both algorithms and types of collections that can be very useful on various subsets of each other.
Note that you can exploit the looks-like philosophy of the Pike type system to accurately check if a certain collection fits a certain algorithm, since it checks identifier by identifier that the class of the actual argument is compatible with the class of the formal argument.
*) In some cases the same property might be expressed through several functions. Then it's of course not appropriate for a collection implementation to arbitrarily define some of them and leaving out others. Such dependencies should also be part of the documented interface, e.g. by making those functions mandatory in an interface class that itself is optional.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 11:28: Subject: More about ADTs
I think that implies that the [] operation would return the index as the value if it exists. I.e. foo[0] would return 0, making it necessary to use zero_type. No, please regard the elements as indices and all values as 1. That makes [] and []= work with a minimal amount of special cases (only []= needs one), and besides it's compatible with the multiset data type.
Well I might, but just because you ask so politely ;-). Serious, you're right, I've been looking at java, but that is the Pike way.
Good point, but then it should perhaps throw an error instead? The insert isn't successful, afterall. I don't think it's much use being error tolerant in the insert operation since it's typically only used in ADTs that actually allows duplicate indices, otherwise []= works just as well. Perhaps the insert function shouldn't be defined at all in ADTs that don't allow duplicate indices.
No I think I keep it as it is, then you have the possibility to insert and replace if exist with '[]= and insert but do not replace with insert. And I don't see it as an error if a key already exist, and by returning the value belonging to the key you can make a choice if you want to replace the value or not.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
What is the name and syntax you have planned for them?
Then the iterator can/should "follow" its index to its new position and then continue stepping from there in a well defined way. The only case where the iterator could reasonably loose track is if the element it's "standing on" is removed. (An ADT could solve this either by refusing the remove, by doing a copy-on-write, or by documenting that the iterator more or less loose track in that case. The new multisets take the last route.)
With "if the there hasn't been any changes in the collection", I did not mean that the iterator would not work, I mean that the iterator will not traverse the elements in the same order.
Then a superclass to Collection would be necessary, and even then it probably would not work out well if Collection isn't designed with a few thoughts in that direction. Most of these issues are solved simply by making functions optional. (That's not really a cost, since a runtime check that a function exists before calling it is unavoidable anyway in Pike.)
The main goal I had in mind when creating the collection interfaces was that you could write an algorithm that take a collection and then later reuse it with any other collection. By declaring a lot of methods optional this will spoil the abstractness of the whole thing because then you has to know the implementation of the algorithm before you know if your collection will work anyway.
/ Peta, jo det är jag
I haven't thought about that yet, besides that it should be something with either a "`" or a "_" first (for the lfuns, that is). ADT.set_order and ADT.get_order seems like reasonable names for the global functions, and then the lfuns would be _set_order and _get_order. Would that do?
The ordering function is a less-than operation and not a three-way comparator (i.e. like strcmp(3)). My reasons for that are:
o It's simpler to make a less-than function. o It's actually not that often one can use the extra information given by a three-way comparator to save calls, at least in binary trees. o It can be slightly less work to do a less-than operation. o The issue of equality is not connected to the order; in a three-way comparator it's easier to make the mistake that it should return 0 only when the operands are equal, and not when they are orderwise unrelated (i.e. the order is partial).
I was thinking about the name _set_comparator or _set_orderer but that was for a 3-way comparator, I should probably call a less-than simply _set_less_than then it's quite obvious from the name what the function is expected to do. A less-than should likely be enough I'll try to dream up a scenario where it wouldn't be.
Not the algorithm itself, only the functions it needs. The algorithms are inherently dependent on the properties of the collections they can work with, and those properties are more than this. I believe that the need by an algorithm for certain functions is an accurate way of expressing which properties it needs(*). Since all the functions are adequately abstracted and don't touch on the implementations in improper ways, it's also an abstract way.
In this view, you're saying that you wish to oversimplify the interface just to be able to make a simple rule that an algorithm that work on one collection will work on all (or on one of a few subsets). The backside is that you then also completely rule out both algorithms and types of collections that can be very useful on various subsets of each other.
Note that you can exploit the looks-like philosophy of the Pike type system to accurately check if a certain collection fits a certain algorithm, since it checks identifier by identifier that the class of the actual argument is compatible with the class of the formal argument.
I still believe it's better to have a few relatively limited interfaces that include the 90% most used types instead of having a lot that include 100%. Otherwise it will end up in a vast number of different types that only works with a few algorithms each because the one who made them just had a those algorithms in mind when the type was done. But you're right that the current way to categorise the ADTs might have been totally wrong and may be a better way to try to look at possible algorithms and try to see what they need and make a few interfaces, one for each type of algorithm, just try to keep it down to four or five interfaces, ten maximum. Still keep the collection, set, map and the others, but declaring most of the methods optional and put the restriction in the interfaces. Let try it out and see what it result in.
/ Peta, jo det är jag
Previous text:
2002-11-13 02:46: Subject: More about ADTs
No I think I keep it as it is, then you have the possibility to insert and replace if exist with '[]= and insert but do not replace with insert. /.../
That's a valid point. Don't really know when the second variety would be useful, but still.
and by returning the value belonging to the key you can make a choice if you want to replace the value or not.
The return value should then be UNDEFINED and not 0 if the insert is successful.
(Besides, global functions for setting and getting the `< function is necessary in the new multisets, so these will probably be candidates for lfuns too.)
What is the name and syntax you have planned for them?
I haven't thought about that yet, besides that it should be something with either a "`" or a "_" first (for the lfuns, that is). ADT.set_order and ADT.get_order seems like reasonable names for the global functions, and then the lfuns would be _set_order and _get_order. Would that do?
The ordering function is a less-than operation and not a three-way comparator (i.e. like strcmp(3)). My reasons for that are:
o It's simpler to make a less-than function. o It's actually not that often one can use the extra information given by a three-way comparator to save calls, at least in binary trees. o It can be slightly less work to do a less-than operation. o The issue of equality is not connected to the order; in a three-way comparator it's easier to make the mistake that it should return 0 only when the operands are equal, and not when they are orderwise unrelated (i.e. the order is partial).
With "if the there hasn't been any changes in the collection", I did not mean that the iterator would not work, I mean that the iterator will not traverse the elements in the same order.
Yes, well, you stated a premise that an iterator on both ordered and unordered collections should be able to fulfill. I wanted to point out that additional conditions can be added to an iterator on an ordered collection.
The main goal I had in mind when creating the collection interfaces was that you could write an algorithm that take a collection and then later reuse it with any other collection. By declaring a lot of methods optional this will spoil the abstractness of the whole thing because then you has to know the implementation of the algorithm before you know if your collection will work anyway.
Not the algorithm itself, only the functions it needs. The algorithms are inherently dependent on the properties of the collections they can work with, and those properties are more than this. I believe that the need by an algorithm for certain functions is an accurate way of expressing which properties it needs(*). Since all the functions are adequately abstracted and don't touch on the implementations in improper ways, it's also an abstract way.
In this view, you're saying that you wish to oversimplify the interface just to be able to make a simple rule that an algorithm that work on one collection will work on all (or on one of a few subsets). The backside is that you then also completely rule out both algorithms and types of collections that can be very useful on various subsets of each other.
Note that you can exploit the looks-like philosophy of the Pike type system to accurately check if a certain collection fits a certain algorithm, since it checks identifier by identifier that the class of the actual argument is compatible with the class of the formal argument.
*) In some cases the same property might be expressed through several functions. Then it's of course not appropriate for a collection implementation to arbitrarily define some of them and leaving out others. Such dependencies should also be part of the documented interface, e.g. by making those functions mandatory in an interface class that itself is optional.
/ Martin Stjernholm, Roxen IS
I should probably call a less-than simply _set_less_than then it's quite obvious from the name what the function is expected to do.
It is, but it's slightly less obvious how the function is used otoh. In this case one could perhaps believe that the lfun sets the `< in the collection itself. To capture both use and form, the name would be _set_order_using_less_than or something similar. I think it's usually enough if the exact form is described in the docs only. If an ordering function always is of a certain form within the domain, then "the ordering function" is just as descriptive as "the less-than ordering function".
I still believe it's better to have a few relatively limited interfaces that include the 90% most used types instead of having a lot that include 100%.
Fine, as long as those 90% include the interesting data types. That means, in my opinion, that e.g. the property of unknown size has to be covered for the sake of pipe queues.
Besides, the interface will probably be extended over time anyway, as all actively used interfaces do. It's a good thing if one tries to think ahead a bit to avoid future incompatible interface changes, but that doesn't mean one has to incorporate all that from the beginning. And making mandatory functions (and trailing arguments) optional rarely leads to serious incompatibilities.
/.../ because the one who made them just had a those algorithms in mind when the type was done.
I'd like to point out that there's no room for such arbitrariness if it works to describe properties through the existence of functions. If the data type has a certain property then it's clearly a mistake to not implement the corresponding function(s), and it's not up to the implementor to choose whether a data structure has a property or not. The intended algorithms are therefore irrelevant; they can rather be used as a check: If an algorithm that is known to work with a certain collection type turns out not to work then there's an error either in the algorithm implementation, in the collection implementation, or in the interface spec.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-14 13:18: Subject: More about ADTs
I haven't thought about that yet, besides that it should be something with either a "`" or a "_" first (for the lfuns, that is). ADT.set_order and ADT.get_order seems like reasonable names for the global functions, and then the lfuns would be _set_order and _get_order. Would that do?
The ordering function is a less-than operation and not a three-way comparator (i.e. like strcmp(3)). My reasons for that are:
o It's simpler to make a less-than function. o It's actually not that often one can use the extra information given by a three-way comparator to save calls, at least in binary trees. o It can be slightly less work to do a less-than operation. o The issue of equality is not connected to the order; in a three-way comparator it's easier to make the mistake that it should return 0 only when the operands are equal, and not when they are orderwise unrelated (i.e. the order is partial).
I was thinking about the name _set_comparator or _set_orderer but that was for a 3-way comparator, I should probably call a less-than simply _set_less_than then it's quite obvious from the name what the function is expected to do. A less-than should likely be enough I'll try to dream up a scenario where it wouldn't be.
Not the algorithm itself, only the functions it needs. The algorithms are inherently dependent on the properties of the collections they can work with, and those properties are more than this. I believe that the need by an algorithm for certain functions is an accurate way of expressing which properties it needs(*). Since all the functions are adequately abstracted and don't touch on the implementations in improper ways, it's also an abstract way.
In this view, you're saying that you wish to oversimplify the interface just to be able to make a simple rule that an algorithm that work on one collection will work on all (or on one of a few subsets). The backside is that you then also completely rule out both algorithms and types of collections that can be very useful on various subsets of each other.
Note that you can exploit the looks-like philosophy of the Pike type system to accurately check if a certain collection fits a certain algorithm, since it checks identifier by identifier that the class of the actual argument is compatible with the class of the formal argument.
I still believe it's better to have a few relatively limited interfaces that include the 90% most used types instead of having a lot that include 100%. Otherwise it will end up in a vast number of different types that only works with a few algorithms each because the one who made them just had a those algorithms in mind when the type was done. But you're right that the current way to categorise the ADTs might have been totally wrong and may be a better way to try to look at possible algorithms and try to see what they need and make a few interfaces, one for each type of algorithm, just try to keep it down to four or five interfaces, ten maximum. Still keep the collection, set, map and the others, but declaring most of the methods optional and put the restriction in the interfaces. Let try it out and see what it result in.
/ Peta, jo det är jag
I'm missing the `+, `-, `| and `& operations between ADTs of, at least, the same type. It would also be nice to have rotation and decreasing shifts on the ordered collections. Trivial example (since I felt my description wasn't as descriptive):
class ADT (array data) {
this_program rol() { data = data[1..] + data[0]; return this_object(); }
this_program ror() { data = data[-1] + data[..sizeof(data)-2]; return this_object(); }
mixed shl() { mixed ret = data[0]; data = data[1..]; return ret; }
mixed shr() { mixed ret = data[-1]; data = data[..sizeof(data)-2]; return ret; } }
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-07 13:08: Subject: More about ADTs
I've made som changes and here is a new suggestion for a common interface of the new ADTs.
Common API for new ADTs in Pike
- Introduction
This document describes a common platform for new ADTs in pike to stand on. The purpose of the platform is to make the new ADTs as flexible as possible and possible to use together with generic algorithms. The base of the platform is the Collection concept, being the base concepts of all the ADTs, and the Iterator concept making it possible to iterate over collections without knowing the underlying structure. Besides the two base concepts this document also describes a few sub-concepts that are related to the two base concepts according to the two figures below.
Figur 1 Collection concepts
Collection |-----------> Ordered Collection | | | |--------> Sequence | |--------> Ordered Set <------| | |--------> Ordered Map <--| | |----> Map --------------------------------------| | |----> Set ------------------------------------------|
Figur 2 Interface concepts
Iterator -------> Ordered Iterator
- Collection concepts
2.1. Collection Collection is the base concept of all the ADTs. The reason it is called collection is that all the ADTs in the library is in some way a collection of items. To implement a collection the following methods must be implemented
iterator _get_iterator() Return an iterator for this ADT positioned at the first item in the ADT.
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
array _values() Return an array with the values in the ADT.
int _has_value(mixed value) Returns 1 if value is in the value domain of collection, or 0 (zero) if not found
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
string _sprintf(int conversion_type, mapping(string:int)|void params) Present the collection in a human readable way. For more information see _sprintf in the pike manual.
int is_empty() Return 0 if the ADT do has elements.
int max_size() Return the maximal size of the ADT if there is any, otherwise -1.
void clear() Clear the ADT.
iterator first() Return an iterator for this ADT positioned at the first item in the ADT, same as _get_iterator().
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
2.2. Set Set is a collection containing any number of items but only one of each item. It is a sub-concept of Collection and must thereby besides implementing the methods below also implements the methods in Collections.
int '[](mixed item) Return 1 if the item could be found in the set, 0 if it couldn't
int '[]=(mixed item, int exist) Add the item to the set if exist is true remove the item if exist is false.
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
int insert(mixed item) Add the item to the set if it isn't a part of the set. Return true if the item was added, else false.
mixed delete(mixed item) Remove the item from the set. If the item was removed it will be returned otherwise a 0 zero_type is returned.
multiset cast("multiset") The set should at least be able to cast to a multiset, additional castings are optional.
2.3. Map A map is a data structure that allows you to associate a key with one or more values and then quickly retrieve those values using the key. To implement a Map the ADT must have all the methods in this concept and in the Collection concept.
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
mapping cast("mapping") The map should at leas be able to cast to a mapping, additional castings are optional.
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. Any type could be used as an index if it satisfies a few rules. See Index for more information. There is also nothing that prevents the use of the values as indices as long as the value satisfies the rules for an Index.
int _has_index(int index) Returns 1 if index is in the index domain of collection, or 0 (zero) if not found
array _indices() Return an array with the indices of the collection.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
mixed '[]=(int ind, mixed item) Set the index ind to be item. Throw an error if ind is out of range
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item. -1 if no object was found. Throw an error if start is out of range.
void insert(index ind, mixed item) Insert an item in the sequence at the index ind. The size of the sequence is increased by one and the items at index and above have their indices increases by one. Throw an error if ind is out of range.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
void add(mixed item) Add the object item as fast as possible to the sequence.
int delete_at(int ind) Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
iterator get_iterator(int ind) Get an iterator for this ADT positioned at the index ind. Throw an error if ind is out of range.
array cast("array") The sequence should at least be able to cast to an array, additional castings are optional.
2.6. OrderedMap An OrderedMap is a map where the keys are orderable and the iterator belonging to an OrderedMap will traverse the values in that order.
2.7. OrderedSet An OrderedSet is a set were the values are orderable and the iterator belonging to an OrderedSet will traverse the values in that order.
- Iterator concepts
3.1. Iterator The iterator concept does not differ much from the present day iterator. Only a few new methods are added to make it possible to step backwards and compare iterators.
int(0 1) '!() Should return 0 (zero) when not at end of collection, and 1 at end of collection
void '+=(int n) Should advance the specified number of steps
iterator '-=(int n) Move the iterator n steps backwards
int '==(iterator iter) Return true if the two iterators is pointing to the same object. Throw an error if the two iterators belong to different collections.
mixed index() Should return the current index.
mixed value() Should return the current value
int has_previous() Return true if the iterator can move one step backwards.
int has_next() Return true if the iterator can move one step forwards. It have the same use as '!().
collection get_collection() Return the collection this iterator iterates over.
Optional void set_value(mixed value) Set the value of this iterator. This method should only be available if it is possible to set the value via an iterator.
Optional void add(mixed value) Add a new item to the collection after the item the iterator currently is pointing at. This method should only be available if it is possible to add entries via the iterator
Optional mixed remove() Remove the item the iterator currently is pointing at and move to the next item. This method should only be available if it is possible to remove entries via the iterator
3.2. OrderedIterator An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
Int(0..1) '>(OrderedIterator iter) Return 1 if iter is positioned before this iterator otherwise 0. Throw an error if the two iterators belong to different collections.
- Index
If a type could be used as an index in an OrderedCollection then it must be possible to use the following operations on it.
'== If the two indexes a and b is equal it should return 1 otherwise 0. The following rules must also be satisfied. a == a must be true a == b implies b == a !(a == b) implies a != b a == b and b == c implies a == c
'< If the index a is ordered lower than b it should return 1 otherwise 0. The following rules must also be satisfied. a < a must be false a < b implies !(b < a) a < b implies (b > a) a < b and b < c implies a < c
'> If the index a is ordered higher than b it should return 1 otherwise 0. The following rules must also be satisfied. a > a must be false a > b implies !(b > a) a > b implies b < a a > b and b > c implies a > c
/ Peta, jo det är jag
Yes I know, actually it was a deliberate choise to leave them outside the interface. Since i try to keep the interface as small as possible, just basic, and the +. -, /, %, &, | only work on some ADTs and not others i decided to put them on ADT level instead of on this interface level.
/ Peta, jo det är jag
Previous text:
2002-11-09 02:41: Subject: More about ADTs
I'm missing the `+, `-, `| and `& operations between ADTs of, at least, the same type. It would also be nice to have rotation and decreasing shifts on the ordered collections. Trivial example (since I felt my description wasn't as descriptive):
class ADT (array data) {
this_program rol() { data = data[1..] + data[0]; return this_object(); }
this_program ror() { data = data[-1] + data[..sizeof(data)-2]; return this_object(); }
mixed shl() { mixed ret = data[0]; data = data[1..]; return ret; }
mixed shr() { mixed ret = data[-1]; data = data[..sizeof(data)-2]; return ret; } }
/ Martin Nilsson (Fake Build Master)
A notice about this in a paragraph in the spec would be nice, to ensure the reader that thought has been applied. I even think it would be good to explicitly list the lfuns that isn't in the API, to make it easier to spot missing lfuns (Anyone feel like documenting the _random lfun?).
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-09 20:13: Subject: More about ADTs
Yes I know, actually it was a deliberate choise to leave them outside the interface. Since i try to keep the interface as small as possible, just basic, and the +. -, /, %, &, | only work on some ADTs and not others i decided to put them on ADT level instead of on this interface level.
/ Peta, jo det är jag
I think they should be mentioned in the interface docs, so that if they are implemented by an ADT it's done in a uniform way. Like many other functions, they might very well be optional.
Anyway, it's not a problem to add them later on; they are de facto fairly well defined (i.e. an ADT implementing e.g. `| ought to do it in the same way as it works on arrays, multisets and mappings already), so it's mostly a matter of writing it down.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-09 20:13: Subject: More about ADTs
Yes I know, actually it was a deliberate choise to leave them outside the interface. Since i try to keep the interface as small as possible, just basic, and the +. -, /, %, &, | only work on some ADTs and not others i decided to put them on ADT level instead of on this interface level.
/ Peta, jo det är jag
int [Sequence::]delete_at(int ind)
Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
In addition to Mast's good and thorough analysis, I'd like to add two ponderings here. First, returning the value corresponding to the index being deleted is IMO a too valuable convenience to leave out (it was added to m_delete for mappings a while back, as a way of eating off configuration options and similar from a mapping, for further processing, and the like). If the case of removal failure can happen, for some reason, returning UNDEFINED would feel more in line with how a mapping would have behaved, if the range check doesn't to cover that problem.
Second (and now I address a wider public): was there some form of consensus about not naming this _m_delete, for the sake of the somewhat annoyingly ugly prefix? And in that case, was some other LFUN for delete operations under consideration?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-07 13:08: Subject: More about ADTs
I've made som changes and here is a new suggestion for a common interface of the new ADTs.
Common API for new ADTs in Pike
- Introduction
This document describes a common platform for new ADTs in pike to stand on. The purpose of the platform is to make the new ADTs as flexible as possible and possible to use together with generic algorithms. The base of the platform is the Collection concept, being the base concepts of all the ADTs, and the Iterator concept making it possible to iterate over collections without knowing the underlying structure. Besides the two base concepts this document also describes a few sub-concepts that are related to the two base concepts according to the two figures below.
Figur 1 Collection concepts
Collection |-----------> Ordered Collection | | | |--------> Sequence | |--------> Ordered Set <------| | |--------> Ordered Map <--| | |----> Map --------------------------------------| | |----> Set ------------------------------------------|
Figur 2 Interface concepts
Iterator -------> Ordered Iterator
- Collection concepts
2.1. Collection Collection is the base concept of all the ADTs. The reason it is called collection is that all the ADTs in the library is in some way a collection of items. To implement a collection the following methods must be implemented
iterator _get_iterator() Return an iterator for this ADT positioned at the first item in the ADT.
int _sizeof() Return the current size of the ADT. Return -1 if the size could not be decided or is truly unlimited.
array _values() Return an array with the values in the ADT.
int _has_value(mixed value) Returns 1 if value is in the value domain of collection, or 0 (zero) if not found
int _equal(mixed coll) Decide if the collection coll and this collection contains the same elements, an if it is important, in the same order.
string _sprintf(int conversion_type, mapping(string:int)|void params) Present the collection in a human readable way. For more information see _sprintf in the pike manual.
int is_empty() Return 0 if the ADT do has elements.
int max_size() Return the maximal size of the ADT if there is any, otherwise -1.
void clear() Clear the ADT.
iterator first() Return an iterator for this ADT positioned at the first item in the ADT, same as _get_iterator().
iterator last() Return an iterator for this ADT positioned at the last item in the ADT.
2.2. Set Set is a collection containing any number of items but only one of each item. It is a sub-concept of Collection and must thereby besides implementing the methods below also implements the methods in Collections.
int '[](mixed item) Return 1 if the item could be found in the set, 0 if it couldn't
int '[]=(mixed item, int exist) Add the item to the set if exist is true remove the item if exist is false.
iterator get_iterator(mixed item) Get an iterator for this ADT positioned at the item or null if the item could not be found
int insert(mixed item) Add the item to the set if it isn't a part of the set. Return true if the item was added, else false.
mixed delete(mixed item) Remove the item from the set. If the item was removed it will be returned otherwise a 0 zero_type is returned.
multiset cast("multiset") The set should at least be able to cast to a multiset, additional castings are optional.
2.3. Map A map is a data structure that allows you to associate a key with one or more values and then quickly retrieve those values using the key. To implement a Map the ADT must have all the methods in this concept and in the Collection concept.
mixed '[](mixed key) Return the item associated with the key if it could be found, 0 if it couldn't
mixed '[]=(mixed key, mixed item) If there is an item with the key key then the associated item will be overwritten with the supplied item. If there is no key in the map a new key-item pair will be added.
iterator get_iterator(mixed key) Get an iterator for this ADT positioned at the key or null if the key could not be found.
mixed insert(mixed key, mixed item) If the key doesn't exist or duplicates are allowed, associate the value with the key and return 0, otherwise don't modify the map and return the current value associated with the key.
mixed delete(mixed key) Remove the pair with the key key. Return the keys value if it was removed, if the key could not be removed return zero_type.
mapping cast("mapping") The map should at leas be able to cast to a mapping, additional castings are optional.
2.4. OrderedCollection An OrderedCollection is a collection that has indices that could be used to find the values and the indices are strictly ordered is some way. Any type could be used as an index if it satisfies a few rules. See Index for more information. There is also nothing that prevents the use of the values as indices as long as the value satisfies the rules for an Index.
int _has_index(int index) Returns 1 if index is in the index domain of collection, or 0 (zero) if not found
array _indices() Return an array with the indices of the collection.
2.5. Sequence A sequence is a sub-concept of an Ordered Collection. It is a linear collection that allows index-based access and automatically expands to accommodate new elements. A vector is a good example.
mixed '[](int ind) Return the item at index ind in the sequence. Throw an error if ind is out of range.
mixed '[]=(int ind, mixed item) Set the index ind to be item. Throw an error if ind is out of range
int _search(mixed item, void|int start) Return the index of the first item in the sequence with the index start or higher that match the item item. -1 if no object was found. Throw an error if start is out of range.
void insert(index ind, mixed item) Insert an item in the sequence at the index ind. The size of the sequence is increased by one and the items at index and above have their indices increases by one. Throw an error if ind is out of range.
int(0..1) delete(mixed item) Remove the first value that are equal to item from the sequence. Return 1 if a item was removed, otherwise 0;
void add(mixed item) Add the object item as fast as possible to the sequence.
int delete_at(int ind) Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
iterator get_iterator(int ind) Get an iterator for this ADT positioned at the index ind. Throw an error if ind is out of range.
array cast("array") The sequence should at least be able to cast to an array, additional castings are optional.
2.6. OrderedMap An OrderedMap is a map where the keys are orderable and the iterator belonging to an OrderedMap will traverse the values in that order.
2.7. OrderedSet An OrderedSet is a set were the values are orderable and the iterator belonging to an OrderedSet will traverse the values in that order.
- Iterator concepts
3.1. Iterator The iterator concept does not differ much from the present day iterator. Only a few new methods are added to make it possible to step backwards and compare iterators.
int(0 1) '!() Should return 0 (zero) when not at end of collection, and 1 at end of collection
void '+=(int n) Should advance the specified number of steps
iterator '-=(int n) Move the iterator n steps backwards
int '==(iterator iter) Return true if the two iterators is pointing to the same object. Throw an error if the two iterators belong to different collections.
mixed index() Should return the current index.
mixed value() Should return the current value
int has_previous() Return true if the iterator can move one step backwards.
int has_next() Return true if the iterator can move one step forwards. It have the same use as '!().
collection get_collection() Return the collection this iterator iterates over.
Optional void set_value(mixed value) Set the value of this iterator. This method should only be available if it is possible to set the value via an iterator.
Optional void add(mixed value) Add a new item to the collection after the item the iterator currently is pointing at. This method should only be available if it is possible to add entries via the iterator
Optional mixed remove() Remove the item the iterator currently is pointing at and move to the next item. This method should only be available if it is possible to remove entries via the iterator
3.2. OrderedIterator An OrderedIterator is an iterator working on an OrderedCollection. It has two characteristics, first it traverse the collection in the same order each time it is traversed, if the there hasn't been any changes in the collection, beginning with the item with the lowest index and ending with the item with the highest index. Secondly it is possibly to say that an iterator is less then or greater then another iterator.
int(0..1) '<(OrderedIterator iter) Return 1 if iter is positioned after this iterator, otherwise 0. Throw an error if the two iterators belong to different collections.
Int(0..1) '>(OrderedIterator iter) Return 1 if iter is positioned before this iterator otherwise 0. Throw an error if the two iterators belong to different collections.
- Index
If a type could be used as an index in an OrderedCollection then it must be possible to use the following operations on it.
'== If the two indexes a and b is equal it should return 1 otherwise 0. The following rules must also be satisfied. a == a must be true a == b implies b == a !(a == b) implies a != b a == b and b == c implies a == c
'< If the index a is ordered lower than b it should return 1 otherwise 0. The following rules must also be satisfied. a < a must be false a < b implies !(b < a) a < b implies (b > a) a < b and b < c implies a < c
'> If the index a is ordered higher than b it should return 1 otherwise 0. The following rules must also be satisfied. a > a must be false a > b implies !(b > a) a > b implies b < a a > b and b > c implies a > c
/ Peta, jo det är jag
Well, if we measure consensus by the lack of protests, then yes. I at least thought it could be worthwhile to replace "_m_delete" with a better name, although I don't have a strong opinion either for or against that.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-09 04:13: Subject: More about ADTs
int [Sequence::]delete_at(int ind)
Remove the item at the index ind. The size of the sequence is decreased by one and the items above ind have their indices decreases by one. Return 0 if the removal failed. Throw an error if ind is out of range.
In addition to Mast's good and thorough analysis, I'd like to add two ponderings here. First, returning the value corresponding to the index being deleted is IMO a too valuable convenience to leave out (it was added to m_delete for mappings a while back, as a way of eating off configuration options and similar from a mapping, for further processing, and the like). If the case of removal failure can happen, for some reason, returning UNDEFINED would feel more in line with how a mapping would have behaved, if the range check doesn't to cover that problem.
Second (and now I address a wider public): was there some form of consensus about not naming this _m_delete, for the sake of the somewhat annoyingly ugly prefix? And in that case, was some other LFUN for delete operations under consideration?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Since that is the only measurable way (that works in LysKOM, anyway), it's what we have to act on IMO. I presume the better name is _delete?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-09 16:50: Subject: More about ADTs
Well, if we measure consensus by the lack of protests, then yes. I at least thought it could be worthwhile to replace "_m_delete" with a better name, although I don't have a strong opinion either for or against that.
/ Martin Stjernholm, Roxen IS
That's what I think anyway. Peta has suggested "delete".
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-09 17:05: Subject: More about ADTs
Since that is the only measurable way (that works in LysKOM, anyway), it's what we have to act on IMO. I presume the better name is _delete?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
The only argument I see in favour of that is the lesser ugliness when used OO-wise (i e my_object->delete(...)) rather than how the LFUN is ordinarily invoked, via some toplevel command (like m_delete(...)). A solution that would look good from both aspects would be nice, though I can't come up with one I really like.
Does having both methods (_delete for the LFUN, delete for nice looks in the OO-style class interface) smell better or worse than the two problems (ugly name / reserving delete for this global interface)? I think I personally lean towards "better".
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-09 17:09: Subject: More about ADTs
That's what I think anyway. Peta has suggested "delete".
/ Martin Stjernholm, Roxen IS
I most definitely think that it would be worse to have two different names for the same function. `[] and `-> is enough.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-09 17:21: Subject: More about ADTs
The only argument I see in favour of that is the lesser ugliness when used OO-wise (i e my_object->delete(...)) rather than how the LFUN is ordinarily invoked, via some toplevel command (like m_delete(...)). A solution that would look good from both aspects would be nice, though I can't come up with one I really like.
Does having both methods (_delete for the LFUN, delete for nice looks in the OO-style class interface) smell better or worse than the two problems (ugly name / reserving delete for this global interface)? I think I personally lean towards "better".
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
It would make some sense to have "delete" be the name of a global function, and "_delete" be the name of the method. I.e.
delete(mixed collection, mixed index) { if (objectp(collection)) return collecction->_delete(index); else if (mappingp(collecction)) return ...; ... }
I guess this is just a new name for the global m_delete function.
/ Niels Möller ()
Previous text:
2002-11-09 17:39: Subject: More about ADTs
I most definitely think that it would be worse to have two different names for the same function. `[] and `-> is enough.
/ Martin Stjernholm, Roxen IS
Yes. Some sort of global function corresponding to m_delete for the insert operation will also be necessary for the new multisets. To be consistent it should then be called m_insert, but I don't particularly like the "m_" prefix in this case either. (As it happens, it works for multisets too, but the operation is really more fundamental than that.)
Perhaps we can come up with a new operator syntax instead?
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-10 11:20: Subject: More about ADTs
It would make some sense to have "delete" be the name of a global function, and "_delete" be the name of the method. I.e.
delete(mixed collection, mixed index) { if (objectp(collection)) return collecction->_delete(index); else if (mappingp(collecction)) return ...; ... }
I guess this is just a new name for the global m_delete function.
/ Niels Möller ()
No, the insert lfun is not `[]=. The difference is that the insert operation always adds an element, causing a duplicate index if necessary. On mappings it would either be undefined or throw an error if the index already exists.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-10 16:18: Subject: Re: More about ADTs
Perhaps we can come up with a new operator syntax instead?
That would be nice.
I got this idea, how about using a form like
x[y]=; or x[y]=void;
for delete? Is that doable? That doesn't solve the problem with a delete lfun, though. (The insert lfun is obviously "`[]="...?)
/ Brevbäraren
"Martin Stjernholm, Roxen IS @ Pike developers forum" 10353@lyskom.lysator.liu.se writes:
No, the insert lfun is not `[]=. The difference is that the insert operation always adds an element, causing a duplicate index if necessary. On mappings it would either be undefined or throw an error if the index already exists.
Ah, I see. So there is need for a true insert operator...
Hmm, I can't think of any that doesn't have parsing problems. All I've concluded so far is that it's damn unfortunate that +=, -= etc have the copy semantics they do. :(
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-10 16:08: Subject: More about ADTs
Yes. Some sort of global function corresponding to m_delete for the insert operation will also be necessary for the new multisets. To be consistent it should then be called m_insert, but I don't particularly like the "m_" prefix in this case either. (As it happens, it works for multisets too, but the operation is really more fundamental than that.)
Perhaps we can come up with a new operator syntax instead?
/ Martin Stjernholm, Roxen IS
On Sun, Nov 10, 2002 at 05:55:05PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Hmm, I can't think of any that doesn't have parsing problems. All I've concluded so far is that it's damn unfortunate that +=, -= etc have the copy semantics they do. :(
these copy semantics are actually a problem, and it would be nice if we could get rid of them.
marek was showing me an example of such a problem a while ago...
greetings, martin.
Unfortunately I think that's next to impossible compatibility-wise. It's very hard to go through old code and find the cases that depends on the copy behavior, and the bugs that would result if a case is missed will likely be hard to track down.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-11 18:39: Subject: Re: More about ADTs
On Sun, Nov 10, 2002 at 05:55:05PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Hmm, I can't think of any that doesn't have parsing problems. All I've concluded so far is that it's damn unfortunate that +=, -= etc have the copy semantics they do. :(
these copy semantics are actually a problem, and it would be nice if we could get rid of them.
marek was showing me an example of such a problem a while ago...
greetings, martin.
/ Brevbäraren
Question. Assume there is a insert lfun. But the map insert need two arguments, key and value, but the set only need one argument, value, will this be alot of special case handling?
/ Peta, jo det är jag
Previous text:
2002-11-10 16:08: Subject: More about ADTs
Yes. Some sort of global function corresponding to m_delete for the insert operation will also be necessary for the new multisets. To be consistent it should then be called m_insert, but I don't particularly like the "m_" prefix in this case either. (As it happens, it works for multisets too, but the operation is really more fundamental than that.)
Perhaps we can come up with a new operator syntax instead?
/ Martin Stjernholm, Roxen IS
No, the insert in Set needs one argument: The index, not the value. That distinction is vital when it comes to getting the ADT:s to play nicely together.
I don't think there will be any difficult special cases there; just let the insert lfun take one argument for Sets and two for Maps.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-11 12:15: Subject: More about ADTs
Question. Assume there is a insert lfun. But the map insert need two arguments, key and value, but the set only need one argument, value, will this be alot of special case handling?
/ Peta, jo det är jag
That might be confused with "destroy", especially for people with C++ background. Perhaps "_remove" is a better name?
/ Niels Möller ()
Previous text:
2002-11-09 17:05: Subject: More about ADTs
Since that is the only measurable way (that works in LysKOM, anyway), it's what we have to act on IMO. I presume the better name is _delete?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Possibly. I'm considering the inverse too: how confused would people be by a top-level command "delete" or "remove"? Either could easily be mistaken for a file operation too, if we are to consider all possible misunderstandings. :-)
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-10 11:16: Subject: More about ADTs
That might be confused with "destroy", especially for people with C++ background. Perhaps "_remove" is a better name?
/ Niels Möller ()
Possibly. I'm considering the inverse too: how confused would people be by a top-level command "delete" or "remove"? Either could easily be mistaken for a file operation too, if we are to consider all possible misunderstandings. :-)
Maybe it could be called "remove_element", "insert_element"?
Hmm like that, then is no fuzz about what it is.
/ Peta, jo det är jag
Previous text:
2002-11-11 07:40: Subject: Re: More about ADTs
Possibly. I'm considering the inverse too: how confused would people be by a top-level command "delete" or "remove"? Either could easily be mistaken for a file operation too, if we are to consider all possible misunderstandings. :-)
Maybe it could be called "remove_element", "insert_element"?
/ Brevbäraren
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
/ Niels Möller ()
Previous text:
2002-11-11 07:40: Subject: Re: More about ADTs
Possibly. I'm considering the inverse too: how confused would people be by a top-level command "delete" or "remove"? Either could easily be mistaken for a file operation too, if we are to consider all possible misunderstandings. :-)
Maybe it could be called "remove_element", "insert_element"?
/ Brevbäraren
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
They might be sufficient untils someone finds out operators for these operations.
Are shift operators used for these datatypes?
set<<elem; set>>elem;
A little C++-ish though, and lacks a syntax for key:value tupels...
Maybe set<<[elem] set>>[elem] map<<[key;elem] map>>[key]
...is that possible to parse?
I guess it could work, but I personally find such abuses of << and >>
quite ugly.
/ Niels Möller ()
Previous text:
2002-11-11 19:55: Subject: Re: More about ADTs
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
They might be sufficient untils someone finds out operators for these operations.
Are shift operators used for these datatypes?
set<<elem; set>>elem;
A little C++-ish though, and lacks a syntax for key:value tupels...
Maybe set<<[elem] set>>[elem] map<<[key;elem] map>>[key]
...is that possible to parse?
/ Brevbäraren
Perhaps one can find something involving `-, but I rather not have another operator. At least not in the latin-1 table. There were some far reaching plans for how to add operators from the wide collection of special characters in Unicode a while back. Anyone remember anything from that discussion?
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-11 21:08: Subject: Re: More about ADTs
Me too. Any other suggestions?
/ Brevbäraren
The only fancy operators I remember was from the upper half of latin 1. And I don't think it's a good idea to add fancy characters for operators. It's fine for languages like APL, I guess, which were designed from the start to have lots of fancy operators and compact syntax, but pike code should look reasonably similar to C.
/ Niels Möller ()
Previous text:
2002-11-11 21:14: Subject: Re: More about ADTs
Perhaps one can find something involving `-, but I rather not have another operator. At least not in the latin-1 table. There were some far reaching plans for how to add operators from the wide collection of special characters in Unicode a while back. Anyone remember anything from that discussion?
/ Martin Nilsson (Fake Build Master)
I don't want operators which you can not reach from a ASCII-terminal, but that doesn't mean that it shouldn't be possible to code with "difficult" characters. We already accept identifiers from the full unicode character realm.
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-11 21:38: Subject: Re: More about ADTs
The only fancy operators I remember was from the upper half of latin
- And I don't think it's a good idea to add fancy characters for
operators. It's fine for languages like APL, I guess, which were designed from the start to have lots of fancy operators and compact syntax, but pike code should look reasonably similar to C.
/ Niels Möller ()
The problem with the last case is that [...] can be both a prefix and a postfix operator. E.g. in this case:
x << [y] - z
it can either be a type cast:
x << ([y] -z)
(the minus is unary) or your new operator:
(x << [y]) - z
But I think that's solvable just by applying operator precedence. Since postfix operators in general have higher precedence than prefix operators, it would reasonably be disambiguated to the second case. (This is not a "real" case of precedence, though; in reality it's a shift/reduce conflict.)
I'm thinking along the lines of building on the [] and []= operators, since these operations to a large extent is an index operation. I thought it'd be better to use + and - since those are used elsewhere for adding and removing elements, while the shift operators would have to be stretched a bit to include that. Your placement of the operator before the open bracket is the only place that both makes some sense and (probably) is parseable. Thus:
Insert: foo+[bar] = gnu (lfun: `+[]=) Delete: foo-[bar] (lfun: `-[])
On the same line, one could perhaps also dream up an operator to replace the use of zero_type(foo[bar]), which I've always considered a bit ugly since the single operation "index exists" is broken up in "lookup index" and "check for magic value".
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-11 19:55: Subject: Re: More about ADTs
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
They might be sufficient untils someone finds out operators for these operations.
Are shift operators used for these datatypes?
set<<elem; set>>elem;
A little C++-ish though, and lacks a syntax for key:value tupels...
Maybe set<<[elem] set>>[elem] map<<[key;elem] map>>[key]
...is that possible to parse?
/ Brevbäraren
Sorry, that syntax already has semantics:
int a; array(int) b; b=[a]=({3});
(1) Result: ({ /* 1 element */ 3 })
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2002-11-12 11:41: Subject: Re: More about ADTs
On Tue, Nov 12, 2002 at 02:55:02AM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Insert: foo+[bar] = gnu (lfun: `+[]=) Delete: foo-[bar] (lfun: `-[])
foo=[bar]=gnu
for an assignment without copy_value?
greetings, martin.
/ Brevbäraren
On Tue, Nov 12, 2002 at 12:15:04PM +0100, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Sorry, that syntax already has semantics:
which are?
int a; array(int) b; b=[a]=({3});
(1) Result: ({ /* 1 element */ 3 })
this example is not very conclusive,
b=[a]=({3});
seems to be of the type:
foo=bar=baz;
but [a] can't stand by itself yet, the fact that this assigns ({3}) to b seems to be unrelated to the [a] inbetween.
i would even argue that this is a bug, because a is not affected by nor has it any effect on the assignment.
so what is the semantics of allowing this statement?
greetings, martin.
b=[a]=({3});
seems to be of the type:
foo=bar=baz;
but [a] can't stand by itself
Yes it can. When it is an lvalue. The semantics is that a is assigned the first element of the array. The more general syntax is
[a,b,c] = expr
which assigns the three elements in the array produces by expr to the three variables a, b and c. The value of the assignment expression is the entire array.
yet, the fact that this assigns ({3}) to b seems to be unrelated to the [a] inbetween.
Yes. That is always the case with chained assignments.
i would even argue that this is a bug, because a is not affected by nor has it any effect on the assignment.
But a is affected. It's set to 3.
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2002-11-12 12:48: Subject: Re: More about ADTs
On Tue, Nov 12, 2002 at 12:15:04PM +0100, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Sorry, that syntax already has semantics:
which are?
int a; array(int) b; b=[a]=({3});
(1) Result: ({ /* 1 element */ 3 })
this example is not very conclusive,
b=[a]=({3});
seems to be of the type:
foo=bar=baz;
but [a] can't stand by itself yet, the fact that this assigns ({3}) to b seems to be unrelated to the [a] inbetween.
i would even argue that this is a bug, because a is not affected by nor has it any effect on the assignment.
so what is the semantics of allowing this statement?
greetings, martin.
/ Brevbäraren
On Tue, Nov 12, 2002 at 01:00:05PM +0100, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
but [a] can't stand by itself
Yes it can. [a,b,c] = expr
right, forgot about that.
a is not affected by the assignment.
But a is affected. It's set to 3.
oops, indeed, missed that one, darn.
greetings, martin.
What do you mean? I'm not aware of any situation where an assignment implies a copy_value.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 11:41: Subject: Re: More about ADTs
On Tue, Nov 12, 2002 at 02:55:02AM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Insert: foo+[bar] = gnu (lfun: `+[]=) Delete: foo-[bar] (lfun: `-[])
foo=[bar]=gnu
for an assignment without copy_value?
greetings, martin.
/ Brevbäraren
On Tue, Nov 12, 2002 at 07:45:08PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
What do you mean? I'm not aware of any situation where an assignment implies a copy_value.
i was refering to the copy semantics you talked about earlier regarding += and -=
didn't you mean the same as copy_value there?
greetings, martin.
Ah, ok. Then it's not the assignment that does the copying but rather the merge operation before it. (Besides, the copying in that case is only one level deep while copy_value is recursive.)
Also, those operations have nothing to do with indexing, so it's not sane if operators for them involve a pair of square brackets. They would rather be variants of "+=", "|=" etc, e.g. by adding an extra character somewhere.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 19:52: Subject: Re: More about ADTs
On Tue, Nov 12, 2002 at 07:45:08PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
What do you mean? I'm not aware of any situation where an assignment implies a copy_value.
i was refering to the copy semantics you talked about earlier regarding += and -=
didn't you mean the same as copy_value there?
greetings, martin.
/ Brevbäraren
On Tue, Nov 12, 2002 at 08:05:15PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Ah, ok. Then it's not the assignment that does the copying but rather the merge operation before it. (Besides, the copying in that case is only one level deep while copy_value is recursive.)
you are right, my bad. (so much for quoting from memory :-)
Also, those operations have nothing to do with indexing, so it's not sane if operators for them involve a pair of square brackets.
? assignments to arrays/mappings involve indexing, don't they? assignments to simple types don't have any copy semantics do they?
greetings, martin.
But it's not really any assignment involved. It's a destructive merge, nothing else. The assignment is necessary only in the current copying operation, since that is a nondestructive merge followed by an assignment.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 20:15: Subject: Re: More about ADTs
On Tue, Nov 12, 2002 at 08:05:15PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Ah, ok. Then it's not the assignment that does the copying but rather the merge operation before it. (Besides, the copying in that case is only one level deep while copy_value is recursive.)
you are right, my bad. (so much for quoting from memory :-)
Also, those operations have nothing to do with indexing, so it's not sane if operators for them involve a pair of square brackets.
? assignments to arrays/mappings involve indexing, don't they? assignments to simple types don't have any copy semantics do they?
greetings, martin.
/ Brevbäraren
On Tue, Nov 12, 2002 at 08:25:03PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
But it's not really any assignment involved. It's a destructive merge, nothing else. The assignment is necessary only in the current copying operation, since that is a nondestructive merge followed by an assignment.
ah, wait, forget the whole ithing. the mistake was a lot earlier. the copy semantics we seek to get rid of are only with += and -= but not with foo[bar]=baz
hence no need for another assignment without copysemantics, and no need for further confusing each other :-)
greetings, martin.
On the same line, one could perhaps also dream up an operator to replace the use of zero_type(foo[bar]), which I've always considered a bit ugly since the single operation "index exists" is broken up in "lookup index" and "check for magic value".
I'd like that too; perhaps foo![bar], if we'd settle for foo+[bar]=baz and foo-[bar]?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-12 02:50: Subject: Re: More about ADTs
The problem with the last case is that [...] can be both a prefix and a postfix operator. E.g. in this case:
x << [y] - z
it can either be a type cast:
x << ([y] -z)
(the minus is unary) or your new operator:
(x << [y]) - z
But I think that's solvable just by applying operator precedence. Since postfix operators in general have higher precedence than prefix operators, it would reasonably be disambiguated to the second case. (This is not a "real" case of precedence, though; in reality it's a shift/reduce conflict.)
I'm thinking along the lines of building on the [] and []= operators, since these operations to a large extent is an index operation. I thought it'd be better to use + and - since those are used elsewhere for adding and removing elements, while the shift operators would have to be stretched a bit to include that. Your placement of the operator before the open bracket is the only place that both makes some sense and (probably) is parseable. Thus:
Insert: foo+[bar] = gnu (lfun: `+[]=) Delete: foo-[bar] (lfun: `-[])
On the same line, one could perhaps also dream up an operator to replace the use of zero_type(foo[bar]), which I've always considered a bit ugly since the single operation "index exists" is broken up in "lookup index" and "check for magic value".
/ Martin Stjernholm, Roxen IS
Thought about that, but "!" is a little problematic since it stands for a negation. Would foo![bar] return 0 or 1 if the element exists? The similarity with !foo[bar] suggests it'd return 0, but one could also disregard that and think it'd be 1. Either way we choose, it might be difficult to remember.
On a side note, the foo+[bar]=gnu and foo-[bar] syntaxes begin to feel more and more intuitive to me. I don't know about you, but I think they would be fairly easy to get used to.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 12:00: Subject: Re: More about ADTs
On the same line, one could perhaps also dream up an operator to replace the use of zero_type(foo[bar]), which I've always considered a bit ugly since the single operation "index exists" is broken up in "lookup index" and "check for magic value".
I'd like that too; perhaps foo![bar], if we'd settle for foo+[bar]=baz and foo-[bar]?
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
On Tue, Nov 12, 2002 at 08:30:03PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
On a side note, the foo+[bar]=gnu and foo-[bar] syntaxes begin to feel more and more intuitive to me. I don't know about you, but I think they would be fairly easy to get used to.
foo[bar]=gnu; foo+=([ bar:gnu ]); foo+[bar]=gnu; foo-[bar];
yes, i agree...
foo+[bar]; (for sets with keys without values)
greetings, martin.
You might be right. Would foo?[bar] be parsable? It doesn't have the problem foo![bar] has regarding which interpretation is logical, IMO.
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Previous text:
2002-11-12 20:25: Subject: Re: More about ADTs
Thought about that, but "!" is a little problematic since it stands for a negation. Would foo![bar] return 0 or 1 if the element exists? The similarity with !foo[bar] suggests it'd return 0, but one could also disregard that and think it'd be 1. Either way we choose, it might be difficult to remember.
On a side note, the foo+[bar]=gnu and foo-[bar] syntaxes begin to feel more and more intuitive to me. I don't know about you, but I think they would be fairly easy to get used to.
/ Martin Stjernholm, Roxen IS
It feels like the beginning of foo?[bar]=zonk():zerblat().
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-12 21:50: Subject: Re: More about ADTs
You might be right. Would foo?[bar] be parsable? It doesn't have the problem foo![bar] has regarding which interpretation is logical, IMO.
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
Yes, it could be problematic if we extend the ambigious case in 9305871. Begin with:
x ? [y] - z
Ok, using the "precedence" rule, we reduce this to:
(x?[y]) - z
and go on:
x ? [y] - z : v
Oops, it was a ? : operator afterall. Too late now; have to report the ':' as a syntax error.
It'd be necessary to use a parenthesis:
x ? ([y] - z) : v
Essentially we get the rule: If the second expression in a ? : operator begins with a type cast, it has to be surrounded by parenthesis. Not that that will happen often, but it's a very ugly special case to document in the manual.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-12 21:52: Subject: Re: More about ADTs
It feels like the beginning of foo?[bar]=zonk():zerblat().
/ Martin Nilsson (Fake Build Master)
Ugga, that is not very pretty...
/ Fredrik (Naranek) Hubinette (Real Build Master)
Previous text:
2002-11-11 19:55: Subject: Re: More about ADTs
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
They might be sufficient untils someone finds out operators for these operations.
Are shift operators used for these datatypes?
set<<elem; set>>elem;
A little C++-ish though, and lacks a syntax for key:value tupels...
Maybe set<<[elem] set>>[elem] map<<[key;elem] map>>[key]
...is that possible to parse?
/ Brevbäraren
Ugga, that is not very pretty...
/ Fredrik (Naranek) Hubinette (Real Build Master)
Previous text:
2002-11-11 19:55: Subject: Re: More about ADTs
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
They might be sufficient untils someone finds out operators for these operations.
Are shift operators used for these datatypes?
set<<elem; set>>elem;
A little C++-ish though, and lacks a syntax for key:value tupels...
Maybe set<<[elem] set>>[elem] map<<[key;elem] map>>[key]
...is that possible to parse?
/ Brevbäraren
I wouldn't like remove_element to replace m_delete (at least as the recommended function); it's too long for such a common operation. The insert function isn't so critical since `[]= still works fine most of the time.
/ Martin Stjernholm, Roxen IS
Previous text:
2002-11-11 19:44: Subject: Re: More about ADTs
Somewhat verbose, but I think that's ok for destructive operations that are expected to be used an order of magnitude more seldom than for instance plain indexing (->).
/ Niels Möller ()
Since "remove" is the standard C function for removing a file, I'd say that name is highly unsuitable for any other purpose. (I consider it a bug in Pike that the remove-a-file function is called "rm". It's too short, cryptic and non-standard. The same things goes for "mv".)
And "delete" isn't ideal either, since it's the standard C++ keyword for invoking an object destructor.
/ Leif Stensson, Lysator
Previous text:
2002-11-11 00:29: Subject: More about ADTs
Possibly. I'm considering the inverse too: how confused would people be by a top-level command "delete" or "remove"? Either could easily be mistaken for a file operation too, if we are to consider all possible misunderstandings. :-)
/ Johan Sundström (ärkehertig av Lysators rootgrupp)
But we have many more to chose from. erase, abolish, extinguish, repeal, abrogate, annul, expel, oust, evict, expunge, efface.
/ Martin Nilsson (Fake Build Master)
Previous text:
2002-11-11 14:58: Subject: More about ADTs
Since "remove" is the standard C function for removing a file, I'd say that name is highly unsuitable for any other purpose. (I consider it a bug in Pike that the remove-a-file function is called "rm". It's too short, cryptic and non-standard. The same things goes for "mv".)
And "delete" isn't ideal either, since it's the standard C++ keyword for invoking an object destructor.
/ Leif Stensson, Lysator
"Annihilate, kill, kill, kill!" (0.5p)
/ Marcus Comstedt (ACROSS) (Hail Ilpalazzo!)
Previous text:
2002-11-11 15:06: Subject: More about ADTs
But we have many more to chose from. erase, abolish, extinguish, repeal, abrogate, annul, expel, oust, evict, expunge, efface.
/ Martin Nilsson (Fake Build Master)
black (out), blot out, cancel, efface, expunge, obliterate, wipe (out), x (out), eliminate, exclude, rule out; omit. eradicate might also be an option. (www.m-w.com *is* fun!)
/ Marcus Agehall (Trådlös)
Previous text:
2002-11-11 15:26: Subject: More about ADTs
"Ex-terminate! Ex-terminate!" (0.25p)
/ Leif Stensson, Lysator
Pike is not C. Pike is not C++.
Your arguments are valid, but compatibility with C/C++ is not a goal in itself.
/ Fredrik (Naranek) Hubinette (Real Build Master)
Previous text:
2002-11-11 14:58: Subject: More about ADTs
Since "remove" is the standard C function for removing a file, I'd say that name is highly unsuitable for any other purpose. (I consider it a bug in Pike that the remove-a-file function is called "rm". It's too short, cryptic and non-standard. The same things goes for "mv".)
And "delete" isn't ideal either, since it's the standard C++ keyword for invoking an object destructor.
/ Leif Stensson, Lysator
Pike is not C. Pike is not C++.
Your arguments are valid, but compatibility with C/C++ is not a goal in itself.
/ Fredrik (Naranek) Hubinette (Real Build Master)
Previous text:
2002-11-11 14:58: Subject: More about ADTs
Since "remove" is the standard C function for removing a file, I'd say that name is highly unsuitable for any other purpose. (I consider it a bug in Pike that the remove-a-file function is called "rm". It's too short, cryptic and non-standard. The same things goes for "mv".)
And "delete" isn't ideal either, since it's the standard C++ keyword for invoking an object destructor.
/ Leif Stensson, Lysator
Pike is not C. Pike is not C++.
Your arguments are valid, but compatibility with C/C++ is not a goal in itself.
/ Fredrik (Naranek) Hubinette (Real Build Master)
Previous text:
2002-11-11 14:58: Subject: More about ADTs
Since "remove" is the standard C function for removing a file, I'd say that name is highly unsuitable for any other purpose. (I consider it a bug in Pike that the remove-a-file function is called "rm". It's too short, cryptic and non-standard. The same things goes for "mv".)
And "delete" isn't ideal either, since it's the standard C++ keyword for invoking an object destructor.
/ Leif Stensson, Lysator
Well, if we measure consensus by the lack of protests, then yes. I at least thought it could be worthwhile to replace "_m_delete" with a better name, although I don't have a strong opinion either for or against that.
I have no opinion what the lfun will be called. I think it's a bad idea renaming the m_delete efun to something else... If anything, maybe a delete operator could take it's place long-term.
pike-devel@lists.lysator.liu.se