More about ADTs

12 Nov 2002

      ...
I think that implies that the [] operation would return the index as
the value if it exists. I.e. foo[0] would return 0, making it
necessary to use zero_type. No, please regard the elements as indices
and all values as 1. That makes [] and []= work with a minimal amount
of special cases (only []= needs one), and besides it's compatible
with the multiset data type.
Well I might, but just because you ask so politely ;-). 
Serious, you're right, I've been looking at java, but that is the Pike way.
...
Good point, but then it should perhaps throw an error instead? The
insert isn't successful, afterall. I don't think it's much use being
error tolerant in the insert operation since it's typically only used
in ADTs that actually allows duplicate indices, otherwise []= works
just as well. Perhaps the insert function shouldn't be defined at all
in ADTs that don't allow duplicate indices.
No I think I keep it as it is, then you have the possibility to insert
and replace if exist with '[]= and insert but do not replace with
insert. And I don't see it as an error if a key already exist, and by 
returning the value belonging to the key you can make a choice if you
want to replace the value or not.
...
(Besides, global functions for setting and getting the `< function is
necessary in the new multisets, so these will probably be candidates
for lfuns too.)
What is the name and syntax you have planned for them?
...
Then the iterator can/should "follow" its index to its new position
and then continue stepping from there in a well defined way. The only
case where the iterator could reasonably loose track is if the element
it's "standing on" is removed. (An ADT could solve this either by
refusing the remove, by doing a copy-on-write, or by documenting that
the iterator more or less loose track in that case. The new multisets
take the last route.)
With "if the there hasn't been any changes in the collection", I did
not mean that the iterator would not work, I mean that the iterator
will not traverse the elements in the same order.
...
Then a superclass to Collection would be necessary, and even then it
probably would not work out well if Collection isn't designed with a
few thoughts in that direction. Most of these issues are solved simply
by making functions optional. (That's not really a cost, since a
runtime check that a function exists before calling it is unavoidable
anyway in Pike.)
The main goal I had in mind when creating the collection interfaces
was that you could write an algorithm that take a collection and then
later reuse it with any other collection. By declaring a lot of
methods optional this will spoil the abstractness of the whole thing
because then you has to know the implementation of the algorithm
before you know if your collection will work anyway.
/ Peta, jo det är jag
Previous text:
...
2002-11-12 04:19:
Subject: More about ADTs

...
There is nothing that prevents that _indices() should be added to all
types that need it. But I did not made it a demand for being a
collection since there is after all types that does not have indices
that could be a collection.
You're right, e.g. a queue would probably have no indexing operation
and hence no indexes. It should be optional, like _values().
...
About sets, I regard the content as both indices and values.
I think that implies that the [] operation would return the index as
the value if it exists. I.e. foo[0] would return 0, making it
necessary to use zero_type. No, please regard the elements as indices
and all values as 1. That makes [] and []= work with a minimal amount
of special cases (only []= needs one), and besides it's compatible
with the multiset data type.
...
...
Are elements compared with equal() or `==? To be consistent it should
be the first, but there clearly can be situations where one would
rather do the latter.
/.../
...
Shouldn't it be `==, wouldn't it be hard to handle basic types as
elements otherwise?
That's not a problem; equal() works like == on integers, floats and
strings (which I presume you mean with "basic types"). Or did I
misunderstand you?
...
...
Isn't it more useful if it replaces the value in the second case? Then
a Map that doesn't allow multiple indices can simply let `[]= and
insert be the same.
Don't think that an insert should do a replace.
Good point, but then it should perhaps throw an error instead? The
insert isn't successful, afterall. I don't think it's much use being
error tolerant in the insert operation since it's typically only used
in ADTs that actually allows duplicate indices, otherwise []= works
just as well. Perhaps the insert function shouldn't be defined at all
in ADTs that don't allow duplicate indices.
...
...
How is the range defined? Is it possible to index from the end with
negative integers? If so, how will that work together with unlimited
Sequences?
If the end could not be found it should probably also throw an
error. Another solution is not to allow negative integers.
My vote is then to throw an error; negative indices are useful as a
convenience, although they do add slightly more work for the ADT
implementation.
...
I had a get_comparator method but realised that it will just give
another interface and not be necessary in a minimal base.
Is it really a good goal to be minimal? Then you should remove a lot
of other functions too, even _get_iterator since there might be
collections that have no use for iterators (e.g. queues are perfectly
useful without such a thing).
I think it's much better to define these things centrally and
optionally, so that if they exist they have a known name and behavior.
Otherwise there's the risk that the function to set the comparison
function in one ADT is named e.g. set_less and takes a `<, while it in
another is called set_comparator and takes a three-way comparison
function returning -1, 0 or 1.
(Besides, global functions for setting and getting the `< function is
necessary in the new multisets, so these will probably be candidates
for lfuns too.)
...
...
That won't work well with how the == operator is used. /.../
Yes it should be _equal()
Excellent solution, actually. Didn't think of that.
...
...
If there is a well defined order, I don't think the exception "if the
there hasn't been any changes in the collection" should be necessary;
it's possible to do better than that.
There could be collections where you can change the order of the indices.
Then the iterator can/should "follow" its index to its new position
and then continue stepping from there in a well defined way. The only
case where the iterator could reasonably loose track is if the element
it's "standing on" is removed. (An ADT could solve this either by
refusing the remove, by doing a copy-on-write, or by documenting that
the iterator more or less loose track in that case. The new multisets
take the last route.)
...
Question do you think I should keep the '[] at all since a for example
a linked list will be O(n) and there will probably be others. After
all the only reason there is '[] is to resemble the existing types
delete, insert and replace will work just as well.
Any other name for it would be confusing in the cases the operation
does exist, since the idiomatic name is `[] and nothing else. With
this reasoning, a linked list would simply not have any `[].
But drawing lines to Lisp, that would perhaps be unfortunate if one
really wants to use such things in vector-like ways. Maybe it's better
to soften that efficiency argument to: "If an `[] exists, it should be
the fastest available lookup method." That'd still be enough to have
some sort of definition to distinguish between the index and value
sets, which is the key point here.
...
Finally a few words about unlimited/undefined sizes. They are
undoubtedly cause a lot of problems and since one of the goals with
the structure is to make it possible to use the collections and its 
sub-concepts in algorithms and most algorithms are quite useless on 
unlimited and undefined sizes. Therefore I think it's best to not let 
collections be of a unlimited/undefined size.
Then a superclass to Collection would be necessary, and even then it
probably would not work out well if Collection isn't designed with a
few thoughts in that direction. Most of these issues are solved simply
by making functions optional. (That's not really a cost, since a
runtime check that a function exists before calling it is unavoidable
anyway in Pike.)
Also, at least collections of unknown size aren't that esoteric, just
think of a queue that works between processes as well as threads. Such
a thing might use a pipe with some system dependent buffer size. It
can also reasonably have other properties like not being able to be
iterated over, there's no `[]=, etc.
/ Martin Stjernholm, Roxen IS

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

More about ADTs