Consider:
class A { constant x = -1; string f() { return ([1: "foo"])[x] || "bar"; // Warning: Indexing on illegal type. Got int(-1..-1) } }
class B { inherit A; constant x = 1; }
int main() { werror ("%O\n", B()->f()); }
This gives a warning on the indicated line. I guess the assumption is that x is constant there, but that's not necessarily true as the example shows.
I guess the problem here again is that constants lack explicit types. If it was possible to type it then one would normally write "constant int x = -1;" and the situation wouldn't occur.
Isn't the problem rather that the constant isn't constant? I'd expect a (non-abstract) constant to actually be constant, and thus nomask.
Both constants are constant, but the expression x isn't constant since it resolves to different constants in different classes. That's just the result of overloading and is a very useful feature. You can make your constants explicitly nomask or local if you want to, or you can resolve them statically using local::x to get a constant expression.
Both constants are constant, but the expression x isn't constant since it resolves to different constants in different classes.
Then the problem is that x doesn't bind statically. If A::x is constant, and the use of "x" is statically bound, it should be statically bound to the constant value -1. Otherwise we have some other kind of binding for constants.
I'm not sure I understand exactly what you mean with 'the use of "x" is statically bound'.
x is looked up using the standard rules, and the result happens to be a constant (in A or B, depending on the class instance). That doesn't mean that the expression x itself is constant. It would be truly odd if the result of the lookup would affect the lookup process itself. The current behavior in that regard is compatible, highly useful and imo quite natural.
I'm not sure I understand exactly what you mean with 'the use of "x" is statically bound'.
I mean that "x" in the expression '([1: "foo"])[x] || "bar"' is associated with a declared object (preferably 'constant x = -1;') during compile time. If the association to a declared object is done at runtime, then it is not static binding.
If, for example, I write
int foo() { int y=17; return y; }
then the use of "y" is statically bound to the declaration "int y=17;". It can not later be persuaded to suddenly refer to some different "y", of a different type (or value).
Is this a "truly odd" case of a "result of the lookup process affecting the lookup process itself"? Because the lookup process during compilation yielded a local variable, a stating binding was used. I don't see why this could not be the case with constants.
Ok, then the expression "x" isn't statically bound in my example.
The situation is different for a local variable inside a function body. If the lookup process for an identifier resolves such a variable, then it will always resolve to the same variable, regardless of the (static or dynamic) context of the function body. Thus the identifier expression is indeed statically bound in that case.
That is no longer (necessarily) true when the lookup process starts to search for declarations in a class scope, because class declarations can be overloaded. If it were somehow possible to inherit function bodies too and overload specific parts inside them then it wouldn't be true for local variables either.
I think it's right that the "constant" keyword controls read-onlyness without affecting binding.
If the lookup process for an identifier resolves such a variable, then it will always resolve to the same variable, regardless of the (static or dynamic) context of the function body. Thus the identifier expression is indeed statically bound in that case.
Exactly. And that's how my intuition thinks a "constant" should work as well.
I think it's right that the "constant" keyword controls read-onlyness without affecting binding.
"constant" isn't just a modifier, it's its own type of entity. If it had been "constant int x=7;" then I agree that it should only differ from "int x=7;" in read-onlyness. But since it is different from an instance variable, I don't see any problem with it having different binding from an instance variable, just like a local variable has a different binding from an instance variable.
By the way, it appears that constant _already_ have different binding rules than instance variables. Consider this code:
class foo { constant x=3; int f() { return x; } }
class bar { inherit foo; constant x=7; array(int) g() { return ({ foo::x, bar::x, x, f() }); } }
Here bar()->g() will return ({ 3, 7, 7, 7 }). However, change the "constant" to "int", and you'll instead get ({ 7, 7, 7, 7 }). So a constant is clearly already different in more whats than read- onlyness.
What you're seeing there is a peculiar result of how variable overloading works: When a variable (without "local" modifier) is overloaded, both the overloaded and the overloading definition refer to the same storage.
Immutable definitions, i.e. both functions(*) and constants, don't behave that way; they retain their original values (quite naturally, since they wouldn't be immutable otherwise).
This behavior of variable overloading is not self-evident. Long ago overloaded variables actually didn't share storage - each class got its own and the normal binding rules were applied to the identifier. Then the variable variety of your example would behave exactly as the constant case.
That was changed, if I recall correctly, mostly on pragmatic grounds: The only reason to overload a variable is either to modify its type or (more commonly) to override the initializer. In those cases, retaining the storage for the overloaded variable would just cause confusion (and hence bugs) and waste space in the objects.
Changing the type of an overloaded variable shouldn't really be allowed from a strict typing perspective either (the overloading type should only allowed to be more restrictive to work when the variable is queried, and may only be more lax when the variable is set).
So the overloading paradigm doesn't really work that well with variables in several perspectives. Strictly speaking it shouldn't be allowed, but it is useful to override initializers.
These problems does not apply to immutable defintions though, to which both constants and functions belong.
*) One can replace the constant x with a function returning the value in your example - the function behaves in the same way as the constant.
Interestingly, it looks like getters and setters have inherited the variable behavior when it comes to overloading:
class A { int `x() {return 3;} }
class B { inherit A; int `x() {return 7;} int f() {return A::x;} }
Here B()->f() prints 7, i.e. B::`x is called even when A::x is queried. Is that correct? Spontaneously I'm inclined to regard a getter only as syntactic sugar for the corresponding function syntax:
class A { int get_x() {return 3;} }
class B { inherit A; int get_x() {return 7;} int f() {return A::get_x();} }
which correctly prints 3.
It does sound like a bug, but it is correct, if getters and setters are to mimic the behaviour of variables. If you do the following
class A { int x = 3; }
class B { inherit A;
int x = 4;
int get_a() { return A::x; } }
you will always get 4 back from B's methods.
and how does one access the value of x that was set in the parent class?
greetings, martin.
You don't. The x:s use the same storage, so the 3 is not stored anywhere, only the 4.
Well, it is stored in __INIT for A, but that's not exactly easy to get at from B...
On a side note; one thing that would be useful when using getters and setters would be the ability to access the getter/setter function that was overloaded. eg:
class A { private string(0..255) x_storage; void `x=(string(0..255) x) { if (string.width(x)) error("String contains wide characters.\n"); x_storage = x; } }
class B { inherit A; void `x=(string(0..255) x) { string q = utf8_to_string(x); // Check that x is valid UTF8. A::x = x; } }
Currently the above code will not work, since the line
A::x = x
will call B::`x=() rather than A::`x=().
I assume you meant "A::x = q;" (and string_to_utf8())? ^ If you use "::`x=(q);" instead it will work.
But then you'd have to know there is a setter in the inherited class. That could change as an implementation detail.
I'd still like to know the reasons for using the variable override rules for getters and setters. I think this example too shows that it would make more sense to treat them the same way as all other functions.
But then you'd have to know there is a setter in the inherited class. That could change as an implementation detail.
Indeed. That could be fixed by letting ::`x= be the implicit setter if x is declared as a variable without getter or setter in the parent class, though.
By the way, that's how it already works for `[]. You can use ::`[] as a fallback in `[] without needing to know if the superclass implements `[] or not.
I assume you meant "A::x = q;" (and string_to_utf8())?
No, I ment what I wrote. utf8_to_string() throws errors if the argument isn't a valid UTF-8 string. The assignment to q is there to ensure that the call isn't optimized away. The result is thus a variable where it is only possible to store valid UTF8-strings.
If you use "::`x=(q);" instead it will work.
True, I forgot that I had removed the forced private of the getter/setter functions. From program.c:
/* NOTE: The function needs to have the same PRIVATE/INLINE * behaviour as the variable for overloading to behave * as expected. * * FIXME: Force PRIVATE? */
No, I ment what I wrote.
Ok, I misunderstood the purpose of the subclass. I don't think you meant what you wrote on this line though:
if (string.width(x)) error("String contains wide characters.\n");
Find two errors. :-)
True, I forgot that I had removed the forced private of the getter/setter functions. From program.c:
/* NOTE: The function needs to have the same PRIVATE/INLINE * behaviour as the variable for overloading to behave * as expected. * * FIXME: Force PRIVATE? */
This is somewhat confusing to me. "The variable"? The setter is for "x", which is not declared as a variable. And if you were to make the setter private, wouldn't that make it impossible for B to override it in the first place?
if (string.width(x)) error("String contains wide characters.\n");
Find two errors. :-)
Oops... :-)
True, I forgot that I had removed the forced private of the getter/setter functions. From program.c:
/* NOTE: The function needs to have the same PRIVATE/INLINE * behaviour as the variable for overloading to behave * as expected. * * FIXME: Force PRIVATE? */
This is somewhat confusing to me. "The variable"? The setter is for "x", which is not declared as a variable. And if you were to make the setter private, wouldn't that make it impossible for B to override it in the first place?
The declaration of a getter or setter function implicitly defines a variable with run-time storage type PIKE_T_GET_SET. The implicit variable derives its modifiers from the modifiers of the getter and setter functions. Forcing the setter function private would mean that it could only be accessed via the variable, in which case the inh::var syntax mentioned earlier would be necessary.
ah, now i understand. thanks.
that makes sense because unless the variable is declared local it won't be used anyways. but the same is true for functions, yet, when functions are overloaded, the original is still accessible though parent::
is there a reason why variables should not behave the same way? other than saving some storage space i can't see any advantage to share the storage and completely replace the parent version.
greetings, martin.
Saving storage is a good enough reason in itself, I think. But it has already been discussed elsewhere in this thread:
This behavior of variable overloading is not self-evident. Long ago overloaded variables actually didn't share storage - each class got its own and the normal binding rules were applied to the identifier. Then the variable variety of your example would behave exactly as the constant case.
That was changed, if I recall correctly, mostly on pragmatic grounds: The only reason to overload a variable is either to modify its type or (more commonly) to override the initializer. In those cases, retaining the storage for the overloaded variable would just cause confusion (and hence bugs) and waste space in the objects.
While your reasoning is logically flawless, Martin's is IMO a much more useful behaviour. And making constant a property on normal typed variables (changing their variability to constantness instead), would be even better still.
I think being able to inline values during compilation is rather useful, actually. I don't mind having a "read-only property" _as well_ though.
You ignore the main point of my argument, namely that the binding being static for a local variable simply is a result of the fact that there can't ever be any alternatives in that case. Or put another way, the question whether the binding is static or not is moot for local variables; one can just as well regard it as non-static and arrive at exactly the same result. In the case of class declarations that's clearly no longer true.
As for the looks of the constant construct, I don't think the syntactic difference is really relevant as to whether it should be orthogonal wrt to binding or not. But anyway:
I my view "constant" is a modifier. Then the type is left out altogether from a constant declaration. I believe the reason for that has been that it can be inferred from the value. That approach is however not without problems since the type can get overly narrow (my original example in this thread shows one problematic case, but there are others). Due to those problems, it's just a matter of time before constant declarations will allow explicit typing.
You ignore the main point of my argument, namely that the binding being static for a local variable simply is a result of the fact that there can't ever be any alternatives in that case.
You say that it "can't" be any alternatives, but what you mean is that we have constucted the language such that no alternatives base on dynamic environment are considered. In the same way, we can make binding of constants not consider alternatives based on the dynamic environment.
To show what I mean, consider:
class X { constant a=1; int foo() { int b=7; return a+b; } }
class Y { inherit X; constant a=2; int foo() { int b=14; return ::foo(); } }
The variable "int b=14" _could_ be an alternative to "int b=7", when X::foo() is invoked in an object of class Y (dynamic context). But it isn't, because we don't want it to be. In the same way, whether "constant a=2" is an alternative to "constant a=1" is purely a matter of choice.
For that to become an issue we'd have to introduce function body inheritance first. Indeed there is choice, but the choice is - to begin with - whether to introduce that concept or not (my vote on that matter is no).
For the class level, inheritance and overriding is already a fact, and there are observable effects whether or not "constant" should imply static binding. I think it should not: 1) It'd unnecessarily break the orthogonality principle, 2) it'd introduce a spurious difference wrt binding of function identifiers, 3) we'd have to introduce a new "nolocal" modifier, and 4) it'd massively break compatibility.
How do other languages handle the same situation? Maybe they even prohibit redefining a const-flagged member variable in a subclass?
Java (approximating "constant" with "final"): Works like I want, the constant is statically bound. Defining a new constant with the same name (and a different type!) in a subclass is allowed, but does not alter the behaviour of inherited methods. The behaviour is the same regarless of wheteher I declare the constant as "static" or not.
C++ (approximating "constant" with "static const"): Like Java. Here the "static" is needed though, because I'm not allowed to give an initializer otherwise.
Your choice of approximation in the Java case reflects your own view; there is a "final" in Pike too which maps better to Javas "final".
If it wasn't clear earlier, I regard "constant" as a way to declare something that is like a function in almost all respects except for the type. With that view, "constant" has very clean semantics in Pike, and it's just as useful for overloading as functions are.
Your choice of approximation in the Java case reflects your own view; there is a "final" in Pike too which maps better to Javas "final".
What approximation would you choose to reflect your view then?
About "final" in Pike, I wasn't aware of that. Using "final constant" seems to solve half the problem (the larger half). It doesn't make it possible to create a different constant with the same name in a subclass though.
If it wasn't clear earlier, I regard "constant" as a way to declare something that is like a function in almost all respects except for the type. With that view, "constant" has very clean semantics in Pike, and it's just as useful for overloading as functions are.
Yes, but "something which is almost like a function" isn't really what you'd normally associate with the word "constant". You'd rather think "something which is almost like a #define". The fact that the bug in the type checker which was the start of this thread has appeared is a clear indication that my intuition isn't unique.
Why do we need something which is "almost like a function" anyway? Why not just use a function? To save some typing?
What approximation would you choose to reflect your view then?
I don't know. My Java is too rusty and I'm not inclined to dig into it right now.
About "final" in Pike, I wasn't aware of that. Using "final constant" seems to solve half the problem (the larger half). It doesn't make it possible to create a different constant with the same name in a subclass though.
No, but "local" does.
Why do we need something which is "almost like a function" anyway?
Because we support access to non-functions in classes in Pike. If we'd think functions are good enough to access constant values then we'd reasonably also apply the same approach to variables and therefore force all variables to be private, or at least protected. From a OO-theoretical point of view that'd make the language more clean.
Why not just use a function? To save some typing?
Maybe, to a small extent. More importantly it saves unnecessary function calls, and it's more convenient when there is a substantial risk that the identifier doesn't exist at all. That's why Error.Generic uses constants:
class MyError { inherit Error.Generic; constant error_name = "MyError"; // ... }
mixed err = catch (blabla()); if (objectp (err) && err->error_name == "MyError") my_error_handling (err); else throw (err);
If we'd be forced to use functions here then the error test would become even clumsier:
if (objectp (err) && functionp (err->error_name) && err->error_name() == "MyError") ...
Because we support access to non-functions in classes in Pike. If we'd think functions are good enough to access constant values then we'd reasonably also apply the same approach to variables and therefore force all variables to be private, or at least protected. From a OO-theoretical point of view that'd make the language more clean.
I don't see how this follows. Are you referring to the fact that constants can be accessed without adding "()", and that removing this, we should also remove the possibility to access variables without "()"? In that case, I did not suggest removing the ()s. Simple use a getter function.
int `x() { return 3; }
makes "x" accessible in exactly the same way as
constant x=3;
does. Well, apart from the fact that you currently can't call functions in class-values, but that is a bug we could fix. Other major OO languages allow it.
In the case of variables, we conceptually get default getters and setters for non-private variables unless they are overridden. I don't see any reason to change that.
[...] More importantly it saves unnecessary function calls,
That sounds like an implementation detail that could be handled by the optimizer.
and it's more convenient when there is a substantial risk that the identifier doesn't exist at all. That's why Error.Generic uses constants:
class MyError { inherit Error.Generic; constant error_name = "MyError"; // ... }
mixed err = catch (blabla()); if (objectp (err) && err->error_name == "MyError") my_error_handling (err); else throw (err);
If we'd be forced to use functions here then the error test would become even clumsier:
if (objectp (err) && functionp (err->error_name) && err->error_name() == "MyError") ...
Again, define `error_name rather than error_name, and you can still use the more conveient variant.
string `error_name() { return "MyError"; } isn't as convenient though.
Isn't #define really what you want constants to be?
How about
string `error_name() = "MyError";
? If all we need is syntatic sugar, there are various ways that sugar could look.
#define is not what I want because it has a completely different (flat) scope system. Among other things, you can not access #defines in modules by indexing the module. Also, a #define contains a sequence of tokens rather than a value, which can be problematic at times.
Getters and setters still have some semantic issues, I think; see e.g. 16636404. But given that they are sorted out I'm all for that
constant x = 17;
would become semantically equivalent to
int `x() {return 17;}
provided that the binding semantics don't change.
In the case of variables, we conceptually get default getters and setters for non-private variables unless they are overridden.
That's not quite true when it comes to overloading. If only the conceptual default getters and setters would be overridden for variables then they wouldn't share storage.
That sounds like an implementation detail that could be handled by the optimizer.
Well, it better be sorted out first. There's currently quite a difference in direct access vs going through getters, so I suspect it'll require some doing to get the optimizer to cover that. Until then the performance considerations remain.
Getters and setters still have some semantic issues, I think; see e.g. 16636404.
Finding issues to be fixed is all what this thread is about. :-) I think we have identified various improvements on functions to add to the 8.0 roadmap.
In the case of variables, we conceptually get default getters and setters for non-private variables unless they are overridden.
That's not quite true when it comes to overloading. If only the conceptual default getters and setters would be overridden for variables then they wouldn't share storage.
I'm sorry, but I don't quite follow here. Can variables be overloaded? Can you give a code example of what you mean?
Well, it better be sorted out first. There's currently quite a difference in direct access vs going through getters, so I suspect it'll require some doing to get the optimizer to cover that. Until then the performance considerations remain.
All in good time, of course. Right now we're just trying to decide where we want to go, and what is preventing us from getting there.
Ok. But I still don't see what you are getting at. The overridden variable shares storage with the overriding variable because of the declaration. It doesn't matter if you use the default setter/getter pair or if you write one explicitly. Or does it?
Case 1: Implicit setter/getter
class A { int x=3; }
class B { inherit A; int x=7; }
Both x's will share storage, and this storage will initially contain 7, if a B is instantiated. B()->x will return 7, and B()->x=4 will set the single storage for "x" to 4.
Case 2: Explicit setter/getter
class A { int x=3; int `x() { return x; } void `x=(int n) { x=n; } }
class B { inherit A; int x=7; int `x() { return x; } void `x=(int n) { x=n; } }
This should give the same result, no? Currently the compiler doesn't like this, so that is yet another thing to fix. But I don't get why the x:s would not share storage in one case when they do in the other.
local constant should do what you are looking for.
subclass can overload local members, but the parrent class will not use the overloaded version.
greetings, martin.
For that to become an issue we'd have to introduce function body inheritance first. Indeed there is choice, but the choice is - to begin with - whether to introduce that concept or not (my vote on that matter is no).
And mine as well. And for the same reason that I don't want that (it is unintuitive), I don't want constants that work like they do now. Overriding of constants is also a concept that we had to introduce (after all, we had to introduce the concept of constants in the first place)...
But it appears that this behaviour has been implemented for a long time, so changing it would indeed be a compatibility problem, I can buy that argument.
As for the "nolocal" modifier, maybe this would be cool to have anyway? You could apply it to function scope variables to give them object lifetime. (And twice to achieve what I scetched in 16636824? :)
As for the lifetime issue, I'm for a more generic solution (which I think has been discussed earlier), namely a modifier where the storage is named explicitly. Something like this:
class X { int f() { // c is visible only in f but stored in the object instance. storage(X) int c; return c++; } }
class X { int f() { // c is visible only in f but stored in the topmost class instance. storage(global::this) int c; return c++; } }
(That could theoretically also be extended to quite wild things:
void kilroy (object x) { // Attach our own value to x. storage(x) marker = "kilroy was here"; }
But I'm not particularly keen on taking it that far.)
Hm, regarding the "nolocal" modifier, maybe it's needed after all? It turns out that enum, which is syntactic sugar for constant, implicitly adds local. So if it's important for constants to be able to be nonlocal, isn't it for enums (which are constants) as well?
I consider that a mishap. But it's not quite so severe as for normal constants since an enum is a collection of values: Overriding it is more a question of modifying the collection (i.e. adding or removing values, although removing values wouldn't be safe typewise) rather than changing the individual values.
But I don't really know how enums react to overriding in the collection sense either. I consider both enums and typedefs as incomplete features since the types they define can't be used everywhere. I hardly use them at all.
Regarding constant binding, there's even more murkiness in that area, though: Constants can be used in case labels in switches, and the binding there is implicitly static since the jump table for switches is always fixed at compile time. I'd like to see a warning for that so that people are encouraged to use local::foo instead.
The optimizer could also choose to recompile functions in inheriting classes to allow static binding. In some cases that can be an interesting optimization. I had that in mind when I wrote the RXML 2 parser; specifically RXML.Frame._eval is a rather unwieldy function that is designed to be tailored to the specific RXML.Frame instance by simply removing through constant optimization the 9/10th of it that doesn't apply in each specific case.
How come
class A { constant x = -1; }
class B { inherit A; constant x = 1; }
doesn't give a warning, but
class A { int(-1..-1) x = -1; }
class B { inherit A; int(1..1) x = 1; }
doesn't even compile?
I'm sure there's a good explanation, but it's probably related to this exact problem.
pike-devel@lists.lysator.liu.se