I've been annoyed at the divergent and fragmented implementations of html_encode_string, http_decode_string, etc. in Pike for quite some time.
This is what it looks like today, afaik:
encode: decode: --------------------------------------------------- html | _Roxen.html_encode_string Protocols.HTTP.unentity (partial) http | Protocols.HTTP.http_encode_string _Roxen.http_decode_string xml | - -
Where should we put these methods instead? Standards.{HTTP,HTML,XML}? Somewhere else?
The http-function belongs in the URL module, really. HTML functions in Parser.HTML (or Standards.HTML) somewhere, presumably. And XML in the XML module.
There is also Parser.html_entities, Parser.html_entity_parser, Parser.parse_html_entities and Parser.encode_html_entities.
I recently wanted to move away URI variable fiddling from Protocols.HTTP into Standards.URI, since Standards.URI is used anyway. When I had added get_query_variables, set_query_variables, add_query_variable and add_query_variables then problem of where the encoding should be arised, so I didn't check anything in.
Indeed it is; I didn't read properly. I thought a more natural place would be Protocols.HTTP, but now when I check up the relevant rfc the encoding is actually part of the uri syntax rather than http. So instead I wonder why the function got named "http_encode_string" in the first place and not "uri_encode_string". So shouldn't it actually be Standards.URI.uri_encode?
Another explanation why I want it in the object is that I like interfaces to be in the class that is going to be encoded rather then having an external function operating on that class.
The name is probably very old, like from the spider/µlpc time. Or even older (lpc4 or mudos).
Well, the fact that Standards.URI is a class and not a module is a bit of a problem, I think. If we're talking about a better name and place for http_encode_string, it's a function that just encodes a string and returns a string. That doesn't need a URI object and shouldn't require one. So in hindsight it's a bit unfortunate that Standards.URI wasn't made a pmod with a Standards.URI.URI inside it instead.
Of course, having a method in the URI class to let it encode itself to a properly quoted string is a different matter, but is of course nice too.
(Sorry, the last sentence there got a bit strange.)
I see there's a quote() method in Standards.URI. It encodes only partially in a bogus way that makes it dangerous to use. I suggest to deprecate it and introduce a complete encoding function like the Roxen.http_encode_url found in WebServer 4.0 or later.
So in hindsight it's a bit unfortunate that Standards.URI wasn't made a pmod with a Standards.URI.URI inside it instead.
In hinder still hindsight, it's a bit unfortunate that the Pike module and class system makes this the proper way of solving the issue of how to add stand-alone methods related to a problem domain of some class, but where a class instance isn't necessarily always around. It's a bit counter-intuitive and silly-looking until you understand the internals of the language.
Whether it smells better or worse than modules sporting a `() method to return class instances is of course also a matter of taste; that's a tradeoff between pretty interface names and pretty or even useful at all type information, IIRC? (At least from what I understand, that would make the module name useless as a type for the class returned.) Are there other problems with using `() methods like that?
Ohh, you're getting me started on that one now. It's really a mast-bait.. ;) Anyways, here it goes:
I'm of quite the opposite opinion regarding that.
Prime example: Thread and Thread.Thread. Thread is the module for all things thread related, while Thread.Thread is for actual thread instances. These two things are really very different, and they contain entirely different stuff. E.g:
o A thread related thing: A mutex, not related to any specific thread: Thread.Mutex o A thread instance related thing: The thread identifying number: Thread.Thread()->id_number().
So why try to coerce these two things into the same namespace? What's the beauty in that? I think it's prettier when different things actually have different names too. So even if Pike had proper static class members, I'd strongly vote for having different names in cases like this.
One case is for the sake of inherits: It makes perfect sense to inherit the instance class without inheriting the module, and in many cases it also makes sense to inherit the module without inheriting the instance class. Not that I can come up with a good reason to inherit the Thread module, but e.g. inheriting a parser module for some language to make another one for a language derivate doesn't seem far fetched.
Another (closely related) case is typing: Obviously a class attempting to behave like a thread instance shouldn't need to contain mutexes and stuff to fulfill the Thread module too. Dreadful kludgery is necessary to coerce the type system to accept this - just take a peek in Thread.pmod where an attempt has been done.
Given the simple principle that different things should have different names, it's actually not the case that Thread.Thread and Standards.URI.URI are silly looking and only makes sense if internals are considered. I can hardly think of more natural names, in fact. The closest could perhaps be Thread.Instance and Standards.URI.Instance, or the other way around Threading.Thread and, uhh, Standards.URIStandard.URI? Standards.URIStuff.URI? Nope, can't come up with anything there.
It's probably the repetition that bothers most people who think Thread.Thread looks ugly or even badly designed. Let's draw a parallell to Stdio.File: It's also a class for the principal object in a module that contains the related stuff. In that regard it doesn't seem far fetched that the module could have been called "File" instead of "Stdio". If it were, I bet there would have been opinions about joining File and File.File. But now the names happen to be different, and at least I haven't heard of anyone wanting to make Stdio.File() into Stdio().
As for having a `() in the module as an alias for create() in the instance class, that doesn't do much harm. I wouldn't regard it as particularly good either - it would obfuscate the code just to save a bit of space. Using the module name as an alias for the instance class type is otoh very messy indeed.
Ohh, you're getting me started on that one now. It's really a mast-bait.. ;) Anyways, here it goes:
I know. :-) But it's an interesting subject. I just found that I share most of your ideas were the names just picked as collective nouns for the module and singular nouns for classes. (Which would translate to "Threads.Thread" in the specific case of "Thread.Thread".) A bit like the Files.File name of ancient ages, I suppose. On the other hand, I'm not sure I'd like Threads.Mutex, though probably mostly for being used to the name it has now, and that naming policy becomes absurd for GTK, GL and similar modules.
You are right on target about the repetition being the sole source of aesthetic mutterings -- in Stdio.File, I don't think anybody finds it the least bit unpleasant.
As for having a `() in the module as an alias for create() in the instance class, that doesn't do much harm. I wouldn't regard it as particularly good either - it would obfuscate the code just to save a bit of space. Using the module name as an alias for the instance class type is otoh very messy indeed.
I suppose that mostly moves the issue from instantiation to the type, yes (assuming you meant having a Standard.URI.URI that also could be created via Standard.URI.`()).
We seem to agree about joining the module and class namespace to one is bad, both for types and inheritance. I just wish we'd come up with some nice way of attaching methods and constants to a class name (i e Standards.URI.encode()) without messing with the looks of its object instances (Standards.URI()->encode) by necessity.
Well, there should be a Standards.URI()->encode() too as Mirar suggested, but that's a different matter.
If we had real static functions it'd of course be simplest to add one in Standards.URI, but I don't think it's really the right place for it.
Perhaps a way to go is to move the module entirely? I'm not particularly fond of the "Standards" category at all. It's too vague and overlaps with other things. (I mean, http is a standard, so why not Standards.HTTP? MIME is a standard and yet it's at the top level.) Same goes for the "Tools" directory, for that matter.
I favor a flat namespace, and when it comes to well known acronyms like "URI" I don't see any problem with having it directly at the top level. So my suggestion is to move Standards.URI to URI.URI and add URI.encode_string.
Is it 100% impossible to add a backwards-compatible mechanism to Pike to allow for static (in the java meaning of the word) member methods and variables?
'protected' and 'private' were once upon a time implemented so that 'static' could get it's "usual" meaning, unless I remember incorrectly.
'global' might work. But is not quite correct, since the identifier is not really global, only moved 'up' one level.
I'd say it's a decent name, especially if thought of in the sense of "class global". Would it risk confusing people about being related to the "global." prefix?
keyword suggestions: permanment persistant sticky omnipresent external
regarding the meaning of static i just read in kernighan and ritchie that static in c actually has two meanings, one of which seems to be very similar to static in pike: a function or variable declared static is only visible in the file it is declared in.
only inside a function static takes the meaning of permanent storage within a single function.
this suggests that pikes use of static is actually not as strange as once thought considering its roots in c. (another mystery unraveled, making a note of this in the pike book)
greetings, martin.
We did once upon a time decide that this alternative modifier should be "static" för that if I remember correctly.
Too bad we didn't choose hidden instead of static. That better describes what static means in Pike IMHO.
As far as new keyword I suggest "persistent".
"Persistant" contains the conotation thst it is stored somewhere permanent. "Invariant" is probably more correct.
To be honest, I don't like either of those suggestions. I would really like to see the current use of static deprecated and changed in the future. The current static keyword could be changed to "hidden", which I think describes what it does fairly well.
pike-devel@lists.lysator.liu.se