Why doesn't format_iso_short() in the calendar module work as advertized?
refdocs:
iso_short "2000-06-02T00:00:00"
reality:
Pike v7.7 release 34 running Hilfe v3.5 (Incremental Pike Frontend)
Calendar.ISO_UTC.Second(time())->format_iso_short();
(1) Result: "20070813T19:56:13"
Where did the dashes go? :-(
In the last episode (Aug 13), Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum said:
Why doesn't format_iso_short() in the calendar module work as advertized?
refdocs:
iso_short "2000-06-02T00:00:00"
reality:
Pike v7.7 release 34 running Hilfe v3.5 (Incremental Pike Frontend)
Calendar.ISO_UTC.Second(time())->format_iso_short();
(1) Result: "20070813T19:56:13"
Where did the dashes go? :-(
Looks like format_iso_short() should call format_ymd() instead of format_ymd_short().
The standard allows both variants. The ContentDirectory:1 specification (which is what I'm trying to implement ATM) requires the variant with dashes though. So the current behaviour is not non-ISO, just non-what's-documented. And besides, it seems silly to omit the dashes but keep the colons (which are also allowed to be omitted by ISO 8601)...
Fix it to comply with the docs. Preferrably, add a testsuite test too. (I'm guessing you already did either or both, but am too lazy to peek)
Wouldn't it be better to fix the docs to match the current output if it's in fact valid?
Since noone reacted before, I doubt anyone uses it, and I agree that the dashes are better and make more sense (just like I (?) wrote in the docs).
Speaking for myself I often try indices(foo) and then foo->bar() to see if bar is the method I'm looking for without even reading the docs. A lack of reaction doesn't mean nobody is using the function; it could just as well be that people have adapted their code to the current output format.
I also agree dashes would be more consistent, but this is above all a compatibility issue. I couldn't find any calls in our Roxen source base, but that doesn't include any customer-written modules for instance.
I didn't think that people didn't use it because it didn't work like the doc, but that the output is so inconsistent as well as very rarely useful that some other function were selected instead.
I'm betting 75% av all users trial and error functions until they get one that works due to the status of the documentation, so I wouldn't be surprised if someone was dependant on that format.
It's true that it's a easy way to find the funktions, but as iso_short is pretty useless, I would say it's only about 10% chance that anyone is using it and depends on it's exact output.
I don't use it, and I use a lot of Calendar nasty stuff.
So what-ever solution you decide on, I think it's OK.
The most likely dependency graph goes like this:
1. Need timestamp for log. ISO sounds like it's standardized and good. 2. Implement log parser. 3. Watch logparser break when format_iso_short is changed.
Other than that there is very little chance of something breaking internally in programs if the format is changed.
Which reminds me of an idea I had yesterday, that there should be a function which takes a string in ISO 8601 format (_any_ ISO 8601 format) and returns an appropriate Calendary object (Second, Minute, Day, Week, Month or Year depending on the string)...
Can the Calendar module handle iso 8601 time zones (including +hh:mm = +hhmm, and +hh)?
I tried to figure out how the timezone management in the Calendar module worked yesterday, but I failed...
More specifically, I know how to make a time object in my current timezone and in UTC:
Pike v7.6 release 86 running Hilfe v3.5 (Incremental Pike Frontend)
Calendar.ISO.Second(2007,01,01,20,23,14);
(1) Result: Second(Mon 1 Jan 2007 20:23:14 CET)
Calendar.ISO.Second(2007,01,01,20,23,14)->tzname();
(2) Result: "CET"
Calendar.ISO_UTC.Second(2007,01,01,20,23,14);
(3) Result: Second(Mon 1 Jan 2007 20:23:14 UTC)
Calendar.ISO_UTC.Second(2007,01,01,20,23,14)->tzname();
(4) Result: "UTC"
But how do I create one with an arbitrary timezone?
| > Calendar.ISO->set_timezone("Europe/Helsinki")->Second(2006,11,17,10,11,12); | (2) Result: Second(Fri 17 Nov 2006 10:11:12 EET)
set_timezone should take any Calendar.Timezone compatible object, in case you feel like making +47:11. It exists in all Calendar and Time objects and return the same object with the timezone changed.
Great. Any particular reason why it accepts "-14" but not "-15"? AFAIK -12 is the "latest" timezone actually in existence (and nobody lives there... :).
I think someone actually made -14 for a bit. Or at least +14, something about being first into the new millennium... Other than that I can't really answer.
+14 is in use by Kiribati (Christmas Island and friends), yes. Also in use are +13:45 and +12:45...
+1.
That reminds me of a feature I miss in sscanf. Present good behaviour:
array_sscanf("1234567890123","%10d%d");
(1) Result: ({ /* 2 elements */ 1234567890, 123 })
array_sscanf("1234567890123","%3d%d");
(2) Result: ({ /* 2 elements */ 123, 4567890123 })
Present bad behaviour:
array_sscanf("1234567890123","%-3d%d");
(3) Result: ({ /* 2 elements */ 123, 4567890123 })
Wanted behaviour:
array_sscanf("1234567890123","%-3d%d");
(4) Result: ({ /* 2 elements */ 1234567890, 123 })
array_sscanf("1234567890123","%-4d%2d%d");
(5) Result: ({ /* 3 elements */ 123456789, 1, 23 })
Present bad behaviour:
array_sscanf("1234567890123","%-3d%d");
(3) Result: ({ /* 2 elements */ 123, 4567890123 })
Yes? This is the expected behaviour.
Wanted behaviour:
array_sscanf("1234567890123","%-3d%d");
(4) Result: ({ /* 2 elements */ 1234567890, 123 })
?? You want it to be relative to the size of the first argument?
Present bad behaviour:
array_sscanf("1234567890123","%-3d%d");
(3) Result: ({ /* 2 elements */ 123, 4567890123 })
Yes? This is the expected behaviour.
It is, from old habit, but as it is the same behaviour as
array_sscanf("1234567890123","%3d%d");
(2) Result: ({ /* 2 elements */ 123, 4567890123 })
Bad phrasing of "we have a presently useless redundant flag which does not affect %d:s behaviour" on my part. (Only confirmed by testing; the source, or a bearded hacker's better knowledge, may prove this wrong.)
Wanted behaviour:
array_sscanf("1234567890123","%-3d%d");
(4) Result: ({ /* 2 elements */ 1234567890, 123 })
?? You want it to be relative to the size of the first argument?
Specifically, I want it to be relative to the current integer part being processed, the way present %d is the special case %-0d with my proposed improvement (rather than implementing %1d). Better example:
array_sscanf("+1234567890123 is a valid ISO date","+%-4d%2d%d%s");
(6) Result: ({ /* 4 elements */ 123456789, 1, 23, " is a valid ISO date" })
Which would be good, because sscanf presently can't parse those.
Present bad behaviour:
array_sscanf("1234567890123","%-3d%d");
(3) Result: ({ /* 2 elements */ 123, 4567890123 })
Yes? This is the expected behaviour.
It is, from old habit, but as it is the same behaviour as
array_sscanf("1234567890123","%3d%d");
(2) Result: ({ /* 2 elements */ 123, 4567890123 })
Somehow cut-away text inserted here: ...it adds no value. That is, a
bad phrasing of "we have a presently useless redundant flag which does not affect %d:s behaviour" on my part. (Only confirmed by testing; the source, or a bearded hacker's better knowledge, may prove this wrong.)
Wanted behaviour:
array_sscanf("1234567890123","%-3d%d");
(4) Result: ({ /* 2 elements */ 1234567890, 123 })
?? You want it to be relative to the size of the first argument?
Specifically, I want it to be relative to the current integer part being processed, the way present %d is the special case %-0d with my proposed improvement (rather than implementing %1d). Better example:
array_sscanf("+1234567890123 is a valid ISO date","+%-4d%2d%d%s");
(6) Result: ({ /* 4 elements */ 123456789, 1, 23, " is a valid ISO date" })
Which would be good, because sscanf presently can't parse those.
Could of course be argued that it's hackish and the job be better left to regexes and string-to-int casts, too.
Hm, maybe that "_any_" has to be qualified a bit. I just realized that the standard is a bit ambiguous.
2007-05
can mean either May 2007, or the time 20:07 in Mexico (UTC-5).
Give precedence to extended formats? (i e: if there is one possible reading that uses proper separators, where the other reading is one that has traded them away, like the : missing from 20:07-05, always pick the extended format instead.)
Yeah, so the qualification in question would be something like "any format that is unabreviated, or in an abbreviated format which can not be misinterpreted as an unabbreviated one".
I think all ambiguities that arrise are cause by time-only specifications where the colon has been omitted. Apart from the one I mentioned, there's "2007" which can mean either the year 2007, or the time 20:07 (without timezone information).
Ambitiously also adding the proper fallback in 7.6 compat mode would be another improvement in style. I don't think a behaviour as clearly inconcistent with itself is anything to stick around with though, even if it doesn't break the letter of the standard it vaguely references.
It is a lot easier to get "2007-08-14T09:26:22", if you wanted that, from replace(Calendar.ISO_UTC.Second(time())->format_time()," ","T") than the output of Calendar.ISO_UTC.Second(time())->format_iso_short() so I don't quite buy in to your argument, and the given output looks are too suspicious to be anything I would depend on prior to having verified in docs that it is indeed intended that way.
Possible, but not worth keeping around API cruft for, IMO.
pike-devel@lists.lysator.liu.se