The function low_stat in fdlib.c uses a sequence of FileTimeToSystemTime, SystemTimeToTzSpecificLocalTime and the obscure function __loctotime_t to convert FILETIME to time_t.
Not only is this unnecessarily complex, it completely breaks Pike on W9x - SystemTimeToTzSpecificLocalTime is not supported and always returns FALSE, so low_stat always fails and Pike refuses to find any modules.
The following calculation achieves the same result and makes Pike work on W9x again (example for ftLastWriteTime):
buf->st_mtime = (time_t)((((INT64)findbuf.ftLastWriteTime.dwHighDateTime << 32) + findbuf.ftLastWriteTime.dwLowDateTime - 116444736000000000) / 10000000);
Reference:
http://msdn.microsoft.com/library/en-us/sysinfo/base/converting_a_time_t_val...
Please consider implementing this change, as I'd really like to use Pike on a W9x system.
Regards, Axel
Does that work in the transition between DST and non-DST time? The reason for that code is to get correct time stamps when a file is modified in non-DST time and the stat is run in DST time or vice versa. I think that code actually comes from some knowledge base article or Visual Studio sample or something, but I fail to find any KB article with that code snippet (there are however several about this bug, notably 158588, 129574, and 190315).
With this in mind I doubt that SystemTimeToTzSpecificLocalTime only does that simple calculation, at least on post-W9x system. Anyway, if the code is changed this DST-switching case must be tested.
13.11.04 22:35:03, Martin Stjernholm wrote:
buf->st_mtime = (time_t)((((INT64)findbuf.ftLastWriteTime.dwHighDateTime << 32)
- findbuf.ftLastWriteTime.dwLowDateTime - 116444736000000000) / 10000000);
Does that work in the transition between DST and non-DST time?
Yes.
FILETIME holds the number of 100-nanosecond intervals since January 1, 1601 00:00 UTC.
time_t holds the number of seconds since January 1, 1970 00:00 UTC.
The reference points are absolute points in time - the notion of days and time-of-day doesn't come into it at all; thus timezones, DST and all that becomes irrelevant. You simply have to compensate the different reference points in time (the subtraction) and go from 100-nanoseconds to seconds (the division).
With this in mind I doubt that SystemTimeToTzSpecificLocalTime only does that simple calculation, at least on post-W9x system.
The calculation it does is actually even simpler. That function works on SYSTEMTIME structures, which hold a date and time split into year, month, day, hour and so on. This representation is dependent on the timezone and DST, in contrast to FILETIME or time_t.
If you use FileTimeToSystemTime to convert a FILETIME into the SYSTEMTIME representation, you get GMT time. Now if you want to know what time that was in *your* timezone, say GMT+02:00, you feed that SYSTEMTIME to SystemTimeToTzSpecificLocalTime. That function will now simply add 2 to the "hour" member of SYSTEMTIME and cope with overflows into the day, month and year members. Nothing more.
Now __loctotime_t does exactly the opposite - take a SYSTEMTIME specifying the date and time in the local timezone, adjust the time and maybe date to get GMT time, and then convert that into the time_t representation.
So SystemTimeToTzSpecificLocalTime and the first part of __loctotime_t exactly cancel each other out! I'd be quite surprised if that was indeed part of a KB article or coding sample, especially considering that http://msdn.microsoft.com/library/en-us/sysinfo/base/converting_a_time_t_val... describes the right way to do it.
Axel
Ok. I missed that you talked about the combination of SystemTimeToTzSpecificLocalTime and __loctotime_t.
Now I've investigated the origins of that code a bit more. It's actually originally from Microsofts own implementation of _stat in its C Runtime Library, which you get the source for with Visual Studio. It's then been hacked up a bit to fit pike (and the bug fixed, of course), but it's still a bit dubious if it's really kosher to have in pike.
This is what CRT does:
SYSTEMTIME SystemTime; FILETIME LocalFTime;
if ( !FileTimeToLocalFileTime( &findbuf.ftLastWriteTime, &LocalFTime ) || !FileTimeToSystemTime( &LocalFTime, &SystemTime ) ) { _dosmaperr( GetLastError() ); FindClose( findhandle ); return( -1 ); }
buf->st_mtime = __loctotime_t( SystemTime.wYear, SystemTime.wMonth, SystemTime.wDay, SystemTime.wHour, SystemTime.wMinute, SystemTime.wSecond, -1 );
In pike it's modified to:
SYSTEMTIME SystemTime; SYSTEMTIME UtcSTime;
if ( !FileTimeToSystemTime( &findbuf.ftLastWriteTime, &UtcSTime ) || !SystemTimeToTzSpecificLocalTime(NULL, &UtcSTime, &SystemTime) ) { _dosmaperr( GetLastError() ); FindClose( hFind ); return( -1 ); }
buf->st_mtime = __loctotime_t( SystemTime.wYear, SystemTime.wMonth, SystemTime.wDay, SystemTime.wHour, SystemTime.wMinute, SystemTime.wSecond, -1 );
To me it pretty much looks like the person that did the pike fix didn't know a better way to convert a SYSTEMTIME to time_t. The bug that it corrects is in the use of FileTimeToLocalFileTime, which uses the current time and not the filestamp time to decide whether to add the DST hour or not.
Now why does CRT use the roundabout conversion it does? There's a comment a bit earlier:
* Note: We cannot directly use the file time stamps returned in the * WIN32_FIND_DATA structure. The values are supposedly in system time * and system time is ambiguously defined (it is UTC for Windows NT, local * time for Win32S and probably local time for Win32C). Therefore, these * values must be converted to local time before than can be used.
So I gather that older windowsen store local time in the timestamps, as opposed to NT, and that FileTimeToLocalFileTime is a nop on the old OS's. I wonder if the pike version works correctly on Win32S and Win32C systems (whichever Win9x they map to). Is the same OS specific conversion difference present in FileTimeToSystemTime?
14.11.04 02:55:01, Martin Stjernholm wrote:
So I gather that older windowsen store local time in the timestamps, as opposed to NT, and that FileTimeToLocalFileTime is a nop on the old OS's.
No. The FILETIME values in the WIN32_FIND_DATA structure are defined as the number of 100-nanosecond intervals since January 1, 1601 00:00 UTC on *all* Windows versions. Likewise, the behaviour of FileTimeToLocalFileTime, and the bug, is identical on NT and W9x.
It all depends on the underlying *file system*.
On NTFS volumes, the time is stored as UTC, so it's easy to calculate the difference to January 1, 1601 00:00 UTC directly.
On FAT volumes, the time is stored as local time, as a heritage from MS-DOS days. This means that this local time must be converted to UTC first before calculating the FILETIME. During this conversion, all Windows versions exhibit the same bug, using the *current* DST status instead of the DST status at that time. So the FILETIME value can be off by one hour for FAT volumes.
Now the CRT _stat implementation uses FileTimeToLocalFileTime to convert this value back to local time - which suffers from the same DST bug. Then the undocumented function __loctotime_t is used to convert this local time to a time_t, and this function correctly accounts for DST at that time.
On FAT volumes, the conversion error while retrieving the FILETIME, and the conversion error of FileTimeToLocalFileTime exactly cancel each other out, so the final result of the _stat implementation is always correct.
On NTFS volumes however, there is no conversion error while retrieving the FILETIME because the file system already stores it as UTC, so the conversion error of FileTimeToLocalFileTime can cause the final result to be 1 hour off.
The current Pike implementation does it exactly the other way round - the result is always correct on NTFS volumes, but can be 1 hour off on FAT volumes.
The simple calculation I proposed has exactly the same properties as the current Pike implementation - the result is always correct on NTFS volumes, but can be 1 hour off on FAT volumes. What's more, it's considerably less bloated and does not completely break Pike on W9x.
If you want to always get correct results for NTFS *and* FAT volumes, you'd have to detect which kind of file system the file is in, and then use my proposed calculation for NTFS and the standard _stat implementation for FAT. However this would be quite a hassle just because of an occasional 1 hour off on FAT volumes.
Summing up, I renew my pledge to replace the current Pike implementation with my proposed calculation.
Cheers, Axel
No. The FILETIME values in the WIN32_FIND_DATA structure are defined as the number of 100-nanosecond intervals since January 1, 1601 00:00 UTC on *all* Windows versions. Likewise, the behaviour of FileTimeToLocalFileTime, and the bug, is identical on NT and W9x.
Doesn't that contradict the comment in the CRT source? (Not that I necessarily believe that the comment must be correct.)
However this would be quite a hassle just because of an occasional 1 hour off on FAT volumes.
The occasional one hour difference can produce quite nasty bugs, not least because it's so occasional. But if it isn't cheap to check the filesystem type, we probably can't afford to do it anyway. And in my view it's more important that it works correctly on ntfs.
Summing up, I renew my pledge to replace the current Pike implementation with my proposed calculation.
Your theory makes sense. I'd just like to know what this Win32S and Win32C that the comment talks about is.
Btw, speaking of DST bugs, it seems GetFileTime, which is used by Stdio.File.stat, has an even more nasty one:
FAT records times on disk in local time. GetFileTime retrieves cached UTC times from FAT. When it becomes daylight saving time, the time retrieved by GetFileTime will be off an hour, because the cache has not been updated. When you restart the machine, the cached time retrieved by GetFileTime will be correct.
Firstly, I don't see why it would get incorrect; it's afterall cached in UTC where there's no such thing as DST. Secondly, provided it actually is incorrect, I wonder why they don't fix it instead of just documenting it. Are they still stuck in the "well, everyone reboots at least once a day anyway" paradigm?
I wonder how many of these comments are still there from Windows 3.0. Or earlier.
15.11.04 00:50:02, Martin Stjernholm wrote:
No. The FILETIME values in the WIN32_FIND_DATA structure are defined as the number of 100-nanosecond intervals since January 1, 1601 00:00 UTC on *all* Windows versions. Likewise, the behaviour of FileTimeToLocalFileTime, and the bug, is identical on NT and W9x.
Doesn't that contradict the comment in the CRT source? (Not that I necessarily believe that the comment must be correct.)
The comment is very old, and what it refers to, albeit in a quite obscure way, is that Win32s incorrectly implements FILETIME based on local times. However this has not been carried over to Win32c aka W9x, I have tested and confirmed that W9x implements FILETIME exactly like NT/2K/XP. So the comment is not relevant for Pike anymore, which would never run on Win32s anyway.
But if it isn't cheap to check the filesystem type, we probably can't afford to do it anyway.
Well, it's not THAT expensive:
GetVolumeInformation( "C:", NULL, 0, NULL, NULL, NULL, szFs, 8);
For the first parameter, the root directory must be extracted from the file name. If the file name doesn't contain a drive specification, that parameter should be chosen as NULL, then GetVolumeInformation uses the current drive.
Then, if "NTFS" or "HPFS" is returned in szFs, the Pike implementation low_stat (changed to use my proposed calculation) should be used, otherwise the standard _stat function.
Btw, speaking of DST bugs, it seems GetFileTime, which is used by Stdio.File.stat, has an even more nasty one:
FAT records times on disk in local time. GetFileTime retrieves cached UTC times from FAT. When it becomes daylight saving time, the time retrieved by GetFileTime will be off an hour, because the cache has not been updated. When you restart the machine, the cached time retrieved by GetFileTime will be correct.
Firstly, I don't see why it would get incorrect; it's afterall cached in UTC where there's no such thing as DST.
Indeed, it seems that Micro$oft is very, very confused by DST. By "incorrect" they probably mean "incorrect in comparison to our already incorrect implementation of FILETIME values in WIN32_FIND_DATA", and not "incorrect" in the absolute sense of the word :)
Regarding Stdio.File.stat, it should probably be changed to use my proposed combination of _stat/_low_stat as well - avoiding redundancy and more bugs. BTW, I just noticed the function "convert_filetime_to_time_t" in that context - which exactly implements my proposed calculation, albeit a little more expensive by using floats instead of INT64...
Axel
GetVolumeInformation( "C:", NULL, 0, NULL, NULL, NULL, szFs, 8);
For the first parameter, the root directory must be extracted from the file name. If the file name doesn't contain a drive specification, that parameter should be chosen as NULL, then GetVolumeInformation uses the current drive.
What about paths starting with \?
15.11.04 13:40:03, Marcus Comstedt wrote:
GetVolumeInformation( "C:", NULL, 0, NULL, NULL, NULL, szFs, 8);
For the first parameter, the root directory must be extracted from the file name. If the file name doesn't contain a drive specification, that parameter should be chosen as NULL, then GetVolumeInformation uses the current drive.
What about paths starting with \?
For UNC paths, one has no option but to trust the corresponding network resource to return the correct FILETIME. So yes, UNC paths should be detected and the same calculation as for NTFS should be used.
Axel
Ok then. Do you already have a patch?
Well, it's not THAT expensive:
GetVolumeInformation( "C:", NULL, 0, NULL, NULL, NULL, szFs, 8);
The drive is already extracted in that function, so it's afterall not that much additional work. One probably ought to bench it properly, but..
Indeed, it seems that Micro$oft is very, very confused by DST. By "incorrect" they probably mean "incorrect in comparison to our already incorrect implementation of FILETIME values in WIN32_FIND_DATA", and not "incorrect" in the absolute sense of the word :)
Even with that quirky interpretation of "incorrect", it doesn't really explain how a reboot should change matters.
/.../ BTW, I just noticed the function "convert_filetime_to_time_t" in that context - which exactly implements my proposed calculation, albeit a little more expensive by using floats instead of INT64...
Yes, it could probably use a touch-up too.
15.11.04 23:40:02, Martin Stjernholm wrote:
Ok then. Do you already have a patch?
No, because the proposed change has evolved quite a bit during this thread. I think I'll better leave it to an experienced Pike developer to make the final implementation.
Indeed, it seems that Micro$oft is very, very confused by DST. By "incorrect" they probably mean "incorrect in comparison to our already incorrect implementation of FILETIME values in WIN32_FIND_DATA", and not "incorrect" in the absolute sense of the word :)
Even with that quirky interpretation of "incorrect", it doesn't really explain how a reboot should change matters.
After a reboot, GetFileTime will recalculate the cached UTC times according to the current DST status, so they will once again be consistent with the values returned in WIN32_FIND_DATA, and thus by M$ definition "correct".
Axel
I think I'll better leave it to an experienced Pike developer to make the final implementation.
In this case I'd say the pike pecularities are minimal while the windows pecularities are monumental. Oh well, I'll try to whip together something. The main problem is that I don't have much time to do the proper testing. And constructing testsuite tests are probably not possible, either.
After a reboot, GetFileTime will recalculate the cached UTC times according to the current DST status, so they will once again be consistent with the values returned in WIN32_FIND_DATA, and thus by M$ definition "correct".
In other words, the cache will after reboot be recalculated with the bogus DST compensation, and it will at that point not be UTC afterall, but offset with an hour. Thus, if pike tries to compensate for the bogusness in the way we've discussed here, it will be one hour off in the opposite direction until the machine is rebooted. So this is still an issue, and working around it would require fd_fstat to look at the system uptime or something. :P
An attempt at fixing it is committed to 7.7. It's not tested at all, not even compiled. It's actually rather unfortunate that there's no NT system in xenofarm anymore.
I gave up on fixing a workaround for GetFileTime since I couldn't find any reasonably simple way to get the volume information for a file handle.
I began setting up a semipermanent machine for XP-testing last weekend. I'll try to get it finished this weekend.
We need to finish the MinGW port, so that it's easy to compile pike nativly under Win32.
pike-devel@lists.lysator.liu.se