Well. Is [ and \ really that common in glob patterns? (the addition of quoting and making [ a control character being the incompatible changes)
It is possible to use 8.0::glob (not yet in place).
Or this, which is future compatible (8.0:: would be removed after a while):
add_constant( "glob", lambda(array|string a, array|string b) { array(string)|string compat(array|string what) { if(arrayp(what)) return map(what,this_function); return replace(what, (["\":"\\", "[":"\[" ])); }; return !!glob(compat(a),b); });
(since I don't think any of the array version are used in roxen, it can be simplified a bit.. :))
Anyway, the glob function in 8.1 complies with the definition of glob in wikipedia, whatever that is worth.
As for why glob was modified:
The lack of quoting made it impossible to match * or ?. The [ is less important to me at least, I just added that to be feature complete (or, well, to the 'basic' glob pattern syntax).
After perusing the roxen source there are very few locations where the pattern is not a simple constant *.<ext>, only a few places takes the pattern as input from the user (web developer). In most cases this seems to then be matched to a mime type ([ and \ being somewhat uncommon in those) or a file (not path) name (no , theoretically [ can occur),
Of course, custom code can do whatever.
On Fri, May 13, 2016 at 4:10 PM, Jonas Walldén @ Pike developers forum 10353@lyskom.lysator.liu.se wrote:
Just saw Per's checkin (781fcde2) in 8.1:
Extended glob pattern syntax: o \ can now be used to quote special characters in the pattern o [ can be used for ranges of characters ([bx] [a-c0-9] [^a] etc).
Also changed glob to return the matching glob instead of 1 when an array is passed as the first (pattern) argument.
This can be used to remove some loops where you want to do different things depending on which pattern matched.
Both these changes are incompatible.
Can we please put this in a new method or use opt-in with an extra flag? I'm not too worried about the return value, but the changes in glob parsing sound problematic to me.
At least for our customers there are plenty of globs in user data (RXML code, system configs, custom modules etc) that could misbehave silently, i.e. without producing compile-time errors.
It is also how glob works in other languages, such as perl python and go, if that is seens as more relevant.
However, only some of them have escaping, instead [*] would be used to match a literal *. I can remove the , but then [ is more needed.
On Fri, May 13, 2016 at 4:38 PM, Per Hedbor per@hedbor.org wrote:
Well. Is [ and \ really that common in glob patterns? (the addition of quoting and making [ a control character being the incompatible changes)
It is possible to use 8.0::glob (not yet in place).
Or this, which is future compatible (8.0:: would be removed after a while):
add_constant( "glob", lambda(array|string a, array|string b) { array(string)|string compat(array|string what) { if(arrayp(what)) return map(what,this_function); return replace(what, (["\":"\\", "[":"\[" ])); }; return !!glob(compat(a),b); });
(since I don't think any of the array version are used in roxen, it can be simplified a bit.. :))
Anyway, the glob function in 8.1 complies with the definition of glob in wikipedia, whatever that is worth.
As for why glob was modified:
The lack of quoting made it impossible to match * or ?. The [ is less important to me at least, I just added that to be feature complete (or, well, to the 'basic' glob pattern syntax).
After perusing the roxen source there are very few locations where the pattern is not a simple constant *.<ext>, only a few places takes the pattern as input from the user (web developer). In most cases this seems to then be matched to a mime type ([ and \ being somewhat uncommon in those) or a file (not path) name (no , theoretically [ can occur),
Of course, custom code can do whatever.
On Fri, May 13, 2016 at 4:10 PM, Jonas Walldén @ Pike developers forum 10353@lyskom.lysator.liu.se wrote:
Just saw Per's checkin (781fcde2) in 8.1:
Extended glob pattern syntax: o \ can now be used to quote special characters in the pattern o [ can be used for ranges of characters ([bx] [a-c0-9] [^a] etc).
Also changed glob to return the matching glob instead of 1 when an array is passed as the first (pattern) argument.
This can be used to remove some loops where you want to do different things depending on which pattern matched.
Both these changes are incompatible.
Can we please put this in a new method or use opt-in with an extra flag? I'm not too worried about the return value, but the changes in glob parsing sound problematic to me.
At least for our customers there are plenty of globs in user data (RXML code, system configs, custom modules etc) that could misbehave silently, i.e. without producing compile-time errors.
Not all of our source is public so I doubt you can check that. :-)
Just one example where there could be a problem: filename pattern matching. We have customers that get news wires with odd filenames, and the pattern is user-configurable in the import module. It's the [ character that I think poses the biggest challenge, not the */? escaping.
It is also how glob works in other languages, such as perl python and go, if that is seens as more relevant.
I don't see how that justifies the incomaptible change. And this seems like a feature creep toward regexps, so it's not like it's been impossible to accomodate these needs already with a regexp.
Could it not be named Glob(), String.glob(), globx() or glob(a, b, 1)?
Another approach would be to claim glob(array, ...) as the extended syntax since it's only been available since 8.0, and the caller needing this for a single glob can arrayify the parameter easily.
pike-devel@lists.lysator.liu.se