Everybody thanks for the replies...! I also suffered from CPU eating regexp using '*', but I expected it was my own fault somehow :) But how do I go from here? There's also no trim_right (or _left) function, does anybody have something like that lying around? I guess I just have to process the string, right-to-left, char-by-char...
Greetings,
Coen
Bertrand LUPART wrote:
Mirar @ Pike developers forum 10353@lyskom.lysator.liu.se wrote:
However, i'd like to know why while doing this: ----8<----8<----8<----8<----
object r = Regexp.PCRE.Studied("[\W]*$"); r->replace(foo, "");
---->8---->8---->8---->8---- Pike eats all my CPU and the command never finish.
Good question. The PCRE code should be easy to read though, feel free to investigate? :)
When using n* (zero or more n), Regexp.PCRE._pcre()->exec() returns an array of two identical int.
----8<----8<----8<----8<---- $ pike Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
Regexp.PCRE._pcre("o+")->exec("foobar",0);
(1) Result: ({ /* 2 elements */ 1, 3 })
Regexp.PCRE._pcre("o*")->exec("foobar",0);
(2) Result: ({ /* 2 elements */ 0, 0 }) ---->8---->8---->8---->8----
For each replace, Regexp.PCRE()->replace() attempts to execute the regular expression at the end of the previous hit. It uses the return from exec as a start and end offset. Since the start and end offset returned by exec() are the same, this results in a infinite loop.
----8<----8<----8<----8<---- $ pike Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
string foo = "foobar"; Regexp.PCRE("o+")->replace(foo,"");
({ /* 2 elements */ 1, 3 }) -1 (1) Result: "fbar"
Regexp.PCRE("o*")->replace(foo,"");
({ /* 2 elements */ 0, 0 }) ({ /* 2 elements */ 0, 0 }) ({ /* 2 elements */ 0, 0 }) ({ /* 2 elements */ 0, 0 })
... infinite loop ... ---->8---->8---->8---->8----
Please note that Regexp.PCRE()->matchall() obviously suffers from the same problem:
----8<----8<----8<----8<---- $ pike Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
string foo = "foobar"; Regexp.PCRE("o+")->matchall(foo, lambda(mixed s){ werror("%O\n",s); } );
({ /* 1 element */ "oo" }) (1) Result: Regexp.PCRE.StudiedWidestring("o+")
Regexp.PCRE("o*")->matchall(foo, lambda(mixed s){ werror("%O\n",s); } );
({ /* 1 element */ "" }) ({ /* 1 element */ "" })
... infinite loop ... ---->8---->8---->8---->8----
I don't know which code should be fixed for now. I don't understand the documentation for exec() when it returns an array here: http://pike.ida.liu.se/generated/manual/modref/ex/predef_3A_3A/Regexp/P CRE/_pcre/exec.html
__________________________________________________________ Deze e-mail en de inhoud is vertrouwelijk en uitsluitend bestemd voor de geadresseerde(n). Indien u niet de geadresseerde bent van deze e-mail verzoeken wij u dit direct door te geven aan de verzender door middel van een reply e-mail en de ontvangen e-mail uit uw systemen te verwijderen. Als u geen geadresseerde bent, is het niet toegestaan om kennis te nemen van de inhoud, deze te kopieren, te verspreiden, bekend te maken aan derden noch anderszins te gebruiken.
The information contained in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Please notify us immediately if you have received it in error by reply e-mail and then delete this message from your system. __________________________________________________________