New subject: [Bug] Regexp.PCRE("n*") multiple match endless loop

16 Apr 2008


      Mirar @ Pike  developers forum
10353@lyskom.lysator.liu.se wrote:
...
...
However, i'd like to know why while doing this:
----8<----8<----8<----8<----
...
object r = Regexp.PCRE.Studied("[\W]*$");
r->replace(foo, "");
---->8---->8---->8---->8----
Pike eats all my CPU and the command never finish.
Good question. The PCRE code should be easy to read though, feel free
to investigate? :)
When using n* (zero or more n), Regexp.PCRE._pcre()->exec() returns an
array of two identical int.
----8<----8<----8<----8<----
$ pike
Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
...
Regexp.PCRE._pcre("o+")->exec("foobar",0);
(1) Result: ({ /* 2 elements */
                1,
                3
            })
...
Regexp.PCRE._pcre("o*")->exec("foobar",0);
(2) Result: ({ /* 2 elements */
                0,
                0
            })
---->8---->8---->8---->8----
For each replace, Regexp.PCRE()->replace() attempts to execute the
regular expression at the end of the previous hit. It uses the return
from exec as a start and end offset.
Since the start and end offset returned by exec() are the same, this
results in a infinite loop.
----8<----8<----8<----8<----
$ pike
Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
...
string foo = "foobar";             
Regexp.PCRE("o+")->replace(foo,"");
({ /* 2 elements */
    1,
    3
})
-1
(1) Result: "fbar"
...
Regexp.PCRE("o*")->replace(foo,"");
({ /* 2 elements */
    0,
    0
})
({ /* 2 elements */
    0,
    0
})
({ /* 2 elements */
    0,
    0
})
({ /* 2 elements */
    0,
    0
})
... infinite loop ...
---->8---->8---->8---->8----
Please note that Regexp.PCRE()->matchall() obviously suffers from the
same problem:
----8<----8<----8<----8<----
$ pike
Pike v7.6 release 112 running Hilfe v3.5 (Incremental Pike Frontend)
...
string foo = "foobar";
Regexp.PCRE("o+")->matchall(foo, lambda(mixed s){ werror("%O\n",s); } );
({ /* 1 element */
    "oo"
})
(1) Result: Regexp.PCRE.StudiedWidestring("o+")
...
Regexp.PCRE("o*")->matchall(foo, lambda(mixed s){ werror("%O\n",s); } );
({ /* 1 element */
    ""
})
({ /* 1 element */
    ""
})
... infinite loop ...
---->8---->8---->8---->8----
I don't know which code should be fixed for now.
I don't understand the documentation for exec() when it returns an array
here:
http://pike.ida.liu.se/generated/manual/modref/ex/predef_3A_3A/Regexp/P
CRE/_pcre/exec.html
-- 
Bertrand LUPART

http://bertrand.gotpike.org/

[Bug] Regexp.PCRE("n*") multiple match endless loop (was: Re: Regexp troubles)