On Thu, 27 Jun 2019 13:58:02 +0200, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
/paper-hyperscan-a-fast-multi-pattern-regex-matcher-for-modern-cpus/
For modern CPU:s? Looks more like they are targeting a certain 1970:s architecture:
cpuid(1, 0, &eax, &ebx, &ecx, &edx);
Did you test it on RISC-V?
From https://news.ycombinator.com/item?id=19270199:
""" I suppose "Some Modern CPUs" was too long-winded a title? As I said, it doesn't take a genius to understand Intel's motivations. They bought the project, after all.
Not being @ Intel anymore, I don't have access to the older stuff, and even if I did, the codebase has diverged significantly since.
Without going on too much of a tirade - the experience of developing for all those platforms really sucked. Almost all the non-x86 platforms had significant bugs in their toolchains. One of the MIPS variants (particular architecture elided to spare the guilty) had bugs in their gcc intrinsics in a way that suggested that no-one had ever done any significant third-party dev on the platform.
Big-endian was also a huge PITA.
It was a ton of work to keep all those systems alive, and our machine rack looked like a zoo of dev boards and weirdo devices.
In the "ure3" system I mention, I would make retargetability/portability to other systems a first-class goal.
One way of achieving this is not having such a huge profusion of methods and complexity. Hyperscan is over-engineered for many use cases if you aren't a network company looking to scan 5,000 complex regexes in streaming mode at hopefully maximal performance. """
Also: 114 kLOC for a regexp matcher. Srsly?
For a 40x speed boost? Worth at least trying.
BR, tj