Re: Hyperscan regexp engine

27 Jun 2019


      On Thu, 27 Jun 2019 13:58:02 +0200, Marcus Comstedt (ACROSS) (Hail  
Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se  
wrote:
...
...
/paper-hyperscan-a-fast-multi-pattern-regex-matcher-for-modern-cpus/
For modern CPU:s?  Looks more like they are targeting a certain 1970:s
architecture:
cpuid(1, 0, &eax, &ebx, &ecx, &edx);
Did you test it on RISC-V?
From https://news.ycombinator.com/item?id=19270199:
"""	
I suppose "Some Modern CPUs" was too long-winded a title?
As I said, it doesn't take a genius to understand Intel's motivations.  
They bought the project, after all.
Not being @ Intel anymore, I don't have access to the older stuff, and  
even if I did, the codebase has diverged significantly since.
Without going on too much of a tirade - the experience of developing for  
all those platforms really sucked. Almost all the non-x86 platforms had  
significant bugs in their toolchains. One of the MIPS variants (particular  
architecture elided to spare the guilty) had bugs in their gcc intrinsics  
in a way that suggested that no-one had ever done any significant  
third-party dev on the platform.
Big-endian was also a huge PITA.
It was a ton of work to keep all those systems alive, and our machine rack  
looked like a zoo of dev boards and weirdo devices.
In the "ure3" system I mention, I would make retargetability/portability  
to other systems a first-class goal.
One way of achieving this is not having such a huge profusion of methods  
and complexity. Hyperscan is over-engineered for many use cases if you  
aren't a network company looking to scan 5,000 complex regexes in  
streaming mode at hopefully maximal performance.
"""
...
Also:  114 kLOC for a regexp matcher.  Srsly?
For a 40x speed boost?  Worth at least trying.
BR,
tj

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: Hyperscan regexp engine