testsuite

17 Sep 2014


      ...
Well, yes, a HTML tokenizer would be useful. HTML5 has a very readable
specification.
So maybe it's time for a new tool.
...
Ironically enough it is about 10 times slower than ye olde Opera HTML5
parser at actually parsing html. :)
Yes, but I believe it was written to search for specific tags, not
parse every single tag or even to build a datastructure around it. So
it's naturally pretty bad at anything not RXML (as RXML were at the
time, too, probably) :)
...
I have seriously considered writing one. But the name 'Parser.HTML' is
already taken. :)
Which is bad. But it shouldn't be the largest obstacle. :)
Use a subtree. Parser.HTML.Tokenizer?

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

testsuite