utf8_char_index

24 Sep 2003

      Hmm, it still doesn't work very well if the indata is huge...
but I guess it wont be that much worse then the utf8 string.
The most important right now is a quick function for start_index 
from character to byte index, though.
/ Mirar
Previous text:
...
2003-09-24 13:14:
Subject: utf8_char_index

The best would probably be a method like 
'array(string,string) string_to_utf8_with_index( string input );'
that returns the utf8 string and a string (or array with integers, but
that would use even more memory) with the byte->character mapping.
[string index,string utf8] = string_utf8_with_index( data );
array(int) offsets = rows(index,regexp_function_utf8( utf8 ));
/ Per Hedbor ()

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

utf8_char_index