On Wed, Nov 03, 2004 at 07:50:03PM +0100, Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
Yes, I can also reason theoretically what the gain is. What I meant was to actually _measure_ it. Is it 0.1% speed/memory gain? 1%? 10%?
OK, the facts. Simple test case - ca. 50M of data (UTF8), in 256 byte chunks, reading from file, converting, processing, again converting, writing to file (buffered, i.e. hdd latency is not counted, only CPU time is measured, using gauge{}). P4-1.7, 768M RAM, IDE HDD, Linux.
Results for read => UTF8 => UTF16 => processing => UTF8 => write:
Preparing file... Done! Measuring... 52428915 bytes processed; time spent: 2.900; 0.058000 s/M
Results for read => processing (without conversion) => write:
Preparing file... Done! Measuring... 52428915 bytes processed; time spent: 0.780; 0.015600 s/M
As you can see, the conversion takes 3.6 times more CPU time than plain processing without conversion. This is not 1% and not even 100%. Yes, I ran it several times and times shown above are average for all runs.
The test case is simple but reflects behavior (more or less) of my real application. "processing" in this test case was simulated by search() for something non-existent. Actual amount of data processed is counted in tens of gigabytes, actual processing is similar to search but using regular expressions, XML processing, involves data exctraction and manipulation (= more time & memory spent for 16-bit wide strings).
May be, when 1T of memory and 128GHz CPUs will cost $500, I'll not make any benchmarks, but... :)
How do you intend to implement the flag in that case?
sql->set_encoding(), may be.
Regards, /Al