Re: lfun `+= and bogus conditions, performance loss, semantic errors

8 Sep 2014

      Ah. So that is why nobody is replying to the mails I send.
It would appear that the KOM<->email bridge is not working correctly,
so the last few replies I have written have not, actually, been sent.
First, on this topic:
...
Object `+= lfuns are not called when they should have been.
Well. That depends on how you define "should have been.". :)
`+= has never been called except when there is only one reference to
the object, which is why it can be somewhat confusing. It does allow
optimizations since the value can be modified destructively, however.
And, really, X += Y only implies that the variable X will have Y added
to it, it is not the same as whatever X pointing to getting Y added.
(this is much the same as if there had been a `=, assignment
overloading)
Consider:
|  mapping m = ([]);
|  object o = 1003443894984389348989434389;
|  multiset z = (<>);
|  array q = ({});
|  string b = "";
In which cases should using += change not only the variable but also
what it points to?
I think this would be somewhat confusing, as an example:
|  object a = 18042893489438948390;
|  object b = 182389983498439843983498;
|  object c = b;
a += b;
// afterwards c == b, while I would expect (a-b == c) to be true.
As for how the += optimization is supposed to work:
...

For one, the object might have an extra instance on the stack, in
which case the minimum number of references becomes 2 already.

Well, this should be mostly fixed by the fact that a = a + X; should
be handled differently than other additions, the lvalue (the variable
being assigned to and read from) is actually cleared before the
addition happens.
The code that is supposed to ensure this is the case lives in line
1337 in las.c (nice..).
The non-existence of +=/-= etc as actual opcpdes is done in
treeopt.in, but note that even if those rules are removed there is
actually no += opcode in the generated code, and there never has been
to my knowledge, the conversion was done in docode.c previously
however, which led to a lot of code duplication.
...

And for another, the object might have multiple instances which
all refer to the same object; which *implies* that updating one of
those using += should modify all of them.

Well. Not really, in pike. Adding things to a variable only ever
change the variable, not the thing it points to.
IOBuffer talk:
...
The actual reason I beefed up Buffer a bit lately is *because* I need
to do some protocol decoding of a byte stream.
Now I see IOBuffer arrive.  In order to avoid code bloat, wouldn't it
be a better idea to integrate the functionality of IOBuffer into Buffer
and just keep one?
Sorry about the timing, I have had IOBuffer on the way for some time
(I am still wondering where to put it, however, that has, believe it
or not, been a blocker for me. Perhaps Stdio.Buffer?  I will create a
buffered stream that reads to and writes from said object, without
creating pike strings)
Unfortunately it is not possible to make String.Buffer even close to
as efficient as long as it uses a stringbuilder. And not using a
stringbuilder slows some things down (sprintf comes to mind) and makes
others more or less impossible (wide strings) without excessive code
duplication.
The whole reason for IOBuffer is that it uses pointer arithmetics on
guaranteed 8bit strings to be fast at both reading from the beginning
and writing to the end at the same time (I am, by the way, considering
converting it to be a circular buffer to avoid the one memmove it does
at times), and it is also efficient at creating sub-buffers.
The fact that it is guaranteed to only contain 8bit characters helps a
lot too.
...
Or is the performance difference so strikingly great that this sounds
like a bad idea?
As things stands now, yes.
Things like subbuffers is unfortunately actually impossible when using
a stringbuilder.
I guess I might outline the plan for IOBuffer some more (I actually
did this during the last pike conference, but it has been a while. :))
o Add support for reading to and writing from file objects to it.
   Either add support in Stdio.File (also add
   System.Memory + String.Buffer?) or do it the other way
   around (that way lies madness, however, see also: Shuffler)
The main goal here is to do one copy less and avoid the pike string
   creation
o Add support for System.Memory & String.Buffer to add()
o Add support for reading other things than "binary holerith" and
integers
  + line
  + word
  + json object
  + encode_value?
o Add support for "throw mode".
It is rather useful to be able to change what happens when you try
   to read data that is not in the buffer.
| void read_callback()
   | {
   |    inbuffer->add( new_data );
   |   // Read next packet
   |   while( Buffer packet = inbuffer->read_hbuffer(2) )
   |   {
   |     packet->set_error_mode(Buffer.THROW_ERROR);
   |     if( mixed e = catch(handle_packet( packet ) ) )
   |       if( e->buffer_error )
   |         protocol_error();       // illegal data in packet
   |       else
   |         throw(e); // the other code did something bad
   |   }
   | }
The handle_packet function then just assumes the packet is well
  formed, and reads without checking if the read succeed (since it
  will, or an error will be thrown).
This code snipplet also demonstrates why the subbuffers are handy,
  read_hbuffer does not actually copy any data, the returned buffer
  just keeps a reference to the data in the original buffer.
-- mail #2
...
I've been tracking IOBuffer extensions back to String.Buffer, I'll present
what I have shortly.
...
I suspect (but benchmarks will have to tell) that the String.Buffer
implementation is not significantly slower than the current IOBuffer
one (whilst supporting the full range of character widths).
Well. You have some minor optimizations to do:
|  int perf(object buffer)
|  {
|    buffer->add("11");
|    for(int i=0;i<10000; i++ )
|    {
|        int l;
|        if( buffer->cut )
|        {
|           l = (buffer[0]<<8) | buffer[1];
|           buffer->cut(0,2,1);
|        }
|        else
|   l = buffer->read_int(2);
|       buffer->add(random_string(l));
|       return sizeof(buffer);
|     }
|  }
...
perf( String.Buffer() );
Result 2: 325250971
Compilation: 624ns, Execution: 144.86s
...
perf( Stdio.IOBuffer() );
Result 3: 328787331
Compilation: 639ns, Execution: 194.06ms
(note that the length differs due to random_string)
However, reviewing the IOBuffer interface, I wonder about the
following issues:
...

Isn't it prudent to drop set_error_mode() and simply implement
this functionality (the throw()) using a custom range_error()
override?

Well. That would work, yes, I just simply did not remove the old
version of throwing errors since it would most often be used using the
simply buff->set_error_mode(1) when doing sub-parsing as I showed in
the documentation for set_error_mode.
The need to do rather complex sub-classing for that common usecase
seemed somewhat pointless.
...

Why insist on lock()ing the buffer when subbuffers are active?
Couldn't the code figure out by itself when a subbuffer exists
and then decide on-demand and automatically when a copy needs to be made
to transparently support the desired operation?

Not really, since the subbuffer only contains a pointer directly into
the memory area of the main buffer, if the main buffer changes that
using realloc or malloc it would be invalid, this could of course be
fixed by adding a list of subbuffers to the main buffer, but then you
run into issues with refcounting and such. Since the usecase where you
have a subbuffer active and want to modify the main buffer is rather
uncommon I thought it was OK that you have to call trim() on the
subbuffer to do that.
...
Why not return IOBuffers practically everywhere, and then let the
caller decide when and if to cast them to a string?  It gets rid of
excessive method diversification due to there needing to be a string
and a buffer returning one.  Returning a buffer is cheap, it doesn't
copy the content.
Well, there is about a factor of 3 performance difference:
...
string perf2(object b) { while( b->read(1) ); }
string perf3(object b) { while( b->read_buffer(1) ); }
...
perf2(Stdio.IOBuffer(mb100));
Result 5: 0
Compilation: 664ns, Execution: 92.19ms
...
perf3(Stdio.IOBuffer(mb100));
Result 6: 0
Compilation: 680ns, Execution: 173.68ms
Since that does not include the cast, which should be about as fast as
the first read, it becomes about 3x slower.
And most of the time you actually want the string version, not the
buffer version.
-- mail #3
...
Erm...
We are *in* a Buffer object, so by definition we have one.
So returning a readonly-copy with zero-copy effort is easy.
It basically delays the creation of the shared string as long as possible.
Not really, we are in /a/ buffer object, not the subsection of it that
should be returned. You have to create a new one to return a
subsection. read_buffer is about as fast as it gets, it does the
minimal amount of work.
...
As long as one is doing string operations
(adding/substracting/matching) Buffer objects are better.  Once done
with that, the final "result" can/should be a shared string.
Nothing that adds data to the buffer returns a string in the current
buffer code.
The only thing that returns a string is if you call read() or
read_hstring() on it.
...
Once done with that, the final "result" can/should be a shared
string.
An additional comment: By definition you are almost never actually
"done" with a IOBuffer.
They are designed to be input and output buffers for IO.
-- mail #4
...
An additional comment: By definition you are almost never actually
"done" with a IOBuffer.
They are designed to be input and output buffers for IO.
And now the basic support is there to use them for Stdio.File objects.
Stdio.File now has a new nonblocking mode: Buffered I/O
In this mode the file object maintains two buffers, one for input and
one for output.
The read callback will get the buffer as the second argument, and data
that the user does not read from that buffer is kept until the next
time data arrives from the file (this means you do not have to do your
own buffering of input)
The output buffer is, unsurprisingly, used to output data from.
This has at least three somewhat convenient effects:
o The write callback will now receive that buffer as a second
  argument. You just add data to it to write it.
o Adding data to the buffer when /not/ in the write callback will
  still trigger sending of data if no write callback is pending.
o Your write callback will not be called until the buffer is actually
  empty.
An extremely small demo:
| void null() {}
|
| int main()
| {
|     Stdio.IOBuffer output = Stdio.IOBuffer();
|     Stdio.File fd = Stdio.File(0);
|
|     fd->set_buffer_mode( 0, output );
|
|     fd->set_nonblocking( Function.uncurry(output->add), null, null );
|
|     return -1;
| }
This will case all data received on stdio to be eched to .. stdin
using buffered output.
Not the most useful application, but it does show how easy it
is. Buffered mode in general is mainly useful because it removes the
need for you to handle the buffers manually in your code.
Currently read() and write() are not in any way modified by having
buffered output enabled, if you interact directly with the file object
it will bypass the buffer. I am unsure if this is a good idea or not.
--
Per Hedbor

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: lfun `+= and bogus conditions, performance loss, semantic errors