I've just made some experiments for a replacement to ADT.struct and ADT.Struct:
program struct =
ADT.Serialize.SimpleStruct("struct", ADT.Serialize.Int32, "integer", ADT.Serialize.SimpleStruct("substruct", ADT.Serialize.String, "data" ), "sub");
object o = struct(14); indices(o);
(1) Result: ({ /* 5 elements */ "encode", "decode", "_integer", "integer", "sub" })
o->integer;
(2) Result: 14
o->sub;
(3) Result: substruct()
o->sub->data = "foobar";
(4) Result: "foobar"
A decent encoding API is easy:
o->encode();
(5) Result: "\0\0\0\16foobar"
Now to the hard part; a streaming decoder API...
As far as I know, there are two main approaches:
* Pull -- the user sends a data source (eg a file object) to the object, and it's up to the object to actually read the data. * Push -- the user reads the data in segments (typically with non-blocking I/O), and pushes it into the object.
The Pull API is the easiest to implement, but it has the following problems:
* Does not support non-blocking I/O. * Error recovery is complicated (eg if the source is a pipe/socket). * Extra over-head for the common case where you already have read the data.
The Push API on the other hand has the following problems:
* How to find where to restart the decoding when more data arrives. * How to signal that more data is needed/too much data was received.
Suggestions?
Without any analysis of this proposal I would like to point out a few of the weaknesses in todays solutions.
1. Efficency. We desperately need something better than array_sscanf(f->read(n), "%"+n+"c")[0]; to read n byte integers from file objects. I think the best solution is a generalized "stream" wrapper kind of thing. class RichStreamAPI(Stdio.File|API.Stream|string x)
(This is in line with the wishlist item to unify the stream APIs in Pike, like charset codecs, gz/bzip2/etc compression, crypto functions etc)
2. Conditional struct items. I am fairly pleased with how my ADT.Struct worked out in my SWF-decoder, but when items depend on other items to determine their size or existence, I doesn't feel well supported. E.g. this is an example of the RECT primitive in Flash that consists of 5 bits size followed by the box coordinates using that size as number of bits.
class RECT { inherit BitStruct; Item nbits = UB(5); Item xmin = SB(0); Item xmax = SB(0); Item ymin = SB(0); Item ymax = SB(0);
void set_size(Item item, object file) { item->size = this->nbits; }
void create() { xmin->add_decoder_cb(set_size); xmax->add_decoder_cb(set_size); ymin->add_decoder_cb(set_size); ymax->add_decoder_cb(set_size); ::create(); } }
I should mention that the ability to used a finished struct class as an item in a new struct class has proven to be very helpful.
3. Alignment. Again, using my Flash decoder as past experience, you need to be able to define rules for how items fit together. In Flash files, whenever a byte-item follows a sub-byte item, there is a byte alignment occuring. It is possible to work around this by having knowledge of what data is consumed on a sub-byte level in the input object (Stdio.File based class in this case), but something that would benefit from some redesigning)
pike-devel@lists.lysator.liu.se