I think the new streaming behavior in Sql.mysql.big_query is nice and as it should be, but it's causing too much headaches in existing applications - the old de-facto behavior is too ingrained.
So a proposal is to instead do like this:
query: Like today - slow and memory consuming. big_query: A variant of query() that still is nonstreaming. I.e. it only avoids the very bulky response format of query(). streaming_query: Like big_query but guaranteed to stream if it exists. It's database dependent whether other queries can be issued while a streaming_query response object exists. It's of course also database dependent if there are locks on the tables in the server while the response object exists.
While at it, we could also consider generalizing the big_typed_query interface that the oracle module provides:
big_typed_query: Like big_query but doesn't convert everything to strings. Integers and floats are kept that way, Date/timestamps are Calendar objects, there are objects for representing the NULL value for each type. streaming_typed_query: Streaming variant of big_typed_query.
Excellent suggestion. +1
/ Johan Sundström (Achtung Liebe!)
Previous text:
2004-09-29 17:16: Subject: MySQL big_query in 7.7
I think the new streaming behavior in Sql.mysql.big_query is nice and as it should be, but it's causing too much headaches in existing applications - the old de-facto behavior is too ingrained.
So a proposal is to instead do like this:
query: Like today - slow and memory consuming. big_query: A variant of query() that still is nonstreaming. I.e. it only avoids the very bulky response format of query(). streaming_query: Like big_query but guaranteed to stream if it exists. It's database dependent whether other queries can be issued while a streaming_query response object exists. It's of course also database dependent if there are locks on the tables in the server while the response object exists.
While at it, we could also consider generalizing the big_typed_query interface that the oracle module provides:
big_typed_query: Like big_query but doesn't convert everything to strings. Integers and floats are kept that way, Date/timestamps are Calendar objects, there are objects for representing the NULL value for each type. streaming_typed_query: Streaming variant of big_typed_query.
/ Martin Stjernholm, Roxen IS
Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
big_query: A variant of query() that still is nonstreaming. I.e. streaming_query: Like big_query but guaranteed to stream if it
Sounds like a plan. But I still think that it is feasible to make big_query() intelligent and pull in the rest of the stream, as soon as it notices another big_query() being started on the same connection. I.e. it should not be a large performance hit, yet still offer all the advantages, and be compatible to existing applications. It basically makes a runtime decision if streaming is possible or not.
While at it, we could also consider generalizing the big_typed_query interface that the oracle module provides:
big_typed_query: Like big_query but doesn't convert everything to
No argument here.
But I still think that it is feasible to make big_query() intelligent and pull in the rest of the stream, as soon as it notices another big_query() being started on the same connection.
I was hoping Per would respond to this, since he's the one who's having trouble with it. Anyway, as far as I've understood that might not be good enough since it can keep the tables locked for too long. I.e. it can be important that the whole response is read quickly if there are other users of the database. And having a pike loop that reads it all in before it's used would create the same kind of bulky pike structures that query() suffers from.
/ Martin Stjernholm, Roxen IS
Previous text:
2004-09-30 22:49: Subject: Re: MySQL big_query in 7.7
Martin Stjernholm, Roxen IS @ Pike developers forum wrote:
big_query: A variant of query() that still is nonstreaming. I.e. streaming_query: Like big_query but guaranteed to stream if it
Sounds like a plan. But I still think that it is feasible to make big_query() intelligent and pull in the rest of the stream, as soon as it notices another big_query() being started on the same connection. I.e. it should not be a large performance hit, yet still offer all the advantages, and be compatible to existing applications. It basically makes a runtime decision if streaming is possible or not.
While at it, we could also consider generalizing the big_typed_query interface that the oracle module provides:
big_typed_query: Like big_query but doesn't convert everything to
No argument here.
Sincerely, srb@cuci.nl Stephen R. van den Berg (AKA BuGless). Gravity is running out! Conserve gravity: walk with a light step, use tape, magnets or glue instead of paperweights, avoid showers... take baths instead.
/ Brevbäraren
Looking at other aspects of the Sql module I think it might be an idea to dispose of the wrapper. Instead of having Sql.Sql wrap Sql.* that inherits C-code, we could have the C-modules inherit Sql.DB and then let Sql.open create the actual objects (from Sql.* programs). That way we can access database specific functions directly and remove one level of indirection when calling API functions. Since the access API is a new one, we can redesign whatever we want...
/ Martin Nilsson (DivX Networks)
Previous text:
2004-10-13 23:02: Subject: Re: MySQL big_query in 7.7
But I still think that it is feasible to make big_query() intelligent and pull in the rest of the stream, as soon as it notices another big_query() being started on the same connection.
I was hoping Per would respond to this, since he's the one who's having trouble with it. Anyway, as far as I've understood that might not be good enough since it can keep the tables locked for too long. I.e. it can be important that the whole response is read quickly if there are other users of the database. And having a pike loop that reads it all in before it's used would create the same kind of bulky pike structures that query() suffers from.
/ Martin Stjernholm, Roxen IS
The only really new API here is big_typed_query and the streaming variants, but I think all the query variants except the basic query() function should follow the same API for consistency.
Still, it ought to be possible to do away with the Sql.Sql wrapper. That might need changes in some of the db specific implementations, but if there are compatibility issues with that they shouldn't be worse than that #pike compat goo is an acceptable solution.
There is a type problem if we try to remove the Sql.Sql wrapper, though: The base class for the db specific variants should be Sql.Sql to get the typing correct, but that means its create function would have to replace itself with a different object. An easy and fairly passable way to deal with that is to allow create() to return the object that should be the result of the clone operation. That's only ugly because it unnecessarily instantiates an Sql.Sql object solely to call create() in it. True static functions would be nice.
/ Martin Stjernholm, Roxen IS
Previous text:
2004-10-13 23:12: Subject: Re: MySQL big_query in 7.7
Looking at other aspects of the Sql module I think it might be an idea to dispose of the wrapper. Instead of having Sql.Sql wrap Sql.* that inherits C-code, we could have the C-modules inherit Sql.DB and then let Sql.open create the actual objects (from Sql.* programs). That way we can access database specific functions directly and remove one level of indirection when calling API functions. Since the access API is a new one, we can redesign whatever we want...
/ Martin Nilsson (DivX Networks)
The Sql.Sql object will however be created only once, while the wrapper acts for every call, so that trade off is easy. The question is however if we want the possibility to return different objects from create methods.
It would be possible to let Sql.pike/Sql.pmod inherit Sql.DB and add an optional `() that returns the actual objects. Also ugly, but doesn't require any language changes.
/ Martin Nilsson (DivX Networks)
Previous text:
2004-10-13 23:39: Subject: Re: MySQL big_query in 7.7
The only really new API here is big_typed_query and the streaming variants, but I think all the query variants except the basic query() function should follow the same API for consistency.
Still, it ought to be possible to do away with the Sql.Sql wrapper. That might need changes in some of the db specific implementations, but if there are compatibility issues with that they shouldn't be worse than that #pike compat goo is an acceptable solution.
There is a type problem if we try to remove the Sql.Sql wrapper, though: The base class for the db specific variants should be Sql.Sql to get the typing correct, but that means its create function would have to replace itself with a different object. An easy and fairly passable way to deal with that is to allow create() to return the object that should be the result of the clone operation. That's only ugly because it unnecessarily instantiates an Sql.Sql object solely to call create() in it. True static functions would be nice.
/ Martin Stjernholm, Roxen IS
Generally speaking, the functionality to have more control of the object returned by a clone operation makes sense to me: It's useful to return already created objects for caching purposes or in cases where it's important to avoid multiple instances. In such cases it's not wrong to hide the implementation detail whether or not new objects are returned. It's also useful for compatibility/refactoring, as in this case.
Ideally it should be accomplished through a true static function which is named something else than "create". If it exists, it's called for a clone operation before any object instance is created, and it always return the object that should be the result of the clone operation.
But since true static functions aren't likely to happen anytime soon, I think a slight misuse of the return value from create() is passable. It's easy to deprecate when a cleaner solution exists, and keeping support for it doesn't get in the way of that solution afaics.
Having both an Sql.Sql and an Sql.DB doesn't make sense to me. It's Sql.Sql that defines the common interface, ergo it should be the class all the db variant inherits. I also don't particularly like muddling the distinction between modules and classes. It might work ok to inherit Sql.Sql and use it as a type even though it is a module, but inheriting the `() function isn't meaningful. But the worst is that it might give other users the impression that that sort of thing is a good idea, especially those who don't have a clear understanding of the distinction between classes and modules.
/ Martin Stjernholm, Roxen IS
Previous text:
2004-10-14 00:03: Subject: Re: MySQL big_query in 7.7
The Sql.Sql object will however be created only once, while the wrapper acts for every call, so that trade off is easy. The question is however if we want the possibility to return different objects from create methods.
It would be possible to let Sql.pike/Sql.pmod inherit Sql.DB and add an optional `() that returns the actual objects. Also ugly, but doesn't require any language changes.
/ Martin Nilsson (DivX Networks)
Like static methods in C++ or Java: A function that isn't connected to any specific instance of the class it's declared in.
/ Martin Stjernholm, Roxen IS
Previous text:
2004-10-14 02:15: Subject: Re: MySQL big_query in 7.7
what is a true static function?
greetings, martin.
/ Brevbäraren
Martin Stjernholm, Roxen IS @ Pike developers forum wrote: [...]
But the worst is that it might give other users the impression that that sort of thing is a good idea, especially those who don't have a clear understanding of the distinction between classes and modules.
/ Martin Stjernholm, Roxen IS
I'm not sure I have a clear understanding of the distinction. Is it documented somewhere, or is it appropriate to ask for a brief explanation?
Looking at the pike tutorial (http://pike.ida.liu.se/docs/tutorial/) I get the feeling that modules is .pmod files and classes is .pike files (or a class{ ... } in any of these). But my only conclusion to this is that modules then will be pre-created by the pike compiler, ready to be used whenever referenced, and classes needs to be cloned before use. As has been made clear to me in previous discussions about pike modules.
What part is 'muddling with the distinction between modules and classes' ? Is it when inheriting a module into a class.. ?
// Andreas
I think of modules as objects without programs and classes as instances of some program. A module is normally a .pmod file and a class is implemented in a .pike file.
The main difference is, that the module only exists in one instance. You cannot create multiple instances of it whereas the class can be instantiated into several instances.
I'm sure mast, grubba or Nilsson or someone else can give a more detailed and technical explanation of classes vs modules but that is what I consider the most important difference.
/ Marcus Agehall (PacketFront)
Previous text:
2004-10-14 09:07: Subject: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
Martin Stjernholm, Roxen IS @ Pike developers forum wrote: [...]
But the worst is that it might give other users the impression that that sort of thing is a good idea, especially those who don't have a clear understanding of the distinction between classes and modules.
/ Martin Stjernholm, Roxen IS
I'm not sure I have a clear understanding of the distinction. Is it documented somewhere, or is it appropriate to ask for a brief explanation?
Looking at the pike tutorial (http://pike.ida.liu.se/docs/tutorial/) I get the feeling that modules is .pmod files and classes is .pike files (or a class{ ... } in any of these). But my only conclusion to this is that modules then will be pre-created by the pike compiler, ready to be used whenever referenced, and classes needs to be cloned before use. As has been made clear to me in previous discussions about pike modules.
What part is 'muddling with the distinction between modules and classes' ? Is it when inheriting a module into a class.. ?
// Andreas
/ Brevbäraren
Thanks. This verifies what I've learned so far. So my remaining ?-mark is what mast referred to as bad practice regarding the muddling with modules and classes.. didn't quite follow the discussion about inheritance and so (what was, and why, a bad idea). But it might fall into place later on..
Marcus Agehall (PacketFront) @ Pike (-) developers forum wrote:
I think of modules as objects without programs and classes as instances of some program. A module is normally a .pmod file and a class is implemented in a .pike file.
The main difference is, that the module only exists in one instance. You cannot create multiple instances of it whereas the class can be instantiated into several instances.
I'm sure mast, grubba or Nilsson or someone else can give a more detailed and technical explanation of classes vs modules but that is what I consider the most important difference.
/ Marcus Agehall (PacketFront)
Previous text:
2004-10-14 09:07: Subject: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
Martin Stjernholm, Roxen IS @ Pike developers forum wrote: [...]
But the worst is that it might give other users the impression that that sort of thing is a good idea, especially those who don't have a clear understanding of the distinction between classes and modules.
/ Martin Stjernholm, Roxen IS
I'm not sure I have a clear understanding of the distinction. Is it documented somewhere, or is it appropriate to ask for a brief explanation?
Looking at the pike tutorial (http://pike.ida.liu.se/docs/tutorial/) I get the feeling that modules is .pmod files and classes is .pike files (or a class{ ... } in any of these). But my only conclusion to this is that modules then will be pre-created by the pike compiler, ready to be used whenever referenced, and classes needs to be cloned before use. As has been made clear to me in previous discussions about pike modules.
What part is 'muddling with the distinction between modules and classes' ? Is it when inheriting a module into a class.. ?
// Andreas
/ Brevbäraren
The bad practice is to use an object (.pmod) as type and not a program (class/.pike).
/ Martin Nilsson (DivX Networks)
Previous text:
2004-10-14 09:23: Subject: Re: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
Thanks. This verifies what I've learned so far. So my remaining ?-mark is what mast referred to as bad practice regarding the muddling with modules and classes.. didn't quite follow the discussion about inheritance and so (what was, and why, a bad idea). But it might fall into place later on..
Marcus Agehall (PacketFront) @ Pike (-) developers forum wrote:
I think of modules as objects without programs and classes as instances of some program. A module is normally a .pmod file and a class is implemented in a .pike file.
The main difference is, that the module only exists in one instance. You cannot create multiple instances of it whereas the class can be instantiated into several instances.
I'm sure mast, grubba or Nilsson or someone else can give a more detailed and technical explanation of classes vs modules but that is what I consider the most important difference.
/ Marcus Agehall (PacketFront)
Previous text:
2004-10-14 09:07: Subject: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
Martin Stjernholm, Roxen IS @ Pike developers forum wrote: [...]
But the worst is that it might give other users the impression that that sort of thing is a good idea, especially those who don't have a clear understanding of the distinction between classes and modules.
/ Martin Stjernholm, Roxen IS
I'm not sure I have a clear understanding of the distinction. Is it documented somewhere, or is it appropriate to ask for a brief explanation?
Looking at the pike tutorial (http://pike.ida.liu.se/docs/tutorial/) I get the feeling that modules is .pmod files and classes is .pike files (or a class{ ... } in any of these). But my only conclusion to this is that modules then will be pre-created by the pike compiler, ready to be used whenever referenced, and classes needs to be cloned before use. As has been made clear to me in previous discussions about pike modules.
What part is 'muddling with the distinction between modules and classes' ? Is it when inheriting a module into a class.. ?
// Andreas
/ Brevbäraren
/ Brevbäraren
Ah, makes sense Got it now, thanks :)
Martin Nilsson (DivX Networks) @ Pike (-) developers forum wrote:
The bad practice is to use an object (.pmod) as type and not a program (class/.pike).
/ Martin Nilsson (DivX Networks)
Yes, but the really sinister badness becomes apparent later on as more functionality is added. There are two different sets of functions that are placed in classes and modules:
o The functions placed in modules are meaningless to inherit and to include in the type check (hence the need to make them optional). In the discussion about making Sql.Sql a module, such a function is `(). It only exists to be compatible with the Sql.Sql("...") syntax to get a database connection, but you also get a `() in the resulting object which only is confusing and might make errors harder to detect.
o The functions placed in classes are meaningless to call with the module syntax, and if you make a module of a class you get a meaningless object instance. In the Sql.Sql module example, the master will create an object for the module hierarchy, but as opposed to all other Sql.Sql objects, that one doesn't correspond to an actual database connection. It's possible to make calls like Sql.Sql.query which are bogus (but they will probably fail sooner or later when trying to call a function that lacks a definition in Sql.Sql).
An excellent example of how horrible it can get is the Thread module. Someone thought it'd be neat if the type name for thread objects was Thread and not Thread.Thread. Thus there's a shitload of optional functions there, because in reality there are many things that belong in a thread module but not in the thread objects (Mutex, Queue, all_threads, to name a few). Furthermore, if the type name is Thread, it's reasonable to expect that you can inherit it to make your own thread objects with some extra context. That doesn't work, since Thread.pmod overrides create() to avoid a thread being created when the module is compiled.
/ Martin Stjernholm, Roxen IS
Previous text:
2004-10-14 10:24: Subject: Re: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
The bad practice is to use an object (.pmod) as type and not a program (class/.pike).
/ Martin Nilsson (DivX Networks)
Is anyone really using the Thread type?
/ Martin Nilsson (DivX Networks)
Previous text:
2004-10-14 13:15: Subject: Re: distinction between modules and classes? (Was: Re: MySQL big_query in 7.7)
Yes, but the really sinister badness becomes apparent later on as more functionality is added. There are two different sets of functions that are placed in classes and modules:
o The functions placed in modules are meaningless to inherit and to include in the type check (hence the need to make them optional). In the discussion about making Sql.Sql a module, such a function is `(). It only exists to be compatible with the Sql.Sql("...") syntax to get a database connection, but you also get a `() in the resulting object which only is confusing and might make errors harder to detect.
o The functions placed in classes are meaningless to call with the module syntax, and if you make a module of a class you get a meaningless object instance. In the Sql.Sql module example, the master will create an object for the module hierarchy, but as opposed to all other Sql.Sql objects, that one doesn't correspond to an actual database connection. It's possible to make calls like Sql.Sql.query which are bogus (but they will probably fail sooner or later when trying to call a function that lacks a definition in Sql.Sql).
An excellent example of how horrible it can get is the Thread module. Someone thought it'd be neat if the type name for thread objects was Thread and not Thread.Thread. Thus there's a shitload of optional functions there, because in reality there are many things that belong in a thread module but not in the thread objects (Mutex, Queue, all_threads, to name a few). Furthermore, if the type name is Thread, it's reasonable to expect that you can inherit it to make your own thread objects with some extra context. That doesn't work, since Thread.pmod overrides create() to avoid a thread being created when the module is compiled.
/ Martin Stjernholm, Roxen IS
Ok. streaming_query() is now implemented. It shows however that eof() is rather flawed. In big_query() eof() in the result object alwasy returns 1. In streaming_query() is returns 0 until one "failed" fetch_row() has been performed, i.e. at a point where the user already knows eof is reached.
pike-devel@lists.lysator.liu.se