We have been working on a CritBit tree implementation for pike. We would like to get it upstream, however we have made a few changes elsewhere that may need discussion:
precompile.pike templating: we added a simple template support to precompile.pike. It basically replaces a set of identifiers in the template file before precompilation. The syntax is admittedly not very beautiful. Autodoc to support global paths in AUTODOC_SRC_IN: we could not figure out how to make the autodoc generator to extract docs from several .cmod files without the AUTODOC_SRC_IN feature applied on the generated .c-files with absolute paths.
You can check it out using git from http://fortranland.com/git/pike. The branch is called critbit. We have not done any testing on non x86 machines. It would be nice if someone could run the testsuite on some big endian machine. Also, we only compiled on linux machines. There might be problems on other environments because of the gcc intrinsics we are using for prefix searches.
Currently there are trees supporting float, integers and pike strings and two derived types for ipv4 ip addresses and date objects. The lookup, insert and delete operations are comparable in speed to pike mappings (1.X slower in our tests). In worst case scenario critbit trees can be slow in insert and lookup (e.g. keys like "a"*i).
However, critbit trees should be more memory efficient in many cases because of the mapping overhead. Also, they support several operations that mappings don't offer, namely, iteration in lexicographical order, range lookup and finding biggest/smallest index, etc.
We have written quite extensive tests and a complete documentation, which you can find at http://fortranland.com/refdoc/modref/ex/predef_3A_3A/ADT/CritBit.html
Any hints, suggestions, bugreports and comments are welcome.
arne and tobias
Hi. This code seems to depend on Linux-specific header files:
#### Making dynamic: post_modules/CritBit Compiling post_modules/CritBit/tree_block_alloc.c In file included from /tmp/pike/src/post_modules/CritBit/ptr_key.h:7:0, from /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:9: /tmp/pike/src/post_modules/CritBit/bitvector.h:4:20: fatal error: endian.h: No such file or directory compilation terminated.
Please fix. If you want to know the byteorder, just use the define PIKE_BYTEORDER provided by the Pike include file global.h.
On Wed, 26 Jan 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Hi. This code seems to depend on Linux-specific header files:
#### Making dynamic: post_modules/CritBit Compiling post_modules/CritBit/tree_block_alloc.c In file included from /tmp/pike/src/post_modules/CritBit/ptr_key.h:7:0, from /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:9: /tmp/pike/src/post_modules/CritBit/bitvector.h:4:20: fatal error: endian.h: No such file or directory compilation terminated.
Please fix. If you want to know the byteorder, just use the define PIKE_BYTEORDER provided by the Pike include file global.h.
Thanks, will do. When searching the source for BYTEORDER there is several places (encode.c and pcx.c) where BYTEORDER is used instead of PIKE_BYTEORDER. It seems that those are just wrong.
Looks like you forgot to test-compile before committing :-)
#### Making dynamic: post_modules/CritBit Compiling post_modules/CritBit/tree_block_alloc.c In file included from /tmp/pike/src/post_modules/CritBit/ptr_key.h:7:0, from /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:9: /tmp/pike/src/post_modules/CritBit/bitvector.h:36:6: error: missing ')' in expression
Line 36 is
#if !(defined(PIKE_BYTEORDER)
which obviously has unbalanced parentheses...
Its a shame. I fixed that, and another typo and pushed.
On Wed, 26 Jan 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Looks like you forgot to test-compile before committing :-)
#### Making dynamic: post_modules/CritBit Compiling post_modules/CritBit/tree_block_alloc.c In file included from /tmp/pike/src/post_modules/CritBit/ptr_key.h:7:0, from /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:9: /tmp/pike/src/post_modules/CritBit/bitvector.h:36:6: error: missing ')' in expression
Line 36 is
#if !(defined(PIKE_BYTEORDER)
which obviously has unbalanced parentheses...
Thanks. Two more problems:
1) The Makefile.in contains no rules for creating the c files from cmod. I had too add these lines:
tree.c : tree.cmod
inttree.c : inttree.cmod
floattree.c : floattree.cmod
2) After that, inttree.c was generated, but did not compile:
Compiling inttree.c /tmp/pike/src/post_modules/CritBit/inttree.cmod:41:10: error: expected '=', ',', ';', 'asm' or '__attribute__' before string constant
This is the generated code:
[...] static int IntTree_program_fun_num=-1; #line 41 "/tmp/pike/src/post_modules/CritBit/inttree.cmod" SET_TVAR "iterator_class" "Iterator"; SET_TVAR "tree_class" "IntTree"; SET_TVAR "OBJ2_TREE" "OBJ2_INTTREE"; [...]
It looks like a definition of "SET_TVAR" is missing somewhere...
On Wed, Jan 26, 2011 at 9:10 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
Thanks. Two more problems:
Also thanks, we'll look into it.
It looks like a definition of "SET_TVAR" is missing somewhere...
SET_TVAR is a (new) Tools.Standalone.Precompile token (it sets template variables for the also new INCLUDE_TEMPLATE feature). Unfortunately the Pike build process uses the installed Pike and thus it's precompile.pike. We were under the impression that post_modules would use the precompile.pike of the newly compiled Pike version. We're not sure what to do there, although for testing purposes copying the $CRITBIT_PIKE precompile.pike to $SYSTEM_PIKE/lib/modules/Tools.pmod/Standalone.pmod helps.
Thanks,
tobias
Simply removing the lines of code introduced by that commit seems to solve the problem, but I do not know the purpose for which they were added (the commit message wasn't terribly helpful), so I can't say if that is indeed the correct fix...
On Thu, Jan 27, 2011 at 2:55 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
Simply removing the lines of code introduced by that commit seems to solve the problem, but I do not know the purpose for which they were added (the commit message wasn't terribly helpful), so I can't say if that is indeed the correct fix...
This was, as far as I know, a "fix" to support cross compilation...
That doesn't make sense. The code further down can use the system pike with the precompile.pike from the build tree (this is what happens for me, even though I'm not cross compiling), which is exactly what you want in the cross compilation case...
In that case, I might've misattributed the commit. In any case, I've now heard several things about the Pike crosscompilation:
* System Pike will be used * System Pike will be used, but newly installed Pike for post_modules * ...what you just said
and then of course, experienced Pike to just use the system Pike (and precompile.pike). We're a little lost here, but without changes to precompile.pike our project doesn't make sense (tree code is used multiple times and at multiple levels, for example the trees exported to Pike use a block allocator itself using trees again to manage the memory. Without source code reuse, it would blow up and become an unmaintainable mess. :) ).
On Wed, Jan 26, 2011 at 4:45 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
Hi. This code seems to depend on Linux-specific header files:
We have worked on that and got it to compile under a virtual FreeBSD 8.1 on a amd64 machine. During this process, we discovered an imcompatibility with GCC 4.2 and resolved that, too. The code is pushed to http://fortranland.com/git/pike.
Unfortunately, the Pike build as a whole does not work on FreeBSD - but this does not depend on whether post_modules/CritBit is in the source tree or not. The CritBits compile without any error or warning. Pike fails to build with the following issue there:
Compiling module.c /usr/home/dev/hash/pike/build/freebsd-8.1-release-amd64/smartlink gcc -DPIKE_SRC_ROOT="/usr/home/dev/hash/pike" -I. -I/usr/home/dev/hash/pike/src -I/usr/home/dev/hash/pike/build/freebsd-8.1-release-amd64/bundles/include -I/usr/local/include -DPIKE_CORE -g -ggdb3 -pthread -O3 -pipe -W -Wall -Wno-unused -Wcomment -Wformat -Wimplicit-function-declaration -Wmultichar -Wswitch -Wuninitialized -Wpointer-arith -Wchar-subscripts -Wno-long-long -Wdeclaration-after-statement -L/usr/home/dev/hash/pike/build/freebsd-8.1-release-amd64/bundles/lib64 -L/usr/home/dev/hash/pike/build/freebsd-8.1-release-amd64/bundles/lib/64 -L/usr/home/dev/hash/pike/build/freebsd-8.1-release-amd64/bundles/lib/. -L/usr/local/lib -R/usr/local/lib -pthread -rdynamic main.o language.o security.o bignum.o pike_cpulib.o interpret.o constants.o cpp.o fdlib.o cyclic.o array.o backend.o callback.o encode.o docode.o dynamic_buffer.o dynamic_load.o error.o fd_control.o fsort.o gc.o hashtable.o lex.o multiset.o signal_handler.o pike_search.o pike_types.o pike_embed.o mapping.o pike_memory.o module_support.o pikecode.o object.o opcodes.o operators.o pike_float.o port.o program.o rbtree.o rusage.o sscanf.o stralloc.o stuff.o threads.o version.o queue.o builtin.o iterators.o facetgroup.o svalue.o las.o builtin_functions.o peep.o module.o `cat modules/linker_options post_modules/linker_options` -lrt -lm -lcrypt -o pike /usr/home/dev/hash/pike/src/object.c:695: Fatal error: Couldn't load master object. No stack - no backtrace. *** Signal 6
Pike from /usr/ports/lang/pike78 compiled fine. More analysis on the Pike build issue is not available as of now, unfortunately.
Thanks!
With some fiddling I got the CritBit module to compile on Solaris, but it does not load:
Pike v7.9 release 5 running Hilfe v3.5 (Incremental Pike Frontend)
ADT.CritBit;
lib/modules/ADT.pmod/_CritBit.so:-: Warning: Failed to load library: ld.so.1: pike: fatal: relocation error: file /tmp/pike/build/sunos-5.11-sun4v/lib/modules/ADT.pmod/_CritBit.so: symbol MIN: referenced symbol not found Compiler Error: 1: Index 'CritBit' not present in module ADT. Compiler Error: 1: Indexed module was: ADT.
There is a MINIMUM() macro in pike_macros.h you can use if you like.
On Thu, Jan 27, 2011 at 3:50 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
There is a MINIMUM() macro in pike_macros.h you can use if you like.
I have replaced all occurences of MIN and MAX accordingly and pushed to the GIT on fortranland.com. Additionally I have testcompiled the CritBits with the code changed so that big endian code was used. Compile was fine, but result the result would of course be unusable.
Thanks. Moving forward...
Doing tests in build/sunos-5.11-sun4v/post_modules/CritBit/testsuite (704 tests) Bus error (core dumped)
Well, I'll have to inspect this closer in a debugger later...
On Thu, Jan 27, 2011 at 6:10 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
Well, I'll have to inspect this closer in a debugger later...
Not to discourage you, but the error might not be where the crash occurs. The backtrace might help a little, though. In my lack of big endian machines, I've tried emulating some with qemu, but no OS I tried really got on with the virtual machine even that much that I could have completed the installation procedure. Does anyone happen to have a spare login on a big endian machine?
The backtrace might help a little, though.
Not as such:
Program received signal SIGSEGV, Segmentation fault. 0xffffffff5dc1d55c in ?? () (gdb) bt #0 0xffffffff5dc1d55c in ?? () #1 0xffffffff5dc1d554 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb)
The offending instruction is
0xffffffff5dc1d55c: sth %g1, [ %i0 + 0x18 ]
which attempts to store a 16-bit value at an odd address (since %i0 contains 0x10092d6e9).
Apparently I need to compile a new gdb, since I don't get any symbolic info from shared objects, but I've managed to map the code back to this line, using pattern matching of the machine code:
/tmp/pike/src/post_modules/CritBit/tree_low.c:156 CB_SET_KEY(node, s);
So appantly "node" contains an odd address. I instrumented the function node_init() to print the addresses of nodes it allocates, and sure enough:
tree = 1008d8480 tree = 100931289 *** Error code 138 make: Fatal error: Command failed for target `just_verify'
So the bug is that CB_NODE_ALLOC() does not provide enough alignment for an object of type "struct cb_node" in the pointer it returns.
On Fri, Jan 28, 2011 at 4:30 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
So the bug is that CB_NODE_ALLOC() does not provide enough alignment for an object of type "struct cb_node" in the pointer it returns.
I seem to have less time available this weekend than I thought beforehand, so I have for now simply disabled the block allocator on big endian machines. Could you probably check if at least the tree logic is correct?
Thanks,
tobias
I seem to have less time available this weekend than I thought beforehand, so I have for now simply disabled the block allocator on big endian machines. Could you probably check if at least the tree logic is correct?
This has nothing to do with endianness. There are plenty of little endian archtectures which don't support bad alignment either.
But I'll try it out nevertheless. :-)
Well, there are still alignment bugs:
Program received signal SIGSEGV, Segmentation fault. 0xffffffff5dc1cd18 in ?? ()
The offending instruction is
0xffffffff5dc1cd18: ld [ %g4 + %l0 ], %g3
%g4 is 0x100a001e0, but %l0 is 0x1, so together they form an odd address. The code comes from line 79 in widestring.h, which says
PREFIX(32);
The definition of PREFIX contains
uint ##n ##_t x = ((*((uint ##n ##_t *)(p1+j)) ^ *((uint ##n ##_t *) (p2+j))));
which will be instantiated to
uint32_t x = ((*((uint32_t *)(p1+j)) ^ *((uint32_t *) (p2+j))));
It seems like "j" is odd (1) which will of course not work for an uint32_t.
On Sun, Jan 30, 2011 at 5:50 PM, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum 10353@lyskom.lysator.liu.se wrote:
Well, there are still alignment bugs:
Ok, no escape from learning more about alignment. Will do.
Thanks!
Well, it's not terribly complicated. What it all boils down to is this: Any primitive object (i.e. scalar, floating point or pointer) needs to reside at an address which is a multiple of the size of said object. The C compiler will normally take care of this for you, but by performing pointer arithmetic on type-punned pointers it is possible to shoot oneself in the foot.
I fixed that one. Sorry for the late reply, but we have both been a little busy. I cannot guarantee that there are no other places where we are doing misaligned memory checks. We only played around with valgrind and the AC flag. I feel a little guilty of wasting your time with these bugs. Unfortunately though, I only have logins on x86 linux machines for testing. So just in case someone has a machine somewhere and feels like giving us a login...
best
arne
On Sun, 30 Jan 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Well, it's not terribly complicated. What it all boils down to is this: Any primitive object (i.e. scalar, floating point or pointer) needs to reside at an address which is a multiple of the size of said object. The C compiler will normally take care of this for you, but by performing pointer arithmetic on type-punned pointers it is possible to shoot oneself in the foot.
Hm, I'm unable to fetch your HEAD:
error: Unable to find ce8ecbed625b14643e67e114c904e58320907579 under http://fortranland.com/git/pike Cannot obtain needed object ce8ecbed625b14643e67e114c904e58320907579
Did you forget to run "git update-server-info"?
Maybe its because i rebased to latest 7.9? I also ran pike update-server-info again, but its in the post-update hook which i remember working before.
On Tue, 8 Feb 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Hm, I'm unable to fetch your HEAD:
error: Unable to find ce8ecbed625b14643e67e114c904e58320907579 under http://fortranland.com/git/pike Cannot obtain needed object ce8ecbed625b14643e67e114c904e58320907579
Did you forget to run "git update-server-info"?
The thing is that your refs/heads/critbit says that ce8ecbed625b14643e67e114c904e58320907579 is the head, but apparently no such commit object can be found in objects/?
And no, fetch still isn't working. Making a new clone seems to work though, and gets this object. Very weird. Well, that allows me to work around the problem at least (by fetching from the new clone first)...
As for the actual code, it's now possible to run make verify without memory faults, so that's good progress. :-)
12 tests are failing, please see http://mc.pp.se/critbit_verify.txt for details.
12 tests are failing, please see http://mc.pp.se/critbit_verify.txt for details.
We have finally gotten hold of a big endian machine and resolved these issues. The code is pushed to the fortranland git repository.
Hm, did you commit everything?
Compiling floattree.c In file included from /tmp/pike/src/post_modules/CritBit/floattree.cmod:28:0: /tmp/pike/src/post_modules/CritBit/floattree.h:15:9: warning: #pragma pack(push[, id], <n>) is not supported on this target /tmp/pike/src/post_modules/CritBit/floattree.h:20:9: warning: #pragma pack(pop[, id], <n>) is not supported on this target Compiling post_modules/CritBit/tree_block_alloc.c /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c: In function ¡free_item¢: /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:93:6: warning: implicit declaration of function ¡LBITN¢ /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:93:17: error: expected expression before ¡unsigned¢ {...]
I can't find any declaration of "LBITN" anywhere in the CritBit sources...
Its rather that we removed something which we should have kept. Its only used in the unused block_alloc code. but that one is still in OBJS and being compiled. I readded the defines, thanks.
arne
On Tue, 22 Feb 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Hm, did you commit everything?
Compiling floattree.c In file included from /tmp/pike/src/post_modules/CritBit/floattree.cmod:28:0: /tmp/pike/src/post_modules/CritBit/floattree.h:15:9: warning: #pragma pack(push[, id], <n>) is not supported on this target /tmp/pike/src/post_modules/CritBit/floattree.h:20:9: warning: #pragma pack(pop[, id], <n>) is not supported on this target Compiling post_modules/CritBit/tree_block_alloc.c /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c: In function �free_item�: /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:93:6: warning: implicit declaration of function �LBITN� /tmp/pike/src/post_modules/CritBit/tree_block_alloc.c:93:17: error: expected expression before �unsigned� {...]
I can't find any declaration of "LBITN" anywhere in the CritBit sources...
Right:
| Begin tests at Tue Feb 22 11:44:41 2011 (pid 5107) | Doing tests in post_modules/CritBit/testsuite (704 tests) | Total tests: 704 (0 tests skipped) | Finished tests at Tue Feb 22 11:44:49 2011
So, the following issues remain then:
* Makefile.in needs to be fixed, see 9a3d000 in the main repository for reference.
* #pragmas should be removed
* Use of __attribute should be removed, or at least made conditional on gcc (including version)
* __bultin_X must be tested for, and portable alternatives used if they do not exist (looks like you did this for __builtin_expect already, so that leaves __builtin_clz, __builtin_clzl, __builtin_clzll, __builtin_bswap32, and __builtin_bswap64)
We have tried to address all those. We have a set of fallbacks and tests for builtins for the compilers we could think of. One unrelated problem we ran into was that a pike compiled with icc segfaults when compiled with default --with-machine-code. Without that everything seems to be fine. We did not try to compile on windows, however we will give that a try aswell.
arne
On Tue, 22 Feb 2011, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
Right:
| Begin tests at Tue Feb 22 11:44:41 2011 (pid 5107) | Doing tests in post_modules/CritBit/testsuite (704 tests) | Total tests: 704 (0 tests skipped) | Finished tests at Tue Feb 22 11:44:49 2011
So, the following issues remain then:
- Makefile.in needs to be fixed, see 9a3d000 in the main repository
for reference.
#pragmas should be removed
Use of __attribute should be removed, or at least made conditional
on gcc (including version)
- __bultin_X must be tested for, and portable alternatives used if
they do not exist (looks like you did this for __builtin_expect already, so that leaves __builtin_clz, __builtin_clzl, __builtin_clzll, __builtin_bswap32, and __builtin_bswap64)
We have tried to address all those. We have a set of fallbacks and tests for builtins for the compilers we could think of.
Great! Is everything pushed to fortranland?
One unrelated problem we ran into was that a pike compiled with icc segfaults when compiled with default --with-machine-code.
That does indeed sound unrelated, but thanks for reporting it.
pike-devel@lists.lysator.liu.se