Nice. If my import from the SVN version was correct, this should match with my import. An automated check between the two git repositories is not so difficult and actually very fast.
Note that the $Id$ markers in pike-full-10100103 have been extracted in -ko mode, so they most likely don't match the SVN repository. This has been fixed in the current converter.
Er, yes, indeed. The check will only be very fast if the file content is identical (in which case the tree hashes for entire commits can be matched).
Case in point, around label v7.8.350 there is an artificial "branch loop", and looking back through history I see it again at v7.8.336, and probably
No, those are actually proper splits. They have typically been casued by the build system running in parallel with someone committing stuff. They are (in the general case) also neccessary to keep the tags on the correct versions of files.
Ok. Excellent. Just make sure then that the build system is using the second-parent-slot, and the regular commits use the first-parent-slot (so that --first-parent on git commands merely yields proper commits).
The import repository I made should be around 50MB, yours is around 150MB. It looks like it still contains a lot of extra garbage.
The main reason is probably that it's a raw repository as generated by git-fast-import. The git-fast-import manual recommends running git-repack -f --window=50 on the repository afterwards.
I think I tried that already. But I'll look into it once more.