I'm currently in the progress of comparing grubba's git conversion of the Pike repository to my svn conversion, which should improve the quality of both.
I've encountered a systematic difference where some kind of policy decision on what is "right" is needed though.
The issue is with tags. In CVS, each file is tagged separately. If a file does not have a particular tag, you will not get that file when you check out the tag. There are in fact quite a few files in Pike which do not have certain tags even though they were current on trunk when the tag was made. This is typically the case with modules which were initially developed outside the main tree (and imported using the CVS "module" concept, which is not understood by cvs rtag). Examples include:
lib/modules/SSL.pmod/ src/post_modules/COM/ src/post_modules/SDL/ src/post_modules/Shuffler/ src/post_modules/_Image_SVG/ lib/modules/Audio.pmod lib/modules/Tools.pmod/Legal.pmod
Then, there is also a few tags (so far I have encountered v7_1_7, v7_1_20, and v7_7_28) where large parts of the tree (including src/ in its entirery) do not have the tag, suggesting that the tagging operation was interrupted, and the resulting broken tag was just left there instead of being (re)moved.
So, there are two principle approaches here: Should the tags in the converted repository accurately reflect the tags in the CVS repository, so that you only get those files you would have gotten out of CVS, or do we take the revisionist approach of including also such files which do not have the tag but which were present on head when the tag was created, and therefore "ought" to have been tagged?
So, there are two principle approaches here: Should the tags in the converted repository accurately reflect the tags in the CVS repository, so that you only get those files you would have gotten out of CVS, or do we take the revisionist approach of including also such files which do not have the tag but which were present on head when the tag was created, and therefore "ought" to have been tagged?
I'm of the opinion that the latter is preferable, since the lack of a tag for a file is of minimal interest, and the alternative would have to introduce artificial commits that are not on the main branches to represent these tags.
Normally I'm with the conserving history at all costs crowd, but revisionism seems the most pragmatic option here. The history is probably best preserved by tar-ing up all the sources that where used in creating the modern repositories and archive that somewhere public.
On Sun, Mar 21, 2010 at 12:45:02PM +0000, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
There are in fact quite a few files in Pike which do not have certain tags even though they were current on trunk when the tag was made. This is typically the case with modules which were initially developed outside the main tree (and imported using the CVS "module" concept, which is not understood by cvs rtag).
if they were developed outside the main tree, doesn't that mean that they were not part of the trunk at that time but merged later?
shouldn't such cases be imported into an independent branch (that has no common base with pike) and then merged into the main pike branch at the appropriate point?
using a merge would represent the true history (developed seperately and then merged)
greetings, martin.
On Sun, Mar 21, 2010 at 12:45:02PM +0000, Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
There are in fact quite a few files in Pike which do not have certain tags even though they were current on trunk when the tag was made. This is typically the case with modules which were initially developed outside the main tree (and imported using the CVS "module" concept, which is not understood by cvs rtag).
if they were developed outside the main tree, doesn't that mean that they were not part of the trunk at that time but merged later?
Often they were merged via the cvs module system.
On Mon, Mar 22, 2010 at 09:35:02AM +0000, Henrik Grubbstr�m (Lysator) @ Pike (-) developers forum wrote:
if they were developed outside the main tree, doesn't that mean that they were not part of the trunk at that time but merged later?
Often they were merged via the cvs module system.
what does that mean specifically?
how does that change the fact thet they were merged?
how does merging with the module system differ from copying a seperate cvs tree into a subdirectory of the pike repository?
in both casesthere is a point in time when the copy/merge happened, and that point would be the merge point for git.
greetings, martin.
if they were developed outside the main tree, doesn't that mean that they were not part of the trunk at that time but merged later?
They were not merged. They were initially included as a "module", which is the equivalent of e.g. svn:externals in subversion. So when checking out Pike, the module would get checked out as well, giving the impression that it was part of the main tree. Later, the ,v files were manually moved to the main tree, and the "module" directive removed. This made no practical difference except that the path to the repository for those directories changed slightly. There was never any merge operation involved.
that is what i meant with merging. regardless of the method used, they were combined somehow, now giving the appearance that they were always part of the repo. but in reality they were not part of the repo until they were included.
either this point when the modules were added to the main repo, or the point when they were actually moved to the main tree can be regarded as a merge point which in git could be represented as a real merge commit.
what i mean is, that currently in cvs the history looks like:
P---M---P-t-M--P------M---P-t-M---------P---M---... (where P are commits to pike itself, M commits to external modules, and t are tags that are only made to the main repo)
while in reality what happened is:
M-------M--- \ m1---M-------M--- \ P-------P-T----P----------P-T------m2---P---M---...
m1 is the point where the code was added with cvs module, and m2 where it was moved to the main tree. either m1 or m2 could be considered mergepoints, and represented with a merge commit in git and the tags represent a state that matches history as it happened
greetings, martin.
that is what i meant with merging. regardless of the method used, they were combined somehow, now giving the appearance that they were always part of the repo. but in reality they were not part of the repo until they were included.
They were always part of the repo, just with a different path. And only the path within the repo changed, the path in a checkout remained the same.
m1 is the point where the code was added with cvs module, and m2 where it was moved to the main tree. either m1 or m2 could be considered mergepoints, and represented with a merge commit in git and the tags represent a state that matches history as it happened
How? I agree in principle as far as m1 is concerned, but what could possible be merged at m2? The tree is the union of all files in the main tree and in all modules, so it looks exactly the same before and after m2.
well, i was not sure if m1 or m2 would be the more logical mergepoint. if you say it is m1 then we don't need to discuss m2 as it does not make sense to have two mergepoints here.
greetings, martin.
Well, in that case we're back to square one: What to do with T:s between m1 and m2. :-)
Marcus Comstedt (ACROSS) (Hail Ilpalazzo!) @ Pike (-) developers forum wrote:
No more opinions? There seem to be two votes for revisionism so far.
I'd consider revisionism the way of least resistance in this case.
The extra information gleaned from doing it any other way is negligible, it might even be considered counterproductive because it would have increased the amount of work one needs to do to see the big picture (the "module" in relation to mainstream Pike of that time).
pike-devel@lists.lysator.liu.se