Skip to content

git remote svn features

flyingflo edited this page Apr 15, 2012 · 4 revisions

Features as originally suggested by Jonathan Nieder.

  • baseline: remote helper in C

  • option to import starting with a particular numbered revision. This would be good practice for seeing how options passed to "git clone -c" can be read from the config file. -c, i.e. --config allows options to be set before git starts fetching.

  • option or URL schema to import a single project from a large Subversion repository that houses several projects. This would already be useful in practice since importing the entire Apache Software Foundation repository takes a while which is a waste when one only wants the history of the Subversion project. How should the importer handle Subversion copy commands that refer to other projects in this case? snvrdump can import subdirectories, but it doesn’t follow the history of the copy source, loosing the history of branches before they split off from their parent.

  • automatically detecting trunk when importing a project with the standard layout. The trunk usually is not branched from elsewhere so this does not require copyfrom info. Some design questions come up here: should the remote helper import the entire project tree, too? (I think "yes", since copy commands that copy from other branches are very common and that would ensure the relevant info is available to git.) This means storing a commit that represents the svn directory prior to splitting it into branches. This could be useful for debugging and more.

  • mapping of git commit names to Subversion revision numbers in notes. "git notes" provide a way to attach additional information to a commit. Because "git notes" are stored in the git object db as native objects, they can be shared using the usual "git fetch" / "git push" commands as long as you specify the appropriate source and destination refs on the command line or in git’s configuration file. Commands like "git rebase" that modify history also have some support for carrying notes along. This can map git commit to svn rev number.

  • mapping of svn rev numbers to git commits. "git fast-import" has a facility that fits well for this mapping: a marks file can store an arbitrary mapping from numbers to objects (usually objects that were part of the import). svn-fe writes a mark for each Subversion revision it imports

Note
There is a marks-to-notes converter (by Jonathan) that keeps both mapping in sync.
  • detecting trunk and branches and exposing them as different remote branches. This is a small step that just involves understanding how remote helpers expose branches.

  • Metadata: storing path properties and copyfrom information in the commits produced by the vcs-svn/ library. How should these be stored? For example, there could be a parallel directory structure in the tree with properites for <path> stored at .properties/<path>.properties. This strawman scheme doesn’t work if the repository being imported has any paths ending with ".properties", though. Ideas? Or store it in a JSON file in .git.

        foo/
                bar.c
        baz/
                qux.c
        .properties/
                foo.properties
                foo/
                        bar.c.properties
                baz/
                        qux.c.properties
  • tracing history past branch creation events, using the now-saved copyfrom information.

  • tracing second-parent history using svn:mergeinfo properties. Handling merges in general. In svn merging means "Apply the differences between two sources to a working copy path." The mergeinfo property stores a list of revisions that have been applied to that file. So it’s not straight forward to translate it into a git merge. This is a file property.

  • a git repository that fetched from svn is cloned, and the cloned repo should be able to fetch from svn too.

  • empty directory

  • cross branch revisions: the only sane way to translate them to git is creating two commits. This requires the revision-commit mapping to be 1-to-n if not n-to-n.

  • detecting tags: In svn tags are created by an "svn copy" to a directory. This is equivalent to a branch. The difference is, that tags are usually never changed by a subsequent revision. This property could be used to detect tags. However there is one problem: whenever we import a directory that looks like a tag, we don’t know if it will be changed tomorrow.

  • branch detection: Andrew suggest this rules for default branch detection: A directory is a branch if…​

    1. it is not a subdirectory of an existing branch; and

    2. either:

      • 2a. it is in a list of branches specified by the user, or

      • 2b. it is copied from a (subdirectory of a) branch. For unclear cases, the helper can fail with an error message and rely on a pre-created SBL file that specifies what a directory actually represents.

  • branches deleted in svn:

    1. Just delete the branch, allow git gc to later cleanup the objects

    2. Just leave them be for the user to deal with at a later date

    3. Move them to another namespace

Clone this wiki locally