Skip to content

User Guide

Matthias Sohn edited this page Nov 30, 2023 · 1 revision

Getting Started

If you're new to Git or distributed version control systems generally, then you might want to read Git for Eclipse Users first. If you need more details and background read the book Pro Git. Find details about each release in the new and noteworthy and the release notes.

Taking JGit for a Spin

Although you are probably interested in JGit because you want to integrate it into an existing application or create a tool, JGit is more than simply a Java library for working with git repository. So before diving into the different aspects of the library let's take JGit for a spin.

You are probably familiar with the git command line interface (CLI) that can be used from the shell or in scripts. JGit comes with its own small CLI, which, although not as feature-full as the git CLI, is a good way to showcase what JGIt can do. Furthermore, the programs serve as an excellent source of inspiration for how to accomplish different tasks.

Building the JGit CLI

Assuming that you have the EGit git repository cloned and ready, build the jgit binary by running the jgit maven build (see the Contributor Guide):

~/src/jgit$ mvn clean install

Find the jgit binary here (path relative to root of working tree of your clone of the jgit repository):

org.eclipse.jgit.pgm/target/jgit

Check your build by running the "version" command:

prompt$ ./jgit version
jgit version 0.10.0-SNAPSHOT

If you want to use jgit frequently you may consider to ease running it via a symbolic link (usually goes under /usr/local/bin)

sudo ln -s /path/to/jgit /usr/local/bin/jgit

Running the JGit CLI

Overview

When given the -h flag, commands provide a helpful message listing what flags they support.

prompt$ ./jgit version -h
jgit version [--help (-h)]

 --help (-h) : display this help text

Running jgit with no arguments lists the most commonly used commands.

prompt$ ./jgit
jgit --git-dir GIT_DIR --help (-h) --show-stack-trace command [ARG ...]

The most commonly used commands are:
 branch   List, create, or delete branches
 clone    Clone a repository into a new directory
 commit   Record changes to the repository
 daemon   Export repositories over `[`git://`](git://)
 diff     Show diffs
 fetch    Update remote refs from another repository
 init     Create an empty git repository
 log      View commit history
 push     Update remote repository from local refs
 rm       Stop tracking a file
 tag      Create a tag
 version  Display the version of jgit

The commands are modeled after their corresponding command in the git CLI. We will not cover all the commands here, but simply give some examples.

jgit also provides a number of debug and test commands, to list all the available commands run

prompt$ ./jgit debug-show-commands

Inspecting the Repository

Before inspecting the most recent commits, you probably want to know which branches the repository contains and what branch is currently checked out. Using the branch commands -v flag, you get a small summary of branches, their revision, and the first line of the revision's commit message.

prompt$ ./jgit branch -v
  master       4d4adfb Git Project import: don't hide but gray out existing projects
* traceHistory 6b9fe04 [historyView] Add trace instrumentation

The log command, like git-log(1), shows the commit log. For example,

prompt$ ./jgit log --author Matthias --grep tycho master
commit 482442b599abf75b63b397680aaff09c4e48c0ed
Author: Matthias Sohn <[email protected]>
Date:   Fri Oct 08 10:58:52 2010 +0200

    Update build to use tycho 0.10.0

will show you all commits in the "master" branch, where the author name matches "Matthias" and the commit messages contains the word tycho. More search criteria to filter the commit log, such as committer name, can be given.

Graphical History View

Finally, to show some of the graphical capabilities of JGit, we will end this small tour by launching the graphical log tool.

prompt$ ./jgit glog

This should give you a window with the revision graph plotted to the left and three columns containing the first line of the message, the author name, and the commit date.

Image:Jgit-glog.png

Concepts

API

Repository

A Repository holds all objects and refs used for managing source code.

To build a repository, you invoke flavors of RepositoryBuilder.

FileRepositoryBuilder builder = new FileRepositoryBuilder();
Repository repository = builder.setGitDir(new File("/my/git/directory"))
  .readEnvironment() // scan environment GIT_* variables
  .findGitDir() // scan up the file system tree
  .build();

Git Objects

All objects are represented by a SHA-1 id in the Git object model. In JGit, this is represented by the AnyObjectId and ObjectId classes.

There are four types of objects in the Git object model:

  • blob
    • is used to store file data
  • tree
    • can be thought of as a directory; it references other trees and blobs
  • commit
    • a commit points to a single tree
  • tag
    • marks a commit as special; generally used to mark specific releases

To resolve an object from a repository, simply pass in the right revision string.

ObjectId head = repository.resolve("HEAD");

Ref

A ref is a variable that holds a single object identifier. The object identifier can be any valid Git object (blob, tree, commit, tag).

For example, to query for the reference to head, you can simply call

Ref HEAD = repository.getRef("refs/heads/master");`

RevWalk

A RevWalk walks a commit graph and produces the matching commits in order.

RevWalk walk = new RevWalk(repository);

TODO talk about filters

RevCommit

A RevCommit represents a commit in the Git object model.

To parse a commit, simply use a RevWalk instance:

RevWalk walk = new RevWalk(repository);
RevCommit commit = walk.parseCommit(objectIdOfCommit);

RevTag

A RevTag represents a tag in the Git object model.

To parse a tag, simply use a RevWalk instance:

RevWalk walk = new RevWalk(repository);
RevTag tag = walk.parseTag(objectIdOfTag);

RevTree

A RevTree represents a tree in the Git object model.

To parse a tree, simply use a RevWalk instance:

RevWalk walk = new RevWalk(repository);
RevTree tree = walk.parseTree(objectIdOfTree);

Reference

Porcelain API

While JGit contains a lot of low level code to work with Git repositories, it also contains a higher level API that mimics some of the Git porcelain commands in the org.eclipse.jgit.api package.

Most users of JGit should start here.

AddCommand (git-add)

AddCommand allows you to add files to the index and has options available via its setter methods.

  • addFilepattern()

Here's a quick example of how to add a set of files to the index using the porcelain API.

Git git = new Git(db);
AddCommand add = git.add();
add.addFilepattern("someDirectory").call();

CommitCommand (git-commit)

CommitCommand allows you to perform commits and has options available via its setter methods.

  • setAuthor()
  • setCommitter()
  • setAll()

Here's a quick example of how to commit using the porcelain API.

Git git = new Git(db);
CommitCommand commit = git.commit();
commit.setMessage("initial commit").call();

TagCommand (git-tag)

TagCommand supports a variety of tagging options through its setter methods.

  • setName()
  • setMessage()
  • setTagger()
  • setObjectId()
  • setForceUpdate()
  • setSigned() - not supported yet, will throw exception

Here's a quick example of how to tag a commit using the porcelain API.

Git git = new Git(db);
RevCommit commit = git.commit().setMessage("initial commit").call();
RevTag tag = git.tag().setName("tag").call();

LogCommand (git-log)

LogCommand allows you to easily walk a commit graph.

  • add(AnyObjectId start)
  • addRange(AnyObjectId since, AnyObjectId until)

Here's a quick example of how get some log messages.

Git git = new Git(db);
Iterable`<RevCommit>` log = git.log().call();

MergeCommand (git-merge)

TODO

Ant Tasks

JGit has Ant tasks for some common tasks contained in the org.eclipse.jgit.ant bundle.

To use these tasks:

<taskdef resource="org/eclipse/jgit/ant/ant-tasks.properties">
   <classpath>
     <pathelement location="path/to/org.eclipse.jgit.ant-VERSION.jar"/>
     <pathelement location="path/to/org.eclipse.jgit-VERSION.jar"/>
     <pathelement location="path/to/jsch-0.1.44-1.jar"/>
   </classpath>
</taskdef>

This would then provide git-clone, git-init and git-checkout tasks.

git-clone

<git-clone uri="http://egit.eclipse.org/jgit.git" />

The following attributes are required:

  • uri: the uri to clone from

The following attributes are optional:

  • dest: the destination to clone to (defaults to use a human readable directory name based on the last path component of the uri)
  • bare: true/false/yes/no to indicate if the cloned repository should be bare or not (defaults to false)
  • branch: the initial branch to check out when cloning the repository (defaults to HEAD)

git-init

<git-init />

The following attributes are optional:

  • dest: the path where a git repository is initialized (defaults $GIT_DIR or the current directory)
  • bare: true/false/yes/no to indicate if the repository should be bare or not (defaults to false)

git-checkout

<git-checkout src="path/to/repo" branch="origin/experimental" />

The following attributes are required:

  • src: the path to the git repository
  • branch: the initial branch to checkout

The following attributes are optional:

  • createbranch: true/false/yes/no to indicate whether the branch should be created if it does not already exist (defaults to false)
  • force: true/false/yes/no: if true/yes and the branch with the given name already exists, the start-point of an existing branch will be set to a new start-point; if false, the existing branch will not be changed (defaults to false)

git-add

TODO

Snippets

Finding children of a commit

PlotWalk revWalk = new PlotWalk(repo());
ObjectId rootId = (branch==null)?repo().resolve(HEAD):branch.getObjectId();
RevCommit root = revWalk.parseCommit(rootId);
revWalk.markStart(root);

PlotCommitList<PlotLane> plotCommitList = new PlotCommitList<PlotLane>();
plotCommitList.source(revWalk);
plotCommitList.fillTo(Integer.MAX_VALUE);
return revWalk;

Snippet Collection

There is a collection of ready-to-run JGit code snippets available at https://github.com/centic9/jgit-cookbook

Advanced Topics

Reducing memory usage with RevWalk

The revision walk interface and the RevWalk and RevCommit classes are designed to be light-weight. However, when used with any repository of considerable size they may still require a lot of memory. This section provides hints on what you can do to reduce memory when walking the revision graph.

Restrict the walked revision graph

Try to walk only the amount of the graph you actually need to walk. That is, if you are looking for the commits in refs/heads/master not yet in refs/remotes/origin/master, make sure you markStart() for refs/heads/master and markUninteresting() refs/remotes/origin/master. The RevWalk traversal will only parse the commits necessary for it to answer you, and will try to avoid looking back further in history. That reduces the size of the internal object map, and thus reduces overall memory usage.

RevWalk walk = new RevWalk(repository);
ObjectId from = repository.resolve("refs/heads/master");
ObjectId to = repository.resolve("refs/remotes/origin/master");

walk.markStart(walk.parseCommit(from));
walk.markUninteresting(walk.parseCommit(to));

// ...

Discard the body of a commit

There is a setRetainBody(false) method you can use to discard the body of a commit if you don't need the author, committer or message information during the traversal. Examples of when you don't need this data is when you are only using the RevWalk to compute the merge base between branches, or to perform a task you would have used `git rev-list` with its default formatting for.

RevWalk walk = new RevWalk(repository);
walk.setRetainBody(false);
// ...

If you do need the body, consider extracting the data you need and then calling dispose() on the RevCommit, assuming you only need the data once and can then discard it. If you need to hang onto the data, you may find that JGit's internal representation uses less overall memory than if you held onto it yourself, especially if you want the full message. This is because JGit uses a byte[] internally to store the message in UTF-8. Java String storage would be bigger using UTF-16, assuming the message is mostly US-ASCII data.

RevWalk walk = new RevWalk(repository);
// more setup
Set`<String>` authorEmails = new HashSet`<String>`();

for (RevCommit commit : walk) {
   // extract the commit fields you need, for example:
   authorEmails.add(commit.getAuthorIdent().getEmailAddress());
   commit.dispose();
}

Subclassing RevWalk and RevCommit

If you need to attach additional data to a commit, consider subclassing both RevWalk and RevCommit, and using the createCommit() method in RevWalk to consruct an instance of your RevCommit subclass. Put the additional data as fields in your RevCommit subclass, so that you don't need to use an auxiliary HashMap to translate from RevCommit or ObjectId to your additional data fields.

public class ReviewedRevision extends RevCommit {
   private final Date reviewDate;

   private ReviewedRevision(AnyObjectId idDate reviewDate) {
       super(id);
       this.reviewDate = reviewDate;
   }

   public List`<String>` getReviewedBy() {
       return getFooterLines("Reviewed-by");`
   }

   public Date getReviewDate() {
       return reviewDate;`
   }`

   public static class Walk extends RevWalk {
       public Walk(Repository repo) {
           super(repo);
       }

       @Override
       protected RevCommit createCommit(AnyObjectId id) {
           return new ReviewedRevision(idgetReviewDate(id));
       }

       private Date getReviewDate(AnyObjectId id) {
           // ...
       }
   }
}

Cleaning up after a revision walk

A RevWalk cannot shrink its internal object map. If you have just done a huge traversal of say all history of the repository, that will load everything into the object map, and it cannot be released. If you don't need this data in the near future, it may be a good idea to throw away the RevWalk and allocate a new one for your next traversal. That will let the GC reclaim everything and make it available for another use. On the other hand, reusing an existing object map is much faster than building a new one from scratch. So you need to balance the reclaiming of memory against the user's desire to perform fast updates of an existing repository view.

RevWalk walk = new RevWalk(repository);
// ...
for (RevCommit commit : walk) {
   // ...
}
walk.dispose();`
Clone this wiki locally