How I use Jujutsu
Over my career in software engineering, I have had a love/hate relationship with version control. In the early days it was dealing with all the idiosyncrasies of SVN
. Then git
came along and I pretty much never looked back. However, I really never bought into all the power features of git
. I relied heavily on forges like Github or Gitlab to manage the life cycle of of a project. I never really worried about history because I so rarely had to look back. When I did, it was largely to figure out where something went wrong, not looking to back out a part of a change. When I would need to back out a change, I usually leveraged my understanding of the codebase instead of my understanding of git to back out a change. That often meant copying out a change at on commit, maybe modifying it, and then reapplying it later on in the history.
My units of work were consistently branches with many meaningless commits mingled among partial commits of actual meaning. I often made huge sweeping changes that took weeks to land and often changed multiple areas of code to the chagrin of my collaborators. Conflicts were inevitable as these work efforts took longer and got bigger. I swore by the Squashed Merge Commit and looked sideways at anyone who told me “Rebasing is better, you know?” Well, I didn’t know. Every time I rebased one of my massive PRs, I spent an inordinate amount of time dealing with conflicts that I could not figure out by myself how to resolve. Merging made these problems go away. Well, most of the time.
To be fair, I acknowledge that these are bad habits. Its just that git
really encourages programmers to lean into these bad habits.
My journey to correcting these bad habits started when I discovered the jj-vcs project. At first, I mainly looked at it as a tool to make amending commits a core part of the workflow, but as I dug into it and the ecosystem around it, it became much more of a paradigm shift. There are many blogs that focus on how JJ works, so will only touch on some of the high levels of functionality and instead focus on how I use it personally. You can read a much more in-depth exploration of JJ here
Commits vs Branches
As I said above, I was always committing all day long into a branch, so that by the time it was time to merge, I would have hundreds of commits. Many of them basically worthless. You know, “fixed typo” type commits. Nothing that anyone would need to look back on to see if there was a hidden regression. As a result I relied exclusively on squashing my branch when merging PR. Again, as said above, this resulted in much bigger PRs than may have originally been intended.
How JJ changes this is by making the commit the unit of work. It achieves this by always having your working copy checked in. Yes, there is no more git commit -asm "did stuff
. Its just already checked in the moment you make the change. Now, the theoretical loss of functionality here is that since you are no longer making discrete commits, you are left with one giant commit and no way of moving backwards and forwards withing your unit of work. If you do some exploratory code spelunking and come to realize that your giant refactor you are doing in the middle of your branch is not going to work, you want a way out. With git
you would use incremental commits to give yourself breadcrumbs to lead back to a known good state. How does it work if you are always checked in?
Jujutsu mitigates this with Operation Log. Almost every interaction with JJ creates a snapshot under the hood. Look at a log of changes, snapshot. Describe a commit, snapshot. JJ also provides tooling to interact with the Operation Log. So if you want to back out changes that were made 3 hours ago, you can, while still maintaining a single commit as your unit of work. In fact, you can even undo any operation you have performed. If you go ahead and rebase, then fix conflicts, then make a bunch of changes, decide you’ve made a huge mistake, you can just undo them. Its incredible.
Stacked PRs
Now that I was focusing on making each commit meaningful, it got me thinking about how to be a better collaborator. Enter Stacked PRs. Breaking PRs into smaller consumable parts is nothing new, however, Stacked PRs codify how you should approach breaking up your code. Specifically by creating a branch of PRs that flow into each other. Conversely you can also write your code in such a way that any PR can land in any order, which is awesome, but often harder to accomplish depending on the code base.
In the figure below you can see a “simple” branch model for articulating a change that is broken up into three PRs. Each PR can be reviewed independently, though reviews should start at the base and work out, ie: pr1 -> pr2 -> pr3
. Then once reviews are in, they should be merged from the outermost in, ie: pr3 -> pr2 -> pr1
.
gitGraph LR: commit commit branch pr1 checkout pr1 commit id: "database" commit id: "review feedback pr1" branch pr2 checkout pr2 commit id: "service" commit id: "review feedback pr2" branch pr3 checkout pr3 commit id: "UI" commit id: "review feedback pr3" checkout pr2 merge pr3 checkout pr1 merge pr2 checkout main merge pr1 commit id: "work continues"
While this is great for reviews, its considerable overhead for the developer orchestrating all of the reviews. For this reason, I like to use Super Pull Requests1 to manage the life cycle of a PR. SPR manages this by creating creating shadow branches behind the scene. It ensures that the base PR is diffed against main, but all dependent PRs are diffed against a synthetic branch that includes main, but also the previous PR. It then ensures that the base PR lands before it can land any dependent PRs.
As I mentioned above, JJ leans into a commit being the unit of work. So we are generally always working on a single “commit,” but in PRs, we want to be able to see changes. So SPR manages the branch-based workflow to show reviewers the changes made to a PR, while also keeping JJ’s commit-based workflow by landing the PR as a single rebased commit. It ends up looking more like this:
gitGraph LR: commit commit branch spr checkout spr commit id: "database" commit id: "review feedback spr" checkout main merge spr checkout spr commit id: "service" commit id: "review feedback pr2" checkout main merge spr checkout spr commit id: "UI" commit id: "review feedback pr3" checkout main merge spr commit id: "work continues"
The above diagram is a gross oversimplification of the process, but from a conceptual point of view, it accurate. Underneath the hood there are 7 branches that get spun up to facilitate this process. However, at no point does anyone have to care about them and they all get cleaned up in the process.2
Conventional Commits & Release Notes
Now that the tooling for working with a “commit as the unit of work” workflow is in place. I wanted to make the commits valuable to support knowing what work was done. Enter Convention Commits. They’ve been around for years, and adopted by many teams/projects, but I never prioritized using it. However, now was a create time for the change. To facilitate this with JJ+SPR, I adopted the use of Koji to enforce the structure. You can see below how the workflow works, but it basically boils down to filling out a form with the required info when writing a commit message.
With commit history being valuable, now it was time to aggregate the data so that it could be published when needed. For this I chose to use git-cliff. With git-cliff your commits become your release notes. When added to a release pipeline, your commits now become organized by type, linked to issues if links exist, and indicating to the user if something is broken. Historically this was a painful process of cleaning up git history and making it look like release notes. Now, its 99% ready for consumption and easily updated if not.
Workflow
# I generally init with the `--collocate` flag so that other tooling that expects a `git` repo still works.
# `koji` is specifically designed to work with `git` repos, so it actually
# commits when interacting with `git`. In this case, we only want to describe
# the revset, so we use the `--stdout` flag
Then do work. Sometimes, particularly when a work-effort spans multiple days, I will use the described commit as a “branch.” I will then use an un-described revset as my working-copy.
# do more work
jj squash
in the above block just squashes whatever is in the current working-copy into its parent.
When ready to initiate a pull request:
# if you are working on your described revset. This will create a new
# working-copy and let revise your commit message. `jj new` will crate a new
# working-copy without editing the commit message
# if you are already on a new working-copy, jump straight to `spr`
This will create a new working copy off of your previous revset that contains your changes. Then it initiates a pull request with the revset containing your changes as the initial commit.
Deal with PR feedback in your current empty new working copy.
Each time spr diff
is issued, a new commit is added to the PR. These commits are purely for the reviewers to be able to see what has changed in the PR process. Everything will be squashed into a single commit at the end of this process. Repeat this as many times as needed.
If changes have been created by the git forge to the PR, like Github adding text to the description, you may have to amend your working copy.
Once all feedback has been dealt with, you can “land” the PR. This squashes all commits created by the spr diff
command. It then completes the PR on the git forge by rebasing all your changes onto main with your original title and description.
If you have multiple dependent PRs, rebasing after landing one of the PRs is a simple process.
One issue with SPR is that when it completes a PR, it creates a new revset, orphaning your original commit. To deal with this, I create a new working copy off of main, then abandon my original commit.
This cleans up your local tree so that the old revset no longer shows up in the log.
Command Line Aliases
I use fish
shell, so I create many abbreviations for help with the fairly clear, but verbose, API.
# Jujutsu
# I use `jj` to exit insert mode