CAUTION: This post is primarily a note-to-self - something I can refer back to when I need to remember how this is done.
The other day, I made a comment on Twitter about an idea for an interview question regarding the treatment of Pull Requests (PRs) that contained unexpected white-space edits. This kind of scenario comes up all the time when you work at a company like InVision, where engineers are constantly floating from team to team and repository to repository, bring with them all of their own personal preferences and peccadillos. The tweet drew a variety of responses, from the trolling and humorous to the practical and inspiring. Which is, to me, exactly what an interview question should do: provide people with an opportunity to express their personality, thought process, and technical know-how. What I didn't expect, however, was to walk away from the conversation feeling inspired to learn; but, that's what's so great about working with other people - opportunities to inspire and teach are everywhere.
If I were asked this question in an interview, the best answer I could come up with would be to use the "?w=1" feature of GitHub (which hides white-space-only changes from the list of edits). And, in fact, in real life, this is what I do. For the most part, I try to avoid superflous edits in a PR (Pull Request); but, sometimes, this is unavoidable, especially when you wrap a chunk of code in a Try/Catch or a Lock or a Conditional Statement or something else that changes a large swath of indentation. In such cases, I'll offer up the "?w=1" flag as the most reasonable way to view the PR:
NOTE: While "?w=1" is an amazing tool, it does have a huge limitation in GitHub; at the time of this writing, you can't leave per-line comments in a PR that is hiding white-space changes. As such, if you see something you want to comment-on, you have to change the URL, reload the page, find the original line, and leave your comment. Hopefully they fix this in the future.
After discussing all of this on Twitter, however, I became dissatisfied with my own answer. Inspired by others, I wanted to take this as an opportunity to learn more about Git; and, about how I might be able to rewrite a PR such that white-space changes are moved to their own commit (or removed entirely from the change-set). It turns out, Git makes this possible with interactive rebasing and "patching".
The "TLDR" (Too Long, Didn't Read) of this is that you can interactively rebase the contents of a Pull Request, asking git to "edit" one of the commits. Then, when editing said commit, you can reset the branch and then re-add the changes using the "patch" flag. This will iterate through the "hunks" of code, prompting you to either keep (stage) a hunk; or, leave it in the working tree. You can then commit the staged changes and continue the rebase.
Now, let's change a single line of logic in that file:
Notice that we've only changed the console.log() statement. Or, so we think. If we look at the delta between the current commit and the previous commit, you can see that superfluous white-space edits have snuck in:
To move these white-space edits to their own commit, we can interactively rebase on the previous commit:
git rebase -i head~
This will prompt us with a list of commits that include our one commit:
pick 3a981c3 Changed console logging.
If we change "pick" to "edit":
edit 3a981c3 Changed console logging.
... then the rebasing process will stop on the selected commit and allow us to modify it. In this case, we want to take the file modification and reset it ("mixed mode" by default) to the previous commit such that the index / staging is rolled-back but the working tree is kept as-is with pending changes:
Now, we can add our file back to the staging area using "-p" or "--patch":
git add -p index.js
Git will then start iterating through the "hunks", asking us how and if we want to stage the changes:
As you can see, we have a whole host of options to choose from. If we enter the "?" response, Git will explain what each option means:
- y - stage this hunk
- n - do not stage this hunk
- q - quit; do not stage this hunk or any of the remaining ones
- a - stage this hunk and all later hunks in the file
- d - do not stage this hunk or any of the later hunks in the file
- g - select a hunk to go to
- / - search for a hunk matching the given regex
- j - leave this hunk undecided, see next undecided hunk
- J - leave this hunk undecided, see next hunk
- k - leave this hunk undecided, see previous undecided hunk
- K - leave this hunk undecided, see previous hunk
- s - split the current hunk into smaller hunks
- e - manually edit the current hunk
- ? - print help
In this case, we're going to use "s" to split the hunks. At this point, Git will start iterating through the smaller blocks of contiguous changes, asking the same questions (less the option to "s" split any further). We're going to use "n" for "do not stage" for each white-space hunk until we get to our console.log() change:
Once we've said "y" (yes) to the console.log() change, we can select "d" (deny rest of hunks) to jump back into the command-line. At this point, we've staged our single-line change, leaving our white-space edits in the working tree:
At this point, we can commit the console.log() change to the local repository:
git commit -m "Updating console logging."
Now, the only uncommitted and unstaged changes in our working tree will be our white-space only edits. We could choose to remove them from the commit-history altogether by resetting the current branch, mid-rebase:
git reset --hard
... or, we could simply stage them and commit them. Since the original goal here was to split them out into their own commit, that's what we'll do:
git add index.js
git commit -m "Fixing white-space."
At this point, we're ready to move on with our rebase:
git rebase --continue
Since we only had the one edit operation, this "--continue" action will complete the rebase. If we now examine the commit history, we can see that our console.log() update is in one commit and our white-space edits are in a separate commit:
And there we go! We've used an interactive rebase to rewrite the git commit history, moving the white-space changes to their own commit. Now, when your teammate goes to review the PR (Pull Request), they can quickly focus in on the commit that contains the business logic changes, ignoring the commit that contains the stylistic changes.
Of course, all things in moderation. It's up to each of us to know when this kind of clean-up is worth while.
Special thanks to John Kary who has a great "Intro to git add patch mode tutorial" on YouTube:
This really feels like it should be scriptable, which would be awesome for git tools and plugins:
- Start the rebase
- Pick the last commit - or why not all since the last push?
- For each commit, split the hunks and check each whether it is a whitespace-only change or not and accept or reject as described.
- re-commit the non-whitespace and the whitespace commits
- do a little dance
Would be pretty cool to have this automated in some way.
It's funny you mention that -- I was just sort of hoping that you could use the "difftool" for something like this. It feels like it would be perfect. One one side, it could show you the current content of the working tree; and on the other side (of the difftool UI), it would be the "staged" content. Then, you could simply move the portions of the current tree over to the staged tree.
I ran across some discussion thread somewhere in which someone else suggested something similar; and, apparently this would actually be really complicated because git would have to save temporary copies of the code or something. I didn't really follow -- my git skills aren't really all that good.