Help! Why does git think I have been decapitated?

Or: all about “detached HEAD” in git

You’ve been using git for awhile and you’re finally getting the hang of it. Then one day, you get this weird message from git:

You are in 'detached HEAD' state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

Wait, why does git think I have been decapitated? I feel fine.

Technically, nothing is wrong. This message is a warning, not an error. But you’re probably a little confused: it isn’t immediately obvious why you have been warned or what might happen if you fail to heed the warning.

Let’s back up a bit and try to unpack some of the issues here.

How did this happen?

A “detached HEAD” happens when you use git checkout to checkout something that isn’t a branch. Most often, this happens when you checkout a specific SHA, or a shortcut like HEAD^.

Another common “detached HEAD” scenario is making a small typo when checking out a colleague’s remote branch.

If you’re trying to work a branch called feature and you type git checkout feature, git will see origin/feature and assume you want to create a local feature branch tracking changes from origin/feature (or: git checkout -b feature --track origin/feature)

However, if you type git checkout origin/feature, you’ll be in “detached HEAD” – that extraneous origin/ instead leads git to assume that you want something else altogether.

What really is HEAD?

Usually, we say that HEAD is a “pointer”, but that isn’t literally true. HEAD is literally a file, inside the .git folder that lives inside and makes your project folder a git repository.

Normally, this file contains a single line, which points or links or refers to the currently-checked out branch.

You can inspect this file and see for yourself. Say you have recently used git checkout master. If you now use cat .git/HEAD (cat is a Unix command that prints a file to the screen. I know it makes no sense.) you will see:

ref: refs/heads/master

If you checkout a branch called feature and cat .git/HEAD again, you will now see:

ref: refs/heads/feature

So: HEAD is a file. The contents of that file are (more-or-less) a branch name. And that file changes when you checkout a different branch.

In “normal operation”, git updates .git/HEAD with the name of the currently-checked-out branch. But then: what happens when you’re in “detached HEAD”?

If you aren’t already, let’s put ourselves in “detached HEAD” and see what happens. Run this command in a git repository with more than three commits and it’ll put you in “detached HEAD”: git checkout HEAD^. You should see the long warning message I included earlier.

Now try cat .git/HEAD and instead of refs/heads, you should see a long alphanumeric code - I got 49aa5ca3fd952ed1e48759f0e842650841d332b1, which just happens to be the SHA identifier for a commit I named “One commit before the most recent commit” (which is what the HEAD^ short syntax expands to become).

So, this is “detached HEAD”: instead of a branch in the form of ref: refs/heads/<branch>, HEAD is “detached” if the contents of .git/HEAD point us at a specific commit by its SHA identifier.

Still: why is this a “detached HEAD”? And what’s the big deal anyway?

Why is HEAD ‘attached’ to refs/heads?

What you call “branches”, git internally calls “refs”, or sometimes “heads”. (All branches are “refs”, but not all “refs” are branches. Tags are also “refs”.)

A “ref” is another kind of pointer in git. For git, a branch (or tag) is not just a human-friendly label, “refs/heads” are an indicator of the present, from which git can start to trace backwards to find the entire repository history.

Any individual git commit has no knowledge of its future, but each git commit (except the first) has a record indicating its past - the identifier of the previous commit in history is stored as the commit’s “parent”. (If you think about it, it has to be this way, or git would have to edit older commits whenever you made a new commit later on - it’d be a small nightmare to keep things consistent.)

To see this in action, look at the last three commits in your repository, with the full commit meta-data on display using the command git log -3 --format=raw. You’ll see something like this:

commit 823171e770806ffe05e4698dfcfe14eeec260898
tree 3f3b28fed915450b25fe845bfc0dc6b3ee9ec811
parent 49aa5ca3fd952ed1e48759f0e842650841d332b1
author Joshua Wehner <jaw6@fake.com> 1460487755 -0500
committer Joshua Wehner <jaw6@fake.com> 1460487755 -0500

    The most recent commit

commit 49aa5ca3fd952ed1e48759f0e842650841d332b1
tree a332444208e9700d95cb78948a41da840d796780
parent 6822da008523291da1d559ec5b4f03f75c88e217
author Joshua Wehner <jaw6@fake.com> 1460487755 -0500
committer Joshua Wehner <jaw6@fake.com> 1460487755 -0500

    One commit before the most recent commit

commit 6822da008523291da1d559ec5b4f03f75c88e217
tree f87f39484d60fd23dae45a220c604745d3453a0f
parent 23482b60590927e3939815b366c3ee00073b58dc
author Joshua Wehner <jaw6@fake.com> 1460487755 -0500
committer Joshua Wehner <jaw6@fake.com> 1460487755 -0500

    Two commits before the most recent commit

So, my commit 823171 knows that its parent is 49aa5ca3 who knows that its parent is 6822da0 and so on.

Commits themselves are also stored as files. They can also be found inside the .git folder, in .git/objects. The objects folder is further split into sub-folders using the first two digits of the commit identifier, a little like organizing your documents by year (2016, 2015, etc.)

Now, imagine that you are git. Your user has just asked you for a list of all the commits in the history (git log). You know they are in .git/objects but there’s a lot of stuff in there. There could easily be millions of objects. And they wouldn’t necessarily be in a helpful order (those first two digits aren’t sequential in any useful way).

But: if you could get your “hands” on some relatively recent commits – like, maybe the most recent work on the active branches – you could start with those and gradually walk backwards from parent to parent, until you had found all the commits in the repository history.

That’s, basically, what “refs/heads” (branches and tags) lets git do – identify a set of relatively recent commits from which it can work, via parent identifiers, to build a tree of all the commits.

In fact, if there were a commit that fell outside of that tree structure, that commit would be considered by git to be “unreachable” - it wouldn’t show up in git log and it will eventually be removed from .git/objects.

What might cause a commit to fall off the tree and be “unreachable” ?

  • Deleting a branch (git branch -d <branch>) with un-merged work - those commits no longer have a “refs/heads” indicating a commit in the “present”, so they become “unreachable” when the branch is deleted

  • Resetting branch history - a command like git reset HEAD^ moves that branch’s “refs/heads” pointer to an earlier commit, making any commits in between “unreachable” by git

And, most significantly for our purposes: Making a new commit after “detached HEAD”

The risk of “detached HEAD” is accidentally creating unreachable commits

This is why git is trying to warn you about “detached HEAD”: there’s nothing wrong with git checkout <bad SHA> as long as you don’t make any new commits over there. If you treat “detached HEAD” like being in “read-only mode”, you’ll be fine.

If you do make new commits, those commits will become unreachable when you switch away to another branch – that is, when .git/HEAD is no longer pointing at your new commits, git has nothing pointing to those commits.

Maybe you were trying to solve a mysterious bug. You know when the bug came in, maybe even the exact commit, but it’s not clear what exactly is wrong. You thought it would help if you could “see” exactly what the code looked like back on Friday - maybe you could load the project with a debugger attached, try to pin-point the problem.

You did a git checkout <bad commit SHA>, attached the debugger, ran some tests and voila! you’ve figured it out. You write the new, fixed code, commit it and you’re done.

Except: git checkout <bad SHA> detached HEAD. You forgot you were in “detached HEAD” when you made your fix commits. Now there isn’t a “refs/heads” branch (or tag) that points to these commits (or anything in their future), so they will disappear from your repository as soon as you git checkout <away>.

If you are using git on the command-line, modern versions of the git commands will try to warn you about leaving these unreachable commits behind. If I made a commit “This commit will be unreachable soon” on a “detached HEAD” and then try to git checkout master, I’ll see:

Warning: you are leaving 1 commit behind, not connected to
any of your branches:

  c10b6d2 This commit will be unreachable soon

If you want to keep it by creating a new branch, this may be a good time to do so with:

 git branch <new-branch-name> c10b6d2

Switched to branch 'master'

This lengthy warning is trying to tell me how to prevent these commits from becoming lost in the shuffle. If I create a new branch pointing to the commit, or latest of these commits, the branch will maintain these commits as “reachable”. The command git branch <new-branch-name> <SHA> will create a new branch and point it specifically at the given SHA identifier.

So, let’s summarize what we know

  • Most of the time, git stores the name of the current branch in .git/HEAD.
  • When you check out something that is not a branch, a specific commit SHA identifier is stored in .git/HEAD instead – this is what puts you in “detached HEAD”.
  • Commits only know the identifier of the previous commit or “parent”, they don’t know about the future.
  • Git uses branches (or tags), aka “refs/heads”, as indicators of “the present”, from which it can work backwards to build the repository history.
  • Commits which fall outside of that backwards-working history are “unreachable”.
  • The “detachable HEAD” state is dangerous because there is a risk that new commits become unreachable when we leave them behind via checking out to a branch.
  • Git will try to warn us about “detached HEAD” and also about leaving commits behind.
  • You can avoid losing commits by creating a new branch pointing at the last of the would-be-unreachable commits.

And that’s all there is to keeping your HEAD attached in git!

image

Permalink • Posted in: git