5 min read
To users, git is a VCS, but for designers, it’s a content-addressable system. The core of git is essentially a hash table. Any kind of data can be inserted in git and it returns a unique key, which is used to fetch that object.
Git is designed like a bathroom. It has porcelain commands and plumbing commands. You interact with the porcelain but underneath there’s plumbing. Plumbing is for computers and porcelain is for humans. Porcelain is built upon plumbing. There's a section towards the end called "bottoms up git". We will learn about how to commit using porcelain commands.
Git commands are a leaky abstraction over the data storage. You tell Git that you want to save a snapshot of your project and it basically records a manifest of what all of the files in your project look like at that point. Git is more like a mini filesystem with some incredibly powerful tools built on top of it, rather than simply a VCS.
We will explore 5 famous git commands.
Apart from that, we will also explore the HEAD pointers.
This is what an empty git repository looks like, after running git init
. We will only focus on HEAD
, objects/
, refs/
, heads/
and tags/
.
1$ git init2$ tree .git34.git5├── HEAD6├── config7├── description8├── hooks9│ ├── applypatch-msg.sample10│ ├── commit-msg.sample11│ ├── fsmonitor-watchman.sample12│ ├── post-update.sample13│ ├── pre-applypatch.sample14│ ├── pre-commit.sample15│ ├── pre-merge-commit.sample16│ ├── pre-push.sample17│ ├── pre-rebase.sample18│ ├── pre-receive.sample19│ ├── prepare-commit-msg.sample20│ ├── push-to-checkout.sample21│ └── update.sample22├── info23│ └── exclude24├── objects25│ ├── info26│ └── pack27└── refs28├── heads29└── tags
Create a new file and put some content in it.
1$ echo "console.log("Hello World");" > new.js && cat new.js2console.log("Hello World");
After adding a new object is created ac/cefceba62b4874a613a2336de33ee716e99931
1$ git add new.js2.git3├── HEAD4├── objects5│ ├── ac6│ │ └── cefceba62b4874a613a2336de33ee716e999317│ ├── info8│ └── pack9└── refs10 ├── heads11 └── tags
It is a unique SHA hash and we can address the hash using the first 4 characters, acce
. However, the directory structure is slightly odd. Why is there a subdirectory?
Git objects can grow up 10k+ in number and File Systems don’t really like it when you have a really high number of files in one directory. So to keep the files manageable git creates a directory first.
Git ships with a really convenient plumbing command cat-file
to print all the contents in the file. We can look at the content of the hash acce
This command needs just the first 4 characters of the hash. That’s unique for all the hash created. The hash file contains the content of main.js
. The type of the hash is a blob. blob is one of the git object types. Blob means “binary large object”. When we git add
git creates a blob object for that file. blob is the git object type for storing files.
1$ git cat-file -p acce2console.log("Hello World");34 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56$ git cat-file -t acce7blob
1# commit the changes after adding.2$ git commit -m "first commit"34 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56# look at the contents of the git directory.7$ tree .git8.git9├── COMMIT_EDITMSG10├── HEAD11├── logs12│ ├── HEAD13│ └── refs14│ └── heads15│ └── master16├── objects17│ ├── 2618│ │ └── c7fccd29746f6775d8f291c6e0bbdfba6a4aac19│ ├── 8e20│ │ └── 62e9859f9e0283f159a0a94a6ea7a7372e9b5621│ ├── ac22│ │ └── cefceba62b4874a613a2336de33ee716e9993123│ ├── info24│ └── pack25└── refs26 ├── heads27 │ └── master28 └── tags293014 directories, 25 files
After the commit, we have two new hashes, 26c7
and 8e62
. One is the object tree and the other is the commit hash object. First, the tree is created and then the commit object gets created. While a tree represents a particular directory state of a working directory, a commit represents that state in "time" and explains how to get there. The commit object contains the directory tree object hash, parent commit hash, author, committer, date and message. We’ll come back to the other object later. Let’s explore the commit object 26c7
.
The commit hash contains the names and references to an object called tree. The hash of the tree 8e62
is the other hash file that was created. We can check the type of a git object using git cat-file -t
. It returns the object type as commit
.
1$ git cat-file -p 26c72tree 8e62e9859f9e0283f159a0a94a6ea7a7372e9b563author Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656713957 +02004committer Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656713957 +020056first commit78 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#910$ git cat-file -t 26c711commit
The content of commit object has a reference tree 8e62e9859f9e0283f159a0a94a6ea7a7372e9b56
. Trees are pointers to file names, content, and other trees. The tree is employed for storing filename and is also used to store a group of files together. Git stores content similar to the UNIX filesystem, but a bit simplified. All the content is stored as tree and blob objects, with trees corresponding to UNIX directory entries and blobs corresponding more or less to inodes or file contents. A single tree object contains one or more entries, each of which is the SHA-1 hash of a blob or subtree with its associated mode, type, and filename.
This is what a tree file looks like.
1$ git cat-file 8e622100644 blob accefceba62b4874a613a2336de33ee716e99931 main.js34 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56$ git cat-file -t 8e627tree
The tree object contains one line per file or subdirectory, with each line giving file permissions(10644
), object type(blob
), object hash(acce
) and filename (main.js
). Object type is either “blob” for a file or “tree” for a subdirectory.
We have a file refs/heads/master
, a HEAD pointer for the master and it points to the latest commit. You can create different branches and they create different pointers.
1# content of master pointer2$ cat .git/refs/heads/master326c7fccd29746f6775d8f291c6e0bbdfba6a4aac45**** #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#67# HEAD and master8$ git log9commit 26c7fccd29746f6775d8f291c6e0bbdfba6a4aac (HEAD -> master)10Author: Shubham Srivastava <shbm@Shubhams-MacBook-Air.local>11Date: Sat Jul 2 00:19:17 2022 +02001213 first commit
git branch -b feature
creates a new branch. It also creates a new HEAD for the feature branch. At this point of branching, the feature branch shares the same location as the master. We can verify it by looking at refs/logs/heads/feature
.
1$ git branch -b feature2.git3├── COMMIT_EDITMSG4├── HEAD5├── logs6│ ├── HEAD7│ └── refs8│ └── heads9│ ├── feature10│ └── master11├── objects12│ ├── 2613│ │ └── c7fccd29746f6775d8f291c6e0bbdfba6a4aac14│ ├── 8e15│ │ └── 62e9859f9e0283f159a0a94a6ea7a7372e9b5616│ ├── ac17│ │ └── cefceba62b4874a613a2336de33ee716e9993118│ ├── info19│ └── pack20└── refs21 ├── heads22 │ ├── feature23 │ └── master24 └── tags252614 directories, 25 files2728 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#2930# Which pointer does the feature HEAD point? Same as master's HEAD.31$ cat .git/logs/refs/heads/feature320000000000000000000000000000000000000000 26c7fccd29746f6775d8f291c6e0bbdfba6a4aac Shubham Srivastava [shbm@Shubhams-MacBook-Air.local](mailto:shbm@Shubhams-MacBook-Air.local) 1656722666 +0200 branch: Created from HEAD
Modify main.js
to make changes in the feature branch. The file now contains.
1$ cat main.js2console.log("Hello World");3console.log("Feature");
Adding a new file creates a new object. As seen previously, it creates a new blob object.
1$ git add mains.js2.git3├── COMMIT_EDITMSG4├── HEAD5├── logs6│ ├── HEAD7│ └── refs8│ └── heads9│ ├── feature10│ └── master11├── objects12│ ├── 0713│ │ └── 99851535ee3b53930befa9a383691eaa29ed9d14│ ├── 2615│ │ └── c7fccd29746f6775d8f291c6e0bbdfba6a4aac16│ ├── 8e17│ │ └── 62e9859f9e0283f159a0a94a6ea7a7372e9b5618│ ├── ac19│ │ └── cefceba62b4874a613a2336de33ee716e9993120│ ├── info21│ └── pack22└── refs23 ├── heads24 │ ├── feature25 │ └── master26 └── tags272815 directories, 28 files
The new blob object contains the latest version of mains.js
.
1# What does the newly created hash file contains?2$ git cat-file -p 07993console.log("Hello World");4console.log("Feature");
As seen previously, committing crates two objects; the tree object and the commit object. a150
is the tree file and d23a
is the commit object. The HEAD of the feature branch has also changed which now contains the latest commit in the feature branch. And since we’re on a different branch the HEAD includes a ref to refs/heads/feature
.
1$ git commit -m “feature”2.git3├── COMMIT_EDITMSG4├── HEAD5├── logs6│ ├── HEAD7│ └── refs8│ └── heads9│ ├── feature10│ └── master11├── objects12│ ├── 0713│ │ └── 99851535ee3b53930befa9a383691eaa29ed9d14│ ├── 2615│ │ └── c7fccd29746f6775d8f291c6e0bbdfba6a4aac16│ ├── 8e17│ │ └── 62e9859f9e0283f159a0a94a6ea7a7372e9b5618│ ├── a119│ │ └── 50a1687ff7dd85b374a223d99259836fa8a0cd20│ ├── ac21│ │ └── cefceba62b4874a613a2336de33ee716e9993122│ ├── d223│ │ └── 3a3ba983a7d4ab08cc47e9a5b8189139e6712a24│ ├── info25│ └── pack26└── refs27 ├── heads28 │ ├── feature29 │ └── master30 └── tags313217 directories, 30 files3334 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#3536# a150 is the tree file. It contains ref of 0799, the main.js file with new edits.37$ git cat-file -p a15038100644 blob 0799851535ee3b53930befa9a383691eaa29ed9d main.js3940 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#4142# d23a is the commit hash of the feature branch. It contains tree a150 as the content43$ git cat-file -p d23a44tree a150a1687ff7dd85b374a223d99259836fa8a0cd45parent 26c7fccd29746f6775d8f291c6e0bbdfba6a4aac46author Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656714340 +020047committer Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656714340 +02004849feature commit5051 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#5253# the HEAD pointer of feature brach54$ cat .git/refs/heads/feature55d23a3ba983a7d4ab08cc47e9a5b8189139e6712a5657# the current head58$ cat .git/HEAD59ref: refs/heads/feature
Checking out master creates changes to the HEAD. It moves the HEAD back to the master’s HEAD.
1# Let's change the branch and print HEAD again2$ git checkout master3Switched to branch 'master'45 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#67# HEAD file is the latest head.8# soft-reset HEAD~19$ cat .git/HEAD10ref: refs/heads/master
We’ve learned about all the git objects and a basic idea about what happens when we execute some of the git commands. Now we will create a new commit with only the porcelain commands.
Let’s create a new directory structure which looks like .git
1$ tree .git2.git3├── HEAD4├── config5├── info6│ └── exclude7├── objects8│ ├── info9│ └── pack10└── refs11 ├── heads12 └── tags
This command computes the object ID value for an object with a specified type with the contents of the named file (which can be outside of the work tree). -w
optionally writes the resulting object into the object database. When <type>
is not specified, it defaults to "blob". So we created a new blob hash with the content “Hello World” in the database.****
1$ echo "Hello World" | git hash-object --stdin -w2557db03de997c86a4a028e1ebd3a1ceb225be23834 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56$ tree .git7.git8├── HEAD9├── config10├── info11│ └── exclude12├── objects13│ ├── 5514│ │ └── 7db03de997c86a4a028e1ebd3a1ceb225be23815│ ├── info16│ └── pack17└── refs18 ├── heads19 └── tags
We can verify what the contents are using cat-file
1$ git cat-file -p 557d2Hello World3 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#45$ git cat-file -t 557d6blob
Modifies the index. Each file mentioned in the command is updated in the index. To bring a file to the staging area we use update-index. But if we look at the status, it returns a strange status. It shows a new file as hello and also a deleted file called hello.
1$ git update-index --add --cacheinfo 10644 557db03de997c86a4a028e1ebd3a1ceb225be238 hello23 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#45$ git status6On branch master78No commits yet910Changes to be committed:11 (use "git rm --cached <file>..." to unstage)12 new file: hello1314Changes not staged for commit:15 (use "git add/rm <file>..." to update what will be committed)16 (use "git restore <file>..." to discard changes in working directory)17 deleted: hello
We can create a new tree object using write-tree
. It creates a tree object using the current index. The name of the new tree object is printed to standard output. Conceptually, git write-tree sync’s the current index contents into a set of tree files. We can see the object in the .git/objects
directory. We can verify the contents of the hash using cat-file
.
1# creates a new tree object2$ git write-tree3117c62a8c5e01758bd284126a6af69deab9dbbe245 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#67$ tree .git8.git9├── HEAD10├── config11├── index12├── info13│ └── exclude14├── objects15│ ├── 1116│ │ └── 7c62a8c5e01758bd284126a6af69deab9dbbe217│ ├── 5518│ │ └── 7db03de997c86a4a028e1ebd3a1ceb225be23819│ ├── info20│ └── pack21└── refs22 ├── heads23 └── tags2425$ git cat-file -p 117c26100644 blob 557db03de997c86a4a028e1ebd3a1ceb225be238 hello
However, the status does not change because we haven’t added that tree object to a commit object. To create a new commit, git uses commit-tree
which creates a new commit and takes in the hash of the tree object. commit-tree creates a new commit object based on the provided tree object and emits the new commit object id on stdout.
1$ git commit-tree 117c -m "First Commit"263dc01736bdd6b7e5d15e3b871590573550704fd34 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56$ tree .git7.git8├── HEAD9├── config10├── index11├── info12│ └── exclude13├── objects14│ ├── 1115│ │ └── 7c62a8c5e01758bd284126a6af69deab9dbbe216│ ├── 5517│ │ └── 7db03de997c86a4a028e1ebd3a1ceb225be23818│ ├── 6319│ │ └── dc01736bdd6b7e5d15e3b871590573550704fd20│ ├── info21│ └── pack22└── refs23 ├── heads24 └── tags252610 directories, 7 files2728$ git cat-file -p 63dc01736bdd6b7e5d15e3b871590573550704fd29tree 117c62a8c5e01758bd284126a6af69deab9dbbe230author Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656808916 +020031committer Shubham Srivastava <shbm@Shubhams-MacBook-Air.local> 1656808916 +02003233First Commit
However, the status is still not happy.
1$ git status2On branch master34No commits yet56Changes to be committed:7 (use "git rm --cached <file>..." to unstage)8 new file: hello910Changes not staged for commit:11 (use "git add/rm <file>..." to update what will be committed)12 (use "git restore <file>..." to discard changes in working directory)13 deleted: hello
There’s a file called HEAD which references refs/heads/master
. Although there is no file in that location. It needs to be created echo 63dc01736bdd6b7e5d15e3b871590573550704fd > .git/refs/heads/master
. HEAD references the latest commit in the working tree. Normally a commit would identify a new "HEAD" state, and while Git doesn’t care where you save the note about that state, in practice we tend to just write the result to the file that is pointed at by .git/HEAD
, so that we can always see what the last committed state was.
1$ cat .git/HEAD2ref: refs/heads/master34 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#56# no file refs/heads/master7$ tree .git8.git9├── HEAD10├── config11├── index12├── info13│ └── exclude14├── objects15│ ├── 1116│ │ └── 7c62a8c5e01758bd284126a6af69deab9dbbe217│ ├── 5518│ │ └── 7db03de997c86a4a028e1ebd3a1ceb225be23819│ ├── 6320│ │ └── dc01736bdd6b7e5d15e3b871590573550704fd21│ ├── info22│ └── pack23└── refs24 ├── heads25 └── tags262710 directories, 7 files2829# Git is looking for the latest pointer in HEAD.30# The latest pointer should be the latest commit.31$ echo 63dc01736bdd6b7e5d15e3b871590573550704fd > .git/refs/heads/master3233 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#3435# the log contains the hash now36$ git log37commit 63dc01736bdd6b7e5d15e3b871590573550704fd (HEAD -> master)38Author: Shubham Srivastava <shbm@Shubhams-MacBook-Air.local>39Date: Sun Jul 3 02:41:56 2022 +02004041 First Commit4243 #=-=-=-=-=-=-=-=-=-=-=-=-=-=-=#4445# Now if we look at the status we get something different.46# Still not 100% happy.47$ git status48On branch master49Changes not staged for commit:50 (use "git add/rm <file>..." to update what will be committed)51 (use "git restore <file>..." to discard changes in working directory)52 deleted: hello5354no changes added to commit (use "git add" and/or "git commit -a")
Why is a status not completely happy. The log contains the latest commit but a final piece is missing. Although we have added the latest commit in the working tree we haven’t moved to that pointer yet. We need to use checkout
to bring it to the latest HEAD pointer.
TIP: --
is used by git to run an operation on a specific file.
1# -- in git applies a command to a specific file. hello was created when update-index was executed2$ git checkout HEAD -- hello
Finally, Git is happy.
1# The status is now clean2$ git status3On branch master4nothing to commit, working tree clean