GIT DIY Exercise for Beginners
The exercise is for folks who are beginning to learn git, and may be even transitioning from a traditional version control tool like svn or cvs. Before proceeding with this exercise, I would suggest a quick reading of this article on Getting Started - About Version Control.
Pre-requisites :
1. Install git on your machine - https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
2. Create an account on GitHub, if do not have any.
3. Create a new repository on GitHub. [Quick link: https://help.github.com/articles/create-a-repo/ ]
4. Open Terminal or Command-prompt, switch to relevant folder of choice and checkout the repository locally, as follows:
git clone <path to your repository on github>
You can either use “HTTPS” or “SSH” protocol to work with the GitHub repository.
An HTTPS path would look something like – https://github.com/deveshchanchlani/train4git.git,
while an SSH path would look something like – git@github.com:deveshchanchlani/train4git.git
The above command will create a new folder by the name of your repository, and place the repository contents in it.
5. Now, Initialize your repository. In my case it would be:
# 'train4git' is name of my new repository. cd train4git # create a file to begin with. touch README.md # add the new file to be recognized by git. git add README.md # provide a comment for your first commit. git commit -m "first commit" # push the new change-commit to remote GitHub repository. git push -u origin master
6. Finally to get running with the exercise, run the following commands and in the same order:
# create a local repository branch called 'canary' git branch canary # create a local repository branch called 'dev' git branch dev # push the local 'dev' branch to remote GitHub repo git push -u origin dev # push the local 'canary' branch to remote GitHub repo git push -u origin canary
Now, browse to your repository page on GitHub. You would see a drop-down labeled "Branch: master". Clicking on this drop-down, you would notice 3 branches - dev, canary and master. "master" is the default branch. "dev" and "canary" branches are the ones we just created above, for our exercise.
Let's start rolling !
Knowing remote path of the repo
git remote -v
This would show an output similar to, as shown below:
[If you have checked out using SSH]
origin git@github.com:deveshchanchlani/train4git.git (fetch) origin git@github.com:deveshchanchlani/train4git.git (push)
[If you have checked out using HTTPS]
origin https://github.com/deveshchanchlani/train4git.git (fetch) origin https://github.com/deveshchanchlani/train4git.git (push)
You see the URL there twice because Git allows you to have different push and fetch URLs for each remote in case you want to use different protocols or addresses for reads and writes.
Tracking branches
$ git branch # To lists all the "local" branches canary dev * master
Notice, the current working branch is prefixed by * sign.
$ git branch -r # To lists all the "remote" branches origin/canary origin/dev origin/master
You can see 3 remote branches - master, dev and canary. The branches dev and canary were created by us in the "pre-requisites" section.
$ git branch -a # To lists "all" the branches canary dev * master remotes/origin/canary remotes/origin/dev remotes/origin/master
The remote branches begin with "remotes" keyword. The branches are listed in alphabetical order, with local branches preceding the remote branches. This shows us that there are 3 local branches, and corresponding 3 remote branches.
Tracking changes and commits
If you want to ignore certain type of files from being tracked by git (like *.log files), you make their entries in a file named .gitignore. This file is usually placed inside the root directory of the repo. However, it can also be placed inside any directory in the repo.
Let us add the below contents to our .gitignore file, do notice the comment lines start with the letter '#'.
# ignore .jar files *.jar # but do track xyz.jar, even though you're ignoring # .jar files above !xyz.jar # only ignore the TODO file in the current directory, # not subdir /TODO /TODO # ignore the build/ directory build/ # ignore doc/*.txt files, but not doc/server/*.txt files doc/*.txt # ignore all .pdf files in the doc/ directory doc/**/*.pdf
You can find more information on .gitignore at https://git-scm.com/docs/gitignore
Few .gitignore templates can be found at https://github.com/github/gitignore
Now if you fire -
git status
It would show you the file modified, under "Untracked files". This means the new file .gitignore is untracked and will not be considered for committing. So, you would be required to start tracking the file, to commit it. This would be done as:
git add .gitignore
Now, if 'git status' command is fired again, it can be seen that the file .gitignore has been labeled as new file:, and is now being tracked by git.
Commit and pushing change-sets
git commit -a -m "adding .gitignore"
The above command would commit the changes and create a change-set, locally, that is the change-set is local to your machine. The change-set can be observed on typing
git log
If you go to GitHub, and check the commit history of master branch, you would not find this change-set. To move this change-set to remote (so that even your team can see it), you would now be required to push it to remote
git push origin master:master
The syntax for push command is:
git push |remote| |local-branch-name|:|remote-branch-to-push-into|
We will talk more about “git push” going ahead.
Deleting a file and checking in
Before that, let us quickly add few files and check it in, as follows:
touch tempFile.txt file1.txt git add tempFile.txt file1.txt git commit -a -m "adding tempFile.txt and file1.txt" git push origin master:master
Assuming now you no longer need tempFile.txt, let us now begin to delete it from the git repo, as follows:
rm tempFile.txt
Now, if you do 'git status' you would see "deleted: tempFile.txt" under "Changes not staged for commit:" heading. This means that these changes are under “unstaged” bucket and will not be considered for committing purpose. So, you are now required to bring it under “staged” bucket, before committing the changes.
git rm tempFile.txt
followed by 'git status' you would see 'deleted: tempFile.txt' under 'Changes to be committed:' heading. Now, just commit and push this change, as below
git commit -m "deleting tempFile.txt" git push origin master:master
Another way of making these changes would have been:
rm tempFile.txt git commit -a -m "deleting tempFile.txt" git push origin master:master
In the above code-snippet the 'git rm' is missing. All this command was supposed to do was to bring the changes from "unstaged" bucket to the "staged" bucket. Another way of doing this is by making use of the -a option of 'git commit' command. This option makes sure that all the modified and deleted files are staged automatically. However, it does not work for new files.
Playing with branches
Switching between branches
Doing a git branch will show that the current branch is master. Now fire
git checkout canary
This will switch the current branch from master to canary. It can be confirmed by firing 'git branch' again. It will show canary being the current branch.
Next, on firing a 'git log', only single change-set will be seen with the title “first commit”. All other change-sets will not be shown since they were on the master branch ! (Remember the 'git commit' syntax above.)
So, the commit change-sets also get switched when the current branch is switched from one to another. This is an obvious behavior.
But, what happens to the "staged” and “unstaged” buckets? Answer: They remain intact !!
Try this out, keeping in mind the current branch is “canary”:
touch unstagedFile.txt git checkout master git status
notice the unstagedFile.txt is still shown in the status of master branch, though it was created under canary branch. It will continue to show across branches until it is committed to a certain branch.
For now, let us delete this file and move ahead, fire - 'rm unstagedFile.txt'
Creating new local branch
# switch to 'canary' branch git checkout canary # create new branch 'newFeatureCanary' identical # to current branch 'canary' git branch newFeatureCanary #switch to the new local branch git checkout newFeatureCanary
This would first switch to canary and then create a new local branch called newFeatureCanary, which has identical change-sets as canary. Now, on firing 'git branch', newFeatureCanary can be found listed, but the current branch would still be canary. So, we would be required to switch to the newly created local branch.
Now, any commit made here will be exclusive to newFeatureCanary branch, that is will not be seen in any other branch, including canary.
There's a shortcut as below, to create a local branch and check it out instantly. Lets try it out on master.
#switch to master branch git checkout master #create a new branch from master and check it out as well git checkout -b newFeatureMaster
Another very important thing to note here is, that the branches newFeatureCanary or newFeatureMaster are local to your machine, and will not be visible to any one else. These are different from remote branches. You can confirm this by checking the repository-page on GitHub, and observe the number of branches present.
Deleting a local branch
#make sure the branch-to-delete 'newFeatureMaster' is not #the current branch, switch to someother branch git checkout master #delete the 'local' branch git branch -d newFeatureMaster
Listing commits
#Getting change-sets unique to canary branch #and not in dev branch git log dev..canary #Getting change-sets unique to dev branch #and not in canary branch git log canary..dev #Getting unique change-sets in current branch wrt master git log master.. #Getting unique change-sets in master branch wrt current branch git log ..master #Getting change-set log of any branch git log dev #Knowing incoming change-sets from 'canary remote' #to local current branch git fetch && git log ..remotes/origin/canary #Knowing outgoing change-sets to 'remote master' #from local current branch git fetch && git log remotes/origin/master..
We will discuss “git fetch” in detail a bit later.
Updating from remote
There are 2 ways in which you could update from remote repo - "pull" and "fetch+merge".
Approach A. FETCH + MERGE
In GIT, there are 3 types of branches for each line of development:
- Remote branch
- Local branch
- Remote tracking branch
Let us consider canary as the line of work (or branch), and have a look at the below illustration of change-sets and where each branch points.
The branch remote/canary points to the current state of the remote repository. This branch is present in the remote and will reflect change-sets that others too would have pushed.
The remote-tracking-canary branch points to the change-set, which was at the tip when you last took an update from the remote. Hence, currently your local repo does not have any knowledge of change-sets 'D' and 'E'.
The branch local/canary points to the current tip of the local canary branch, and the change-sets 'F' and 'G' are your local change-sets.
So, git branch lists all your local branches and remote tracking branches.
Now, if you fire -
git fetch
it would align the remote-tracking-canary and remote/canary pointers, thus bringing in the snapshots of change-sets 'D' and 'E' into the local repository copy. This means that the information regarding 'D' and 'E' is now there locally (in your local .git folder), but still it will not be reflected in the local branches. Now your local repo state would be something like below:
Your local branch canary has no knowledge of the change-sets 'D' and 'E', and it is in continuation from change-set 'C'. Now your local branch canary can be fast-forwarded to the current state of the remote repo. And, now if you do git checkout canary, this is what you would get to see:
$ git checkout canary Switched to branch canary Your branch is behind 'origin/canary' by 2 commits, and can be fast-forwarded
Now to align your local branch canary with the remote canary, you would be required to do:
#merge from remote-tracker-branch git merge remotes/origin/canary
This would bring your repo in the following state:
To summarize: You can do a git fetch at any time to update your local copy of a remote branch. This will update all the remote-tracking branches. This operation never changes any of your own local branches, and is safe to perform without changing your local working copy.
Now to try Approach A, we may first create a change-set on canary branch through GitHub. (https://help.github.com/articles/creating-new-files/, ignore the last step to "create a pull request") Now, we need to update our local copy with the remote changes.
# switch to canary git checkout canary # check change-sets unique to remote-tracker-canary # wrt canary branch git log ..remotes/origin/canary # should not show any change-sets # fetch remote changes into local repo copy git fetch # check again, change-sets unique to remote-tracker-canary # wrt canary git log ..remotes/origin/canary # will show the remotely added change-set # through GitHub in the log # merge changes from remote-tracker-canary into current # i.e. canary branch git merge remotes/origin/canary
Approach B. PULL
Pull is just doing fetch and merge in one command. In the simplest terms, git pull does a git fetch followed by a git merge.
If you are on branch canary, and do git pull, it would fetch the remote-repo, update the local-repo pointers, but merge only the current branch and not others.
Let us consider the below illustration, as current repo state:
Now, if you were on branch canary, and do
git pull
the new repo-state would be as below:
So, git pull actually aligned all the remote-tracking branches with their corresponding remote branches. Since, the current local branch was canary, even the local-canary pointer was updated but not local-dev pointer.
To try it out, first create a change-set on dev and canary branches on GitHub. Next fire -
# check change-sets unique to remote-tracker-canary wrt canary git log canary..remotes/origin/canary # should not show any change-sets # check change-sets unique to remote-tracker-dev wrt dev branch git log dev..remotes/origin/dev # should not show any change-sets # pull remote changes into canary git checkout canary git pull # notice the new change-set present in canary git log canary # check again, change-sets unique to remote-tracker-canary # wrt canary git log canary..remotes/origin/canary # will not show any change-sets since canary # branch also got updated # check again, change-sets unique to remote-tracker-dev wrt dev git log dev..remotes/origin/dev # will show remotely added change-set to remote-dev # through GitHub, since local-dev branch does not get updated
Updating a local branch from desired remote branch
Let us create a new local branch called newCanaryFeature from canary branch, as follows:
git checkout canary git checkout -b newCanaryFeature
Now in this case, the branch newCanaryFeature will not have a corresponding remote-tracker branch. Therefore, if you now try doing git pull, you would get an error information claiming …
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
There are 3 ways of updating the branch newCanaryFeature -
Approach A. (Pull + Merge)
# 1. Update canary to latest git checkout canary git pull # 2. checkout new branch git checkout newCanaryFeature # 3. take merge from canary git merge canary
Approach B. (Fetch + Merge)
# 1. Update local copy of repo, which means all remote-trackers git fetch # 2. checkout new branch git checkout newCanaryFeature # 3. take merge from canary remote-tracker git merge remotes/origin/canary
I always prefer second approach to first approach. The reason being it makes me stick to common mechanisms for updating any local branch, irrespective of the branch having an associated remote-tracker. Hence, for the same reasons, I prefer fetch + merge approach to the pull approach.
Approach C. (Assign Remote-tracker)
# 1. Assign remote-tracker to the local branch git branch --set-upstream-to=origin/canary newCanaryFeature # 2. pull recent changes from remote git pull
This approach is simply to add remote-tracker to newCanaryFeature, and then doing a git pull on it. I never prefer this approach. I always like only one local branch corresponding to a remote-tracker.
Understanding "merges" between branches
Suppose you have two branches, "stable" and "new-idea", whose tips are at revisions E and F, respectively:
So the commits A, C and E are on "stable" and A, B, D and F are on "new-idea". If you then merge "new-idea" onto "stable" with the following commands:
git checkout stable git merge new-idea
Then you have the following:
Observe, commit G being the merge-commit on branch "stable". If you carry on committing on "new-idea" and on "stable", you get:
So now A, B, C, D, E, F, G and H are on "stable", while A, B, D, F and I are on "new-idea".
Creating Diff and Applying Patches
# Getting diff of change-sets unique to canary wrt dev branch git diff dev..canary > diff1.patch # Getting diff of change-sets unique to dev wrt canary branch git diff canary..dev > diff2.patch # Getting diff of change-sets unique to current branch wrt master git diff master.. > diff3.patch # OR, simply, git diff master > diff4.patch # Getting diff of change-sets unique to master wrt current branch git diff ..master > diff5.patch # Knowing incoming diff in canary after 'git fetch' git fetch && git diff ..remotes/origin/canary > diff6.patch # Knowing outgoing diff from current to remote-master branch git fetch && git diff remotes/origin/master.. > diff7.patch
The above commands would guide the output of respective diff commands into file *.patch files. To apply a patch -
# Syntax to apply a patch => patch -p1 < {/path/to/patch/file} # To apply a patch on canary branch git checkout canary patch -p1 < diff.patch # To reverse an applied Patch patch -R < /path/to/file
The patching example above makes use of patch command. If not already installed, you may need to install it.
The URL to install it on Windows - http://gnuwin32.sourceforge.net/packages/patch.htm
To install it on Ubuntu, fire - sudo apt-get install patch
GIT Repository Browser
gitk
The above command brings up a graphical interface for browsing the git repository. More details on this can be found at - https://git-scm.com/docs/gitk