Git and GitHub workflow

This instruction skim through of using Git in a collaborative project. The workflow is based on typical open-source projects on GitHub, though a similar experience can be expected regardless of where the project is hosted.

GitLens vs. git command

Instructions in this document have two versions, one with VS Code + GitLens and one with the git command. The former is provided because the graphic UI is easier to work with, and covers most of the daily usage. However, note that the command provides more functionalities, and you might need it for edge cases (e.g., to find a lost commit, or to rewrite history).

Git repository

A git repository (repo) is a folder managed by git. That means git keeps track of the files in it, you can switch, compare, and combine different versions with git.

You can use git repo for local folders only, but it is common to synchronize your local repo to a remote server (e.g. on GitHub). While working with a git repo, it's a good idea to synchronize often the local and remote changes.

Getting familiar with the local and remote repos:

  • Run "Git: Fetch From All Remotes" command;
  • Search the "Commit Graph" command and show the commit graph.

You should see two remotes, one for your own private fork (origin), and one for the shared main repo (upstream). This is a common setup where you own whole controll over the "origin", while the "upstream" is administrated by the core developer or a team.

When you proceed with this tutotiral, see how the changes are reflected in the graph.

Inspecting a repo from CLI:

  • Run git remote -v in a git repository, how many remotes you have;
  • git fetch to update the remotes;
  • git status to compare your local copy to the remote;
  • git pull to apply remote changes to your local repo (you might have to setup the default remotes).

Commits

Commit are the basic element of "changes" in git, whenever you make a change, it will be identified as a hex code (called SHA) like 3c03f5. The code itself is very long, but it is usually OK to refer to it with the first 6 digits.

To make a commit, you have to "stage" your changes. Staging prepares your change to be "committed", which means making a snapshot of you project.

Practice

Create a new file in the documentation, commit it following the instruction below.

Make a commit with GitLens:

  • Changes in VS Code can be found in the source control side panel;
  • You can click on a file to see what changes are made;
  • Clicking the + (plus) sign to stages your change;
  • Write a short message and click commit to make a commit;
  • Check the history in Commit Graph.

Make a commit from CLI

  • git add filename.suffix to stage the change;
  • git commit to start a commit, an editor of your choice should pop up
    a commit will be made after you save and exit;
  • If you are in a hurry, you can just attach the message with git commit -c "commit message";
  • Use git log to inspect the history.

You see that you don't have to commit all your changes at once, and a good practice is to arrange your commits in some logical order and write concise commit messages. But for now, don't worry too much and just keep in mind that:

  1. Frequent commits are better than big mixed commits;
  2. Commit messages should be informative.

I.e., write "work in progress for XXX task", rather than "combined progress in May". Later in this tutorial you will see how to re-arrange your commits, and it is easier if you have small commits than big ones.

You should now see your commit and its SHA code. The code allows you to switch between different versions of your repo, when you do so, you "check out" a commit (e.g. git checkout 3c03f5d). You will not often check out commits directly, instead, you will usually use "branches" instead. We do not need new branches in this tutorial, but you can read more here.

Rebase

Rebasing is commonly used to synchronize your commits remote changes. When you rebase your commits, you "transplant" your commit onto some other commits, see the illustration below:

      A---B---C local main
     /
D---E---F---G upstream main
              A'--B'--C' local main
             /
D---E---F---G upstream main

You see how this turns your progress A, B, C into new commits A', B', C', meaning they will get new SHA codes since each of them is based on a different starting point now. The rebase command offers much freedom for rearranging your commits. See how you can "squash" multiple commits into one to clean up a long commit history below:

Make a new commit and combine the two commits you've made with rebase:

  • Edit mkdocs.yml, add your page into the navigation and commit the change;
  • In the Commit Graph, choose the commit before your first change, choose "Rebase current branch onto commit" and select "Interactive Rebase".
  • In the editor, rearrange the order of previous commits, select "squash" for the second commit you made, and click "Start Rebase". You will be able to write a new commit message for the combined commit.

With the command line interface, you can initiate an interactive rebase with the git rebase --interactive command. You can learn more about the rebase command in this documentation.

After saving and exiting the message editor, you shall see a new combined commit, with a new SHA code. The same tool is useful if you want to rewrite one commit message, or simply fix some small error in a previous commit.

Push

With git, it is easy to synchronize changes of code.

Click on "Sync Changes" in the Source Control Panel to "push" your changes to the the "origin" remote, i.e., your fork of this private repo.

This usually works fine, however, when you have a conflicting history to the remote one, you will need to force the update by manually running push. Find the command "GitLens: Push". Selecting "Force Push" will force the remote branch to synchronize.

To sync changes in CLI you can use the git push command. Git also allows you to set up multiple remotes, and configure default push destinations. You can find more detaild explanations about synchronization in this documentation

Check that your changes appears on the GitHub page.

Collaboration on GitHub

In the above sections, you see how you can use Git to track the change of your code. In a collaborative setting, the ideas are the same, the difference is that one cannot allow everyone to write to a main repo.

A standard procedure is called "pull request" (PR), as the name suggests, one submit one's change to the main repo, but the administrator decides whether the changes are accepted (or "pulled").

You do not need to finish your changes to start a PR, in fact, many repos use PR as a way to discuss changes to the code. You can start a PR now and describe the changes you will make. After you completed your work, comment in the PR to let the administrator review your work.

Finish your documentation and notify it in PR when it's done. Someone should review your changes, and comment on it. Address the changes if necessary, and your PR will be merged in the end.

Rebase vs. merge

In a shared repo some extra rules might be needed to keep the history trackable, there are two general styles of collaboration, namely rebase and merge.

In the above sections, we have only used one branch and rebase changes if necessary. Some projects require this for the entire project, so that history will be "linear" in the long run. On the other hand, in some repositories, one can work on different branches and keep the history of "merging", and you might see parallel developments of the project in the log.

There are debates on the preferable styles of git history, for this documentation, we keep a linear history with the "rebase" workflow for its simplicity. This means, you will only be able to pass a pull-request when your commits are put on top the current shared "main" branch.

« Previous
Next »