Collaboration

Author

Government Analysis Function and ONS Data Science Campus

Git can also be used with remote repositories, which are provided by online services such as Github or Gitlab. Which service you use is a matter of context and it’s always best to check with your organisation. These training materials will use Github as it’s free to sign up and considered the most user-friendly for people new to Git and version control.

The ONS uses a system called GitLab, an open source alternative, which allows the department to host repos within the secure ONS ecosystem.

Regardless of what service you use, the principles and benefits of remote repositories are broadly the same:

In general, it is good practice to make code produced available on online repositories so they can be viewed by anyone interested. However, as civil servants we must consider the nature of what information, data and documents we are putting into the public domain, just as we would with any other document.

1 Linking Remote to Local

You’ll need to have a GitHub account before attempting this section of the course. This means you are able to log into Github within a web browser and view your profile or organisational page. This is where your projects are stored, however, you are currently unable to interact with this system via your own machine’s command line. This is because your computer and the remote repository service must be securely connected.

In order to make the connection to the remote server secure, you must generate a unique code. This is an authentication method to ensure a secure connection.

For Github this is done using a Personal Access Token (PAT).

The documentation for getting started with Github has everything you need to be ready for the next sections. You should already have Git installed on your machine from previous exercises and be familiar with making commits.

If you need instructions for connecting to a secure internal service such as Gitlab or alternative repository, please ask your colleagues or contact your I.T service

2 Creating a Repository

Your device is now able to connect to the remote system, which means you can now set up a project to work with using GitHub. The concepts covered from here are demonstrate specifically in Github, however are often similar across different services.

There are many ways to set a repository up in Github. The simplest is to follow the guide at Github Docs for creating repos

Once you’ve completed the steps in the tutorial you’ll have a repository ready to connect to your local Git environment.

Note: The remote repository service will not automatically sync our project to our machine, we need to get the project onto our machine first, and then explicitly sync it. This is detailed in the sections below.

3 Clone

The project now exists in our remote repository, and we want to get it onto our machine. To do this we use a git command clone. We clone our remote repo to our computer.

To clone a Github repository, follow the instructions on the Github Docs for cloning

Cloning is also used when starting work on a project that already exists on a remote repo. For example, if your team has been working on a task for a while and you have been asked to join them. You do not currently have the files on your computer, but do have access to the project on Github or equivalent service. You would therefore clone the project to access it on your computer.

You would now have an instance of the project associated with the remote repo on your device.

The name of the folder created will be that of the name of the project created.

If I have a project called “my-super-awesome-statistics-project-that-takes-ages-to-type” on the remote server this will create a folder on my machine of the same name.

4 Fetch and Pull

As mentioned previously, our remote repo will not continuously sync files with our local repository (think: why could that that be a bad idea?).

This means the user needs to manually ensure that changes on the remote repository are tracked by our device. We must explicitly ask git to check what changes have been made since we last updated our local repo.

This is achieved by calling the git fetch command. By fetching we ask the remote repo to inform our device of all the updates to the branches that we are tracking.

We can add in the additional argument after fetch that specifies which remote repo we are referring to. By default the remote repos are assumed to be called origin. Which makes our full command:

$ git fetch origin

Which fetches the information about our origin remote repo and shows us the changes at the remote.

Note: This does not change our files on our branches, it updates the information we have about the remote tracked branches.

To actually get the changes made on the remote repo to our device we use a different command, pull.

When we use the pull command we are merging our current branch with a certain branch on the remote repo. If the merge is successful our local branch will therefore be up-to-date with the remote repo branch.

You should ensure that all changes to your local repository branch have been committed (called a clean working copy) before you try to pull from the remote repo to avoid untracked changes.

As git pull merges the remote repo to your local repo merge conflicts can occur. These can be handled in the same way as discussed in the ‘Merges’ section.

Performing a pull will automatically perform a fetch command first. We use fetch on its own to understand what has changed on the remote repo.

The full form of the pull command is:

$ git pull <remote name> <branch name>

If someone else has updated the dev branch of our repository we can pull it onto ours using:

$ git pull origin dev

Which will then list all of the changes made to your local file, including insertions, deletions and files created/removed.

We can check the state of our local repos in relation to the remote repository using the command we have already introduced status.

$ git status

For Github, the following guide covers the specifics of Fetch and Pull (plus a few other tips and tricks)

4.1 Tracking

We make the distinction between tracked and untracked branches. Tracked branches are those which are connected explicitly to a remote repository branch.

An untracked branch is one which is just local to the device being used. We can only pull/fetch and do other remote interacting operations with branches that are tracked.

5 Push

Up until now we have connected our machine to the remote repository, used clone to get the project onto our computer.

After some time there were changes to the project on the remote repo, as our colleague had been working on a branch. This caused the remote repo to be different from our local repo.

We then used fetch and pull to get then merge the changes between the remote repo and our local files. Great, we are up-to-date.

But now we want to work on the project ourselves, this means we need to make changes to our local repo, and then update the remote repo so it has our changes integrated.

This is done by using the push command. The push command allows you to merge a local branch to a remote branch. It follows the syntax below.

$ git push <remote name> <branch name> 

If we were pushing to the dev branch, within the remote repo called the default origin we would write:

$ git push origin dev

Again, the Github Docs have a guide available for Pushing commmits to remote

Here is a diagram of the different repos and commands to move files between them.

repository flow

5.1 When to Push?

We want to make sure that all our changes to the code are being tracked and recorded at any time, that is why we want to commit early and often. However, we do not want to be constantly pushing up to the remote repo if our work is unfinished. Below are general rules of thumb for when to commit and push:

  • commit when a “unit” of work is done; such as a new function made or part of a script written.
  • push when a complete feature is made. This feature shouldn’t contain any bugs and should work well with the rest of the code.

Remember, we are documenting the code with each commit message, which allows us to trace when things are added. This allows us to work backwards if we made a mistake.

We then push when we are happy with the feature we have created and want others to be able to access the changes made to a specific branch.

6 Existing Projects

If you are wanting to create a Git repo for a project you have already started working on you can do so easily.

REMEMBER: Do not use the GitHub account for your actual work. Check what you are authorised to use with your I.T team.

  1. Ensure that you have a Git area on your local machine. This is achieved using git init, you should have one area to store all your projects.

  2. Create a repo on the remote service you are using.

  3. Clone that repo into your git area such that a folder connected to the remote exists.

  4. Move the files from the project you are working on into the new cloned folder.

  5. add, commit and push the files up to the remote repo.

This means you can now move any of your none sensitive work (such as your training materials) without having to start from scratch!

7 Merging

We can now update remote repo branches with our local changes by using push. This allows us to alter the contents of a branch, but not to combine features from our different branches. For this we want to merge branches in the remote repo.

This is very similar to the merging we did previously, which was done locally on a computer. However, this time it is done using the remote repo, which means we can combine different branches worked on by different people.

Now you can be working on feature A, your collegue on feature B and you can then merge your work so a branch has both feature A and B.

The source branch is the branch that is merged into the target branch. For example, if you have been working on some new Feature C and you want to merge it into the main development branch dev then you would do the merge shown above.

When creating a merge request there is a range of information you can give to it which will allow clearer documentation and customisation. You can select the title of the request and add a description of the merge, and why it is necessary.

Shows Github page where merges between branches can be made. The user has the option to select which branch is merged into which.

In addition, you can determine whether the source branch will be removed after the merge occurs. This can be useful if a feature is fully complete, but may not be the best idea if you will still need to work on elements of that source branch.

Setting up a pull request does not action the merge, it is documentation of the intention to merge and tracks the relevant commits made to the source branch.

If you need an in-depth guide to creating and allocating pull requests, please follow the guidance at Github Docs

Peer reviewing code is an important part of working collaboratively, a pull request is a great time to put this into practice.

Merge conflicts can be resolved in Github itself in this case, the site will show us for each file and each conflict the difference between the source and target branches. You can then select which version of that section of code you would like to use and then commit this version.

7.1 Alternative Method

Merges with branches do not need to be done via the Github webpage. Instead they can be performed in command line using the methods we have already discussed.

First you would add then commit to the branch you are working on.

Then you would fetch/pull the target branch you would like to merge to.

You would then merge locally between your source and target branch, resolving any conflicts.

And finally you would push up to the remote target branch.

8 Example

Below is a diagram of what happens with two people working on the same project. Time goes from left to right. Bob and Alice are both working on different parts of the project, but there could be conflicts.

Alice not only has to submit a merge request, it also needs to be granted before the source and target branches merge.

Shows diagram of two people working on the same project with a remote repo.

9 Exercise

This will require you to have a GitHub account, or to use your department’s guidance on remote git repositories.

REMEMBER: Do not use the GitHub account for your actual work.

  1. Create your own project within GitHub called “practice”.

  2. clone your newly created project onto your local device. You will need to be in your Git repo locally to do this in Git Bash.

  3. Create and checkout a new branch dev locally.

  4. Create a new text file called development_feature.txt in the repo. Add in some text of your choice to the file and save it.

  5. add and commit this change with a relevant message. push this commit up to the remote repo.

  6. Going onto GitHub, create a merge request between the source branch dev and target branch master. Using the GitHub website merge the two branches.

  7. Solve any merge conflicts.

  8. Using Git Bash, change to the master branch.

  9. Check the status of the repo.

  10. fetch and pull any changes to the master branch to your machine, open up your feature_<your username>.txt to see if your text is there.

  1. On your GitHub account select either New > Repository or Create New Repo.

  2. Copy the HTTPS URL at the top of the project.

Open Git Bash in your local Git Area.

Write the following code:

$ git clone <paste URL here>

Remember you can paste into Git Bash using Ctrl+Shift+Insert.

  1. Write in Git Bash:
$ git checkout -b dev
  1. Either:

Got to the file location in your file explorer, and create a new file with notepad, entering in the relevant text.

Or:

In Git Bash move to the location you want the file and type:

$ cat > development_feature.txt
"Your input text here"

Then press Crtl+d at the end of the line.

  1. Use the following commands:
$ git add development_feature.txt

$ git commit -m "relevant commit message"

$ git push origin dev
  1. Go onto the GitHub website, to the project, and click merge, entering the source branch (your username) and the target branch (master).

  2. Resolve any conflicts using the GitHub IDE, or locally.

  3. Write the following command in Git Bash:

$ git checkout master
  1. Write the following command in Git Bash:
$ git status
  1. Write the following command in Git Bash:
$ git fetch

$ git pull origin master

Open the file either by clicking it in File Explorer.

Or in Git Bash type:

cat development_feature.txt

Reuse

Open Government Licence 3.0