Open at the start

Are you intending to develop in the open? A guide to starting out with an open source project.

Guidance for Coding in the Open

Overview

The preferable approach to open source development should be open from the start [1]. Considering the ICO Guidance on Freedom of Information requests [2] (FoI), developing analytical code in private is not without risk. Consider that you may need to turnaround an FoI within 20 working days, such a request could introduce significant demand upon a team in anonymising and quality assuring a project. Wherever possible, taking steps to be able to work within the scrutiny of the open source community is preferable.

A key consideration in releasing code at the end is whether or not to release with the entire commit history. Releasing at intervals with an entire commit history vastly increases the number of lines of code to review before release. Taking the time to adequately assess the risks associated with a project and to develop accordingly is a decision that should not default to private in all cases.

This guidance is intended to help developers at the Data Science Campus when setting up a public repository on our Campus Team GitHub. It will discuss some suggested practice in setting up a project board, though this guidance should not document the current state of the GitHub user interface, as this changes frequently.

Before You Start

Work with the Campus delivery team to ensure appropriate project governance is set up. This should include (but not limited to):

  • Completing an ethical self-assessment [3].
  • Gain agreement in writing with project lead to code in the open.
  • Consider the needs of the development team.
    • Does everyone understand the risks with the project and how to mitigate them?
    • Could some dedicated pair-programming help to standardise the development approach?
    • Could revisiting the Campus GitHub training [4] & SyOps help?
    • Consider a Risks log, setting out the sensitive aspects of a project, ensure the dev team have access to this document and that it is maintained.
    • This can also help to ensure everyone is up and running with key mitigations such as pre-commit [5] and that style conventions are adhered to.

Repository Setup

  • Choose a repository template that suits the complexity of the project. There are multiple to choose from, but the Campus’ minimal repository template [6] may be useful. For more involved projects, you may wish to consider GovCookieCutter [7] .
  • Configure branch protection [8] under your repository settings.
  • Ensure a license file is included. MiT is adequate for most purposes.
  • Documentation or other Intellectual Property beyond code is distributed under OGL 3.0 [9]
  • It’s a good idea to provide issue or pull request (PR) templates and a contributing guide to help nudge collaborators (internal or external) towards the habits you expect.
  • Include a code of conduct to help safeguard from poor online behaviour.
  • Include contribution guidance so that team members and external collaborators understand the working conventions of your repository.

External Collaboration

  • In an open repo, any GitHub member could fork the codebase and raise a PR.
  • Colleagues should consider that they are representing the ONS [10] in all their interactions on our corporate GitHub teams account. Refer to the GitHub SyOps for more information.
  • In cases of collaboration with external agencies, consider whether access to the public repository is enough, or whether access to a private project board needs approval with Campus Delivery Team.

Project Board

Considering the sensitivity of the work undertaken, a decision may be made on whether a public or private project board (kanban) [11] is needed. A public code repository can be linked to a private project board. To facilitate this way of working, please consider:

  • Issues on the public repo are visible to all GitHub users. Adding them to a private project does not change the visibility of the project board or its content.
  • Private project boards allow the creation of draft tickets (sometimes referred to as notes). Converting a note to an issue on a public repo will change its visibility.
  • For more information on the visibility of a project, please see the GitHub documentation [12] .

References

[1]
Analytical Standards and Pipelines team at the Office for National Statistics (ONS), “Open sourcing analytical code.” https://analysisfunction.civilservice.gov.uk/policy-store/open-sourcing-analytical-code/
[2]
Information Commissioners Office, “How to access information from a public authority.” https://ico.org.uk/for-the-public/official-information/
[3]
[4]
Office for National Statistics Data Science Campus, “Data science campus GitHub training.” https://github.com/datasciencecampus/DSCA_GitHub_Training
[5]
A. Sottile et al, “Pre-commit.” https://pre-commit.com/
[6]
[7]
Office for National Statistics Best Practice & Impact, “Govcookiecutter on GitHub.” https://github.com/best-practice-and-impact/govcookiecutter
[8]
[9]
The National Archives, “Open government license 3.0.” https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
[10]
Civil Service, “Civil service standard.” https://www.gov.uk/government/publications/civil-service-code
[11]
[12]