At a glance
Open source code can be viewed, downloaded and used by anyone. This guide outlines the benefits of publishing analytical code as open source and how to do it within the NHSBSA.
What is open source?
Open source is a way of developing and distributing software. The code can be viewed, downloaded and used by anyone. Anyone can contribute to open source code (pending approval by the code owner via Pull Requests) which promotes collaboration across teams and organisations. Open source code is often hosted on platforms like GitHub, where it can be shared and collaborated on by a community of developers. Open source code is typically licensed under an open source licence, which specifies how others can use, modify and share the code.
Why publish analytical code as open source?
Open source code:
- allows others to see how analyses are performed, increasing transparency and trust in results
- shows the workings behind published analyses (reports, data, dashboards, etc), making it easier to verify and reproduce results
- enables others to reuse code which increases value
- open review and feedback help identify issues and improve code
- encourages collaboration and knowledge sharing across teams and organisations
How to publish analytical code openly
There are two main routes to open-sourcing analytical code:
- Work in the open from the start: this is the ideal scenario which maximises the benefits from open-source, but it is usually only suitable when you are working on open, published datasets - for example, using data from the NHSBSA Open Data Portal.
- Retrospectively publish closed-source codebases: many of the datasets we analyse at the NHSBSA are not publicly available and so to protect sensitive and confidential data, we often work within secure internal environments. It is still recommended to retrospectively publish your analytical code, although this may need additional review (see below for details).
To publish analytical code:
- Include a file named
LICENCE(British English spelling) with the complete licence text in your repository, not just a statement in theREADME. - Document your code and provide clear instructions for use, dependencies, and purpose. Ensure the code itself is written to be easy to understand and maintain, and include a
CONTRIBUTEfile to guide others who may want to contribute. - Host your code on the NHSBSA Data Analytics GitHub.
- Follow NHSBSA policies and approval processes for open source coding.
- Use secret management systems and keep credentials out of repositories. Make sure you check the history as well as the current version of the codebase! See GOV.UK guidance.
- Check for sensitive information: Ensure no personal, confidential, or proprietary data is included. Make sure you check the history as well as the current version of the codebase!
See the GOV.UK Service Manual and NHS Digital RAP Community guidance for detailed steps and best practices.
Retrospective open sourcing
There are additional considerations if you are retrospectively open sourcing your code (after having developed it closed source).
- The NHSBSA Retrospective Open Sourcing Guidance (only visible to internal NHSBSA colleagues).
- Whether you would benefit from using a ‘fit-for-publishing checklist’ to ensure your code is ready for release, including internal and external review steps. See the NHS Digital RAP Community Fit for publishing checklist.
Licensing open code
When publishing code or content openly, it is essential to include a clear licence to specify how others can use, modify, and share your work. Open code should include a LICENCE file, with a copyright notice where the year should reflect first publication, or a range if significantly updated.
The NHSBSA uses two types of licences, to serve different purposes:
- For software/code use the Apache 2 Licence. This is generally the licence to use in the code repository via the
LICENCEfile. - For published content, documentation, and data use the Open Government Licence v3.0 (OGL v3). A link to the OGL v3 licence should be included in the footer or main documentation of published outputs. The NHSBSA uses a standard footer format which includes the appropriate licencing information and can be reused across NHSBSA products:
For more details on licencing, see the NHSBSA Digital Playbook.