Before beginning to work on your problem definition, you will need to update JIRA and setup a GitHub repository to store and monitor code.
Setting up the resources
- Remember to set the new GitHub repository to private.
- Invite relevant Data Science team members as collaborators and grant the wider team read access.
- Code should be written to a professional standard under the proviso it could be made public.
You must check the repository does not compromise data security before making it public
Git commit history
Your git commit history will be made public.
- Git messages can be changed but it is advised that you follow good commit practice.
- Each commit should be limited to a single change.
- Commit history can be removed but it is not recommended as we strive to be transparent.
Never expose Personally Identifiable Information (PII) on GitHub
Make sure that you never push Personally Identifiable Information (PII) to the GitHub repository.
- Use Git Leaks to reduce the risk of pushing PII to Git.
- Never write PII directly in R scripts.
- Always reference PII from the database.
Further guidance on setting up resources can be found on the internal Data Science Wiki. After creating all resources, remember to:
- provide regular updates to the JIRA ticket
- regularly push code additions and alterations to the GitHub repository