Github Resources

CB Genetics Working Best Practices

Best practices for working on analyses in github: 

  • All repositories with NOAA work should be housed in Github Enterprise under the NWFSC organization, NOT in your personal/NOAA github account. This ensures that code will be retained and remain accessible even if the author leaves the agency.

  • Work and commit in the main branch of a repo (see special cases for branching and forking below)

  • Commits

    • Perform commits daily at minimum when working on analysis

    • Make sure to push to origin (if working from github desktop) daily as well!

    • If working in a branch, commit daily to the branch, and determine how often to merge to main

  • Forking

    • Generally do not use!

    • Forking may be useful if contributors to the repository are outside of enterprise, but you must create and uphold good team standards on how often forks are merged back into the main repository.

  • Branching

    • Use sparingly!

    • Create a branch for a specific task, complete task, merge to main immediately. 

      • Use when you are making a change that could “break” the main repo for some reason

      • Branches are easy to “get lost in” and can create confusion and complicate organization of your repo

  • Projects can be created within our CB-genetics team. Use them to organize tasks or discussions associated with one or multiple repositories.

CB Genetics Github Repository Guidelines

See How to: Create a github repository in enterprise for a full tutorial walk through of creating a github repository.

From the NMFS Github Guide, your repository should include a README, the government product DISCLAIMER, and LICENSE file. In addition, all CB Genetics github repositories should include the following components:

README

The README file should provide a detailed description of the repository. The contents of the README will vary depending on the project type and degree of completeness, but should include a few common elements.

Recommended README components:

  • Primary contacts: Include name and email of at least one primary contact

  • Citation/Reference/DOI: If the work is something that can be cited, then also include citation information and a DOI. This will be more common in repositories containing finished projects.

  • Objective: Briefly describe the objective of the project/analysis to provide context for your repository.

  • Methods: Include relevant information regarding the methods used in the analysis for this project repository. This can be broken up in sections if needed for a larger repository that houses multiple analyses.

  • Major Results: For projects in mostly finished states, add major takeaways from your analysis. Including figures is optional.

  • Disclaimer: Repositories and web content shared on GitHub should make it clear to the audience that no information should be considered or interpreted as official communication of NOAA. The simplest method for doing this is to include the disclaimer text as a footnote within the repository’s README.md.

DISCLAIMER

This repository is a scientific product and is not official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an ‘as is’ basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.

LICENSE File

You should add an open license file to the GitHub repository before work begins. This establishes from the moment work starts that all work is open source (as required for publicly released federally funded work).

To add a license to your repository, navigate to the base level of your repository (so not in a subdirectory) and add a file called LICENSE. Under the file name, click Choose a license template, select Apache-2.0 license.

Gitignore

.gitignore files are used to tell Github to ignore some files in your working directory when pushing to Github remote. This is useful because Github cannot handle large files (such as fastq) and sometimes we don’t want every file in our working directory to be pushed up to Github.

A .gitignore file should be created when you first set up your repository - often we choose a template like the R template because we are using R code in our repository and the .gitignore file will ignore syncing the temporary files that are created when working in R.

If you do not see a .gitignore file in your repository directory:

  1. Check to see that it is not just a hidden file- On a mac, press Command + Shift + . to show hidden files
  2. If no .gitignore file exists, create one from a simple text file and save it as “.gitignore” in your main repository directory. You can use a template for a certain language from the list here or can create a useful .gitignore for your project with one or more coding languages here.

To edit your .gitignore for specific directories or file types, use the examples below:

To ignore a whole directory

**/directory_name 

To ignore the files within a directory (the directory will show up in github remote but will be empty- if for some reason you want to preserve directory structure)

directory_name/** 

To ignore a specific file extension, from anywhere in your repository directory (this will also ignore these file extensions when they exist in sub-diretories)

*.fastq

Assign CB-Genetics Team

Assigning the CB-Genetics team to your repository keeps our group’s work organized and easy to find within the NWFSC enterprise organization. Simply navigate to the NWFSC enterprise organization, click on the Teams tab, expand the CB team to find the CB-Genetics team within it, then click on Repositories.

Repositories in the CB-Genetics Team

To do this: Settings -> Collaborators and teams -> Manage access -> Add teams -> CB-Genetics (You choose permissions when establishing this access.)

For a guided tutorial, see How to: Create a github repository in enterprise

Add the nwfsc-cb-genetics Topic to Your Repository

Topics are used in Github like hashtags are used in social media. It is another way to categorize repositories and organize them. For example, the NWFSC Github Organization landing page has a link to all repositories with the nwfsc-cb-genetics topic.

The “nwfsc-cb-genetics” topic should be added to all repositories in our group.

To do this: navigate to the main page of the repository -> in the top right corner, to the right of “About”, clikc the gear symbol -> under “Topics” start to type nwfsc-cb-genetics, click the matching topic from the dropdown menu -> click “Save changes”

For a guided tutorial, see How to: Create a github repository in enterprise

NOAA Github Repository Guidelines

From the NMFS Github Guide, all repositories, regardless of purpose, must follow these general guidelines:

  • PII and BII should never be shared (on purpose or inadvertently) on GitHub regardless of whether the repository is in a private or public repository.
  • No sensitive information should be shared in repositories, including (but is not limited to) usernames, passwords, login information, port numbers, IP addresses, server names, Application Programming Interface (API) keys, Personally Identifiable Information (PII), Business Identifiable Information (BII), or confidential data.
  • GitHub is not a back-up service nor is it a data repository with archiving. Other tools are designed for this purpose.
  • Only scientific content that can be reasonably classified as FISMA Low should be shared on GitHub.
  • Repositories that have code that interacts with APIs using IP addresses, usernames, passwords, secrets, or credentials must take steps to prevent committing of “secrets” to GitHub.