Public Code Repository Best Practices

In this section

Learn & Grow

The Johns Hopkins OSPO recommends a set of best practices for researchers maintaining publicly available code repositories in our GitHub Enterprise instance or other version control systems. For research software code repositories, we recommend adding a license, README, Contributing file, Version, and Identifiers such as DOIs, SWHIDs, and/or Citations. Repositories that aren’t research software but which are publicly available can still benefit from adding a license, README, and a version.

Below you’ll find information about each recommended best practice, and you can always contact the OSPO at ospo@jhu.edu if you have questions or need assistance implementing these recommendations.

Licensing

The rights to use, inspect, distribute, and modify open-source code are granted by the open-source license, an intellectual property license, and legal agreement. If your code is stored in a publicly-available repository on GitHub, GitLab, Bitbucket, or elsewhere, you should add a license to it. It’s a common misconception that code without a license is free for others to use. In fact, the opposite is true! Copyright law in the United States grants you copyright protection for your code as soon as you create it. Without a license, others may not legally use your code without permission.

The OSPO has a video Explainer about open source licensing and a comprehensive Licensing guide developed in collaboration with JHTV that provides information about types of OS licenses, choosing the right license for your project, and the differences among licenses, copyright, patents, and trademarks.

Licensing FAQ
1. What exactly is Open-Source Software and what rights does it grant? Open-source software is defined as software with publicly available source code, the human readable instructions written in languages like Python or C++. It is specifically licensed to allow anyone to use, inspect, modify, and distribute the code. These rights are legally granted through an open-source license, which acts as an intellectual property agreement.
2. Why should I add an open source license to my research software? For researchers, “research software” includes the tools, code, or libraries used to generate or analyze data. Making this software open source is beneficial, leading to greater transparency in the research process and creating more opportunities for collaboration with others in the field.
3. What are the primary types of open-source licenses I can choose from? Copyleft (or Reciprocal) Licenses require any users who make modifications to the software to release those changes under the same license. Examples of copyleft licenses include the GPL and AGPL. Permissive Licenses offer significant freedom, allowing the software to be used for various purposes, including integration into commercial products. Examples include the Apache, BSD, and MIT licenses.
4. Is open source just about the legal license? While the license is the legal foundation, open source also represents a set of core values. These include collaboration, transparency, and community. Effective projects often feature a transparent development process, support a strong community of contributors, and use participatory governance for decision-making.
5. What happens if I don’t add a license to my code? It’s a common misconception that code without a license is free for others to use. In fact, the opposite is true! Copyright law in the United States grants you copyright protection for your code as soon as you create it. Without a license, others may not legally use your code without permission. Note that sharing code on platforms like GitHub may grant certain limited rights (such as viewing and copying) through the platform’s terms of service, but this doesn’t replace an open-source license.
6. Should I create my own open-source license? No, it’s not recommended to create your own open-source license or use licenses that weren’t intended for code, such as Creative Commons. Instead, use established licenses that have been legally vetted and are widely recognized. A good starting point for choosing an open-source license is the website https://choosealicense.com, or you can view a complete list of licenses approved by the Open-Source Initiative at https://opensource.org.
7. What if I want to restrict commercial use of my code? If you’d like to restrict use of your code to non-commercial applications, consider licenses such as the Johns Hopkins University Academic Software License or the Business Source License (BUSL). However, it’s important to note that these licenses are considered “source-available,” not open source, as they include usage restrictions.
8. How do I actually apply a license to my code repository? If you’re using GitHub, you can add a known license file when creating a new repository. If you’re working with an exiting repository, the basic steps are to create a license file named LICENSE or LICENSE.md and place it in the root directory of your repository, copy and paste the license text into your LICENSE file, and then commit the new file to your repository and push the changes.
9. Where can I get help with licensing questions? The OSPO has an extensive Licensing FAQ on its website at https://ospo.library.jhu.edu/licensing/, or you can reach out to ospo@jhu.edu to schedule a time to talk through your projects with an OSPO staff member. For specific legal guidance, Johns Hopkins Tech Ventures (JHTV) is also available at https://ventures.jhu.edu.

README

A README serves as the introduction to your software and provides users with important information on how to get started. Add a README to your repository to help others understand the purpose of your code, why it was created, what it does, and how it works; provide clear instructions on installation, usage, and dependencies; and encourage collaboration by providing clear guidelines for potential contributors. The OSPO has a video Explainer that provides a quick introduction to writing READMEs, and a GitHub template to copy for your own repository. The website Markdown Guide is a great resource for getting started with Markdown, a useful formatting language for GitHub.

README FAQ
1. What is the main purpose of a README file? A README serves as the primary introduction to your software and provides users with the essential information needed to get started. It helps others understand the purpose of your code—why it was created, what it does, and how it works—while providing clear instructions for installation and collaboration.
2. How should I name and format my README file? You should name the file README and can use either plain text or a markup language.
3. What essential sections should a good README include? According to the OSPO’s guidelines, a comprehensive README should feature:
  
  Identification: The project’s name and logo as the first heading, followed by a project URL and the names of the authors or owners.
  
  Evaluation: A description of what the project achieves (focusing on the “what,” not the “how”) and its target audience.
  
  Usage: A list of prerequisites and a simple “one-time” setup guide to help a reader go from having the files to using the project for the first time.
  
  Engagement: Links to full documentation, help resources (like mailing lists or forums), and clear guidelines for contributing or reporting bugs.
4. When and how often should I update my README? You should set a reminder to review your README on a regular basis. Key times to update it include annually, during a major release, or upon reaching other significant project milestones.
5. Do I need a README if my code isn’t open source? Yes! READMEs aren’t just for open source projects. Any code that will be shared publicly or privately can benefit from a README. Whether you’re sharing code with collaborators, your research team, or keeping it for your future self, a README helps anyone understand the purpose of your code, why it was created, what it does, and how it works.
6. How do I write a good project description for my README? When describing your project, focus on what it does or achieves—not how it does it. This is not the place to talk about languages, technologies, or tools; that comes later in the documentation. If you’re having trouble getting started, try a “MadLib” approach: “With [project name], you can [verb] [plural noun].” You should also identify who the project was built for or is used by if it has a specific audience. For open-source software, this section should also describe who may use the project and under what terms—be sure to name the license.
7. What should my installation instructions include in the README? Your installation instructions should help your reader go from having your project’s files to using the project for the first time. These instructions should stop once the project works once—extended usage instructions really belong in your full documentation. Always make sure you test your installation and setup steps. If you have somebody outside the project test them, you can get even better and more honest feedback.
8. What “engagement” information should I include to help users interact with my project? To help readers engage with your project, include several key pieces of information:
  
  Tell them where to go for more project documentation (this might include your website, documentation files, or companion files like your license or code of conduct)
  
  Tell your readers where to go for help (descriptions and links to mailing lists, issue trackers, or forums)
  
  Tell your audience how they can help (for open source software, link to and summarize the project’s contributors guide, or describe how and where you want contributors to add to your project)
  
  Describe how and where bugs should be reported
9. How do I handle READMEs for projects with multiple repositories? For projects with more than one repository, create a README for the overall project. You can choose one repository to be the primary and place the core README there. Alternatively, if your repository platform supports it, you can place a README at the project or organization level. Then, create a separate README in each individual repository that points to the primary project README and describes in more detail the purpose of that specific component.
10. How should I structure a long README? If your README is long, add a table of contents after your project description to help readers navigate. If your README is very long, you might consider moving some content to other documentation files. This keeps the README focused on getting users started while more detailed information lives elsewhere.
11. Should I identify authorship in my README? Yes! Underneath the project name, you should clearly identify the owner or author of the project. If your project uses a notice file to identify copyright or authorship, you should point to that file in your README.

Contributing

A well written CONTRIBUTING.md should help guide would-be contributors through everything they need to participate confidently and successfully with your project. At minimum, it should welcome contributors, explain the development environment, tools, and package managers needed to build and test the project, describe the workflow (branching strategy, how to run tests, how to submit patches/pull requests), and point to related resources such as the issue tracker or code of conduct. The OSPO has a video Explainer that provides a quick introduction to writing Co,ntributing.md files and a GitHub template to copy for your own repository.

Contributing FAQ
1. What is a CONTRIBUTING file and why is it important for a project? A CONTRIBUTING file is a guide within your open-source repository that explains to potential contributors exactly how they can help with your project. It acts as an anchor for building a community and complements other essential project files like the README and license. Importantly, it also helps project owners organize and communicate their specific policies around contributions.
2. What should be included in a good CONTRIBUTING file? While there is no single “right” way to make one, a helpful CONTRIBUTING file should ideally include:
  
  Welcome and Resources: A welcome message for contributors and links to essential resources like documentation, issue trackers, and communication platforms (e.g., forums or web forms).
  
  Technical Guidance: Instructions on how to test the project, where those tests are located, and a link to the project’s style guide or coding conventions.
  
  Issue Reporting: Clear steps on how and where to report bugs, potentially including a link to a bug report template and directing users toward “good first issues”.
  
  Pull Request Protocol: An outline of the pull request process, including expected response times and how to request enhancements.
  
  Contact Information: A list of core contributors and their preferred contact methods.
3. Does every project need a CONTRIBUTING file if it’s not looking for outside help? Yes. Even if you decide not to accept outside contributions, it is still useful to have a CONTRIBUTING file to clearly note that you are not seeking contributors at this time. This manages expectations for anyone who might come across your repository.
4. How should the file be named and saved? By convention, the file should be titled in all caps as CONTRIBUTING and saved as a Markdown file (e.g., CONTRIBUTING.md) in your project’s root directory.
5. How should I prepare before writing my CONTRIBUTING file? Before starting to write, connect with your project’s core contributors to discuss what types of contributions you’d like to invite. Consider whether you want to accept bug fixes, new features, or non-code contributions like documentation or graphic design. This planning step helps ensure your CONTRIBUTING file accurately reflects your team’s capacity and goals.
6. What is a “good first issue” and should I include them in my CONTRIBUTING file? If you want users to fix bugs and not just report them, consider labeling some bugs as “good first issues” and directing users to them in your CONTRIBUTING file. These are typically smaller, well-defined problems that are ideal entry points for new contributors who want to get involved but may not be familiar with your entire codebase yet.
7. What is a Style Guide and why should I reference it in my CONTRIBUTING file? A Style Guide or Coding Conventions document defines standards and conventions that dictate how software code should be formatted, structured, and written to ensure consistency across your project. Including a link or overview of your project’s style guide in your CONTRIBUTING file helps new contributors write code that fits seamlessly with your existing codebase.
8. Should I use a bug report template? Yes! You can link to a bug report “template” that contributors can copy and add context to. This will keep your bugs tidy and relevant, ensuring that bug reports contain the information you need to address them effectively.
9. What should I tell potential contributors about response times? When outlining your pull request protocol, include what kind of response a user will get back from the team on submission, and any caveats about the speed of response. Setting realistic expectations about response times helps manage contributor expectations and reduces frustration.
10. What is a humans.txt file and should I use one? You can list the core contributors and their preferred methods of contact directly in your CONTRIBUTING file, or link to a humans.txt file in your root directory. A humans.txt file is a simple text file that provides information about the people who contributed to your project.
11. Should I include information about project meetings in my CONTRIBUTING file? Yes! If the project has standing meetings or calls where new contributors would be welcome, include information about when they occur and how to attend. This helps new contributors feel welcomed and provides them with direct opportunities to connect with the team.
12. How should I structure a long CONTRIBUTING file? If your CONTRIBUTING.md file ended up long, consider including a table of contents with links to different headings in your document. This helps contributors quickly navigate to the information most relevant to them.
13. I’m feeling overwhelmed about creating a CONTRIBUTING file. What’s the most important thing to focus on? Focus on your Contributors! There’s no right way to make a CONTRIBUTING file, and examples that you find can vary considerably, unlike many other standard project files such as licenses or citations. The most important thing to keep in mind is that someone reading your CONTRIBUTING file wants to contribute to your project but doesn’t know how. Make sure they find the information they need to make those first connections.
14. Are there examples or templates I can reference? Yes! Some good CONTRIBUTING.md examples and templates include Ruby on Rails, Open Government, and Nadia Asparouhova’s CONTRIBUTING template.
15. Do I need to know Markdown to create a CONTRIBUTING file? While it’s convention to save your CONTRIBUTING file as markdown in your root directory, you don’t need to be an expert. If you’re not familiar with markdown formatting, there’s a great how-to guide in the GitHub documentation that can help you get started, or you can visit the Markdown Guide.

Versioning

Software versioning is the structured assignment of unique identifiers to distinct states of software so that changes, whether new features, bug fixes, performance improvements, or architectural revisions, can be clearly tracked, referenced, and managed across the lifecycle of a project. Beyond supporting deployments and rollbacks, versioning enhances reproducibility in research by enabling others to retrieve the exact code state used in experiments or analyses, which is critical for verification and reuse in academic work.

Version Control and Versioning FAQ
1. What is a version control system, and why should I use one? A version control system is a combination of technologies and practices for tracking and controlling changes to a project’s files, in particular to source code, documentation, and web pages.If you’re writing code and saving it locally – even if it’s just one script – it is worth it to move to a version control system. Version control can: track changes to your code over time; help you experiment without fear of breaking things; and allow others to see and cite your exact methods.
2. Which version control system should I use? From Karl Fogel’s book “Producing Open Source Software” – “If you don’t already have an opinion about which version control system your project should use, then choose Git, and host your project’s repositories at GitHub.” Git is the de facto version control system standard in open source. The GitHub hosting platform provides free hosting and convenient tooling that is built around Git. GitHub allows you to work privately as long as you need and easily transition your project to publicly viewable when you’re ready to share or accept contributions.New to Git and GitHub? There are lots of great resources at Hopkins and beyond to help you get started:
  
  JHU Data Services provides Git and GitHub workshops each semester: https://dataservices.library.jhu.edu/training-workshops/
  
  GitHub provides free training resources: https://docs.github.com/en/get-started/start-your-journey/git-and-github-learning-resources and https://github.com/git-guides
  
  The Software Carpentries self-directed Version Control with Git for novices: https://swcarpentry.github.io/git-novice/
3. What is a software release? A release is a snapshot of your code at a specific point in time.
4. My project is not that big, do I need a formal release? Even smaller projects can benefit from releases, for example, before submitting a paper, after fixing important bugs or making significant code changes, when sharing code with collaborators, or for a major milestone in your analysis. Even if you’re not planning on making further changes to your code, a release gives your code an identifiable version that you and others can reference precisely.
5. How do I create a release? If you’re using GitHub, follow these steps to create a release:
  
  Go to your repository and click on Releases in the right sidebar
  
  In Releases, select Draft a new release
  
  Click on the Tag field, and enter a tag/release number for your version.
  
  Give your release a title and a description. Your description should include what this version does, changes since the last release, which paper/analysis it supports (if any), and updated installation/usage instructions.
  
  Hit Publish!
6. How do I create a release if I’m not using GitHub? If you’re using a different hosting platform for your code, check the documentation for those platforms for instructions on creating a new release: GitLab: https://docs.gitlab.com/ee/user/project/releases/ , Bitbucket: https://support.atlassian.com/bitbucket-cloud/docs/use-tags-and-releases/ , SourceForge: https://sourceforge.net/p/forge/documentation.
7. What release numbers or tags should I use?
  
  If you’ll only have one release, you can tag it v1.0.0
  
  If you’re planning on ongoing development, consider semantic versioning, which follows the pattern: major.minor.patch (e.g. v2.3.6)
  
  Increment the first number and set the second and third to zero if you have “breaking” changes that require users to modify code to upgrade, e.g. 2.3.6>3.0.0
  
  Increment the second number and set the third to zero if you’ve added new features that don’t break existing functionality, e.g. 2.3.6>2.4.0
  
  Increment just the third number for bug fixes and small improvements without new features, e.g. 2.3.6>2.3.7

Findability

The FAIR principles—Findable, Accessible, Interoperable, and Reusable—provide a framework for making research outputs more discoverable and useful to the broader scientific community. Originally developed for research data, these principles have been adapted for research software to address the unique challenges of preserving and citing code.

For our GitHub Campus users with public repositories, we recommend adding a citation and persistent identifiers to your repository. Citation can be easily created using the web tool CFFINIT. For persistent identifiers, we recommend Digital Object Identifiers (DOIs), which provide a permanent link to specific releases of your software and are widely recognized in academic publishing, and/or Software Hash Identifiers (SWHIDs), which create an intrinsic, permanent reference to your code as preserved in the Software Heritage archive. DOIs can be minted by depositing your code into the Hopkins Research Data Repository or into Zenodo. A SWHID can be created by syncing your repository with the Software Heritage archive. We also recommend adding the ORCiD identifiers of everyone who contributed to the software to the Citation and repository.

The OSPO has Explainers for DOIs and Citations if you need more information!