Version control keeps a complete history of your work on a given project. It facilitates collaboration – everyone can work freely on any part of the project without overriding others’ changes. You can move between past versions and roll back when needed. You can also review your project’s history through commit messages describing each added change and see what exactly has changed in the content. You can see who made the changes and when they happened.
Version control is a powerful tool and fundamental practice in software development. When coupled through a code hosting service, it easily allows contributions from outside collaborators. Version control benefits both individuals and teams and should be adopted in almost all projects.
- Introduction version control from MolSSI’s Best Practices Workshop
- Software Carpentry Version Control with Git
- GitHub 15 Minutes to Learn Git
- Git Commit Best Practices
Sharing software is vital to allow multiple people to collaborate on the same codebase. For open-source software, sharing code also gives the public access to the code for reviewing, testing, and contributing. When papers and software are both published, it increases the reproducibility and reusability of scientific work. Sharing code in a public online repository also allows others to cite the software and credit its authors.
MolSSI recommends using git for version control, and GitHub as a hosting service, though there are other options.
Tutorials for sharing code with GitHub
Software should be tested regularly throughout the entire development cycle, from the first written lines of code through production releases, to ensure correct operation. Thorough testing is typically an afterthought; however, it can be essential for ensuring changes in a given part of the code do not negatively affect other parts of the code. This is true for all sizes of projects, from big to small, and for all team sizes from one to hundreds.
With all of these test types, you should understand the concept of “expected behavior” instead of “accurate returns.” With scientific code, we expect functions like 2+2 to return 4, which we can think of as an “accurate return” (it is also “expected behavior” in this case), but if you pass in something like “Waffle” + 2, the “expected behavior” that should happen is either that an error is thrown, the 2 is cast to a string and appended, or something you want to have happen. In such a case, the code would still be behaving expectedly, even if it were not returning scientifically accurate information. Unexpected behavior may be “Waffle” + 2 not throwing an error, or calling some routine you did not expect.
Two main types of testing are strongly encouraged:
- Unit tests – Small tests designed to check that individual functions or operations behave correctly. This means that not only do the operations run without unexpected errors, but also that they behave as expected with the given inputs, even if that behavior is to throw or raise an error. Unit tests should be added whenever new features/code are added, ensuring that the code is covered by tests.
- Regression tests – given a known input, does the software correctly and consistently return the correct values, even after changes to the code? These tests can occasionally take longer to run than Unit Tests, as they often require running the whole code base, or large parts of it. Regression tests can also include tests of previously fixed bugs. This second use case is the more common definition outside of scientific code bases. Ensuring that old bugs are not reintroduced during development is often extremely helpful, as this helps you and others to avoid making the same mistakes.
A third type of testing recommended based on how many dependents your code has, i.e. “How many other places is your code used?”
- Integration Tests – Wide-spanning tests which check that subsystem operations behave correctly, also within other environments like on specific hardware or in specific packages. You likely won’t have to think about these until you try to include your package within other packages.
Code coverage measures how many code paths your tests touch, out of all the possible paths taken through your code. It is often reported in “Percent of lines executed relative to total lines of code,” but it should be read as “what percent of decision branches are executed”, as multiple choices/conditions can lead to the same code being executed. There is no hard and fast rule for what percent you should aim for. 90-95% is a good target, especially early on. 100% is admirable, but often not realistically attainable, especially when working with sprawling code, since touching every line of code with tests can require such esoteric test inputs that the calculations would become unreasonable. However, every code is different, and you should still strive for 100% coverage until you hit the limit of sane programming.
It’s also important to understand the limits of code coverage. Full coverage does not in any way ensure that the code was run correctly, it just states just that all the code was touched by the interpreter. You still have to write tests that assess the “expected behavior” of your code, not just that it ran.
Lastly, there is the concept of “number of tests” as a metric for how well tested a code base is. This metric is meaningless by itself. e.g. a test could be written that simply does “assert x == x”, and then repeats that test for every 232-1 32-bit unsigned integers. Tests should never be written with “how many tests” in mind, they should always be written within the context of three things: the type of the test (list above), whether it captures all reasonable expected behaviors, and how much coverage (code coverage) does the collection of tests have; in that order.
- Python: PyTest
- C/C++/Fortran: CTest
- Rust: Test Attribute
- Julia: Test Macros
Continuous integration (CI) automatically builds your code, runs tests on a variety of different platforms, and deploys all manner of builds and documentation as desired. Typically, CI is run when new code is proposed (e.g. through GitHub Pull Requests) or committed to the repository. CI is useful for catching bugs before they reach your end users, and for running tests automatically, including on platforms that are not available to every developer.
CI can be broken down into several stages. Most CI should at least build the code and then run unit tests. The build stage takes the source code and performs compilation and dependency resolution/installation for the next stage. Compiled languages like C++ and Rust require this step to turn the source code into executables. Interpreted languages like Python or R do not usually need this step, as their code is not compiled, but they still typically need to install dependencies. The unit test stage runs a series of tests to ensure that the code is working as expected without syntactical or logical errors. Most, if not all, codes should have unit tests. Some codes where reproducibility of results is highly sought for (especially in the scientific field) should also include a regression test stage, in which the results of the code are compared against previously computed values. Regression tests can take significantly longer than unit tests and may need to be relegated to very infrequent CI runs, or handled through separate means. Lastly, a deploy stage can take the compiled and verified code and push it to an appropriate branch or service to make it available. Deployment can also include documentation pages, API’s, and experimental/nightly builds.
GitHub itself now provides a CI service for its repositories called “GitHub Actions” which can be configured to run with most repositories. However, there are also many other CI services, most of which have webhooks for integration with GitHub. There are also CI services for non-GitHub based code repositories.
Examples of CI Software/Services
- Web Based Services
- GitHub Actions
- Azure Pipelines
- CircleCI (Non-Microsoft example)
- Self Hosted Options
- Examples of projects using CI:
- Basis Set Exchange (Python with GitHub Actions)
- QCFractal (Python and SQL with GitHub Actions)
Code that lives beyond its initial development will be read many times as part of routine maintenance and as new features are added. For this reason, it is essential to comment your code thoroughly: it will reduce the time and effort to understand your code when you come back to it some months or years later. Establishing and following a standard style in your projects will also increase readability, make maintenance easier, and can reduce onboarding time for new developers.
While code style can be personal, languages usually have at least a few dominant coding styles that are familiar to most programmers in that language. When programming in Python, the most commonly followed style is some variation of PEP 8. In Python, you might also consider adopting type hinting for large projects. Documentation embedded in the code through documentation strings or comments is a crucial aspect of code style you should also establish for your projects.
MolSSI recommends using automatic formatting tools that can enforce a particular coding style. These are often configurable for each project.
- Python: black
- C/C++: clang-format, astyle
Example of a coding style guides:
Code Style Tutorials:
The importance of documentation in an organization is often determined by multiple factors including the adopted software development practices (waterfall, agile etc.) and the size of the software being documented. Regardless, the documentation is a vitrine for the software that reflects its health and livelihood. A thriving software ecosystem has regularly updated documentation.
Ideally, documentation not only offers brief and informative guidelines to help busy users achieve their goals rapidly but also includes detailed user/developer manuals to provide deeper insights into the software infrastructure. The former type of documentation is often entitled as getting started, quick guide, 10-minutes to … etc. The 10 minutes to Pandas and 10 minutes to Dask are great examples of such documentation. The documentation can also be complemented by short video tutorials or brief blog posts with practical examples.
The developer documentation, on the other hand, involves several more detailed pieces of information such as build requirements and dependencies, or how to compile/build/test/install the software.
Documentation also commonly covers the application programming interface (API). The API reference often involves the documentation of various internal files, function and class signatures as well as the reasoning behind the naming conventions and certain adopted designs. Mature scientific and engineering libraries such as oneAPI Math Kernel Library (oneMKL) and oneAPI Deep Neural Network (oneDNN) library from Intel provide great examples of this class of documentation.
The documentation should be kept up to date with changes in the code. This is not an easy task, especially for large and fast-moving code bases using agile software engineering practices. However, slightly out-of-date documentation is generally preferable to no documentation. We recommend compiling and testing the examples provided within the documentation regularly, in order to maintain the quality and usefulness of the documentation over time.
Popular documentation packages:
- C/C++/Fortran – Doxygen
- Python – Sphinx and Read The Docs
- Fortran – Ford and Doxygen
- Julia – Documenter.jl
Examples of good documentation:
- SEAMM – (Sphinx)
- Arbitrary precision math library (arb) – (Sphinx)
- MolSSI Driver Interface (MDI) – (Doxygen)
- Fortran Package Manager – (Ford)
While interpreted languages such as Python have been steadily gaining popularity, compiled languages such as C and Fortran are still prevalent and many new software projects even choose to mix compiled and interpreted languages. Compiled software generally requires a build system to manage internal and external dependencies and to automate most of the tedious process of generating libraries and executables. The popular standards for build systems have shifted over the years, from Makefiles, to GNU Autotools, and in recent years to CMake. While there are modern competitors such as Meson, we recommend the use of CMake for all build systems, as new users and developers are more likely to be familiar with it and more likely to have it already installed on their computers. CMake is a flexible build system that makes it straightforward to implement simple build systems for new projects that can be arbitrarily extended to handle growing project complexity and more diverse needs. CMake is also actively supported and responsive to emerging software and hardware trends in computing.
There is an abundance of online information, documentation, and examples regarding CMake. Its developers provide a tutorial that is good for guiding unfamiliar developers through the implementation of a simple CMake build system, and the main documentation is versioned and clearly delineates the version requirements of specific features.
An important consideration in using CMake is what minimum version to require, which is a decision that every project needs to make for itself, although it can certainly be changed with time. Using older CMake versions makes it more likely that users will already have a compatible version of CMake installed on their computers, but it will also prevent the use of newer features. Our recommendation is to use the minimum version of CMake that contains all of the features that are actively being used by a project and increase the minimum version requirement as more recent features become necessary. We do however recommend starting with CMake 3.x rather than CMake 2.x, and using a target-based build system.
The main use cases of CMake are (1) developers building a project, (2) users building a project, (3) automated builds for continuous integration, and less commonly (4) automated builds as part of building a dependent project and (5) packaging a project for distribution. Ideally, a project should compile and build without any custom arguments to CMake in a properly set up software environment, and any necessary, non-standard arguments should be clearly articulated in a project’s documentation. These different use cases can each introduce complications into a build system, and it is important to keep track of and document which features are serving which use cases.
Software quality depends on many factors such as: Functionality, usability, performance, reliability, portability, interoperability, scalability, and reusability (see full description here).
There are many aspects that contribute to a good design and to the quality of your software. It is important to follow the best practices and think about the design of your software. Luckily, many experienced programmers have developed best practices over a substantial period of time. These best practices can help inexperienced developers to learn software design easily and quickly.
The first thing you can learn that will immediately improve the quality of software is to follow the SOLID Principles of Software Design. Following those 5 principles will result in a more understandable, flexible and maintainable code. You can read more here:
- Dev IQ: The SOLID principles of Object Oriented Design
- The Team Coder: SOLID Principles of Software Design by Examples
Design Patterns are well-thought-of, reusable solutions to common recurring problems that developers face during software development. They are considered a common terminology between experienced developers. Design Patterns are general and can be applied to any programming language. The following are some references to get you started.
- MolSSI Workshop in Object Oriented Programming and Design Patterns
- Python Design Patterns: For Sleek And Fashionable Code
- Design Pattern in Python (Github examples)
- A General tutorial on Design Patterns
Object Oriented Programming (OOP):
Object Oriented Programming (OOP) is a method of structuring functions and data into objects that can help organize software projects. It has a number of advantages, including improving reusability and maintainability. It is highly encouraged to use OOP in large-scale projects
Containerization is a tool that allows you to launch and work with many different environments (known as “containers”) on a single computer, each of which might be running a different operating system, a different set of installed libraries, different environment variables, etc. Each container operates in isolation from the others; effectively, you can think of each container as a totally different computer, which just happens to be sharing the same hardware as the other containers. Although similar in some ways to Virtual Machines, containerization is based on fundamentally different technology, and is generally much easier to set up and has a dramatically smaller performance cost. For most purposes, the performance cost of using containers is negligible.
This opens up all sorts of possibilities. Do you own a Windows machine, but want to test a code on Linux? No problem – just launch a Linux container on your Windows machine, and test away! Are you having trouble reproducing a bug someone else has encountered, and suspect the problem might be dependent on some detail of the runtime environment? There’s no need to mess with (and potentially break) your own environment in pursuit of the bug – just try some tests in a few containers, leaving your own environment unchanged. In fact, because of the clean isolation and reproducibility of environments that is provided by containerization, anything you can do in a container should probably be done in a container.
Moreover, you can easily build and deploy containers. For example, you could build a code you are developing (along with all of its dependencies) within a container, and then deploy the container. Because the container contains your compiled software and everything needed to run it, including the operating system, the end user doesn’t need to install anything on their system. All they need to do is launch your container and start running calculations.
By far the most commonly used containerization tool is Docker, which is what we recommend getting started with. It is important to note that using Docker effectively requires root access. HPC centers are never going to give you root access to their expensive machines, which precludes Docker from use in an HPC context. Fortunately, there are several containerization alternatives that have been developed specifically for use on HPC machines, with the most prominent being Apptainer. If you are interested in using containerization on an HPC machine, ask the organization that operates the machine about which containerization solution(s) they recommend.
Recommended Software (not usable with HPC):
Recommended Hosting Service:
Alternatives for HPC:
- Apptainer (previously known as Singularity)