Being an interdisciplinary organization, we have the pleasure of watching our scientists succeed as part of The MolSSI and within their individual fields. Dr. Daniel G.A. Smith has celebrated a recent accomplishment we are thrilled about: a published paper in the Journal of Chemical Theory and Computation. His work– “Psi4NumPy: An Interactive Quantum Chemistry Programming Environment for Reference Implementations and Rapid Development”–is an excellent example of how his research values align with The MolSSI’s commitment to educate, empower, and enable the computational molecular science field.
Psi4NumPy is a self-explanatory title for this program: Psi4 and NumPy, two different Python modules, glued together. This innovation allows for the quick development of clear, understandable Python code for new quantum chemical methods. With Psi4NumPy, quantum chemistry coding programs will be more accessible for both students and seasoned researchers to learn, use, and create.
While earning his Ph.D. in quantum chemistry, Daniel discovered that running tests, computations, and implementing quantum chemical methodologies was a painfully repetitive process. “For me I have always thought that quantum mechanics programming is portrayed as much harder than it really is,” Daniel said. “From basic computer science education, to new students being thrown into programs with millions of lines of code dependant on institutional knowledge” While the current form is straightforward for someone with advanced knowledge of the system and coding, this form makes learning, testing, and reproducing difficult for many in the community. What Daniel hoped for instead was a more effective system to educate novice coders, enable reproducibility of quantum chemistry methodologies, and provide researchers a rapid prototyping playground.
In order to produce the best research it can, the community must be able to be able to communicate and reproduce the ideas of others. “I think an idea that I and many others adhere to is that unless you have programmed a method it is very hard to intuitively understand it. When coming up with new methodologies there is the theoretical understanding of the method and then there is the software that enables the community to try the theory,” Daniel explained. The field has plenty of theoretical understanding to go around; it’s the software that’s needed. Scientists struggle to recode that which is presented on paper. “After ~5 years of developing code and theory in this field I can pick up a paper in the literature and with about a 50% success rate recode what they have done after several weeks of effort,” Daniel said. Those are not great odds, and Daniel felt it was definitely not the best use of his time, especially when the only way to complete the picture is “we once had to fly out the author of a method so that we could sit in a room together and work on it for about a week.”
So when his advisor during his Ph.D. encouraged him towards the complex catalog of codes, Daniel considered if there might be a path better suited to his needs. Daniel’s innovative idea was to combine the linear algebra program NumPy with the quantum chemistry program Psi4, basically adhering the two programs together in order to quickly create quantum chemistry software without the decades of workarounds and accrued notes. This hybridization of general purpose scientific Python and optimized C++ “has taken off quite rapidly within our field,” Daniel said. “It is one of those general ‘good ideas’ that many in the community started working on around the same time we did.”
Psi4NumPy is a fully open-source project, hosted on GitHub so that any researcher in the world is free to use or contribute. Further, this project is a collaborative, team effort by thirty members of the molecular science community (both those within and without the MolSSI). Open access to high-quality code created by a team of researchers encourages a culture shift towards collaboration, community, and contribution.
This program has incredible potential to open up the world of quantum chemistry to an even wider breadth of students, something that is important for the sustainability of the field. Psi4NumPy gives users detailed step-by-step instructions with clean code. Within the new program, extraneous options and performance tweaks are removed, leaving only the necessary code in a highly verbose format. Students can start programming–and, as a result, pursuing their own research–sooner, with hands-on experience to gain better understanding of the field. And with programs that aren’t oversaturated with performance tweaks and workarounds, learning and using the programs requires far less prerequisite knowledge than current approaches.
“There is a deficiency in our field for the ability to convey knowledge of ‘how’ to build code and our reproducibility has suffered outside of canonical methods in our field,” Daniel said. “This publication represents a step forward not particularly in science itself, but in the ability to share the science and ideas involved in our field and assists newer students in understanding how to program these theories.”
Computational molecular science problems are enormous and complex. While Psi4NumPy does not solve the problems of code reproduction and education, it is a solution that may benefit many. Daniel’s innovative thinking demonstrates how reframing coding to consider education and reproducibility first can better empower & enable the community at large. This is the heartbeat of The MolSSI, and we are thrilled to see what will come of this project as it grows.