As a graduate student, Steven Weisberg helped develop a college campus – albeit a virtual one. Called Virtual Silcton, the software tests spatial navigation skills, teaches people the layout of a virtual campus, then challenges them to point in the direction of specific landmarks.1. It’s been used by more than a dozen labs, says Weisberg, who is now a cognitive neuroscientist at the University of Florida in Gainesville.
But in February 2020, a colleague who was testing the software identified a problem: it couldn’t calculate your direction accurately if you were pointing more than 90 degrees from the site. “The first thing I thought was, ‘oh, that’s weird,'” Weisberg recalled. But it was true: his software generated errors that could alter his calculations and his conclusions.
“We have to take it all out,” he thought.
When it comes to software, bugs are inevitable, especially in academia, where code tends to be written by graduate students and post-docs who have never been trained in software development. But simple strategies can minimize the likelihood of a bug and make the recovery process easier.
Julia Strand, a psychologist at Carleton College in Northfield, Minnesota, studies strategies to help people strike up a conversation in, say, a loud, crowded restaurant. In 2018, she reported that a visual cue, such as a flashing dot on a computer screen that coincided with speech, reduced the cognitive effort needed to understand what was being said.2. This suggested that a simple smartphone app could reduce the mental fatigue that sometimes occurs in such situations.
But it wasn’t true. Strand had inadvertently programmed the test software to start timing one condition earlier than the other, which, as she wrote in 2020, “is akin to starting a stopwatch before a runner arrives. at the line”.
“I felt physically ill,” she wrote – the mistake could have negatively affected her students, collaborators, funding and work. It didn’t – she corrected her article, kept her grants and was granted tenure. But to help others avoid a similar experience, she created an educational resource called Error Tight.3.
Error Tight provides practical guidance that echoes computational reproducibility checklists, such as; use version control; document code and workflow; and adopt standardized file naming and organization strategies.
His other recommendations are more philosophical. An “error-proof” lab, says Strand, recognizes that even careful researchers make mistakes. As a result, his team adopted a common strategy in professional software development: code review. The team proactively searches for bugs by having two people review their work, rather than assuming those bugs don’t exist.
Joana Grave, a doctoral student in psychology at the University of Aveiro, Portugal, also uses code review. In 2021, Grave retracted a study when she found the tests she scheduled had been miscoded to show the wrong images. Now the experienced programmers on the team check her work, she says, and Grave repeats the coding tasks to make sure she gets the same answer.
Scientific software can be difficult to revise, warns C. Titus Brown, a bioinformatician at the University of California, Davis. “If we’re operating on the edge of novelty, there may only be one person who understands the code, and it may take a long time for another person to understand it. And even then, they may not be asking the right questions.
Weisberg shared other helpful practices in a Twitter thread about his experience. These include sharing code, data, and compute environments on sites such as GitHub and Binder; ensure that the results of the calculations agree with the evidence collected using different methods; and adopting widely used software libraries instead of custom algorithms when possible, as these are often extensively tested by the scientific community.
Wherever your code comes from, validate it before using it — and then periodically, such as after updating your operating system, advises Philip Williams, a natural products chemist at the University of Hawaii to Manoa in Honolulu. “If something changes, the best practice is to go back and make sure everything is okay, rather than just assuming that those black boxes will always give the right answer,” he says.
Williams and his colleagues identified what they called a “glitch” in another researcher’s published code for interpreting nuclear magnetic resonance data.4, which resulted in different sorting of datasets depending on the user’s operating system. Checking their numbers against a model dataset with known “correct” answers could have alerted them that the code was not working as expected, he says.
If the code can’t be bug-free, it can at least be developed in such a way that any bugs are relatively easy to find. Lorena Barba, a mechanical and aerospace engineer at George Washington University in Washington DC, says that when she and her graduate student at the time, Natalia Clementi, discovered a coding error underlying a study5 they had posted in 2019, “there were poo emojis sent from Slack and all kinds of screaming emojis and stuff for a few hours.” But the pair were able to solve their problem quickly, thanks to the reproducibility packages (called repro-packs) Barba’s lab makes for all of their published work.
A repro-pack is an open-access archive of all the scripts, datasets, and configuration files needed to perform an analysis and reproduce the results published in a paper, which Barba’s team uploads to open-access repositories such than Zenodo and Figshare. Once they realized their code contained an error – they had accidentally omitted a math term from one of their equations – Clementi retrieved the relevant repro-pack, corrected the code, ran his calculations again, and compared the results. Without repro-pack, she would have had to remember exactly how that data was processed. “It probably would have taken me months to try to see if this [code] was correct or not,” she said. Instead, it only took two days.
Brown needed much longer to fix a bug he discovered in 2020 when he tried to apply his lab’s metagenome research tool, called spacegraphcats, to a new question. The software contained a bad filtering step, which deleted some data. “I started thinking, ‘oh my God, maybe this calls the original post into question,'” he deadpans. Brown fixed the software in less than two weeks. But restarting the calculations delayed the project for several months.
To minimize delays, good documentation is crucial. Milan Curcic, an oceanographer at the University of Miami, Florida, co-authored a 2020 study6 who studied the impact of hurricane wind speed on ocean waves. As part of this work, Curcic and his colleagues repeated calculations that had been performed in the same lab in 2004, only to discover that the original code was using the wrong data file to perform some of its calculations, producing a ” offset” of about 30%.
According to Google Scholar, the 2004 studyseven has been cited more than 800 times, and its predictions inform hurricane forecasts today, Curcic says. Yet its code, written in the MATLAB programming language, has never been posted online. And it was so poorly documented that Curcic had to go through it line by line to figure out how it worked. When he found the error, he said, “The question was, am I misunderstanding it correctly, or is it really incorrect?”
Strand asks team members to read each other’s code to familiarize them with programming and encourage good documentation. “The code should be commented enough that even someone who doesn’t know how to code can understand what’s going on and how the data changes at each step,” she says.
And she encourages students to view mistakes as part of science rather than personal failures. “Labs that have a culture of ‘smart, careful people don’t make mistakes,’ are preparing to be a lab that doesn’t admit mistakes,” she says.
Bugs do not necessarily mean retraction in all cases. Barba, Brown, and Weisberg’s errors had only minor impacts on their results, and none required changes to their publications. In 2016, Marcos Gallego Llorente, then a graduate student in genetics at the University of Cambridge, UK, identified an error in the code he wrote to study human migration patterns in Africa 4,500 years ago. . When he reanalyzed the data, the overall conclusion was unchanged, although the extent of its geographical impact was, and one correction was enough.
Thomas Hoye, an organic chemist at the University of Minnesota in Minneapolis, co-authored a study that used the software in which Williams discovered a bug. When Williams contacted him, Hoye says, he didn’t have “a particularly strong reaction.” He and his colleagues fixed their code, updated their protocols online, and moved on.
“I couldn’t help but think at the end, ‘this is how science should work,'” he says. “You find a mistake, you step back, you improve, you correct, you move forward.”