Software builds - that is, compiling programs into machine executable code - is an important part of most developers’ lives. When builds fail, due to compilation errors, it requires programmers to take extra time and brainpower to find and fix the problem, reducing their productivity. A better understanding of the cause of frequent software build errors, then, could help lead to new or improved development tools that would reduce these errors and increase developer output.
That was the motivation behind a new study from a group of researchers from Google, the Hong Kong University of Science and Technology and the University of Nebraska. The team wanted to address three main questions: How often do builds fail, why do builds fail and how long does it take to fix builds?
To answer these questions, they looked at the results from over 26 million builds by 18,000 Google engineers from November 2012 through July 2013. The builds were of Java or C++ code, Google’s most common languages, and errors were generated by either the javac compiler (for Java) and the LLVM Clang compiler (for C++). A build was defined as “a single request from a programmer which executes one or more compiles” and deemed a failure if any compile in the build failed. Compile error messages were grouped into one of five categories (dependency, type mismatch, syntax, semantic and other).
After reading through the study, there were a few findings that I found particularly interesting:
Build failure rates are not related to build frequency or developer experience
Going in, the researchers had hypothesized that developers who build more frequently would experience a higher rate of build failure. The researchers found no correlation between a developer’s build count and the build failure ratio.
They also theorized that more experienced developers would have a lower failure rate. Again, though, this was found not to be the case. The researchers found no evidence in these data that “experienced” developers (defined as those with at least 1,000 builds in the previous nine months) had a lower failure ratio than “novice” developers (those with fewer than 200 builds in the previous three months).
The majority of build errors are dependency-related
Almost 65% of all Java build errors were classified as dependency-related, such as cases where the compiler couldn’t find a symbol (the most common one, 43% of all build errors), a package didn’t exist or Google-specific dependency check failed.
Similarly, almost 53% of all C++ build errors were classified as dependency-related. The most common such errors were using an undeclared identifier and missing class variables.
C++ generates more build errors than Java, but they’re easier to fix
The study found that the median build failure rate for C++ code was 38.4%, while the median for Java was 28.5%. It was also found that syntax errors occurred more frequently when building C++ code than Java. The researchers attribute this difference to the greater use of IDEs in Java development, which helps to cut down on these simpler errors. It probably also helps to explain why C++ build errors tended to be resolved more quickly than Java errors.
One of the key implications of the study is that tool developers should focus on helping software engineer prevent or resolve dependency errors. Cutting back on the number of these build errors, or, at least, the time it requires to resolve them, should help improve developer productivity. In fact, the authors say that, based on this study, the Google infrastructure team is now trying to do just that.
How generalizable these Google-specific findings are to rest of the general software developer population is unclear. But it’s a good start into a type of research that could eventually make developers’ lives easier and improve the efficiency of the whole software development process.
Read more of Phil Johnson's #Tech blog and follow the latest IT news at ITworld. Follow Phil on Twitter at @itwphiljohnson. For the latest IT news, analysis and how-tos, follow ITworld on Twitter and Facebook.