A team of researchers at Texas A&M University looked at the best way to document computer code by using samples and found that good naming, and comment were more important than good documentation.

A team of researchers at Texas A&M University looked at the best way to document computer code by using samples that were functionally the same, but with different documentation. This allowed the team to correlate higher scores and faster performance with which documentation styles were better than others.
“This research is exciting to me because even with a relatively small amount of data we had statistically significant results that really help to inform how we should document the code we write,” said Scott Kolodziej, a graduate student at Texas A&M’s department of computer science and engineering. “Ultimately, we’ve helped to answer the question of what constitutes good documentation: first, good naming, and second, good comments.”
They also uncovered an interesting correlation, poor documentation seems to lead to a more correct understanding of the code at the cost of time.
“While we’re not advocating that you document your code poorly, it may imply that too much documentation distracts the reader from what the code actually does, or lulls them into a false sense of understanding, even when the documentation is not meant to mislead,” Kolodziej said.
Opinions vary about what makes documentation good, clear and understandable. But hard evidence in support of these opinions can be hard to come by. Kolodziej and his team wanted to provide objective data about what makes quality software.
“Ultimately, we would like to add to a statistically and methodologically sound foundation to software engineering,” Kolodziej said. “In this case, what does good documentation look like?”
The project started as part of the Aggie Research Leadership/Scholars Program, which is an on-campus program to bridge the gap between undergraduates interested in research and graduate students looking for mentoring opportunities. Kolodziej built his team of undergraduates and met to plan and design the details of the study. Once the experiment was designed, another 24 undergraduate students were recruited to participate in the study.
“I first became interested in empirical software engineering after reading some papers by Dr. Andreas Stefik from the University of Nevada, Las Vegas,” Kolodziej said. “I wanted to help contribute to the body of empirical evidence underpinning software engineering by conducting my own experimental study.”
The results of his findings said that better variable and function naming was much more effective than only using comments to code.
“This was very surprising, especially given the traditional importance given to commenting code,” Kolodziej said. “It implies that a software engineer should spend at least as much time coming up with descriptive and clear names for their variables and functions than simply commenting their code and hoping that makes up for names like ‘x’ and ‘y’.”
Texas A&M University
Texas A&M University, College Station, TX
– Edited by Chris Vavra, production editor, Control Engineering, CFE Media, [email protected]. See more Control Engineering manufacturing IT stories.