Enrollment in undergraduate computer science courses is at an all-time high at colleges nationwide. But this trend that's been hailed by the U.S. tech industry has a dark side: a disproportionate number of students taking these courses are caught cheating.
More students are caught cheating in introductory computer science courses than in any other course on campus, thanks to automated tools that professors use to detect unauthorized code reuse, excessive collaboration and other forbidden ways of completing homework assignments.
Computer science professors say their students are not more dishonest than students in other fields; they're just more likely to get caught because software is available to check for plagiarism.
Half the academic dishonesty cases at the University of Washington involve computer science students.
"The truth is that on every campus, a large proportion of the reported cases of academic dishonesty come from introductory computer science courses, and the reason is totally obvious: we use automated tools to detect plagiarism," explains Professor Ed Lazowska, chair of computer science and engineering at the University of Washington. "We compare against other student submissions, and we compare against previous student submissions and against code that may be on the Web. These tools flag suspicious cases, which are then manually examined."
Stanford University disclosed in February that 23% of its honor code violations involved computer science students, although these students represent only 6.5% of the student body. Of 123 honor code violations investigated last year by Stanford's Judicial Panel, 28 involved computer science students.
"The tools that we employ make it easy to catch students cheating," says Professor Mehran Sahami, associate professor of computer science at Stanford University. "I wouldn't say that computer science students violate the honor code more often or are any more dishonest."
Sahami says computer science students mistakenly believe that writing software code is similar to solving a mathematical proof, where one correct answer exists. What students don't realize is that software code is more akin to writing an essay and that a significant amount of creativity is involved.
"One of the things that happens in computer science and contributes to the cheating rates is that students are unaware of how dissimilar programs that do the same task really look," Sahami says. "They tend to think that it's OK if they copy portions of someone else's program. But our tools can discover this."
Behind the cheating epidemic
As more college students study computer science, the number of cheating incidents is on the rise.
At the University of Washington, total enrollment in the two introductory computer science courses is the highest ever, approaching 2,750 students over the last four quarters compared to 2,500 students a year ago. In these two courses, between 1% and 2% of the assignments are identified as involving academic dishonesty, Lazowska says.
The top computer science schools in the country are experiencing the same trend.
"We see more cheating. It's something we're working hard to stop through a number of means," says Professor Lenny Pitt, director of undergraduate programs at the Department of Computer Science at the University of Illinois. "Cheating has gone up across many fields because it's easier to find with the tools we have today."
Cheating in introductory computer science classes typically involves homework assignments rather than exams. Students get frustrated when their code won't run, and that's when they are tempted to borrow solutions from someone else.
Because computer science assignments are complicated and need debugging, professors assign the same homework several years in a row. That means students can find code written by someone who has already taken the course.
"There's a lot of infrastructure that needs to be built and a lot of effort that goes into creating a homework problem. There's a disincentive for professors to change those problems every semester. So we tend to reassign similar problems, and that causes cheating because past solutions are available," Pitt says, adding that the University of Illinois checks homework against a repository of past solutions.
The other common form of cheating involves excessive collaboration.
"Many of our students like to collaborate, but at what point are you copying?" Pitt asks. "The course policy needs to be really clear. Some courses will allow you to work in pairs but not in triples. If you don't follow that policy, we would call that cheating."
At Stanford, students are allowed to discuss strategies for solving homework problems, but they need to write their own code. "You shouldn't be looking at someone else's program, and you shouldn't be sharing yours with other computer science students," Sahami says.
Sahami says the introductory computer science courses require students to code their own programs, while higher-level courses allow for more teamwork. "We want them to learn the mechanics first, and then open up the world of collaboration," he adds.
Students often resort to cheating because they feel excessive pressure from their peers and their families to excel, professors say.
"Anecdotally, we see a larger incidence of cheating among foreign students then domestic students, and I think part of that is that they are under extreme pressure to do well and succeed," Pitt says. "That pressure contributes to the idea that they need to do as well as possible even if it means taking some shortcuts."
Computer science professors say just as much copying and excess collaboration goes on in other college courses.
"Does anyone in their right mind think that [cheating] isn't happening in large introductory courses in other fields? If so, they're smoking something," Lazowska says. "There have been several cases in which faculty in other disciplines have adapted these tools to detect plagiarism in term papers, and have found plagiarism rates far greater than typically encountered in computer science courses."
Why industry should care
As college recruiters traipse across the country seeking the best computer science graduates from the Class of 2010, they need to be aware of the rising incidents of cheating in computer science courses.
"Ethics is a huge issue in the IT industry, not just with copyrights and licensing but with privacy," says David Foote, CEO of Foote Partners, a research firm tracking IT workforce management and compensation issues. Foote sits on the advisory council for Hiram College's Center for the Study of Ethics and Values. "We need to be looking at the idea of how do we breed people to be concerned naturally about issues of ethics and morality."
For CIOs and other hiring executives, the issue is whether copying code for homework assignments could lead prospective employees to ethical lapses on the job. Businesses want to hire entry-level workers who won't get in trouble for software licensing, patent infringement and other intellectual property problems.
"There are not enough programs in American universities that are looking at ethics and cheating," Foote says. "Don't pick on computer science people. Pick on the fact that universities are not doing enough in general about calling attention to cheating."
Foote points out that some incidents of excess collaboration that qualify as cheating on college campuses would be encouraged by industry, which wants employees to reuse software to improve efficiency and cut costs.
"In the real world, people write code in teams where they are given pieces of a project to work on," Foote says. "The academic world should be mapping onto the real world…They shouldn't be handing out assignments where people are coding on their own."
To encourage collaboration, Georgia Tech changed its approach to cheating in its introductory computer science courses in 2007. Instead of trying to catch and punish students for excessive collaboration, Georgia Tech requires students to disclose the names of the other students that helped them complete their homework.
"Students sign a collaboration agreement," explains Cedric Stallworth, assistant dean for Undergraduate Enrollment at Georgia Tech's College of Computing. "We realize that computing is one of the subjects that is best learned in a group. If students are using somebody else's code and are learning from it, that's all right."
To ensure that the students are mastering the material, Georgia Tech requires them to give an oral demonstration of how their software works for one of their teaching assistants.
"We worry less about catching cheaters. We worry more about properly assessing the student's skill set," Stallworth says. "Less percentage of the grade is from homework and more percentage is from the assessment, and the assessment is designed to truly [measure] skills. Then you can cheat on homework, but that's not going to help you with the assessment that counts for the bulk of your grade."
No rise in failure rates
Computer science professors do not think that cheating is a result of students being unprepared to take introductory-level courses. They are not seeing a rise in failure rates or a drop in grades. Instead, they say their students are more qualified than ever.
"We haven't seen an increase in failure rates," Stanford's Sahami says, adding that 5% or less of students typically fail introductory computer classes. "This is not a student body that accepts failure. For them to pass all of their classes is an important thing."
Lazowska says rising enrollment in computer science courses is a more important trend than the resulting increase in plagiarism cases.
"An ever-broader range of students is recognizing that, even if they major in something else, college-level preparation in computational thinking is essential," Lazowska says. "There is no reason to believe that computer science students are anything other than better than ever."
Read more about infrastructure management in Network World's Infrastructure Management section.
This story, "Why computer science students cheat" was originally published by Network World.