@brusilovsky

Automated Plagiarism Detection for Computer Programming Exercises Based on Patterns of Resubmission

, and . Proceedings of the 2018 ACM Conference on International Computing Education Research, page 178--186. New York, NY, USA, ACM, (2018)
DOI: 10.1145/3230977.3231006

Abstract

Plagiarism detection for computer programming exercises is a difficult problem. A traditional strategy has been to compare the submissions from all of the students in a class, searching for similarities between submissions suggestive of copying. Automated tools exist that compare submissions in order to help with this search. Increasingly, however, instructors have allowed students to submit multiple solutions, receiving formative feedback between submissions, with feedback often generated by automated assessment systems. Allowing multiple submissions allows for a fundamentally new way to detect plagiarism. Specifically, students may struggle with an exercise until frustration leads them to submit work that is not their own. We present a method for detecting plagiarism from the sequence of submissions made by an individual student. We have explored a variety of measures of program change over submissions, and we have found a set of features that can be transformed, using logistic regression, into a score capturing the likelihood of plagiarism. We have applied this method to data from four exercises from an undergraduate programming class. We show that our automatically generated scores are strongly correlated with the assessments of plagiarism made by an expert instructor. Thus, the scores can act as a powerful tool for searching for cases of academic dishonesty.

Description

Automated Plagiarism Detection for Computer Programming Exercises Based on Patterns of Resubmission

Links and resources

Tags

community

  • @brusilovsky
  • @dblp
@brusilovsky's tags highlighted