Plagiarism-detection software harms students

Students are often not terribly clear about what constitutes plagiarism. You may chalk this up to inadequate instruction in high school or first-year undergraduate writing classes. But I’ll let you in on a secret: academics, in general, are not always clear about what constitutes plagiarism. Despite this real-world lack of clarity, plagiarism-detection software often presents itself as providing clear and concrete answers about plagiarism. Recently, I met a student who was unkindly caught up in the confluence of these rough waters.

Muddier than we think

Sure, most folks in the academic world agree with a roughly-sketched definition of plagiarism as “passing off someone else’s ideas as your own” or “using someone else’s work without proper credit”. But when you get into the details of what “passing off” or “proper credit” are, academic disciplines, communities of work or practice, and cultures and subcultures rapidly diverge. The cite-for-every-factual-statement-or-reference-and-then-cite-some-more practices of legal academia would be distasteful to researchers in many scientific disciplines, in which it is usually accepted that you do not have to provide citations for commonly understood facts and basic ideas. And while academics in the US and many other countries valorize original compositions, some cultures encourage copying from ones predecessors as a form of demonstrating competence within a field.

When I encourage a roomful of academics to consider the ethics of Roy Lichtenstein’s copying of comic-book panels, the tenor of the discussion often turns on whether there are any fine arts faculty members in the room. (Very generally speaking, fine artists are more in favor of his kind of copying than folks in other disciplines.)

And I can’t begin to count the amount of minutes I’ve spent explaining to senior faculty members, K-12 teachers, graduate students, undergraduates, librarians of every stripe, and many more people, that in the U.S. legal system, there is no legal cause of action for plagiarism; and that plagiarism is an entirely distinct issue from copyright. (You can infringe copyrights without plagiarizing, and plagiarize without infringing.)

Detection software – a questionable solution

Into this world of confusion that people don’t seem to know they’re confused about, enter the Internet (which supposedly makes plagiarism tons easier than before), high teaching loads (and increasingly reliance on underpaid adjuncts), and greater emphasis on writing-enriched college curricula. I get why plagiarism-detection software -exists-, I just don’t think it’s helping as much as people think it does. A number of other folks have written great critiques of this kind of software on a number of points. I’ve always worried about false positives (but trusted that thoughtful instructors would probably control for that), about the false sense of security it may be giving (false negatives), the costs to campuses, and most importantly about the semi-extortionate practice of requiring students to hand over their intellectual property to a system that will use it against them and their peers.

So I’ve had my concerns about this software for a while, but more recently I’ve been aware that many of the major plagiarism-detection services have started marketing separately direct to students. This strikes me as something institutional subscribers should be -very- concerned about: this is double-dipping, and it’s encouraging an arms race between different market segments for plagiarism-detection.

Adding up to a stressed-out student

Recently I encountered a student with some plagiarism-detection problems I hadn’t seen before. First, he was using a plagiarism-detecting software of his own volition, before submitting a piece of writing. He ran his work through Grammarly, and received a report of “matches”; content that might be plagiarized. I’m not entirely certain how the software framed this, whether it was presented as instances of plagiarism or as simply matches to existing content, but from the student’s seeming level of anxiety, I suspect the former. Concerningly, he had to pay the software provider to learn what these supposed matches were. (I think the initial submission may have been free, but do not know whether they retained his paper for future submissions to be matched against.)

Once he had paid, the matches that they revealed to him were extremely generic pieces of academic-ese prose, from sources he had never read or even heard of, in disciplines entirely unrelated to his work. Although I have his permission to talk about his experience, I don’t want to quote the specific phrases, because they might identify him to Grammarly. They were along the lines of these examples: “results illustrate these and other basic principles of” or “our methods were both efficient and reproducible, as evidenced by” (roughly this length and style of content.) While I literally just made up those examples, I wouldn’t be surprised if they matched in a search against a large database of writing samples.

Again, I don’t know that these were presented as definitely plagiarised, but the student was deeply concerned about how to deal with the situation, while seeming otherwise a pretty well-informed and level-headed person. I suggested he could talk about it with an academic advisor if he had ongoing concerns about whether he might get in trouble for copying language from sources he hadn’t read. Hopefully, an advisor would agree that this is not actually an issue.

You may find the student’s confusion and concern evidence of a poor understanding of what constitutes plagiarism, but I find it pretty understandable. First, as I said above, a lot of students (even graduate students, as he was) are still a bit unclear on plagiarism, and matches like these may confuse further. But it’s also understandable even if he does fully understand that he can’t have plagiarised from sources he never read: he does not know whether his work will be run through other plagiarism detection software, or whether those evaluating his work will be able to recognize a false positive.

This student suffered actual harm: financial harm from the highly-questionable business practice of identifying “matches” and then charging users to see what the matches are. (I would be less concerned if Grammarly had charged entirely up-front, or if the first couple of matches had been identifiable for free, so the student could have seen they were false positives without paying.) But there’s also a more nebulous harm, someone made him nervous enough to want to know the content of the “matches” the software had identified, and to be unsure how to proceed when the matches were from works he hadn’t consulted. While I think there are other factors contributing there, I’m pretty sure that the existence and use of plagiarism-detection software (and the way that software is marketed and sold) is undermining students’ confidence in their own work, and creating legitimate concerns for them as to how it will be received by evaluators.

Leave a comment

Your email address will not be published. Required fields are marked *

css.php