Overview

Turnitin has announced they will be previewing a new AI Writing Detection Indicator to help support academic integrity. This feature will check any Turnitin submissions and predict which parts were written by a human or generated using AI writing tools such as ChatGPT. Turnitin will begin previewing this feature starting April 4th, 2023, but this is only a preview, as the tool is still in development.

USC instructors will be able to find this feature when grading and providing feedback on student submissions in the Turnitin Feedback Studio. You will also be able to access an AI writing detection report, which will have more resources from Turnitin on their new feature.

Limitations

According to Turnitin, its AI writing detection is at best 97% accurate, and they admit to having a false positive rate of at least 1%. Furthermore, Turnitin’s detection model is currently only trained on content generated by GPT-3, GPT-3.5, and variants of these language models, which include ChatGPT.

While we feel this feature can be a powerful tool for detecting potential instances of AI-generated writing, the tool is still in its infancy. Thus, it should be used only as a starting point for further conversations with students about writing and academic integrity, rather than as the final word on student work.

Feature Testing: Updated April 14th, 2023

The USC Blackboard Support Team has been testing the features of the Turnitin AI writing detection tool. Since this is a tool still in its infancy, we want to make sure you have the most accurate information on it if you choose to try it out. Here are some of the concerns we have so far as we continue testing this tool:

False positives

We’ve submitted a few sample papers of our own and have already begun to see them show up as false positives for Turnitin (i.e., a paper written by AI gets treated as if it’s written by a human, or a paper written by a human gets treated as if it’s written by AI). We will continue to monitor the rate at which Turnitin flags false positives for AI writing.

Papers submitted before April 4th

For any papers that were submitted to Turnitin before the release of this tool (April 4th, 2023), Turnitin will not give any score for whether the paper was written by a human or by AI. When you open past papers in Turnitin, the AI writing detection indicator will either remain blank, or show a 0%. If it shows 0%, this is not an indication of an accurate AI writing score. Any papers written before April 4th will have to be resubmitted to Turnitin if you’d like to run them through Turnitin’s AI writing detector.

Guidance

We continue to recommend using Turnitin in order to detect instances of content plagiarism in student submissions. If you choose to try out Turnitin’s new AI writing detection feature, when you receive an AI writing detection report, take the time to carefully review the report in combination with your own experience and judgment. Again, this tool should not be used as the final word on a student’s work.

If you have any doubts or concerns about a submission, follow up with the student to gather more information to help you assess the situation. Additionally, the Office of Academic Integrity (academicintegrity@usc.edu) can provide guidance and support.

Further Support

Visit Turnitin’s help page on AI writing detection to learn more about this tool. Turnitin has also created a user guide and FAQ sheet for their AI writing detection tool you can view in order to help you better navigate the features of the tool.

If you have any other comments or questions about this feature, please email us at blackboard@usc.edu.

Frequently Asked Questions

How does the solution work?

When a paper is submitted to Turnitin, the submission is first broken into segments of text that are roughly a few hundred words (about five to ten sentences). Those segments are then overlapped with each other to capture each sentence in context.

The segments are run against the AI detection model and they give each sentence a score between 0 and 1 to determine whether it is written by a human or by AI. If the model determines that a sentence was not generated by AI, it will receive a score of 0. If it determines the entirety of the sentence was generated by AI it will receive a score of 1.

Using the average scores of all the segments within the document, the model then generates an overall prediction of how much text (with 98% confidence based on data that was collected and verified in our AI Innovation lab) in the submission we believe has been generated by AI. For example, when we say that 40% of the overall text has been AI- generated, we’re 98% confident that is the case.

Currently, Turnitin’s AI writing detection model is trained to detect content from the GPT-3 and GPT-3.5 language models, which includes ChatGPT. We are actively working on expanding our model to enable us to better detect content from other AI language models.

Will this impact the current Turnitin Similarity Report?

No. This additional functionality does not change the way you use the Similarity report or your existing workflows. The AI detection capabilities have been added to the Similarity report to provide a seamless experience for users.

What does the percentage in the AI writing detection indicator mean?

The percentage indicates the amount of qualifying text within the submission that Turnitin’s AI writing detection model determines was generated by AI (with 98% confidence based on data that was carefully collected and verified in a controlled lab environment). This qualifying text includes only prose sentences, meaning that it only analyzes blocks of text that are written in standard grammatical sentences and do not include other types of writing such as lists, bullet points, or other non-sentence structures.

This percentage is not necessarily the percentage of the entire submission. If text within the submission is not considered long-form prose text, it will not be included. Unlike the Similarity Report, the AI writing percentage does not necessarily correlate to the amount of text in the submission. Turnitin’s AI writing detection model only looks for prose sentences contained in long-form writing. Prose text contained in long-form writing means individual sentences that make up a longer piece of written work, such as an essay, a dissertation, or an article, etc. The model does not detect AI-generated text such as poetry, scripts, or code. Nor does it detect short-form/unconventional writing such as bullet points, tables, or short exam answers.

Will students see the AI writing detection results?

No, the AI writing detection indicator and report are not visible to students.

What is the difference between the Similarity score and AI writing detection percentage?

The Similarity score and the AI writing detection percentage are completely independent and do not influence each other.

  • The Similarity score indicates the percentage of matching-text found in the submitted document when compared to Turnitin’s comprehensive collection of content for similarity checking.
  • The AI writing detection percentage, on the other hand, shows the overall percentage of text in a submission that Turnitin’s AI writing detection model predicts was generated by AI writing tools.
Which AI writing models can Turnitin’s technology detect?

The first iteration of Turnitin’s AI writing detection capabilities have been trained to detect models including GPT-3, GPT-3.5, and variants. The technology can also detect other AI writing tools that are based on these models such as ChatGPT.

The detectors are trained on the outputs of GPT-3, GPT-3.5 and ChatGPT, and modifying text generated by these systems will have an impact on the detectors’ abilities to identify AI written text. In Turnitin’s AI Innovation Lab, they conducted tests using open sourced paraphrasing tools (including different LLMs) and in most cases, the detector has retained its effectiveness and is able to identify text as AI-generated even when a paraphrasing tool has been used to change the AI output.

    What can I do if I feel that the AI indicator is incorrect? How are false positives dealt with?

    Sometimes false positives (incorrectly flagging human-written text as AI-generated), can include lists without a lot of structural variation, text that literally repeats itself, or text that has been paraphrased without developing new ideas. If the indicator shows a higher amount of AI writing in such text, Turnitin advises you to take that into consideration when looking at the percentage indicated.

    In a longer document with a mix of authentic writing and AI generated text, it can be difficult to exactly determine where the AI writing begins and original writing ends.

    In shorter documents where there are only a few hundred words, the prediction will be mostly “all or nothing” because the tool is predicting on a single segment without the opportunity to overlap. This means that some text that is a mix of AI- generated and original content could be flagged as entirely AI-generated.

    Please consider these points as you are reviewing the data and following up with students or others.

    What is the threshold percentage at which students have committed academic misconduct?

    While Turnitin has confidence in their model, this tool does not make a determination of misconduct, rather it provides data for educators to make an informed decision based on their academic and institutional policies and personal experience. This is why we would not recommend using a specific percentage on the AI writing detector as the sole basis for action or a definitive grading measure by instructors.