Big Data News Hubb
Advertisement
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
Big Data News Hubb
No Result
View All Result
Home Big Data

New AI classifier for indicating AI-written text

admin by admin
February 1, 2023
in Big Data


We’re launching a classifier trained to distinguish between AI-written and human-written text.

We’ve trained a classifier to distinguish between text written by a human and text written by AIs from a variety of providers. While it is impossible to reliably detect all AI-written text, we believe good classifiers can inform mitigations for false claims that AI-generated text was written by a human: for example, running automated misinformation campaigns, using AI tools for academic dishonesty, and positioning an AI chatbot as a human.

Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives). Our classifier’s reliability typically improves as the length of the input text increases. Compared to our previously released classifier, this new classifier is significantly more reliable on text from more recent AI systems.

We’re making this classifier publicly available to get feedback on whether imperfect tools like this one are useful. Our work on the detection of AI-generated text will continue, and we hope to share improved methods in the future.

Try our free work-in-progress classifier yourself:

Limitations

Our classifier has a number of important limitations. It should not be used as a primary decision-making tool, but instead as a complement to other methods of determining the source of a piece of text.

  1. The classifier is very unreliable on short texts (below 1,000 characters). Even longer texts are sometimes incorrectly labeled by the classifier.
  2. Sometimes human-written text will be incorrectly but confidently labeled as AI-written by our classifier.
  3. We recommend using the classifier only for English text. It performs significantly worse in other languages and it is unreliable on code.
  4. Text that is very predictable cannot be reliably identified. For example, it is impossible to predict whether a list of the first 1,000 prime numbers was written by AI or humans, because the correct answer is always the same.
  5. AI-written text can be edited to evade the classifier. Classifiers like ours can be updated and retrained based on successful attacks, but it is unclear whether detection has an advantage in the long-term.
  6. Classifiers based on neural networks are known to be poorly calibrated outside of their training data. For inputs that are very different from text in our training set, the classifier is sometimes extremely confident in a wrong prediction.

Training the classifier

Our classifier is a language model fine-tuned on a dataset of pairs of human-written text and AI-written text on the same topic. We collected this dataset from a variety of sources that we believe to be written by humans, such as the pretraining data and human demonstrations on prompts submitted to InstructGPT. We divided each text into a prompt and a response. On these prompts we generated responses from a variety of different language models trained by us and other organizations. For our web app, we adjust the confidence threshold to keep the false positive rate low; in other words, we only mark text as likely AI-written if the classifier is very confident.

Impact on educators and call for input

We recognize that identifying AI-written text has been an important point of discussion among educators, and equally important is recognizing the limits and impacts of AI generated text classifiers in the classroom. We have developed a preliminary resource on the use of ChatGPT for educators, which outlines some of the uses and associated limitations and considerations. While this resource is focused on educators, we expect our classifier and associated classifier tools to have an impact on journalists, mis/dis-information researchers, and other groups.

We are engaging with educators in the US to learn what they are seeing in their classrooms and to discuss ChatGPT’s capabilities and limitations, and we will continue to broaden our outreach as we learn. These are important conversations to have as part of our mission is to deploy large language models safely, in direct contact with affected communities.

If you’re directly impacted by these issues (including but not limited to teachers, administrators, parents, students, and education service providers), please provide us with feedback using this form. Direct feedback on the preliminary resource is helpful, and we also welcome any resources that educators are developing or have found helpful (e.g., course guidelines, honor code and policy updates, interactive tools, AI literacy programs).



Source link

Previous Post

Per Scholas and Cloudera Innovate to Solve for the Skills Gap in Data

Next Post

ChatGPT vs Chatsonic – Big Data Analytics News

Next Post

ChatGPT vs Chatsonic - Big Data Analytics News

Recommended

Introducing Whisper

January 19, 2023

Why a Data-driven Culture is Important to the Success of your SaaS Business

January 21, 2023

Are We Nearing the End of ML Modeling?

February 7, 2023

Don't miss it

News

Bill Gates Says the Age of AI Has Begun, Bringing Opportunity and Responsibility

March 25, 2023
Big Data

Techniques for training large neural networks

March 25, 2023
Big Data

O’Reilly 2023 Tech Trends Report Reveals Growing Interest in Artificial Intelligence Topics, Driven by Generative AI Advancement

March 24, 2023
Big Data

Democratizing the magic of ChatGPT with open models

March 24, 2023
News

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

March 24, 2023
News

ChatGPT Puts AI At Inflection Point, Nvidia CEO Huang Says

March 24, 2023

big-data-footer-white

© 2022 Big Data News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Big Data
  • News
  • Contact us

Newsletter Sign Up

No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us

© 2022 Big Data News Hubb All rights reserved.