Big Data News Hubb
Advertisement
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us
No Result
View All Result
Big Data News Hubb
No Result
View All Result
Home Big Data

Language models can explain neurons in language models

admin by admin
May 9, 2023
in Big Data


Although the vast majority of our explanations score poorly, we believe we can now use ML techniques to further improve our ability to produce explanations. For example, we found we were able to improve scores by:

  • Iterating on explanations. We can increase scores by asking GPT-4 to come up with possible counterexamples, then revising explanations in light of their activations.
  • Using larger models to give explanations. The average score goes up as the explainer model’s capabilities increase. However, even GPT-4 gives worse explanations than humans, suggesting room for improvement.
  • Changing the architecture of the explained model. Training models with different activation functions improved explanation scores.

We are open-sourcing our datasets and visualization tools for GPT-4-written explanations of all 307,200 neurons in GPT-2, as well as code for explanation and scoring using publicly available models on the OpenAI API. We hope the research community will develop new techniques for generating higher-scoring explanations and better tools for exploring GPT-2 using explanations.

We found over 1,000 neurons with explanations that scored at least 0.8, meaning that according to GPT-4 they account for most of the neuron’s top-activating behavior. Most of these well-explained neurons are not very interesting. However, we also found many interesting neurons that GPT-4 didn’t understand. We hope as explanations improve we may be able to rapidly uncover interesting qualitative understanding of model computations.



Source link

Previous Post

Visibility and Transparency – Cloudera Blog

Next Post

ChatGPT Gives Kinetica a Natural Language Interface for Speedy Analytics Database

Next Post

ChatGPT Gives Kinetica a Natural Language Interface for Speedy Analytics Database

Recommended

TigerGraph Cloud’s New Capabilities Help Close the Data and Decision Gap

March 7, 2023

Google Claims Its TPU v4 Outperforms Nvidia A100

April 9, 2023

Learning to Play Minecraft with Video PreTraining (VPT)

January 28, 2023

Don't miss it

News

Top 9 Mind Map Makers Online & Offline for Brainstorming

December 1, 2023
Big Data

Sam Altman returns as CEO, OpenAI has a new initial board

December 1, 2023
Big Data

Announcing General Availability of Model Registry

November 30, 2023
Big Data

Sophos Anticipates AI-Based Attack Techniques and Prepares Detections

November 30, 2023
Big Data

Automating Governance of PHI Data in Healthcare

November 30, 2023
News

How Eightfold AI implemented metadata security in a multi-tenant data analytics environment with Amazon Redshift

November 30, 2023
big-data-footer-white

© Big Data News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Big Data
  • News
  • Contact us

Newsletter Sign Up

No Result
View All Result
  • Home
  • Big Data
  • News
  • Contact us

© 2022 Big Data News Hubb All rights reserved.