Mechanistic Unlearning: A New AI Method that Uses Mechanistic Interpretability to Localize and Edit Specific Model Components Associated with Factual Recall Mechanisms

by Techaiapp
4 minutes read

Mechanistic Unlearning: A New AI Method that Uses Mechanistic Interpretability to Localize and Edit Specific Model Components Associated with Factual Recall Mechanisms

 Large language models (LLMs) sometimes learn the things that we don’t want them to learn and understand
Send this to a friend