LughM to

[email protected]English • 6 months ago

Anthropic just published new research that successfully identified and mapped millions of human-interpretable concepts, called “features”, within the neural networks of Claude.

www.anthropic.com

3

39

Anthropic just published new research that successfully identified and mapped millions of human-interpretable concepts, called “features”, within the neural networks of Claude.

www.anthropic.com

LughM to

[email protected]English • 6 months ago

3

Mapping the Mind of a Large Language Model

www.anthropic.com

We have identified how millions of concepts are represented inside Claude Sonnet, one of our deployed large language models. This is the first ever detailed look inside a modern, production-grade large language model.

Chat

Diplomjodler
link
English
5•6 months ago
Wow. This is potentially huge.