Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Toy Models of Superposition (2022) (transformer-circuits.pub)
46 points by ZeljkoS on Aug 22, 2023 | hide | past | favorite | 4 comments


I am curious why isn't the network always in an "interference" state, but sometimes it collapses into a "restricted" ONB?

> the neural networks we observe in practice are in some sense noisily simulating larger, highly sparse networks

This seems somewhat related to a point made by Ilya Sutskever here [1]: NNs can be though of an approximation to the Kolmogorov compressor. Speculating, one could say any network is a projection of the ideal compressor (which argualy perfectly represents all n-features in an n-dimensional ONB) into a lower dimentional space, hence the interference. But why is there not always such an interference?

[1] https://www.youtube.com/watch?v=AKMuA_TVz3A


This is one of the best presentations of such a complex topic, or any topic for that matter that I've seen in a long time!

To the authors if they happen to find themselves here, I say: bravo!


Anthropic is awesome


guys. the mobile layout. people consume hobby content on mobile




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: