Interpretability research is basically a projection of the original function imp... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		benchmarkist on Nov 22, 2024 \| parent \| context \| favorite \| on: Show HN: Llama 3.2 Interpretability with Sparse Au... Interpretability research is basically a projection of the original function implemented by the neural network onto a sub-space of "explanatory" functions that people consider to be more understandable. You're right that the words they use to sell the research is completely nonsensical because the abstract process has nothing to do with anything causal.

HeatrayEnjoyer on Nov 22, 2024 [–]

All code is causal.

benchmarkist on Nov 22, 2024 | [–]

Which makes it entirely irrelevant as a descriptive term.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact