Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The model is capable of generating many different responses to the same prompt. An ensemble of fact checking models can be used to reject paths that contain "facts" that are not present in the reference data (i.e. a fixed knowledge graph plus the context).

My guess is that the fact checking is actually easier, and the models can be smaller since they should not actually store the facts.



Exactly. Given a source of truth, it can't be that hard to train a separate analytic model to evaluate answers from the existing synthetic model. (Neglecting for the moment the whole Gödel thing.)

The problem isn't going to be developing the model, it's going to be how to arrive at an uncontroversial source of ground truth for it to draw from.

Meanwhile, people are complaining that the talking dog they got for Christmas is no good because the C++ code it wrote for them has bugs. Give it time.


That’s quite the system that can take in any natural language statement and confirm whether its true or false.

You might be underestimating the scope of some task here.


Not true or false; just present or absent in the reference data. Note that false negatives will not result in erroneous output, so the model can safely err on the side of caution.

Also 100% accuracy is probably not the real threshold for being useful. There are many low hanging fruits today that could be solved by absolutely tiny error correcting models (e.g. arithmetic and rhyming).


There's research showing you can tell if something is a hallucination or memorized fact based on the activation patterns inside the LM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: