The other thing about chat was when we had these instruct mo...

The other thing about chat was when we had these instruct models. The task of “complete this text,  but in a nice or helpful way” is a pretty poorly defined task. That task is both confusing for the  model and for the human who’s supposed to do the data labeling.

Whereas for chat, people had an intuitive sense of what a helpful robot should be like. So it was just much easier for people to  get an idea of what the model was supposed to do. As a result, the model had a much more coherent personality and it was much easier to get pretty sensible behavior robustly.

One of the other reasons we have a chat interface for LLMs and large machine-learned models: it was easier for humans to evaluate chat as a form of output/interaction.

Comments
www.joshbeckman.org/notes/720259126