distillation – a technique by which larger, more capable models can be used to train smaller models to similar performance within certain domains – Replit has a great writeup on how they created their first release codegen model in just a few weeks this way!

You can subscribe or follow and reply through those channels.


Keyboard Shortcuts

Key Action
o Source
e Edit
i Insight
r Random
h Home
s or / Search
www.joshbeckman.org/notes/452253204