Lex Fridman Podcast
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America
Arvid Lundmark
like really really caring as much as you can because like the the naive version of caring as much as you can about inference time compute is what people have already done with like the llama models or just over training the shit out of 7b models um on way way way more tokens than essential optimal right but if you really care about it maybe the thing to do is what gamma did which is let's just not let's not just train on tokens let's literally train on uh
0
💬
0
Comments
Log in to comment.
There are no comments yet.