Menu
Sign In Add Podcast

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

8369.105 - 8386.136 Arvid Lundmark

minimizing the KL divergence with the distribution of gamma 27B, right? So knowledge distillation there. And you're spending the compute of literally training this 27 billion model, billion parameter model on all these tokens just to get out this, I don't know, smaller model.

0
💬 0

Comments

There are no comments yet.

Log in to comment.