Lex Fridman Podcast
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America
Arvid Lundmark
And even if maybe it is 10X worse scaling properties during training, meaning you have to spend 10X more compute to train the thing to get the same level of capabilities, it's worth it because you care most about that inference budget for really long context windows. So it'll be interesting to see how people kind of play with all these dimensions.
0
💬
0
Comments
Log in to comment.
There are no comments yet.