Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

1481.355 - 1498.349 Dylan Patel

In some of DeepSeq's earlier papers, they talk about their training data being distilled for math. I shouldn't use this word yet, but taken from Common Crawl. And that's a public access that anyone listening to this could go download data from the Common Crawl website. This is a crawler that is maintained publicly.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required