Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
In some of DeepSeq's earlier papers, they talk about their training data being distilled for math. I shouldn't use this word yet, but taken from Common Crawl. And that's a public access that anyone listening to this could go download data from the Common Crawl website. This is a crawler that is maintained publicly.
0
💬
0
Comments
Log in to comment.
There are no comments yet.