Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Dylan Patel

👤 Person
1122 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I think we'll make sure we want to go down the license rabbit hole before we do specifics.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I think we'll make sure we want to go down the license rabbit hole before we do specifics.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, especially in the DeepSeq v3, which is their pre-training paper. They were very clear that they are doing interventions on the technical stack that go at many different levels. For example, to get highly efficient training, they're making modifications at or below the CUDA layer for NVIDIA chips.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, especially in the DeepSeq v3, which is their pre-training paper. They were very clear that they are doing interventions on the technical stack that go at many different levels. For example, to get highly efficient training, they're making modifications at or below the CUDA layer for NVIDIA chips.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, especially in the DeepSeq v3, which is their pre-training paper. They were very clear that they are doing interventions on the technical stack that go at many different levels. For example, to get highly efficient training, they're making modifications at or below the CUDA layer for NVIDIA chips.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I have never worked there myself, and there are a few people in the world that do that very well, and some of them are at DeepSeq. And these types of people are... at DeepSeek and leading American frontier labs, but there are not many places.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I have never worked there myself, and there are a few people in the world that do that very well, and some of them are at DeepSeq. And these types of people are... at DeepSeek and leading American frontier labs, but there are not many places.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

I have never worked there myself, and there are a few people in the world that do that very well, and some of them are at DeepSeq. And these types of people are... at DeepSeek and leading American frontier labs, but there are not many places.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so these weights that you can download from Hugging Face or other platforms are very big matrices of numbers. You can download them to a computer in your own house that has no internet and you can run this model and you're totally in control of your data.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so these weights that you can download from Hugging Face or other platforms are very big matrices of numbers. You can download them to a computer in your own house that has no internet and you can run this model and you're totally in control of your data.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so these weights that you can download from Hugging Face or other platforms are very big matrices of numbers. You can download them to a computer in your own house that has no internet and you can run this model and you're totally in control of your data.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That is something that is different than how a lot of language model usage is actually done today, which is mostly through APIs, where you send your prompt to GPUs run by certain companies. And these companies will have different distributions and policies on how your data is stored, if it is used to train future models, where it is stored, if it is encrypted, and so on.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That is something that is different than how a lot of language model usage is actually done today, which is mostly through APIs, where you send your prompt to GPUs run by certain companies. And these companies will have different distributions and policies on how your data is stored, if it is used to train future models, where it is stored, if it is encrypted, and so on.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

That is something that is different than how a lot of language model usage is actually done today, which is mostly through APIs, where you send your prompt to GPUs run by certain companies. And these companies will have different distributions and policies on how your data is stored, if it is used to train future models, where it is stored, if it is encrypted, and so on.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the open weights are, you have your fate of data in your own hands. And that is something that is deeply connected to the soul of open source computing.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the open weights are, you have your fate of data in your own hands. And that is something that is deeply connected to the soul of open source computing.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

So the open weights are, you have your fate of data in your own hands. And that is something that is deeply connected to the soul of open source computing.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes. So for one, I have very understanding of many people being confused by these two model names. So I would say the best way to think about this is that when training a language model, you have what is called pre-training, which is when you're predicting the large amounts of mostly internet text. You're trying to predict the next token.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes. So for one, I have very understanding of many people being confused by these two model names. So I would say the best way to think about this is that when training a language model, you have what is called pre-training, which is when you're predicting the large amounts of mostly internet text. You're trying to predict the next token.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yes. So for one, I have very understanding of many people being confused by these two model names. So I would say the best way to think about this is that when training a language model, you have what is called pre-training, which is when you're predicting the large amounts of mostly internet text. You're trying to predict the next token.