Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Dylan Patel

👤 Person
1122 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so DeepSeq v3 is a new mixture of experts, transformer language model from DeepSeq, who is based in China. They have some new specifics in the model that we'll get into. Largely, this is a open weight model, and it's a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so DeepSeq v3 is a new mixture of experts, transformer language model from DeepSeq, who is based in China. They have some new specifics in the model that we'll get into. Largely, this is a open weight model, and it's a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so DeepSeq v3 is a new mixture of experts, transformer language model from DeepSeq, who is based in China. They have some new specifics in the model that we'll get into. Largely, this is a open weight model, and it's a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Most people use instruction models today, and those are what served in all sorts of applications. This was released on, I believe, December 26th, or that week. And then weeks later, on January 20th, DeepSeq released DeepSeq R1, which is a reasoning model, which... really accelerated a lot of this discussion.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Most people use instruction models today, and those are what served in all sorts of applications. This was released on, I believe, December 26th, or that week. And then weeks later, on January 20th, DeepSeq released DeepSeq R1, which is a reasoning model, which... really accelerated a lot of this discussion.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Most people use instruction models today, and those are what served in all sorts of applications. This was released on, I believe, December 26th, or that week. And then weeks later, on January 20th, DeepSeq released DeepSeq R1, which is a reasoning model, which... really accelerated a lot of this discussion.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This reasoning model has a lot of overlapping training steps to DeepSeq v3, and it's confusing that you have a base model called v3 that you do something to to get a chat model, and then you do some different things to get a reasoning model. I think a lot of the AI industry is going through this challenge of communications right now where OpenAI makes fun of their own naming schemes.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This reasoning model has a lot of overlapping training steps to DeepSeq v3, and it's confusing that you have a base model called v3 that you do something to to get a chat model, and then you do some different things to get a reasoning model. I think a lot of the AI industry is going through this challenge of communications right now where OpenAI makes fun of their own naming schemes.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

This reasoning model has a lot of overlapping training steps to DeepSeq v3, and it's confusing that you have a base model called v3 that you do something to to get a chat model, and then you do some different things to get a reasoning model. I think a lot of the AI industry is going through this challenge of communications right now where OpenAI makes fun of their own naming schemes.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

They have GPT-4.0. They have OpenAI-01. And there's a lot of types of models. So we're going to break down what each of them are. There's a lot of technical specifics on training and go from high level to specific and kind of go through each of them.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Yeah, so this discussion has been going on for a long time in AI. It became more important since ChatGPT or more focal since ChatGPT at the end of 2022. Open weights is the accepted term for when model weights of a language model are available on the internet for people to download. Those weights can have different licenses, which is effectively the terms by which you can use the model.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

There are licenses that come from history and open source software. There are licenses that are designed by companies specifically. All of Lama, DeepSeek, Quen, Mistral, these popular names in... open weight models have some of their own licenses. It's complicated because not all the same models have the same terms. The big debate is on what makes a model open weight. Why are we saying this term?

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.

Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.

← Previous Page 1 of 57 Next →