Dylan Patel
👤 PersonAppearances Over Time
Podcast Appearances
It's kind of a mouthful. It sounds close to open source, but it's not the same. there's still a lot of debate on the definition and soul of open source AI. Open source software has a rich history on freedom to modify, freedom to take on your own, freedom from any restrictions on how you would use the software, and what that means for AI is still being defined.
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
So for what I do, I work at the Allen Institute for AI. We're a nonprofit. We want to make AI open for everybody. And we try to lead on what we think is truly open source. There's not full agreement in the community. But for us, that means releasing the training data, releasing the training code, and then also having open weights like this. And we'll get into the details of the models and
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
Again and again, as we try to get deeper into how the models were trained, we will say things like the data processing, data filtering, data quality is the number one determinant of the model quality. And then a lot of the training code is the determinant on how long it takes to train and how fast your experimentation is.
So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.
So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.
So without fully open source models where you have access to this data, it is... hard to know, or it's harder to replicate. So we'll get into cost numbers for DeepSeq v3 on mostly GPU hours and how much you could pay to rent those yourselves. But without the data, the replication cost is going to be far, far higher. And same goes for the code.
Yeah. DeepSeek is doing fantastic work for disseminating understanding of AI. Their papers are extremely detailed in what they do. And for other companies, teams around the world, they're very actionable in terms of improving your own training techniques. And we'll talk about licenses more. The DeepSeq R1 model has a very permissive license. It's called the MIT license.
Yeah. DeepSeek is doing fantastic work for disseminating understanding of AI. Their papers are extremely detailed in what they do. And for other companies, teams around the world, they're very actionable in terms of improving your own training techniques. And we'll talk about licenses more. The DeepSeq R1 model has a very permissive license. It's called the MIT license.
Yeah. DeepSeek is doing fantastic work for disseminating understanding of AI. Their papers are extremely detailed in what they do. And for other companies, teams around the world, they're very actionable in terms of improving your own training techniques. And we'll talk about licenses more. The DeepSeq R1 model has a very permissive license. It's called the MIT license.
That effectively means there's no downstream restrictions on commercial use. There's no use case restrictions. You can use the outputs from the models to create synthetic data. And this is all fantastic. I think the closest peer is something like Lama, where you have the weights and you have a technical report. And the technical report is very good for Lama.
That effectively means there's no downstream restrictions on commercial use. There's no use case restrictions. You can use the outputs from the models to create synthetic data. And this is all fantastic. I think the closest peer is something like Lama, where you have the weights and you have a technical report. And the technical report is very good for Lama.
That effectively means there's no downstream restrictions on commercial use. There's no use case restrictions. You can use the outputs from the models to create synthetic data. And this is all fantastic. I think the closest peer is something like Lama, where you have the weights and you have a technical report. And the technical report is very good for Lama.
One of the most read PDFs of the year last year is the Lama 3 paper. But in some ways, it's slightly less actionable. It has less details on the training specifics and less plots. And so on. And the Lama 3 license is more restrictive than MIT. And then between the DeepSeek custom license and the Lama license, we could get into this whole rabbit hole.
One of the most read PDFs of the year last year is the Lama 3 paper. But in some ways, it's slightly less actionable. It has less details on the training specifics and less plots. And so on. And the Lama 3 license is more restrictive than MIT. And then between the DeepSeek custom license and the Lama license, we could get into this whole rabbit hole.
One of the most read PDFs of the year last year is the Lama 3 paper. But in some ways, it's slightly less actionable. It has less details on the training specifics and less plots. And so on. And the Lama 3 license is more restrictive than MIT. And then between the DeepSeek custom license and the Lama license, we could get into this whole rabbit hole.
I think we'll make sure we want to go down the license rabbit hole before we do specifics.