Steeve Morin

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing with Nvidia is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? OpenAI is amazing, but it's not their compute. Ultimately, if you don't own your compute, you're starting with something at your ankle. In five years, I would say 95% inference, 5% training.

0.089 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing with Nvidia is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? OpenAI is amazing, but it's not their compute. Ultimately, if you don't own your compute, you're starting with something at your ankle. In five years, I would say 95% inference, 5% training.

0.089 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You get ripped off. Here's the dirty secret, is that NVIDIA, like a TSMC sells you at 60% margin, NVIDIA sells you at 90% margin. And on top of that, there's Amazon that takes, let's say a 30% margin. So you are a very thin crust on a very big cake. It's a bit of a losing game if you go all in on one provider, you want optionality.

1010.659 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You get ripped off. Here's the dirty secret, is that NVIDIA, like a TSMC sells you at 60% margin, NVIDIA sells you at 90% margin. And on top of that, there's Amazon that takes, let's say a 30% margin. So you are a very thin crust on a very big cake. It's a bit of a losing game if you go all in on one provider, you want optionality.

1010.659 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, yes. Here's the problem, though. Let's say you are on Google Cloud and you're on TPUs. Suddenly, you just removed that 90% chunk on the spend. The problem is that for multiple software reasons, which we are solving at DML, is that they're not really, I would say, a commercial success. They are very much successful inside of Google, but not much outside of Google.

1041.349 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, yes. Here's the problem, though. Let's say you are on Google Cloud and you're on TPUs. Suddenly, you just removed that 90% chunk on the spend. The problem is that for multiple software reasons, which we are solving at DML, is that they're not really, I would say, a commercial success. They are very much successful inside of Google, but not much outside of Google.

1041.349 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Amazon, same, is pushing very, very hard for their, you know, Tranium chips. So the future I see is that you use whatever, you know, your provider has because you don't want to pay, you know, 90% outrageous margin and try to make, you know, a profit out of that. Okay.

1065.831 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Amazon, same, is pushing very, very hard for their, you know, Tranium chips. So the future I see is that you use whatever, you know, your provider has because you don't want to pay, you know, 90% outrageous margin and try to make, you know, a profit out of that. Okay.

1065.831 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these two obey fundamentally different, I would say, tectonic forces. So in training, more is better. You want more of everything, essentially. And the recipe for success is the speed of iteration. You change stuff, you see how it works, and you do it again. Hopefully it converges. And it's like changing the wheel of a moving car, so to speak. So that is training.

1096.151 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these two obey fundamentally different, I would say, tectonic forces. So in training, more is better. You want more of everything, essentially. And the recipe for success is the speed of iteration. You change stuff, you see how it works, and you do it again. Hopefully it converges. And it's like changing the wheel of a moving car, so to speak. So that is training.

1096.151 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

On inference, this is a complete reverse. Less is better. You want less headaches. You don't want to be working up at night because inference is production. You could say that training is research and inference is production. And it's fundamentally different. In terms of infra, probably the number one thing that is the number one difference between these two is the need for interconnect.

1120.158 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

On inference, this is a complete reverse. Less is better. You want less headaches. You don't want to be working up at night because inference is production. You could say that training is research and inference is production. And it's fundamentally different. In terms of infra, probably the number one thing that is the number one difference between these two is the need for interconnect.

1120.158 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do production, if you can avoid to have interconnect between, let's say, a cluster of GPUs, of course you will avoid that if you can. And this is why models have the sizes they have. It's so that people can run them without the need to connect multiple machines together. It's very constraining in terms of the environment.

1141.656 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do production, if you can avoid to have interconnect between, let's say, a cluster of GPUs, of course you will avoid that if you can. And this is why models have the sizes they have. It's so that people can run them without the need to connect multiple machines together. It's very constraining in terms of the environment.

1141.656 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably the fundamental difference, the need for interconnect. And number two is, ultimately, do you really care about what your model is running on as long as it's outputting whatever you want it to output?

1166.568 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably the fundamental difference, the need for interconnect. And number two is, ultimately, do you really care about what your model is running on as long as it's outputting whatever you want it to output?

1166.568 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Think of it like doing a painting and doing a million paintings. The tools you will use, the process you will do. If you do one painting, what you favor is the speed at which you can do a stroke and do some iteration. If you do a million, what you want is a process that is reliable, that can deliver you efficiently a million paintings. So that is the same for training versus inference.

1189.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Think of it like doing a painting and doing a million paintings. The tools you will use, the process you will do. If you do one painting, what you favor is the speed at which you can do a stroke and do some iteration. If you do a million, what you want is a process that is reliable, that can deliver you efficiently a million paintings. So that is the same for training versus inference.

1189.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run around millions of instances of a model, You cannot hack your way to do that. By the way, people do hack their way today, but this is probably the fundamental difference.

1214.424 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run around millions of instances of a model, You cannot hack your way to do that. By the way, people do hack their way today, but this is probably the fundamental difference.

1214.424 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

1233.152 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

1233.152 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a lot of duct tape. Here's also probably one of the problem is that training on first principle is actually two passes, forward and backward, right? It's called forward pass and backward pass, right? Inference is running only the forward pass. So that's how things are today. There are people who are trying to specialize a bit because at some point duct tape doesn't really work out.

1237.494 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a lot of duct tape. Here's also probably one of the problem is that training on first principle is actually two passes, forward and backward, right? It's called forward pass and backward pass, right? Inference is running only the forward pass. So that's how things are today. There are people who are trying to specialize a bit because at some point duct tape doesn't really work out.

1237.494 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And when you're on big scales, that makes a problem. And it's a problem that's growing because a lot of people are coming on the market with needs for inference. That wasn't the case, you know, a year and a half ago or a year ago. OpenAI had this problem, right? Maybe Anthropic had this problem. But it wasn't a universal problem yet. And now it's becoming a universal problem.

1262.133 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And when you're on big scales, that makes a problem. And it's a problem that's growing because a lot of people are coming on the market with needs for inference. That wasn't the case, you know, a year and a half ago or a year ago. OpenAI had this problem, right? Maybe Anthropic had this problem. But it wasn't a universal problem yet. And now it's becoming a universal problem.

1262.133 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So, for instance, probably the number one thing, depending on how you deploy, but if you deploy inference, the number one thing that will get you is what's called autoscaling. So as your systems get more and more loaded, you want to provision because these things are tremendously expensive. You want to provision them as you scale, right? So you don't want to say, I have 1,000 GPUs, 24 hours.

1286.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So, for instance, probably the number one thing, depending on how you deploy, but if you deploy inference, the number one thing that will get you is what's called autoscaling. So as your systems get more and more loaded, you want to provision because these things are tremendously expensive. You want to provision them as you scale, right? So you don't want to say, I have 1,000 GPUs, 24 hours.

1286.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Even if there's nobody on the production, I will pay for them, which is, mind you, what people are doing today. This is crazy. So what you want to do is you want to provision, compute as you grow your needs, right? and you want to do it up, and you want to do it down. Probably the number one thing that gives you a lot of efficiency in terms of spend.

1312.969 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Even if there's nobody on the production, I will pay for them, which is, mind you, what people are doing today. This is crazy. So what you want to do is you want to provision, compute as you grow your needs, right? and you want to do it up, and you want to do it down. Probably the number one thing that gives you a lot of efficiency in terms of spend.

1312.969 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

We're talking multiples, like 5x, sometimes 10x improvement. The thing is, this is a problem, at least in I would say regular back-end engineering, this is a problem everybody knows. Everybody is doing it because the savings are so huge. But on AI, nobody really had the problem. So now they're coming up to it. So this is one example.

1334.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

We're talking multiples, like 5x, sometimes 10x improvement. The thing is, this is a problem, at least in I would say regular back-end engineering, this is a problem everybody knows. Everybody is doing it because the savings are so huge. But on AI, nobody really had the problem. So now they're coming up to it. So this is one example.

1334.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's one example, yeah. Another one is choosing the right compute. It's like kind of, I would say, a vicious circle because provisioning compute is very hard. So if you lose compute, it's very bad.

1367.62 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's one example, yeah. Another one is choosing the right compute. It's like kind of, I would say, a vicious circle because provisioning compute is very hard. So if you lose compute, it's very bad.

1367.62 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You are essentially incentivized to overbuy, in the case of Amazon or Google that would be buying reserved compute, which you're not going to use because if you buy it on demand, you will get tremendously ripped off. So that creates this face scarcity of compute that people buy preemptively because they raise a shit ton of money and they're not using it. So this is a major problem too.

1380.332 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You are essentially incentivized to overbuy, in the case of Amazon or Google that would be buying reserved compute, which you're not going to use because if you buy it on demand, you will get tremendously ripped off. So that creates this face scarcity of compute that people buy preemptively because they raise a shit ton of money and they're not using it. So this is a major problem too.

1380.332 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It might well be, yes. We are being spared a bit because Blackwell is late and others are getting canceled. And so H series, I would say are still, you know, in the active, but yes, absolutely. But you know, what choice do you have? This is the thing.

1410.992 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It might well be, yes. We are being spared a bit because Blackwell is late and others are getting canceled. And so H series, I would say are still, you know, in the active, but yes, absolutely. But you know, what choice do you have? This is the thing.

1410.992 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I might tell you that I think it already started. I'm getting cold emails for discounts from services I never heard about. And I started getting these emails probably around October, November. Some people are left with a lot of capex that they don't know what to do with.

1445.107 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I might tell you that I think it already started. I'm getting cold emails for discounts from services I never heard about. And I started getting these emails probably around October, November. Some people are left with a lot of capex that they don't know what to do with.

1445.107 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a different thing to build a cluster and do a training run than it is to build literally a cloud provider or hyperscaler or whatever you want to call it. There are a lot of people who do their training runs on the regular providers, but then move to regular hyperscalers when they do production. So I'm very much worried there will be an oversupply of these chips.

1463.273 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a different thing to build a cluster and do a training run than it is to build literally a cloud provider or hyperscaler or whatever you want to call it. There are a lot of people who do their training runs on the regular providers, but then move to regular hyperscalers when they do production. So I'm very much worried there will be an oversupply of these chips.

1463.273 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The problem is that, remember, the chips are the collateral. So somewhere in the US or whatever, there's going to be a data center with like a thousand GPUs that people may buy 30 cents on the dollar. This is what might happen.

1488.206 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The problem is that, remember, the chips are the collateral. So somewhere in the US or whatever, there's going to be a data center with like a thousand GPUs that people may buy 30 cents on the dollar. This is what might happen.

1488.206 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Technically speaking, he is right. But realistically speaking, I'm not sure I agree. The thing is, these chips are on the market. They're here. I'll tab on Chrome and get one. That is something that I don't take lightly. Availability, that is, right?

1532.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Technically speaking, he is right. But realistically speaking, I'm not sure I agree. The thing is, these chips are on the market. They're here. I'll tab on Chrome and get one. That is something that I don't take lightly. Availability, that is, right?

1532.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think Nvidia is used to stay, at least if not for the H100 bubble bust, because these chips are going to be on the market and people will buy them and do inference with them. Remains to see the OPEX and the electricity, etc., but... The thing is, the only chips that are really frontier on that sense are probably TPUs and then the upcoming chips.

1549.803 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think Nvidia is used to stay, at least if not for the H100 bubble bust, because these chips are going to be on the market and people will buy them and do inference with them. Remains to see the OPEX and the electricity, etc., but... The thing is, the only chips that are really frontier on that sense are probably TPUs and then the upcoming chips.

1549.803 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, they're great chips, but they're not on the market. Or like there are outrageous prices, like millions of dollars to run a model. So what chips are great and why aren't they on the market? Let's say for instance, Cerebras, incredible technology, incredibly expensive. So how will the market value the premium of having single stream, very high tokens per second?

1575.482 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, they're great chips, but they're not on the market. Or like there are outrageous prices, like millions of dollars to run a model. So what chips are great and why aren't they on the market? Let's say for instance, Cerebras, incredible technology, incredibly expensive. So how will the market value the premium of having single stream, very high tokens per second?

1575.482 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There is a value into that, right? As we saw with Mistral and Perplexity, but I think there was another loss. I don't know, I don't have the details, But I think it was done at a loss that Cerebrus put it out. So today there's three actors on the market that can deliver this. I think this will be, I would say, the pushing force for change in the inference landscape, agents and reasoning.

1597.642 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There is a value into that, right? As we saw with Mistral and Perplexity, but I think there was another loss. I don't know, I don't have the details, But I think it was done at a loss that Cerebrus put it out. So today there's three actors on the market that can deliver this. I think this will be, I would say, the pushing force for change in the inference landscape, agents and reasoning.

1597.642 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is very high tokens per second only for you.

1621.823 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is very high tokens per second only for you.

1621.823 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's this trick. Because here's the thing, there's no magic. This little trick is called SRAM. SRAM is memory on the chip directly. So that is very, very fast memory. But here's the problem with SRAM. is that SRAM consumes surface on the chip, which makes it a bigger chip, which is very hard in terms of yield, right? Because the chances of problems are higher and so on.

1634.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's this trick. Because here's the thing, there's no magic. This little trick is called SRAM. SRAM is memory on the chip directly. So that is very, very fast memory. But here's the problem with SRAM. is that SRAM consumes surface on the chip, which makes it a bigger chip, which is very hard in terms of yield, right? Because the chances of problems are higher and so on.

1634.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So SRAM is, I would say, very, very, very fast memory, which gives you a lot of advantage when you do very, very high inference, but it's terribly expensive. And if you look at, for instance, Grok, they have on their generation, this generation, they have 230 megabytes of SRAM per chip. A 7TB model is 140 gigabytes. So you do the math, right?

1660.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So SRAM is, I would say, very, very, very fast memory, which gives you a lot of advantage when you do very, very high inference, but it's terribly expensive. And if you look at, for instance, Grok, they have on their generation, this generation, they have 230 megabytes of SRAM per chip. A 7TB model is 140 gigabytes. So you do the math, right?

1660.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Cerebras has 44 gigabytes of SRAM into what they call their wafer scale engine. which is a chip the size of a wafer. I mean, most likely it's interconnected, but it's huge, right? And it has to be water-cooled. They have copper, you know, I would say needles that touch the chip. It's crazy stuff. Very, very impressive technology, mind you, but very, very expensive. So my bet is...

1685.7 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Cerebras has 44 gigabytes of SRAM into what they call their wafer scale engine. which is a chip the size of a wafer. I mean, most likely it's interconnected, but it's huge, right? And it has to be water-cooled. They have copper, you know, I would say needles that touch the chip. It's crazy stuff. Very, very impressive technology, mind you, but very, very expensive. So my bet is...

1685.7 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think there will be chips on the market that do that at a much lower price. And there's two companies I see going in that direction. One is called Etched, and the other one is called Vsor. That's the two I see. Because if you can deliver this at, I would say, the price that is comparable to GPUs, you've won.

1710.692 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think there will be chips on the market that do that at a much lower price. And there's two companies I see going in that direction. One is called Etched, and the other one is called Vsor. That's the two I see. Because if you can deliver this at, I would say, the price that is comparable to GPUs, you've won.

1710.692 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's hard to say. I mean, you need some SRAM, but if you can have a smaller process node, but if you can hook yourself with external memory, then yes, you can do that a lot better. But the thing is, if you go full-blown SRAM, then there's no magic. You will have to pay the price.

1735.951 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's hard to say. I mean, you need some SRAM, but if you can have a smaller process node, but if you can hook yourself with external memory, then yes, you can do that a lot better. But the thing is, if you go full-blown SRAM, then there's no magic. You will have to pay the price.

1735.951 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

pushed by reasoning. So reasoning, not in the sense that you see on DeepSeq and whatever, right? Reasoning and what's called latent space reasoning. Latent space reasoning and agents will push the market towards different types of compute.

1765.333 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

pushed by reasoning. So reasoning, not in the sense that you see on DeepSeq and whatever, right? Reasoning and what's called latent space reasoning. Latent space reasoning and agents will push the market towards different types of compute.

1765.333 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way models reason today is their reason in tokens. So it's as if you think to yourself, you would say out loud what you're thinking. So yes, it works, but it's a bit inefficient, right? And you lose information doing this. Latent space reasoning is this without going, I would say, to English or whatever, right?

1783.404 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way models reason today is their reason in tokens. So it's as if you think to yourself, you would say out loud what you're thinking. So yes, it works, but it's a bit inefficient, right? And you lose information doing this. Latent space reasoning is this without going, I would say, to English or whatever, right?

1783.404 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So staying in what's called the latent space, which is where all the information of an LLM, let's say an LLM, An LLM lives, right? So this is very much how we work as humans. And we move toward what Yann LeCun calls an energy-based model in which we have different types of longer or shorter, I would say, thinking times, if you will, right?

1806.384 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So staying in what's called the latent space, which is where all the information of an LLM, let's say an LLM, An LLM lives, right? So this is very much how we work as humans. And we move toward what Yann LeCun calls an energy-based model in which we have different types of longer or shorter, I would say, thinking times, if you will, right?

1806.384 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that fundamentally, GPUs cannot deliver this, plain and simple, at scale. Why can't GPUs deliver it? Because the access to external memory prevents it. So HBM is all the rage, right? But HBM compared to SRAM is absolutely, you know, dark slow. So this is the problem you get. So HBM is like the best we can do, but it's still slow versus SRAM.

1829.581 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that fundamentally, GPUs cannot deliver this, plain and simple, at scale. Why can't GPUs deliver it? Because the access to external memory prevents it. So HBM is all the rage, right? But HBM compared to SRAM is absolutely, you know, dark slow. So this is the problem you get. So HBM is like the best we can do, but it's still slow versus SRAM.

1829.581 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, you want HBM, to be clear. No, SRAM, this will not deliver. It's a dead end in terms of scaling SRAM means scaling the surface, means you get depreciating problems. It explodes everywhere, right? So you need some SRAM, right? So we'll have bigger amounts of SRAM into chips. And of course, bigger what's called external memory into chips. The issue with HBM is that it's still slow.

1871.539 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, you want HBM, to be clear. No, SRAM, this will not deliver. It's a dead end in terms of scaling SRAM means scaling the surface, means you get depreciating problems. It explodes everywhere, right? So you need some SRAM, right? So we'll have bigger amounts of SRAM into chips. And of course, bigger what's called external memory into chips. The issue with HBM is that it's still slow.

1871.539 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You have the products, the data, and the compute. Who has all three? Google has, like, Android, Google Docs. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

19.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You have the products, the data, and the compute. Who has all three? Google has, like, Android, Google Docs. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

19.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And yes, maybe NVIDIA has a stronghold and they can prevent you from getting some. So that would be like, I call it the Nutella situation in which, you know, Nutella, they own 80% of the hazelnuts market, right? So yes, you can do a competitor, but who will you buy the nuts from, right? So there will be a need for HBM. There will be a need for SRAM, right?

1900.476 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And yes, maybe NVIDIA has a stronghold and they can prevent you from getting some. So that would be like, I call it the Nutella situation in which, you know, Nutella, they own 80% of the hazelnuts market, right? So yes, you can do a competitor, but who will you buy the nuts from, right? So there will be a need for HBM. There will be a need for SRAM, right?

1900.476 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I would say better, more dedicated architecture will be able to deliver these things. And then there's like the next frontier after that, which is called compute in memory. There's two companies that are on that market. One is called Rain, Rain.ai. Sam Altman is one of the investors. There's no surprise. The other one is called Fractile. So this is the next frontier.

1920.549 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I would say better, more dedicated architecture will be able to deliver these things. And then there's like the next frontier after that, which is called compute in memory. There's two companies that are on that market. One is called Rain, Rain.ai. Sam Altman is one of the investors. There's no surprise. The other one is called Fractile. So this is the next frontier.

1920.549 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And the idea is that instead of like transferring the data between external memory and the CPU and do the compute there, You actually bring the CPU to the memory and you do everything. It's crazy stuff, but it's coming. Maybe not this year, but... How does that change the situation? It makes it much more efficient, but what does that actually mean in reality?

1941.716 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And the idea is that instead of like transferring the data between external memory and the CPU and do the compute there, You actually bring the CPU to the memory and you do everything. It's crazy stuff, but it's coming. Maybe not this year, but... How does that change the situation? It makes it much more efficient, but what does that actually mean in reality?

1941.716 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means you get maybe not SRAM-level performance, but you get a lot faster performance. in terms of compute. And if you translate that to LLMs, let's say, you get much, much higher tokens per second in a single stream, which is exactly what you want when you go into reasoning. You want your model to maybe think, let's say, for like half a second, and then boom.

1964.508 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means you get maybe not SRAM-level performance, but you get a lot faster performance. in terms of compute. And if you translate that to LLMs, let's say, you get much, much higher tokens per second in a single stream, which is exactly what you want when you go into reasoning. You want your model to maybe think, let's say, for like half a second, and then boom.

1964.508 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't want to wait 50 seconds and context switch to some other thing, which is the problem everybody has today, mind you. So yeah, I think inference will be pushed. The compute landscape will be pushed to change because of these two constraints.

1987.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't want to wait 50 seconds and context switch to some other thing, which is the problem everybody has today, mind you. So yeah, I think inference will be pushed. The compute landscape will be pushed to change because of these two constraints.

1987.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

In five years, I would say 95% inference, 5% training.

2012.792 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

In five years, I would say 95% inference, 5% training.

2012.792 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Depends on the supply. I think that there's a shot that they don't. Because here's the thing, you know, even if we take, you know, same amount of, you know, let's imagine we have a new chip from Amazon, right? That is the same amount. Oh, wait, we do. It's called Tranium. You know, why would I pay 90% margin of NVIDIA if I can freely change to Tranium? My old production runs on AWS anyways, right?

2022.017 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Depends on the supply. I think that there's a shot that they don't. Because here's the thing, you know, even if we take, you know, same amount of, you know, let's imagine we have a new chip from Amazon, right? That is the same amount. Oh, wait, we do. It's called Tranium. You know, why would I pay 90% margin of NVIDIA if I can freely change to Tranium? My old production runs on AWS anyways, right?

2022.017 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run on the cloud and you're running on NVIDIA, you're getting squeezed out of your money, right? So if you're on production on dedicated chips, of course, so maybe through commoditization, but hey, I'm on AWS, I can just click and boom, it runs on AWS's chips. Who cares, right? I just run my model like I did two minutes ago.

2047.954 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run on the cloud and you're running on NVIDIA, you're getting squeezed out of your money, right? So if you're on production on dedicated chips, of course, so maybe through commoditization, but hey, I'm on AWS, I can just click and boom, it runs on AWS's chips. Who cares, right? I just run my model like I did two minutes ago.

2047.954 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are. They have a protocol NIM, sort of does that. The thing with NVIDIA is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? I'm sorry, but I don't want to care about that, right? I want to do my stuff.

2079.564 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are. They have a protocol NIM, sort of does that. The thing with NVIDIA is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? I'm sorry, but I don't want to care about that, right? I want to do my stuff.

2079.564 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And NVIDIA got me into saying, hey, you should care about this because there's nothing else on the market. Well, that's not true. But ultimately, this is the GPU I have in my machine. So, you know, off I go. If tomorrow that changes, why would I pay 90% margin on my compute? That's insane. This is why I believe it ultimately goes through the software. This is my entry point to the ecosystem.

2096.403 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And NVIDIA got me into saying, hey, you should care about this because there's nothing else on the market. Well, that's not true. But ultimately, this is the GPU I have in my machine. So, you know, off I go. If tomorrow that changes, why would I pay 90% margin on my compute? That's insane. This is why I believe it ultimately goes through the software. This is my entry point to the ecosystem.

2096.403 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If the software abstracts away those idiosyncrasies, as they do on CPUs, then the providers will compete on specs and not on fake modes or circumstantial modes, right? So this is where I think, you know, the market is going. And of course, there's the availability problem. There is, you know, if you, you know, piss off Jensen, you might need to kiss the ring, you know, to get back in line, right?

2121.342 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If the software abstracts away those idiosyncrasies, as they do on CPUs, then the providers will compete on specs and not on fake modes or circumstantial modes, right? So this is where I think, you know, the market is going. And of course, there's the availability problem. There is, you know, if you, you know, piss off Jensen, you might need to kiss the ring, you know, to get back in line, right?

2121.342 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But I mean, ultimately, I don't see this as being sustainable.

2149.095 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But I mean, ultimately, I don't see this as being sustainable.

2149.095 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So all, I would say, chip makers have a GTM problem. All of them. whether it's Google, whether it's AMD, whether it's TenStorm. The problem is that there's, I would say, probably two fundamental problems. The number one is if you're maintaining multiple stacks today is very, very, very hard. So you don't. So let's say I buy AMD. I want to buy AMD, right? That means I'm going to abandon NVIDIA.

2177.193 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So all, I would say, chip makers have a GTM problem. All of them. whether it's Google, whether it's AMD, whether it's TenStorm. The problem is that there's, I would say, probably two fundamental problems. The number one is if you're maintaining multiple stacks today is very, very, very hard. So you don't. So let's say I buy AMD. I want to buy AMD, right? That means I'm going to abandon NVIDIA.

2177.193 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, crap. I have a six-year amortization plan on that. Oh, man, what do I do? So do I need to support both stacks? Uh, unclear. Maybe until AMD tells me, hey, you know, you have, I don't know, let's say 1,000 Nvidia GPUs. You're about to buy 100,000 of AMD. I mean, come on, right? And I'm like, okay, that is, you know, makes it worth my while, right?

2204.471 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, crap. I have a six-year amortization plan on that. Oh, man, what do I do? So do I need to support both stacks? Uh, unclear. Maybe until AMD tells me, hey, you know, you have, I don't know, let's say 1,000 Nvidia GPUs. You're about to buy 100,000 of AMD. I mean, come on, right? And I'm like, okay, that is, you know, makes it worth my while, right?

2204.471 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But that is ultimately the fundamental problem is that the steps are very high, right? I need to have a lot of incentives to buy into that ecosystem. So I need to buy a lot of them. So if you're AMD, that is already a problem. But then Microsoft comes along and buys it all, makes, by the way, OpenAI, or at least on the inference side, puts OpenAI in the green because of the efficiency gains.

2227.305 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But that is ultimately the fundamental problem is that the steps are very high, right? I need to have a lot of incentives to buy into that ecosystem. So I need to buy a lot of them. So if you're AMD, that is already a problem. But then Microsoft comes along and buys it all, makes, by the way, OpenAI, or at least on the inference side, puts OpenAI in the green because of the efficiency gains.

2227.305 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Can I just try and understand?

2249.018 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Can I just try and understand?

2249.018 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's actually both. Yeah. The buy-in is very high. So to make it worth it, you have to buy a lot. And if you buy a lot, this is, you know what, we talk to all of them. They always have the same questions and it's completely understandable. They say, this is great, but who's the customer? Because on the other side, let's take Amazon, for instance, with Tranium.

2263.527 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's actually both. Yeah. The buy-in is very high. So to make it worth it, you have to buy a lot. And if you buy a lot, this is, you know what, we talk to all of them. They always have the same questions and it's completely understandable. They say, this is great, but who's the customer? Because on the other side, let's take Amazon, for instance, with Tranium.

2263.527 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Apple just came and said, hey, we're going to buy 100,000 of them. So you want to buy 10,000, you feel like the big shot, right? Yeah, but go back to the queue because there's Apple before you, right? So they have to have very high commitments. You cannot be incrementally better. It's very hard, right? And also very hard, I can give you one metric if you want.

2285.063 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Apple just came and said, hey, we're going to buy 100,000 of them. So you want to buy 10,000, you feel like the big shot, right? Yeah, but go back to the queue because there's Apple before you, right? So they have to have very high commitments. You cannot be incrementally better. It's very hard, right? And also very hard, I can give you one metric if you want.

2285.063 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I know for a fact that being seven times better and take whatever metric you want. Whether it's spend, whether it's whatever, it's not enough to get people to switch. People will choose nothing over something. So this is a very hard market to enter into because you cannot also compete of incremental gains. It's very hard, right? So you have to convince a lot of people.

2306.168 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I know for a fact that being seven times better and take whatever metric you want. Whether it's spend, whether it's whatever, it's not enough to get people to switch. People will choose nothing over something. So this is a very hard market to enter into because you cannot also compete of incremental gains. It's very hard, right? So you have to convince a lot of people.

2306.168 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Maybe you can go the Middle East route in which they sprinkle everything and they evaluate everything. That's not, you know, very sustainable, I would say, strategy in the long term.

2329.761 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Maybe you can go the Middle East route in which they sprinkle everything and they evaluate everything. That's not, you know, very sustainable, I would say, strategy in the long term.

2329.761 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

2347.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

2347.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely. The right approach to me is making the buy-in zero. If the buy-in is zero, you don't worry about this. You just buy whatever is best today. How do you do that by renting? Oh, because this is what we do. This is our promise. Our thesis is that if the buy-in is zero, you know, you completely unlock that value.

2351.212 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely. The right approach to me is making the buy-in zero. If the buy-in is zero, you don't worry about this. You just buy whatever is best today. How do you do that by renting? Oh, because this is what we do. This is our promise. Our thesis is that if the buy-in is zero, you know, you completely unlock that value.

2351.212 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means that you can freely switch, you know, compute to compute, like freely, right? You just say, hey, now it's AMD, boom, it runs. You just say, oh, it's 10 store and boom, it runs, right? How do you do that then?

2374.63 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means that you can freely switch, you know, compute to compute, like freely, right? You just say, hey, now it's AMD, boom, it runs. You just say, oh, it's 10 store and boom, it runs, right? How do you do that then?

2374.63 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah, yeah, yeah. Not agreements, but we work with them to support their chips. But the thing is, at least as a user myself of our tech, is that if it's free for me to switch or to choose whichever provider I want in terms of compute, AMD, Nvidia, whatever, then I can take whatever is best today, and I can take whatever is best tomorrow, and I can run both.

2390.138 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah, yeah, yeah. Not agreements, but we work with them to support their chips. But the thing is, at least as a user myself of our tech, is that if it's free for me to switch or to choose whichever provider I want in terms of compute, AMD, Nvidia, whatever, then I can take whatever is best today, and I can take whatever is best tomorrow, and I can run both.

2390.138 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I can run three different platforms at the same time. I don't care. I only run what is good at the moment. And that unlocks, to me, a very cool thing, which is incremental improvement. If you are 30% better, I'll switch to you.

2414.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I can run three different platforms at the same time. I don't care. I only run what is good at the moment. And that unlocks, to me, a very cool thing, which is incremental improvement. If you are 30% better, I'll switch to you.

2414.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

This is actually a great question. I think that if you are doing it bottom-up, infra to applications, you will lose because nobody will care, as they don't today, right? If you look at TPUs, they're available, they're great. Nobody cares. Why does nobody care about TPUs, sorry? Because the cost of buying, it's always the same, right? You have to spend six months of engineering to switch to TPUs.

2438.879 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

This is actually a great question. I think that if you are doing it bottom-up, infra to applications, you will lose because nobody will care, as they don't today, right? If you look at TPUs, they're available, they're great. Nobody cares. Why does nobody care about TPUs, sorry? Because the cost of buying, it's always the same, right? You have to spend six months of engineering to switch to TPUs.

2438.879 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And mind you, TPUs do training. They are the only ones. We're training them now. But AMD can do training, but in terms of maturity, by far the most mature software and compute is TPUs, and then it's NVIDIA, right? So the buy-in is so high that people are like, no, fuck. We'll see, right? I'm not on Google Cloud. I have to sign up. Oh my God, right? So these are tremendous chips.

2462.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And mind you, TPUs do training. They are the only ones. We're training them now. But AMD can do training, but in terms of maturity, by far the most mature software and compute is TPUs, and then it's NVIDIA, right? So the buy-in is so high that people are like, no, fuck. We'll see, right? I'm not on Google Cloud. I have to sign up. Oh my God, right? So these are tremendous chips.

2462.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are tremendous assets. Now, in terms of the risk, I think if you want to do it, you have to do it top to bottom. You have to start with whatever it is you're going to build and then permeate downwards into the infrastructure. Take, for example, Microsoft with OpenAI. They just bought all of AMD's supply and they run ChatGPT on it. That's it. And that puts them in the green.

2488.356 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are tremendous assets. Now, in terms of the risk, I think if you want to do it, you have to do it top to bottom. You have to start with whatever it is you're going to build and then permeate downwards into the infrastructure. Take, for example, Microsoft with OpenAI. They just bought all of AMD's supply and they run ChatGPT on it. That's it. And that puts them in the green.

2488.356 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's actually what makes them profitable on inference. Or at least, let's say, not lose money.

2511.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's actually what makes them profitable on inference. Or at least, let's say, not lose money.

2511.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because I can give you actual numbers. If you run eight H100, you can put two 70B models on them because of the RAM. That's number one. Number two is if you go from one GPU to two, you don't get twice the performance. Maybe you get 10% better performance. Yeah, that's the dirty secret nobody talks about. I'm talking inference, right?

2523.715 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because I can give you actual numbers. If you run eight H100, you can put two 70B models on them because of the RAM. That's number one. Number two is if you go from one GPU to two, you don't get twice the performance. Maybe you get 10% better performance. Yeah, that's the dirty secret nobody talks about. I'm talking inference, right?

2523.715 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So you go from, let's say, 100 to 110 by doubling the amount of GPUs. That is insane. So you'd rather have two by one than one by two, right? So with one machine of H100, you can run two 7 TBs model if you do four GPUs and four GPUs, right? That's number one. If you run on AMD, well, there's enough memory inside the GPU to run one model per card. So you get eight GPUs, eight times the throughput.

2546.434 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So you go from, let's say, 100 to 110 by doubling the amount of GPUs. That is insane. So you'd rather have two by one than one by two, right? So with one machine of H100, you can run two 7 TBs model if you do four GPUs and four GPUs, right? That's number one. If you run on AMD, well, there's enough memory inside the GPU to run one model per card. So you get eight GPUs, eight times the throughput.

2546.434 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

While on the other hand, you get eight GPUs, two, maybe two and a half times the throughput. So that is a Forex right there, just by virtue of this. So that is the compute part. But if you look at all of these things, there are tremendous amount of, we talked to companies who have chips upcoming with almost 300 gigabytes of memory. on it, right?

2575.051 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

While on the other hand, you get eight GPUs, two, maybe two and a half times the throughput. So that is a Forex right there, just by virtue of this. So that is the compute part. But if you look at all of these things, there are tremendous amount of, we talked to companies who have chips upcoming with almost 300 gigabytes of memory. on it, right?

2575.051 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is, you know, a model like one chip per model. This is the best thing you want if you're on seven TBs, right? So, which is what I would say, not the state of the art, but this is the regular stuff people will use for serving.

2598.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is, you know, a model like one chip per model. This is the best thing you want if you're on seven TBs, right? So, which is what I would say, not the state of the art, but this is the regular stuff people will use for serving.

2598.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you look, you know, top to bottom and you know what you're going to build with them, then it's a lot better to do the efficiency gains because four times is a big deal, right? And mind you, these chips are 30% cheaper than Nvidia's. It's like a no brainer. But if you go bottom up and say, I'm going to rent them out, people will not rent them. Simple.

2613.698 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you look, you know, top to bottom and you know what you're going to build with them, then it's a lot better to do the efficiency gains because four times is a big deal, right? And mind you, these chips are 30% cheaper than Nvidia's. It's like a no brainer. But if you go bottom up and say, I'm going to rent them out, people will not rent them. Simple.

2613.698 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that's why, you know, I think it's a good way to attack it from the software because ultimately, do you really care about that your MacBook, let's say, is an M2 or an M3? It's like, oh, it's the better one. And that's it, right? And imagine if you had to care about these things. That would be insane.

2633.946 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that's why, you know, I think it's a good way to attack it from the software because ultimately, do you really care about that your MacBook, let's say, is an M2 or an M3? It's like, oh, it's the better one. And that's it, right? And imagine if you had to care about these things. That would be insane.

2633.946 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Stock? Yeah. I used to think the market was efficient. So probably I would go today, at least I would go with Nvidia still. Because the supply. But, you know, if we play our cards right, we ship our stuff, hopefully I will come back and tell you to buy AMD as much as you can. Or a 10-storrent, you know, if they go public or whoever else. These chips are amazing, by the way.

2667.346 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Stock? Yeah. I used to think the market was efficient. So probably I would go today, at least I would go with Nvidia still. Because the supply. But, you know, if we play our cards right, we ship our stuff, hopefully I will come back and tell you to buy AMD as much as you can. Or a 10-storrent, you know, if they go public or whoever else. These chips are amazing, by the way.

2667.346 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably not a lot of people are accustomed to what it entails to run production. So that inference is production, and production is hard. Somebody has to wake up at night. And I used to be that guy, right? I don't want to do it again. So production is hard.

2698.339 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably not a lot of people are accustomed to what it entails to run production. So that inference is production, and production is hard. Somebody has to wake up at night. And I used to be that guy, right? I don't want to do it again. So production is hard.

2698.339 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Thankfully, we have a lot of software nowadays to do that a lot better, but there's not a lot of reuse because the AI field, at least, is not really accustomed to that yet. It's changing, but the discussions I had a year ago and the discussions I had today are not the same. They're going to the right direction, but they're not there exactly yet. So probably that would be the number one thing.

2715.746 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Thankfully, we have a lot of software nowadays to do that a lot better, but there's not a lot of reuse because the AI field, at least, is not really accustomed to that yet. It's changing, but the discussions I had a year ago and the discussions I had today are not the same. They're going to the right direction, but they're not there exactly yet. So probably that would be the number one thing.

2715.746 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is only training code running only for WordPress. This is not what it is.

2741.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is only training code running only for WordPress. This is not what it is.

2741.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're still going after training. So there's still this frontier. Probably it's why also NVIDIA is the better buy right now. Because on the NVIDIA side, if you do training, it's incremental. If you have bought 1,000 NVIDIA GPUs and you buy 1,000 new NVIDIA GPUs, that gives you 2,000 GPUs, right? But if you buy 1,000 and 1,000 AMD, that gives you twice 1,000, right? It's a bit different.

2761.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're still going after training. So there's still this frontier. Probably it's why also NVIDIA is the better buy right now. Because on the NVIDIA side, if you do training, it's incremental. If you have bought 1,000 NVIDIA GPUs and you buy 1,000 new NVIDIA GPUs, that gives you 2,000 GPUs, right? But if you buy 1,000 and 1,000 AMD, that gives you twice 1,000, right? It's a bit different.

2761.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So they're still going after training, definitely. And they're very pragmatic in doing so. But, I mean, they have the capex to spend. They're not making their money out of it, probably. The only one, by the way, that owns their compute are Google. There's like this triangle of, I would say, of wind that I, this is my mental model, mind you. You have the products, the data, and the compute.

2786.24 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So they're still going after training, definitely. And they're very pragmatic in doing so. But, I mean, they have the capex to spend. They're not making their money out of it, probably. The only one, by the way, that owns their compute are Google. There's like this triangle of, I would say, of wind that I, this is my mental model, mind you. You have the products, the data, and the compute.

2786.24 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Who has all three? And you get everything flows from there. Products, data, compute. Who has all three? Google? Amazon? Amazon, they don't have products. They have Amazon, right? They have AWS, but they don't have actual products. Google has like, you know, Android, Google Docs, whatever. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

2807.807 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Who has all three? And you get everything flows from there. Products, data, compute. Who has all three? Google? Amazon? Amazon, they don't have products. They have Amazon, right? They have AWS, but they don't have actual products. Google has like, you know, Android, Google Docs, whatever. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

2807.807 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So at the very bottom of things, ZML is an ML framework that runs any models on any hardware. We sit ultimately at the infrastructure layer. We enable anybody to run their model better, faster, more reliably, but on any compute whatsoever. Doesn't really matter. It could be NVIDIA. It can be AMD. It could be TPU. and whatnot. And we do all that without compromise.

281.573 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So at the very bottom of things, ZML is an ML framework that runs any models on any hardware. We sit ultimately at the infrastructure layer. We enable anybody to run their model better, faster, more reliably, but on any compute whatsoever. Doesn't really matter. It could be NVIDIA. It can be AMD. It could be TPU. and whatnot. And we do all that without compromise.

281.573 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If they're not busy doing a reorg, they might.

2830.012 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If they're not busy doing a reorg, they might.

2830.012 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, open AI is amazing, but it's not their compute. It is Microsoft's compute. And if you own your compute, you own your margin is essentially what you're saying. Yeah. Even Microsoft, when they were running NVIDIA, they bought NVIDIA at some outrageous margins. I talked to a lot of people that build data centers, and I tell them, mind you, these people buy tens of thousands of GPUs.

2843.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, open AI is amazing, but it's not their compute. It is Microsoft's compute. And if you own your compute, you own your margin is essentially what you're saying. Yeah. Even Microsoft, when they were running NVIDIA, they bought NVIDIA at some outrageous margins. I talked to a lot of people that build data centers, and I tell them, mind you, these people buy tens of thousands of GPUs.

2843.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I asked them, hey, do you get at least a discount or something? And they're like, no, the only thing we get is the supply. So, I mean, ultimately, if you don't own your compute, you're starting with, you know, something at your ankle. Definitely. And so this is why I like to think in this triangle, product, data, compute.

2868.453 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I asked them, hey, do you get at least a discount or something? And they're like, no, the only thing we get is the supply. So, I mean, ultimately, if you don't own your compute, you're starting with, you know, something at your ankle. Definitely. And so this is why I like to think in this triangle, product, data, compute.

2868.453 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you can see where everybody sits and their weaknesses and their strengths.

2887.44 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you can see where everybody sits and their weaknesses and their strengths.

2887.44 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's like a brute force approach to this. It is a very American approach, more and more and more. But the thing is, you look at, for instance, the XAI cluster. It's not 100,000 GPUs. It is four times 25,000. You're starting to see something because InfiniBand and in the case of Rocky, which is anyways, the technology they used to bridge their GPUs together. You have upper bounds, right?

2912.296 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's like a brute force approach to this. It is a very American approach, more and more and more. But the thing is, you look at, for instance, the XAI cluster. It's not 100,000 GPUs. It is four times 25,000. You're starting to see something because InfiniBand and in the case of Rocky, which is anyways, the technology they used to bridge their GPUs together. You have upper bounds, right?

2912.296 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

At some point you're fighting physics. So you can push, it's like, you know, trying to get to the speed of light. As you approach it, the amount of energy you need is a lot higher and a lot higher and it grows and grows.

2937.833 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

At some point you're fighting physics. So you can push, it's like, you know, trying to get to the speed of light. As you approach it, the amount of energy you need is a lot higher and a lot higher and it grows and grows.

2937.833 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's two, I would say, counter to that would be that number one is we're still scaled, but there's a lot of waste and excess, you know, spending on the engineering side, which is the deep seek approach, right? Very successful at that, mind you. They said, yeah, if we do this and this differently, then we get, you know, multiple sometimes, right?

2948.543 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's two, I would say, counter to that would be that number one is we're still scaled, but there's a lot of waste and excess, you know, spending on the engineering side, which is the deep seek approach, right? Very successful at that, mind you. They said, yeah, if we do this and this differently, then we get, you know, multiple sometimes, right?

2948.543 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So virtually you increase your compute capacity because you're more efficient. And the other approach is Yann LeCun's approach, which is this is not scaling. And at some point you need, we need to look the problem in the face and do something better, right? So of course we push and push and push because there's capital still, but I'm more of these two approaches. I think you can do more with less.

2968.777 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So virtually you increase your compute capacity because you're more efficient. And the other approach is Yann LeCun's approach, which is this is not scaling. And at some point you need, we need to look the problem in the face and do something better, right? So of course we push and push and push because there's capital still, but I'm more of these two approaches. I think you can do more with less.

2968.777 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think until somebody does it. DeepSeq was a good wake-up call, right? Suddenly efficiency is in. That's number one. And number two is until there's a new architecture that comes out and changes the game. So in the case of LLMs, for instance, you have these what's called non-transformer models that changes fundamentally the compute requirements.

2998.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think until somebody does it. DeepSeq was a good wake-up call, right? Suddenly efficiency is in. That's number one. And number two is until there's a new architecture that comes out and changes the game. So in the case of LLMs, for instance, you have these what's called non-transformer models that changes fundamentally the compute requirements.

2998.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That might be a frontier that completely obsoletes the transformers. The transformers are the building block by which current models work. The way they work is that for each token or syllable, the model will look at everything behind it. You can see that as you add more text, you have more work to do.

3019.199 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That might be a frontier that completely obsoletes the transformers. The transformers are the building block by which current models work. The way they work is that for each token or syllable, the model will look at everything behind it. You can see that as you add more text, you have more work to do.

3019.199 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there are these new architectures that do not require this, that might change these things and probably shift the amount of compute needed to do training or to do inference. And then there's the new thing, which is Yann's thesis, which is the word model. As in, LLMs are at the end. What we need is something that understands the world fundamentally. And this is, it's JEPA thesis, it's called.

3040.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there are these new architectures that do not require this, that might change these things and probably shift the amount of compute needed to do training or to do inference. And then there's the new thing, which is Yann's thesis, which is the word model. As in, LLMs are at the end. What we need is something that understands the world fundamentally. And this is, it's JEPA thesis, it's called.

3040.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm very bullish on this, but it's very frontier. Why are you bullish on it? And why is it so frontier? Because it's Yann LeCun. It's hard to. He's no bullshit, right? So he explained to me how it worked and I was blown away. But it makes a lot of sense. We are creeped out because the machine talks back to us. But it's not a new thing, right?

3065.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm very bullish on this, but it's very frontier. Why are you bullish on it? And why is it so frontier? Because it's Yann LeCun. It's hard to. He's no bullshit, right? So he explained to me how it worked and I was blown away. But it makes a lot of sense. We are creeped out because the machine talks back to us. But it's not a new thing, right?

3065.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's the key point, because if there's a compromise, then it's not really, you know, agnostic, right?

307.784 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's the key point, because if there's a compromise, then it's not really, you know, agnostic, right?

307.784 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It used to, you know, this is not new technology when it came out. Well, like when it exploded, it was a new technology. But suddenly it was talking back and that freaked us out. And we got crazy on it, right? But language is one form of communication, but it is ultimately a very narrow window into, you know, the world. We use it to describe the world arguably with some loss, right?

3087.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It used to, you know, this is not new technology when it came out. Well, like when it exploded, it was a new technology. But suddenly it was talking back and that freaked us out. And we got crazy on it, right? But language is one form of communication, but it is ultimately a very narrow window into, you know, the world. We use it to describe the world arguably with some loss, right?

3087.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so the JPA approach is, long story short, is that... you have essentially two things you want to do, and you try and minimize the energy to do them. And from this, understanding emerges, physics emerges, et cetera, because you're trying to minimize the amount of energy to go from one state to the other. And that actually makes sense. Like if you try and pick this AirPod case,

3110.289 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so the JPA approach is, long story short, is that... you have essentially two things you want to do, and you try and minimize the energy to do them. And from this, understanding emerges, physics emerges, et cetera, because you're trying to minimize the amount of energy to go from one state to the other. And that actually makes sense. Like if you try and pick this AirPod case,

3110.289 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not going to go round trip around the block to get it, right? I just get it. And in my brain, it's wired to just do the thing. If I go and, you know, talk to myself out loud, put the hand down, move to the left and whatever, that feels very, you know, inefficient. So probably this will be something that changes. And in the case of LLMs, there's good work also.

3133.385 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not going to go round trip around the block to get it, right? I just get it. And in my brain, it's wired to just do the thing. If I go and, you know, talk to myself out loud, put the hand down, move to the left and whatever, that feels very, you know, inefficient. So probably this will be something that changes. And in the case of LLMs, there's good work also.

3133.385 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

on what's called diffusion-based LLMs, which means like instead of thinking, you know, what's called auto-aggressively, that means you get a new token, you re-inject and you redo, etc. They think more like what we do, which is in patches, right? Imagine a paragraph of text and words appear until it's done.

3157.253 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

on what's called diffusion-based LLMs, which means like instead of thinking, you know, what's called auto-aggressively, that means you get a new token, you re-inject and you redo, etc. They think more like what we do, which is in patches, right? Imagine a paragraph of text and words appear until it's done.

3157.253 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think it's fair game, to be honest. I will not shed a tear. It's fair game. If you, there were like some people who tried to ask, I think it was, I don't remember if it was an open AI model. So a diffusion model image, right? They asked it to generate an image from a Star Wars movie at whatever timestamp. And it came out with the Star Wars movie, you know, screenshot.

3189.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think it's fair game, to be honest. I will not shed a tear. It's fair game. If you, there were like some people who tried to ask, I think it was, I don't remember if it was an open AI model. So a diffusion model image, right? They asked it to generate an image from a Star Wars movie at whatever timestamp. And it came out with the Star Wars movie, you know, screenshot.

3189.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Obviously it was trained with it. I think it's fair game because there's no free lunch, right? It was trained with data. You had a good ride. Somebody was sneaky and took it, but you took it from the beginning too. So let's just accept it's fair game.

3210.485 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Obviously it was trained with it. I think it's fair game because there's no free lunch, right? It was trained with data. You had a good ride. Somebody was sneaky and took it, but you took it from the beginning too. So let's just accept it's fair game.

3210.485 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, absolutely. I take my cup and enjoy it very much, that movie, every single day.

3230.66 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, absolutely. I take my cup and enjoy it very much, that movie, every single day.

3230.66 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm a bit split on this. There's a part of me that said that if you re-inject data into the system, the system deteriorates. That feels a bit, I would say, intuitive. But if you look at AlphaGo, for instance, the moment it's ramped up in its skills is when they started generating games. or synthetic games, right?

3255.327 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm a bit split on this. There's a part of me that said that if you re-inject data into the system, the system deteriorates. That feels a bit, I would say, intuitive. But if you look at AlphaGo, for instance, the moment it's ramped up in its skills is when they started generating games. or synthetic games, right?

3255.327 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I'm a bit, you know, split, but there are some verticals that very much benefit from this. Code LLMs, for instance. We can run code, right? So this is the poolside thesis. Just so I understand, why does it work for coding and not for other things? Because you don't use the AI model to generate output. You use the machine. You just run the code, right?

3274.116 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I'm a bit, you know, split, but there are some verticals that very much benefit from this. Code LLMs, for instance. We can run code, right? So this is the poolside thesis. Just so I understand, why does it work for coding and not for other things? Because you don't use the AI model to generate output. You use the machine. You just run the code, right?

3274.116 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, you actually can see it. It's been happening for a while. Models now are not the right abstractions, at least if you look at closed source models, they're not really models. They're more like backends. And there are a lot of tricks that you feel like you're talking to one model, but ultimately you're talking to a constellation, an assembly of backends that produces a response.

328.778 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, you actually can see it. It's been happening for a while. Models now are not the right abstractions, at least if you look at closed source models, they're not really models. They're more like backends. And there are a lot of tricks that you feel like you're talking to one model, but ultimately you're talking to a constellation, an assembly of backends that produces a response.

328.778 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you see what it makes and you run all this code and you create data out of it. Whereas if you run an LLM and you say to an LLM, all right, generate me two trillion tokens of text, it will do it with its, you know, so you may inject and stuff. So there's a lot of tricks, but ultimately my guts tell me that it feels wrong, right? Because you re-inject. data that was there.

3296.669 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you see what it makes and you run all this code and you create data out of it. Whereas if you run an LLM and you say to an LLM, all right, generate me two trillion tokens of text, it will do it with its, you know, so you may inject and stuff. So there's a lot of tricks, but ultimately my guts tell me that it feels wrong, right? Because you re-inject. data that was there.

3296.669 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so it will deteriorate. There's loss. So yeah, I'm a bit bullish. I'm not sure exactly on what vertical. Code is one. We'll see. Distillation is, in some sense, a bit like that. You create synthetic data from a bigger model into a smaller one. Probably the most, I would say, mind-blowing thing about distillation is that sometimes the smaller models become better than the bigger model.

3320.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so it will deteriorate. There's loss. So yeah, I'm a bit bullish. I'm not sure exactly on what vertical. Code is one. We'll see. Distillation is, in some sense, a bit like that. You create synthetic data from a bigger model into a smaller one. Probably the most, I would say, mind-blowing thing about distillation is that sometimes the smaller models become better than the bigger model.

3320.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

through distillation.

3344.779 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

through distillation.

3344.779 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

One theory is that the smarter model is better at generating output that you would want it to generate, essentially. It's not better in the general sense. It's better at the task at which you were measuring it. This is what it learned to imitate.

3352.721 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

One theory is that the smarter model is better at generating output that you would want it to generate, essentially. It's not better in the general sense. It's better at the task at which you were measuring it. This is what it learned to imitate.

3352.721 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Sometimes it's wasteful to run big models. A lot of times it's actually wasteful to run big models. I think there's going to be a lot of smaller models for efficiency reasons, but...

3377.49 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Sometimes it's wasteful to run big models. A lot of times it's actually wasteful to run big models. I think there's going to be a lot of smaller models for efficiency reasons, but...

3377.49 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a but, which is you talk to people at DeepMind and they don't even fine tune anymore because they have such, you know, what's called big context window, which is what the model, you know, the data, the model you inject, right? At one time that nowadays they just dump data into it and just say, do whatever, you know, that data tells you to do instead of fine tuning as we used to do.

3388.737 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a but, which is you talk to people at DeepMind and they don't even fine tune anymore because they have such, you know, what's called big context window, which is what the model, you know, the data, the model you inject, right? At one time that nowadays they just dump data into it and just say, do whatever, you know, that data tells you to do instead of fine tuning as we used to do.

3388.737 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if the efficiency gains, we're not there yet, right? But if their efficiency gains, I would say pass that threshold, we'll just do it at runtime. We'll just have a great model that will just specialize at each request. But that's not for tomorrow, I think.

3412.228 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if the efficiency gains, we're not there yet, right? But if their efficiency gains, I would say pass that threshold, we'll just do it at runtime. We'll just have a great model that will just specialize at each request. But that's not for tomorrow, I think.

3412.228 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a very, very clever trick. What you do is you represent knowledge into what's called the vector space or latent space. And what you do is through what's called vector search. So imagine you have, let's say, a 3D space that represents all knowledge, all of everything. And let's say a cat sits here, a dog sits close because it's an animal, but it's far from some other property and so on.

3432.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a very, very clever trick. What you do is you represent knowledge into what's called the vector space or latent space. And what you do is through what's called vector search. So imagine you have, let's say, a 3D space that represents all knowledge, all of everything. And let's say a cat sits here, a dog sits close because it's an animal, but it's far from some other property and so on.

3432.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what you do is you run the user's request through this same system. It's called an embedding. And that will give you a vector, and you will take whatever is closer to you, what's called semantically close. And then it's actually very clever. You actually insert those pieces of text before the request.

3460.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what you do is you run the user's request through this same system. It's called an embedding. And that will give you a vector, and you will take whatever is closer to you, what's called semantically close. And then it's actually very clever. You actually insert those pieces of text before the request.

3460.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So it's as if you would say, knowing the following, and you give the data, let's say it's law or whatever, please answer my request. And that's it. So that's a bit of a clever trick. It's a bit dirty because, of course, you know, you are limited by the amount of data you can input, right? So there's this problem in which how do you chunk, you know, the data that you input?

3482.608 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So it's as if you would say, knowing the following, and you give the data, let's say it's law or whatever, please answer my request. And that's it. So that's a bit of a clever trick. It's a bit dirty because, of course, you know, you are limited by the amount of data you can input, right? So there's this problem in which how do you chunk, you know, the data that you input?

3482.608 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one, you know, I would say obvious thing would be that if you ask a model to generate an image, then it will, you know, switch to a diffusion model, right? Not an LLM. And there's many, many more tricks. The turbo models and OpenAI do that. There's a lot of tricks.

349.64 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one, you know, I would say obvious thing would be that if you ask a model to generate an image, then it will, you know, switch to a diffusion model, right? Not an LLM. And there's many, many more tricks. The turbo models and OpenAI do that. There's a lot of tricks.

349.64 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It is, it is. Depends on how it works, but yes, sometimes it is. But think of it as in, it's like a preamble to your question. Knowing the following, and the following is a tiny window into the content. Please answer my question. And of course, as you talk more and more, it will forget because that window is fixed.

3517.237 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It is, it is. Depends on how it works, but yes, sometimes it is. But think of it as in, it's like a preamble to your question. Knowing the following, and the following is a tiny window into the content. Please answer my question. And of course, as you talk more and more, it will forget because that window is fixed.

3517.237 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What pushes smaller models are efficiency, roughly, speed. You know, less is better. So if we can do with less, then less it is. Simple as this, right? In terms of RAG, the key frontier is what we call attention level search. But this is something we're working on. You have the exclusivity now, I'm putting it out there. It doesn't push, I would say, model sizes.

3544.165 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What pushes smaller models are efficiency, roughly, speed. You know, less is better. So if we can do with less, then less it is. Simple as this, right? In terms of RAG, the key frontier is what we call attention level search. But this is something we're working on. You have the exclusivity now, I'm putting it out there. It doesn't push, I would say, model sizes.

3544.165 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What really pushes model sizes are the efficiency rather than specializing. Meaning that if you can do the same performance with a smaller model that is fine-tuned with RAG or whatever, then you'll do it with a smaller because, again, less is better.

3566.182 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What really pushes model sizes are the efficiency rather than specializing. Meaning that if you can do the same performance with a smaller model that is fine-tuned with RAG or whatever, then you'll do it with a smaller because, again, less is better.

3566.182 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, I love it. Constraint is the mother of innovation. Yes, you know, we can, you know, troll a bit about, you know, the Singapore, you know, gray market and all of these things. But ultimately, like they had no choice. Here's the thing. If you can buy more, why would you give a damn, right? You can just buy more. So if you are pushed to efficiency, then you will deliver efficiency.

3596.297 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, I love it. Constraint is the mother of innovation. Yes, you know, we can, you know, troll a bit about, you know, the Singapore, you know, gray market and all of these things. But ultimately, like they had no choice. Here's the thing. If you can buy more, why would you give a damn, right? You can just buy more. So if you are pushed to efficiency, then you will deliver efficiency.

3596.297 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are very, very skilled people. This is the coolest thing to me about AI, honestly, is the geography doesn't matter anymore. You can just do things. You appear out of nowhere, boom, you're on the map. And so I'm very, very glad that they did. I found the reaction very entertaining, to be honest. So yeah, I mean, constrain is a very good driver of efficiency.

3618.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are very, very skilled people. This is the coolest thing to me about AI, honestly, is the geography doesn't matter anymore. You can just do things. You appear out of nowhere, boom, you're on the map. And so I'm very, very glad that they did. I found the reaction very entertaining, to be honest. So yeah, I mean, constrain is a very good driver of efficiency.

3618.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So definitely models in the sense of getting, you know, weights and running them is something that is ultimately going away because, you know, in favor of like full-blown backends, right? You feel like you're talking to a model, but ultimately you're talking to an API. The thing is, that API will be running locally in your own cloud instances and so on.

365.108 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So definitely models in the sense of getting, you know, weights and running them is something that is ultimately going away because, you know, in favor of like full-blown backends, right? You feel like you're talking to a model, but ultimately you're talking to an API. The thing is, that API will be running locally in your own cloud instances and so on.

365.108 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not sure who is a threat to OpenAI at the moment. Here's why. You look at the numbers. I mean, we live in a bubble. We, you know, we follow every new episode, the whatever new model, whatever, who said what and so on. But, you know, I go to my mother and I ask her, you know, do you know ChatGPT? And she says, yes.

3652.102 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not sure who is a threat to OpenAI at the moment. Here's why. You look at the numbers. I mean, we live in a bubble. We, you know, we follow every new episode, the whatever new model, whatever, who said what and so on. But, you know, I go to my mother and I ask her, you know, do you know ChatGPT? And she says, yes.

3652.102 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And do you know, I don't know, I don't want to dunk on anybody, but do you want to know some other model? And she says, what it is, what is it, right? Even Gemini, right? Google, right? So they have a strong brand, they have a strong product, but there's a balance between the product and the models, honestly.

3669.985 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And do you know, I don't know, I don't want to dunk on anybody, but do you want to know some other model? And she says, what it is, what is it, right? Even Gemini, right? Google, right? So they have a strong brand, they have a strong product, but there's a balance between the product and the models, honestly.

3669.985 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So this is Gary from Fluidstack, actually, who told me that his mentor model in terms of model providers, they'll be like car makers. There's no winner tickle. Everybody will have their own because ultimately also human knowledge is everybody has everything. So we're converging. But I liked an analogy.

3685.615 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So this is Gary from Fluidstack, actually, who told me that his mentor model in terms of model providers, they'll be like car makers. There's no winner tickle. Everybody will have their own because ultimately also human knowledge is everybody has everything. So we're converging. But I liked an analogy.

3685.615 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, deep seek made a very good, you know, made waves, but it was, it was, you know, waves that were amplified by the media and the narrative and the drama.

3703.789 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, deep seek made a very good, you know, made waves, but it was, it was, you know, waves that were amplified by the media and the narrative and the drama.

3703.789 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Today, maybe. Tomorrow, I'm not sure. They're a bit late in terms of ASIC. They are like A100 level, but they have probably, I would say, one of their unfair advantage is that it's like when you do exercise in the water, right? It's like this. So this is their state. They are constrained, so they are bound to do better. They can just not buy their way into better compute.

3719.036 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Today, maybe. Tomorrow, I'm not sure. They're a bit late in terms of ASIC. They are like A100 level, but they have probably, I would say, one of their unfair advantage is that it's like when you do exercise in the water, right? It's like this. So this is their state. They are constrained, so they are bound to do better. They can just not buy their way into better compute.

3719.036 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I think it hinders their success, but I think it's short term to think that way.

3744.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I think it hinders their success, but I think it's short term to think that way.

3744.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, I don't care. This is something that makes me wonder sometimes. I understand the narrative and so on, but I am absolutely not fearful. Let's be successful first, and then we'll talk about the politics. But again, I'm not Mistral. I'm not building gigawatt data centers and so on. So if you build gigawatt data centers, you run into these problems. But maybe you run into these problems. But

3757.942 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, I don't care. This is something that makes me wonder sometimes. I understand the narrative and so on, but I am absolutely not fearful. Let's be successful first, and then we'll talk about the politics. But again, I'm not Mistral. I'm not building gigawatt data centers and so on. So if you build gigawatt data centers, you run into these problems. But maybe you run into these problems. But

3757.942 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing is, if you're successful, everything flows from there.

3784.227 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing is, if you're successful, everything flows from there.

3784.227 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are very competent. I think it's easy to spread FUD. There's a lot of FUD going around, especially about regulation and everything. But here's the thing, I look around me and I don't see, you know, what I read, right? So I am hardly convinced about, you know, everybody was saying that they were dead and boom, they came out with their release and it was insane.

3798.981 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are very competent. I think it's easy to spread FUD. There's a lot of FUD going around, especially about regulation and everything. But here's the thing, I look around me and I don't see, you know, what I read, right? So I am hardly convinced about, you know, everybody was saying that they were dead and boom, they came out with their release and it was insane.

3798.981 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what I know is that I hope they don't have too much money. That's for sure. You want to be clever, right?

3819.509 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what I know is that I hope they don't have too much money. That's for sure. You want to be clever, right?

3819.509 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

My first impression was that I don't buy it. I would say, you know, American style, right? You start with the claim and we'll figure it out later. I don't buy it. And ultimately, I'm not sure I care that much about it. Let's imagine it's true, right? Congratulations. Amazing. But it is more of the same. It is a vertical scaling. And as you know, my days are spent on efficiency.

3836.181 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

My first impression was that I don't buy it. I would say, you know, American style, right? You start with the claim and we'll figure it out later. I don't buy it. And ultimately, I'm not sure I care that much about it. Let's imagine it's true, right? Congratulations. Amazing. But it is more of the same. It is a vertical scaling. And as you know, my days are spent on efficiency.

3836.181 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I look at these things as being like, all right, this is a bigger, you know, this is an American car of AI. It's big. It consumes a lot of gas. But ultimately, you know, it's not a good car, right? I think there has to be sufficient capital, but at some point, I'm not sure it is really a differentiator. That was prior to DeepSeek, then DeepSeek came.

3859.756 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I look at these things as being like, all right, this is a bigger, you know, this is an American car of AI. It's big. It consumes a lot of gas. But ultimately, you know, it's not a good car, right? I think there has to be sufficient capital, but at some point, I'm not sure it is really a differentiator. That was prior to DeepSeek, then DeepSeek came.

3859.756 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That was always my thesis, but you need money, you need infrastructure, you need... But what is ultimately probably the two limiting factor today is talent and energy. That's it. The rest, yes, of course, you can buy 500 billion of GPUs. By the way, 90% margin. So if we work on that margin, we can shrink that number probably. So I'm not easily entertained by these numbers.

3881.67 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That was always my thesis, but you need money, you need infrastructure, you need... But what is ultimately probably the two limiting factor today is talent and energy. That's it. The rest, yes, of course, you can buy 500 billion of GPUs. By the way, 90% margin. So if we work on that margin, we can shrink that number probably. So I'm not easily entertained by these numbers.

3881.67 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I've seen how the sausage is made way too many times. Dude, I want to do a quick fire with you.

3911.32 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I've seen how the sausage is made way too many times. Dude, I want to do a quick fire with you.

3911.32 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah. Latency, reasoning. Definitely. This year. What does that mean in... So the shift from throughput, so how speed my answers, to how long it takes for my answer complete to appear. That is probably one of the fundamental, like this year, right? Longer term, I'm very rooting for non-transformer models that will change the compute, also landscape.

3925.774 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah. Latency, reasoning. Definitely. This year. What does that mean in... So the shift from throughput, so how speed my answers, to how long it takes for my answer complete to appear. That is probably one of the fundamental, like this year, right? Longer term, I'm very rooting for non-transformer models that will change the compute, also landscape.

3925.774 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And of course, you know, world models, right? Yes. And or energy-based models.

3949.309 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And of course, you know, world models, right? Yes. And or energy-based models.

3949.309 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one thing I would say is do not resell compute if you can. A lot of AI startups that are building on top of AI are trying to make a margin on top of a very big cake. And ultimately what they sell is compute. If you look at the dollar of spend, for $1 of spend, maybe 98% of it goes to somebody else's margin.

3961.445 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one thing I would say is do not resell compute if you can. A lot of AI startups that are building on top of AI are trying to make a margin on top of a very big cake. And ultimately what they sell is compute. If you look at the dollar of spend, for $1 of spend, maybe 98% of it goes to somebody else's margin.

3961.445 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do AI, as much as you can, try to verticalize on the product, but not on the compute. If your business model implies buying a lot of tokens, it's a very hard circle to square to put that into $20 a month. So I always say, please look at it from that angle. And if you can, try and avoid it. What's the biggest challenge that Jensen Huang faces today?

3983.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do AI, as much as you can, try to verticalize on the product, but not on the compute. If your business model implies buying a lot of tokens, it's a very hard circle to square to put that into $20 a month. So I always say, please look at it from that angle. And if you can, try and avoid it. What's the biggest challenge that Jensen Huang faces today?

3983.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The highs are very high, but they don't last forever. So probably it's how to navigate the downslope. Blackwell is probably something that keeps him awake at night.

4011.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The highs are very high, but they don't last forever. So probably it's how to navigate the downslope. Blackwell is probably something that keeps him awake at night.

4011.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because orders are getting canceled. Why are they getting canceled? They have a lot of problems with these chips. So a lot of people, you know, are canceling their orders. These chips are like on the frontier of scaling. And so, you know, they were supposed to come out last summer, but that heat dissipation and, you know, matter bending problem used to be called

4029.9 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because orders are getting canceled. Why are they getting canceled? They have a lot of problems with these chips. So a lot of people, you know, are canceling their orders. These chips are like on the frontier of scaling. And so, you know, they were supposed to come out last summer, but that heat dissipation and, you know, matter bending problem used to be called

4029.9 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, absolutely. You can get probably an order of magnitude more efficiency depending on the hardware you run on. That is substantial. Not a lot of people have that problem at the moment. Things are getting built as we speak. But a simple example is if you switch from NVIDIA to AMD on a 7 TB model, you can get four times better efficiency in terms of spend. So that is substantial.

405.334 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, absolutely. You can get probably an order of magnitude more efficiency depending on the hardware you run on. That is substantial. Not a lot of people have that problem at the moment. Things are getting built as we speak. But a simple example is if you switch from NVIDIA to AMD on a 7 TB model, you can get four times better efficiency in terms of spend. So that is substantial.

405.334 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The people who are very privy to silicon told me, this is what we call a pretty big fucking problem, right? End quote. Probably how to navigate the downslope. Maybe you don't know, but the supply of H100 was actually smoothed out over the year so that they didn't have like a big spike in deliveries and then a quarter less, right? Which pissed a lot of people, mind you. who bought a lot of them.

4052.154 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The people who are very privy to silicon told me, this is what we call a pretty big fucking problem, right? End quote. Probably how to navigate the downslope. Maybe you don't know, but the supply of H100 was actually smoothed out over the year so that they didn't have like a big spike in deliveries and then a quarter less, right? Which pissed a lot of people, mind you. who bought a lot of them.

4052.154 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Some of them even haven't received their order from last year. And they already see like the new chip, the B200, and then the one after, you know, and they're super pissed. There will be a downslope at some point. The question is, you know, when, how, like if there's like the H100 bubble, of course it will impact Nvidia. But Blackwell is, I'm probably going to get a lot of flack for this.

4078.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Some of them even haven't received their order from last year. And they already see like the new chip, the B200, and then the one after, you know, and they're super pissed. There will be a downslope at some point. The question is, you know, when, how, like if there's like the H100 bubble, of course it will impact Nvidia. But Blackwell is, I'm probably going to get a lot of flack for this.

4078.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But, you know, I've seen some very worrying numbers about it. and varying testimonies about people who operate these things, right? So that ride will stop or at least, you know, slow down.

4099.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But, you know, I've seen some very worrying numbers about it. and varying testimonies about people who operate these things, right? So that ride will stop or at least, you know, slow down.

4099.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is very much substantial. Now, the problem is getting some AMD GPUs, right? I'm really sorry.

432.244 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is very much substantial. Now, the problem is getting some AMD GPUs, right? I'm really sorry.

432.244 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's a few reasons. Probably the most important one is the PyTorch CUDA, I would say, Duo. And that's very, very hard to break. These two are very much intertwined. Can you just explain to us what PyTorch and CUDA are? Oh, yes, absolutely. Yeah, yeah. PyTorch is the ML framework that people use to build actually trained models, right?

442.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's a few reasons. Probably the most important one is the PyTorch CUDA, I would say, Duo. And that's very, very hard to break. These two are very much intertwined. Can you just explain to us what PyTorch and CUDA are? Oh, yes, absolutely. Yeah, yeah. PyTorch is the ML framework that people use to build actually trained models, right?

442.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You can do inference with it, but by far the most successful framework for training is PyTorch. And PyTorch was very much built on top of CUDA, which is NVIDIA software, right? Let's just say the strengths of PyTorch make it ultimately very, very bound to CUDA. So of course it runs on, you know, it runs on AMD, it runs on, you know, even Apple and so on.

464.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You can do inference with it, but by far the most successful framework for training is PyTorch. And PyTorch was very much built on top of CUDA, which is NVIDIA software, right? Let's just say the strengths of PyTorch make it ultimately very, very bound to CUDA. So of course it runs on, you know, it runs on AMD, it runs on, you know, even Apple and so on.

464.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But there was always, you know, the tens of little details that not exactly run like, you know, you would expect and there's work involved, but then also there's supply. So probably that's the number one thing. The second thing is there's a lot of GPUs on the market. Pretty much all of them are NVIDIA.

489.353 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But there was always, you know, the tens of little details that not exactly run like, you know, you would expect and there's work involved, but then also there's supply. So probably that's the number one thing. The second thing is there's a lot of GPUs on the market. Pretty much all of them are NVIDIA.

489.353 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The reason being that if you think, you know, in layers and you say, all right, I'm going to buy, let's say, GPUs and I'm going to sell them to folks to maybe not even do training, right? Just do inference. Then most likely, if you look at it that way, you'll end up buying Nvidia because everybody will want to run an Nvidia because nobody knows really how to do whatever.

508.146 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The reason being that if you think, you know, in layers and you say, all right, I'm going to buy, let's say, GPUs and I'm going to sell them to folks to maybe not even do training, right? Just do inference. Then most likely, if you look at it that way, you'll end up buying Nvidia because everybody will want to run an Nvidia because nobody knows really how to do whatever.

508.146 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they've trained on Nvidia, so they're like, I can just reuse my code and so on. So there's like this self-perpetuating circle of people just buy Nvidia because they want to resell and people just use Nvidia because it's there, right? But it's by far not the most efficient platform. And arguably, even in terms of software, it's not the best software platform.

528.177 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they've trained on Nvidia, so they're like, I can just reuse my code and so on. So there's like this self-perpetuating circle of people just buy Nvidia because they want to resell and people just use Nvidia because it's there, right? But it's by far not the most efficient platform. And arguably, even in terms of software, it's not the best software platform.

528.177 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably two of the most, I would wager, the most important reasons.

551.728 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably two of the most, I would wager, the most important reasons.

551.728 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because the chips are there. There's a lot of things, but in my opinion, there's going to be a need for inference. Very hard to say whether it will be worth everybody's money to do it on H100. That is a bubble that I think will blow some time. I'm kind of afraid of that, to be honest.

572.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because the chips are there. There's a lot of things, but in my opinion, there's going to be a need for inference. Very hard to say whether it will be worth everybody's money to do it on H100. That is a bubble that I think will blow some time. I'm kind of afraid of that, to be honest.

572.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

because it was built on the A100, I would say, financial model, which was at generation zero, we do training, but when it's last generation, we do inference, and it worked beautifully, right? For A100, then H100 comes along, and inference is, it's worth five times the price, and it maybe runs twice in terms of performance, on inference, that is. On training, it's a lot better, but on inference,

593.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

because it was built on the A100, I would say, financial model, which was at generation zero, we do training, but when it's last generation, we do inference, and it worked beautifully, right? For A100, then H100 comes along, and inference is, it's worth five times the price, and it maybe runs twice in terms of performance, on inference, that is. On training, it's a lot better, but on inference,

593.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's like maybe twice as fast. Actually, when it came out, it ran at the same speed than the A100. So there's a money gap that's going to have to, you know, be bridged sometime, right? And the part that worries me is that I see, you know, amortization plans in like, you know, six, seven years, right? With the GPUs at the collateral.

619.016 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's like maybe twice as fast. Actually, when it came out, it ran at the same speed than the A100. So there's a money gap that's going to have to, you know, be bridged sometime, right? And the part that worries me is that I see, you know, amortization plans in like, you know, six, seven years, right? With the GPUs at the collateral.

619.016 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I'm like, well, I'm not sure how it's going to work because at least when they came out, they were worth five times the price and they're just two times, you know, faster. Something has got to give.

638.729 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I'm like, well, I'm not sure how it's going to work because at least when they came out, they were worth five times the price and they're just two times, you know, faster. Something has got to give.

638.729 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Not much, ultimately. The two things that could very much shake the industry, the chip industry, in my opinion, are agents and reasoning.

659.966 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Not much, ultimately. The two things that could very much shake the industry, the chip industry, in my opinion, are agents and reasoning.

659.966 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think this is where NVIDIA can be attacked. I mean, why agents and why reasoning? The difference is for agents and reasoning, you need to wait until the end of the request to get whatever it is you came for. You don't really care about the speed at which the text outputs, which is what you want in a chat, right?

674.358 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think this is where NVIDIA can be attacked. I mean, why agents and why reasoning? The difference is for agents and reasoning, you need to wait until the end of the request to get whatever it is you came for. You don't really care about the speed at which the text outputs, which is what you want in a chat, right?

674.358 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You only care about how much time does it take between the beginning of my request and the end. And so that fundamentally changes the incentives from throughput bound to latency bound. And so GPUs, let's say you're running a GPUs at, let's say, 10,000 tokens per second. You very much like to do it, you know, 100 times 100, right?

695.387 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You only care about how much time does it take between the beginning of my request and the end. And so that fundamentally changes the incentives from throughput bound to latency bound. And so GPUs, let's say you're running a GPUs at, let's say, 10,000 tokens per second. You very much like to do it, you know, 100 times 100, right?

695.387 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they can do that, but they cannot give you 10,000 tokens per second only on you, per stream, what we say. But in terms of agents or reasoning, this is exactly what you want, because you don't want to wait like 50 seconds for whatever thinking, right? And agents, it's the same. So these two, I think, are the shot that might make NVIDIA change its course with respect to chips.

718.772 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they can do that, but they cannot give you 10,000 tokens per second only on you, per stream, what we say. But in terms of agents or reasoning, this is exactly what you want, because you don't want to wait like 50 seconds for whatever thinking, right? And agents, it's the same. So these two, I think, are the shot that might make NVIDIA change its course with respect to chips.

718.772 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're not idiots, right? How should agents change NVIDIA's strategy? Hard to say, because NVIDIA has a very, very vertical approach. They do more of more, right? Like if you look at Blackwell, it's actually crazy what they did for Blackwell. They assembled two chips.

743.621 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're not idiots, right? How should agents change NVIDIA's strategy? Hard to say, because NVIDIA has a very, very vertical approach. They do more of more, right? Like if you look at Blackwell, it's actually crazy what they did for Blackwell. They assembled two chips.

743.621 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the surface was so big that the chip started to bend a bit, which further perpetuated the problem because it then didn't make contact with the heat sink and so on. So they are very much, and you know, the power envelope, they push it to a thousand watts. It requires liquid cooling and so on. So they are very much in a very vertical foot to the pedal in terms of GPU scaling.

763.226 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the surface was so big that the chip started to bend a bit, which further perpetuated the problem because it then didn't make contact with the heat sink and so on. So they are very much, and you know, the power envelope, they push it to a thousand watts. It requires liquid cooling and so on. So they are very much in a very vertical foot to the pedal in terms of GPU scaling.

763.226 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, GPUs are a good trick for AI, but they're not built for AI. It's not a specialized chip. It is a specialization of a GPU, but it is not an AI chip.

785.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, GPUs are a good trick for AI, but they're not built for AI. It's not a specialized chip. It is a specialization of a GPU, but it is not an AI chip.

785.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way it worked is that you can think of a screen as a matrix. And if you have to render pixels on a screen, there's a lot of pixels and everything has to happen in parallel, right? So that you don't waste time. Turns out, you know, matrices are a very important thing in AI.

806.495 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way it worked is that you can think of a screen as a matrix. And if you have to render pixels on a screen, there's a lot of pixels and everything has to happen in parallel, right? So that you don't waste time. Turns out, you know, matrices are a very important thing in AI.

806.495 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there was this cool trick in which we essentially tricked the GPU into, back, that was like probably 20 years ago, we would trick the GPU into believing it was doing graphics rendering, where actually we were making it do parallel work, right? It was called GPGPU at the time, right? So it was always a cool trick. But it was not dedicated for this.

825.808 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there was this cool trick in which we essentially tricked the GPU into, back, that was like probably 20 years ago, we would trick the GPU into believing it was doing graphics rendering, where actually we were making it do parallel work, right? It was called GPGPU at the time, right? So it was always a cool trick. But it was not dedicated for this.

825.808 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The pioneers probably were, of course, Google with TPU, which are very much more advanced on the architectural level. But essentially, the way they work, it kind of works for AI. But for LLMs, that starts to crack because they're so big and there's a lot of memory transfers and so on.

846.65 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The pioneers probably were, of course, Google with TPU, which are very much more advanced on the architectural level. But essentially, the way they work, it kind of works for AI. But for LLMs, that starts to crack because they're so big and there's a lot of memory transfers and so on.

846.65 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Actually, that's why Grok achieves, not Grok, but Grok, Cerebras, and all these folks, they achieve very high performance single stream is because the data is right in the chip. They don't have to get it from memory, which is slow, which GPU has to do. So there's a lot of these things that ultimately make it a good trick but not a dedicated solution per se.

866.724 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Actually, that's why Grok achieves, not Grok, but Grok, Cerebras, and all these folks, they achieve very high performance single stream is because the data is right in the chip. They don't have to get it from memory, which is slow, which GPU has to do. So there's a lot of these things that ultimately make it a good trick but not a dedicated solution per se.

866.724 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That said, though, the reason probably NVIDIA won, at least in the training space, is because of Mellanox, not because of the raw compute. Because you need to run lots of these GPUs in parallel. So the interconnect between them is ultimately what matters. How fast can they exchange data? Because remember,

890.454 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That said, though, the reason probably NVIDIA won, at least in the training space, is because of Mellanox, not because of the raw compute. Because you need to run lots of these GPUs in parallel. So the interconnect between them is ultimately what matters. How fast can they exchange data? Because remember,

890.454 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

When you do a matrix multiplication, let's say, you read, the matrix is read like hundreds of times during the multiplication. So there's a lot of transfers going on. And so far, Melanox with, you know, InfiniBand had the best technology. So that's why, you know, a lot of people, and when you do training, by the way, it is the name of the game, the interconnect. When you do inference, not so much.

912.674 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

When you do a matrix multiplication, let's say, you read, the matrix is read like hundreds of times during the multiplication. So there's a lot of transfers going on. And so far, Melanox with, you know, InfiniBand had the best technology. So that's why, you know, a lot of people, and when you do training, by the way, it is the name of the game, the interconnect. When you do inference, not so much.

912.674 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't care. when you do inference.

937.352 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't care. when you do inference.

937.352 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

FRANCESC CAMPOY- So I would divide it in two categories. Well, three categories. The GPUs you can buy or rent. the TPUs you can rent, and the TPUs you can buy. This is how the market is structured today, right? Right now, if you want to go dedicated, at least in the cloud, there's two options, TPUs and Tranium. TPUs on Google, Tranium on Amazon.

959.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

FRANCESC CAMPOY- So I would divide it in two categories. Well, three categories. The GPUs you can buy or rent. the TPUs you can rent, and the TPUs you can buy. This is how the market is structured today, right? Right now, if you want to go dedicated, at least in the cloud, there's two options, TPUs and Tranium. TPUs on Google, Tranium on Amazon.

959.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these are available chips, you can rent them today. If you want to buy GPUs or rent GPUs, they're GPUs, we know it all the time. And there's this new wave of computing, which are... dedicated chips, you can actually buy. The 10th Torrent, the Etched, the Vsora. So I think it will be a mix of, for instance, let's say you are in Google Cloud, of course you don't want to do NVIDIA.

985.021 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these are available chips, you can rent them today. If you want to buy GPUs or rent GPUs, they're GPUs, we know it all the time. And there's this new wave of computing, which are... dedicated chips, you can actually buy. The 10th Torrent, the Etched, the Vsora. So I think it will be a mix of, for instance, let's say you are in Google Cloud, of course you don't want to do NVIDIA.

985.021 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing with Nvidia is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? OpenAI is amazing, but it's not their compute. Ultimately, if you don't own your compute, you're starting with something at your ankle. In five years, I would say 95% inference, 5% training.

0.089 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You get ripped off. Here's the dirty secret, is that NVIDIA, like a TSMC sells you at 60% margin, NVIDIA sells you at 90% margin. And on top of that, there's Amazon that takes, let's say a 30% margin. So you are a very thin crust on a very big cake. It's a bit of a losing game if you go all in on one provider, you want optionality.

1010.659 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, yes. Here's the problem, though. Let's say you are on Google Cloud and you're on TPUs. Suddenly, you just removed that 90% chunk on the spend. The problem is that for multiple software reasons, which we are solving at DML, is that they're not really, I would say, a commercial success. They are very much successful inside of Google, but not much outside of Google.

1041.349 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Amazon, same, is pushing very, very hard for their, you know, Tranium chips. So the future I see is that you use whatever, you know, your provider has because you don't want to pay, you know, 90% outrageous margin and try to make, you know, a profit out of that. Okay.

1065.831 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these two obey fundamentally different, I would say, tectonic forces. So in training, more is better. You want more of everything, essentially. And the recipe for success is the speed of iteration. You change stuff, you see how it works, and you do it again. Hopefully it converges. And it's like changing the wheel of a moving car, so to speak. So that is training.

1096.151 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

On inference, this is a complete reverse. Less is better. You want less headaches. You don't want to be working up at night because inference is production. You could say that training is research and inference is production. And it's fundamentally different. In terms of infra, probably the number one thing that is the number one difference between these two is the need for interconnect.

1120.158 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do production, if you can avoid to have interconnect between, let's say, a cluster of GPUs, of course you will avoid that if you can. And this is why models have the sizes they have. It's so that people can run them without the need to connect multiple machines together. It's very constraining in terms of the environment.

1141.656 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably the fundamental difference, the need for interconnect. And number two is, ultimately, do you really care about what your model is running on as long as it's outputting whatever you want it to output?

1166.568 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Think of it like doing a painting and doing a million paintings. The tools you will use, the process you will do. If you do one painting, what you favor is the speed at which you can do a stroke and do some iteration. If you do a million, what you want is a process that is reliable, that can deliver you efficiently a million paintings. So that is the same for training versus inference.

1189.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run around millions of instances of a model, You cannot hack your way to do that. By the way, people do hack their way today, but this is probably the fundamental difference.

1214.424 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

1233.152 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a lot of duct tape. Here's also probably one of the problem is that training on first principle is actually two passes, forward and backward, right? It's called forward pass and backward pass, right? Inference is running only the forward pass. So that's how things are today. There are people who are trying to specialize a bit because at some point duct tape doesn't really work out.

1237.494 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And when you're on big scales, that makes a problem. And it's a problem that's growing because a lot of people are coming on the market with needs for inference. That wasn't the case, you know, a year and a half ago or a year ago. OpenAI had this problem, right? Maybe Anthropic had this problem. But it wasn't a universal problem yet. And now it's becoming a universal problem.

1262.133 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So, for instance, probably the number one thing, depending on how you deploy, but if you deploy inference, the number one thing that will get you is what's called autoscaling. So as your systems get more and more loaded, you want to provision because these things are tremendously expensive. You want to provision them as you scale, right? So you don't want to say, I have 1,000 GPUs, 24 hours.

1286.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Even if there's nobody on the production, I will pay for them, which is, mind you, what people are doing today. This is crazy. So what you want to do is you want to provision, compute as you grow your needs, right? and you want to do it up, and you want to do it down. Probably the number one thing that gives you a lot of efficiency in terms of spend.

1312.969 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

We're talking multiples, like 5x, sometimes 10x improvement. The thing is, this is a problem, at least in I would say regular back-end engineering, this is a problem everybody knows. Everybody is doing it because the savings are so huge. But on AI, nobody really had the problem. So now they're coming up to it. So this is one example.

1334.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's one example, yeah. Another one is choosing the right compute. It's like kind of, I would say, a vicious circle because provisioning compute is very hard. So if you lose compute, it's very bad.

1367.62 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You are essentially incentivized to overbuy, in the case of Amazon or Google that would be buying reserved compute, which you're not going to use because if you buy it on demand, you will get tremendously ripped off. So that creates this face scarcity of compute that people buy preemptively because they raise a shit ton of money and they're not using it. So this is a major problem too.

1380.332 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It might well be, yes. We are being spared a bit because Blackwell is late and others are getting canceled. And so H series, I would say are still, you know, in the active, but yes, absolutely. But you know, what choice do you have? This is the thing.

1410.992 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I might tell you that I think it already started. I'm getting cold emails for discounts from services I never heard about. And I started getting these emails probably around October, November. Some people are left with a lot of capex that they don't know what to do with.

1445.107 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a different thing to build a cluster and do a training run than it is to build literally a cloud provider or hyperscaler or whatever you want to call it. There are a lot of people who do their training runs on the regular providers, but then move to regular hyperscalers when they do production. So I'm very much worried there will be an oversupply of these chips.

1463.273 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The problem is that, remember, the chips are the collateral. So somewhere in the US or whatever, there's going to be a data center with like a thousand GPUs that people may buy 30 cents on the dollar. This is what might happen.

1488.206 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Technically speaking, he is right. But realistically speaking, I'm not sure I agree. The thing is, these chips are on the market. They're here. I'll tab on Chrome and get one. That is something that I don't take lightly. Availability, that is, right?

1532.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think Nvidia is used to stay, at least if not for the H100 bubble bust, because these chips are going to be on the market and people will buy them and do inference with them. Remains to see the OPEX and the electricity, etc., but... The thing is, the only chips that are really frontier on that sense are probably TPUs and then the upcoming chips.

1549.803 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, they're great chips, but they're not on the market. Or like there are outrageous prices, like millions of dollars to run a model. So what chips are great and why aren't they on the market? Let's say for instance, Cerebras, incredible technology, incredibly expensive. So how will the market value the premium of having single stream, very high tokens per second?

1575.482 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There is a value into that, right? As we saw with Mistral and Perplexity, but I think there was another loss. I don't know, I don't have the details, But I think it was done at a loss that Cerebrus put it out. So today there's three actors on the market that can deliver this. I think this will be, I would say, the pushing force for change in the inference landscape, agents and reasoning.

1597.642 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is very high tokens per second only for you.

1621.823 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's this trick. Because here's the thing, there's no magic. This little trick is called SRAM. SRAM is memory on the chip directly. So that is very, very fast memory. But here's the problem with SRAM. is that SRAM consumes surface on the chip, which makes it a bigger chip, which is very hard in terms of yield, right? Because the chances of problems are higher and so on.

1634.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So SRAM is, I would say, very, very, very fast memory, which gives you a lot of advantage when you do very, very high inference, but it's terribly expensive. And if you look at, for instance, Grok, they have on their generation, this generation, they have 230 megabytes of SRAM per chip. A 7TB model is 140 gigabytes. So you do the math, right?

1660.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Cerebras has 44 gigabytes of SRAM into what they call their wafer scale engine. which is a chip the size of a wafer. I mean, most likely it's interconnected, but it's huge, right? And it has to be water-cooled. They have copper, you know, I would say needles that touch the chip. It's crazy stuff. Very, very impressive technology, mind you, but very, very expensive. So my bet is...

1685.7 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think there will be chips on the market that do that at a much lower price. And there's two companies I see going in that direction. One is called Etched, and the other one is called Vsor. That's the two I see. Because if you can deliver this at, I would say, the price that is comparable to GPUs, you've won.

1710.692 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's hard to say. I mean, you need some SRAM, but if you can have a smaller process node, but if you can hook yourself with external memory, then yes, you can do that a lot better. But the thing is, if you go full-blown SRAM, then there's no magic. You will have to pay the price.

1735.951 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

pushed by reasoning. So reasoning, not in the sense that you see on DeepSeq and whatever, right? Reasoning and what's called latent space reasoning. Latent space reasoning and agents will push the market towards different types of compute.

1765.333 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way models reason today is their reason in tokens. So it's as if you think to yourself, you would say out loud what you're thinking. So yes, it works, but it's a bit inefficient, right? And you lose information doing this. Latent space reasoning is this without going, I would say, to English or whatever, right?

1783.404 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So staying in what's called the latent space, which is where all the information of an LLM, let's say an LLM, An LLM lives, right? So this is very much how we work as humans. And we move toward what Yann LeCun calls an energy-based model in which we have different types of longer or shorter, I would say, thinking times, if you will, right?

1806.384 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that fundamentally, GPUs cannot deliver this, plain and simple, at scale. Why can't GPUs deliver it? Because the access to external memory prevents it. So HBM is all the rage, right? But HBM compared to SRAM is absolutely, you know, dark slow. So this is the problem you get. So HBM is like the best we can do, but it's still slow versus SRAM.

1829.581 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, you want HBM, to be clear. No, SRAM, this will not deliver. It's a dead end in terms of scaling SRAM means scaling the surface, means you get depreciating problems. It explodes everywhere, right? So you need some SRAM, right? So we'll have bigger amounts of SRAM into chips. And of course, bigger what's called external memory into chips. The issue with HBM is that it's still slow.

1871.539 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You have the products, the data, and the compute. Who has all three? Google has, like, Android, Google Docs. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

19.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And yes, maybe NVIDIA has a stronghold and they can prevent you from getting some. So that would be like, I call it the Nutella situation in which, you know, Nutella, they own 80% of the hazelnuts market, right? So yes, you can do a competitor, but who will you buy the nuts from, right? So there will be a need for HBM. There will be a need for SRAM, right?

1900.476 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I would say better, more dedicated architecture will be able to deliver these things. And then there's like the next frontier after that, which is called compute in memory. There's two companies that are on that market. One is called Rain, Rain.ai. Sam Altman is one of the investors. There's no surprise. The other one is called Fractile. So this is the next frontier.

1920.549 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And the idea is that instead of like transferring the data between external memory and the CPU and do the compute there, You actually bring the CPU to the memory and you do everything. It's crazy stuff, but it's coming. Maybe not this year, but... How does that change the situation? It makes it much more efficient, but what does that actually mean in reality?

1941.716 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means you get maybe not SRAM-level performance, but you get a lot faster performance. in terms of compute. And if you translate that to LLMs, let's say, you get much, much higher tokens per second in a single stream, which is exactly what you want when you go into reasoning. You want your model to maybe think, let's say, for like half a second, and then boom.

1964.508 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't want to wait 50 seconds and context switch to some other thing, which is the problem everybody has today, mind you. So yeah, I think inference will be pushed. The compute landscape will be pushed to change because of these two constraints.

1987.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

In five years, I would say 95% inference, 5% training.

2012.792 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Depends on the supply. I think that there's a shot that they don't. Because here's the thing, you know, even if we take, you know, same amount of, you know, let's imagine we have a new chip from Amazon, right? That is the same amount. Oh, wait, we do. It's called Tranium. You know, why would I pay 90% margin of NVIDIA if I can freely change to Tranium? My old production runs on AWS anyways, right?

2022.017 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If you run on the cloud and you're running on NVIDIA, you're getting squeezed out of your money, right? So if you're on production on dedicated chips, of course, so maybe through commoditization, but hey, I'm on AWS, I can just click and boom, it runs on AWS's chips. Who cares, right? I just run my model like I did two minutes ago.

2047.954 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are. They have a protocol NIM, sort of does that. The thing with NVIDIA is that they spend a lot of energy making you care about stuff you shouldn't care about. And they were very successful. Like, who gives a shit about CUDA? I'm sorry, but I don't want to care about that, right? I want to do my stuff.

2079.564 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And NVIDIA got me into saying, hey, you should care about this because there's nothing else on the market. Well, that's not true. But ultimately, this is the GPU I have in my machine. So, you know, off I go. If tomorrow that changes, why would I pay 90% margin on my compute? That's insane. This is why I believe it ultimately goes through the software. This is my entry point to the ecosystem.

2096.403 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If the software abstracts away those idiosyncrasies, as they do on CPUs, then the providers will compete on specs and not on fake modes or circumstantial modes, right? So this is where I think, you know, the market is going. And of course, there's the availability problem. There is, you know, if you, you know, piss off Jensen, you might need to kiss the ring, you know, to get back in line, right?

2121.342 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But I mean, ultimately, I don't see this as being sustainable.

2149.095 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So all, I would say, chip makers have a GTM problem. All of them. whether it's Google, whether it's AMD, whether it's TenStorm. The problem is that there's, I would say, probably two fundamental problems. The number one is if you're maintaining multiple stacks today is very, very, very hard. So you don't. So let's say I buy AMD. I want to buy AMD, right? That means I'm going to abandon NVIDIA.

2177.193 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, crap. I have a six-year amortization plan on that. Oh, man, what do I do? So do I need to support both stacks? Uh, unclear. Maybe until AMD tells me, hey, you know, you have, I don't know, let's say 1,000 Nvidia GPUs. You're about to buy 100,000 of AMD. I mean, come on, right? And I'm like, okay, that is, you know, makes it worth my while, right?

2204.471 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But that is ultimately the fundamental problem is that the steps are very high, right? I need to have a lot of incentives to buy into that ecosystem. So I need to buy a lot of them. So if you're AMD, that is already a problem. But then Microsoft comes along and buys it all, makes, by the way, OpenAI, or at least on the inference side, puts OpenAI in the green because of the efficiency gains.

2227.305 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Can I just try and understand?

2249.018 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's actually both. Yeah. The buy-in is very high. So to make it worth it, you have to buy a lot. And if you buy a lot, this is, you know what, we talk to all of them. They always have the same questions and it's completely understandable. They say, this is great, but who's the customer? Because on the other side, let's take Amazon, for instance, with Tranium.

2263.527 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Apple just came and said, hey, we're going to buy 100,000 of them. So you want to buy 10,000, you feel like the big shot, right? Yeah, but go back to the queue because there's Apple before you, right? So they have to have very high commitments. You cannot be incrementally better. It's very hard, right? And also very hard, I can give you one metric if you want.

2285.063 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I know for a fact that being seven times better and take whatever metric you want. Whether it's spend, whether it's whatever, it's not enough to get people to switch. People will choose nothing over something. So this is a very hard market to enter into because you cannot also compete of incremental gains. It's very hard, right? So you have to convince a lot of people.

2306.168 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Maybe you can go the Middle East route in which they sprinkle everything and they evaluate everything. That's not, you know, very sustainable, I would say, strategy in the long term.

2329.761 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Right.

2347.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely. The right approach to me is making the buy-in zero. If the buy-in is zero, you don't worry about this. You just buy whatever is best today. How do you do that by renting? Oh, because this is what we do. This is our promise. Our thesis is that if the buy-in is zero, you know, you completely unlock that value.

2351.212 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It means that you can freely switch, you know, compute to compute, like freely, right? You just say, hey, now it's AMD, boom, it runs. You just say, oh, it's 10 store and boom, it runs, right? How do you do that then?

2374.63 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah, yeah, yeah. Not agreements, but we work with them to support their chips. But the thing is, at least as a user myself of our tech, is that if it's free for me to switch or to choose whichever provider I want in terms of compute, AMD, Nvidia, whatever, then I can take whatever is best today, and I can take whatever is best tomorrow, and I can run both.

2390.138 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I can run three different platforms at the same time. I don't care. I only run what is good at the moment. And that unlocks, to me, a very cool thing, which is incremental improvement. If you are 30% better, I'll switch to you.

2414.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

This is actually a great question. I think that if you are doing it bottom-up, infra to applications, you will lose because nobody will care, as they don't today, right? If you look at TPUs, they're available, they're great. Nobody cares. Why does nobody care about TPUs, sorry? Because the cost of buying, it's always the same, right? You have to spend six months of engineering to switch to TPUs.

2438.879 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And mind you, TPUs do training. They are the only ones. We're training them now. But AMD can do training, but in terms of maturity, by far the most mature software and compute is TPUs, and then it's NVIDIA, right? So the buy-in is so high that people are like, no, fuck. We'll see, right? I'm not on Google Cloud. I have to sign up. Oh my God, right? So these are tremendous chips.

2462.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are tremendous assets. Now, in terms of the risk, I think if you want to do it, you have to do it top to bottom. You have to start with whatever it is you're going to build and then permeate downwards into the infrastructure. Take, for example, Microsoft with OpenAI. They just bought all of AMD's supply and they run ChatGPT on it. That's it. And that puts them in the green.

2488.356 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's actually what makes them profitable on inference. Or at least, let's say, not lose money.

2511.871 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because I can give you actual numbers. If you run eight H100, you can put two 70B models on them because of the RAM. That's number one. Number two is if you go from one GPU to two, you don't get twice the performance. Maybe you get 10% better performance. Yeah, that's the dirty secret nobody talks about. I'm talking inference, right?

2523.715 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So you go from, let's say, 100 to 110 by doubling the amount of GPUs. That is insane. So you'd rather have two by one than one by two, right? So with one machine of H100, you can run two 7 TBs model if you do four GPUs and four GPUs, right? That's number one. If you run on AMD, well, there's enough memory inside the GPU to run one model per card. So you get eight GPUs, eight times the throughput.

2546.434 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

While on the other hand, you get eight GPUs, two, maybe two and a half times the throughput. So that is a Forex right there, just by virtue of this. So that is the compute part. But if you look at all of these things, there are tremendous amount of, we talked to companies who have chips upcoming with almost 300 gigabytes of memory. on it, right?

2575.051 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is, you know, a model like one chip per model. This is the best thing you want if you're on seven TBs, right? So, which is what I would say, not the state of the art, but this is the regular stuff people will use for serving.

2598.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you look, you know, top to bottom and you know what you're going to build with them, then it's a lot better to do the efficiency gains because four times is a big deal, right? And mind you, these chips are 30% cheaper than Nvidia's. It's like a no brainer. But if you go bottom up and say, I'm going to rent them out, people will not rent them. Simple.

2613.698 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that's why, you know, I think it's a good way to attack it from the software because ultimately, do you really care about that your MacBook, let's say, is an M2 or an M3? It's like, oh, it's the better one. And that's it, right? And imagine if you had to care about these things. That would be insane.

2633.946 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Stock? Yeah. I used to think the market was efficient. So probably I would go today, at least I would go with Nvidia still. Because the supply. But, you know, if we play our cards right, we ship our stuff, hopefully I will come back and tell you to buy AMD as much as you can. Or a 10-storrent, you know, if they go public or whoever else. These chips are amazing, by the way.

2667.346 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably not a lot of people are accustomed to what it entails to run production. So that inference is production, and production is hard. Somebody has to wake up at night. And I used to be that guy, right? I don't want to do it again. So production is hard.

2698.339 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Thankfully, we have a lot of software nowadays to do that a lot better, but there's not a lot of reuse because the AI field, at least, is not really accustomed to that yet. It's changing, but the discussions I had a year ago and the discussions I had today are not the same. They're going to the right direction, but they're not there exactly yet. So probably that would be the number one thing.

2715.746 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is only training code running only for WordPress. This is not what it is.

2741.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're still going after training. So there's still this frontier. Probably it's why also NVIDIA is the better buy right now. Because on the NVIDIA side, if you do training, it's incremental. If you have bought 1,000 NVIDIA GPUs and you buy 1,000 new NVIDIA GPUs, that gives you 2,000 GPUs, right? But if you buy 1,000 and 1,000 AMD, that gives you twice 1,000, right? It's a bit different.

2761.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So they're still going after training, definitely. And they're very pragmatic in doing so. But, I mean, they have the capex to spend. They're not making their money out of it, probably. The only one, by the way, that owns their compute are Google. There's like this triangle of, I would say, of wind that I, this is my mental model, mind you. You have the products, the data, and the compute.

2786.24 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Who has all three? And you get everything flows from there. Products, data, compute. Who has all three? Google? Amazon? Amazon, they don't have products. They have Amazon, right? They have AWS, but they don't have actual products. Google has like, you know, Android, Google Docs, whatever. They have everything. They can sprinkle everywhere. This is the sleeping giant in my mind.

2807.807 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So at the very bottom of things, ZML is an ML framework that runs any models on any hardware. We sit ultimately at the infrastructure layer. We enable anybody to run their model better, faster, more reliably, but on any compute whatsoever. Doesn't really matter. It could be NVIDIA. It can be AMD. It could be TPU. and whatnot. And we do all that without compromise.

281.573 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

If they're not busy doing a reorg, they might.

2830.012 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, open AI is amazing, but it's not their compute. It is Microsoft's compute. And if you own your compute, you own your margin is essentially what you're saying. Yeah. Even Microsoft, when they were running NVIDIA, they bought NVIDIA at some outrageous margins. I talked to a lot of people that build data centers, and I tell them, mind you, these people buy tens of thousands of GPUs.

2843.545 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I asked them, hey, do you get at least a discount or something? And they're like, no, the only thing we get is the supply. So, I mean, ultimately, if you don't own your compute, you're starting with, you know, something at your ankle. Definitely. And so this is why I like to think in this triangle, product, data, compute.

2868.453 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you can see where everybody sits and their weaknesses and their strengths.

2887.44 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's like a brute force approach to this. It is a very American approach, more and more and more. But the thing is, you look at, for instance, the XAI cluster. It's not 100,000 GPUs. It is four times 25,000. You're starting to see something because InfiniBand and in the case of Rocky, which is anyways, the technology they used to bridge their GPUs together. You have upper bounds, right?

2912.296 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

At some point you're fighting physics. So you can push, it's like, you know, trying to get to the speed of light. As you approach it, the amount of energy you need is a lot higher and a lot higher and it grows and grows.

2937.833 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's two, I would say, counter to that would be that number one is we're still scaled, but there's a lot of waste and excess, you know, spending on the engineering side, which is the deep seek approach, right? Very successful at that, mind you. They said, yeah, if we do this and this differently, then we get, you know, multiple sometimes, right?

2948.543 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So virtually you increase your compute capacity because you're more efficient. And the other approach is Yann LeCun's approach, which is this is not scaling. And at some point you need, we need to look the problem in the face and do something better, right? So of course we push and push and push because there's capital still, but I'm more of these two approaches. I think you can do more with less.

2968.777 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think until somebody does it. DeepSeq was a good wake-up call, right? Suddenly efficiency is in. That's number one. And number two is until there's a new architecture that comes out and changes the game. So in the case of LLMs, for instance, you have these what's called non-transformer models that changes fundamentally the compute requirements.

2998.706 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That might be a frontier that completely obsoletes the transformers. The transformers are the building block by which current models work. The way they work is that for each token or syllable, the model will look at everything behind it. You can see that as you add more text, you have more work to do.

3019.199 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there are these new architectures that do not require this, that might change these things and probably shift the amount of compute needed to do training or to do inference. And then there's the new thing, which is Yann's thesis, which is the word model. As in, LLMs are at the end. What we need is something that understands the world fundamentally. And this is, it's JEPA thesis, it's called.

3040.8 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm very bullish on this, but it's very frontier. Why are you bullish on it? And why is it so frontier? Because it's Yann LeCun. It's hard to. He's no bullshit, right? So he explained to me how it worked and I was blown away. But it makes a lot of sense. We are creeped out because the machine talks back to us. But it's not a new thing, right?

3065.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That's the key point, because if there's a compromise, then it's not really, you know, agnostic, right?

307.784 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It used to, you know, this is not new technology when it came out. Well, like when it exploded, it was a new technology. But suddenly it was talking back and that freaked us out. And we got crazy on it, right? But language is one form of communication, but it is ultimately a very narrow window into, you know, the world. We use it to describe the world arguably with some loss, right?

3087.33 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so the JPA approach is, long story short, is that... you have essentially two things you want to do, and you try and minimize the energy to do them. And from this, understanding emerges, physics emerges, et cetera, because you're trying to minimize the amount of energy to go from one state to the other. And that actually makes sense. Like if you try and pick this AirPod case,

3110.289 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not going to go round trip around the block to get it, right? I just get it. And in my brain, it's wired to just do the thing. If I go and, you know, talk to myself out loud, put the hand down, move to the left and whatever, that feels very, you know, inefficient. So probably this will be something that changes. And in the case of LLMs, there's good work also.

3133.385 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

on what's called diffusion-based LLMs, which means like instead of thinking, you know, what's called auto-aggressively, that means you get a new token, you re-inject and you redo, etc. They think more like what we do, which is in patches, right? Imagine a paragraph of text and words appear until it's done.

3157.253 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think it's fair game, to be honest. I will not shed a tear. It's fair game. If you, there were like some people who tried to ask, I think it was, I don't remember if it was an open AI model. So a diffusion model image, right? They asked it to generate an image from a Star Wars movie at whatever timestamp. And it came out with the Star Wars movie, you know, screenshot.

3189.708 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Obviously it was trained with it. I think it's fair game because there's no free lunch, right? It was trained with data. You had a good ride. Somebody was sneaky and took it, but you took it from the beginning too. So let's just accept it's fair game.

3210.485 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Absolutely, absolutely. I take my cup and enjoy it very much, that movie, every single day.

3230.66 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm a bit split on this. There's a part of me that said that if you re-inject data into the system, the system deteriorates. That feels a bit, I would say, intuitive. But if you look at AlphaGo, for instance, the moment it's ramped up in its skills is when they started generating games. or synthetic games, right?

3255.327 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I'm a bit, you know, split, but there are some verticals that very much benefit from this. Code LLMs, for instance. We can run code, right? So this is the poolside thesis. Just so I understand, why does it work for coding and not for other things? Because you don't use the AI model to generate output. You use the machine. You just run the code, right?

3274.116 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, you actually can see it. It's been happening for a while. Models now are not the right abstractions, at least if you look at closed source models, they're not really models. They're more like backends. And there are a lot of tricks that you feel like you're talking to one model, but ultimately you're talking to a constellation, an assembly of backends that produces a response.

328.778 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And you see what it makes and you run all this code and you create data out of it. Whereas if you run an LLM and you say to an LLM, all right, generate me two trillion tokens of text, it will do it with its, you know, so you may inject and stuff. So there's a lot of tricks, but ultimately my guts tell me that it feels wrong, right? Because you re-inject. data that was there.

3296.669 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And so it will deteriorate. There's loss. So yeah, I'm a bit bullish. I'm not sure exactly on what vertical. Code is one. We'll see. Distillation is, in some sense, a bit like that. You create synthetic data from a bigger model into a smaller one. Probably the most, I would say, mind-blowing thing about distillation is that sometimes the smaller models become better than the bigger model.

3320.805 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

through distillation.

3344.779 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

One theory is that the smarter model is better at generating output that you would want it to generate, essentially. It's not better in the general sense. It's better at the task at which you were measuring it. This is what it learned to imitate.

3352.721 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Sometimes it's wasteful to run big models. A lot of times it's actually wasteful to run big models. I think there's going to be a lot of smaller models for efficiency reasons, but...

3377.49 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

There's a but, which is you talk to people at DeepMind and they don't even fine tune anymore because they have such, you know, what's called big context window, which is what the model, you know, the data, the model you inject, right? At one time that nowadays they just dump data into it and just say, do whatever, you know, that data tells you to do instead of fine tuning as we used to do.

3388.737 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if the efficiency gains, we're not there yet, right? But if their efficiency gains, I would say pass that threshold, we'll just do it at runtime. We'll just have a great model that will just specialize at each request. But that's not for tomorrow, I think.

3412.228 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's a very, very clever trick. What you do is you represent knowledge into what's called the vector space or latent space. And what you do is through what's called vector search. So imagine you have, let's say, a 3D space that represents all knowledge, all of everything. And let's say a cat sits here, a dog sits close because it's an animal, but it's far from some other property and so on.

3432.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what you do is you run the user's request through this same system. It's called an embedding. And that will give you a vector, and you will take whatever is closer to you, what's called semantically close. And then it's actually very clever. You actually insert those pieces of text before the request.

3460.72 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So it's as if you would say, knowing the following, and you give the data, let's say it's law or whatever, please answer my request. And that's it. So that's a bit of a clever trick. It's a bit dirty because, of course, you know, you are limited by the amount of data you can input, right? So there's this problem in which how do you chunk, you know, the data that you input?

3482.608 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one, you know, I would say obvious thing would be that if you ask a model to generate an image, then it will, you know, switch to a diffusion model, right? Not an LLM. And there's many, many more tricks. The turbo models and OpenAI do that. There's a lot of tricks.

349.64 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It is, it is. Depends on how it works, but yes, sometimes it is. But think of it as in, it's like a preamble to your question. Knowing the following, and the following is a tiny window into the content. Please answer my question. And of course, as you talk more and more, it will forget because that window is fixed.

3517.237 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What pushes smaller models are efficiency, roughly, speed. You know, less is better. So if we can do with less, then less it is. Simple as this, right? In terms of RAG, the key frontier is what we call attention level search. But this is something we're working on. You have the exclusivity now, I'm putting it out there. It doesn't push, I would say, model sizes.

3544.165 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

What really pushes model sizes are the efficiency rather than specializing. Meaning that if you can do the same performance with a smaller model that is fine-tuned with RAG or whatever, then you'll do it with a smaller because, again, less is better.

3566.182 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, I love it. Constraint is the mother of innovation. Yes, you know, we can, you know, troll a bit about, you know, the Singapore, you know, gray market and all of these things. But ultimately, like they had no choice. Here's the thing. If you can buy more, why would you give a damn, right? You can just buy more. So if you are pushed to efficiency, then you will deliver efficiency.

3596.297 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

These are very, very skilled people. This is the coolest thing to me about AI, honestly, is the geography doesn't matter anymore. You can just do things. You appear out of nowhere, boom, you're on the map. And so I'm very, very glad that they did. I found the reaction very entertaining, to be honest. So yeah, I mean, constrain is a very good driver of efficiency.

3618.994 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So definitely models in the sense of getting, you know, weights and running them is something that is ultimately going away because, you know, in favor of like full-blown backends, right? You feel like you're talking to a model, but ultimately you're talking to an API. The thing is, that API will be running locally in your own cloud instances and so on.

365.108 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I'm not sure who is a threat to OpenAI at the moment. Here's why. You look at the numbers. I mean, we live in a bubble. We, you know, we follow every new episode, the whatever new model, whatever, who said what and so on. But, you know, I go to my mother and I ask her, you know, do you know ChatGPT? And she says, yes.

3652.102 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And do you know, I don't know, I don't want to dunk on anybody, but do you want to know some other model? And she says, what it is, what is it, right? Even Gemini, right? Google, right? So they have a strong brand, they have a strong product, but there's a balance between the product and the models, honestly.

3669.985 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So this is Gary from Fluidstack, actually, who told me that his mentor model in terms of model providers, they'll be like car makers. There's no winner tickle. Everybody will have their own because ultimately also human knowledge is everybody has everything. So we're converging. But I liked an analogy.

3685.615 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Yes, deep seek made a very good, you know, made waves, but it was, it was, you know, waves that were amplified by the media and the narrative and the drama.

3703.789 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Today, maybe. Tomorrow, I'm not sure. They're a bit late in terms of ASIC. They are like A100 level, but they have probably, I would say, one of their unfair advantage is that it's like when you do exercise in the water, right? It's like this. So this is their state. They are constrained, so they are bound to do better. They can just not buy their way into better compute.

3719.036 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I think it hinders their success, but I think it's short term to think that way.

3744.225 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, I don't care. This is something that makes me wonder sometimes. I understand the narrative and so on, but I am absolutely not fearful. Let's be successful first, and then we'll talk about the politics. But again, I'm not Mistral. I'm not building gigawatt data centers and so on. So if you build gigawatt data centers, you run into these problems. But maybe you run into these problems. But

3757.942 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The thing is, if you're successful, everything flows from there.

3784.227 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

They are very competent. I think it's easy to spread FUD. There's a lot of FUD going around, especially about regulation and everything. But here's the thing, I look around me and I don't see, you know, what I read, right? So I am hardly convinced about, you know, everybody was saying that they were dead and boom, they came out with their release and it was insane.

3798.981 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So what I know is that I hope they don't have too much money. That's for sure. You want to be clever, right?

3819.509 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

My first impression was that I don't buy it. I would say, you know, American style, right? You start with the claim and we'll figure it out later. I don't buy it. And ultimately, I'm not sure I care that much about it. Let's imagine it's true, right? Congratulations. Amazing. But it is more of the same. It is a vertical scaling. And as you know, my days are spent on efficiency.

3836.181 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So I look at these things as being like, all right, this is a bigger, you know, this is an American car of AI. It's big. It consumes a lot of gas. But ultimately, you know, it's not a good car, right? I think there has to be sufficient capital, but at some point, I'm not sure it is really a differentiator. That was prior to DeepSeek, then DeepSeek came.

3859.756 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That was always my thesis, but you need money, you need infrastructure, you need... But what is ultimately probably the two limiting factor today is talent and energy. That's it. The rest, yes, of course, you can buy 500 billion of GPUs. By the way, 90% margin. So if we work on that margin, we can shrink that number probably. So I'm not easily entertained by these numbers.

3881.67 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I've seen how the sausage is made way too many times. Dude, I want to do a quick fire with you.

3911.32 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Oh, yeah. Latency, reasoning. Definitely. This year. What does that mean in... So the shift from throughput, so how speed my answers, to how long it takes for my answer complete to appear. That is probably one of the fundamental, like this year, right? Longer term, I'm very rooting for non-transformer models that will change the compute, also landscape.

3925.774 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And of course, you know, world models, right? Yes. And or energy-based models.

3949.309 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Probably the number one thing I would say is do not resell compute if you can. A lot of AI startups that are building on top of AI are trying to make a margin on top of a very big cake. And ultimately what they sell is compute. If you look at the dollar of spend, for $1 of spend, maybe 98% of it goes to somebody else's margin.

3961.445 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So if you do AI, as much as you can, try to verticalize on the product, but not on the compute. If your business model implies buying a lot of tokens, it's a very hard circle to square to put that into $20 a month. So I always say, please look at it from that angle. And if you can, try and avoid it. What's the biggest challenge that Jensen Huang faces today?

3983.872 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The highs are very high, but they don't last forever. So probably it's how to navigate the downslope. Blackwell is probably something that keeps him awake at night.

4011.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because orders are getting canceled. Why are they getting canceled? They have a lot of problems with these chips. So a lot of people, you know, are canceling their orders. These chips are like on the frontier of scaling. And so, you know, they were supposed to come out last summer, but that heat dissipation and, you know, matter bending problem used to be called

4029.9 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

No, absolutely. You can get probably an order of magnitude more efficiency depending on the hardware you run on. That is substantial. Not a lot of people have that problem at the moment. Things are getting built as we speak. But a simple example is if you switch from NVIDIA to AMD on a 7 TB model, you can get four times better efficiency in terms of spend. So that is substantial.

405.334 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The people who are very privy to silicon told me, this is what we call a pretty big fucking problem, right? End quote. Probably how to navigate the downslope. Maybe you don't know, but the supply of H100 was actually smoothed out over the year so that they didn't have like a big spike in deliveries and then a quarter less, right? Which pissed a lot of people, mind you. who bought a lot of them.

4052.154 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Some of them even haven't received their order from last year. And they already see like the new chip, the B200, and then the one after, you know, and they're super pissed. There will be a downslope at some point. The question is, you know, when, how, like if there's like the H100 bubble, of course it will impact Nvidia. But Blackwell is, I'm probably going to get a lot of flack for this.

4078.096 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But, you know, I've seen some very worrying numbers about it. and varying testimonies about people who operate these things, right? So that ride will stop or at least, you know, slow down.

4099.914 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That is very much substantial. Now, the problem is getting some AMD GPUs, right? I'm really sorry.

432.244 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there's a few reasons. Probably the most important one is the PyTorch CUDA, I would say, Duo. And that's very, very hard to break. These two are very much intertwined. Can you just explain to us what PyTorch and CUDA are? Oh, yes, absolutely. Yeah, yeah. PyTorch is the ML framework that people use to build actually trained models, right?

442.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You can do inference with it, but by far the most successful framework for training is PyTorch. And PyTorch was very much built on top of CUDA, which is NVIDIA software, right? Let's just say the strengths of PyTorch make it ultimately very, very bound to CUDA. So of course it runs on, you know, it runs on AMD, it runs on, you know, even Apple and so on.

464.378 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But there was always, you know, the tens of little details that not exactly run like, you know, you would expect and there's work involved, but then also there's supply. So probably that's the number one thing. The second thing is there's a lot of GPUs on the market. Pretty much all of them are NVIDIA.

489.353 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The reason being that if you think, you know, in layers and you say, all right, I'm going to buy, let's say, GPUs and I'm going to sell them to folks to maybe not even do training, right? Just do inference. Then most likely, if you look at it that way, you'll end up buying Nvidia because everybody will want to run an Nvidia because nobody knows really how to do whatever.

508.146 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they've trained on Nvidia, so they're like, I can just reuse my code and so on. So there's like this self-perpetuating circle of people just buy Nvidia because they want to resell and people just use Nvidia because it's there, right? But it's by far not the most efficient platform. And arguably, even in terms of software, it's not the best software platform.

528.177 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So that is probably two of the most, I would wager, the most important reasons.

551.728 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Because the chips are there. There's a lot of things, but in my opinion, there's going to be a need for inference. Very hard to say whether it will be worth everybody's money to do it on H100. That is a bubble that I think will blow some time. I'm kind of afraid of that, to be honest.

572.279 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

because it was built on the A100, I would say, financial model, which was at generation zero, we do training, but when it's last generation, we do inference, and it worked beautifully, right? For A100, then H100 comes along, and inference is, it's worth five times the price, and it maybe runs twice in terms of performance, on inference, that is. On training, it's a lot better, but on inference,

593.252 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

It's like maybe twice as fast. Actually, when it came out, it ran at the same speed than the A100. So there's a money gap that's going to have to, you know, be bridged sometime, right? And the part that worries me is that I see, you know, amortization plans in like, you know, six, seven years, right? With the GPUs at the collateral.

619.016 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And I'm like, well, I'm not sure how it's going to work because at least when they came out, they were worth five times the price and they're just two times, you know, faster. Something has got to give.

638.729 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Not much, ultimately. The two things that could very much shake the industry, the chip industry, in my opinion, are agents and reasoning.

659.966 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I think this is where NVIDIA can be attacked. I mean, why agents and why reasoning? The difference is for agents and reasoning, you need to wait until the end of the request to get whatever it is you came for. You don't really care about the speed at which the text outputs, which is what you want in a chat, right?

674.358 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You only care about how much time does it take between the beginning of my request and the end. And so that fundamentally changes the incentives from throughput bound to latency bound. And so GPUs, let's say you're running a GPUs at, let's say, 10,000 tokens per second. You very much like to do it, you know, 100 times 100, right?

695.387 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

And they can do that, but they cannot give you 10,000 tokens per second only on you, per stream, what we say. But in terms of agents or reasoning, this is exactly what you want, because you don't want to wait like 50 seconds for whatever thinking, right? And agents, it's the same. So these two, I think, are the shot that might make NVIDIA change its course with respect to chips.

718.772 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

I mean, they're not idiots, right? How should agents change NVIDIA's strategy? Hard to say, because NVIDIA has a very, very vertical approach. They do more of more, right? Like if you look at Blackwell, it's actually crazy what they did for Blackwell. They assembled two chips.

743.621 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the surface was so big that the chip started to bend a bit, which further perpetuated the problem because it then didn't make contact with the heat sink and so on. So they are very much, and you know, the power envelope, they push it to a thousand watts. It requires liquid cooling and so on. So they are very much in a very vertical foot to the pedal in terms of GPU scaling.

763.226 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

But the thing is, GPUs are a good trick for AI, but they're not built for AI. It's not a specialized chip. It is a specialization of a GPU, but it is not an AI chip.

785.48 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So the way it worked is that you can think of a screen as a matrix. And if you have to render pixels on a screen, there's a lot of pixels and everything has to happen in parallel, right? So that you don't waste time. Turns out, you know, matrices are a very important thing in AI.

806.495 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So there was this cool trick in which we essentially tricked the GPU into, back, that was like probably 20 years ago, we would trick the GPU into believing it was doing graphics rendering, where actually we were making it do parallel work, right? It was called GPGPU at the time, right? So it was always a cool trick. But it was not dedicated for this.

825.808 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

The pioneers probably were, of course, Google with TPU, which are very much more advanced on the architectural level. But essentially, the way they work, it kind of works for AI. But for LLMs, that starts to crack because they're so big and there's a lot of memory transfers and so on.

846.65 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

Actually, that's why Grok achieves, not Grok, but Grok, Cerebras, and all these folks, they achieve very high performance single stream is because the data is right in the chip. They don't have to get it from memory, which is slow, which GPU has to do. So there's a lot of these things that ultimately make it a good trick but not a dedicated solution per se.

866.724 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

That said, though, the reason probably NVIDIA won, at least in the training space, is because of Mellanox, not because of the raw compute. Because you need to run lots of these GPUs in parallel. So the interconnect between them is ultimately what matters. How fast can they exchange data? Because remember,

890.454 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

When you do a matrix multiplication, let's say, you read, the matrix is read like hundreds of times during the multiplication. So there's a lot of transfers going on. And so far, Melanox with, you know, InfiniBand had the best technology. So that's why, you know, a lot of people, and when you do training, by the way, it is the name of the game, the interconnect. When you do inference, not so much.

912.674 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

You don't care. when you do inference.

937.352 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

FRANCESC CAMPOY- So I would divide it in two categories. Well, three categories. The GPUs you can buy or rent. the TPUs you can rent, and the TPUs you can buy. This is how the market is structured today, right? Right now, if you want to go dedicated, at least in the cloud, there's two options, TPUs and Tranium. TPUs on Google, Tranium on Amazon.

959.628 View full episode →

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

20VC: Why Google Will Win the AI Arms Race & OpenAI Will Not | NVIDIA vs AMD: Who Wins and Why | The Future of Inference vs Training | The Economics of Compute & Why To Win You Must Have Product, Data & Compute with Steeve Morin @ ZML

So these are available chips, you can rent them today. If you want to buy GPUs or rent GPUs, they're GPUs, we know it all the time. And there's this new wave of computing, which are... dedicated chips, you can actually buy. The 10th Torrent, the Etched, the Vsora. So I think it will be a mix of, for instance, let's say you are in Google Cloud, of course you don't want to do NVIDIA.

985.021 View full episode →

Podcast Appearances

Login Required