Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Simon Willison

👤 Person
236 total appearances

Appearances Over Time

Podcast Appearances

Oxide and Friends
Predictions 2025

um there is at least one significant model now where the training data is at least open as in you can download a copy of the training data it includes stuff from the common crawl so it's includes a bunch of copyrighted websites that they've scraped but um but that has but there is at least one model now that has completely transparent licensing um transparent transparency on the training data itself which is it's good you know um

Oxide and Friends
Predictions 2025

um there is at least one significant model now where the training data is at least open as in you can download a copy of the training data it includes stuff from the common crawl so it's includes a bunch of copyrighted websites that they've scraped but um but that has but there is at least one model now that has completely transparent licensing um transparent transparency on the training data itself which is it's good you know um

Oxide and Friends
Predictions 2025

One of the other things that I've been tracking is, I love this idea of a vegan model, an LLM, which really was trained entirely on openly licensed material, such that all of the holdouts on ethical grounds over the training, which is a position I fully respect. If you're going to look at these things and say, I'm not using them, I don't agree with the ethics of how they were trained,

Oxide and Friends
Predictions 2025

One of the other things that I've been tracking is, I love this idea of a vegan model, an LLM, which really was trained entirely on openly licensed material, such that all of the holdouts on ethical grounds over the training, which is a position I fully respect. If you're going to look at these things and say, I'm not using them, I don't agree with the ethics of how they were trained,

Oxide and Friends
Predictions 2025

That's a perfectly rational decision for you to make. I want those people to be able to use this technology. So actually, one of my potential guesses for the next year was I think we will get to see a vegan model released. Somebody will put out an openly licensed model that was trained entirely on licensed or public domain work. I think when that happens, it will be a complete flop.

Oxide and Friends
Predictions 2025

That's a perfectly rational decision for you to make. I want those people to be able to use this technology. So actually, one of my potential guesses for the next year was I think we will get to see a vegan model released. Somebody will put out an openly licensed model that was trained entirely on licensed or public domain work. I think when that happens, it will be a complete flop.

Oxide and Friends
Predictions 2025

I think what will happen is it won't be as good as the... It'll be notably not as useful. But more importantly, I think a lot of the holdouts will reject it because we've already seen this. People saying, no, it's got GPL code in it. The GPL says that you have to attribute the... There's attribution requirements not being met, which is entirely true. That is, again, a rational position to take.

Oxide and Friends
Predictions 2025

I think what will happen is it won't be as good as the... It'll be notably not as useful. But more importantly, I think a lot of the holdouts will reject it because we've already seen this. People saying, no, it's got GPL code in it. The GPL says that you have to attribute the... There's attribution requirements not being met, which is entirely true. That is, again, a rational position to take.

Oxide and Friends
Predictions 2025

But I think that... It's both true and it makes sense to me, but it's also a case of moving the goalposts. So I think what would happen with a vegan model is the people who it was aimed at will find reasons not to use it. And I'm not going to say those are bad reasons, but I think that will happen.

Oxide and Friends
Predictions 2025

But I think that... It's both true and it makes sense to me, but it's also a case of moving the goalposts. So I think what would happen with a vegan model is the people who it was aimed at will find reasons not to use it. And I'm not going to say those are bad reasons, but I think that will happen.

Oxide and Friends
Predictions 2025

In the meantime, it's just not going to be very good because it won't know anything about modern culture or anything where it would have had to ripped off a newspaper article to learn about something that happened.

Oxide and Friends
Predictions 2025

In the meantime, it's just not going to be very good because it won't know anything about modern culture or anything where it would have had to ripped off a newspaper article to learn about something that happened.

Oxide and Friends
Predictions 2025

I'm very sold on that with one sort of edge case. And that's the thing about writing. The most tedious part of learning is learning to write essays. That's the thing that people cheat on. And that's the thing where I don't see how you learn those writing skills without the miserable slog, without the tedium.

Oxide and Friends
Predictions 2025

I'm very sold on that with one sort of edge case. And that's the thing about writing. The most tedious part of learning is learning to write essays. That's the thing that people cheat on. And that's the thing where I don't see how you learn those writing skills without the miserable slog, without the tedium.

Oxide and Friends
Predictions 2025

And so that's the one part of education I'm most nervous about is how do people learn the tedious slog of writing when they've got this tempting devil on their shoulder that will just write it for them.

Oxide and Friends
Predictions 2025

And so that's the one part of education I'm most nervous about is how do people learn the tedious slog of writing when they've got this tempting devil on their shoulder that will just write it for them.

Oxide and Friends
Predictions 2025

I will say one thing about LLMs for feedback. They can't do spell checking. I only noticed this recently. Claude, amazing model, it can't spot spelling mistakes. If I ask it for spell checking, it hallucinates words that I didn't misspell, and it misses the words that I did. And it's because of the tokenization, presumably. But that was a bit of a surprise. It's like, it's a language model.

Oxide and Friends
Predictions 2025

I will say one thing about LLMs for feedback. They can't do spell checking. I only noticed this recently. Claude, amazing model, it can't spot spelling mistakes. If I ask it for spell checking, it hallucinates words that I didn't misspell, and it misses the words that I did. And it's because of the tokenization, presumably. But that was a bit of a surprise. It's like, it's a language model.

Oxide and Friends
Predictions 2025

You would have thought that spelling, spell checking would work. Anything they output is spelled correctly, but they actually have difficulty spelling spelling mistakes, which I thought was interesting.

Oxide and Friends
Predictions 2025

You would have thought that spelling, spell checking would work. Anything they output is spelled correctly, but they actually have difficulty spelling spelling mistakes, which I thought was interesting.