Menu
Sign In Pricing Add Podcast

Pekka Enberg

Appearances

Code Story

S10 Turso Update with Pekka Enberg

1018.26

There's basically two main things. mobile space in terms of, you basically have two platforms, right? So you have iOS and Android, but then you also have things like React Native and all that stuff. So it's actually pretty fragmented. So one of the main things we're focusing on is basically making sure that the

Code Story

S10 Turso Update with Pekka Enberg

1036.713

our SDKs for those platforms are top notch and things that you can really just get the developer experience. Because like for us, it's always this combination of trying to get the best possible developer experience, but combine that with robust infrastructure. So that's one thing which is actually surprisingly big investment.

Code Story

S10 Turso Update with Pekka Enberg

104.252

My story is tied with Glover's as well. We both worked on Linux kernel. It feels like it was also like yesterday, but more than a decade ago, we were working with the Linux kernel. That's when we met. We joined a company to do an operating systems product, which then pivoted to something completely different, a database product. So me and Glover worked at ScyllaDB.

Code Story

S10 Turso Update with Pekka Enberg

1055.1

You really have to go and do the work for every fragment of the ecosystem separately. So that's one. The other part is basically, as you also hinted towards, even if you do this stuff on mobile or in the browser, you want to offload to the cloud. And a big part of what we build is the stuff that basically runs on the cloud and getting all of that right and doing the scaling there.

Code Story

S10 Turso Update with Pekka Enberg

1079.627

Desegregated storage is something that is super interesting. And basically, we're keeping SQLite in the client, but we're also doing this server-side SQLite in a sense, which is also a big part of what we do.

Code Story

S10 Turso Update with Pekka Enberg

1113.233

That's a great question. We struggled with that as well a little bit. So as I mentioned, two main platforms, and then you have React Native, you have Flutter, and other things as well. But I think those are the four main ones to consider.

Code Story

S10 Turso Update with Pekka Enberg

1129.688

For us, actually starting with React Native just turned out to be pretty natural because our JavaScript SDK for the server side thing is something that is by far the most popular one. And unsurprisingly, JavaScript is such a huge ecosystem. That is the main one we want to tackle first because there's this upside, of course, that you get this portability between the platforms.

Code Story

S10 Turso Update with Pekka Enberg

1152.776

But I think down the road, the reality of things is that you anyway, like it really depends on the types of applications. But if you look at the big ones, they will go native. Then it just becomes React Native. We can do start with that. But then you probably have to just do Android and iOS both. I just probably know like you can't really make a decision with going to with one of them.

Code Story

S10 Turso Update with Pekka Enberg

1175.302

Then the rest of it, Flutter and frameworks like that, I think for us, it's just going to be demand-based. We're going to see if we have enough people basically wanting to do something beyond those three.

Code Story

S10 Turso Update with Pekka Enberg

1219.984

So right now, unfortunately, it's not production grade yet. So you just come to our Discord and ask for the beta for React Native client essentially. That's basically just the starting point. The way we're integrating is just trying to find existing open source ecosystem libraries, for example, and integrating that. So the one we use today is a library called op-sqlite.

Code Story

S10 Turso Update with Pekka Enberg

124.666

I think you've had their founder here as a guest as well. I did do real backend programming also before switching to working on databases themselves. So Java was a hot call technology at the time. That's the general introduction I usually give.

Code Story

S10 Turso Update with Pekka Enberg

1245.318

And basically, it will have Turso support out of the box. And basically, then you sign up to our service and get your databases on the cloud. And off you go. Because once you have a managed database on the cloud, then it's just a configuration thing on the client to connect to your remotely managed database, and it will do all the sync and all those things.

Code Story

S10 Turso Update with Pekka Enberg

154.969

The tagline is SQLite for production. And basically what we're trying to do is bring the capabilities to SQLite. And for the audience that doesn't know what SQLite is, SQLite is an embedded database. It basically runs already everywhere on embedded systems. Like the claim is that it's the most deployed database on planet, which is probably true.

Code Story

S10 Turso Update with Pekka Enberg

176.778

But if you look at more modern workloads, people doing serverless, even on mobile to some extent, there are things that are just missing from SQLite. And actually, I don't know how much you went through this with Glover on the Inception story, but Durso actually was a company doing something completely different in the beginning.

Code Story

S10 Turso Update with Pekka Enberg

194.07

And we were heavy users of SQLite for a local development toolkit, essentially. And we wanted to do something like that. The thing was to build something like a managed service, essentially. And we were always thinking that we're going to get a proper SQL database to do it in production.

Code Story

S10 Turso Update with Pekka Enberg

211.097

We quickly discovered that SQLite itself is pretty agile, worked really well, but it was just missing some features like replication, which we needed at the time. So that's the thing that we do. We built those features to SQLite to make it really awesome for modern production workloads.

Code Story

S10 Turso Update with Pekka Enberg

237.768

So actually really early on, one of the things which we focused on is this ability to bring the database close to, essentially close to the user. People doing web apps where latency is essential for great customer experience. that is something that really resonates with people.

Code Story

S10 Turso Update with Pekka Enberg

256.438

And if you combine it with something like Cloudflare Workers, for example, you can actually cut down a lot of the sort of overall latency, just have a better response time, essentially. But we also, something that we did, which is maybe slightly counterintuitive, SQLite is an in-process database.

Code Story

S10 Turso Update with Pekka Enberg

275.267

So it's a library that you put into your application and now the database is in the same sort of, it's in your application. But we also added a mode where you can access SQLite remotely. So turning it into something similar to Postgres. But the thing with SQLite is it's so lightweight that you can get lots and lots of databases.

Code Story

S10 Turso Update with Pekka Enberg

294.839

So one interesting thing that people are using Tools for is essentially having a database per tenant. So database per user, for example, architecture, which is great for SaaS applications and things like that.

Code Story

S10 Turso Update with Pekka Enberg

320.984

Two things to update on. And maybe the first one is actually, let's say, more boring incremental part. Because actually, when we first started to work on Turso, as I mentioned, like this edge part was a key thing for us. So even if it's infrastructure, you can't really be that agile in infrastructure space.

Code Story

S10 Turso Update with Pekka Enberg

340.118

But you can still find ways to do proof of concepts and really aggressively validate and work with. Around the time Glover was on this podcast, we had just announced something called Embedded Replicas, and we were actually working on the multi-tenancy thing that I mentioned. And those were actually just things that we already had on the roadmap since after the first six months.

Code Story

S10 Turso Update with Pekka Enberg

361.98

And it's just basically, once you start to work on It's a different way of using the database. Like you can have the database inside your application, but you can still have replication. So you can have durability on basically offloading to the cloud, but also this multi-tenancy thing and all the sort of schema migration and all that stuff. And it is still ongoing work.

Code Story

S10 Turso Update with Pekka Enberg

382.116

So there's lots and lots of work to do to get that into sort of production shape. So that's the sort of more boring incremental part. But the really interesting thing that we recently did, recently released, and it's actually not GA yet, is basically bringing a capability, a new capability to SQLite. So vector search. Probably people are already getting a little bit bored with all the...

Code Story

S10 Turso Update with Pekka Enberg

404.441

LLM and AI stuff. But basically, that's super interesting thing that happened over the past 18 months, like you have a completely new type of workload, new types of applications, which need this capability. And that's something that I personally find super exciting, because this was the first time we really had to dig deep and change core part of SQLite. Very cool.

Code Story

S10 Turso Update with Pekka Enberg

444.275

There are many benefits to, as I mentioned, the database per tenant model. So if you zoom out a little bit, what typically people have to do when they're building whatever application, you take a big database and you start with that.

Code Story

S10 Turso Update with Pekka Enberg

459.204

All your user data is in the same database, all the whatever product catalogs, if you're doing an online store or things like that, they're all shared in with this one database. And then problems pop up. As you start to scale, you need to start thinking about sharding and all of those things. But maybe more importantly, it's really hard to keep the data isolated.

Code Story

S10 Turso Update with Pekka Enberg

480.455

So that's one of the things why multi-tenancy story essentially resonates so well, because now you can have these databases which are isolated from each other. So just imagine managing your user data or even some more confidential data. And it really extends all the way from backend to the mobile as well, right?

Code Story

S10 Turso Update with Pekka Enberg

497.686

So you can imagine having your application data essentially sharded per user and having that replicated in the mobile device, for example. So it's partly about scaling, but it's also about privacy and data isolation.

Code Story

S10 Turso Update with Pekka Enberg

634.938

When there was this first wave of ChatGPT 3.5, I think that was the sort of, at least for me, the sort of turning point. Then all of a sudden we were in this situation that everybody wanted to apply these large language models to their applications. And basically the models themselves are super useful, but there's this problem called hallucination because they just make up stuff.

Code Story

S10 Turso Update with Pekka Enberg

656.459

So these large language models essentially... are limited to whatever they saw during training. And these things get trained by reading essentially through the whole internet. But there's always a cutoff date, right? So you train it and then after that, it doesn't really know about the new things that appear. But also for enterprises, these models don't really know your company specific data.

Code Story

S10 Turso Update with Pekka Enberg

677.222

information. And that's why people came up with this retrieval augmented generation, which is essentially just retrieving data for the model. And this is where the vector search part comes in. Imagine an interface where you have a customer typing a question. So the way it essentially works is that you take that question, you run it through a

Code Story

S10 Turso Update with Pekka Enberg

697.122

a large language model, generate an embedding, which is a vector. And then you use this vector or this embedding to find relevant information. And that relevant information is through vector search, which is managed in some database. For me, the really interesting thing is that initially what happened was that there was this like explosion of different special purpose databases, vector databases.

Code Story

S10 Turso Update with Pekka Enberg

719.589

At some point they were embedding databases, but then I think the everybody's converged on vector databases. And these are special purpose thing to do just retrieval part. But quickly people also discovered that, hey, we still have this traditional data that we want to access, but also lots of different databases and data sources. So like, how can we simplify this thing?

Code Story

S10 Turso Update with Pekka Enberg

743.07

And then you had a lot of Postgres was adding this extension and so forth. But with SQLite, what is really interesting is because it is such a lightweight thing and you can run it in mobile devices, for example, when you actually bring this vector capability to SQLite, you can do all of this model work. related processing and all of that searching within the device itself.

Code Story

S10 Turso Update with Pekka Enberg

763.127

And you can imagine you have the latency advantage, but also increasingly people are super interested in the sort of privacy aspect, right? Because now you can have the private information on the device. It doesn't necessarily have to leave the device. So I think that's the cool part in using the old traditional SQL database and then vector search.

Code Story

S10 Turso Update with Pekka Enberg

81.977

Thank you for having me.

Code Story

S10 Turso Update with Pekka Enberg

810.183

zooming out a little bit and going back to the large language model. So like for us, we actually initially didn't do anything. When we got to the first vector databases out, we decided, okay, we don't really understand this space. We're just going to wait out and see what happens in the market. But then

Code Story

S10 Turso Update with Pekka Enberg

826.152

Six months later, you could see a Postgres community, for example, really stepping up and doing this. And then we started thinking, this kind of becomes like an existential thing. It's one workload, but it still becomes like a super important workload. So what can we do? And then we basically just started exploring how we could implement it.

Code Story

S10 Turso Update with Pekka Enberg

846.683

And actually, it wasn't probably us first pointing out the mobile aspect. It was just that We're going to do this feature. And then through our design partners, people were like, hey, this is perfect. Like I can use this for my LLM powered application and all that stuff. But it also, you could see the trend of large language models, basically splitting into two different directions.

Code Story

S10 Turso Update with Pekka Enberg

866.695

You have the really large model. So Lama 3.0 something just got released. And it's like, it's one of the biggest open source models out there. But you also have the smaller one, which actually fit on devices. We could see Apple, for example, doing some research in that area. I don't know if they roll it out.

Code Story

S10 Turso Update with Pekka Enberg

884.745

But basically, you could see that if you have a powerful retrieval augmented generation, so you have this ability to search for data, then you can probably get pretty far with the simpler model. So for us, it started to make sense that, okay, this is something that probably is useful in the mobile space.