Pekka Enberg
Appearances
Code Story
S10 Turso Update with Pekka Enberg
There's basically two main things. mobile space in terms of, you basically have two platforms, right? So you have iOS and Android, but then you also have things like React Native and all that stuff. So it's actually pretty fragmented. So one of the main things we're focusing on is basically making sure that the
Code Story
S10 Turso Update with Pekka Enberg
our SDKs for those platforms are top notch and things that you can really just get the developer experience. Because like for us, it's always this combination of trying to get the best possible developer experience, but combine that with robust infrastructure. So that's one thing which is actually surprisingly big investment.
Code Story
S10 Turso Update with Pekka Enberg
My story is tied with Glover's as well. We both worked on Linux kernel. It feels like it was also like yesterday, but more than a decade ago, we were working with the Linux kernel. That's when we met. We joined a company to do an operating systems product, which then pivoted to something completely different, a database product. So me and Glover worked at ScyllaDB.
Code Story
S10 Turso Update with Pekka Enberg
You really have to go and do the work for every fragment of the ecosystem separately. So that's one. The other part is basically, as you also hinted towards, even if you do this stuff on mobile or in the browser, you want to offload to the cloud. And a big part of what we build is the stuff that basically runs on the cloud and getting all of that right and doing the scaling there.
Code Story
S10 Turso Update with Pekka Enberg
Desegregated storage is something that is super interesting. And basically, we're keeping SQLite in the client, but we're also doing this server-side SQLite in a sense, which is also a big part of what we do.
Code Story
S10 Turso Update with Pekka Enberg
That's a great question. We struggled with that as well a little bit. So as I mentioned, two main platforms, and then you have React Native, you have Flutter, and other things as well. But I think those are the four main ones to consider.
Code Story
S10 Turso Update with Pekka Enberg
For us, actually starting with React Native just turned out to be pretty natural because our JavaScript SDK for the server side thing is something that is by far the most popular one. And unsurprisingly, JavaScript is such a huge ecosystem. That is the main one we want to tackle first because there's this upside, of course, that you get this portability between the platforms.
Code Story
S10 Turso Update with Pekka Enberg
But I think down the road, the reality of things is that you anyway, like it really depends on the types of applications. But if you look at the big ones, they will go native. Then it just becomes React Native. We can do start with that. But then you probably have to just do Android and iOS both. I just probably know like you can't really make a decision with going to with one of them.
Code Story
S10 Turso Update with Pekka Enberg
Then the rest of it, Flutter and frameworks like that, I think for us, it's just going to be demand-based. We're going to see if we have enough people basically wanting to do something beyond those three.
Code Story
S10 Turso Update with Pekka Enberg
So right now, unfortunately, it's not production grade yet. So you just come to our Discord and ask for the beta for React Native client essentially. That's basically just the starting point. The way we're integrating is just trying to find existing open source ecosystem libraries, for example, and integrating that. So the one we use today is a library called op-sqlite.
Code Story
S10 Turso Update with Pekka Enberg
I think you've had their founder here as a guest as well. I did do real backend programming also before switching to working on databases themselves. So Java was a hot call technology at the time. That's the general introduction I usually give.
Code Story
S10 Turso Update with Pekka Enberg
And basically, it will have Turso support out of the box. And basically, then you sign up to our service and get your databases on the cloud. And off you go. Because once you have a managed database on the cloud, then it's just a configuration thing on the client to connect to your remotely managed database, and it will do all the sync and all those things.
Code Story
S10 Turso Update with Pekka Enberg
The tagline is SQLite for production. And basically what we're trying to do is bring the capabilities to SQLite. And for the audience that doesn't know what SQLite is, SQLite is an embedded database. It basically runs already everywhere on embedded systems. Like the claim is that it's the most deployed database on planet, which is probably true.
Code Story
S10 Turso Update with Pekka Enberg
But if you look at more modern workloads, people doing serverless, even on mobile to some extent, there are things that are just missing from SQLite. And actually, I don't know how much you went through this with Glover on the Inception story, but Durso actually was a company doing something completely different in the beginning.
Code Story
S10 Turso Update with Pekka Enberg
And we were heavy users of SQLite for a local development toolkit, essentially. And we wanted to do something like that. The thing was to build something like a managed service, essentially. And we were always thinking that we're going to get a proper SQL database to do it in production.
Code Story
S10 Turso Update with Pekka Enberg
We quickly discovered that SQLite itself is pretty agile, worked really well, but it was just missing some features like replication, which we needed at the time. So that's the thing that we do. We built those features to SQLite to make it really awesome for modern production workloads.
Code Story
S10 Turso Update with Pekka Enberg
So actually really early on, one of the things which we focused on is this ability to bring the database close to, essentially close to the user. People doing web apps where latency is essential for great customer experience. that is something that really resonates with people.
Code Story
S10 Turso Update with Pekka Enberg
And if you combine it with something like Cloudflare Workers, for example, you can actually cut down a lot of the sort of overall latency, just have a better response time, essentially. But we also, something that we did, which is maybe slightly counterintuitive, SQLite is an in-process database.
Code Story
S10 Turso Update with Pekka Enberg
So it's a library that you put into your application and now the database is in the same sort of, it's in your application. But we also added a mode where you can access SQLite remotely. So turning it into something similar to Postgres. But the thing with SQLite is it's so lightweight that you can get lots and lots of databases.
Code Story
S10 Turso Update with Pekka Enberg
So one interesting thing that people are using Tools for is essentially having a database per tenant. So database per user, for example, architecture, which is great for SaaS applications and things like that.
Code Story
S10 Turso Update with Pekka Enberg
Two things to update on. And maybe the first one is actually, let's say, more boring incremental part. Because actually, when we first started to work on Turso, as I mentioned, like this edge part was a key thing for us. So even if it's infrastructure, you can't really be that agile in infrastructure space.
Code Story
S10 Turso Update with Pekka Enberg
But you can still find ways to do proof of concepts and really aggressively validate and work with. Around the time Glover was on this podcast, we had just announced something called Embedded Replicas, and we were actually working on the multi-tenancy thing that I mentioned. And those were actually just things that we already had on the roadmap since after the first six months.
Code Story
S10 Turso Update with Pekka Enberg
And it's just basically, once you start to work on It's a different way of using the database. Like you can have the database inside your application, but you can still have replication. So you can have durability on basically offloading to the cloud, but also this multi-tenancy thing and all the sort of schema migration and all that stuff. And it is still ongoing work.
Code Story
S10 Turso Update with Pekka Enberg
So there's lots and lots of work to do to get that into sort of production shape. So that's the sort of more boring incremental part. But the really interesting thing that we recently did, recently released, and it's actually not GA yet, is basically bringing a capability, a new capability to SQLite. So vector search. Probably people are already getting a little bit bored with all the...
Code Story
S10 Turso Update with Pekka Enberg
LLM and AI stuff. But basically, that's super interesting thing that happened over the past 18 months, like you have a completely new type of workload, new types of applications, which need this capability. And that's something that I personally find super exciting, because this was the first time we really had to dig deep and change core part of SQLite. Very cool.
Code Story
S10 Turso Update with Pekka Enberg
There are many benefits to, as I mentioned, the database per tenant model. So if you zoom out a little bit, what typically people have to do when they're building whatever application, you take a big database and you start with that.
Code Story
S10 Turso Update with Pekka Enberg
All your user data is in the same database, all the whatever product catalogs, if you're doing an online store or things like that, they're all shared in with this one database. And then problems pop up. As you start to scale, you need to start thinking about sharding and all of those things. But maybe more importantly, it's really hard to keep the data isolated.
Code Story
S10 Turso Update with Pekka Enberg
So that's one of the things why multi-tenancy story essentially resonates so well, because now you can have these databases which are isolated from each other. So just imagine managing your user data or even some more confidential data. And it really extends all the way from backend to the mobile as well, right?
Code Story
S10 Turso Update with Pekka Enberg
So you can imagine having your application data essentially sharded per user and having that replicated in the mobile device, for example. So it's partly about scaling, but it's also about privacy and data isolation.
Code Story
S10 Turso Update with Pekka Enberg
When there was this first wave of ChatGPT 3.5, I think that was the sort of, at least for me, the sort of turning point. Then all of a sudden we were in this situation that everybody wanted to apply these large language models to their applications. And basically the models themselves are super useful, but there's this problem called hallucination because they just make up stuff.
Code Story
S10 Turso Update with Pekka Enberg
So these large language models essentially... are limited to whatever they saw during training. And these things get trained by reading essentially through the whole internet. But there's always a cutoff date, right? So you train it and then after that, it doesn't really know about the new things that appear. But also for enterprises, these models don't really know your company specific data.
Code Story
S10 Turso Update with Pekka Enberg
information. And that's why people came up with this retrieval augmented generation, which is essentially just retrieving data for the model. And this is where the vector search part comes in. Imagine an interface where you have a customer typing a question. So the way it essentially works is that you take that question, you run it through a
Code Story
S10 Turso Update with Pekka Enberg
a large language model, generate an embedding, which is a vector. And then you use this vector or this embedding to find relevant information. And that relevant information is through vector search, which is managed in some database. For me, the really interesting thing is that initially what happened was that there was this like explosion of different special purpose databases, vector databases.
Code Story
S10 Turso Update with Pekka Enberg
At some point they were embedding databases, but then I think the everybody's converged on vector databases. And these are special purpose thing to do just retrieval part. But quickly people also discovered that, hey, we still have this traditional data that we want to access, but also lots of different databases and data sources. So like, how can we simplify this thing?
Code Story
S10 Turso Update with Pekka Enberg
And then you had a lot of Postgres was adding this extension and so forth. But with SQLite, what is really interesting is because it is such a lightweight thing and you can run it in mobile devices, for example, when you actually bring this vector capability to SQLite, you can do all of this model work. related processing and all of that searching within the device itself.
Code Story
S10 Turso Update with Pekka Enberg
And you can imagine you have the latency advantage, but also increasingly people are super interested in the sort of privacy aspect, right? Because now you can have the private information on the device. It doesn't necessarily have to leave the device. So I think that's the cool part in using the old traditional SQL database and then vector search.
Code Story
S10 Turso Update with Pekka Enberg
zooming out a little bit and going back to the large language model. So like for us, we actually initially didn't do anything. When we got to the first vector databases out, we decided, okay, we don't really understand this space. We're just going to wait out and see what happens in the market. But then
Code Story
S10 Turso Update with Pekka Enberg
Six months later, you could see a Postgres community, for example, really stepping up and doing this. And then we started thinking, this kind of becomes like an existential thing. It's one workload, but it still becomes like a super important workload. So what can we do? And then we basically just started exploring how we could implement it.
Code Story
S10 Turso Update with Pekka Enberg
And actually, it wasn't probably us first pointing out the mobile aspect. It was just that We're going to do this feature. And then through our design partners, people were like, hey, this is perfect. Like I can use this for my LLM powered application and all that stuff. But it also, you could see the trend of large language models, basically splitting into two different directions.
Code Story
S10 Turso Update with Pekka Enberg
You have the really large model. So Lama 3.0 something just got released. And it's like, it's one of the biggest open source models out there. But you also have the smaller one, which actually fit on devices. We could see Apple, for example, doing some research in that area. I don't know if they roll it out.
Code Story
S10 Turso Update with Pekka Enberg
But basically, you could see that if you have a powerful retrieval augmented generation, so you have this ability to search for data, then you can probably get pretty far with the simpler model. So for us, it started to make sense that, okay, this is something that probably is useful in the mobile space.