Menu
Sign In Pricing Add Podcast
Podcast Image

Code Story

S10 Turso Update with Pekka Enberg

Thu, 12 Sep 2024

Description

Hello listeners.Today, I have an incredible follow episode from our friends at Turso. You may remember our episode with Glauber Costa in Season 8, where he told us the creation story of the platform. Today, I'm speaking with his co-founder, Pekka, to hear the update on Turso and what the team has been building over the past year.Now with Turso, you can not only have embedded replicas on your device or browser, with multi-tenancy and syncing to Turbo's edge network - but now the tool powers vector search from on the device itself, leading to natively server less, low latency sql lite production loads. Turso continues to push the envelope with their product, and expanding use cases for developers.If you would like to learn more about Turso, go to turso.tech. If you'd like to learn more specifically about vector search, go to turso.tech/vector.SponsorsSpeakeasyLinkshttps://turso.tech/https://turso.tech/vectorhttps://codestory.co/podcast/bonus-glauber-costa-turso/https://codestory.co/podcast/bonus-dor-laor-scylladb/Our Sponsors:* Check out Kinsta: https://kinsta.com* Check out Vanta: https://vanta.com/CODESTORYSupport this podcast at — https://redcircle.com/code-story/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Audio
Featured in this Episode
Transcription

0.523 - 25.09 Noah Labhart

This episode is sponsored by Speakeasy. Grow your API user adoption and improve engineering velocity with friction-free integration experiences. With Speakeasy's platform, you can now automatically generate SDKs in 10 languages and Terraform providers in minutes. Visit speakeasy.com slash codestory and generate your first SDK for free. This message is sponsored by QA Wolf.

0
💬 0

25.41 - 49.147 Noah Labhart

QA Wolf gets engineering teams to 80% automated end-to-end test coverage and helps them ship five times faster by reducing QA cycles from hours to minutes. With over 100 five-star reviews on G2 and customer testimonials from SalesLoft, Grotta, and Autotrader, you're in good hands. Join the Wolf Pack at qawolf.com. Hello, listeners.

0
💬 0

49.447 - 69.132 Noah Labhart

Today, I have an incredible follow-up episode from our friends at Terso. You may remember our episode with Glauber Costa in Season 8, where he told us the creation story of the platform. Today, I'm speaking with his co-founder, Pekka, to hear the update on Terso and what the team has been building over the past year. Have a listen.

0
💬 0

73.151 - 81.937 Noah Labhart

Well, today I have another special guest on the Code Story podcast, Pekka Enberg of Terso. Pekka, thank you for being on the show today. Thank you.

0
💬 0

81.977 - 82.737 Pekka Enberg

Thank you for having me.

0
💬 0

83.338 - 99.809 Noah Labhart

Absolutely. You know, recently we had your partner Glauber on the podcast to tell us about the creation, the inception story of Terso. But we're going to dive into a bit of an update of the things you've been working on. But before we do, tell me a little bit about you.

0
💬 0

104.252 - 124.385 Pekka Enberg

My story is tied with Glover's as well. We both worked on Linux kernel. It feels like it was also like yesterday, but more than a decade ago, we were working with the Linux kernel. That's when we met. We joined a company to do an operating systems product, which then pivoted to something completely different, a database product. So me and Glover worked at ScyllaDB.

0
💬 0

124.666 - 141.605 Pekka Enberg

I think you've had their founder here as a guest as well. I did do real backend programming also before switching to working on databases themselves. So Java was a hot call technology at the time. That's the general introduction I usually give.

0
💬 0

142.446 - 154.249 Noah Labhart

Remind my audience what Terso is. So you gave the high level of your founder of Terso and bringing more lows to SQLite. But tell me and the audience a bit of a reminder of what Terso really is.

0
💬 0

154.969 - 175.938 Pekka Enberg

The tagline is SQLite for production. And basically what we're trying to do is bring the capabilities to SQLite. And for the audience that doesn't know what SQLite is, SQLite is an embedded database. It basically runs already everywhere on embedded systems. Like the claim is that it's the most deployed database on planet, which is probably true.

0
💬 0

176.778 - 193.91 Pekka Enberg

But if you look at more modern workloads, people doing serverless, even on mobile to some extent, there are things that are just missing from SQLite. And actually, I don't know how much you went through this with Glover on the Inception story, but Durso actually was a company doing something completely different in the beginning.

0
💬 0

194.07 - 211.057 Pekka Enberg

And we were heavy users of SQLite for a local development toolkit, essentially. And we wanted to do something like that. The thing was to build something like a managed service, essentially. And we were always thinking that we're going to get a proper SQL database to do it in production.

0
💬 0

211.097 - 227.55 Pekka Enberg

We quickly discovered that SQLite itself is pretty agile, worked really well, but it was just missing some features like replication, which we needed at the time. So that's the thing that we do. We built those features to SQLite to make it really awesome for modern production workloads.

0
💬 0

228.663 - 237.167 Noah Labhart

Tell me about some of the maybe the success stories of the customers. What are they doing with Terso and how have they found success with bringing that into production?

0
💬 0

237.768 - 255.917 Pekka Enberg

So actually really early on, one of the things which we focused on is this ability to bring the database close to, essentially close to the user. People doing web apps where latency is essential for great customer experience. that is something that really resonates with people.

0
💬 0

256.438 - 275.247 Pekka Enberg

And if you combine it with something like Cloudflare Workers, for example, you can actually cut down a lot of the sort of overall latency, just have a better response time, essentially. But we also, something that we did, which is maybe slightly counterintuitive, SQLite is an in-process database.

0
💬 0

275.267 - 294.819 Pekka Enberg

So it's a library that you put into your application and now the database is in the same sort of, it's in your application. But we also added a mode where you can access SQLite remotely. So turning it into something similar to Postgres. But the thing with SQLite is it's so lightweight that you can get lots and lots of databases.

0
💬 0

294.839 - 307.368 Pekka Enberg

So one interesting thing that people are using Tools for is essentially having a database per tenant. So database per user, for example, architecture, which is great for SaaS applications and things like that.

0
💬 0

308.447 - 319.871 Noah Labhart

Okay, so tell me the update. We talked about a few things before recording the episode, but I'm curious about what has gone on with Terso. What have you built? What have you shipped to the world? And what has been the big changes with the product?

0
💬 0

320.984 - 339.998 Pekka Enberg

Two things to update on. And maybe the first one is actually, let's say, more boring incremental part. Because actually, when we first started to work on Turso, as I mentioned, like this edge part was a key thing for us. So even if it's infrastructure, you can't really be that agile in infrastructure space.

0
💬 0

340.118 - 361.78 Pekka Enberg

But you can still find ways to do proof of concepts and really aggressively validate and work with. Around the time Glover was on this podcast, we had just announced something called Embedded Replicas, and we were actually working on the multi-tenancy thing that I mentioned. And those were actually just things that we already had on the roadmap since after the first six months.

0
💬 0

361.98 - 382.096 Pekka Enberg

And it's just basically, once you start to work on It's a different way of using the database. Like you can have the database inside your application, but you can still have replication. So you can have durability on basically offloading to the cloud, but also this multi-tenancy thing and all the sort of schema migration and all that stuff. And it is still ongoing work.

0
💬 0

382.116 - 403.881 Pekka Enberg

So there's lots and lots of work to do to get that into sort of production shape. So that's the sort of more boring incremental part. But the really interesting thing that we recently did, recently released, and it's actually not GA yet, is basically bringing a capability, a new capability to SQLite. So vector search. Probably people are already getting a little bit bored with all the...

0
💬 0

404.441 - 425.63 Pekka Enberg

LLM and AI stuff. But basically, that's super interesting thing that happened over the past 18 months, like you have a completely new type of workload, new types of applications, which need this capability. And that's something that I personally find super exciting, because this was the first time we really had to dig deep and change core part of SQLite. Very cool.

0
💬 0

425.79 - 443.406 Noah Labhart

Okay, so I want to dig into the boring part before I get into the not boring part with the vector search. So from an engineering standpoint, I couldn't imagine multi-tenancy and all the things you're working on or worked on there was boring. But tell me why that's important. Why is it important for multi-tenancy for your customers?

0
💬 0

444.275 - 459.104 Pekka Enberg

There are many benefits to, as I mentioned, the database per tenant model. So if you zoom out a little bit, what typically people have to do when they're building whatever application, you take a big database and you start with that.

0
💬 0

459.204 - 480.415 Pekka Enberg

All your user data is in the same database, all the whatever product catalogs, if you're doing an online store or things like that, they're all shared in with this one database. And then problems pop up. As you start to scale, you need to start thinking about sharding and all of those things. But maybe more importantly, it's really hard to keep the data isolated.

0
💬 0

480.455 - 497.666 Pekka Enberg

So that's one of the things why multi-tenancy story essentially resonates so well, because now you can have these databases which are isolated from each other. So just imagine managing your user data or even some more confidential data. And it really extends all the way from backend to the mobile as well, right?

0
💬 0

497.686 - 513.423 Pekka Enberg

So you can imagine having your application data essentially sharded per user and having that replicated in the mobile device, for example. So it's partly about scaling, but it's also about privacy and data isolation.

0
💬 0

514.244 - 533.23 Noah Labhart

This episode is sponsored by Speakeasy. Whether you're growing the user adoption of your public API or streamlining internal development, SDKs can turn the chore of API integration into effortless implementation. Unburden your API users from guessing their way around your API while keeping your team focused on your product.

0
💬 0

533.57 - 554.036 Noah Labhart

Shorten the time to live integration and provide a delightful experience for your customers. With Speakeasy's platform, you can now automatically generate up-to-date, robust, idiomatic SDKs in 10 languages and Terraform providers in just a matter of minutes. SDKs are feature-rich with type safety, auto-retries, and pagination.

0
💬 0

554.536 - 581.847 Noah Labhart

Everything you need to give your API the developer experience it deserves. Deliver a premium API experience without the premium price tag. Visit speakeasy.com slash codestory to get started and generate your first SDK for free. This message is sponsored by SnapTrade. Link end-user brokerage accounts and build world-class investing experiences with SnapTrade's unified brokerage API.

0
💬 0

582.308 - 605.432 Noah Labhart

With over $12 billion in connected assets and over 300,000 connected accounts, SnapTrade's API quality and developer experience are second to none. SnapTrade is SOC 2 certified and uses industry-leading security practices. Developers can use the company's official client SDKs to build investing experiences in minutes without the limitations of traditional aggregators.

0
💬 0

605.972 - 630.115 Noah Labhart

Get started for free today by visiting snaptrade.com slash codestory. Certainly. Okay, that makes sense. That gives me a good idea there of what people would use that for. Okay, so tell me about vector search. So people are, you know, are getting, you know, bored, sure, with the LLM stuff, because it's all in the buzz and things, but it's super useful. And it's really valuable.

0
💬 0

630.495 - 634.418 Noah Labhart

Give me some use cases of the vector search. What drove you to that?

0
💬 0

634.938 - 656.439 Pekka Enberg

When there was this first wave of ChatGPT 3.5, I think that was the sort of, at least for me, the sort of turning point. Then all of a sudden we were in this situation that everybody wanted to apply these large language models to their applications. And basically the models themselves are super useful, but there's this problem called hallucination because they just make up stuff.

0
💬 0

656.459 - 676.982 Pekka Enberg

So these large language models essentially... are limited to whatever they saw during training. And these things get trained by reading essentially through the whole internet. But there's always a cutoff date, right? So you train it and then after that, it doesn't really know about the new things that appear. But also for enterprises, these models don't really know your company specific data.

0
💬 0

677.222 - 695.901 Pekka Enberg

information. And that's why people came up with this retrieval augmented generation, which is essentially just retrieving data for the model. And this is where the vector search part comes in. Imagine an interface where you have a customer typing a question. So the way it essentially works is that you take that question, you run it through a

0
💬 0

697.122 - 719.309 Pekka Enberg

a large language model, generate an embedding, which is a vector. And then you use this vector or this embedding to find relevant information. And that relevant information is through vector search, which is managed in some database. For me, the really interesting thing is that initially what happened was that there was this like explosion of different special purpose databases, vector databases.

0
💬 0

719.589 - 742.309 Pekka Enberg

At some point they were embedding databases, but then I think the everybody's converged on vector databases. And these are special purpose thing to do just retrieval part. But quickly people also discovered that, hey, we still have this traditional data that we want to access, but also lots of different databases and data sources. So like, how can we simplify this thing?

0
💬 0

743.07 - 763.067 Pekka Enberg

And then you had a lot of Postgres was adding this extension and so forth. But with SQLite, what is really interesting is because it is such a lightweight thing and you can run it in mobile devices, for example, when you actually bring this vector capability to SQLite, you can do all of this model work. related processing and all of that searching within the device itself.

0
💬 0

763.127 - 780.515 Pekka Enberg

And you can imagine you have the latency advantage, but also increasingly people are super interested in the sort of privacy aspect, right? Because now you can have the private information on the device. It doesn't necessarily have to leave the device. So I think that's the cool part in using the old traditional SQL database and then vector search.

0
💬 0

781.175 - 801.712 Noah Labhart

Yeah, wow. So that's really interesting. You can pull data essentially from within the client itself, whether it be mobile device, and we're talking about mobile devices, eventually, maybe it could be browsers or something like that. But a client, you don't have to go to a server itself to do the heavy computing. How did you figure that out? Because that's not a small thing.

0
💬 0

802.172 - 809.318 Noah Labhart

Usually, you're reliant on the power of the device, right? So if you're using a weak device, it's not going to work very well. Tell me about that.

0
💬 0

810.183 - 825.412 Pekka Enberg

zooming out a little bit and going back to the large language model. So like for us, we actually initially didn't do anything. When we got to the first vector databases out, we decided, okay, we don't really understand this space. We're just going to wait out and see what happens in the market. But then

0
💬 0

826.152 - 846.302 Pekka Enberg

Six months later, you could see a Postgres community, for example, really stepping up and doing this. And then we started thinking, this kind of becomes like an existential thing. It's one workload, but it still becomes like a super important workload. So what can we do? And then we basically just started exploring how we could implement it.

0
💬 0

846.683 - 866.655 Pekka Enberg

And actually, it wasn't probably us first pointing out the mobile aspect. It was just that We're going to do this feature. And then through our design partners, people were like, hey, this is perfect. Like I can use this for my LLM powered application and all that stuff. But it also, you could see the trend of large language models, basically splitting into two different directions.

0
💬 0

866.695 - 884.725 Pekka Enberg

You have the really large model. So Lama 3.0 something just got released. And it's like, it's one of the biggest open source models out there. But you also have the smaller one, which actually fit on devices. We could see Apple, for example, doing some research in that area. I don't know if they roll it out.

0
💬 0

884.745 - 900.629 Pekka Enberg

But basically, you could see that if you have a powerful retrieval augmented generation, so you have this ability to search for data, then you can probably get pretty far with the simpler model. So for us, it started to make sense that, okay, this is something that probably is useful in the mobile space.

0
💬 0

901.297 - 928.759 Noah Labhart

No doubt. It's a really smart way to solve that problem, right, of the local replica, the being able to vector search on the device while still, you know, or eventually syncing, right, to your edge network, to your remote databases. This message is sponsored by QA Wolf. If slow QA processes bottleneck your software engineering team and you're releasing slower because of it, you need a solution.

0
💬 0

929.24 - 952.12 Noah Labhart

You need QA Wolf. QA Wolf gets engineering teams to 80% automated end-to-end test coverage and helps them ship five times faster by reducing QA cycles from hours to minutes. With over 100 five-star reviews on G2 and customer testimonials from SalesLoft, Drada, Autotrader, and many more, you're in good hands. Ready to ship faster with fewer bugs?

0
💬 0

952.721 - 979.05 Noah Labhart

Join the Wolfpack at QAwolf.com to see if they can help you squash the QA bottleneck. This message is sponsored by SnapTrade. Link end-user brokerage accounts and build world-class investing experiences with SnapTrade's unified brokerage API. With over $12 billion in connected assets and over 300,000 connected accounts, SnapTrade's API quality and developer experience are second to none.

0
💬 0

979.83 - 1005.77 Noah Labhart

SnapTrade is SOC 2 certified and uses industry-leading security practices. Developers can use the company's official client SDKs to build investing experiences in minutes without the limitations of traditional aggregators. Get started for free today by visiting snaptrade.com slash codestory. So then, so you've got Vector Search, right? You've already have an amazing SQLite product.

0
💬 0

1005.83 - 1017.8 Noah Labhart

Now you have Vector Search. You're fueling this local replica process. Where do you take this next? This is already an incredibly powerful platform, but where do you take this next? Where do you see this going?

0
💬 0

1018.26 - 1036.133 Pekka Enberg

There's basically two main things. mobile space in terms of, you basically have two platforms, right? So you have iOS and Android, but then you also have things like React Native and all that stuff. So it's actually pretty fragmented. So one of the main things we're focusing on is basically making sure that the

0
💬 0

1036.713 - 1054.82 Pekka Enberg

our SDKs for those platforms are top notch and things that you can really just get the developer experience. Because like for us, it's always this combination of trying to get the best possible developer experience, but combine that with robust infrastructure. So that's one thing which is actually surprisingly big investment.

0
💬 0

1055.1 - 1079.407 Pekka Enberg

You really have to go and do the work for every fragment of the ecosystem separately. So that's one. The other part is basically, as you also hinted towards, even if you do this stuff on mobile or in the browser, you want to offload to the cloud. And a big part of what we build is the stuff that basically runs on the cloud and getting all of that right and doing the scaling there.

0
💬 0

1079.627 - 1092.956 Pekka Enberg

Desegregated storage is something that is super interesting. And basically, we're keeping SQLite in the client, but we're also doing this server-side SQLite in a sense, which is also a big part of what we do.

0
💬 0

1093.951 - 1112.232 Noah Labhart

Awesome. I appreciate that. I want to dive into the SDK portion because you're right. It's super fragmented. There's all kinds of things that you would need to support. How do you choose what is the most important SDK to go make sure is perfect? And how do you go about deciding if you're going to add new ones?

0
💬 0

1113.233 - 1128.707 Pekka Enberg

That's a great question. We struggled with that as well a little bit. So as I mentioned, two main platforms, and then you have React Native, you have Flutter, and other things as well. But I think those are the four main ones to consider.

0
💬 0

1129.688 - 1152.163 Pekka Enberg

For us, actually starting with React Native just turned out to be pretty natural because our JavaScript SDK for the server side thing is something that is by far the most popular one. And unsurprisingly, JavaScript is such a huge ecosystem. That is the main one we want to tackle first because there's this upside, of course, that you get this portability between the platforms.

0
💬 0

1152.776 - 1174.862 Pekka Enberg

But I think down the road, the reality of things is that you anyway, like it really depends on the types of applications. But if you look at the big ones, they will go native. Then it just becomes React Native. We can do start with that. But then you probably have to just do Android and iOS both. I just probably know like you can't really make a decision with going to with one of them.

0
💬 0

1175.302 - 1186.338 Pekka Enberg

Then the rest of it, Flutter and frameworks like that, I think for us, it's just going to be demand-based. We're going to see if we have enough people basically wanting to do something beyond those three.

0
💬 0

1187.202 - 1211.515 Noah Labhart

Makes sense, totally makes sense. And it is a struggle. There's so much out there. Okay, so I'm a developer. I'm building this really cool product and I wanna be able to use SQLite. I wanna be able to use something like Terso where I can remotely update my databases, but on the device itself, I want to be able to do the vector search. I wanna be able to have the embedded replica.

0
💬 0

1211.635 - 1219.643 Noah Labhart

I wanna essentially be able to process data in a fast way on my local device. How do I get started using Terso? What do I need to do?

0
💬 0

1219.984 - 1244.937 Pekka Enberg

So right now, unfortunately, it's not production grade yet. So you just come to our Discord and ask for the beta for React Native client essentially. That's basically just the starting point. The way we're integrating is just trying to find existing open source ecosystem libraries, for example, and integrating that. So the one we use today is a library called op-sqlite.

0
💬 0

1245.318 - 1267.038 Pekka Enberg

And basically, it will have Turso support out of the box. And basically, then you sign up to our service and get your databases on the cloud. And off you go. Because once you have a managed database on the cloud, then it's just a configuration thing on the client to connect to your remotely managed database, and it will do all the sync and all those things.

0
💬 0

1267.399 - 1279.449 Noah Labhart

Fantastic. Well, Pekka, I really appreciate you being on the show today and giving the update on Terso. done some really amazing things with the things that you're calling, quote unquote, boring with multi-tenancy and embedded replicas.

0
💬 0

1279.589 - 1295.544 Noah Labhart

But the vector search capabilities that are on device and being able to power that fast data retrieval and remote syncing with your edge network is really, really fantastic and fascinating. Really appreciate you being on the show, giving the update. Cool. Thank you for having me. Incredible.

0
💬 0

1295.884 - 1317.054 Noah Labhart

Now with Terso, you can not only have embedded replicas on your device with multi-tenancy and syncing to Terso's edge network, but now the tool powers vector search from the device itself, leading to natively serverless, low-latency SQLite production loads. Terso continues to push the envelope forward with their product and expanding use cases for development.

0
💬 0

1317.834 - 1331.024 Noah Labhart

If you'd like to learn more about Terso, go to terso.tech. If you'd like to learn more specifically about Vector Search, go to terso.tech slash vector or sign up for their Discord. And thanks again for listening.

0
💬 0
Comments

There are no comments yet.

Please log in to write the first comment.