Ruby Rogues
Practical Observability: Logging, Tracing, and Metrics for Better Debugging - RUBY 656
Wed, 16 Oct 2024
Today, they dive deep into the world of observability in programming, particularly within Rails applications, with special guest, John Gallagher. Valentino openly shares his struggles with engineering challenges and the frustration of recurring issues in his company's customer account app. They explore a five-step process Valentino has developed to tackle these problems and emphasize the critical role of defining use cases and focusing on relevant data for effective observability.In this episode, they talk about the emotional journey of dealing with bugs, the importance of capturing every event within an app, and why metrics, logs, and tracing each play a unique role in debugging. They also touch on tools like Datadog, New Relic, and OpenTelemetry, discussing their practical applications and limitations. Valentino and John shed light on how structured logging, tracing, and the concept of high cardinality attributes can transform debugging experiences, ultimately aiming for a more intuitive and robust approach to observability.Join them as they delve into the nexus of frustration, learning, and technological solutions, offering valuable insights for every developer striving to improve their application's resilience and performance.SocialsLinkedIn: John GallagherBecome a supporter of this podcast: https://www.spreaker.com/podcast/ruby-rogues--6102073/support.
Hey, everybody. Welcome to another episode of the Ruby Rogues podcast. I am your host today, Valentino Stoll, and we are joined by a very special guest today, John Gallagher. John, can you introduce yourself and tell everybody a little bit about yourself and why we had you on today?
Sure. Thanks for having me on. My name is John Gallagher, and I am a senior engineer at I had a company called BiggerPockets, and we teach how to invest in real estate based in the US. And I also run my own business on the side called Joyful Programming to introduce more joy to the world of programming. And I'm on today to talk a bit about observability, which is one of my many passions.
I'm a bit of a polymath. This is one of the things that is really, really important to me and I'm passionate about. I'm particularly passionate about introducing this into Rails apps. So thanks for having me on.
Yeah, and thank you for all the joy you're bringing to people, I hope. You definitely picked the right language, Ruby. If you're not familiar with this podcast, Ruby is a very joyful experience personally. So it's very cool. I love... I've loved been digging into all of the observability talk that you have on joyful programming.
And it's kind of a very important topic that I feel is definitely kind of... overlooked if you're starting up um maybe you get some like you know bug alerting or something like that in place as like a standard but kind of anything performance monitoring wise is kind of like a oh no like something happened let's look into it now i feel like it's like the the typical uh
uh, flow of things, uh, as people start up. Um, do you want to just give us like a high level, like what is observability and why should we care? Uh, what, you know, we could drill into the details of it after.
Well, um, I don't actually think anybody should care about observability and I don't care about observability as a thing because it's just a means to an end. And what's the actual goal. Um, Doesn't matter how you get there, but the goal is being able to, number one, understand your Rails app in production, and number two, be able to ask unusual questions.
Not questions that you've thought of a day, two days, three weeks ago, because that's not really very useful or interesting. If we knew exactly the questions to ask in the future of our apps, everything would be easy. Just be like, how many 200s have we had in the last week? It's kind of a boring question to ask. Maybe a bit useful. I find the more obvious the question, the less useful it is.
So observability is the practice of making a black box system more transparent. So I like to think of it, imagine your entire Rails app, all the hosting, everything to do with that app is wrapped up in an opaque black box. And somebody says, how does it work? And why is this thing going wrong? You would have no hope of understanding it. If the box is completely translucent,
And you can see everything, which of course is completely impossible in software. But in theory, you'd have this completely translucent box and you can ask all these questions and you get instant answers. That's like 100% observability. And of course, that is absolutely impossible.
And so what we're trying to do with observability is understand what is going on, not just when it goes wrong, although that's the obvious use case, is we have an incident, the most critical point where observability comes into play is an exact scenario that I landed in two weeks into a new role I had. So it was two weeks in, the site had gone down,
I am in the UK, and the rest of my team were in the US, and there were two other engineers in my time zone. And all of us had been at the company for a total of five weeks. So we've got this app. It's down. It's on fire. And we need to put the fire out. And the three of us looked at each other. We were like, should we just restart the dynos? Yeah. So we restarted the dynos.
We crossed our fingers. And it was pure luck that the app came back up. That is the exact opposite of what we want. And we've now moved to a situation where we can ask our app a whole load of very unusual questions. And we will get an answer to that. Why are there a peak of 404s on iOS at 3 a.m.? Looks like a lot of them are coming from this IP address.
Okay, what's that IP address doing on the site? Okay, interesting. How many users are using that IP address? Five. So only five people are using it. So that's the point of observability to me, to be able to ask unusual questions that you haven't thought of already,
dynamically and explore the space and come to some conclusions yeah i think that's a great overview uh and your uh your your debugging uh reminded me of the i had the lucky experience of uh running rails with uh ruby 1.87 and every once in a while you just had to like
give the server a little kick because it started to grow in memory size and just, you know, giving it a quick little flush, like reset things. And you're just like, oh, I guess that's how we're going to do it until we can get like some insight into what's happening. Right. And I think that's definitely underlines the importance of observability in general.
Like, you know, how do you get those insights to begin with? Yeah. And maybe that's a great starting point. Like where do you start like looking at it, like adding this insights, right? Like what's the, is there a modular approach you could take or is it more of like you should look at doing everything all at once kind of thing?
You should definitely not look at doing everything all at once. As I think we can all agree in software, doing everything all at once is a recipe for disaster, no matter what you're doing.
There's no vendor you could just like pay money to and like you get 100% observability.
There are vendors that tell you that you can do that. Whether you actually can or not is a different matter. Spoiler alert, you can't. So I just want to back up a little bit and talk about the feelings because I think it's the feelings that is where all of this start for me.
So I got into observability and it's funny because for the first kind of year of my journey doing this, I didn't even realize I was doing observability. I'd heard about this observability thing and it was out there in the universe. Okay. Maybe I should learn that. I should learn that. And I kept using the should. I should learn this. I should have loads of other stuff to do.
I've got loads of other things. I don't know what it is. I know it comes from controls here, and there's a Wikipedia page that's really complex and really confusing. Whatever. I've got real work to do. But what I know is that I kept coming across these bugs in Bugsnag, Sentry, Airbrake. Choose your error reporting tool. They all help you to a degree, but they're not a silver bullet.
And I kept coming across these defects over and over, and the story was exactly the same. Come across a defect, I'd see the stack trace in the error reporting tool, and I would look at it, and first emotion right out the gate, complete confusion. What is going on here? No idea. So I dig a little bit into the code. I dig a little bit into the stack trace. So it's coming from here.
And this thing is nil. Classic, right? This thing is nil. Where was it being passed in as nil? I don't know. So now I'm like, well, I can't just say I can't fix this. So I now have to, well, do what exactly? I don't have any information to go off. Well, I guess we'll do that bug later. Let's look at the next one. And this just kept happening.
And I would find myself going through all the bugs in the backlog and I couldn't fix any of them. And I just wasted four hours looking at things, asking questions that I couldn't explain, looking at things I didn't understand. And for years, I thought the problem was with me. I honestly thought I'm just not smart enough. I'm not a good engineer, blah, blah, blah, blah, blah.
Bug fixing just isn't really my thing. I'm just not really good at it. And then after many, many years of this, I was in a company, and I just got really sick of this. We just released a brand-new app, and it was a customer account app. And we were getting all these weird bug reports, people saying I can't log in, people saying I can't reset my password.
And every time we did this, we would add a little bit of kind of this ad hoc logging and then put the bug back in the backlog. And then it would come up again and come up again. And after a while, I was just like, this is just, this is ridiculous. We're highly paid engineers. This is not a better way.
So then I started looking into, we were using Kibana at the time, or rather I should say we were not using Kibana at the time. Kibana was there, we were paying for it. And I was like, I've heard this is something to do with logging. So where do we do our logging? People like Kibana. I have no idea what this even is. Let's open it up. And there was just all of this trash. all of this rubbish.
I was like, what's this? How's this supposed to be useful? People are like, oh, we don't really look at that. It's not very useful. I said, so how do you figure out bugs? And they're like, well, we just, we just figure it out. Well, yes, but we're not figuring it out. So all of this was born through frustration.
And so what I did back then is what I recommend everybody does now to answer your question. Come back to the point, John. Yeah. which is take a question that you wish you knew the answer to, a very specific question, not why is our app not performing as we want? Not as in like, why do our, you know, a very, very specific question. So take your big, big question. And the time this was,
Why are people being locked out of the app? Why can they not reset their password? They're clicking on this password link and they're saying it's expired or it goes nowhere or it doesn't work. Okay. Why are those people, like, why is that happening? So that's quite a general question, and you want to break it down into some hypotheses. So that's the first thing.
I have a five-step process, and this is step one. I'll go through the five-step process in a minute. So step one is think of a specific question. So a specific question in this case might be, Okay, I've got one customer here. There's many, many different types of defects. So this one customer here is saying it was expired. I went to the webpage and the link said it had expired.
Okay, when did they click on that link? What response did the app give to them? And when did the token timeout? So those are three questions. Now they're not going to get us to the answer directly, but there are three questions, very specific questions that we can add instrumentation for. So I would take one of those questions. When did the token timeout? Great question.
So in order to do that, we need to know when the token was created and what the expiry of the token was. This is just a random example off the top of my head. So you'd be like, okay, well, we need to know the customer ID. We need to know the token. We don't actually need to know the exact token, but we need to know the customer ID.
We need to know the time that the token was created and the expiry time of that token. Is it 15 minutes? Is it two hours? Whatever. So I would then look into the code. So we've done step two. Step two is define the data that you want to collect. User ID, token expiry, and an event saying the token has been created now for this user ID. Okay, so that's the second step.
The third step is build the instrumentation to do that. So whatever you have to do, maybe it's you need to actually add structured logging to your entire app. I don't know. Maybe it's that you've got the structured logging fine, but there's nothing listening to it. Maybe. Maybe the tool just can't actually measure what you want it to measure.
So maybe you need to invest in a new tool, whatever it is. And then you build some code to instrument just that very small piece of functionality. And then once you've done that, you wait for it to deploy. And then you look at the graphs, you look at the logs, you look at the charts, whatever output you've got.
And what normally happens is, for me, I look at the charts and I say, that is not what I wanted at all, actually. I've misunderstood the problem. I've misunderstood the data I want. Now that I see it, ah! Just like you would with agility, true agility, not agile, because agile means something else now.
But true agility is you do a little bit of work, you develop a feature, you show the customer, they say, not quite right. Go back, adjust it. Closer, but still not quite right. But if you ask them to describe it exactly right from the beginning, it doesn't align with what they want at all. You need to show them, and it's only by showing them that you get feedback.
And the same is true for ourselves. It's only by looking at the graphs and the logs that I realize that actually isn't what I wanted to begin with, or it is, or I'm onto something there. And so I keep then sort of I've used the graph. Maybe it was unusable. Maybe I couldn't query the parameter. Maybe there's all sorts of things that might be happening there. So then the last stage is improved.
And so from improve, you can go back to the very beginning, ask a different question, or maybe you just want to iterate on the instrumentation a bit, deploy it again. Oh, that's more like it. Okay. So now we know the token expiry. What's the next question we want to ask? Well, when did the user actually hit the site? Was it after the token expiry or before? Hmm. Okay.
Sounds like an obvious question, but maybe it's after, which would indicate the token really had expired. Oh, it's before. Huh? How could it be expired when it was before? Oh, hang on. What's the time zone of the token? Now we're getting into it, right? So you log the time zone. Holy cow, the time zone of the token is out of sync with the time zone of the user. That's what it is.
Yeah, I love that. I love that analogy of identifying the use case in order to expose what to observe and where to insert, you know, all of these pieces that are missing or identify them really, right? Not to just insert them, but to identify them. I think that's very important.
I think in general is like trying to identify the actual use cases in order to know what you even want to capture to begin with, right? Like, yeah, we get to throw a wall of logs, right? at a resource like Kibana, and it's not very useful.
But once you start to abstract the ideas and use cases and how people are actually using the thing that you've built, you can definitely isolate what it is that you actually care about. And I think you're right. That is kind of the whole importance of observability.
is is identifying that use case and exposing what what you actually care about uh as far as all these things that are because i mean you know there's http logs there's like also all kinds of logs and information available that's just like emitting all the time like how do you know and identify you know which are really important uh and i i think it just depends right like
What are you yeah, what are you trying to capture? So it's a it's a great like stepwise way to just like start to figure that out. Right. Because, yeah, I guess depending on your role and depending on what, you know, your responsibilities are, that could change and that could be different. And your observability needs will change with that. So identifying that is probably most important, I think.
But as with everything else, I would say if you're really not feeling any pain, don't bother. Just don't bother. I'm not into kind of – I'm not really interested in telling people what they should be doing or could be doing. I mean, goodness me, we hear enough of that in engineering, don't we? You should really learn a language every year. You should be Blair. You should be Blair.
I'm sick of it, absolutely sick of all these gurus telling me what to do and what I should be learning and what I – And very few of them talk about, well, what's the benefit to me? And in order for me to do anything, in order for me to change as a human being in any way, learn anything, I have to feel the pain of it. If you're not feeling the pain, don't bother.
But if you are feeling the pain, if deploys are really glitchy, if you keep asking, for me, the kicker is if I keep asking questions I don't have the answer to, That's a concern. And if they're just minor, oh, like, why did I wake up 10 minutes late today? Who cares? It's not important.
But if the site's gone down for the fourth time this month, and every time the site goes down, we lose at least five grand, 10 grand, maybe even more. And even worse, every single time the site does go down, we just kind of get it back up more by luck than good judgment. This kind of feeling of, oh, we kind of got away with it that time. That's OK.
I know there was this weird thing and it's still not really figured that one out, but that's OK. We'll just put it in the backlog. Um, it's the operational risk. You've got to decide, are you comfortable with that operational risk or not? Is it big enough? And in my experience, you've kind of got to hit rock bottom with this stuff.
As I did, there were loads and loads of bugs that I could have investigated and added logging for and fixed, but you know, it's pushing a boulder up a hill. It's not actually worth it. And it was only when it reached my threshold of pain. I was like, you know what? I have to do something about this now. This is just ridiculous. We're professional people.
We're being paid a lot of money and it's not working. The app that we've delivered is not working. What's more, we don't know why. But also I do just want to add, and this may broaden out the conversation a little bit. You may want to, we may want to keep it narrow on Rails apps, but I've realized that observability principles go way beyond how does our web app work? It applies to any black box.
So as an example, a few years ago, I was working at a company and their SEO wasn't great. And they just kind of were like, oh, you know, we'll try and fix it. And they had several attempts to fix it. None of them really worked. And every attempt was the same. They would get some expert in. The expert would give us a list of 100 things to do. We would do 80 of the 100.
And then nothing would really improve. And then they'd be like, well, we did everything you said. And then they'd move on to another. And rinse and repeat, keep doing that. And then one day, within four weeks, 20% of the site traffic disappeared. And nobody could tell us why. Nobody understood why. Observability. Now, Google is a black box.
So, you know, you're not going to be able to instrument Google. But there's lots of tools that allow you to peer into the inner workings of Google, SEMrush, Screaming Frog, all these kind of tools. They are, in my opinion, actually in, to some degree, the observability space. They're not... Everybody thinks of them as marketing tools, SERPs, engine optimization tools, whatever, whatever, whatever.
They're allowing you to make reasoned guesses about why your searches aren't performing the way they are. And then you can actually take action on that because now you have some data. Oh, this keyword dropped from place four to place 100. Why is that?
okay let's try a let's try hypothesis a put that live and see if google will respond to that oh and now up to you know position 80 whatever it is so the idea of observability goes way way beyond like data dog and new relic and obviously all of those people in the observability space but i i see it as a much much wider and much more applicable topic yeah i i hear you there uh
And I'm all also like, you know, let's not just add New Relic to every app that we deploy. Or, you know, is Bugsnag even needed for every app? Like, these are questions that I ask myself, too. Like, what value are you getting from all these auxiliary services that give you the observability into, like, just blanket things? Yeah.
At what point do you stop that kind of mentality and be like, every Rails app should at least be able to get insight into the logs so that you can see what the application is doing. How long do you capture that? What kind of time frame? Do you have any default standards where you're like, well, I know that I'm going to need to look at this at some point in the application cycle.
What are your defaults?
Great question. I would say if you're making a small app with very little traffic and it's thresholds like anything else, you're making a small app with very little traffic. I have a client at the moment I'm consulting for. and I've made them an app, and it has maybe flipping 20 visits a day or something, 20 hits a day. So I installed Rollbar, free version of Rollbar.
Anything goes wrong, I get a notification. It's fine. The further up the stack you move, the more the defaults change. For a Rails app that's mission critical that I'm not even going to say mission critical, but just serving a decent number of hits a month, uh, 10,000, 20,000. I don't know. I've tried a lot of observability tools. Um, and there's no one that yet that I can unreservedly recommend.
They're all got their pros and cons. Um, Datadog is a good option if money is no object. I kind of don't want to get into the tooling debate because it's kind of a bit of a red herring, I think, in many ways. There's various cost-benefit trade-offs there. But in terms of the defaults, in terms of what you observe, requests has got to be up there.
So every app that I have in my care of any significant size, I would always say install semantic logger. Semantic logger is the best logger I've found. It does JSON out of the box. It's quite extensible. There are many problems with it, but it's the best option that we've got. So that's number one. That will log every, like Rails already logs every request for you.
That will format in JSON for you. There are some notable missing defaults in semantic logger. And I'm working on a gem at the moment that will add some even more sensible defaults into it. So, for example, I believe that request headers do not get logged out of the box. Certainly request body does not get logged out of the box. Request headers might be.
The user agent doesn't get logged out of the box. I mean, this is just crazy. Pretty basic stuff, right? So I have a setup that I use that logs a whole load of things about the requests out of the box. I like to add in user ID out of the box. It depends what kind of setup you have for authentication, but at the very, very least, if somebody's logged in,
The ID of them should be logged in every single request. That is absolutely, you know, absolutely basic stuff. A request ID is also a really, really useful one. I have a complex relationship with logs and tracing because tracing is essentially the pinnacle of observability. I hear a lot of people say logging, like logging is a be all and end all.
Logging is a great place to start, but tracing is really where it's at. And I can go into that, why that is in a bit. But logging is a great default. Logging is a good place to start. Start with semantic logger. Basically, every single thing that's important in any request should be logged. So that's every header.
Obviously, you need to be careful with sensitive data in headers, like do your Rails active logs. I can't remember what it's called, but there's the filtering module that you can add in. And sometimes semantic logger doesn't give you that by default, so you need to be a bit careful. A good default as well is logging all background jobs. Background jobs are one of the most painful things
areas of observability that I've experienced, and we still haven't really cracked it. We have some very, very basic logging out the box in Semantic Logger. I believe it logs the job class, the job ID, and a few other things, but it doesn't log the latency, which is a Huge, huge missed opportunity. And it also, I don't believe it logs the request ID from whence it was enqueued.
So when a job is enqueued, it will, by default, semantic logger will trigger a little entry in the logs, this job is enqueued, and it will tell you what request it came from. But on the other side, when it's picked up and the job is performed, that request ID is missing. So you need to kind of go into the request ID, find the enqueued job, find the job ID, and then take that next leap.
So, I mean, it's a bit clunky, but it's manageable. So in short, semantic logging gives you some okay defaults out of the box, but there's some really basics that it still misses. And so background jobs, requests, those are the two really, really big ones to start out with. But as you can imagine, there are a ton more.
Yeah, you mentioned kind of some key pieces I always think of with observability in general, which is like separating the pieces into their own puzzle, right? Like we have logs, which are kind of just like our data. And then we have individual metrics that we're like snapshotting the logs for particular segments like traffic or number of people using it, like the number of jobs that are running.
And then there are traces, which we could dig into next, because I have a lot of love for all of the standards that are coming out of this with open tracing and things like that. I'd love to dig in there. But also like alerting, like, you know, how does anybody know that there's ever a problem?
Yeah, I mean, and I love, I love like thinking about it in these separate groups and categories, because I think it also helps to think about like the overarching theme, which is like getting insight, but also like, getting meaningful insight.
And like when you want really like the only the only reason anybody ever cares about observability anyway is like when something goes wrong or something is problematic that causes something to go wrong. And you want to either catch it early or, you know, try and remediate. And so like.
Where do you find like I mean, background jobs are like kind of like I feel like the first instance where people realize like, oh, like we need to start looking at, you know, what it's doing. Right. Like you start throwing stuff in the background. You're like, OK, great. Like it's doing the work.
Uh, and then you don't maybe realize if you're on the same node, like, well, those, you know, slow requests can block the web requests. Uh, right. And then, okay, well, if you split those up, finally you got that resolved, but then, okay, well, one problematic, you know, job can back up a queue that it's on. Uh, you know, like where do you, uh,
To me, like the background processing aspect is like why we have tracing to begin with, because it does like it's concurrency. Right. So it's like that's where everybody like ends up hitting their pitfalls is as soon as you start doing things like all at once, like and thinking, oh, like we just throw it in the background and like process things as they come.
And as things start to scale, it causes more problems as you try and figure out timing and stuff like that. Where do you find the most important pieces of making sure that you are capturing the right segments and the right flows in that process?
Yeah. There's so many things you touched on there I want to come back to. To answer your question, first of all, it's the five steps that I walked through. And that's the short answer is if you have a specific question that you cannot answer, what we're really talking about is the implementation details of how you answer that question.
So what question you pick determines a whole load of, a whole load of stuff. I can't just give you a bog standard answer because it just, it depends. I hate saying that, but it does. So I think, yeah, The first question is to ask the question, figure out what data is missing, and then choose the right piece to add into your logs. I feel like I've maybe not understood your question maybe.
Yeah, I mean, it's more of like an open question. I guess like when trying to think about like one of my biggest debugging pitfalls is like trying to like reconstruct the state of what happened when something went wrong. It's like I feel like that's like one of the most typical things. It's like, OK, something happened. Like, well, like it's the data has changed since something had happened.
Maybe the change resolved the issue. But like, you know, trying to figure out what that is and going running through those questions. Right. Like. How do you think about like reconstructing data or reconstructing the state of an issue? Like, is that not the right way to go about it? Or do you try and like do something else?
Fantastic question. So, and this gets to the root of why the three pillars are complete nonsense.
Okay, so there'll be a lot of- Wait, what are the three pillars?
Metrics, traces, and logs.
Okay.
Nonsense. They're not three pillars. The analogy I like to use is saying that observability is three pillars and it's traces, logs, and metrics is a bit like saying programming is three pillars. It's arrays, integers, and strings. It's the same kind of deal. No, it's nothing to do with those things. Well, it is because you use those every day. Yes, but you're kind of missing the point.
So thanks to some amazing work by people at Honeycomb and charity majors and reading their stuff and reading their incredible work, I've realized that metrics, tracing, and logs are missing the point. The point is we want to see events that happened at some point in time. And that neatly answers your question about how do you reconstruct the state of the app.
I mean, the short answer is, of course, you can't. If you're not in an event-driven system, if you're in a CRUD app, if you're storing state to the database, there is no way you can go back in time and accurately recreate it. But we can give it a reasonably good stab.
And we can do this by capturing the state of each event when it was, forget about observability tools and logging and structured logging and tracing just for now. Imagine if when that incident happened, let's say my expired token would be maybe potentially a good example. There are several points in that timeline that we want to understand. Number one, when the token was created.
Number two, when the user hit the website. And maybe there's a third one, when the account was created, let's say that. So imagine if at each of those three points, we had a rich event with everything related to that event in it. So when the account was created,
we had the account ID, the status of the account, whether it's pending or not, the creation date, the customer, the customer ID, blah, blah, blah, blah, blah. And then when the user visited the site, what was the request? What was the request ID? What was the user ID? What was the anonymous user ID? Et cetera, et cetera. And then when the token was created, what was the expiry? What was the this?
What was the that? What was the user ID? Okay. So if we have those three events, and we have enough rich data gathered with each of the events, we can answer your question. Does that make sense so far? There's a whole load of more blah, blah, blah, but does that make sense so far?
I think that you're making some great points of capturing the transactional like user information or user's actions. Yes.
And also other events that are happening in the system. So there's user did something, computer did something, computer enqueued a background job, performed a job, et cetera, et cetera. So the way I think about it is everything that happens in your app, whether it's initialized by the computer, an external data source, a user, it's basic event storming stuff really. That creates an event.
And that event, if you don't capture enough data, that is it. The data is lost forever if you're not in an event. Assuming you're not doing event sourcing and assuming you're not in an event-driven system. So the way I think about it at the most core fundamental level is whether it's logs, traces, metrics, whatever it is, we need a way of capturing those events.
And more importantly, ideally, we need to link the events together. And this is really, really, really important. So if somebody create, let's say somebody hits our app and it creates the token. Well, there's two parts to that. They hit the app. There was a request to our app. And then in the call stack somewhere, the token is created. Those two things are two separate events, but they're nested.
We want to capture that causal relationship. One calls the other. One is a subset of the other. One is a parent, a child, however you want to put it. Without that causal link, We're lost again. We don't know what's caused what. So there are some three or four ideas here. Number one, events. Number two, contextual data with each of those events.
And number three, nested events, if you like, causal relationships between events. And with those three things, you can debug any problem that you would like, is my claim. And so if you just keep that model in mind, let's examine traces, logs, and metrics and see where they fall short, see which one meets those criteria. So tracing gives us all three.
So for those of you, I should explain what tracing is because I was confused about what tracing even was for absolutely years. So tracing allows you to, when somebody hits your app, a trace is started. So there are two concepts in tracing. There's traces and there are spans. And then there's the data associated with spans. But let's just leave that to one side.
So when somebody hits your app with a request, a trace is started. And so the trace will be like, okay, I've started. Here I am. You can append any data that you want to me whilst I'm open. It's like opening the cupboard door, and then you keep putting stuff in the cupboard, and then once the cupboard door's closed, you can't put any more stuff in it. Very simple analogy.
So we open the door, we start the trace, and so it goes down to the controller level. And the controller says, oh, I'm going to glom on some data into whatever the existing trace is about the method, the post body, the request, blah, blah, blah, blah, blah, headers, whatever it is. I'm going to glom that on to the current trace. And then we get down into maybe you've got a service object.
I know some people hate them. I love them. Blah, blah, blah, whatever. That's not the podcast about job. So you get into a service object and the service object says, oh, whatever is in the current trace, I want you to know you hit me and you hit me with these arguments. Cool. I'm going to append that to the trace as well. And then we enqueue a background job. That event gets added onto the trace.
And then even more excitingly, there's a setting in OpenTelemetry where when the job is picked up and performed, the trace is kept open. And there's a whole load of debate about whether this is a good idea or not. But you can do it. You can keep the trace open until that job is started. And so the job says, ah, I've kicked off now. It gloms a whole load more stuff.
Maybe you make an API request in the job. It gloms a whole load more stuff into the trace. And then it comes all the way back up the stack. And you have this trace with all this nested context. And when it's saying, I'm going to glom this data onto the trace, that's called a span. And a span is nested. So you can have spans nested inside spans inside spans.
So essentially, it's this big tree structure. And you might have seen this before. It's the flame graph that you get in Datadog and New Relic and all these kind of things. And everybody looks at these things and thinks they're really pretty. And they are. Indeed, they are. So that's the pinnacle of observability in my head. Traces give it us all. And we can say,
as you can do in any of these observability tools that support tracing, you can do some really cool stuff. Show me all the requests that were a 200 that enqueued a job where the job lasted for more than three seconds. Holy cow, now we're cooking with gas. We've got everything that we need. Show me all the spans that indicated anything to do with the background job.
where it was a 500 response, but the user was logged in, and, and, and, and. And so we can start to not only query the spans, but query the parents of the spans. So you've got all of these nested causal relationships, and it gets ridiculously powerful. So that's traces. Cool. Let's look at logs. What do logs give us? Well, it gives us events. That's all logs are, really.
It's a series of events that happen. Does it give us the ability to nest events inside one another? Nope. Sorry. Your luck's out. You can log causation IDs and you can link them together. And obviously you can log request IDs and filter everything by the request ID. But there's no concept in the log of this log is nested inside this other log. So that information, goodbye. It's gone.
Don't have it. But you have the rich data in every event. Let's look at metrics. What does metrics give you? It doesn't give you the events. It doesn't give you the nesting. And it just gives you some aggregated numbers. So I don't think of them as three pillars. They're three rungs of a ladder. The very top rung is tracing. Awesome. The next rung down is logs. Pretty good.
And metrics are useless. Now, when I say metrics are useless, people get upset with me and say, oh, well, I look at metrics all the time to understand my app. Yeah. Okay. But if you derive metrics from higher rungs, that's totally cool. Totally fine. But what's a really bad idea is to...
directly say i'm going to send this metric right now to my backend and people do this all the time people think this is a good idea it's okay i mean it's better than nothing right it's it's just depends on the fidelity of information you want but the problem is there's two problems actually but the main one is you've sent that data okay you sent it to prometheus datadog whatever you sent that one data point
So then you look in the metrics and you say, holy cow, we're getting all these 500s. Why is that? I'll sit here and wait as long as you want. You're not going to be able to tell me the answer to the question unless it's blindingly obvious, unless you can say, oh, well, this other bit of data over here is like correlates with it time-wise and maybe it might be that. Yeah, okay, it might be that.
How do you know it's that? Well, we're having to guess. Guessing is not a strategy. Hope is not a strategy. I don't really want to debug by just flipping. Guessing, I want to know. And the only way of knowing is having traces. So the way I like to think of it is tracing is the pinnacle. Logs can be derived from traces, which is why the three rungs of ladder.
And everything can be derived as a metric from the two rungs above. So if you've got only logs, you don't have any nested context. But you can get metrics from logs. Fine. If you just have metrics, I would say you're not in great shape because you can't understand why without pure guessing. And it amazes me how many people push back on this idea and think just having some metrics is enough.
It's nowhere near enough. Not in my experience. If somebody wants to refute me and come on this podcast or have a chat with me after, I would love to listen to how metrics allow you to debug very, very deliberately and get the exact data that you need.
You can send off dimensions to metrics and then your metrics bill explodes within about five seconds, especially if it's high cardinality data like IP addresses. I've made that mistake before. We're going to send a dimension of IP with our metrics so that we can understand what's going on. In a week, my manager usually messages me, usually in less than a week, saying, can you turn that off?
We just got a day's dog bill of like five grand. Whoopsies.
I guess I do have like maybe some specific instances where metrics alone can help like identify things. And that's more where it's like the granular metrics are the things that you're actually looking like care about. Right.
Like, let's say, for example, like back to the sidekick background jobs example, like if you notice like your queues piling up and you happen to have your dashboard of metrics just looking at queue size and looking at throughput, like you can easily say, oh, like there's something blocking it and gives you kind of a point of confidence. where to look at in this specific instance.
Or as an example, like also, you know, you can notice like there's a leak in memory by monitoring, you know, your memory consumption of the app and just looking at the metrics for that and getting an alert and saying, why is the memory not stopping growing after a certain amount of time? I mean, these are like, you know, very specific examples that I'm giving, but like, uh, I agree.
Like if, if you're looking for like, you know, it's not going to tell you like if your users are like back to your like token expiration, like are people having a problem with our application that we've made? Like, Uh, you know, and like, we keep getting these, you know, client, uh, you know, emails coming in like, oh, I can't like sign into your app. Like what's happening.
You know, you can't just like take that and be like, oh yeah, it's obviously the tokens like expiration, right? Like it's your customers emails aren't going to like translate directly to that. And you're not going to know right away, uh, without having your tracing in place. Uh,
So a few things there. Number one, you bring up a really good exception I'd forgotten to mention conveniently. If it's infrastructure stuff, if it's like memory, hard disk space, all that kind of stuff, fair game for metrics. Fine. The second thing is I'm quite hyperbolic. So I'm quite an extreme person. So when I say they're useless, I don't mean literally they're completely useless.
I think of metrics as a hint. Hey, there's something going on over here. Cool, that's not useless. Obviously, it's useful. But then the next question is why? And if you've got a super simple system, then it's probably like three things. And you go, well, there's only three jobs in the system. So cool. And maybe you've segregated your metrics by background jobs, which is fair.
You know, it gives you a place to look. It gives you a starting point. But I've, yeah, yeah. They're useful in the aggregate and they're useful at giving you a hint. And yes, they're useful in terms of like making sure the infrastructure is still running. But I see a lot of people depending on them. And I, you know, there's a guy I really respect, used to work with him called Lewis Jones.
And him and I have gone back and forth on this over LinkedIn. And he is convinced I'm wrong about this. He's like, we run everything through metrics. Metrics are awesome. You're just on cloud nine if you think you can trace everything. And there's also a significant weakness with tracing as well, which is you can't trace everything unless you've got relatively low throughput.
or even medium throughput, you can make it work. If you trace every single request and you're doing millions of requests a day, I dread to think what your bill is going to be. So, and then that's where head tracing and head sampling and tail sampling comes into it. And we can get into that if you would like.
I mean, I would love to dig more into tracing in general and maybe more of the distributed aspect of it. Because I think what you're talking about is very important. Like, If we're just talking about tracing through a single request in a Rails app, it's not as useful as maybe where tracing really comes into play is where there's multiple things that start happening.
Once you start having more than one application and the you know, the data starts trickling from one application to the other, uh, even in sidekick example, right? If you're throwing stuff into the background, how does that data snapshot transition through the background jobs? Especially if you have ones that start depending on each other, how do you then manage the queue?
Like in the making sure that you know where it started and you know where it's going, because sometimes you can catch a problem before it starts, uh, by having the traces in play and know where it's heading. Right. Uh, And so I would love to dig into those aspects.
Like where do you, like what tooling, or maybe we shouldn't talk about tooling specifically, but like what aspects of tracing are most important for like holistically looking at your system outside of like, you know, running through your question. Like I think at this point we're beyond like having your questions of what you're trying to look at and that you already know what those
questions are and where do you start setting up tracing? Because I know at Doximity we use open tracing as an open standard for tracing and observability across platforms, languages, and things like that. Do you find that the industry standards are heading in the right direction or where are the pitfalls there? Because I know it just introduces a lot
of dependencies once you start to adopt a lot of these things.
Totally. So I should say I am singing the praises of tracing. but it's a slightly utopian vision that I'm painting because 90% of the work I've done is with logging purely because it's simple to get going. It's more of a known quantity.
And a lot of my talks, this is why I'm not talking a lot about tracing and I'm talking about structured logging because I think structured logging gives you this kind of event-based mindset that you can then start extending to tracing
and the reverse is not true like you can't take that event based kind of mindset into metrics because metrics is just that aggregation right so um but i have like recently i've been doing a lot of queries in our rails app and i've been going to we use new relic sorry we use datadog at work and i've been going to datadog's tracing um interface
and really trying to answer my questions there instead of in logging. So we have both tracing and logging. Our tracing is hobbled a little bit, just purely because of cost reasons. And our logging is not so hobbled. So are the standards heading in the right direction? Yes, but it's going to take a really long time to get there is my short answer.
There is a lot of different ways of going about tracing. The most promising, as we all know, is open telemetry. But, I mean, I read some pretty harsh critiques of open telemetry. There's kind of a topic that generally divides people. If you don't know anything about open telemetry, it sounds an absolute utopia. And I got really excited when I started researching into it.
The more you dig into it, the more you realize... how much complexity there is to resolve and how many challenges that project faces in order to resolve them. And so, I mean, what it's trying to resolve is 30, maybe 40 years, possibly even more, of legacy software, right? Because that's how long logging has been around.
And they're trying to aggregate all of that into one single standard good look. It's a very, very difficult problem to solve. And they're doing an incredible job. But it's very, very difficult. So they have open telemetry is where I'd start with the answer to your question. Open telemetry is 100% the future. I've not seen anything that rivals it.
And open tracing, I believe, came first and then evolved into open telemetry, from my understanding. Apologies if I've got that slightly wrong. And so, yeah, I think there's a few options if you're in Ruby. None of which are ideal. So the OpenTelemetry client in Ruby is not ready for primetime. It's quite behind the current standards in OpenTelemetry.
It doesn't obey any of the latest semantic standards, for example. I have played around with it in an example project. And when it's working, it's absolutely incredible. It's next level brilliant. There are a few problems with it. It's extremely slow. So I tried to use tracing on our test suite at work using this open telemetry tracing.
And it just, it's like, I can't remember the numbers, but it really slowed down our test suite to the point where it really just wasn't practical to use because we were trying to measure the performance of the test suite. So, you know,
um i could have been doing something stupid there it's very possible that i just wasn't using it the right way so sorry open slam machine folks if i've i got i know um i think a lady is called kaylee who is from new relic and she and um I'm so sorry, the names escape me. But there's a whole bunch of people in the Ruby space who are working really hard on OpenTelemetry.
But it's just that the OpenTelemetry project is moving so fast, that's the other problem. So that's option number one, OpenTelemetry. You could maybe fork it and tweak it yourself. The second option and what we use at work is, because we're using Datadog, we use Datadog's tracing tool, which is pretty good. But then even with tracing or logging, I feel like we're kind of,
maybe 20 years behind where everybody else is in programming in terms of observability. Because one of the questions I often have when I look at this stuff and even think about tracing, I maybe have like five, six, seven questions that even I can't resolve, which is what do I trace? How much detail do I trace in? How much is this going to cost me?
And we're still in the stone age with a lot of this stuff. So I don't have any good answers for you in that regard. So we use... the vendor tooling for tracing. I'm sure Eurelic has its own version of that. In fact, I know they do. I know Sentry does. There are certain other providers that don't have any tracing capabilities at all.
So I would say for now, the best option we have is relying on the vendor tracing tools, I would say.
Yeah, it's funny you mentioned Datadog. We've had Ivo on before from Datadog to talk about a lot of the, like, I think memory profiling. He works on a lot of granular Ruby performance tooling, really interesting stuff. But yeah, I mean, I would love to see maybe some more, I don't know, higher level examples of like making use of open telemetry in the Ruby space in general.
Because I think that level, I mean, especially with all of the solid queue or solid trifecta or whatever stuff that's coming around, it would be nice to see something like tracing specifically introduced to Rails that would make more sense in that ecosystem. Yeah. I mean, where do you where do you start profiling stuff is like kind of like an intro to tracing. Yeah.
Like if you wanted to see like the request, it reminds me of was a rack mini profiler tool. Right. Where you you can just see a little tiny tab that says, oh, it took this number of seconds to load. this particular page you wanted to get. And you can click on and expand and see, oh, well, what did your application do at each step of the way and see how long each thing took, right?
And I think of that as like a trace a lot of the times, right? Yeah. And it's very like useful, like even when you're just starting out to see that. Right. And it helps you visualize that.
And so I feel like maybe that's what's missing is a lot of like visualization aspects of all this tracing stuff, because there's something that you look at or find useful when you're starting to dig into like structuring the traces and things like that.
definitely that's leading me up to my, one of my big kind of rants, passions, whatever within the observability space. And I don't see anybody talking about this. Um, I feel like it's either I'm onto a really great idea or it's an unbelievably idiotic idea for some reason that I don't know. It's usually the latter as a spoiler. Um, Okay.
So when I'm looking at traces, there's almost never enough information. Almost never enough information. And this is why charity majors and the team at Honeycomb and Liz Fong-Jones always talk about have wide context-aware events. That's their mantra. Wide context-aware events. And Events, we've already talked about. Context, we've already talked about. We haven't talked much about the wide.
So wide means lots of attributes. So their take on it is add as many attributes as you can to every event. And make them high cardinality attributes. What does that mean? It took me about three months to wrap my head around what high cardinality means. It means anything ending in an ID. There you go. That's an easy explanation. So a request ID. Oops. Sorry, that was me and my microphone.
Anything that looks GUID-like. Anything that is a unique identifier for anything, so that's user ID, request ID, but anything that is a domain object, and this is the real missed opportunity, I think, that we have in the Rails community and in observability community potentially in general. When something goes wrong, or even when something goes right, let's take the token as an example.
When that token is created, the token is a domain object. Now, okay, it's to do with authentication. So it's not really a domain object in a way. But let's say that customer is signing up for an account. The account definitely is a domain object.
And if you want to understand what I mean by domain object, I just mean an object that belongs to the domain, the business domain in which you're operating. It's a business object, a domain object, call it what you will. But when the CTO or even better, the CEO or somebody in marketing talks about this customer account, they talk about people creating accounts. They use that word account.
That's your first clue that it's a really important concept in the domain. So that's what I say when I mean domain objects. I mean words that non-technical people use to describe your app. So they're domain objects. Why are we not adding every relevant domain object to every event? We don't do it.
And so what you'll see is people do this kind of half-hearted, oh, well, we'll add the ID to the current span or the current trace or even the current log. We'll add the ID. And that's okay. That'll be enough. But you're not capturing the state of the object. Why not just take the object, in this case the account, convert it into a hash, and attach it to the event? Why can't we do that?
Now there's a number of reasons why we actually can't do that in some cases. If you're billed in terms of the size of your event, so if you're billed on data, obviously that's going to get expensive fast.
But if you're billed on pure events, as in your observability provider, your observability tooling, is saying for every X number of events or X number of logs per month, we will charge you this much, but the size doesn't matter. then this is a perfect use case to be taking those rich domain objects, converting them into a structured format, and dumping them in the log or the trace.
And so I've kind of thought about this quite a lot, and I've come up with a few quite simple ideas that people can use starting tomorrow in their Rails apps. Not without their problems, but The first of which is, I don't know if anybody's worked with formatted, so two formatted S for date time strings. And we have this idea in Ruby, don't we, of duck typing.
We have an object and really good OO designers that you shouldn't understand anything about that object. You just know it's got four methods on it. And it can be an account. It can be an invoice. It can be many different things. So my approach, and I'm testing this approach out at work at the moment, is instead of having two formatted S, have two formatted H. What does that mean?
It means you're going to format the domain object as a hash. And so to formatted S allows you to pass in a symbol to define the kind of format that you want. So it can be short, ordinal, long, humanized, and it will output a string. It will output a stringified version of that date in these different formats.
So my idea is, why can't we have a method on every single domain object in our Rails app called toFormattedH, and you pass it in a format. That format could be then OpenTelemetry. It could be any one of the numbers, a short, compact. And so for every trace, the way I like to think of it is, I want to, into that trace, add every object that's related to that.
And you could format those in OpenTelemetry format, for example, or you could have a full format or a long format, whatever you want. And so that way you can say, oh, I just want to, I want a representation of the account that is short and it's just got the ID. And that's a totally minimal skeleton. And that's enough for me. But actually here, the work I'm doing is a bit more involved.
So I want to call to formatted H with full. And that will give the full account, like the updated app, created app, everything about it. And then that will be sent to my logs and traces. And I now have a standardized way of observing what's going on with all the rich data of my app state at that point with all the relevant domain objects in it.
So that's my dream that I'm headed towards with this gem. So that's kind of the way I think about structuring it. And I think about the, like, people, I see people doing all this ad hoc kind of, well, this is an ID, and then we'll call the job ID, job underscore ID, I suppose. And what's the account? We can call that account underscore ID. And I just like to think of it as,
Imagine your domain object. So an account has a customer. A customer has some bank details. Bank details is a bad idea, but address maybe. And so we could have these different formats that load nested relationships or not. And obviously, you've got to be careful about the performance problems with that. And so you'll have the exact structure of your domain object in your logs, in your traces.
That, for me, is a dream. And then every single time an account is
is logged it's in the same structure awesome so i know that an account is always going to have an id it's always going to have a whatever other attributes you count a pending status whatever it is and so therefore i can say show me every trace where the account was pending boom yeah i love that idea and uh it does it reminds me a little of the introduction of the uh
you know, the new rails, like, you know, logger. where you could tag, you know, the tag logger was kind of like a start to kind of this idea of, okay, capture all of these pieces with this tag. And it's like almost a pseudo trace, I call it. But it does go along that formatting aspect of like, okay, format all the things like this in a specific way.
And I agree that there's definitely a lot to unwind there. Uh, we'll have to have you on more, uh, if you, you know, when you, uh, you know, put this together as a gem or something, cause, uh, I would love to dig into that. Um, I love the idea of the domain objects and extracting those out into a formattable way that you can then trace and follow through.
That design decision is definitely missed a lot. Seeing things like Packwork as an example was a great step in the right direction, I thought. I'd like to see more of that evolve in the Rails ecosystem of abstracting the domains into their own kind of segments and then being able to format them for traceability and things like that. I think you're onto the right. You're onto a lot here.
And then, I mean, the thing that I think is unbelievably ironic is all I'm talking about is convention over configuration. And is that not why we all got into Rails? I know Ruby is a different thing, but Rails is all about convention over configuration. and the entire area of observability, it strikes me, could do with a massive dollop of convention over configuration.
And that's what OpenTelemetry are trying to do. The one last thing, and I know that time is getting on, but one last thing I want to just say on that is the other huge opportunity is adding context to errors. So we have these exception objects in Ruby, and people store strings with them, and it's like,
what how do you suppose how am i supposed to understand anything from a string and then people try and put ids in the strings and you're like no stop so at work i've made this extremely simple um basically a subclass of standard error where you can attach context so when you create the error you pass in structured context so if our logs are structured surely our errors should be structured as well makes sense right so
You can say, this error happened, and here was the account associated with it when that error happened. And here's a user, and here's this. So it gets attached into the error. And then using Rails' new error handling, rails.error.handle, if you've not used it before, look it up. It's absolutely awesome.
It's one of my favorite things that they've added to Rails recently, relatively recently in the last few years. And you can... basically have listeners to these events, to these errors, beg your pardon. It will catch the errors and then the context is encapsulated in the error. So you can pass these errors around and then you can do interesting stuff with that context.
And all I do is pull out all the context and send it straight into the logs. And that has absolutely changed the way I debug. Because whenever there's an error and it has all this rich data, you just look in the rich data and you're like, oh, that was the account. That was the Shopify ID. That was a product ID. I've got it.
And then you just look at the ID and your external, oh, right, okay, it's out of sync, whatever it is. It makes life so much easier. So that's something I'm really passionate about as well, having domain objects encapsulated within errors. So we've got structured errors, not just structured logs.
Yeah, I mean, that's definitely one thing that I look for when I'm looking for, you know, installing dependencies, right?
Like does the gem have its own, you know, base error class that it then can, you know, give metadata about whatever that it's raising the errors about, like more than just like a string of some error that then you have to figure out what it is like having that extra metadata that you could just, cause you can, you can just add attributes to a class, right.
And say, this error has these attributes, like it, it has meaning associated with the error. I think more people doing that is definitely going to be making that easier to do, first of all. But yeah, and then also getting more people to take on that convention. I completely agree with you there. Yeah, I mean, we are getting at time here.
Is there any last pieces you wanted to quickly highlight or mention before we move into picks?
I think the main thing is if you're listening to this and anything that I'm saying is resonating, forget about the domain object stuff. That's like getting really into the nitty gritty. But coming back to the beginning, if you're frustrated by your debugging experience, if you're thinking, why am I not smart enough to understand this? Chances are the problem is not with you. It's with the tools.
So if you improve the tools, not only do you make your life easier and better, but You level up everybody around you because all the engineers can use the same tools. And that's what we've experienced at BiggerPockets. And that culture of observability has really worked its way into our culture so that now anybody is equipped to go into the logs and ask any question that they want.
So it is a long road, but it all starts with a single step. And so if you are feeling that pain, feel free to reach out to me. I can go through all my socials in a minute, but feel free to reach out to me. Ask me any questions. I'm happy to jump on a Zoom call for half an hour and help you for free. But basically, it all starts by taking very small steps towards a very specific question.
Don't try and add observability because you'll still be here next Christmas. So take heed. There is hope. And if anything that I say resonates, please feel free to reach out to me and I'll help you figure it out.
That's awesome. Yeah, I also echo that sentiment of, you know, tooling is so important. And, you know, OpenTracing definitely is a great, great framework. And if we can improve that in the Ruby space, that'll definitely, uh, we'll, we'll be reaping the rewards as well. Uh, so let's move into picks. Uh, John, do you have anything that you want to, uh, share first or you want me to go?
Um, am I limited to one pick? Because I have many. No, go ahead. Cool. So, um, the first one is, uh, a new language and I already, um, I really thoroughly trounced the idea that we should be learning one programming language a year. Or rather, I just dissed it off without actually giving much justification.
So I'm going to go back on what I just said and say that this language has changed the way I think pretty much forever. And it's changed the way I see Ruby and Rails and just programming in general. And the language is called Unison. Now, it's a very, very strange, unusual language. It's maybe not that readable in places. And it's also extremely new.
I mean, it's been going for five or six years, but what they're trying to do is incredibly ambitious. But look it up. Yeah, it's an incredibly interesting language, and it will expand your mind. And that's certainly what it's done for me. And so it's kind of a language that's targeted at creating programs that are just much, much simpler, but actually more difficult to get your head around.
It's a completely new paradigm for distributed computing, basically. And it's absolutely fascinating. So I would highly suggest checking that out. I know that Dave Thomas at Yuruko, when I spoke at Yuruko recently, he was on stage and he was championing Unison and he called it the future of programming. And I could not agree more. It's an incredible language made by some incredibly smart people.
So that's number one. Number two, there is a static site builder. I've used pretty much all the static site builders on planet Earth. And this is my favorite. It's called Eleventy. It's a really odd name. But I am embarking upon this project at work.
that really is exciting me, which is how do you serve UI components from a dynamic app, so Rails, and meld them into a static site builder without having a pile of JavaScript that you have to wade through? So I want to author my UI components in Rails, and I want to deliver them extremely fast through a static site that's just a blog without having to run that blog on Rails. So Eleventy,
is my go-to tool for doing all that stuff. It also encompasses this thing called Web C, which is my new favorite templating language. Yes, I know, another templating language. I promise, I promise it's really good. It's not another retread of all these other templating languages that are very, very niche and very whatever. So Web C is compatible with Web Components,
And it's a fantastic way of making HTML-like components that are server-side rendered. And I would love to see a plugin for that come to Rails because it is absolutely phenomenal. So those are my two favorite things at the moment.
If anybody's trying to wrestle with UI components in Rails and trying to extract them out of Rails components, also would love to chat through that with anybody who's interested in that kind of stuff. because I think it's, yeah, there's a potential to really break new ground. How about you?
Yeah, thanks. I'll definitely be digging into some of those. Yeah, I was in New York City the other day for the Ruby AI happy hour that they've been doing every couple months. This time they did demos, and I demoed this real-time podcast demo
buddy that i've made uh it's called podcast buddy uh and it just kind of like listens in the background and in real time like keeps track of the topics and the discussions and some example questions uh worth mentioning or maybe some you know topics to transition to uh and it's a lot of fun i just did it for fun but i recently refactored it uh to use the async framework
and shout out to samuel williams just phenomenal like so well put together uh the documentation is coming along it is uh lacking in some areas but i was able to just completely refactor the code so that it works with async and runs things uh you know as they come in uh and it's streaming the the whisper you know transcripts uh it performs actions in the background just
like in the same thread, all managed with async. Just, I love it. So check out podcast buddy and check out async. You can't go wrong. Async WebSocket. Now you can handle even WebSockets asynchronously, just like completely seamless HTTP two and one compatible. Love it. So check those out.
And John, if people want to reach out to you on the web or just in general, how can they how can they reach you?
Thank you. Yes. So I'm on LinkedIn. That's a platform I'm most active on. And my LinkedIn handle is Synaptic Mishap, which is. Yeah, I really regret that. Sorry, everybody. But yeah, so if you just search for John Gallagher, G-A-L-L-A-G-H-E-R, and maybe Rails or Observability, you should be able to find me. I've got quite a cheesy photo, a black and white photo of me in a suit.
It's a horrible photo. And I blog at joyfulprogramming.com. It's a sub stack. So is this still a blog anymore? I have no idea, but that's where I write. I'm on Twitter at Synaptic Miss App and my GitHub handle is John Gallagher, all one word. So, yeah, Joyful Programming is the main source of goodies for me. I've also got a fairly minimal YouTube channel called Joyful Programming.
So feel free to reach out to me, connection request me, ask me any question. I would love to engage with some Ruby folks about observability. Tell me your problems and I'll try and help you wherever I can.
Awesome. I love it. Keep up the great work and keep growing. you know, shouting from the mountaintop about observability, pulling those pillars down and just focusing on the important stuff, right? I love it. So until next time, everybody, I'm Valentino. Thanks, John, for coming on and I look forward to next time.
Thanks for having me, Valentino. It's been amazing.
Awesome.