Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Paul Zaich

👤 Person
168 total appearances

Appearances Over Time

Podcast Appearances

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

So one thing I've been experimenting with is trying to create more automated reports that go into sort of a Slack channel that we can look at. And so people can review that. And we've also implemented basically a bi-weekly review during our retro where we just look at our metrics and some of the longer running trends so that we can see if those look correct. Is there anything that's wrong?

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

So one thing I've been experimenting with is trying to create more automated reports that go into sort of a Slack channel that we can look at. And so people can review that. And we've also implemented basically a bi-weekly review during our retro where we just look at our metrics and some of the longer running trends so that we can see if those look correct. Is there anything that's wrong?

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

We can talk about it, see if there's things that we want to actually action on based on that review. So we're trying to find some ways to do check-ins that don't require us to be all in the office together.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

We can talk about it, see if there's things that we want to actually action on based on that review. So we're trying to find some ways to do check-ins that don't require us to be all in the office together.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

We have implemented what I consider custom metrics. We use Datadog. So a lot of this is out of the box. You can use their implementation, but you're adding some code to specific parts of your application. Maybe it's a callback on your active record model. When something is created, you send a message to an queue and then that triggers over a message into statsd.com. that goes to Datadog.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

We have implemented what I consider custom metrics. We use Datadog. So a lot of this is out of the box. You can use their implementation, but you're adding some code to specific parts of your application. Maybe it's a callback on your active record model. When something is created, you send a message to an queue and then that triggers over a message into statsd.com. that goes to Datadog.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

Anyways, it's a pretty lightweight implementation in terms of what you can do, but you're adding specific events that you want to track. And then you can create your own monitors and alerting around those or correlations between different events in your system.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

Anyways, it's a pretty lightweight implementation in terms of what you can do, but you're adding specific events that you want to track. And then you can create your own monitors and alerting around those or correlations between different events in your system.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

So you could potentially look at a custom metric and then look at that compared to HTTP statuses that are coming through or the latency of an endpoint. And then you could correlate those two metrics as well. So there's some more advanced things you can do there as well if you need to. But again, it's not really a lot of custom work.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

So you could potentially look at a custom metric and then look at that compared to HTTP statuses that are coming through or the latency of an endpoint. And then you could correlate those two metrics as well. So there's some more advanced things you can do there as well if you need to. But again, it's not really a lot of custom work.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

It's just adding some specific points in your code base that you feel like are really important to track. And one example of this for Rails users is, I believe there's something like this already set up for Datadog for Sidekick. So we instrument it on a lot of our

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

It's just adding some specific points in your code base that you feel like are really important to track. And one example of this for Rails users is, I believe there's something like this already set up for Datadog for Sidekick. So we instrument it on a lot of our

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

sidekick jobs and we can see when the lag is growing on on one of those cues we can see what the the average completion time is and look at the p90 completion time for different types of jobs So you get a lot of visibility into your sidekick workers and processes very easily, basically for free.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

sidekick jobs and we can see when the lag is growing on on one of those cues we can see what the the average completion time is and look at the p90 completion time for different types of jobs So you get a lot of visibility into your sidekick workers and processes very easily, basically for free.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

Just to be clear, we capture all of our errors in Sentry. We do have some alerting that goes to Slack, but I would also want to emphasize that anything that truly has any chance of being a serious issue should never be either an email or a Slack alert.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

Just to be clear, we capture all of our errors in Sentry. We do have some alerting that goes to Slack, but I would also want to emphasize that anything that truly has any chance of being a serious issue should never be either an email or a Slack alert.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

You really should have some kind of escalation via either maybe it's text, maybe it's an actual incident response system like PagerDuty where you can have an escalation policy. For us, that's what we're using. It should have this synchronous alerting that really forces someone to look at it. You can't rely on something asynchronous like Slack in this case for serious response on issues.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

You really should have some kind of escalation via either maybe it's text, maybe it's an actual incident response system like PagerDuty where you can have an escalation policy. For us, that's what we're using. It should have this synchronous alerting that really forces someone to look at it. You can't rely on something asynchronous like Slack in this case for serious response on issues.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

You can actually do that, I believe, at least with iOS. You can set up an override where you snooze everything else and then you can set up and you have to just put it in your personal contacts, whatever numbers you think you're going to receive critical notification from. And then that'll actually ring through.

Ruby Rogues
The Sounds of Silence: Lessons From an API Outage with Paul Zaich - RUBY 652

You can actually do that, I believe, at least with iOS. You can set up an override where you snooze everything else and then you can set up and you have to just put it in your personal contacts, whatever numbers you think you're going to receive critical notification from. And then that'll actually ring through.