The Haunted House of API'sToday, we are releasing another episode for Cybersecurity Awareness month, in our series entitled the Haunted House of API’s, sponsored by our friends at Traceable AI. In this series, we are building awareness around API’s, their security risks – and what you can do about it. Traceable AI is building One Platform to secure every API, so you can discover, protect, and test all your API's with contextual API security, enabling organizations to minimize risk and maximize the value API's bring to their customers.The Haunted Web: Navigating API Sprawl and Creepy CrawlersToday’s episode is titled The Haunted Web: Navigating API Sprawl and Creepy Crawlers, with Traceable’s Chief Security Officer, Richard Bird. As organizations scale and evolve, so does the complexity of their APIs. API sprawl, the uncontrolled expansion of APIs, creates a tangled web where vulnerabilities linger in the shadows. These unseen APIs become “creepy crawlers” of your digital infrastructure, creeping through your systems and posing security risks. Richard will discuss how unmanaged and undocumented APIs contribute to blind spots in security, the risks they create for organizations and the best strategies for securing a sprawling ecosystem.Discussion questions:Can you explain what we mean by "unknown APIs" and the different types, like shadow, rogue, zombie, and undocumented?Why do these APIs often go unnoticed, and how do they become security risks?What makes these APIs such an attractive target for attackers, and can you share an example of how one has been exploited?How can organizations begin to uncover these hidden APIs, and what tools or strategies are effective in doing so?In your experience, what are some common mistakes organizations make that lead to these unknown APIs being created or overlooked?SponsorsTraceableLinkshttps://www.traceable.ai/https://www.linkedin.com/in/rbird/https://richardbird.com/Our Sponsors:* Check out Vanta and use my code CODESTORY for a great deal: https://www.vanta.comSupport this podcast at — https://redcircle.com/code-story/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy
Hello, listeners. Today, we are releasing another episode for Cybersecurity Awareness Month as part of our series, The Haunted House of APIs, sponsored by our friends, Traceable. In this series, we are building awareness around APIs, their security risks, and what you can do about it.
Traceable AI is building one platform to secure every API so you can discover, protect, and test all your APIs with contextual security, enabling organizations to minimize risk and maximize the value APIs bring to their customers. Today's episode is titled The Haunted Web, Navigating API Sprawl and Creepy Crawlers with Traceable's Chief Security Officer, Richard Byrd.
As organizations scale and evolve, so does the complexity of their APIs. The uncontrolled expansion of APIs creates a tangled web where vulnerabilities linger in the shadows. These unseen APIs become creepy crawlers of your digital infrastructure, creeping through your systems and posing security risks.
Richard will discuss how unmanaged and undocumented APIs contribute to blind spots, the risks they create for organizations, and the best strategies for securing a sprawling ecosystem. Well, Richard, thank you for being on the show today. It's great to be here.
Before we jump into our topic for today, which is the haunted web, navigating API sprawl and creepy crawlers, tell me and my audience a little bit more about you.
I'm a longtime technologist, 30 years this year. I don't feel that old. I'm the chief security officer for Traceable, and I've been in the startup and solution side for about six years now. I spent more than 24 years in the corporate world where I was executive a number of different things.
All I like to say is it took me 25 years of work in the corporate world to become an overnight sensation in the startup world. So if you're willing to put the work in for a quarter of a century, you can be recognized for being good at almost anything. I met my now wife about eight years ago. We looked at each other and said we both love music.
I had been a young dad, so I hadn't been in the music scene for 25 years. In fact, I always like to tell people I saw Red Hot Chili Peppers in Columbus, Ohio in 1985 or 86. We looked at each other and said, who are we going to go see? And that's like asking your spouse, where are you going to go to dinner? And we looked at each other and said, have you ever been to a music festival? He said, no.
We're some 55 music festivals later now. So that for me is fun, number one. Fun number two is hopping in our van and going to any national park, going to any trailhead and hiking for as long as we're able to and hiking back out. I keep myself busy.
Well, awesome. Let's dive into our topic today. Again, the title is The Haunted Web, Navigating API Sprawl and Creepy Crawlers. So what is API Sprawl and why is it such a significant security challenge for organizations?
I think if you want to know what API sprawl is, and we're going to stay thematic with the creepy crawlers, watch The Last of Us, a virus, a thing propagating out of control and representing a threat to everybody. When we look at APIs, in the last dozen years, APIs have been used to create massive amounts of business value, but with very little to no security oversight.
The reason for that historically is that the goal was to find ways to get applications, particularly in the cloud, to communicate with each other without having to build all of these really heavy integration points that we used to do back in the old data center and application days. And so as soon as folks realized how they could use those APIs, they started doing it like crazy, right?
And they didn't have any security tools. They didn't have any guidance. An example of this is you can walk into any large company today and there'll be 30 organizations within that company that are developing APIs. And they're not using any standard protocols. They're using GraphQL. They're using SOAP. They're using REST. They're using all of these different language types.
And so now you think about, okay, what can that kind of sprawl create in terms of problems? Go back to the last of us, right? I don't know. What kind of problems can self-propagating technologies create once they're out in the wild besides making more of themselves? There are more and more APIs that are being built without these oversight components in place.
And I think it's always really important to point out, API sprawl is not a security problem. API sprawl is an operational problem. It only becomes a security problem when it is a security problem. And when somebody finds an exploitable API that's in this massive mess of APIs that have been created, now the bad guy can just simply take a pathway using that one exposed API.
And it wasn't sprawl necessarily that caused it. It was all of the lack of discipline and control that happens when you have a sprawl. I build it without authentication. I build it without the necessary safeguards. I build it and I put private information in it when I'm not supposed to.
These are all characteristics of behaviors that we see in the market today, and the scale of it is just staggering. We have a new API security report that's coming out. And 57% of the organizations that we've talked to have suffered an API breach in the last two years. And of those, 73% have had at least three. And 41% of them faced five or more API breaches just in the last two years.
And that is the consequence of what happens when API sprawl is allowed to continue uncontrolled and unchecked.
So that's a large number of attacks of being exploited by these businesses in the last couple of years. And it makes sense what you're saying about sprawl and how sprawl isn't necessarily a technology problem. It's an operations problem. So in that operation, how do APIs become part of this sprawl?
And I think you touched on it at a high level, but I'm curious, why do security teams often lose track of these APIs?
It goes back to the beginnings, which is security teams had no responsibilities or obligations to observe, manage, or secure APIs to begin with. When you look at organizations today, API creation definitely doesn't belong to security. It belongs to DevOps. When you look at remediation, say a vulnerable API that was found in testing, security people aren't developers anymore.
So you see a lot of tension in those organizations around mitigating or remediating the risk or the vulnerability that is associated. And so we really are living in a world where almost all of the traffic, like 75, 80 percent of the daily Internet traffic in the world is APIs. And we have security organizations that have been kept out of the equation for years and years.
And then we have an accelerating growth curve of APIs being developed. And we have a much slower curve of security organizations catching up. And it's always I always like to call that the time machine. When one curve is growing exponentially faster, API creation, then another curve, API security is growing. You literally would be better off not doing anything because you're so far behind.
Now, obviously, that's not the right security answer. But it's the mathematical part of this problem.
If API use and API componentry continues to grow at an exponential rate, and any study that you see will suggest API usage is growing anywhere from 3x to 7x a year, but API security is still a cognitive dissonance gap within an organization where people are arguing about whether I need API security because I have a web application firewall in
then you can see where the trend is going, which is even more sprawl, even less security and guideline guardrail control. And then probably more importantly, within the DevOps side of the equation, nobody's in charge of APIs, right? On the operational side, API ownership is fractioned across all the organizations that are developing it. So there's no head of API governance.
There's no head of API compliance and control. APIs haven't been looked at that way historically. And that will change. Inevitably, catastrophic consequences will change behaviors in that space. But it is the biggest gap I have ever seen. And I've said this now for more than two years. It is the biggest gap I've ever seen in a situation where people go, yes, I know I have an API security problem.
But no, I'm doing absolutely nothing about it. And that really is where the market is currently sitting for the most part. There are a lot of very mature and evolved API users in the corporate world that recognize the scale and size of this threat. They are definitely moving down the path, but that is a very small percentage of the overall Fortune 2000, Fortune 3000 landscape.
Sure. It's fascinating that there's so many APIs out there. It's the bedrock of applications, yet there's a small percentage of people that are managing them correctly. You said a term earlier, the creepy crawlers, right? And sticking with the theme, what are they within an API ecosystem and how do they contribute to security blind spots?
Those creepy crawlers are definitely the APIs that are engineered to exchange information without a tremendous amount of oversight. And this is really interesting because I think we're in a time right now where so much of the attention is being put on catalog and discovery, on creating an inventory or directory of all the APIs that we're exposed to.
That focus tends to orient people towards their old line technology providers where they go, oh, I've got a CDN. I've got a web application firewall. And they should know because all the APIs go over those channels. The estimate is somewhere around 30% of your API traffic in any large enterprise actually goes through those connectivity points. So now you've got 70% that you can't see.
There's your creepy crawlers. You've got 70% that are interacting with each other across these applications and are also finding or being built with pathways out of your organization that either bypass or just functionally ignore those web application firewall tools and those CDN tools. Now you've got this really interesting space where You don't know exactly what the API is doing.
You don't know exactly what it's supposed to be doing. You definitely don't understand how it's currently behaving. And in the meantime, information, revenue, reputation are leaking out of whatever access pathway that API is being directed to to push things out externally or receive things internally.
So the creepy crawlers are really all the things that you don't know about in your environment that are associated with APIs that are not in any kind of channel where you can see them.
That's clear. And I see the problem and how that can be obviously a major problem. And you live this space on a day-to-day basis. Can you share an example of an incident where sprawl contributed to a major breach or to a vulnerability?
There's two very precise ones that have received a lot of publicity. First of all, they result in tens of millions of customer records being lost. One is a very large, one of the largest mobile carriers in the world. And the other was a healthcare services organization.
And in both of those cases, I think this is such a powerful example of why so many people in the survey that we presented said their current technology is so ineffective in finding these API exploits. The reason that these particular breaches were successful was because at some point an API was taken out of production, an API that was already resident.
So now the argument that I'll catch that API in testing is completely irrelevant, right? These APIs were already there because there are already tens of thousands of APIs out in the wild that aren't going to go through this whole dev lifecycle thing. And those APIs in both of those cases, those APIs were taken out of production.
They were fixed, tuned, changed in some way, shape or form or another. And a developer put them back into production. And in doing so, there were no lifecycle management, no development lifecycle management practices that were put over that. It was like, hey, go fix that API and then go put it back in production.
And in both of those cases, the developers forgot to reinstate encryption on the endpoints that were associated with those APIs. So now you have this creepy crawler. You've got an API that you thought you knew what it was supposed to be doing, but it's now doing something it wasn't supposed to do.
expose a publicly open endpoint to a bot army that was fired off by bad actors who look for open API endpoints that are missing encryption. And then they found it. And as soon as they found it, they executed the moves that were necessary to go exfiltrate tens of millions of customer records, not just name and address, but like in the case of the mobile carrier, what your payment record was.
The reason why that's so important to the bad guys is because The most valuable thing on the dark web is a phone number with a confirmed live user on the end of it. And now all of that stuff was exposed that said, hey, you're a current customer and you pay regularly like you're supposed to. Bet you that's going to be somebody on the other end of that line that I can scam or that I can exploit.
The last thing that's most important about those two breach examples is no web application firewall, no CDN on the planet could catch that. And the reason is because that API looked like it was supposed to be doing what it was supposed to be doing. And the context of the information about the need and requirement for encryption to be on that endpoint simply did not exist in the system.
If you aren't controlling an API's encryption from an observation standpoint, you know it's supposed to have encryption, and it's been put back into production, and now it doesn't have encryption. If you're not controlling at that level of fine-grained granularity, there is no possible way for today's current technologies to catch those breaches. Wow. That's crazy is what that is. It is.
I can't disagree with you on that. And the one thing that I can tell you that I did see before was the mistakes that we made 20 years ago in forgetting to put encryption on an actual physical firewall and all the bad things that happened from that. So this isn't new. It's creepy crawlers, but it's a remake of a movie that we've seen before. It's a remake of The Living Dead.
It's a remake of any number of scary scenarios that we have seen in security before. The only difference may be volume and speed, but it doesn't make it different from a contextual standpoint. It just means that we've got to have technologies that can also operate at that kind of massive scale and that kind of speed in order to be successful against the bad guys.
Okay, so clearly this is a problem. It's happening. We see it happening for the big players in the space. What can we do about it? What sort of strategies or tools can organizations use to manage and secure their API ecosystems as they grow and scale? How can they keep up with this?
There's an answer to that before technology. And the answer to that initially is the old seven steps answer, which is, hi, I'm Richard Bird and I have an API security problem. A, you have to admit that you have a problem to begin with.
And that sounds a bit trite, but the reality is in today's market, a large number of companies who have built the internet-enabled world that we are riding on today, they had on average nearly 18 or 19 years since the rise of those technologies to address the API space from a security standpoint. And they didn't. They didn't build the fine-grained capability. capabilities.
They didn't build a catalog and discovery capability that takes into account the entire organization's digital footprint, but only the things that moved across their channel. And that's resulted in a lot of people in leadership and companies today going, this solution provider I've had for the last eight or nine years or 10 years has said they do it.
And this other solution provider that we've worked with for years have said they can collect off of other gateways or WAF. That leads to the second piece when it comes to API security tooling.
you have to look at the reality of a next-gen set of capabilities because the first and second gen have shown themselves incapable of being able to apply the necessary level of granularity and context to achieve API security. So the first is admitting you have a problem.
The second, truthfully, from a pure outcomes standpoint, evaluate your current tool base and recognize that in every API security breach of the last decade, Six years, every one of those organizations had a web application firewall or CDN in place. So why did they get breached if those technologies are now telling them we could have stopped that or we can stop that?
And then the next step is to move into where API security is actually happening today from a startup and solution standpoint, which is in the API security platform space. and recognize that this is a holistic effort, not a point solution. It's not enough to know all the APIs that you have. You need to understand the risk and criticality of those APIs.
It's not enough to test those APIs, say, on the AST DevOps side of the equation. You need to be able to address the current vulnerabilities and risk associated with the APIs that have been in production in your organization for years. Threatened vulnerability management. It's not enough to understand signature attacks from a tooling standpoint.
You have to have a platform that has the capability to divine and understand unknown unknowns because it's comparing known normal of an API, what that spec is, to how that API is being abused and used for bad purposes.
And unless you understand the delta between those two, then you're always going to be relying on somebody giving you vulnerabilities in the old kind of semantic AVG way of giving you a subscription list, as opposed to finding those exploits and vulnerabilities without having to sign up for all of that research feed. And then I think finally, you have to look at an API security tool
from the standpoint of what will come next, which is a move into runtime protection, where a signal will be taken off of that intelligent engine that's comparing normal to abnormal. And then that signal will be passed to an application to a microservice
to any number of other ways that APIs are used, where security will be invoked in that moment, and it doesn't go through some kind of policy creation, some firewall somewhere, some other type of kludgy method to try and protect an organization with a sledgehammer by doing an IP block, but you'll be able to use a surgical scalpel to be able to address the actual weakness that's manifesting.
The short answer to the long answer I just gave is, is the API security platform you need to be looking for needs to be answering all those questions because APIs operate across that entire infinite loop lifecycle. They don't just one and done. And so you need to be able to actually address security across the entirety of an API's existence.
And not just one, but the hundreds and the thousands and tens of thousands you're exposed to.
That all makes sense to me, Richard. And I really appreciate you being on the show and explaining this because the sprawl is real.
The problem with API sprawl is showing up out in the wild and the big guys and the three-step process you're describing of admitting you have a problem, understanding that problem, and finding the right tool there to really help you with the continuous discovery documentation, automated monitoring, and taking you forward towards the next step is really, really critical.
So Richard, I appreciate you being on the show today. No, I had a blast. Truly enjoyed it. Thanks for having me. And this concludes The Haunted Web, Navigating API Sprawl and Creepy Crawlers with Richard Byrd. Stay tuned for more episodes in our series, The Haunted House of APIs. And if you'd like to learn more about Traceable, go to traceable.ai. That's traceable.ai.
And thanks again for listening.