API Catalog: The First Step in Protecting your APIs
Dan Gordon is the DecSecOps Evangelist at Traceable AI. When we think of API security, many aspects must be considered. We need to understand your security posture, protect your APIs, gain security insights and develop secure APIs. This article by Dan talks about the first steps in protecting your APIs.
API catalog is a term from API management. It is focused on knowing your assets, understanding what you have, tracking them, and making them available so that developers and teams that will potentially use those APIs can find them, so you do not have to repeat work. From an API management perspective, you may have risk and change management within an API catalog. But, we also realize that API catalogs have a bigger role in the security space.
Why has the API catalog become critical?
APIs connect everything. There are thousands of APIs running on multiple clouds. There are very few software applications that do not use APIs or are not API based. This means we have many APIs that need protection. To add, APIs are growing and changing constantly. This means we must grow and constantly change with them in how we watch and manage them. A catalog with a static system or doing point-in-time checks of your catalog and inventory will not cut it.
Most companies do not have visibility of how many APIs they have, where they reside, and what the APIs are doing. These are the challenges faced by companies. API sprawl creates too much unknown information, exposure, attack surface, and additional risk.
Solving API sprawl
To solve API sprawl, start with a single source of truth that tracks all your APIs with automatic discovery. If it is static, it will not work for long as APIs are constantly changing. We also need accurate documentation. It enables our tools to talk to each other. It helps teams to understand what they’ve got and what they are working with. It enables us to understand what’s in the APIs and what’s changing in them. Last but not least is, understanding the connectivity chains and the upstream and downstream communications because that’s when you start being able to look over time and understand what’s happening and be able to look for patterns and anomalies in those patterns that indicate an attack or a compromise.
API discovery is challenging for a few reasons –
- Deployment – There are multiple deployment options, public cloud, on-premises, or hybrid. There are many options, meaning we must work with many different technologies.
- It is distributed – Cloud applications and microservice architectures have made applications massively distributed. It also means you have different barriers and borders to cross when discussing getting data out on those applications or services.
- Agile releases – Development teams are releasing multiple times a week.
So, when looking for a discovery solution, you must ensure that you find one that has a very flexible way of collecting the data because you don’t want your organizational or architectural requirements to get in the way of having comprehensive data around your APIs.
Discovery should include several things –
- It needs to be able to automatically discover all the API endpoints by inspecting traffic.
- The discovery system should be able to automatically group the APIs into apps, domains, and the services they’re part of. It makes it more digestible for the teams that are using this information.
- The tool should classify APIs as external-facing, internal-facing, or third-party.
- The tool should automatically catalog new APIs and changes to existing APIs.
This is discovery, knowing what you have. But what else should an API catalog solution be able to include to help us with API security? An API catalog should be actionable. Having data is important, but being able to operate on that data within the context of the data and the challenges that it’s presenting is even better.
Should support Open API Specification (OAS or swagger). This will ensure that we have the same language between teams and partners. We will have portability between tools, which means one tool can create the specification, and another can analyze it. It also makes it easier for tools to identify where APIs expose sensitive data.
Conformance analysis at scale. It will be useless if we cannot consume all the collected data. So, the catalog should have an AI system to automate the analysis and flag outliers.
Sensitive data exposure. We should be able to see where there’s sensitive data exposure because the purpose of the catalog is to help us understand the risk and security posture with our APIs. That means being able to identify and classify sensitive data and be able to take action on it.
Risk scoring. We want a tool that can look at all the endpoints in the service and assess for us, telling us what needs to be prioritized and the highest risk API endpoints that need to be addressed. This needs to be based on several criteria; the likelihood of a breach and the impact of the breach. This lets the security and development teams focus on and prioritize fixes to remediate the riskiest APIs first.
An API catalog of this sort becomes very valuable to many multiple teams. It benefits the security teams by giving them a comprehensive view of the territory, the attack surface, and what they need to protect. It helps the DevOps teams to address security issues early in the pipeline when it’s cheaper and easier than fixing issues in production. It helps the compliance teams to confirm that sensitive data is being handled properly.
To summarize, an API catalog should have automatic API discovery, Open API spec support, change detection, conformance analysis, API dependency, sensitive data exposure, API risk scoring, and CI/CD integration.