Anoop Gupta is the Director of Software Engineering at Capital One. Over the past 20 years, he has worked on building highly secure, resilient, performant, and scalable enterprise platforms specializing in API Security, API Governance, Cybersecurity, Identity Access Management, Data Protection, High-Performance Distributed Systems, and Privacy. In this article, he delves into the crucial topic of API drift detection and sheds light on its importance for large enterprises.
Background
As businesses continue to grow, there is an increasing need for APIs to cater to the requirements of both internal and external customers. The rise of cloud computing and the widespread adoption of microservices architecture has added complexity to the situation, resulting in a substantial increase in demand. Consequently, there has been a rapid proliferation of APIs, leading to a phenomenon known as API Sprawl. This proliferation often leads to the development of numerous redundant and duplicate APIs, posing challenges in managing and maintaining consistent security policies across the entire enterprise, which ultimately impacts the overall security posture.
The potential for security breaches is a significant concern, particularly when it comes to unauthorized access to sensitive data through rogue APIs. This kind of exposure can lead to data breaches and financial losses, harming the company’s reputation. Both rogue and approved APIs can be identified through out-of-band network discovery using API Gateway, which serves as a centralized entry point for APIs, or through a centralized API Management system that leverages a cataloging solution.
However, issues such as inconsistencies in API design, documentation standards, data quality, and third-party contracts can still create security vulnerabilities. These inconsistencies in APIs are referred to as API Drift.
What is API Drift Detection
Drift detection is the process of identifying any deviations in the current operational state compared to the expected state. These discrepancies can arise due to various factors, and it is primarily the responsibility of API producers to comply with the specifications set during the design phase to minimize the occurrence of such deviations.
Types of API Drift
OpenAPI Specification Drift
OpenAPI specifications provide a structured framework for defining APIs, enabling engineers to gain comprehensive insights into API functionality. When backed by API Management, API undergoes the producer lifecycle encompassing planning, design, review, development, testing, and deployment phases. Despite the inclusion of processes to align API implementation with the prescribed OpenAPI specification and comprehensive reviews, developers may inadvertently deviate from the specified standards for various reasons. Some are –
- The delay between documenting specifications and implementing the API can cause a disconnect.
- A developer creating a specification can be different from one implementing.
- Product or technical modifications may occur after the specifications have been registered.
Design and runtime drift inevitably lead to unauthorized access to sensitive data beyond organizational boundaries.
Data Attribute Drift
API Sprawl can lead to a proliferation of competing standards across an organization. In a large enterprise with numerous APIs and multiple versions, both consumers and producers may deviate from established enterprise data standards. This can result in non-compliance with data standards and enterprise security policies, potentially exposing sensitive data.
The lack of consistent attribute naming patterns among APIs can result in conflicting standards for the same underlying attribute. This inconsistency makes it difficult to enforce governance and security measures, such as encryption or tokenization, across the enterprise. As a result, data privacy issues may arise due to the challenges governing these attributes.
Data Attribute drift is closely linked to data quality but is significant enough to be considered a distinct type of drift.
Data Quality Drift
An API comprises multiple endpoints, each exposing various data attributes that consumers can access. These data attributes may have different data validations and types, and it’s common for an API to deviate from the initial design specifications.
One frequent issue is data type drift, in which integers are incorrectly interpreted as strings, leading to potential issues when the backends consume this data without proper validation.
Another common problem arises from inadequate validation, such as through regular expressions or range checks, for an attribute. When consumers and producers operate based on undocumented specifications, it can result in unauthorized access to the data.
Data Contract Drift
Large organizations create APIs not only for internal use but also for external third-party integrations. Because customer data is shared beyond the company’s boundaries, it’s crucial to establish data contracts with external clients. These data contracts specify the attributes shared with clients and outline the governance policies associated with the shared data.
Like any of the aforementioned drifts, API producers may unintentionally begin transmitting additional information to clients, resulting in data leakage.
Detection
Even when following the shift left approach of ensuring that specifications are registered during the design phase and having governance and standardization in place, the drifts mentioned above still occur.
Methods
As business demand increases, so do the API transactions, making it increasingly challenging to scan and detect drift at that scale. Employing techniques that are less invasive, decoupled, and scalable such as –
- Out-of-band network scanning similar to API discovery.
- Extending the API intermediary (API Gateway) to scan the traffic.
- API producer calling centralized drift services.
Every drift detection method can be a loosely coupled microservice that provides detection and remediation of the drift.
Machine Learning
Certain deviations mentioned earlier can be easily identified with the correct procedures. However, leveraging machine learning to identify non-standardized data attributes can be highly advantageous. By inputting a comprehensive and expanding library of data attributes and their properties, along with domain and sub-domain information, machine learning models can recognize whether there is a more suitable attribute for use and apply the relevant governance and security policy based on the risk profile.
Remediation
Addressing drift issues can be just as demanding as detecting them. Remediation involves pinpointing the potential risk posed by the drift, evaluating the severity of the issue, identifying the API producer and consumers, tracking occurrences and metrics to recognize patterns, and establishing a process to address and remediate the vulnerability. A centralized solution can offer an effective approach to resolving these challenges.