Data Abstraction using APIs
This is an article by Vikram Vasudevan, founder of Ekahaa solutions. They specialize in data abstraction solutions.
When it comes to API, there has always been a lot of focus on API management, monetization, performance monitoring, caching, and things like that. But few people talk about a key, possibly underrated aspect of APIs – Abstraction in APIs. When you build APIs, you connect to different types of data sources, and it has its own share of complexity. Abstraction, in general, is hiding the complexities of the underlying sub-system. In the case of APIs, there are many different levels of abstraction.
What is Abstraction?
In computing terms, abstraction is a way of hiding the working details of a sub-system. For example, when you see a restaurant, you look at the ambiance and want to eat there. You do not see the complexity of how it was built or is run.
In a database context, we create views, procedures, functions, etc. It abstracts the complexity of the underlying data model. And for somebody who’s using the View, they don’t see what it takes to build that View. That in itself is a big abstraction at a database level. API itself is a big abstraction.
You have a beautiful web application. What you and the users see is the front end. Behind that is a large infrastructure, micro-services, gateways, etc. It is a highly complex system.
After the advent of the cloud, the cloud tools you use are another form of abstraction. When you ask to deply an application to GCP cloud, it spawns your cloud computing system, creates your databases, sets up your networking, etc. This is infrastructure abstraction.
Key principles of abstraction
DRY – Don’t Repeat Yourself – Eliminate repetition of software patterns. Redundancies are okay, as far as performance enhancements go, but don’t repeat your code everywhere as maintenance and technical debt will build up.
WET – Write Everything Twice – only if that’s the best way to solve your problem.
AHA – Avoid Hasty Abstractions – sometimes the best way is to not abstract at all.
MASK – End users do not need to know the complexities. Hide it from them.
Abstraction of Datasources and APIs
In today’s world, there are many different types of data sources. You have the SQL data sources, legacy systems, no SQL data sources, the graph DPS, the influx time series databases, ERPs, etc. So, having a system that can effectively talk to all of these data sources without understanding the underlying semantics would be a big challenge. This is one of the key challenges regarding data source abstraction.
When you build a product, you have to talk to many different types of data sources. So there’s a need to interface with data sources. Every product has its way of dealing with this. Essentially, it boils down to four aspects of data source abstraction.
Four aspects of Datasource Abstraction
Unification – you need to unify your data by bringing multiple data sources together. You must provide a singular language through which you can talk to all of your different types of data sources.
Fortification – You must fortify your data sources by wrapping them in a governance layer. You must govern your data sources through a single data access layer with policy-driven access control at API, row, and field levels.
Restification – Provide a Restful API interface with the flexibility of GraphQL. Perform CRUD operations on data sources.
Democratization – Data is useful only if it is shared. So you need to democratize your data. Propagate data securely across disparate systems seamlessly based on an event, a schedule, or adhoc.
API Building Lifecycle
The API building lifecycle is very complex. The steps are as follows –
- Identify a data source
- Setup API infrastructure
- Write code to define API
- Write code to define security policies
- Use SDKs to inject code to extract documentation.
- Create separate repositories for code versioning.
- Plan and perform testing of APIs
- Deploy the APIs
An API-Driven Data Abstraction Platform needs to support API building, not just defining or documenting APIs.
API building on the fly is a key ingredient of API driven data abstraction platform. It has to be able to connect to major data sources that are in use today. At the same time, it has to keep up with newer data sources that keep getting spawned daily. We must keep up to date on the developments in the data world. This is where the community-driven character model comes in very handy. Auto-discovery of resources from databases is important. The availability of customization is important for advanced users. So whenever you make changes, you need to ensure backward compatibility so that existing API consumers don’t have a negative impact.
Always have SSL support and a multi-tiered security model.
API Management is necessary to ensure that APIs are not scattered. You can either build your solutions or use an available product.
And last but not least are data management and governance capabilities.
Benefits of API-Driven Data Abstraction
- Faster API Releases – Rapid API development and testing allow frequent releases to your customers.
- TCO Reduction – Massive savings in API development and infra costs
- Operational Efficiency – Manage data democratization and governance and all your data pipeline activities using one platform.