DX, API Design & Documentation

Building API Platforms at Scale, Best practices and lessons learned.

287views

Irakli Nadareishvili is the Managing Director, Global Banking Platform at JPMorgan Chase and Company. Over the past 15 years, he has worked on distributed systems and various APIs. He has authored a couple of books. He discusses building API platforms at scale and best practices in this article.

Let me start with a story. The story happens in a large enterprise, and in the middle of this story is a talented, dedicated, brilliant CIO. We will call her Michelle. Michelle is a highly motivated technology executive who is very passionate about efficiency. Some would say that she has a healthy amount of contempt for any kind of wasteful activities. One day Michelle was meeting with her leadership team. They were going through some organizational metrics. As she was looking at various reports and various numbers, her stuff, her heart started to sink. All those numbers were telling her that there were large parts of her organization that were reinventing the same wheel over and over again. Michelle was a very experienced executive, so she was not one for knee-jerk reactions. But looking at the numbers, it was very clear that she needed to act; she needed to push for more reuse in her organization to reduce the number of wasteful activities. She decided to enact a new mandate, where she said any new API Initiative would have to go through permission to build process. There would be a team of highly experienced architects put in place, and those architects would have to review all of the new API initiatives to determine if this initiative was truly delivering a unique business value or if the team had to be recommended for some reuse because this functionality already existed elsewhere. She also declared that most permissions to build would be granted to platform efforts, meaning the efforts that delivered a large set of unique and logically interconnected APIs. And these APIs were supposed to deliver significant business value to the organization. For instance, an accounting system would allow any application that needed any accounting functionality to get it from that platform. Similarly, there would be an enterprise customer platform where anybody that you need any information about customers or modify information about customers those applications go to that platform, one unique place where all these APIs would live.

As Michelle explained this new mandate to her team and collected feedback about implementation details, she noticed that one person was suspiciously quiet. John was a longtime lieutenant to Michelle, usually not known for being quiet. So, this uncommon quietness was standing out. Michelle called him out and asked why he had a white face. And John explained that he was worried about the total cost of reuse (TCR) in time. Michelle had not heard of it and asked John to explain.

John said that far too often, we only look at the benefits of reuse at a frozen time, the decision time. This is misleading. The long-term implications of coordination costs, created as the side-effect of standardization and centralization to achieve reuse, must also be forecasted.

A good example of this could be that multiple departments in a large company need the same functionality. Let’s imagine that we’re working for an online retailer. And all business units, the direct-to-customer, small business, and wholesale departments, want to start their loyalty and rewards program. So, the decision is made to implement it once. Therefore, we’re not going to implement it multiple times; we’re going to save a lot of costs, we’re going to maybe even save on time, and everybody will be happy. So, this decision is made, development is done, and the program is implemented. The first six months go smoothly. Then, the small business has some business requirements that need a change in logic. And this change would help them make five more million dollars. If they had an independent program, they would have made the change in three weeks; but because it is a centralized program, things are a bit complicated and may take up to three months.

So, you can see how in time, because of future changes, the initial benefit of reuse can start diminishing because every further change is a coordination effort that can delay things or make things more expensive.

When we only evaluate the benefits of reuse and centralization at the beginning of the project, everything looks really good. But if we look over a longer period, the cost of change coordination can greatly diminish the initial benefit or make it worthless. We need to evaluate the total cost of reuse in time, and not just at the inception.

Does this mean that we should never share anything; we should never create central platforms? Not really. We need to be very conscious of the total cost of reusing time and always evaluate the full lifecycle of the platform.

So, while deciding on centralization, look at the frequency of change. The functionality that we are centralizing should not be expected to change frequently. Slow-changing shared capabilities are safe for centralization and reuse. Frequently-changing ones are usually not suitable, even if they represent shared needs.

TCR evaluation and appreciation of long-term coordination costs also lead to three main principles of building resilient API-driven platforms that separate the platform and consumer.

  1. Platforms should never be the arbiters of common behavior for consumers built on top of those platforms. Many times, in large companies, when you build your enterprise systems, you go with the functionality that’s the least common denominator of your customers. So, you build what everybody needs. But this is not how long-lasting platforms should be built. They should have an identity and solve a problem.
  2. Platforms should never orchestrate calls to other platforms on behalf of consumers.
  3. Platforms may never implement consumer-specific logic.

This is easy to get wrong. When you build the platforms, you often become a bit of a crowd-pleaser; you want to prove that you’re useful and try to deliver too much.

To conclude, I want you to carry this one thought with you. The time for a wild forest of APIs at large companies is over. Going forward, it will be all about purpose-built platforms that are self-service and delivered as APIs.

 

Irakli Nadareishvili

Irakli Nadareishvili

Managing Director, Global Banking Platform at JPMorgan Chase & Co
Builder of large-scale tech, delightful products, and high-performing organizations focused on long-term strategy, business outcomes, and customer value. Has repeatedly delivered digital products and platforms with cutting-edge capabilities, at the scale of tens of millions of customers, in public cloud environments. Principally focused on effective engineering management, alignment of business and technology strategy, and cloud-native architecture. Authored a couple of popular books on cloud engineering that were published by O'Reilly and translated into other languages. Core value is building diverse, collaborative, and fearless organizations where people can achieve peak performance in a continuous learning environment. Key specialties: financial systems, cloud-native development, building large-scale technology platforms with vibrant developer communities, innovative execution & results, strategic alignment of technology with business goals, Agile development, system and software architecture at scale, growing high performing teams, quality and delivery automation, cloud adoption & modernization + migration.

APIdays | Events | News | Intelligence

Attend APIdays conferences

The Worlds leading API Conferences:

Singapore, Zurich, Helsinki, Amsterdam, San Francisco, Sydney, Barcelona, London, Paris.

Get the API Landscape

The essential 1,000+ companies

Get the API Landscape
Industry Reports

Download our free reports

The State Of Api Documentation: 2017 Edition
  • State of API Documentation
  • The State of Banking APIs
  • GraphQL: all your queries answered
  • APIE Serverless Architecture