
Nilesh Gule is an enterprise architect for a leading bank in Singapore. In this article, he discusses implementing modular architectures for resilience and adaptability.
After COVID, when travel started, I wanted to visit my family in Goa, India. I use a budget airline to travel. I tried their mobile app, but it failed to give me flight routes. When I approached their helpdesk, they raised a ticket to fix it. I then tried to use their website. It had a few problems, but I was able to book tickets. I noticed that the error messages were different on the mobile app and the website. Without a root cause analysis, we can infer that APIs were not used and that the teams managing the apps differed with minimal coordination. So, though I was using the same vendor, my user experience across channels was not the same, which is not good. Even resetting a password took them four days, which is not a good customer experience.
How can we ensure we do not end up in a similar situation? Let us look at some best practices.
12 Factor apps
Some of the engineers from Heroku published a set of guidelines in 2011 based on their experience deploying thousands of apps in their environments. Since then, we have had things like Docker; Kubernetes has become almost a de-facto orchestration platform, microservices, and cloud adoption. As a result, all these guidelines are not completely relevant, and they were adjusted by Kevin Hoffman in 2016, added three more factors, and are now called the 15-factor app. The three new factors are API first, telemetry, and authentication.
API First
API first talks about defining and using a service contract for all your development. It’s mainly suited for the Cloud. Smart devices are a common example where APIs are becoming popular. It allows for rapid prototyping. It helps support a services ecosystem. We can use things like virtualization; even if the API is not ready, the teams can work independently. This can help us to go quicker to the market.
Telemetry
Historically, we were gathering data from systems based on resource usage. We would monitor CPU, RAM, and hard disk usage and build our monitoring around that. But we lose that control with microservices and things moving to the Cloud and platform as a service or on the other platforms. So telemetry comes into the picture. It’s not just about application performance monitoring. We also need domain-specific telemetry and health and system logs.
Frameworks and Tools
Along with this, we also need frameworks and tools. There are tools like Spring Boot. Dapr is an open-source tool that allows us to build a modern application. It provides cross-cutting services or components like service invocation and state management. It supports multiple deployment models. It is designed or built for microservices architecture or applications, but we can also make it work on those applications which are not using microservices. Using Dapr, we can make our application adaptable; we can build it once and deploy it to any environment.
Improve resiliency and adaptability.
- Build loosely coupled applications. This is where microservices and event-driven architectures help us to decouple the applications.
- Automationis using everything as code. We can use multiple tools to help us code everything we are building as infrastructure in source control. We can have a trace of whatever happens, not just for the application code, but also for our infra.
- Testing– We can use chaos testing. It is a way of deliberately introducing faults into the system to test.
- Delivery– We want to minimize downtime. So, progressive delivery, canary releases, dark launches, and phased releases can help us deliver better.
- Serverless – Serverless is another way to help us build a resilient system, reduce cost, and optimize resources.
- Observability – Measure and monitor the performance of your applications and infrastructure.
So using these principles, I think we can build highly resilient, highly observable, and adaptable systems.