DX, API Design & Documentation

Designing a scalable MLOps Pipeline – Insights and Best Practices


Designing a scalable MLOps Pipeline – Insights and Best Practices

Victoria Lo is a solutions engineer and a technical writer. She is a “WomenWhoCode” Leader. In this article, she discusses designing a scalable machine-learning operations pipeline.

How Machine Learning (ML) Works

There are three pillars in machine learning models. First, you have the data, which is used to train the model. GPT is trained on hundreds of billions of text data. The model itself is used to make predictions, generate texts, etc. So, the ML Model is the second pillar. Finally, the code is the third pillar. The code involves deploying the model, maintaining it, and monitoring its performance.

Fraud Detection Model

Let us consider an example of a fraud detection model in the payments industry. Billions of dollars are lost through fraud every year.

There will be transactions when a customer makes a payment or buys something online. All these transaction data will be used to train the fraud detector model. So, once a payment is executed, the fraud detector model would get triggered. From there, the model would predict a risk score. The higher the score, the riskier the transaction is predicted to be. The fraud detector model would predict this risk score. There will be an algorithm that understands whether or not the risk score is considered risky or not. They will have some kind of threshold; any risk score above 600 would be considered risky. Once a fraud is detected, the system can either hold the transaction, prevent it from going through, or block it. This is how an end-to-end fraud detection system works.

But data changes over time, so the way the model trained us also needs to change. Fraud methods evolve, so you cannot use the same old transaction data; the billions of transaction data you have used to train this model are now outdated. Now fraud methods have evolved, and they have found new ways to bypass the fraud detector model. That means the model needs to be updated, as the model is no longer accurate at predicting whether or not the transaction is a fraud. This means the code also needs to be updated. So, as data changes, a new model needs to be deployed over time.

This imposes some challenges. The model performance starts to decay with time. It’s a continuous loop. The three pillars are highly interconnected. If any of them change, the rest would also need changing and updating. This means Overall performance starts to decay, and it’s really hard to maintain and keep it reliable. This can put some risk for businesses. This is where ML Ops comes in.

ML Ops is the practice and process of designing, building, and deploying machine learning models into continuous production using the dev ops principle. Going back to that fraud detection example, we will build pipelines around each of the three pillars: data, model, and code.

The data pipelines will be in charge of ingesting validation and cleaning the data. So, data ingestion would mean collecting data from various sources. Validation would ensure the data is relevant to the model and aligned with the business goals. Cleaning the data would require formatting it and ensuring it’s prepared for training the model.

The machine learning model pipeline will do feature engineering, setting the parameters to train the model. We will automate the training of this model and then evaluate the Evaluation by finding out the model’s matrix, accuracy, position, etc.

Last but not least, we’ll have the code pipelines. This pipeline will automate the deployment process of the model, ensuring that it’s available to be used. The model then needs to be monitored for performance and logging.

Tools are available to help build these pipelines. Some of them are ZenML, Amazon SageMaker, PyTorch Lightning, vertex.ai, and mlflow.


  • Covers end-to-end processes in the ML workflow from data ingestion to model deployment and management
  • Unifies all GCP products under one convenient platform for easy use and integration
  • Creates pipelines easily with pre-built components
  • Serverless and fully managed
  • Save time and costs on infrastructure as it is solutions-focused.

Victoria Lo
Having lived in Indonesia, Singapore and Canada, I am a person that has been internationally exposed to many different cultures. With these global experiences, I have become a versatile and curious person by nature, willing to accept new challenges and meeting new people to make meaningful connections. My passion lies in technology and finance. I graduated from the Smith School of Business specializing in finance with a computer science background. I have been a self-taught programmer in Java, JavaScript, Python and C# to fuel my passion in developing software that can make a difference and hopefully, revolutionize both the finance and tech industry. My blog, Articles by Victoria (lo-victoria.com) is also a part of my mission to create communities of lifelong learners. It is through the medium of writing and content creation that I was able to leverage on my expertise and lend my voice to contribute to the tech community. While being an active member in the tech blogging community, I am also a WomenWhoCode Singapore leader, where I am proactively organizing community-based events to empower any individual to find success in the tech space.

APIdays | Events | News | Intelligence

Attend APIdays conferences

The Worlds leading API Conferences:

Singapore, Zurich, Helsinki, Amsterdam, San Francisco, Sydney, Barcelona, London, Paris.

Get the API Landscape

The essential 1,000+ companies

Get the API Landscape
Industry Reports

Download our free reports

The State Of Api Documentation: 2017 Edition
  • State of API Documentation
  • The State of Banking APIs
  • GraphQL: all your queries answered
  • APIE Serverless Architecture