MLOps for smart and efficient Machine Learning models lifecycle

EdoQ
Into the Cluster ⭐
4 min readNov 17, 2021

--

If you are wondering “How can I automate my machine learning models lifecycle smartly and efficiently?”, well, you are in the right place and the right article. Ain’t much, but good job so far.

The problems without the adoption of MLOps

Machine Learning models built by data scientists are only a part of what an enterprise production deployment workflow involves.​​​​​​​ To operationalize these models, data scientists need to work closely with many other teams like business, engineering, and operations. This is because the role of data scientists should only be to create machine learning algorithms and models, not also to gather requirements or organize the deployment.

Imagine yourself doing all of these practices alone…mind-blowing, right? You could be involved in many problems: low model performance, CI/CD pipelines broken for mysterious reasons, and, yet, facing customers because they change ideas frequently.

Dividing all these processes among several teams represents organizational challenges in terms of communication, collaboration, and coordination. MLOps goal is to overwhelm such challenges with precise and well-established practices. Additionally, MLOps brings about dynamism and speed that is fundamental in today’s world, allowing automatic training of models and deployment whenever new data are available or in a fixed scheduled time window.

What are MLOps and how do they solve the problems?

Essentially, MLOps are very similar to the practices used in DevOps processes applied to the machine learning world, though they aren’t properly the same because of some particular ad-hoc mechanisms used in ML models’ lifecycle.

We, in Cluster Reply, work constantly with Microsoft’s Azure cloud platform which allows us to define an MLOps process adopting its ad-hoc services.

With MLOps, you are writing code on an Azure Machine Learning workspace to train a model rather than for a general application. Here in Machine Learning Operations, bugs are, for instance, low accuracy on the trained model or a bad choice of hyperparameters.

So what do DevOps practices look like with MLOps? Firstly, you’re still checking in code that performs feature engineering, model building, and training. After that, things start to look different. The automation process in MLOps doesn’t compile the code and run unit tests checking code quality and bugs, but instead trains the model with the data fed. Once the model is trained, it is scored and measured for all the metrics you prefer such as accuracy, precision, recall, etc. If one or more metrics are better than your current model metrics, nice catch: this new model will be deployed substituting the older one, providing better results for people who will use it. The final piece is regular retraining. In DevOps practices, when new code is checked in, an automated build and integration testing are scheduled. In MLOps, instead, models are retrained and retested when new data becomes available.

As machine learning and AI become integrated with almost every piece of software and technology that we use, the number of practitioners continues to grow along with the size of data science teams. Data science is no longer a single person in an organization. To collaborate with their bigger teams, they have to adopt the same practices that software engineers have been using for decades.

MLOps not only make collaboration and integration easier, but also allow data scientists to take on more projects, thanks to the division of responsibilities: data engineers clean and prepare data for data scientists who will do what they are the best for, develop machine learning models, and, in the end, the deploy will be left to the operations staff, which will build pipelines to automatize all the process. With MLOps, retraining, testing, and deployment are automated. The nightly build isn’t compiling your new code and running unit tests, but it’s taking all of the new data you collected during the day and retrain your models.

Recap and conclusions

Well, we clearly understood that MLOps is a godsend. Damn, why weren’t invented earlier?

We can sum up MLOps with these benefits:

· Open communication: with the adoption of these best practices, communication among teams will be amplified, improving teams’ cohesion, especially between data engineers, data scientists, and operations.

· Collect and prepare your data faster: thanks to the practices involved in MLOps, you can create an automatic process to gather data into Azure Blob Storage and process them with Azure Databricks, for instance.

· Forget models’ training and validation scheduled by hand: after your data is gathered and prepared, trigger an Azure DevOps CI pipeline using an Azure Function and fed them into your model creating a newer version of it. Then, validate the overcome using your preferred metrics. If the overcome is not enough good, well, this model will be discarded, otherwise will be deployed using an Azure DevOps CD pipeline.

· Update your production environment without anyone noticing: with automated builds and deployments, you can refresh your model used in production during low usage time, such as at night. With this strategy, you can avoid pop-ups like “We are updating our website. Please, be patient”. We all know it is pretty annoying when you bump into these kinds of messages.

Very good, that’s MLOps, but last question: when you will adopt them?

Giuseppe Ruggiero

--

--