Implementation of MLOps CI/CD Pipeline

Blogs

Objective

In this blog post, we’ll dive into the foundations of MLOps and walk through the implementation of a CI/CD pipeline using GitHub Actions. The goal is to automate the process of linting, testing, and deploying a simple machine learning model. This approach helps maintain code quality and streamlines the development and deployment processes.

CI/CD Pipeline Overview

The CI/CD pipeline for this project is defined in the ci-cd.yml file, which orchestrates the entire workflow. The pipeline includes three key stages:

Linting
Testing
Build and Deploy

Each stage plays a vital role in ensuring the quality and correctness of the code, leading to successful deployment of the machine learning model.

graph TD;
    Linting-->Testing;
    Testing-->Build_and_Deploy;
    Build_and_Deploy-->EC2_Deployment;

1. Linting Stage

Purpose: The linting stage ensures that the code follows predefined style guidelines and is free from syntax errors, which helps maintain clean, readable, and maintainable code.

Tool Used: flake8

Configuration: The linting process runs flake8 on the src and tests directories. This checks the code for style issues and syntax violations.

Example:

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python
        uses: actions/setup-python@v2
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install flake8          
      - name: Lint with flake8
        run: |
          flake8 src tests

2. Testing Stage

Purpose: The testing stage verifies the correctness of the code by running unit tests. This is crucial for catching bugs early in the development process.

Tool Used: pytest

Configuration: This stage uses pytest to execute unit tests. Any issues with the functionality of the code will be caught here, ensuring that only valid code makes it to deployment.

Example:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - name: Set up Python
        uses: actions/setup-python@v2
      - name: Install dependencies
        run: |
          pip install pytest          
      - name: Run tests
        run: |
          pytest

3. Build and Deploy Stage

Purpose: The final stage is responsible for building the project and deploying it to an EC2 instance. This step ensures that the model is correctly packaged and deployed in a production environment.

Tools Used: Docker, SSH

Configuration: This stage runs only for the main branch and is responsible for building the Docker image and deploying it to an EC2 instance. It uses SSH to securely connect to the server and orchestrate the deployment process.

Example:

jobs:
  build_and_deploy:
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v2
      - name: Build Docker image
        run: |
          docker build -t my-ml-model .          
      - name: Deploy to EC2
        run: |
          ssh -i ~/.ssh/id_rsa ec2-user@your-ec2-instance 'bash deploy.sh'

GitHub Repository Link

You can find the full repository, along with the branches and merge history, at the following link: GitHub Repository: Olympic Winner Prediction

Process and Tooling

1. Experiment Tracking with MLflow

Task: Track experiments, metrics, parameters, and results for at least three different model training runs using MLflow.

Tool Used: MLflow

Implementation: MLflow is used to record and compare different experiments, capturing essential metrics like accuracy, precision, recall, and F1-score. Here’s a summary of three different experiments performed.

Experiment Dashboard:

graph TD;
    Experiment1-->Metrics1;
    Experiment2-->Metrics2;
    Experiment3-->Metrics3;

Experiment 1:

Model: Random Forest
Parameters: n_estimators=100, max_depth=10
Accuracy: 85%

Experiment 2:

Model: Decision Tree
Parameters: max_depth=15
Accuracy: 80%

Experiment 3:

Model: XGBoost
Parameters: n_estimators=200, learning_rate=0.05
Accuracy: 88%

2. Data Versioning with DVC

Task: Use DVC to version control a dataset and demonstrate how to revert to a previous version.

Tool Used: DVC

Configuration: DVC is used to manage large datasets. Here’s an example of how to version control a dataset and restore a previous version.

Command Example:

dvc init
dvc add data/raw_dataset.csv
git add data/raw_dataset.csv.dvc .gitignore
git commit -m "Add raw dataset"

Reverting to a previous dataset version:

git checkout <commit_hash>
dvc checkout

Model Experimentation and Packaging

Hyperparameter Tuning

Task: Document the hyperparameter tuning process and record the best parameters found using GridSearchCV.

Best Parameters Found:

{
  "max_depth": null,
  "min_samples_split": 2,
  "n_estimators": 200
}

Model Packaging

Task: Package the model in a Docker container for deployment. Here’s a screenshot of the model running in a Docker container:

docker run -p 8000:8000 my-ml-model

Model Deployment & Orchestration

Deployed Model Endpoint

The model has been successfully deployed and is accessible at the following endpoint: http://16.171.115.123:8000

Orchestration Process

Deployment Script: The deploy.sh script on the EC2 instance handles the deployment by pulling the latest Docker image, stopping any running containers, and starting a new container.

Continuous Deployment: The CI/CD pipeline triggers the deployment process on every push to the main branch, ensuring continuous delivery of updates.

#!/bin/bash
docker pull my-ml-model:latest
docker stop $(docker ps -q)
docker run -d -p 8000:8000 my-ml-model

Environment Management: GitHub Secrets are used to securely manage sensitive information like API keys, which are passed to the EC2 instance during deployment.

env:
  API_KEY: ${{ secrets.API_KEY }}

Conclusion

This post outlined the MLOps foundation concepts, walking through the process of setting up a robust CI/CD pipeline, experiment tracking, data versioning, hyperparameter tuning, model packaging, and deployment. By leveraging tools like GitHub Actions, MLflow, DVC, and Docker, we can ensure efficient, reliable, and scalable machine learning workflows.

Last updated on October 21, 2024

Optimizing Cancer Treatment with Multi-Armed Bandits Homophily-Based Graph Node Classification Using GNNs