Skip to content

jakechen/gcp-mlops-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLOps on GCP Vertex AI: a gradual tutorial

Please read MLOps: Continuous delivery and automation pipelines in machine learning before beginning this tutorial.

To summarize the MLOps levels:

  1. MLOps level 0: Manual process
  2. MLOps level 1: ML pipeline automation
  3. MLOps level 2: CI/CD pipeline automation

The goal of MLOps level 2 is to achieve the same velocity and quality of the DevOps teams: new web applications and features are created, tested, deployed, and destroyed every day, if not multiple times a day, all with zero impact to a user. For example, Google constantly adds or updates products across its entire portfolio, which means there are hundreds to thousands of new deployments on any given day. Even if your company is not as big as Google, you and your AI/ML team can and should aspire to the same velocity as the application development teams. This means adopting the practices and principles of DevOps.

MLOps level 0: Manual process

Topics

  • Jupyter Notebooks

Data Scientist workflow

  1. I load the data in Jupyter Notebook
  2. I iterate on model
  3. I run training notebook to output model

MLOps level 1: ML pipeline automation

Topics

  • Non-Jupyter IDE: Code is written in .py files to accomodate containers, not .ipynb
  • Docker Containers: Runs custom code repeatedly
  • Pipelines

Data Scientist workflow

  1. I may iterate on training container
    1. Manually build Docker image and push to Artifact Repository
  2. I may modify Vertex Pipeline
    1. Manually recompile pipeline.yaml
  3. I may need new bucket
    1. Manually create a new bucket in Console

MLOps level 2: CI/CD pipeline automation

Topics

  • Unit testing
  • Production environments: Production environments should not be manually touched, only vetted code can deploy
  • Terraform: Infrastructure as Code e.g. create new buckets using code instead of console
  • Cloud Build: Build service e.g. docker build and push
  • Git automation: When source code changes, changes are automatically tested and applied

Data Scientist workflow

  1. I iterate on training container locally
  2. I git push the changes to dev branch in repo of choice e.g. Gitlab, Github
  3. Cloud Build detects that change, then runs the steps in cloudbuild.yaml (*), which may include:
    1. Running unit tests
    2. Running Docker build and pushing to Artifact Registry
    3. Running terraform apply
    4. Running functional tests
  4. Once all code passes is dev, new changes may automatically pushed to Production depending on your DevOps
  5. The above checks and builds take place in Production and your new model is launched

(*) In this tutorial we will not connect Cloud Build to a repo, instead we will run Cloud Build manually to mimic the trigger would execute.

Best practices

Resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •