From 5c0c3c45eca1bfafe2a74e95bf3f6a0ea74af229 Mon Sep 17 00:00:00 2001 From: David Tesar Date: Fri, 12 Jun 2020 17:25:36 -0700 Subject: [PATCH 1/5] Simplify flow --- bootstrap/README.md | 17 +---------------- docs/custom_model.md | 24 ++++++++++++++++++------ docs/getting_started.md | 4 ++-- 3 files changed, 21 insertions(+), 24 deletions(-) diff --git a/bootstrap/README.md b/bootstrap/README.md index 27051f2b..86e770f0 100644 --- a/bootstrap/README.md +++ b/bootstrap/README.md @@ -1,18 +1,3 @@ # Bootstrap from MLOpsPython repository -To use this existing project structure and scripts for your new ML project, you can quickly get started from the existing repository, bootstrap and create a template that works for your ML project. - -Bootstrapping will prepare a directory structure for your project which includes: - -* renaming files and folders from the base project name `diabetes_regression` to your project name -* fixing imports and absolute path based on your project name -* deleting and cleaning up some directories - -To bootstrap from the existing MLOpsPython repository: - -1. Ensure Python 3 is installed locally -1. Clone this repository locally -1. Run bootstrap.py script -`python bootstrap.py -d [dirpath] -n [projectname]` - * `[dirpath]` is the absolute path to the root of the directory where MLOpsPython is cloned - * `[projectname]` is the name of your ML project +For steps on how to use the bootstrap script, please see the "Bootstrap the project" section of the [custom model guide](..\docs\custom_model.md). diff --git a/docs/custom_model.md b/docs/custom_model.md index bce1fb8a..32eeba44 100644 --- a/docs/custom_model.md +++ b/docs/custom_model.md @@ -3,11 +3,11 @@ This document provides steps to follow when using this repository as a template to train models and deploy the models with real-time inference in Azure ML with your own scripts and data. 1. Follow the MLOpsPython [Getting Started](getting_started.md) guide -1. Follow the MLOpsPython [bootstrap instructions](../bootstrap/README.md) to create your project starting point +1. Bootstrap the project 1. Configure training data 1. [If necessary] Convert your ML experimental code into production ready code 1. Replace the training code -1. Update the evaluation code +1. [Optional] Update the evaluation code 1. Customize the build agent environment 1. [If appropriate] Replace the score code @@ -17,24 +17,36 @@ Follow the [Getting Started](getting_started.md) guide to set up the infrastruct Take a look at the [Repo Details](code_description.md) document for a description of the structure of this repository. -## Follow the Bootstrap instructions +## Bootstrap the project -The [Bootstrap from MLOpsPython repository](../bootstrap/README.md) guide will help you to quickly prepare the repository for your project. +Bootstrapping will prepare a directory structure for your project which includes: + +* renaming files and folders from the base project name `diabetes_regression` to your project name +* fixing imports and absolute path based on your project name +* deleting and cleaning up some directories **Note:** Since the bootstrap script will rename the `diabetes_regression` folder to the project name of your choice, we'll refer to your project as `[project name]` when paths are involved. +To bootstrap from the existing MLOpsPython repository: + +1. Ensure Python 3 is installed locally +1. From a local copy of the code, run the `bootstrap.py` script in the `bootstrap` folder +`python bootstrap.py -d [dirpath] -n [projectname]` + * `[dirpath]` is the absolute path to the root of the directory where MLOpsPython is cloned + * `[projectname]` is the name of your ML project + ## Configure training data The training ML pipeline uses a [sample diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) as training data. -To use your own data: +**Important** Convert the template to use your own model Azure ML Dataset for model training via these steps: 1. [Create a Dataset](https://docs.microsoft.com/azure/machine-learning/how-to-create-register-datasets) in your Azure ML workspace 1. Update the `DATASET_NAME` and `DATASTORE_NAME` variables in `.pipelines/[project name]-variables-template.yml` ## Convert your ML experimental code into production ready code -The MLOpsPython template creates an Azure Machine Learning (ML) pipeline that invokes a set of [Azure ML pipeline steps](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps) (see `ml_service/pipelines/[project name]_build_train_pipeline.py`). If your experiment is currently in a Jupyter notebook, it will need to be refactored into scripts that can be run independantly and dropped into the template which the existing Azure ML pipeline steps utilize. +The MLOpsPython template creates an Azure Machine Learning (ML) pipeline that invokes a set of [Azure ML pipeline steps](https://docs.microsoft.com/python/api/azureml-pipeline-steps/azureml.pipeline.steps) (see `ml_service/pipelines/[project name]_build_train_pipeline.py`). If your experiment is currently in a Jupyter notebook, it will need to be refactored into scripts that can be run independently and dropped into the template which the existing Azure ML pipeline steps utilize. 1. Refactor your experiment code into scripts 1. [Recommended] Prepare unit tests diff --git a/docs/getting_started.md b/docs/getting_started.md index 2c63fa94..fe593069 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -1,9 +1,9 @@ # Getting Started with MLOpsPython -This guide shows how to get MLOpsPython working with a sample ML project ***diabetes_regression***. The project creates a linear regression model to predict diabetes. You can adapt this example to use with your own project. +This guide shows how to get MLOpsPython working with a sample ML project ***diabetes_regression***. The project creates a linear regression model to predict diabetes and has CI/CD DevOps practices enabled for model training and serving when these steps are completed in this getting started guide. -We recommend working through this guide completely to ensure everything is working in your environment. After the sample is working, follow the [bootstrap instructions](../bootstrap/README.md) to convert the ***diabetes_regression*** sample into a starting point for your project. +If you would like to bring your own model code to use this template structure, follow the [custom model](custom_model.md) guide. We recommend completing this getting started guide with the diabetes model first to ensure everything is working in your environment before converting the template to use your own model code. - [Setting up Azure DevOps](#setting-up-azure-devops) - [Get the code](#get-the-code) From f0074c65ae1326e0ed0c50ebc0fa1cfe5bd17930 Mon Sep 17 00:00:00 2001 From: David Tesar Date: Fri, 12 Jun 2020 17:36:07 -0700 Subject: [PATCH 2/5] minor tweaks --- bootstrap/README.md | 2 +- docs/getting_started.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/bootstrap/README.md b/bootstrap/README.md index 86e770f0..bed2fe0f 100644 --- a/bootstrap/README.md +++ b/bootstrap/README.md @@ -1,3 +1,3 @@ # Bootstrap from MLOpsPython repository -For steps on how to use the bootstrap script, please see the "Bootstrap the project" section of the [custom model guide](..\docs\custom_model.md). +For steps on how to use the bootstrap script, please see the "Bootstrap the project" section of the [custom model guide](../docs/custom_model.md). diff --git a/docs/getting_started.md b/docs/getting_started.md index fe593069..1025efa8 100644 --- a/docs/getting_started.md +++ b/docs/getting_started.md @@ -3,7 +3,7 @@ This guide shows how to get MLOpsPython working with a sample ML project ***diabetes_regression***. The project creates a linear regression model to predict diabetes and has CI/CD DevOps practices enabled for model training and serving when these steps are completed in this getting started guide. -If you would like to bring your own model code to use this template structure, follow the [custom model](custom_model.md) guide. We recommend completing this getting started guide with the diabetes model first to ensure everything is working in your environment before converting the template to use your own model code. +If you would like to bring your own model code to use this template structure, follow the [custom model](custom_model.md) guide. We recommend completing this getting started guide with the diabetes model through ACI deployment first to ensure everything is working in your environment before converting the template to use your own model code. - [Setting up Azure DevOps](#setting-up-azure-devops) - [Get the code](#get-the-code) From c33dc0df960bc87714c027ed9961fb8aa6b8b6a5 Mon Sep 17 00:00:00 2001 From: David Tesar Date: Fri, 12 Jun 2020 17:41:31 -0700 Subject: [PATCH 3/5] link to bootstrap section of doc --- bootstrap/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/bootstrap/README.md b/bootstrap/README.md index bed2fe0f..0841cc30 100644 --- a/bootstrap/README.md +++ b/bootstrap/README.md @@ -1,3 +1,3 @@ # Bootstrap from MLOpsPython repository -For steps on how to use the bootstrap script, please see the "Bootstrap the project" section of the [custom model guide](../docs/custom_model.md). +For steps on how to use the bootstrap script, please see the "Bootstrap the project" section of the [custom model guide](../docs/custom_model.md#bootstrap-the-project). From 07ad6e6a92e4b41f2b75c0aa32413b48949ffd00 Mon Sep 17 00:00:00 2001 From: David Tesar Date: Fri, 12 Jun 2020 17:44:07 -0700 Subject: [PATCH 4/5] clarify bootstrap wording --- docs/custom_model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/custom_model.md b/docs/custom_model.md index 32eeba44..523490d6 100644 --- a/docs/custom_model.md +++ b/docs/custom_model.md @@ -19,7 +19,7 @@ Take a look at the [Repo Details](code_description.md) document for a descriptio ## Bootstrap the project -Bootstrapping will prepare a directory structure for your project which includes: +Bootstrapping will prepare the directory structure to be used for your project name which includes: * renaming files and folders from the base project name `diabetes_regression` to your project name * fixing imports and absolute path based on your project name From 4ec837f985623c45ef631de1b9d67d485840d04c Mon Sep 17 00:00:00 2001 From: David Tesar Date: Mon, 15 Jun 2020 12:18:59 -0700 Subject: [PATCH 5/5] address feedback --- docs/custom_model.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/custom_model.md b/docs/custom_model.md index 523490d6..d21c8b8d 100644 --- a/docs/custom_model.md +++ b/docs/custom_model.md @@ -39,7 +39,7 @@ To bootstrap from the existing MLOpsPython repository: The training ML pipeline uses a [sample diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) as training data. -**Important** Convert the template to use your own model Azure ML Dataset for model training via these steps: +**Important** Convert the template to use your own Azure ML Dataset for model training via these steps: 1. [Create a Dataset](https://docs.microsoft.com/azure/machine-learning/how-to-create-register-datasets) in your Azure ML workspace 1. Update the `DATASET_NAME` and `DATASTORE_NAME` variables in `.pipelines/[project name]-variables-template.yml`