|
6 | 6 | 5. [Running using python command](#pythoncommand) |
7 | 7 | 6. [Troubleshooting](#troubleshooting) |
8 | 8 |
|
9 | | -# Auto ML Introduction <a name="introduction"></a> |
10 | | -AutoML builds high quality Machine Learning models for you by automating model and hyperparameter selection. Bring a labelled dataset that you want to build a model for, AutoML will give you a high quality machine learning model that you can use for predictions. |
| 9 | +# Automated machine learning introduction <a name="introduction"></a> |
| 10 | +Automated machine learning (automated ML) builds high quality machine learning models for you by automating model and hyperparameter selection. Bring a labelled dataset that you want to build a model for, automated ML will give you a high quality machine learning model that you can use for predictions. |
11 | 11 |
|
12 | | -If you are new to Data Science, AutoML will help you get jumpstarted by simplifying machine learning model building. It abstracts you from needing to perform model selection, hyperparameter selection and in one step creates a high quality trained model for you to use. |
| 12 | +If you are new to Data Science, automated ML will help you get jumpstarted by simplifying machine learning model building. It abstracts you from needing to perform model selection, hyperparameter selection and in one step creates a high quality trained model for you to use. |
| 13 | + |
| 14 | +If you are an experienced data scientist, automated ML will help increase your productivity by intelligently performing the model and hyperparameter selection for your training and generates high quality models much quicker than manually specifying several combinations of the parameters and running training jobs. automated ML provides visibility and access to all the training jobs and the performance characteristics of the models to help you further tune the pipeline if you desire. |
13 | 15 |
|
14 | | -If you are an experienced data scientist, AutoML will help increase your productivity by intelligently performing the model and hyperparameter selection for your training and generates high quality models much quicker than manually specifying several combinations of the parameters and running training jobs. AutoML provides visibility and access to all the training jobs and the performance characteristics of the models to help you further tune the pipeline if you desire. |
15 | 16 |
|
16 | 17 | # Running samples in a Local Conda environment <a name="localconda"></a> |
17 | 18 |
|
| 19 | +You can run these notebooks in Azure Notebooks without any extra installation. To run these notebook on your own notebook server, use these installation instructions. |
| 20 | + |
18 | 21 | It is best if you create a new conda environment locally to try this SDK, so it doesn't mess up with your existing Python environment. |
19 | 22 |
|
20 | 23 | ### 1. Install mini-conda from [here](https://conda.io/miniconda.html), choose Python 3.7 or higher. |
21 | 24 | - **Note**: if you already have conda installed, you can keep using it but it should be version 4.4.10 or later (as shown by: conda -V). If you have a previous version installed, you can update it using the command: conda update conda. |
22 | 25 | There's no need to install mini-conda specifically. |
23 | 26 |
|
24 | 27 | ### 2. Downloading the sample notebooks |
25 | | -- Download the sample notebooks from [GitHub](https://github.com/Azure/MachineLearningNotebooks) as zip and extract the contents to a local directory. The AutoML sample notebooks are in the "automl" folder. |
| 28 | +- Download the sample notebooks from [GitHub](https://github.com/Azure/MachineLearningNotebooks) as zip and extract the contents to a local directory. The automated ML sample notebooks are in the "automl" folder. |
26 | 29 |
|
27 | 30 | ### 3. Setup a new conda environment |
28 | | -The automl_setup script creates a new conda environment, installs the necessary packages, configures the widget and starts jupyter notebook. |
| 31 | +The **automl/automl_setup** script creates a new conda environment, installs the necessary packages, configures the widget and starts a jupyter notebook. |
29 | 32 | It takes the conda environment name as an optional parameter. The default conda environment name is azure_automl. The exact command depends on the operating system. It can take about 30 minutes to execute. |
30 | 33 | ## Windows |
31 | | -Start a conda command windows, cd to the "automl" folder where the sample notebooks were extracted and then run: automl_setup |
| 34 | +Start a conda command windows, cd to the **automl** folder where the sample notebooks were extracted and then run: |
| 35 | +``` |
| 36 | +automl_setup |
| 37 | +``` |
32 | 38 | ## Mac |
33 | | -Install "Command line developer tools" if it is not already installed (you can use the command: xcode-select --install). |
34 | | -Start a Terminal windows, cd to the "automl" folder where the sample notebooks were extracted and then run: bash automl_setup_mac.sh |
| 39 | +Install "Command line developer tools" if it is not already installed (you can use the command: `xcode-select --install`). |
| 40 | + |
| 41 | +Start a Terminal windows, cd to the **automl** folder where the sample notebooks were extracted and then run: |
| 42 | + |
| 43 | +``` |
| 44 | +bash automl_setup_mac.sh |
| 45 | +``` |
| 46 | + |
35 | 47 | ## Linux |
36 | | -cd to the "automl" folder where the sample notebooks were extracted and then run: automl_setup_linux.sh |
| 48 | +cd to the **automl** folder where the sample notebooks were extracted and then run: |
| 49 | + |
| 50 | +``` |
| 51 | +automl_setup_linux.sh |
| 52 | +``` |
37 | 53 |
|
38 | 54 | ### 4. Running configuration.ipynb |
39 | | -- Before running any samples you would need to run the configuration notebook. Click on 00.configuration.ipynb notebook |
| 55 | +- Before running any samples you next need to run the configuration notebook. Click on 00.configuration.ipynb notebook |
40 | 56 | - Please make sure you use the Python [conda env:azure_automl] kernel when running this notebook. |
41 | 57 | - Execute the cells in the notebook to Register Machine Learning Services Resource Provider and create a workspace. (*instructions in notebook*) |
42 | 58 |
|
43 | 59 | ### 5. Running Samples |
44 | 60 | - Please make sure you use the Python [conda env:azure_automl] kernel when trying the sample Notebooks. |
45 | | -- Follow the instructions in the individual notebooks to explore various features in AutoML |
| 61 | +- Follow the instructions in the individual notebooks to explore various features in automated ML |
46 | 62 |
|
47 | 63 | # Auto ML SDK Sample Notebooks <a name="samples"></a> |
48 | 64 | - [00.configuration.ipynb](00.configuration.ipynb) |
@@ -97,8 +113,8 @@ cd to the "automl" folder where the sample notebooks were extracted and then run |
97 | 113 |
|
98 | 114 | - [07.auto-ml-exploring-previous-runs.ipynb](07.auto-ml-exploring-previous-runs) |
99 | 115 | - List all projects for the workspace |
100 | | - - List all AutoML Runs for a given project |
101 | | - - Get details for a AutoML Run. (Automl settings, run widget & all metrics) |
| 116 | + - List all automated ML Runs for a given project |
| 117 | + - Get details for a automated ML Run. (Automl settings, run widget & all metrics) |
102 | 118 | - Downlaod fitted pipeline for any iteration |
103 | 119 |
|
104 | 120 | - [08.auto-ml-remote-execution-with-text-file-on-DSVM](08.auto-ml-remote-execution-with-text-file-on-DSVM.ipynb) |
@@ -135,11 +151,11 @@ cd to the "automl" folder where the sample notebooks were extracted and then run |
135 | 151 |
|
136 | 152 | # Documentation <a name="documentation"></a> |
137 | 153 | ## Table of Contents |
138 | | -1. [Auto ML Settings ](#automlsettings) |
| 154 | +1. [Automated ML Settings ](#automlsettings) |
139 | 155 | 2. [Cross validation split options](#cvsplits) |
140 | 156 | 3. [Get Data Syntax](#getdata) |
141 | 157 |
|
142 | | -## Auto ML Settings <a name="automlsettings"></a> |
| 158 | +## Automated ML Settings <a name="automlsettings"></a> |
143 | 159 | |Property|Description|Default| |
144 | 160 | |-|-|-| |
145 | 161 | |**primary_metric**|This is the metric that you want to optimize.<br><br> Classification supports the following primary metrics <br><i>accuracy</i><br><i>AUC_weighted</i><br><i>balanced_accuracy</i><br><i>average_precision_score_weighted</i><br><i>precision_score_weighted</i><br><br> Regression supports the following primary metrics <br><i>spearman_correlation</i><br><i>normalized_root_mean_squared_error</i><br><i>r2_score</i><br><i>normalized_mean_absolute_error</i><br><i>normalized_root_mean_squared_log_error</i>| Classification: accuracy <br><br> Regression: spearman_correlation |
@@ -191,7 +207,7 @@ The main code of the file must be indented so that it is under this condition. |
191 | 207 |
|
192 | 208 | # Troubleshooting <a name="troubleshooting"></a> |
193 | 209 | ## Iterations fail and the log contains "MemoryError" |
194 | | -This can be caused by insufficient memory on the DSVM. AutoML loads all training data into memory. So, the available memory should be more than the training data size. |
| 210 | +This can be caused by insufficient memory on the DSVM. Automated ML loads all training data into memory. So, the available memory should be more than the training data size. |
195 | 211 | If you are using a remote DSVM, memory is needed for each concurrent iteration. The concurrent_iterations setting specifies the maximum concurrent iterations. For example, if the training data size is 8Gb and concurrent_iterations is set to 10, the minimum memory required is at least 80Gb. |
196 | 212 | To resolve this issue, allocate a DSVM with more memory or reduce the value specified for concurrent_iterations. |
197 | 213 |
|
|
0 commit comments