@@ -106,52 +106,87 @@ jupyter notebook
106106<a name =" samples " ></a >
107107# Automated ML SDK Sample Notebooks
108108
109- - [ auto-ml-classification-credit-card-fraud.ipynb] ( classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb )
110- - Dataset: Kaggle's [ credit card fraud detection dataset] ( https://www.kaggle.com/mlg-ulb/creditcardfraud )
111- - Simple example of using automated ML for classification to fraudulent credit card transactions
112- - Uses azure compute for training
113-
114- - [ auto-ml-regression.ipynb] ( regression/auto-ml-regression.ipynb )
115- - Dataset: Hardware Performance Dataset
116- - Simple example of using automated ML for regression
117- - Uses azure compute for training
118-
119- - [ auto-ml-regression-explanation-featurization.ipynb] ( regression-explanation-featurization/auto-ml-regression-explanation-featurization.ipynb )
109+ ## Classification
110+ - ** Classify Credit Card Fraud**
111+ - Dataset: [ Kaggle's credit card fraud detection dataset] ( https://www.kaggle.com/mlg-ulb/creditcardfraud )
112+ - ** [ Jupyter Notebook (remote run)] ( classification-credit-card-fraud/auto-ml-classification-credit-card-fraud.ipynb ) **
113+ - run the experiment remotely on AML Compute cluster
114+ - test the performance of the best model in the local environment
115+ - ** [ Jupyter Notebook (local run)] ( local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb ) **
116+ - run experiment in the local environment
117+ - use Mimic Explainer for computing feature importance
118+ - deploy the best model along with the explainer to an Azure Kubernetes (AKS) cluster, which will compute the raw and engineered feature importances at inference time
119+ - ** Predict Term Deposit Subscriptions in a Bank**
120+ - Dataset: [ UCI's bank marketing dataset] ( https://www.kaggle.com/janiobachmann/bank-marketing-dataset )
121+ - ** [ Jupyter Notebook] ( classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb ) **
122+ - run experiment remotely on AML Compute cluster to generate ONNX compatible models
123+ - view the featurization steps that were applied during training
124+ - view feature importance for the best model
125+ - download the best model in ONNX format and use it for inferencing using ONNXRuntime
126+ - deploy the best model in PKL format to Azure Container Instance (ACI)
127+ - ** Predict Newsgroup based on Text from News Article**
128+ - Dataset: [ 20 newsgroups text dataset] ( https://scikit-learn.org/0.19/datasets/twenty_newsgroups.html )
129+ - ** [ Jupyter Notebook] ( classification-text-dnn/auto-ml-classification-text-dnn.ipynb ) **
130+ - AutoML highlights here include using deep neural networks (DNNs) to create embedded features from text data
131+ - AutoML will use Bidirectional Encoder Representations from Transformers (BERT) when a GPU compute is used
132+ - Bidirectional Long-Short Term neural network (BiLSTM) will be utilized when a CPU compute is used, thereby optimizing the choice of DNN
133+
134+ ## Regression
135+ - ** Predict Performance of Hardware Parts**
120136 - Dataset: Hardware Performance Dataset
121- - Shows featurization and excplanation
122- - Uses azure compute for training
123-
124- - [ auto-ml-forecasting-energy-demand.ipynb] ( forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb )
125- - Dataset: [ NYC energy demand data] ( forecasting-a/nyc_energy.csv )
126- - Example of using automated ML for training a forecasting model
127-
128- - [ auto-ml-classification-credit-card-fraud-local.ipynb] ( local-run-classification-credit-card-fraud/auto-ml-classification-credit-card-fraud-local.ipynb )
129- - Dataset: Kaggle's [ credit card fraud detection dataset] ( https://www.kaggle.com/mlg-ulb/creditcardfraud )
130- - Simple example of using automated ML for classification to fraudulent credit card transactions
131- - Uses local compute for training
132-
133- - [ auto-ml-classification-bank-marketing-all-features.ipynb] ( classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb )
134- - Dataset: UCI's [ bank marketing dataset] ( https://www.kaggle.com/janiobachmann/bank-marketing-dataset )
135- - Simple example of using automated ML for classification to predict term deposit subscriptions for a bank
136- - Uses azure compute for training
137-
138- - [ auto-ml-forecasting-orange-juice-sales.ipynb] ( forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb )
139- - Dataset: [ Dominick's grocery sales of orange juice] ( forecasting-b/dominicks_OJ.csv )
140- - Example of training an automated ML forecasting model on multiple time-series
141-
142- - [ auto-ml-forecasting-bike-share.ipynb] ( forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb )
143- - Dataset: forecasting for a bike-sharing
144- - Example of training an automated ML forecasting model on multiple time-series
145-
146- - [ auto-ml-forecasting-function.ipynb] ( forecasting-forecast-function/auto-ml-forecasting-function.ipynb )
147- - Example of training an automated ML forecasting model on multiple time-series
148-
149- - [ auto-ml-forecasting-beer-remote.ipynb] ( forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb )
150- - Example of training an automated ML forecasting model on multiple time-series
151- - Beer Production Forecasting
152-
153- - [ auto-ml-continuous-retraining.ipynb] ( continuous-retraining/auto-ml-continuous-retraining.ipynb )
154- - Continuous retraining using Pipelines and Time-Series TabularDataset
137+ - ** [ Jupyter Notebook] ( regression/auto-ml-regression.ipynb ) **
138+ - run the experiment remotely on AML Compute cluster
139+ - get best trained model for a different metric than the one the experiment was optimized for
140+ - test the performance of the best model in the local environment
141+ - ** [ Jupyter Notebook (advanced)] ( regression/auto-ml-regression.ipynb ) **
142+ - run the experiment remotely on AML Compute cluster
143+ - customize featurization: override column purpose within the dataset, configure transformer parameters
144+ - get best trained model for a different metric than the one the experiment was optimized for
145+ - run a model explanation experiment on the remote cluster
146+ - deploy the model along the explainer and run online inferencing
147+
148+ ## Time Series Forecasting
149+ - ** Forecast Energy Demand**
150+ - Dataset: [ NYC energy demand data] ( http://mis.nyiso.com/public/P-58Blist.htm )
151+ - ** [ Jupyter Notebook] ( forecasting-energy-demand/auto-ml-forecasting-energy-demand.ipynb ) **
152+ - run experiment remotely on AML Compute cluster
153+ - use lags and rolling window features
154+ - view the featurization steps that were applied during training
155+ - get the best model, use it to forecast on test data and compare the accuracy of predictions against real data
156+ - ** Forecast Orange Juice Sales (Multi-Series)**
157+ - Dataset: [ Dominick's grocery sales of orange juice] ( forecasting-orange-juice-sales/dominicks_OJ.csv )
158+ - ** [ Jupyter Notebook] ( forecasting-orange-juice-sales/dominicks_OJ.csv ) **
159+ - run experiment remotely on AML Compute cluster
160+ - customize time-series featurization, change column purpose and override transformer hyper parameters
161+ - evaluate locally the performance of the generated best model
162+ - deploy the best model as a webservice on Azure Container Instance (ACI)
163+ - get online predictions from the deployed model
164+ - ** Forecast Demand of a Bike-Sharing Service**
165+ - Dataset: [ Bike demand data] ( forecasting-bike-share/bike-no.csv )
166+ - ** [ Jupyter Notebook] ( forecasting-bike-share/auto-ml-forecasting-bike-share.ipynb ) **
167+ - run experiment remotely on AML Compute cluster
168+ - integrate holiday features
169+ - run rolling forecast for test set that is longer than the forecast horizon
170+ - compute metrics on the predictions from the remote forecast
171+ - ** The Forecast Function Interface**
172+ - Dataset: Generated for sample purposes
173+ - ** [ Jupyter Notebook] ( forecasting-forecast-function/auto-ml-forecasting-function.ipynb ) **
174+ - train a forecaster using a remote AML Compute cluster
175+ - capabilities of forecast function (e.g. forecast farther into the horizon)
176+ - generate confidence intervals
177+ - ** Forecast Beverage Production**
178+ - Dataset: [ Monthly beer production data] ( forecasting-beer-remote/Beer_no_valid_split_train.csv )
179+ - ** [ Jupyter Notebook] ( forecasting-beer-remote/auto-ml-forecasting-beer-remote.ipynb ) **
180+ - train using a remote AML Compute cluster
181+ - enable the DNN learning model
182+ - forecast on a remote compute cluster and compare different model performance
183+ - ** Continuous Retraining with NOAA Weather Data**
184+ - Dataset: [ NOAA weather data from Azure Open Datasets] ( https://azure.microsoft.com/en-us/services/open-datasets/ )
185+ - ** [ Jupyter Notebook] ( continuous-retraining/auto-ml-continuous-retraining.ipynb ) **
186+ - continuously retrain a model using Pipelines and AutoML
187+ - create a Pipeline to upload a time series dataset to an Azure blob
188+ - create a Pipeline to run an AutoML experiment and register the best resulting model in the Workspace
189+ - publish the training pipeline created and schedule it to run daily
155190
156191<a name =" documentation " ></a >
157192See [ Configure automated machine learning experiments] ( https://docs.microsoft.com/azure/machine-learning/service/how-to-configure-auto-train ) to learn how more about the the settings and features available for automated machine learning experiments.
0 commit comments