You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add details on the Batch scoring session of Getting Started (microsoft#389)
* Revision of getting started guide up to Batch scoring. Also new diagam and fix to ARM template to remove region restrictions.
* Detail on Batch scoring for Getting Started and additional debug message in the copy to ease of diagnosing issues
* Tweaked text and added a NOQA for message
Co-authored-by: Joao Pedro Martins <[email protected]>
Copy file name to clipboardExpand all lines: docs/getting_started.md
+21-11Lines changed: 21 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -286,39 +286,49 @@ The pipeline has the following stage:
286
286
287
287
### Set up the Batch Scoring pipeline
288
288
289
-
In your Azure DevOps project, create and run a new build pipeline based on the [diabetes_regression-batchscoring-ci.yml](../.pipelines/diabetes_regression-batchscoring-ci.yml)
290
-
pipeline definition in your forked repository.
289
+
In your Azure DevOps project, create and run a new build pipeline based on the [.pipelines/diabetes_regression-batchscoring-ci.yml](../.pipelines/diabetes_regression-batchscoring-ci.yml)
290
+
pipeline definition in your forked repository. Rename this pipeline to `Batch-Scoring`.
291
291
292
292
Once the pipeline is finished, check the execution result:
293
293
294
294

295
295
296
-
Also check the published batch scoring pipeline in the **mlops-AML-WS** workspace in [Azure Portal](https://portal.azure.com/):
296
+
Also check the published batch scoring pipeline in your AML workspace in the[Azure Portal](https://portal.azure.com/):
Great, you now have the build pipeline set up for batch scoring which automatically triggers every time there's a change in the master branch!
301
301
302
-
The pipeline stages are summarized below:
302
+
The pipeline stages are described below in detail -- and you must do further configurations to actually see the batch inferences:
303
303
304
304
#### Batch Scoring CI
305
305
306
306
- Linting (code quality analysis)
307
307
- Unit tests and code coverage analysis
308
-
- Build and publish *ML Batch Scoring Pipeline* in an *ML Workspace*
308
+
- Build and publish *ML Batch Scoring Pipeline* in an *AML Workspace*
309
309
310
310
#### Batch Score model
311
311
312
312
- Determine the model to be used based on the model name (required), model version, model tag name and model tag value bound pipeline parameters.
313
313
- If run via Azure DevOps pipeline, the batch scoring pipeline will take the model name and version from the `Model-Train-Register-CI` build used as input.
314
314
- If run locally without the model version, the batch scoring pipeline will use the model's latest version.
315
-
- Trigger the *ML Batch Scoring Pipeline* and waits for it to complete.
315
+
- Trigger the *ML Batch Scoring Pipeline* and wait for it to complete.
316
316
- This is an **agentless** job. The CI pipeline can wait for ML pipeline completion for hours or even days without using agent resources.
317
-
- Use the scoring input data supplied via the SCORING_DATASTORE_INPUT_* configuration variables, or uses the default datastore and sample data.
318
-
- Once scoring is completed, the scores are made available in the same blob storage at the locations specified via the SCORING_DATASTORE_OUTPUT_* configuration variables.
319
-
320
-
To configure your own custom scoring data, see [Configure Custom Batch Scoring](custom_model.md#Configure-Custom-Batch-Scoring).
321
-
317
+
- Create an Azure ML pipeline with two steps. The pipeline is created by the code in `ml_service\pipelines\diabetes_regression_build_parallel_batchscore_pipeline.py` and has two steps:
318
+
-`scoringstep` - this step is a **`ParallelRunStep`** that executes the code in `diabetes_regression\scoring\parallel_batchscore.py` with several different batches of the data to be scored.
319
+
-`scorecopystep` - this is a **`PythonScriptStep`** step that copies the output inferences from Azure ML's internal storage into a target location in a another storage account.
320
+
- If you run the instructions as defined above with no changes to variables, this step will be **not** executed. You'll see a message in the logs for the corresponding step saying `Missing Parameters`. In this case, you'll be able to find the file with the inferences in the same Storage Account associated with Azure ML, in a location similar to `azureml-blobstore-SomeGuid\azureml\SomeOtherGuid\defaultoutput\parallel_run_step.txt`. One way to find the right path is this:
321
+
- Open your experiment in Azure ML (by default called `mlopspython`).
322
+
- Open the run that you want to look at (named something like `neat_morning_qc10dzjy` or similar).
323
+
- In the graphical pipeline view with 2 steps, click the button to open the details tab: `Show run overview`.
324
+
- You'll see two steps (corresponding to `scoringstep`and `scorecopystep` as described above).
325
+
- Click the step with the with older "Submitted time".
326
+
- Click "Output + logs" at the top, and you'll see something like the following:
327
+

328
+
- The `defaultoutput` file will have JSON content with the path to a file called `parallel_run_step.txt` containing the scoring.
329
+
330
+
To properly configure this step for your own custom scoring data, you must follow the instructions in [Configure Custom Batch Scoring](custom_model.md#Configure-Custom-Batch-Scoring), which let you specify both the location of the files to score (via the `SCORING_DATASTORE_INPUT_*` configuration variables) and where to store the inferences (via the `SCORING_DATASTORE_OUTPUT_*` configuration variables).
331
+
322
332
## Further Exploration
323
333
324
334
You should now have a working set of pipelines that can get you started with MLOpsPython. Below are some additional features offered that might suit your scenario.
0 commit comments