|
40 | 40 | "metadata": {}, |
41 | 41 | "source": [ |
42 | 42 | "### Prerequisites\n", |
43 | | - "You'll need to create a compute Instance by following the instructions in the [EnvironmentSetup.md](../Setup_Resources/EnvironmentSetup.md)." |
| 43 | + "You'll need to create a compute Instance by following [these](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-manage-compute-instance?tabs=python) instructions." |
44 | 44 | ] |
45 | 45 | }, |
46 | 46 | { |
|
259 | 259 | "| **forecast_horizon** | The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly). Periods are inferred from your data. |\n", |
260 | 260 | "| **time_column_name** | The name of your time column. |\n", |
261 | 261 | "| **time_series_id_column_names** | The column names used to uniquely identify timeseries in data that has multiple rows with the same timestamp. |\n", |
| 262 | + "| **cv_step_size** | Number of periods between two consecutive cross-validation folds. The default value is \\\"auto\\\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value. |\n", |
262 | 263 | "\n", |
263 | 264 | "#### ``AutoMLConfig`` arguments\n", |
264 | 265 | "| Property | Description|\n", |
|
268 | 269 | "| **blocked_models** | Blocked models won't be used by AutoML. |\n", |
269 | 270 | "| **iteration_timeout_minutes** | Maximum amount of time in minutes that the model can train. This is optional but provides customers with greater control on exit criteria. |\n", |
270 | 271 | "| **iterations** | Number of models to train. This is optional but provides customers with greater control on exit criteria. |\n", |
271 | | - "| **experiment_timeout_hours** | Maximum amount of time in hours that the experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. |\n", |
| 272 | + "| **experiment_timeout_hours** | Maximum amount of time in hours that each experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. **It does not control the overall timeout for the pipeline run, instead controls the timeout for each training run per partitioned time series.** |\n", |
272 | 273 | "| **label_column_name** | The name of the label column. |\n", |
273 | | - "| **n_cross_validations** | Number of cross-validation folds to use for model/pipeline selection. The default value is \\\"auto\\\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. |\n", |
274 | | - "| **cv_step_size** | Number of periods between two consecutive cross-validation folds. The default value is \\\"auto\\\", in which case AutoMl determines the cross-validation step size automatically, if a validation set is not provided. Or users could specify an integer value. |\n", |
275 | | - "| **enable_early_stopping** | Flag to enable early termination if the score is not improving in the short term. |\n", |
| 274 | + "| **n_cross_validations** | Number of cross validation splits. The default value is \\\"auto\\\", in which case AutoMl determines the number of cross-validations automatically, if a validation set is not provided. Or users could specify an integer value. Rolling Origin Validation is used to split time-series in a temporally consistent way. |\n", |
| 275 | + "| **enable_early_stopping** | Flag to enable early termination if the primary metric is no longer improving. |\n", |
276 | 276 | "| **enable_engineered_explanations** | Engineered feature explanations will be downloaded if enable_engineered_explanations flag is set to True. By default it is set to False to save storage space. |\n", |
277 | 277 | "| **track_child_runs** | Flag to disable tracking of child runs. Only best run is tracked if the flag is set to False (this includes the model and metrics of the run). |\n", |
278 | 278 | "| **pipeline_fetch_max_batch_size** | Determines how many pipelines (training algorithms) to fetch at a time for training, this helps reduce throttling when training at large scale. |\n", |
|
281 | 281 | "#### ``HTSTrainParameters`` arguments\n", |
282 | 282 | "| Property | Description|\n", |
283 | 283 | "| :--------------- | :------------------- |\n", |
284 | | - "| **automl_settings** | ``AutoMLConfig`` object.\n", |
| 284 | + "| **automl_settings** | The ``AutoMLConfig`` object defined above. |\n", |
285 | 285 | "| **hierarchy_column_names** | The names of columns that define the hierarchical structure of the data from highest level to most granular. |\n", |
286 | 286 | "| **training_level** | The level of the hierarchy to be used for training models. |\n", |
287 | 287 | "| **enable_engineered_explanations** | The switch controls engineered explanations. |" |
|
354 | 354 | "cell_type": "markdown", |
355 | 355 | "metadata": {}, |
356 | 356 | "source": [ |
357 | | - "Parallel run step is leveraged to train the hierarchy. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The `process_count_per_node` is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n", |
| 357 | + "Parallel run step is leveraged to train multiple models at once. To configure the ParallelRunConfig you will need to determine the appropriate number of workers and nodes for your use case. The ``process_count_per_node`` is based off the number of cores of the compute VM. The node_count will determine the number of master nodes to use, increasing the node count will speed up the training process.\n", |
| 358 | + "\n", |
| 359 | + "| Property | Description|\n", |
| 360 | + "| :--------------- | :------------------- |\n", |
| 361 | + "| **experiment** | The experiment used for training. |\n", |
| 362 | + "| **train_data** | The file dataset to be used as input to the training run. |\n", |
| 363 | + "| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long. |\n", |
| 364 | + "| **process_count_per_node** | Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node for optimal performance. |\n", |
| 365 | + "| **train_pipeline_parameters** | The set of configuration parameters defined in the previous section. |\n", |
| 366 | + "| **run_invocation_timeout** | Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. This must be greater than ``experiment_timeout_hours`` by at least 300 seconds. |\n", |
358 | 367 | "\n", |
359 | | - "* **experiment:** The experiment used for training.\n", |
360 | | - "* **train_data:** The tabular dataset to be used as input to the training run.\n", |
361 | | - "* **node_count:** The number of compute nodes to be used for running the user script. We recommend to start with 3 and increase the node_count if the training time is taking too long.\n", |
362 | | - "* **process_count_per_node:** Process count per node, we recommend 2:1 ratio for number of cores: number of processes per node. eg. If node has 16 cores then configure 8 or less process count per node or optimal performance.\n", |
363 | | - "* **train_pipeline_parameters:** The set of configuration parameters defined in the previous section. \n", |
364 | | - "* **run_invocation_timeout:** Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. This must be greater than ``experiment_timeout_hours`` by at least 300 seconds.\n", |
| 368 | + "Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution.\n", |
365 | 369 | "\n", |
366 | | - "Calling this method will create a new aggregated dataset which is generated dynamically on pipeline execution." |
| 370 | + "**Note**: Total time taken for the **training step** in the pipeline to complete = $ \\frac{t}{ p \\times n } \\times ts $\n", |
| 371 | + "where,\n", |
| 372 | + "- $ t $ is time taken for training one partition (can be viewed in the training logs)\n", |
| 373 | + "- $ p $ is ``process_count_per_node``\n", |
| 374 | + "- $ n $ is ``node_count``\n", |
| 375 | + "- $ ts $ is total number of partitions in time series based on ``partition_column_names``" |
367 | 376 | ] |
368 | 377 | }, |
369 | 378 | { |
|
527 | 536 | "source": [ |
528 | 537 | "## 5.0 Forecasting\n", |
529 | 538 | "For hierarchical forecasting we need to provide the HTSInferenceParameters object.\n", |
530 | | - "#### HTSInferenceParameters arguments\n", |
531 | | - "* **hierarchy_forecast_level:** The default level of the hierarchy to produce prediction/forecast on.\n", |
532 | | - "* **allocation_method:** \\[Optional] The disaggregation method to use if the hierarchy forecast level specified is below the define hierarchy training level. <br><i>(average historical proportions) 'average_historical_proportions'</i><br><i>(proportions of the historical averages) 'proportions_of_historical_average'</i>\n", |
533 | | - "\n", |
534 | | - "#### get_many_models_batch_inference_steps arguments\n", |
535 | | - "* **experiment:** The experiment used for inference run.\n", |
536 | | - "* **inference_data:** The data to use for inferencing. It should be the same schema as used for training.\n", |
537 | | - "* **compute_target:** The compute target that runs the inference pipeline.\n", |
538 | | - "* **node_count:** The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku).\n", |
539 | | - "* **process_count_per_node:** The number of processes per node.\n", |
540 | | - "* **train_run_id:** \\[Optional] The run id of the hierarchy training, by default it is the latest successful training hts run in the experiment.\n", |
541 | | - "* **train_experiment_name:** \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline.\n", |
542 | | - "* **process_count_per_node:** \\[Optional] The number of processes per node, by default it's 4." |
| 539 | + "#### ``HTSInferenceParameters`` arguments\n", |
| 540 | + "| Property | Description|\n", |
| 541 | + "| :--------------- | :------------------- |\n", |
| 542 | + "| **hierarchy_forecast_level:** | The default level of the hierarchy to produce prediction/forecast on. |\n", |
| 543 | + "| **allocation_method:** | \\[Optional] The disaggregation method to use if the hierarchy forecast level specified is below the define hierarchy training level. <br><i>(average historical proportions) 'average_historical_proportions'</i><br><i>(proportions of the historical averages) 'proportions_of_historical_average'</i> |\n", |
| 544 | + "\n", |
| 545 | + "#### ``get_many_models_batch_inference_steps`` arguments\n", |
| 546 | + "| Property | Description|\n", |
| 547 | + "| :--------------- | :------------------- |\n", |
| 548 | + "| **experiment** | The experiment used for inference run. |\n", |
| 549 | + "| **inference_data** | The data to use for inferencing. It should be the same schema as used for training.\n", |
| 550 | + "| **compute_target** | The compute target that runs the inference pipeline. |\n", |
| 551 | + "| **node_count** | The number of compute nodes to be used for running the user script. We recommend to start with the number of cores per node (varies by compute sku). |\n", |
| 552 | + "| **process_count_per_node** | \\[Optional] The number of processes per node. By default it's 2 (should be at most half of the number of cores in a single node of the compute cluster that will be used for the experiment).\n", |
| 553 | + "| **inference_pipeline_parameters** | \\[Optional] The ``HTSInferenceParameters`` object defined above. |\n", |
| 554 | + "| **train_run_id** | \\[Optional] The run id of the **training pipeline**. By default it is the latest successful training pipeline run in the experiment. |\n", |
| 555 | + "| **train_experiment_name** | \\[Optional] The train experiment that contains the train pipeline. This one is only needed when the train pipeline is not in the same experiement as the inference pipeline. |\n", |
| 556 | + "| **run_invocation_timeout** | \\[Optional] Maximum amount of time in seconds that the ``ParallelRunStep`` class is allowed. This is optional but provides customers with greater control on exit criteria. |" |
543 | 557 | ] |
544 | 558 | }, |
545 | 559 | { |
|
0 commit comments