Skip to content

Fix registry-share notebook online deployment stability#3927

Open
Chakradhar886 wants to merge 5 commits into
mainfrom
fix/mlflow-azureml-ai-monitoring
Open

Fix registry-share notebook online deployment stability#3927
Chakradhar886 wants to merge 5 commits into
mainfrom
fix/mlflow-azureml-ai-monitoring

Conversation

@Chakradhar886
Copy link
Copy Markdown
Member

Summary

  • add azureml-ai-monitoring to the nyc_taxi_data_regression training image
  • update share-models-components-environments notebook managed online deployment SKU from Standard_F4s_v2 to Standard_DS3_v2

Why

Notebook CI failed during managed online deployment with container liveness 502 after platform/runtime changes. This hardens the training image dependencies and uses a more stable SKU default for the sample.

Validation

  • notebook and Dockerfile changes committed in 2e70a8f
  • push succeeded to fix/mlflow-azureml-ai-monitoring

Add azureml-ai-monitoring to nyc_taxi_data_regression training image and update managed online endpoint SKU in share-models-components-environments notebook.
Use azureml-ai-monitoring~=0.1.0b1 in nyc_taxi_data_regression env_train Dockerfile to avoid image build failures on Python 3.8 during registry environment creation.
Add explicit import of azureml.ai.monitoring in training script so MLflow captures it as a dependency when saving the model. This ensures the monitoring package is included in the model's conda environment for deployment scoring.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant