Skip to content

Randomly working and failing model evaluation step and failing during publishing artifacts. #375

@lafranke

Description

@lafranke

I am working on the CI step of the pipeline.

It consist of three main steps: Get Pipeline ID, Trigger ML Training Pipeline, and Publish artifact.

The first steps always goes through.

The second step sometimes works and sometimes crashes. It crashes in the evaluation step because it does not get a MSE value. However, this happens randomly. Without changing the code, the pipeline might suddenly throw an error.

Until now it also always fails at the third step, precisely at the "Determine if evaluation succeeded" step. Until now I could determine that the error occurs in the "automobile-publish-model-artifact-template.yml" file, in line "FOUND_MODEL=$(az ml model list -g $(RESOURCE_GROUP) --workspace-name $(WORKSPACE_NAME) --tag BuildId=$(Build.BuildId) --query '[0]')". I could not work out the error further, as the pipeline randomly fails for longer periods of time in the second step.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions