Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
code review changes
  • Loading branch information
deepanker13 committed Jan 10, 2024
commit 32041490e003a5e13d686fd70392173d2ce1fc20
4 changes: 3 additions & 1 deletion .github/workflows/publish-example-images.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,9 @@ jobs:
- component-name: mxnet-auto-tuning
dockerfile: examples/mxnet/tune/Dockerfile
context: examples/mxnet/tune

- component-name: train-api-hf-image
dockerfile: sdk/python/kubeflow/trainer/hf_dockerfile
context: sdk/python/kubeflow/trainer
# TODO (tenzen-y): Fix the below broken Dockerfiles
# - component-name: pytorch-dist-mnist-mpi
# dockerfile: examples/pytorch/mnist/Dockerfile-mpi
Expand Down
26 changes: 0 additions & 26 deletions .github/workflows/publish-sdk-images.yaml

This file was deleted.

2 changes: 1 addition & 1 deletion sdk/python/kubeflow/trainer/hf_dockerfile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Use an official Pytorch runtime as a parent image
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
FROM nvcr.io/nvidia/pytorch:23.12-py3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to use PyTorch image from NVIDIA for this trainer ?
Would it be better to take official PyTorch image similar to what we use in SDK ?
docker.io/pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as suggested by @tenzen-y
#1963 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. @tenzen-y Do you know if PyTorch has any official image that we can use that is supported on all platforms ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreyvelich As I remember correctly, the PyTorch doesn't provide images with multiple architecture platforms with GPU. So, we need to use the NVIDIA official images.


# Set the working directory in the container
WORKDIR /app
Expand Down