Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

Disty0 · 2024-02-11T11:55:20Z

What does this PR do?

Fixes # Non CUDA devices with enable_model_cpu_offload
Target is DirectML and IPEX / XPU devices.

When re-enabling enable_model_cpu_offload in maybe_free_model_hooks, device is not used and enable_model_cpu_offload defaults to cuda.
This PR adds self._offload_device and passes it to enable_model_cpu_offload in maybe_free_model_hooks.

self._offload_device is also being set in enable_sequential_cpu_offload for compatibility reasons.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in this PR.

@patrickvonplaten and @sayakpaul

sayakpaul · 2024-02-11T11:57:25Z

Can you provide a code snippet?

sayakpaul · 2024-02-11T11:58:32Z

src/diffusers/pipelines/pipeline_utils.py


        # make sure the model is in the same state as before calling it
-        self.enable_model_cpu_offload()
+        self.enable_model_cpu_offload(device=getattr(self, "_offload_device", "cuda"))


Personally I think it's a non-breaking change since we default to "cuda" for device in enable_model_cpu_offload().

I think it's ok too :)

HuggingFaceDocBuilderDev · 2024-02-11T12:04:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Disty0 · 2024-02-11T12:06:16Z

import torch
import intel_extension_for_pytorch
from diffusers import AutoPipelineForText2Image, AutoencoderKL, EulerAncestralDiscreteScheduler



model="cagliostrolab/animagine-xl-3.0"
vae="madebyollin/sdxl-vae-fp16-fix"
prompt="masterpiece, best quality, newest, 1girl, solo, depth of field, rim lighting, flowers, petals, crystals, butterfly, scenery, upper body, dark red hair, straight hair, long hair, blue eyes, cat ears, mature female, white sweater, blush, slight smile,"
negative_prompt = "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name"
seed=123456789
num_inference_steps=20



pipeline = AutoPipelineForText2Image.from_pretrained(model, vae=AutoencoderKL.from_pretrained(vae))
pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)
pipeline.safety_checker = None

pipeline = pipeline.to(torch.bfloat16)
pipeline.enable_model_cpu_offload(device="xpu")


# Will work
image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
image.save("image.jpg")

# Will fail
image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
image.save("image2.jpg")

The code above will fail with these logs:

Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00,  2.64it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.35it/s]
Traceback (most recent call last):
  File "/home/disty/Downloads/diffusers/diffusion.py", line 31, in <module>
    image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 1125, in __call__
    ) = self.encode_prompt(
        ^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 415, in encode_prompt
    prompt_embeds = text_encoder(text_input_ids.to(device), output_hidden_states=True)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/torch/cuda/__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Works as expected with this PR:

Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00,  2.97it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:15<00:00,  1.26it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.47it/s]

yiyixuxu

thanks!

yiyixuxu · 2024-02-12T08:11:41Z

src/diffusers/pipelines/pipeline_utils.py


        # make sure the model is in the same state as before calling it
-        self.enable_model_cpu_offload()
+        self.enable_model_cpu_offload(device=getattr(self, "_offload_device", "cuda"))


I think it's ok too :)

Pass device to enable_model_cpu_offload in maybe_free_model_hooks

0f07760

sayakpaul reviewed Feb 11, 2024

View reviewed changes

sayakpaul requested review from patrickvonplaten and yiyixuxu February 11, 2024 11:59

yiyixuxu approved these changes Feb 12, 2024

View reviewed changes

yiyixuxu requested a review from DN6 February 12, 2024 08:12

sayakpaul merged commit 9254d1f into huggingface:main Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

Disty0 commented Feb 11, 2024 •

edited

Loading

Uh oh!

sayakpaul commented Feb 11, 2024

Uh oh!

sayakpaul Feb 11, 2024

Uh oh!

yiyixuxu Feb 12, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Feb 11, 2024

Uh oh!

Disty0 commented Feb 11, 2024 •

edited

Loading

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu Feb 12, 2024

Uh oh!

Uh oh!

Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

Conversation

Disty0 commented Feb 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

sayakpaul commented Feb 11, 2024

Uh oh!

sayakpaul Feb 11, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Feb 11, 2024

Uh oh!

Disty0 commented Feb 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Disty0 commented Feb 11, 2024 •

edited

Loading

Disty0 commented Feb 11, 2024 •

edited

Loading