-
Notifications
You must be signed in to change notification settings - Fork 6.4k
null-text-inversion-pipeline-implementation #6329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
null-text-inversion-pipeline-implementation #6329
Conversation
Could you please:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove the image siamese.jpg
from the PR to make sure the repo is kept light-weight and add some lines to the README?
OK
Patrick von Platen ***@***.***>于2023年12月27日 周三05:27写道:
… ***@***.**** commented on this pull request.
Could we remove the image siamese.jpg from the PR to make sure the repo
is kept light-weight and add some lines to the README?
—
Reply to this email directly, view it on GitHub
<#6329 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYSXVSDSSVVFEWJDSC5MJHDYLM6M7AVCNFSM6AAAAABBC3DEHOVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTOOJWGU4TSOBSGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
…s into null-text-inversion
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Now we just need to run |
Done~ |
Perfect! |
While implementing my UI for Null-Text, ran into a few issues worth pointing out.. In the doc, it shows to import with |
I will fix it later
Alan Bedian ***@***.***> 于2024年1月17日周三 15:20写道:
… While implementing my UI for Null-Text, ran into a few issues worth
pointing out.. In the doc, it shows to import with from
examples.community.pipeline_null_text_inversion import NullTextPipeline
but took me a while to figure out that it should be from
pipeline_null_text_inversion import NullTextPipeline instead. Then the
scheduler in example gave warnings with steps_offset & clip_sample, so it
should be scheduler = DDIMScheduler(num_train_timesteps=1000,
beta_start=0.00085, beta_end=0.0120, beta_schedule="scaled_linear",
steps_offset=1, clip_sample=False)... Not sure if it's compatible with
all the other schedulers, but I can stick with DDIM for now. Then I found
it strange that the input image is supposed to be a path string instead of
a PIL or Numpy like every other pipeline, but I could live with that. Also
wouldn't mind a callback function for my progress bar, seems like
generation time is slow but probably worth it.
The main issue when I finally got it working is the returned
StableDiffusionPipelineOutput which gave error about .save not being on
str. I checked the returned images value and it was a list with a string,
so ["images"] was the output data. I looked into the script, and at the
very end I saw this line 256 image, output_type=output_type,
do_denormalize=[True] * image.shape[0] which looks like the variables are
reversed, so the variable image = output_type, which is "images". Then on
the last line is return StableDiffusionPipelineOutput(images=image,
nsfw_content_detected=False) where I think images should be a list, but
maybe it can take a single. Everything looks like easy fixes, looking
forward to testing the results, seems useful for edits. Thanks..
—
Reply to this email directly, view it on GitHub
<#6329 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYSXVSGBOWX2ELI4O3YGI43YO53SHAVCNFSM6AAAAABBC3DEHOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJVGIZDANJVG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hello! Is there any way to use it with float16? I tried to enlarge the loss by a factor of 10(or 1e3, 1e4), or lr to smaller, but all failed. Are there any solutions to solve it? I really really want to use it with torch.float16 o(╥﹏╥)o |
* null-text-inversion-implementation * edited * edited * edited * edited * edited * edit * makestyle --------- Co-authored-by: Sayak Paul <[email protected]>
Hi, thanks for contributing. How to combine this with prompt-to-prompt for image editing? Could you please provide an example code snippet? Thanks. |
What does this PR do?
Issue: https://github.com/huggingface/diffusers/issues/6313
The implementation of a Null-Text Inversion Pipeline. NullTextPipeline is mostly from the official code [https://github.com/google/prompt-to-prompt/blob/main/null_text_w_ptp.ipynb
Usage:
invert_prompt is used for DDIM-Inversion and optimization.
prompt can be the same with invert_prompt for reconstruction, or be set to a different one like "A lying dog" for image editing.
Note that float16 will fail to successfully optimize the null-text embedding.
Fixes # (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.