-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Description
Is your feature request related to a problem? Please describe.
There is a ridiculous amount of code duplication going on in this library. Could anyone please tell me why the schedulers function betas_for_alpha_bar
is duplicated - identically - 23 times, appearing in about half of the schedulers files? There are many other duplicated functions such as retrieve_timesteps
, appearing 6 times. Many many member functions of classes are also duplicated. This seems like extremely bad practice, and makes it very hard to modify these functions.
This is just the tip of the iceberg - there are similar patterns across the whole codebase. The string "# Copied from" occurs in 149 files, and often many times in each file too. Bad sign. Why are there separate standalone files for StableDiffusionPipeline
, StableDiffusionImg2ImgPipeline
, StableDiffusionInpaintPipeline
, StableDiffusionPipelineSafe
and so on (and so on and so on) when they have a huge amount in common, and could be expressed as single methods of or even arguments to one pipeline? Even if you want them to be separate classes, surely they should be sharing code via OOP rather than using this huge amount of duplicated code? Surely, functionally, StableDiffusionControlNetImg2ImgPipeline
could have been expressed as a very simple combination of the existing ControlNet and Img2Img implementations, rather than being its own 1269 line file (with 17 occurrences of "# Copied from")?
In my opinion this methodology will make this codebase increasingly difficult to maintain and contribute to as the features continue to grow. While I unfortunately don't currently have time to attempt this myself, I really believe this package needs a serious refactor. There are already problems with advanced use; if I were to train a ControlNet on an image variation model, I would be unable to use the result with diffusers as StableDiffusionControlNetImageVariationPipeline
does not yet exist (not that it should!).
It's worth running pipelines/stable_diffusion/pipeline_stable_diffusion.py
and pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py
through diff to see the extent of the issue. The code and documentation are extremely similar, with huge identical sections. check_inputs
is slightly different, Img2Img has a new function get_timesteps
, there are slight differences to the arguments to __call__
, and slight differences to the body - and that's it. Both files are ~1000 lines long.
Describe the solution you'd like.
Share code between pipelines and schedulers wherever possible. There should surely be some kind of StableDiffusionPipeline
base class given the number of related classes. ControlNet should be incorporated in to existing Stable Diffusion pipelines - to my understanding wherever SD is used, ControlNet may also be used (we just need to provide an image condition), so to me it seems very overkill to have separate pipelines for every ControlNet use case. For example, StableDiffusionControlNetImg2ImgPipeline
could inherit from StableDiffusionImg2ImgPipeline
, or StableDiffusionImg2ImgPipeline
could take optional image conditions as input to its call.
Of course these are just suggestions, it's worth thinking very carefully about how to proceed! But I really believe that change is necessary or this library will become more and more bloated as new features are added.
I was originally intending to add a custom beta schedule to this library (the sqrt schedule from Diffusion-LM), but the amount of code duplication that would be necessary for this really put me off. I believe the noise schedule system needs a rewrite - there should really be just one function which maps from a string (and num timesteps etc) to a list of betas, rather than 23 identical functions + additional logic in many different constructors. (perhaps I should separate this into a separate issue?)