Skip to content

I am encountering a key error with the IP adapter when using the train_lcm_distill_lora_sdxl.py script to train the LCM Lora model for SDXL. After training, when I use it together with SD XL and IP adapter, it throws an IP adapter key error. #6382

@cjt222

Description

@cjt222

Describe the bug

I am encountering a key error with the IP adapter when using the train_lcm_distill_lora_sdxl.py script to train the LCM Lora model for SDXL. After training, when I use it together with SD XL and IP adapter, it throws an IP adapter key error.

Loading adapter weights from state_dict led to unexpected keys not found in the model: ['down_blocks.0.resnets.0.conv1.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.conv1.lora_B_1.default_0.weight', 'down_blocks.0.resnets.0.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.0.resnets.0.conv2.lora_A_1.default_0.weight', 'down_blocks.0.resnets.0.conv2.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.conv1.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.conv1.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.0.resnets.1.conv2.lora_A_1.default_0.weight', 'down_blocks.0.resnets.1.conv2.lora_B_1.default_0.weight', 'down_blocks.0.downsamplers.0.conv.lora_A_1.default_0.weight', 'down_blocks.0.downsamplers.0.conv.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.proj_in.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.proj_in.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.0.proj_out.lora_A_1.default_0.weight', 'down_blocks.1.attentions.0.proj_out.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.proj_in.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.proj_in.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.1.attentions.1.proj_out.lora_A_1.default_0.weight', 'down_blocks.1.attentions.1.proj_out.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv1.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv1.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv2.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv2.lora_B_1.default_0.weight', 'down_blocks.1.resnets.0.conv_shortcut.lora_A_1.default_0.weight', 'down_blocks.1.resnets.0.conv_shortcut.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.conv1.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.conv1.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.time_emb_proj.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.time_emb_proj.lora_B_1.default_0.weight', 'down_blocks.1.resnets.1.conv2.lora_A_1.default_0.weight', 'down_blocks.1.resnets.1.conv2.lora_B_1.default_0.weight', 'down_blocks.1.downsamplers.0.conv.lora_A_1.default_0.weight', 'down_blocks.1.downsamplers.0.conv.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.proj_in.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.proj_in.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.0.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.1.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.2.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.3.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.4.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.5.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.6.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.7.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.attn2.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.0.proj.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.8.ff.net.2.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_q.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_k.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_v.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.lora_A_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn1.to_out.0.lora_B_1.default_0.weight', 'down_blocks.2.attentions.0.transformer_blocks.9.attn2.to_q.lora_A_1.default_0.weight'

Reproduction

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, StableDiffusionXLPipeline
import torch
import numpy as np
import cv2
from PIL import Image
from diffusers.utils import load_image
from transformers import pipeline, CLIPVisionModelWithProjection
from diffusers import LCMScheduler
from controlnet_aux import CannyDetector
from controlnet_aux import ContentShuffleDetector

lcm_lora_id = "/home/kas/kas_workspace/cjt/save_lcm_sdxl_models/checkpoint-34000"
pipe = StableDiffusionXLPipeline.from_pretrained(
"/home/kas/kas_workspace/zijunhuang/MODEL_LIBS/models/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16)
#pipe = StableDiffusionXLPipeline.from_pretrained(

"/home/kas/style_transfer/ip_adapters/models/RealVisXL_V3.0", torch_dtype=torch.float16)

pipe.to("cuda")
pipe.safety_checker=None

pipe.load_ip_adapter("h94/IP-Adapter", subfolder="sdxl_models", weight_name="ip-adapter-plus_sdxl_vit-h.safetensors")
pipe.image_encoder = CLIPVisionModelWithProjection.from_pretrained("h94/IP-Adapter", subfolder="models/image_encoder",torch_dtype=torch.float16).to('cuda')
pipe.set_ip_adapter_scale(0.6)
pipe.load_lora_weights(lcm_lora_id)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

ref_image = load_image("/home/kas/style_transfer/AesPA-Net/style_image/dog.jpg")
#ref_image_2 = load_image("/home/kas/style_transfer/AesPA-Net/style_image/23.jpg")

generator = torch.Generator(device="cpu").manual_seed(330)
images = pipe(
prompt='Whimsical steampunk-inspired airship soaring through the skies amidst floating islands, best quality, high quality',
ip_adapter_image=ref_image,
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
num_inference_steps=4,
generator=generator,
guidance_scale=1,
clip_skip=1
).images
images[0].save("ip_adapters_xl.png")

Logs

No response

System Info

  • diffusers version: 0.25.0.dev0
  • Platform: Linux-5.4.0-48-generic-x86_64-with-glibc2.27
  • Python version: 3.10.13
  • PyTorch version (GPU?): 2.1.0+cu121 (True)
  • Huggingface_hub version: 0.20.1
  • Transformers version: 4.36.2
  • Accelerate version: 0.24.1
  • xFormers version: 0.0.22.post7
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul @yiyixuxu

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions