Skip to content

Conversation

@woct0rdho
Copy link
Contributor

As the NewBie model uses detailed XML prompts that can be > 1024 tokens, it's time to implement the sliding attention that's previously missing in Gemma3.

I found a mistake in the config that Gemma3 should use 5 sliding + 1 non-sliding, not vice-versa. rope_theta and rope_scale are also swapped accordingly. (Before sliding attention is implemented, this does not affect the results with < 1024 tokens.)

After this PR, the condition tensor from Gemma3 in ComfyUI is much closer to the one from Gemma3 in Transformers.

@comfyanonymous comfyanonymous merged commit 0aa7fa4 into Comfy-Org:master Dec 20, 2025
10 checks passed
@woct0rdho woct0rdho deleted the gemma-sliding-attn branch December 20, 2025 06:03
lrivera pushed a commit to Research-Warrant/ComfyUI that referenced this pull request Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants