Skip to content

Add position_ids to BloomForCausalLM forward pass#44706

Closed
saivedant169 wants to merge 2 commits into
huggingface:mainfrom
saivedant169:fix/issue-32937-bloom-position-ids
Closed

Add position_ids to BloomForCausalLM forward pass#44706
saivedant169 wants to merge 2 commits into
huggingface:mainfrom
saivedant169:fix/issue-32937-bloom-position-ids

Conversation

@saivedant169
Copy link
Copy Markdown

Fixes part of #32937

What does this PR do?

Adds position_ids as an explicit parameter to BloomForCausalLM.forward() and BloomModel.forward(), bringing Bloom in line with other CausalLM models like Llama, Falcon, Gemma, and Mistral.

Bloom uses ALiBi for positional encoding, so position_ids isn't consumed by the attention computation. But it should still be accepted explicitly rather than silently absorbed through **kwargs:

  • GenerationMixin constructs and passes position_ids during autoregressive decoding. Without it in the signature, the parameter goes into **kwargs and is silently ignored — which works but is fragile and inconsistent.
  • Having a consistent forward API across all CausalLM models makes it easier for downstream tools and frameworks that iterate over model inputs.

This is a follow-up to #44705 (RoFormer), continuing the effort to standardize the CausalLM forward signature across architectures.

How was it tested?

  • Full Bloom test suite: 115 passed, 151 skipped, 0 failures
  • make style clean
python -m pytest tests/models/bloom/test_modeling_bloom.py -v

Coordination

Commented on #32937 here. This is part of a series — RoFormer (#44705) was the first.

Threads position_ids through BloomForCausalLM -> BloomModel for API
consistency with other CausalLM models (Llama, Falcon, Gemma, etc.).

Bloom uses ALiBi for positional encoding, so position_ids is accepted
but not consumed by the attention layer — this matches the pattern
where GenerationMixin passes position_ids and the model should accept
it explicitly rather than silently absorbing it through **kwargs.

Part of huggingface#32937

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: bloom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants