-
Notifications
You must be signed in to change notification settings - Fork 738
fix(vllm): warn that stream interval is not respected for now #4650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: alec-flowers <[email protected]>
d2407d4 to
b6bdbf1
Compare
WalkthroughA runtime warning was added to the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes
Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: alec-flowers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
components/src/dynamo/vllm/args.py (1)
205-205: Remove trailing space.There's a trailing space before the closing quote.
- "bypassing vLLM's OutputProcessor buffering. " + "bypassing vLLM's OutputProcessor buffering."
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/src/dynamo/vllm/args.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: vllm (arm64)
- GitHub Check: trtllm (arm64)
- GitHub Check: sglang (arm64)
- GitHub Check: operator (amd64)
- GitHub Check: operator (arm64)
- GitHub Check: Build and Test - dynamo
- GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (1)
components/src/dynamo/vllm/args.py (1)
201-207: Defensive implementation with correct assumptions confirmed.The warning implementation is solid. The use of
hasattr()ensures backward compatibility across vLLM versions, and the condition correctly checks for non-default values—vLLM'sstream_intervaldefault is indeed 1. The message clearly explains why Dynamo doesn't respect this flag, noting it bypasses vLLM's OutputProcessor buffering with its own post-processing implementation.No changes needed.
Signed-off-by: Alec <[email protected]>
Overview:
As titled
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
--stream-intervalparameter is configured to inform users that this setting is not supported and output buffering behavior differs from standard operation.✏️ Tip: You can customize this high-level summary in your review settings.