-
11:14
(UTC -07:00) - https://rogerw.io
- in/rogerywang
- @rogerw0108
Stars
10
stars
written in Python
Clear filter
A high-throughput and memory-efficient inference and serving engine for LLMs
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Entropy Based Sampling and Parallel CoT Decoding
A framework for efficient model inference with omni-modality models
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference



