Skip to content
View bychen7's full-sized avatar

Block or report bychen7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM and DeepSpeed integration, CUDA/ROCm/Triton kernels, and transparent hardware-aware scaling.

Python 122 20 Updated Jun 13, 2026
Python 230 21 Updated May 26, 2026

An AI skill pack for value investing, capital allocation, and behavioral discipline, distilled from Warren Buffett's 60+ years of shareholder letters.

38 8 Updated Apr 17, 2026

Jobs scraper library for LinkedIn, Indeed, Glassdoor, Google, ZipRecruiter & more

Python 3,655 718 Updated Feb 18, 2026

AI 时代的伯克希尔:基于 Claude Code 的价值投资研究框架。巴菲特·芒格·段永平·李录四大师方法论 + 多Agent并行研究。

Python 33 6 Updated Jun 14, 2026

A curated collection of papers and resources on On-Policy Distillation for Large Language Models.

Python 301 6 Updated Jun 6, 2026

你想蒸馏的下一个员工,何必是同事。蒸馏任何人的思维方式——心智模型、决策启发式、表达DNA。Distill how anyone thinks.

Python 24,202 3,555 Updated Jun 6, 2026

RLAnything (ICML 2026) & AutoTool (ICML 2026), DemyAgent: Open-Source RL for LLMs and Agentic Scenarios

Python 546 56 Updated Jun 12, 2026

OpenClaw-RL: Train any agent simply by talking

Python 5,494 595 Updated May 23, 2026

Code and implementations for the paper "AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning" by Zhiheng Xi et al.

Python 774 74 Updated Feb 15, 2026

AI agents running research on single-GPU nanochat training automatically

Python 86,581 12,541 Updated Mar 26, 2026

A Foundation Model for Generalist Gaming Agents

Python 2,076 233 Updated Jan 25, 2026

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 68,774 8,786 Updated Jan 21, 2026

This repository provides valuable reference for researchers in the field of multimodality, please start your exploratory travel in RL-based Reasoning MLLMs!

1,421 62 Updated May 11, 2026

Awesome Deep Learning papers for industrial Search, Recommendation and Advertisement. They focus on Embedding, Matching, Pre-Ranking, Ranking, Post Ranking, Relevance, LLM and RL. Please cite our p…

Python 2,525 288 Updated Apr 25, 2026

[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incen…

Python 1,357 27 Updated Mar 20, 2026

Ultralytics YOLO 🚀

Python 58,358 11,193 Updated Jun 14, 2026

Official implementation of GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Python 470 32 Updated May 20, 2026

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,444 120 Updated Apr 17, 2026

slime is an LLM post-training framework for RL Scaling.

Python 6,109 894 Updated Jun 13, 2026

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 4,930 442 Updated Nov 13, 2025

A curated list of reinforcement learning (RL) for agents.

98 3 Updated Jun 6, 2026

The official repo for "Vidi: Large Multimodal Models for Video Understanding and Editing"

Python 637 44 Updated Mar 4, 2026

P1: Mastering Physics Olympiads with Reinforcement Learning

86 4 Updated Dec 29, 2025

A curated guide to Generative Engine Optimization (GEO) resources: guides, tools & research to boost visibility in AI-powered search engines.

411 86 Updated Apr 14, 2026

Awesome list for research on GEO (Generative Engine Optimization)

103 11 Updated Feb 4, 2026

A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)

203 7 Updated Aug 6, 2025

Monitoring recent cross-research on LLM & RL on arXiv for control. If there are good papers, PRs are welcome.

554 37 Updated Nov 17, 2025

微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

Python 41,372 7,594 Updated May 24, 2026
Next