-
CUHK
- Shenzhen
Stars
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
张雪峰.skill — 张雪峰的认知操作系统。高考志愿/考研/职业规划的实战思维框架。由女娲.skill生成。
A curated list of papers on reinforcement learning for video generation
Memory for 24/7 proactive agents like openclaw (moltbot, clawdbot).
Give your agents the power of the Hugging Face ecosystem
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
[ICLR 2026] Official repo for paper "Video-As-Prompt: Unified Semantic Control for Video Generation"
SkyReels-A2: Compose anything in video diffusion transformers
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"
Light Image Video Generation Inference Framework
[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
A unified inference and post-training framework for accelerated video generation.
An mcp server for searching against google custom search api
The Schema-Guided Dialogue Dataset
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.
Extracted system prompts from ChatGPT (GPT-5.4, GPT-5.3, Codex), Claude (Opus 4.6, Sonnet 4.6, Claude Code), Gemini (3.1 Pro, 3 Flash, CLI), Grok (4.2, 4), Perplexity, and more. Updated regularly.
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone

