Highlights
- Pro
Stars
A Claude Code skill that turns any codebase into a beautiful, interactive single-page HTML course for non-technical vibe coders.
Tutorials for Triton, a language for writing gpu kernels
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
Kandinsky 5.0: A family of diffusion models for Video & Image generation
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning
Wan: Open and Advanced Large-Scale Video Generative Models
Supercharge Your LLM with the Fastest KV Cache Layer
TinyChatEngine: On-Device LLM Inference Library
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
I seek to draw from the philosophies embedded in AI research to inspire deeper reflection on life — exploring how the principles that guide intelligent systems can illuminate the way we live, think…
[ICCV 2025 (Highlight)] DIMO: Diverse 3D Motion Generation for Arbitrary Objects
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
A curated list of recent papers on efficient video attention for video diffusion models, including sparsification, quantization, and caching, etc.
Fast and memory-efficient exact attention
Efficient Triton Kernels for LLM Training
[ICCV 2025] GameFactory: Creating New Games with Generative Interactive Videos
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
MAGI-1: Autoregressive Video Generation at Scale
Lets make video diffusion practical!
CUDA integration for Python, plus shiny features
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
Interactive visualizations of the geometric intuition behind diffusion models.



