-
vivo
- Hangzhou, Zhejiang, China
-
08:46
(UTC +08:00)
Stars
AI Agent 驱动的开源视频生成工作台 — 小说→角色/场景/道具设计→剧本→分镜图→视频,跨镜头角色与场景一致 | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI
Fast, accurate & comprehensive text measurement & layout
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A collection of modern C++ libraries, include coro_http, coro_rpc, compile-time reflection, struct_pack, struct_json, struct_xml, struct_pb, easylog, async_simple etc.
A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.
A General-purpose Task-parallel Programming System in C++
A relative time formatting library, with no code.
🦋 An Infographic Generation and Rendering Framework, bring words to life with AI!
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
A minimalist SOTA LaTeX OCR model with only 20M parameters, running in browser. Full training pipeline available for self-reproduction. | 超轻量SOTA LaTeX公式识别模型,仅20M参数量,可在浏览器中运行。训练全流程代码开源,以便自学复现。
新概念英语在线点读,点句即读、连续播放,支持 EN / EN+CN / CN。
sgl-project / DeepGEMM
Forked from deepseek-ai/DeepGEMMDeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
If you want to purchase Panzhihua Mi Yi Pipa, please contact me.
SGLang is a high-performance serving framework for large language models and multimodal models.
中文的C++ Template的教学指南。与知名书籍C++ Templates不同,该系列教程将C++ Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Modern C++ Programming Course (C++03/11/14/17/20/23/26)
A low-latency & high-throughput serving engine for LLMs
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
fastllm是后端无依赖的高性能大模型推理库。同时支持张量并行推理稠密模型和混合模式推理MOE模型,任意10G以上显卡即可推理满血DeepSeek。双路9004/9005服务器+单显卡部署DeepSeek满血满精度原版模型,单并发20tps;INT4量化模型单并发30tps,多并发可达60+。
lightweight, standalone C++ inference engine for Google's Gemma models.
how to optimize some algorithm in cuda.
Hackable and optimized Transformers building blocks, supporting a composable construction.
https://wavespeed.ai/ Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.

