Starred repositories
Fast, small, and fully autonomous AI assistant infrastructure — deploy anywhere, swap anything 🦀
"🐈 nanobot: The Ultra-Lightweight OpenClaw"
A lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, Telegram, Slack, Discord, Gmail and other messaging apps,, has memory, scheduled jobs, and runs dir…
An open-source SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skills and subagents, it handles different levels of tasks that could take minute…
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
Anthropic's Interactive Prompt Engineering Tutorial
A free, open source, and extensible speech-to-text application that works completely offline.
We write your reusable computer vision tools. 💜
Reference PyTorch implementation and models for DINOv3
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
[NeurIPS 2025] Official implementation of "XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation".
A unified inference and post-training framework for accelerated video generation.
An inference and training framework for multiple image input in Flux Kontext dev
A PyTorch native platform for training generative AI models
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
The ultimate training toolkit for finetuning diffusion models
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
This node preserves image quality by selectively merging only the changed regions from AI-generated edits back into the original image.
Simple, Efficient, and Effective Negative Guidance in Few-Step Image Generation Models By Value Sign Flip
Repo for SeedVR2 (ICLR2026) & SeedVR (CVPR2025 Highlight)
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨ (ICCV 2025 Highlight)
Context engineering is the new vibe coding - it's the way to actually make AI coding assistants work. Claude Code is the best for this so that's what this repo is centered around, but you can apply…
[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent
An open-source AI agent that brings the power of Gemini directly into your terminal.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.