Stars
from vibe coding to agentic engineering - practice makes claude perfect
Model interpretability and understanding for PyTorch
Text-audio foundation model from Boson AI
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source …
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
[NeurIPS 2023] Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
TransNet V2: Shot Boundary Detection Neural Network
Robust Speech Recognition via Large-Scale Weak Supervision
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]
A collection of awesome text-to-image generation studies.
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
SEED-Story: Multimodal Long Story Generation with Large Language Model
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
Stable Diffusion web UI
Latent Couple extension (two shot diffusion port)
Implementation of DragGAN: Interactive Point-based Manipulation on the Generative Image Manifold
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
PyTorch implementation of ``User-Controllable Latent Transformer for StyleGAN Image Layout Editing'' [Computer Graphics Forum (Proc. of Pacific Graphics 2022)]
StyleGAN - Official TensorFlow Implementation
StyleGAN2 - Official TensorFlow Implementation