-
Tongji University
- Shanghai
Stars
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
"CLI-Anything: Making ALL Software Agent-Native" -- CLI-Hub: https://clianything.cc/
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
AI Agent Assistant & development framework that integrates lots of IM platforms, LLMs, plugins and AI feature, and can be your openclaw alternative. ✨
GUI for a Vocal Remover that uses Deep Neural Networks.
A powerful MCP toolkit for coding, providing semantic retrieval and editing capabilities - the IDE for your agent
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
The PyTorch-based audio source separation toolkit for researchers
Graphormer is a general-purpose deep learning backbone for molecular modeling.
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
A recreation of Neuro-Sama originally created in 7 days.
yysijie / st-gcn
Forked from open-mmlab/mmskeletonSpatial Temporal Graph Convolutional Networks (ST-GCN) for Skeleton-Based Action Recognition in PyTorch
[SIGGRAPH 2025] One Model to Rig Them All: Diverse Skeleton Rigging with UniRig
[ICCV 2023] PyTorch Implementation of "MotionBERT: A Unified Perspective on Learning Human Motion Representations"
A toolbox for skeleton-based action recognition.
Official Code Release for [SIGGRAPH 2025] RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination
Official implementation of "MeshDiffusion: Score-based Generative 3D Mesh Modeling" (ICLR 2023 Spotlight)
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer (NeurIPS 2019)
