-
CUHK
- Shenzhen
Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Robust Speech Recognition via Large-Scale Weak Supervision
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
Ansible is a radically simple IT automation platform that makes your applications and systems easier to deploy and maintain. Automate everything from code deployment to network configuration to clo…
The official gpt4free repository | various collection of powerful language models | opus 4.6 gpt 5.3 kimi 2.5 deepseek v3.2 gemini 3
Unsloth Studio is a web UI for training and running open models like Gemma 4, Qwen3.5, DeepSeek, gpt-oss locally.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
PyTorch Tutorial for Deep Learning Researchers
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A generative world for general-purpose robotics & embodied AI learning.
deep learning for image processing including classification and object-detection etc.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Official inference repo for FLUX.1 models
A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Multimodal Live Streaming on Your Phone
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
WebUI extension for ControlNet
Wan: Open and Advanced Large-Scale Video Generative Models
Question and Answer based on Anything.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.

