Lists (3)
Sort Name ascending (A-Z)
Stars
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。
📚 《从零开始构建智能体》——从零开始的智能体原理与实践教程
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Open-Sora: Democratizing Efficient Video Production for All
Generative Models by Stability AI
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
[NeurIPS 2022] Towards Robust Blind Face Restoration with Codebook Lookup Transformer
Lets make video diffusion practical!
🏛️ 三省六部制 · OpenClaw Multi-Agent Orchestration System — 9 specialized AI agents with real-time dashboard, model config, and full audit trails
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Official implementation of AnimateDiff.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Unlimited-length talking video generation that supports image-to-video and video-to-video generation
Magic to turn Cursor/Windsurf as 90% of Devin
High-resolution models for human tasks.
Recommend new arxiv papers of your interest daily according to your Zotero libarary.
[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
A unified inference and post-training framework for accelerated video generation.
Efficient vision foundation models for high-resolution generation and perception.
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
