-
FlashLabs-Chroma Public
Forked from FlashLabs-AI-Corp/FlashLabs-ChromaWorlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.
Jupyter Notebook Apache License 2.0 UpdatedJan 28, 2026 -
VoxCPM Public
Forked from OpenBMB/VoxCPMVoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Python Apache License 2.0 UpdatedJan 4, 2026 -
-
supertonic Public
Forked from supertone-inc/supertonicLightning-fast, on-device TTS — running natively via ONNX.
Swift MIT License UpdatedNov 22, 2025 -
index-tts Public
Forked from index-tts/index-ttsAn Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Python Other UpdatedNov 7, 2025 -
fish-speech Public
Forked from fishaudio/fish-speechSOTA Open Source TTS
Python Apache License 2.0 UpdatedNov 6, 2025 -
sherpa-onnx Public
Forked from k2-fsa/sherpa-onnxSpeech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…
C++ Apache License 2.0 UpdatedNov 3, 2025 -
unmute Public
Forked from kyutai-labs/unmuteMake text LLMs listen and speak
Python MIT License UpdatedOct 22, 2025 -
https://colab.research.google.com/drive/1fQpp981-Kgv4IgSo5wdh-YJTgmdlN_y_?usp=sharing
Jupyter Notebook UpdatedOct 6, 2025 -
TTS-Dataset-Maker Public
Automated workflow to generate TTS datasets using A.I.
-
-
docling Public
Forked from docling-project/doclingGet your documents ready for gen AI
Python MIT License UpdatedSep 24, 2025 -
tesseraction Public
extract knowledge and insights from YouTube comments against your ideas like a 4-D view of your 3-D idea
UpdatedSep 22, 2025 -
voxtream Public
Forked from herimor/voxtreamVoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency
Python Apache License 2.0 UpdatedSep 22, 2025 -
stem-remixer Public
Music -> cover generator using music style (timbre + others) transfer
-
kokoro-tts Public
Forked from nazdridoy/kokoro-ttsA CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
Python MIT License UpdatedSep 13, 2025 -
HT-Demucs & Spleeter in one UI live on HuggingFace
-
VibeVoice Public
Forked from microsoft/VibeVoiceFrontier Open-Source Text-to-Speech
Python MIT License UpdatedAug 25, 2025 -
CycleGAN-Timbre-Transfer Public
A simple CycleGAN-based approach for timbre style transfer across five distinct musical instruments. Pre-trained weights will be available.
UpdatedAug 5, 2025 -
gpt-oss Public
Forked from openai/gpt-ossgpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Python Apache License 2.0 UpdatedAug 5, 2025 -
Finetune-Experiments Public
speech, language and vision large models fine-tuning experiments with Unsloth, LoRA, PEFT, etc
UpdatedAug 4, 2025 -
notebooks Public
Forked from unslothai/notebooks100+ Fine-tuning LLM Notebooks on Google Colab, Kaggle, and more.
Jupyter Notebook GNU Lesser General Public License v3.0 UpdatedJul 31, 2025 -
-
-
gryannote Public
Forked from clement-pages/gryannoteProvide Gradio custom components to make the diarization-based audio labeling process easier and faster.
Svelte MIT License UpdatedJul 27, 2025 -
-
Kimi-Audio Public
Forked from MoonshotAI/Kimi-AudioKimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Python UpdatedJun 21, 2025 -
EmoSphere-TTS Public
Forked from Choddeok/EmoSphere-TTS[INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
Python UpdatedMay 20, 2025 -
expo-kokoro-onnx Public
Forked from isaiahbjork/expo-kokoro-onnxRun Kokoro TTS locally on device using Expo & ONNX Runtime
TypeScript MIT License UpdatedMay 8, 2025 -
spleeter Public
Forked from deezer/spleeterDeezer source separation library including pretrained models.

