ahk-d

Ali Dulaimi ahk-d

A.I. (Conversational) agents

13 followers · 31 following

Achievements

FlashLabs-Chroma Public
Forked from FlashLabs-AI-Corp/FlashLabs-Chroma

Worlds first open-source real-time end-to-end spoken dialogue model with personalized voice cloning.

Jupyter Notebook Apache License 2.0 Updated Jan 28, 2026
VoxCPM Public
Forked from OpenBMB/VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python Apache License 2.0 Updated Jan 4, 2026
Evaluating-and-Improving-CoT Public

Jupyter Notebook Updated Dec 19, 2025
supertonic Public
Forked from supertone-inc/supertonic

Lightning-fast, on-device TTS — running natively via ONNX.

Swift MIT License Updated Nov 22, 2025
index-tts Public
Forked from index-tts/index-tts

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python Other Updated Nov 7, 2025
fish-speech Public
Forked from fishaudio/fish-speech

SOTA Open Source TTS

Python Apache License 2.0 Updated Nov 6, 2025
sherpa-onnx Public
Forked from k2-fsa/sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ Apache License 2.0 Updated Nov 3, 2025
unmute Public
Forked from kyutai-labs/unmute

Make text LLMs listen and speak

Python MIT License Updated Oct 22, 2025
GPT-from-scratch-implementation-and-tutorial Public

https://colab.research.google.com/drive/1fQpp981-Kgv4IgSo5wdh-YJTgmdlN_y_?usp=sharing

Jupyter Notebook Updated Oct 6, 2025
TTS-Dataset-Maker Public

Automated workflow to generate TTS datasets using A.I.

Python 1 Updated Sep 30, 2025
vadsplit Public

Updated Sep 25, 2025
docling Public
Forked from docling-project/docling

Get your documents ready for gen AI

Python MIT License Updated Sep 24, 2025
tesseraction Public

extract knowledge and insights from YouTube comments against your ideas like a 4-D view of your 3-D idea

chatbot rag crewai

Updated Sep 22, 2025
voxtream Public
Forked from herimor/voxtream

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency

Python Apache License 2.0 Updated Sep 22, 2025
stem-remixer Public

Music -> cover generator using music style (timbre + others) transfer

deep-learning nextjs music-processing reactflow

TypeScript 2 Updated Sep 20, 2025
kokoro-tts Public
Forked from nazdridoy/kokoro-tts

A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.

Python MIT License Updated Sep 13, 2025
music-stem-separation-ui-2025 Public

HT-Demucs & Spleeter in one UI live on HuggingFace

Python 5 Updated Sep 12, 2025
VibeVoice Public
Forked from microsoft/VibeVoice

Frontier Open-Source Text-to-Speech

Python MIT License Updated Aug 25, 2025
CycleGAN-Timbre-Transfer Public

A simple CycleGAN-based approach for timbre style transfer across five distinct musical instruments. Pre-trained weights will be available.

pytorch generative-adversarial-network deepl-learning

Updated Aug 5, 2025
gpt-oss Public
Forked from openai/gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI

Python Apache License 2.0 Updated Aug 5, 2025
Finetune-Experiments Public

speech, language and vision large models fine-tuning experiments with Unsloth, LoRA, PEFT, etc

Updated Aug 4, 2025
notebooks Public
Forked from unslothai/notebooks

100+ Fine-tuning LLM Notebooks on Google Colab, Kaggle, and more.

Jupyter Notebook GNU Lesser General Public License v3.0 Updated Jul 31, 2025
ahk-d Public

Updated Jul 30, 2025
tts Public
Forked from inworld-ai/tts

Inworld TTS

Python MIT License Updated Jul 30, 2025
gryannote Public
Forked from clement-pages/gryannote

Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.

Svelte MIT License Updated Jul 27, 2025
VQ-VAE-Timbre-Transfer-Demo-Gradio Public

Jupyter Notebook Updated Jul 13, 2025
Kimi-Audio Public
Forked from MoonshotAI/Kimi-Audio

Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation

Python Updated Jun 21, 2025
EmoSphere-TTS Public
Forked from Choddeok/EmoSphere-TTS

[INTERSPEECH 2024] The official implementation of EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech

Python Updated May 20, 2025
expo-kokoro-onnx Public
Forked from isaiahbjork/expo-kokoro-onnx

Run Kokoro TTS locally on device using Expo & ONNX Runtime

TypeScript MIT License Updated May 8, 2025
spleeter Public
Forked from deezer/spleeter

Deezer source separation library including pretrained models.

Python 1 MIT License Updated Apr 2, 2025

Ali Dulaimi ahk-d

Achievements

Achievements

FlashLabs-Chroma Public

Uh oh!

VoxCPM Public

Uh oh!

Evaluating-and-Improving-CoT Public

Uh oh!

supertonic Public

Uh oh!

index-tts Public

Uh oh!

fish-speech Public

Uh oh!

sherpa-onnx Public

Uh oh!

unmute Public

Uh oh!

GPT-from-scratch-implementation-and-tutorial Public

Uh oh!

TTS-Dataset-Maker Public

Uh oh!

vadsplit Public

Uh oh!

docling Public

Uh oh!

tesseraction Public

Uh oh!

voxtream Public

Uh oh!

stem-remixer Public

Uh oh!

kokoro-tts Public

Uh oh!

music-stem-separation-ui-2025 Public

Uh oh!

VibeVoice Public

Uh oh!

CycleGAN-Timbre-Transfer Public

Uh oh!

gpt-oss Public

Uh oh!

Finetune-Experiments Public

Uh oh!

notebooks Public

Uh oh!

ahk-d Public

Uh oh!

tts Public

Uh oh!

gryannote Public

Uh oh!

VQ-VAE-Timbre-Transfer-Demo-Gradio Public

Uh oh!

Kimi-Audio Public

Uh oh!

EmoSphere-TTS Public

Uh oh!

expo-kokoro-onnx Public

Uh oh!

spleeter Public

Uh oh!