Unicorn Execution Engine

Multi-Platform Hardware-Accelerated AI Execution Framework

🚀 Overview

The Unicorn Execution Engine is a high-performance runtime for deploying AI models on specialized hardware accelerators including Intel iGPUs, AMD NPUs, and more. Developed by Magic Unicorn Inc., this engine achieves unprecedented performance through hardware-specific optimizations.

✨ Key Features

Intel iGPU (OpenVINO)

3-5x Speedup for Kokoro TTS vs CPU
15W Power Efficiency for laptops
50+ Professional Voices for TTS
Zero-Copy Shared Memory architecture

AMD NPU (MLIR-AIE2)

220x Speedup for WhisperX vs CPU
Custom MLIR-AIE2 Kernels for optimal utilization
INT8/INT4 Quantization with minimal accuracy loss
16 TOPS INT8 performance on Phoenix NPU

📊 Performance Benchmarks

Kokoro TTS v0.19 (Intel iGPU)

Platform	Latency	Power	Speedup
Intel Iris Xe	150ms	15W	3.0x
Intel UHD	250ms	12W	1.8x
CPU (i7)	450ms	35W	Baseline

WhisperX Speech Recognition (AMD NPU)

Model	CPU Time	NPU Time	Speedup	Accuracy
Large-v3	59.4 min	16.2 sec	220x	99%
Large-v2	54.0 min	18.0 sec	180x	98%
Medium	27.0 min	14.4 sec	112x	95%

🛠️ Quick Start

Intel iGPU - Kokoro TTS

# Install with OpenVINO support
pip install unicorn-execution-engine[intel-igpu]

# Or use Docker
docker pull magicunicorn/unicorn-execution-engine:kokoro-intel-igpu

from tts.kokoro_intel_igpu import KokoroIntelTTS

# Initialize with Intel iGPU
tts = KokoroIntelTTS(device="igpu")

# Synthesize with 50+ voices
audio = tts.synthesize("Hello world!", voice="af_bella")

AMD NPU - WhisperX

# Install with NPU support
pip install unicorn-execution-engine[amd-npu]

from unicorn_engine import NPUWhisperX

# Load quantized model
model = NPUWhisperX.from_pretrained("magicunicorn/whisperx-large-v3-npu")

# Transcribe with 220x speedup
result = model.transcribe("meeting.wav")

🏗️ Architecture

Unicorn Execution Engine
├── TTS Module
│   ├── Kokoro v0.19 (Intel iGPU) ✅
│   ├── Whisper (AMD NPU) ✅
│   └── Bark (Planned)
├── LLM Module
│   ├── Llama (AMD NPU) 🚧
│   └── Mistral (NVIDIA) 📋
└── Vision Module
    ├── CLIP (Apple ANE) 📋
    └── SAM (Qualcomm) 📋

📦 Pre-built Models & Packages

HuggingFace Models

kokoro-tts-intel - Kokoro TTS for Intel iGPU
whisperx-large-v3-npu - WhisperX for AMD NPU

Docker Images

# Intel iGPU with Kokoro TTS
docker run --device /dev/dri -p 8880:8880 \
    magicunicorn/unicorn-execution-engine:kokoro-intel-igpu

# AMD NPU with WhisperX  
docker run --device /dev/accel -p 8881:8881 \
    magicunicorn/unicorn-execution-engine:whisperx-amd-npu

🔧 Platform-Specific Setup

Intel iGPU Setup

# Install OpenVINO and drivers
sudo apt-get install intel-opencl-icd intel-level-zero-gpu level-zero
pip install openvino==2024.0.0 onnxruntime-openvino==1.17.0

# Verify Intel GPU
lspci | grep -i intel | grep -i vga

AMD NPU Setup

# Install XRT and drivers
sudo apt-get install xrt amd-npu-driver
pip install pyxrt>=2.0.0

# Verify NPU
ls /dev/accel/accel0

💾 Model Files (Git LFS)

Kokoro TTS

models/kokoro-v0_19.onnx (311MB) - TTS model
models/voices-v1.0.bin (25MB) - 50+ voice embeddings

WhisperX

models/whisperx-large-v3.npumodel (1.5GB) - Quantized INT8 model
models/whisperx-kernels.xclbin (50MB) - Custom MLIR kernels

🚀 Advanced Features

Multi-Hardware Pipeline

from unicorn_engine import MultiPlatformEngine

engine = MultiPlatformEngine()

# Automatically selects best hardware
# Intel iGPU for TTS, AMD NPU for STT
pipeline = engine.create_pipeline([
    ("speech_recognition", "amd-npu"),
    ("text_synthesis", "intel-igpu")
])

result = await pipeline.process(audio_input)

Custom Optimization

# Intel iGPU optimization
from intel_igpu_module import IntelIGPUExecutor
executor = IntelIGPUExecutor()
executor.optimize_for_latency()

# AMD NPU quantization
from unicorn_engine import Quantizer
quantizer = Quantizer(target="npu", precision="int8")

📈 Roadmap

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Development

# Clone repo
git clone https://github.com/Unicorn-Commander/Unicorn-Execution-Engine
cd Unicorn-Execution-Engine

# Install dev dependencies
pip install -e .[dev]

# Run tests
pytest tests/

📚 Documentation

Intel iGPU Guide - Kokoro TTS optimization
AMD NPU Guide - WhisperX acceleration
API Reference - Complete API documentation
Benchmarks - Performance analysis

🏢 About Magic Unicorn Inc.

Magic Unicorn Inc. develops enterprise AI solutions optimized for edge deployment. The Unicorn Commander Suite provides complete AI infrastructure for on-premise deployments.

Related Projects

Unicorn-Orator - Full TTS platform
Meeting-Ops - AI meeting recorder
Unicorn Models - Pre-optimized models

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Intel for OpenVINO and iGPU support
AMD for NPU hardware and MLIR-AIE2
OpenAI for original Whisper models
The open-source community

📞 Contact

GitHub Issues: Report bugs
HuggingFace: Discussion forum
Email: [email protected]

⭐ Star us on GitHub if you find this useful!

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
NPU-Development		NPU-Development
__pycache__		__pycache__
allocation_strategies/gemma-3n-e4b-mix-n-match		allocation_strategies/gemma-3n-e4b-mix-n-match
archive_old_scripts		archive_old_scripts
backends/amd_npu		backends/amd_npu
build/lib/unicorn_engine		build/lib/unicorn_engine
cache/models--unsloth--gemma-3n-E4B-it-GGUF/refs		cache/models--unsloth--gemma-3n-E4B-it-GGUF/refs
custom_npu_kernels		custom_npu_kernels
dist		dist
enhanced_npu_kernels		enhanced_npu_kernels
granite_pipeline/UC1-NPU-Production		granite_pipeline/UC1-NPU-Production
hma_memory_states/gemma-3n-e4b-test		hma_memory_states/gemma-3n-e4b-test
kernels/phase1		kernels/phase1
llama-npu-integration		llama-npu-integration
loader_states/gemma-3n-e4b-test		loader_states/gemma-3n-e4b-test
models		models
npu_binaries		npu_binaries
npu_compile_work		npu_compile_work
npu_development		npu_development
npu_development_complete		npu_development_complete
npu_kernel_binaries		npu_kernel_binaries
npu_kernel_development		npu_kernel_development
npu_kernels		npu_kernels
npu_kernels_compiled		npu_kernels_compiled
npu_kernels_gemma3_4b		npu_kernels_gemma3_4b
npu_kernels_inference		npu_kernels_inference
npu_kernels_real		npu_kernels_real
npu_kernels_source		npu_kernels_source
npu_xrt_wrapper		npu_xrt_wrapper
other-project-npu-docs		other-project-npu-docs
previous_development		previous_development
qwen3_30b_moe		qwen3_30b_moe
real_test_inputs		real_test_inputs
test_results/gemma3n_e4b_core_test		test_results/gemma3n_e4b_core_test
tts		tts
unicorn_engine.egg-info		unicorn_engine.egg-info
unicorn_engine		unicorn_engine
vibevoice		vibevoice
vulkan_compute		vulkan_compute
vulkan_shaders		vulkan_shaders
xdna-driver		xdna-driver
xrt-source		xrt-source
.gitattributes		.gitattributes
.gitignore		.gitignore
ACHIEVING_17_3_TPS.md		ACHIEVING_17_3_TPS.md
ACTUAL_PERFORMANCE_RESULTS.md		ACTUAL_PERFORMANCE_RESULTS.md
AI_PROJECT_MANAGEMENT_CHECKLIST.md		AI_PROJECT_MANAGEMENT_CHECKLIST.md
AI_WORKSPACE_GUIDE.md		AI_WORKSPACE_GUIDE.md
BOTTLENECK_ANALYSIS.md		BOTTLENECK_ANALYSIS.md
BUG_REPORT.md		BUG_REPORT.md
BUILD_STATUS.md		BUILD_STATUS.md
CHANGELOG.md		CHANGELOG.md
CHANGELOG_2025.md		CHANGELOG_2025.md
CLAUDE.md		CLAUDE.md
COMPLETE_SYSTEM_SUMMARY.md		COMPLETE_SYSTEM_SUMMARY.md
CRITICAL_FIXES_COMPLETED.md		CRITICAL_FIXES_COMPLETED.md
CURRENT_OPTIMIZATION_STATUS.md		CURRENT_OPTIMIZATION_STATUS.md
CURRENT_PROJECT_STATUS.md		CURRENT_PROJECT_STATUS.md
CURRENT_STATUS.md		CURRENT_STATUS.md
CURRENT_STATUS_AND_HANDOFF.md		CURRENT_STATUS_AND_HANDOFF.md
CURRENT_STATUS_JULY_12_2025.md		CURRENT_STATUS_JULY_12_2025.md
CURRENT_STATUS_JULY_2025.md		CURRENT_STATUS_JULY_2025.md
CURRENT_STATUS_REAL_IMPLEMENTATION.md		CURRENT_STATUS_REAL_IMPLEMENTATION.md
CURRENT_STATUS_REPORT.md		CURRENT_STATUS_REPORT.md
CURRENT_STATUS_SUMMARY.md		CURRENT_STATUS_SUMMARY.md
CURRENT_STATUS_UPDATED_JULY_2025.md		CURRENT_STATUS_UPDATED_JULY_2025.md
CUSTOM_ENGINE_SUMMARY.md		CUSTOM_ENGINE_SUMMARY.md
CUSTOM_NPU_KERNELS_SUMMARY.md		CUSTOM_NPU_KERNELS_SUMMARY.md
DEBUG_SEGFAULT_GUIDE.md		DEBUG_SEGFAULT_GUIDE.md
DEPLOYMENT_SUCCESS.md		DEPLOYMENT_SUCCESS.md
DIMENSION_FIXES_SUMMARY.md		DIMENSION_FIXES_SUMMARY.md
DOCUMENTATION_INDEX.md		DOCUMENTATION_INDEX.md
DOCUMENTATION_UPDATE_SUMMARY.md		DOCUMENTATION_UPDATE_SUMMARY.md
Dockerfile		Dockerfile
FALLBACK_OPTIMIZATION_STRATEGY.md		FALLBACK_OPTIMIZATION_STRATEGY.md
FINAL_ACTION_PLAN.md		FINAL_ACTION_PLAN.md
FINAL_COMPLETION_SUMMARY.md		FINAL_COMPLETION_SUMMARY.md
FINAL_GPU_RECOMMENDATIONS.md		FINAL_GPU_RECOMMENDATIONS.md
FINAL_IGPU_NPU_COMPARISON.md		FINAL_IGPU_NPU_COMPARISON.md
FINAL_IMPLEMENTATION_PLAN.md		FINAL_IMPLEMENTATION_PLAN.md
FINAL_NPU_IGPU_STATUS.md		FINAL_NPU_IGPU_STATUS.md
FINAL_NPU_KERNEL_SUMMARY.md		FINAL_NPU_KERNEL_SUMMARY.md
FINAL_NPU_VULKAN_RESULTS.md		FINAL_NPU_VULKAN_RESULTS.md
FINAL_OPTIMIZATION_SUMMARY.md		FINAL_OPTIMIZATION_SUMMARY.md
FINAL_PERFORMANCE_RESULTS.md		FINAL_PERFORMANCE_RESULTS.md
FINAL_PERFORMANCE_SUMMARY.md		FINAL_PERFORMANCE_SUMMARY.md
FINAL_PERFORMANCE_SUMMARY_JULY2025.md		FINAL_PERFORMANCE_SUMMARY_JULY2025.md
FINAL_PROJECT_SUMMARY.md		FINAL_PROJECT_SUMMARY.md
FINAL_REALISTIC_PATH.md		FINAL_REALISTIC_PATH.md
FINAL_STATUS_JULY_2025.md		FINAL_STATUS_JULY_2025.md
FINAL_STATUS_SUMMARY.md		FINAL_STATUS_SUMMARY.md
FINAL_SUMMARY.md		FINAL_SUMMARY.md
FINAL_TASKS_CHECKLIST.md		FINAL_TASKS_CHECKLIST.md
FINISHING_TOUCHES_COMPLETE.md		FINISHING_TOUCHES_COMPLETE.md
FINISH_LINE_CHECKLIST.md		FINISH_LINE_CHECKLIST.md
GEMINI_CHANGELOG.md		GEMINI_CHANGELOG.md
GEMINI_CLI_INSTRUCTIONS.md		GEMINI_CLI_INSTRUCTIONS.md
GEMINI_CLI_SYSTEM_PROMPT.md		GEMINI_CLI_SYSTEM_PROMPT.md
GEMINI_CLI_TASK.md		GEMINI_CLI_TASK.md
GEMINI_COMPLETE_TASK.md		GEMINI_COMPLETE_TASK.md

License

mastercda/Unicorn-Execution-Engine

Folders and files

Latest commit

History

Repository files navigation