⚡ AMD NPU Utils

Essential Development Tools for AMD Ryzen AI NPU

🎯 Universal NPU development toolkit
Detection, profiling, optimization, and debugging tools for AMD Ryzen AI NPUs

🧰 Tool Suite

🔍 Detection & Setup

npu-detect - Hardware detection and capability reporting
npu-setup - Automated driver installation and configuration
npu-doctor - System health check and troubleshooting
npu-info - Detailed hardware information and status

📊 Performance & Profiling

npu-profile - Performance profiling and bottleneck analysis
npu-benchmark - Standard benchmarks for NPU performance (RTF 0.213 achieved!)
npu-monitor - Real-time utilization and temperature monitoring
npu-optimize - Model optimization recommendations (30% turbo improvement)

🔧 Development Tools

npu-compile - MLIR-AIE kernel compilation wrapper
npu-test - NPU functionality testing suite
npu-debug - Debugging tools for NPU applications
npu-validate - Model validation for NPU deployment

🚀 Quick Start

Installation

# One-line installer
curl -fsSL https://raw.githubusercontent.com/Unicorn-Commander/amd-npu-utils/main/install.sh | bash

# Manual installation
git clone https://github.com/Unicorn-Commander/amd-npu-utils.git
cd amd-npu-utils
sudo ./install.sh

Basic Usage

# Detect NPU hardware
npu-detect

# Check system health
npu-doctor

# Monitor NPU in real-time
npu-monitor

# Run performance benchmark
npu-benchmark --model kokoro-tts

🔍 NPU Detection

Hardware Detection

$ npu-detect
🔍 AMD NPU Detection Report
═══════════════════════════════════════

✅ NPU Hardware: AMD Ryzen AI (Phoenix)
✅ NPU Driver: amdxdna v1.2.0
✅ XRT Runtime: 2.17.722
✅ VitisAI: 3.5.0
✅ MLIR-AIE: 2024.2

📊 NPU Capabilities:
   - INT8 Quantization: Supported
   - FP16 Precision: Supported  
   - Max Frequency: 1.3 GHz
   - Memory: 32MB Tile Memory
   - Compute Units: 4 AIE Tiles

🎯 Optimization Level: Fully Optimized

Compatibility Check

$ npu-doctor
🏥 NPU System Health Check
══════════════════════════════════════

✅ Driver Installation: OK
✅ XRT Communication: OK
✅ Memory Allocation: OK
✅ Thermal Management: OK
⚠️  Power Management: Needs attention

🔧 Recommendations:
   - Enable NPU power profile: sudo npu-setup --power-profile performance
   - Update firmware: Latest available
   
Overall Status: 🟢 Ready for Production

📊 Performance Tools

Real-time Monitoring

$ npu-monitor
┌─ NPU Monitor ─────────────────────────┐
│ AMD Ryzen AI (Phoenix)                │
├───────────────────────────────────────┤
│ Utilization: ████████░░ 82%           │
│ Temperature: 65°C (Normal)            │
│ Frequency:   1.28 GHz                 │
│ Memory:      24MB / 32MB              │
│ Power:       8.2W / 15W               │
└───────────────────────────────────────┘

Press 'q' to quit, 'r' to reset stats

Performance Profiling

$ npu-profile --model my_model.onnx --input sample.wav
🚀 NPU Performance Profile
═══════════════════════════════════════

Model: my_model.onnx
Input: sample.wav (3.2s audio)

📊 Execution Breakdown:
   ┌─────────────────┬──────────┬─────────┐
   │ Stage           │ Time     │ % Total │
   ├─────────────────┼──────────┼─────────┤
   │ Model Loading   │ 125ms    │   5.1%  │
   │ Input Prep      │  45ms    │   1.8%  │
   │ NPU Execution   │ 1.89s    │  77.2%  │
   │ Output Process  │ 394ms    │  16.1%  │
   └─────────────────┴──────────┴─────────┘

⚡ Performance Metrics:
   - Total Time: 2.45s
   - RTF: 0.765 (Real-Time Factor)
   - Throughput: 1.31x real-time
   - NPU Efficiency: 94.2%

🎯 Optimization Suggestions:
   - Consider INT8 quantization (-15% latency)
   - Batch processing for multiple inputs

🔧 Development Tools

Model Compilation

$ npu-compile --model kokoro.onnx --target phoenix --optimize
🔨 NPU Model Compilation
═══════════════════════════════════════

Input Model: kokoro.onnx
Target: AMD Phoenix NPU
Optimization: Enabled

🔄 Compilation Steps:
   ✅ ONNX Validation
   ✅ Graph Optimization  
   ✅ Quantization (INT8)
   ✅ MLIR-AIE Lowering
   ✅ NPU Kernel Generation
   ✅ Runtime Integration

📦 Output: kokoro_npu.so
   - Size: 24.7MB (vs 89.2MB original)
   - Estimated Performance: 2.1x speedup
   - Memory Usage: 18MB NPU memory

🚀 Ready for deployment!

Testing Suite

$ npu-test --comprehensive
🧪 NPU Test Suite (Comprehensive)
═══════════════════════════════════════

Hardware Tests:
   ✅ NPU Detection
   ✅ Memory Allocation
   ✅ Frequency Scaling
   ✅ Thermal Monitoring

Driver Tests:
   ✅ Kernel Module Loading
   ✅ Device Communication  
   ✅ XRT Interface
   ✅ Error Handling

Performance Tests:
   ✅ Matrix Multiplication
   ✅ Convolution Operations
   ✅ Memory Bandwidth
   ✅ Sustained Workload

Integration Tests:
   ✅ ONNX Runtime
   ✅ VitisAI Provider
   ✅ PyTorch Backend
   ✅ TensorFlow Lite

🎉 All tests passed! NPU ready for production.

⚙️ Configuration Tools

NPU Setup Assistant

$ npu-setup --interactive
🛠️  AMD NPU Setup Assistant
═══════════════════════════════════════

Current Status: Not Configured

1. Install NPU drivers
2. Configure XRT runtime  
3. Set up VitisAI
4. Install development tools
5. Optimize system settings

Select option (1-5): 1

Installing NPU drivers...
✅ Downloaded amdxdna-driver-1.2.0
✅ Compiled kernel module
✅ Installed and loaded driver
✅ Created device nodes

NPU driver installation complete!
Reboot required: No

Continue with XRT setup? (y/n): y

System Optimization

$ npu-optimize --system
🎯 NPU System Optimization
═══════════════════════════════════════

Analyzing system configuration...

🔧 Optimizations Applied:
   ✅ CPU Governor: performance
   ✅ NPU Power Profile: high-performance  
   ✅ Memory Settings: optimized for NPU
   ✅ IRQ Affinity: balanced
   ✅ Thermal Throttling: tuned

📊 Performance Impact:
   - Expected NPU boost: +12%
   - Memory bandwidth: +8%
   - Thermal headroom: +15%

⚡ System optimized for NPU workloads!

🏗️ Tool Categories

Core Utilities

Tool	Purpose	Usage
`npu-detect`	Hardware detection	`npu-detect --verbose`
`npu-setup`	Driver installation	`npu-setup --auto`
`npu-doctor`	Health diagnostics	`npu-doctor --fix`
`npu-info`	System information	`npu-info --json`

Performance Tools

Tool	Purpose	Usage
`npu-monitor`	Real-time monitoring	`npu-monitor --interval 1s`
`npu-profile`	Performance profiling	`npu-profile --model app.onnx`
`npu-benchmark`	Standard benchmarks	`npu-benchmark --suite ml`
`npu-optimize`	Optimization suggestions	`npu-optimize --model conv.onnx`

Development Tools

Tool	Purpose	Usage
`npu-compile`	Model compilation	`npu-compile --target phoenix`
`npu-test`	Testing framework	`npu-test --quick`
`npu-debug`	Debugging utilities	`npu-debug --trace execution`
`npu-validate`	Model validation	`npu-validate model.onnx`

🎯 Hardware Support

Supported NPUs

AMD Ryzen AI (Phoenix) - Ryzen 7040/8040 series
- AIE-ML tiles with 32MB memory
- 1.3 GHz max frequency
- Full toolchain support
AMD Ryzen AI (Strix Point) - Ryzen AI 300 series
- Enhanced AIE2 architecture
- 50 TOPS AI performance
- Advanced quantization support

Future Support

Next-gen AMD NPUs - Forward compatibility planned
Multi-NPU Systems - Cluster management tools
Cloud NPU Instances - Remote development support

📁 Installation Structure

/usr/local/bin/
├── npu-detect         # Hardware detection
├── npu-setup          # Installation assistant  
├── npu-doctor         # Health diagnostics
├── npu-info           # System information
├── npu-monitor        # Real-time monitoring
├── npu-profile        # Performance profiling
├── npu-benchmark      # Benchmarking suite
├── npu-optimize       # Optimization tools
├── npu-compile        # Model compilation
├── npu-test           # Testing framework
├── npu-debug          # Debugging utilities
└── npu-validate       # Model validation

/usr/local/share/npu-utils/
├── configs/           # Configuration templates
├── scripts/           # Helper scripts
├── benchmarks/        # Benchmark datasets
└── docs/              # Documentation

🧪 Example Workflows

New System Setup

# Complete NPU setup on fresh system
npu-detect                    # Check hardware
npu-setup --auto             # Install drivers
npu-doctor --fix             # Fix any issues
npu-optimize --system        # Optimize performance
npu-test --quick             # Verify installation

Model Development

# Develop and optimize ML model for NPU
npu-validate my_model.onnx        # Check compatibility
npu-compile --optimize my_model.onnx  # Compile for NPU
npu-profile my_model_npu.so        # Profile performance
npu-optimize --model my_model_npu.so  # Get suggestions

Production Monitoring

# Monitor NPU in production
npu-monitor --log /var/log/npu.log  # Log metrics
npu-benchmark --continuous          # Continuous testing
npu-doctor --alerts                 # Health alerts

🤝 Contributing

We welcome contributions! Areas where help is needed:

New NPU Hardware Support - Detection and optimization
Performance Tools - Advanced profiling and analysis
Integration Examples - Framework-specific guides
Documentation - User guides and tutorials

See CONTRIBUTING.md for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Projects

Magic Unicorn TTS - NPU-optimized TTS
NPU Prebuilds - Pre-compiled components
MLIR-AIE - Upstream AIE compiler

Powered by Unicorn Commander 🦄
Making NPU development accessible to everyone

Unicorn Commander • Magic Unicorn Tech

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
setup-scripts		setup-scripts
tools		tools
whisper-npu-examples		whisper-npu-examples
BENCHMARK_RESULTS.md		BENCHMARK_RESULTS.md
README.md		README.md
TURBO_MODE_PERFORMANCE_UPDATE.md		TURBO_MODE_PERFORMANCE_UPDATE.md
WHISPER_NPU_EXAMPLES.md		WHISPER_NPU_EXAMPLES.md

Unicorn-Commander/amd-npu-utils

Folders and files

Latest commit

History

Repository files navigation

⚡ AMD NPU Utils

🧰 Tool Suite

🔍 Detection & Setup

📊 Performance & Profiling

🔧 Development Tools

🚀 Quick Start

Installation

Basic Usage

🔍 NPU Detection

Hardware Detection

Compatibility Check

📊 Performance Tools

Real-time Monitoring

Performance Profiling

🔧 Development Tools

Model Compilation

Testing Suite

⚙️ Configuration Tools

NPU Setup Assistant

System Optimization

🏗️ Tool Categories

Core Utilities

Performance Tools

Development Tools

🎯 Hardware Support

Supported NPUs

Future Support

📁 Installation Structure

🧪 Example Workflows

New System Setup

Model Development

Production Monitoring

🤝 Contributing

📄 License

🔗 Related Projects

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages