A powerful desktop app that transforms video content into interactive language learning materials.
Features • Installation • Usage • Configuration • Development
- YouTube - Extract audio from any YouTube video
- Bilibili - Support for Bilibili videos
- Local Files - Process local audio/video files
- Microphone - Record directly (coming soon)
- Speech Recognition - Azure OpenAI Whisper for accurate transcription
- Smart Translation - Context-aware translation with GPT
- Grammar Annotations - Detailed grammar explanations for each sentence
- Vocabulary Highlights - Key words with meanings and usage examples
- Side-by-side Format - Original text paired with translation
- Grammar Notes - Key grammatical structures explained
- Vocabulary Tables - Consolidated word lists for review
- Word Format - Professional .docx output for easy printing
- Fluent Design - Microsoft-inspired dark theme
- Card Selection - Intuitive clickable cards (no boring radio buttons!)
- Timeline Log - Visual activity tracking with status icons
- Real-time Progress - Live updates during processing
- Download
LanguageLearner_Setup_v1.0.0.exefrom Releases - Run the installer
- Launch from desktop or Start menu
# Clone the repository
git clone https://github.com/hancao/language-learner.git
cd language-learner
# Install dependencies
pip install -r requirements.txt
# Install yt-dlp and ffmpeg
pip install yt-dlp
# Download ffmpeg from https://ffmpeg.org and add to PATH
# Run the application
python gui_fluent.py- Create an Azure OpenAI resource at Azure Portal
- Deploy these models:
- Whisper - For speech recognition
- GPT-4 or GPT-3.5 - For translation and annotation
- Get your endpoint URL and API key
- Enter credentials in the app's Settings panel
On first launch, you'll be prompted to configure:
- Endpoint URL
- API Key
- Whisper deployment name
- GPT deployment name
- Select Input Source - Click on YouTube, Bilibili, Local File, or Microphone
- Enter URL/Select File - Paste video URL or browse for local file
- Choose Target Language - Select your learning language
- Start Processing - Click the purple button and wait
- Open Document - Click the output file link when complete
| Language | Code |
|---|---|
| Chinese | 🇨🇳 中文 |
| English | 🇺🇸 English |
| Japanese | 🇯🇵 日本語 |
| Korean | 🇰🇷 한국어 |
| Spanish | 🇪🇸 Español |
| French | 🇫🇷 Français |
language-learner/
├── gui_fluent.py # Main UI application
├── audio_extractor.py # Audio extraction from various sources
├── transcriber.py # Whisper API integration
├── translator.py # GPT translation with optimizations
├── doc_generator.py # Word document generation
├── requirements.txt # Python dependencies
├── LanguageLearner.spec # PyInstaller configuration
└── LanguageLearner.iss # Inno Setup installer script
# Build standalone executable
pyinstaller LanguageLearner.spec
# Build installer (requires Inno Setup)
"C:\Program Files (x86)\Inno Setup 6\ISCC.exe" LanguageLearner.isscustomtkinter>=5.0.0
openai>=1.0.0
python-docx>=0.8.11
yt-dlp>=2023.0.0
Pillow>=9.0.0
scipy>=1.9.0
sounddevice>=0.4.0
The translator module includes several optimizations:
| Feature | Description |
|---|---|
| Segment Merging | Combines short segments to reduce API calls |
| Batch Processing | Processes 40 segments per API call |
| Parallel Execution | Uses 3 concurrent threads |
| Auto Retry | Handles rate limits with exponential backoff |
For a video with 750 segments:
- Before: ~76 API calls, ~20 minutes
- After: ~10 API calls, ~5 minutes
- Ensure yt-dlp is installed:
pip install yt-dlp - Check ffmpeg is in PATH or installed at
C:\ffmpeg\ - Verify the video URL is accessible
- Check your Azure OpenAI endpoint URL (should NOT include deployment path)
- Verify API key is correct
- Ensure deployment names match your Azure configuration
- The app requires Windows 10 or later
- CustomTkinter needs a graphics display
MIT License - feel free to use, modify, and distribute.
- CustomTkinter - Modern UI toolkit
- yt-dlp - Video downloading
- ffmpeg - Audio processing
- Azure OpenAI - AI APIs
Built with ❤️ and AI assistance
Vibe Coding Project - December 2024