Skip to content

Language Learning Assistant - Transform video content into interactive learning materials with AI-powered transcription, translation, and grammar annotations

Notifications You must be signed in to change notification settings

haanc/language-learner

Repository files navigation

Language Learning Assistant 🎓

Version Platform Python License

A powerful desktop app that transforms video content into interactive language learning materials.

FeaturesInstallationUsageConfigurationDevelopment


✨ Features

🎬 Multi-Source Input

  • YouTube - Extract audio from any YouTube video
  • Bilibili - Support for Bilibili videos
  • Local Files - Process local audio/video files
  • Microphone - Record directly (coming soon)

🤖 AI-Powered Processing

  • Speech Recognition - Azure OpenAI Whisper for accurate transcription
  • Smart Translation - Context-aware translation with GPT
  • Grammar Annotations - Detailed grammar explanations for each sentence
  • Vocabulary Highlights - Key words with meanings and usage examples

📄 Learning Documents

  • Side-by-side Format - Original text paired with translation
  • Grammar Notes - Key grammatical structures explained
  • Vocabulary Tables - Consolidated word lists for review
  • Word Format - Professional .docx output for easy printing

🎨 Modern UI

  • Fluent Design - Microsoft-inspired dark theme
  • Card Selection - Intuitive clickable cards (no boring radio buttons!)
  • Timeline Log - Visual activity tracking with status icons
  • Real-time Progress - Live updates during processing

📥 Installation

Option 1: Windows Installer (Recommended)

  1. Download LanguageLearner_Setup_v1.0.0.exe from Releases
  2. Run the installer
  3. Launch from desktop or Start menu

Option 2: Run from Source

# Clone the repository
git clone https://github.com/hancao/language-learner.git
cd language-learner

# Install dependencies
pip install -r requirements.txt

# Install yt-dlp and ffmpeg
pip install yt-dlp
# Download ffmpeg from https://ffmpeg.org and add to PATH

# Run the application
python gui_fluent.py

⚙️ Configuration

Azure OpenAI Setup

  1. Create an Azure OpenAI resource at Azure Portal
  2. Deploy these models:
    • Whisper - For speech recognition
    • GPT-4 or GPT-3.5 - For translation and annotation
  3. Get your endpoint URL and API key
  4. Enter credentials in the app's Settings panel

First Run

On first launch, you'll be prompted to configure:

  • Endpoint URL
  • API Key
  • Whisper deployment name
  • GPT deployment name

🚀 Usage

  1. Select Input Source - Click on YouTube, Bilibili, Local File, or Microphone
  2. Enter URL/Select File - Paste video URL or browse for local file
  3. Choose Target Language - Select your learning language
  4. Start Processing - Click the purple button and wait
  5. Open Document - Click the output file link when complete

Supported Languages

Language Code
Chinese 🇨🇳 中文
English 🇺🇸 English
Japanese 🇯🇵 日本語
Korean 🇰🇷 한국어
Spanish 🇪🇸 Español
French 🇫🇷 Français

🛠️ Development

Project Structure

language-learner/
├── gui_fluent.py          # Main UI application
├── audio_extractor.py     # Audio extraction from various sources
├── transcriber.py         # Whisper API integration
├── translator.py          # GPT translation with optimizations
├── doc_generator.py       # Word document generation
├── requirements.txt       # Python dependencies
├── LanguageLearner.spec   # PyInstaller configuration
└── LanguageLearner.iss    # Inno Setup installer script

Building from Source

# Build standalone executable
pyinstaller LanguageLearner.spec

# Build installer (requires Inno Setup)
"C:\Program Files (x86)\Inno Setup 6\ISCC.exe" LanguageLearner.iss

Dependencies

customtkinter>=5.0.0
openai>=1.0.0
python-docx>=0.8.11
yt-dlp>=2023.0.0
Pillow>=9.0.0
scipy>=1.9.0
sounddevice>=0.4.0

📊 Performance

The translator module includes several optimizations:

Feature Description
Segment Merging Combines short segments to reduce API calls
Batch Processing Processes 40 segments per API call
Parallel Execution Uses 3 concurrent threads
Auto Retry Handles rate limits with exponential backoff

For a video with 750 segments:

  • Before: ~76 API calls, ~20 minutes
  • After: ~10 API calls, ~5 minutes

🐛 Troubleshooting

Audio Extraction Failed

  • Ensure yt-dlp is installed: pip install yt-dlp
  • Check ffmpeg is in PATH or installed at C:\ffmpeg\
  • Verify the video URL is accessible

API Errors

  • Check your Azure OpenAI endpoint URL (should NOT include deployment path)
  • Verify API key is correct
  • Ensure deployment names match your Azure configuration

UI Display Issues

  • The app requires Windows 10 or later
  • CustomTkinter needs a graphics display

📄 License

MIT License - feel free to use, modify, and distribute.

🙏 Acknowledgments


Built with ❤️ and AI assistance

Vibe Coding Project - December 2024

About

Language Learning Assistant - Transform video content into interactive learning materials with AI-powered transcription, translation, and grammar annotations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published