A real-time speech-to-text transcription and summarization tool that converts spoken words into organized notes.
- 🎙️ Real-time audio transcription using Google Cloud Speech-to-Text
- 📝 Automatic summarization using Facebook's BART model
- 🖥️ React frontend with Vite for fast development
- ⚡ FastAPI backend for efficient processing
- 📱 Responsive UI with Tailwind CSS
- Audio Capture: Frontend captures microphone input using Web Audio API
- Stream Processing: Audio chunks are sent to backend via Socket.IO
- Transcription: Google Cloud Speech-to-Text converts speech to text
- Summarization: BART model generates concise bullet-point notes
- Display: Results shown in a tabbed interface (live captions/summary)
- React
- Socket.IO client
- Web Audio API
- FastAPI
- Google Cloud Speech-to-Text
- HuggingFace Transformers (BART model)
- Socket.IO server
- Uvicorn ASGI server