This project is a full-stack application that processes audio files for transcription and summarization. The backend is a Python-based API using Docker, while the frontend is a React application that allows users to record audio, submit it to the API, and view or download the results.
This project is a full-stack application that processes audio files for transcription and summarization. The backend is a Python-based API using Docker, while the frontend is a React application that allows users to record audio, submit it to the API, and view or download the results.
- Audio Transcription: The backend converts audio files to text using OpenAI's Whisper model.
- Text Summarization: Summarizes the transcribed text using GPT-4.
- Text-to-Speech: Converts the summarized text into an MP3 file.
- Supports Various Audio Formats: Handles
.mp3,.mp4,.mpeg,.mpga,.m4a,.wav,.webm, and more. - Frontend Interface: The React frontend allows users to record audio, submit it for processing, and view the results.
- Docker: Ensure Docker is installed on your system. You can download it from here.
- Node.js and npm: Ensure Node.js and npm are installed on your system. You can download them from here.
-
Backend (Python API)
flask_app.py: Main Flask application file.main.py: Core logic for audio processing, transcription, and summarization.requirements.txt: Python dependencies for the project.Dockerfile: Docker configuration file for containerizing the application.
-
Frontend (React Application)
src/axiosInstance.js: Configures the Axios instance to communicate with the backend API.src/AudioProcessorForm.js: Main component that handles audio recording, submission, and displaying results.src/index.js: Entry point for the React application.src/index.css: Basic styling for the application.
git clone https://github.com/Goutcho/React-Audio-Summarizer.git
cd React-Audio-SummarizerNavigate to the backend directory and build the Docker image:
cd audio-summary-api
docker build -t audio-processor-backend .Run the Docker container, exposing it on port 8000:
docker run -d -p 8000:8000 audio-processor-backendNavigate to the frontend directory and install dependencies:
cd ../audio-summary-api
npm install --legacy-peer-depsStart the React application:
npm startThe React application will run on http://localhost:3000 and interact with the API running on http://localhost:8000.
- API Key: Enter your OpenAI API key in the input field on the React frontend.
- Record Audio: Use the frontend interface to start and stop recording audio.
- Submit: Click "Process Audio" to submit the recorded audio to the backend API.
- View Results: The transcribed and summarized text will be displayed on the frontend, with an option to download the summarized text as an MP3 file.
- POST
/process-audio: Upload an audio file for transcription and summarization.- Parameters:
api_key: Your OpenAI API key.audio: The audio file to be processed.
- Response:
transcribed_text: The transcribed text from the audio.summarized_text: The summarized text.download_link: Link to download the summarized text as an MP3 file.transcription_file: Link to download the transcribed text as a.txtfile.summary_file: Link to download the summarized text as a.txtfile.
- Parameters:
To stop the running Docker container:
docker ps # Find the container ID
docker stop <container_id>To stop the React application, simply close the terminal or stop the running process with Ctrl+C.
- This application is configured for development use. For production, consider using a production-grade WSGI server like
gunicornfor the backend and optimizing the frontend build. - Ensure your OpenAI API key is secure and not exposed in public repositories.
This project is licensed under the MIT License.
