This project is a Hindi-based chatbot designed to deliver quick, accurate, and contextually relevant responses in Hindi. It utilizes a Retrieval-Augmented Generation (RAG) pipeline to enhance the quality of answers by combining retrieval and generation techniques.
The application is built using React for the frontend and FastAPI for the backend, ensuring efficient performance and smooth communication between components.
- Frontend: Hosted on Vercel
- Backend: Hosted on Render
- Framework: React
- Hosting Platform: Vercel
- Framework: FastAPI
- Architecture: RAG-based chatbot
- Database: Pinecone Vector Database
- LLM Model: Google Gemini
- Embedding Model:
intfloat/multilingual-e5-large - OCR Engine:
pytesseractfor accurate Hindi text extraction and proper encoding
The chatbotβs database has been built using Jain Vidya Books, specifically Part 3. Text data is processed using OCR and embedded into the Pinecone vector database for semantic search and retrieval.
- Conversational interface entirely in Hindi
- Ability to understand and solve MCQ questions in Hindi.
- RAG-based design for contextually rich answers
- High-speed, efficient response generation
- Multilingual embedding model for robust text understanding
- Accurate Hindi OCR processing using Pytesseract