Skip to content

saadnaseem/RAG_with_chromadb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RAG Chat Web App

A Flask-based web application that enables chat with a local LLM (via Ollama) enhanced with Retrieval-Augmented Generation (RAG) using ChromaDB vector storage and PDF documents.

Features

  • Local LLM Integration: Uses Ollama to run local language models
  • RAG Enhancement: Retrieves relevant context from PDF documents to improve responses
  • Vector Storage: Uses ChromaDB for efficient document storage and retrieval
  • Streaming Responses: Real-time streaming of LLM responses
  • Chat History: Persistent chat history with download functionality
  • Modern UI: Clean, responsive web interface

Prerequisites

  1. Python 3.8+
  2. Ollama installed and running locally
    • Download from: https://ollama.ai/
    • Install and start Ollama service
    • Pull a model: ollama pull llama3.2 (or any other model)

Installation

  1. Clone or download this repository

  2. Navigate to the RAG directory:

    cd RAG
  3. Install Python dependencies:

    pip install -r requirements.txt
  4. Add your PDF documents:

    • Place your PDF files in the documents/ folder
    • The app will automatically process them on first run

Usage

  1. Start the Flask server:

    python server.py
  2. Open your web browser and go to:

    http://localhost:5001
    
  3. Start chatting!

    • Type your questions in the chat interface
    • The app will retrieve relevant context from your PDF documents
    • Responses are enhanced with the retrieved information

How It Works

Document Processing

  • PDFs in the documents/ folder are automatically loaded and chunked
  • Text chunks are embedded using HuggingFace's sentence-transformers/all-MiniLM-L6-v2
  • Embeddings are stored in ChromaDB for fast retrieval

RAG Pipeline

  1. User sends a question
  2. System retrieves relevant document chunks from ChromaDB
  3. Context is combined with the user's question
  4. Enhanced prompt is sent to Ollama LLM
  5. Response is streamed back to the user

File Structure

RAG/
├── server.py              # Flask web server
├── rag_utils.py           # RAG utilities (ChromaDB, embeddings)
├── index.html             # Web interface
├── requirements.txt       # Python dependencies
├── documents/            # PDF files for RAG
│   ├── document1.pdf
│   ├── document2.pdf
│   └── ...
├── chroma_db/           # ChromaDB storage (auto-created)
└── chat_history.json    # Chat history storage

Configuration

Changing the LLM Model

Edit server.py and modify the model name:

payload = {
    'model': 'your-model-name',  # Change this
    'prompt': prompt,
    'stream': True
}

Adjusting RAG Parameters

In rag_utils.py, you can modify:

  • chunk_size: Size of text chunks (default: 1000)
  • chunk_overlap: Overlap between chunks (default: 100)
  • k: Number of retrieved documents (default: 4)

Embedding Model

Change the embedding model in rag_utils.py:

EMBED_MODEL = "your-embedding-model"

Troubleshooting

Ollama Connection Issues

  • Ensure Ollama is running: ollama serve
  • Check if your model is available: ollama list
  • Verify Ollama API is accessible at http://localhost:11434

PDF Processing Issues

  • Ensure PDFs are readable and not corrupted
  • Check file permissions in the documents/ folder
  • Large PDFs may take time to process on first run

Memory Issues

  • Reduce chunk_size in rag_utils.py for large documents
  • Use a smaller embedding model if needed
  • Consider using GPU for embeddings if available

Dependencies

  • Flask: Web framework
  • LangChain: LLM orchestration
  • ChromaDB: Vector database
  • HuggingFace: Embedding models
  • PyPDF2: PDF processing
  • Requests: HTTP client

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

License

This project is open source and available under the MIT License.

Support

For issues and questions:

  • Check the troubleshooting section above
  • Ensure all dependencies are properly installed
  • Verify Ollama is running and accessible

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages