RAG Chat Web App

A Flask-based web application that enables chat with a local LLM (via Ollama) enhanced with Retrieval-Augmented Generation (RAG) using ChromaDB vector storage and PDF documents.

Features

Local LLM Integration: Uses Ollama to run local language models
RAG Enhancement: Retrieves relevant context from PDF documents to improve responses
Vector Storage: Uses ChromaDB for efficient document storage and retrieval
Streaming Responses: Real-time streaming of LLM responses
Chat History: Persistent chat history with download functionality
Modern UI: Clean, responsive web interface

Prerequisites

Python 3.8+
Ollama installed and running locally
- Download from: https://ollama.ai/
- Install and start Ollama service
- Pull a model: ollama pull llama3.2 (or any other model)

Installation

Clone or download this repository
Navigate to the RAG directory:
```
cd RAG
```
Install Python dependencies:
```
pip install -r requirements.txt
```
Add your PDF documents:
- Place your PDF files in the documents/ folder
- The app will automatically process them on first run

Usage

Start the Flask server:
```
python server.py
```
Open your web browser and go to:
```
http://localhost:5001
```
Start chatting!
- Type your questions in the chat interface
- The app will retrieve relevant context from your PDF documents
- Responses are enhanced with the retrieved information

How It Works

Document Processing

PDFs in the documents/ folder are automatically loaded and chunked
Text chunks are embedded using HuggingFace's sentence-transformers/all-MiniLM-L6-v2
Embeddings are stored in ChromaDB for fast retrieval

RAG Pipeline

User sends a question
System retrieves relevant document chunks from ChromaDB
Context is combined with the user's question
Enhanced prompt is sent to Ollama LLM
Response is streamed back to the user

File Structure

RAG/
├── server.py              # Flask web server
├── rag_utils.py           # RAG utilities (ChromaDB, embeddings)
├── index.html             # Web interface
├── requirements.txt       # Python dependencies
├── documents/            # PDF files for RAG
│   ├── document1.pdf
│   ├── document2.pdf
│   └── ...
├── chroma_db/           # ChromaDB storage (auto-created)
└── chat_history.json    # Chat history storage

Configuration

Changing the LLM Model

Edit server.py and modify the model name:

payload = {
    'model': 'your-model-name',  # Change this
    'prompt': prompt,
    'stream': True
}

Adjusting RAG Parameters

In rag_utils.py, you can modify:

chunk_size: Size of text chunks (default: 1000)
chunk_overlap: Overlap between chunks (default: 100)
k: Number of retrieved documents (default: 4)

Embedding Model

Change the embedding model in rag_utils.py:

EMBED_MODEL = "your-embedding-model"

Troubleshooting

Ollama Connection Issues

Ensure Ollama is running: ollama serve
Check if your model is available: ollama list
Verify Ollama API is accessible at http://localhost:11434

PDF Processing Issues

Ensure PDFs are readable and not corrupted
Check file permissions in the documents/ folder
Large PDFs may take time to process on first run

Memory Issues

Reduce chunk_size in rag_utils.py for large documents
Use a smaller embedding model if needed
Consider using GPU for embeddings if available

Dependencies

Flask: Web framework
LangChain: LLM orchestration
ChromaDB: Vector database
HuggingFace: Embedding models
PyPDF2: PDF processing
Requests: HTTP client

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

License

This project is open source and available under the MIT License.

Support

For issues and questions:

Check the troubleshooting section above
Ensure all dependencies are properly installed
Verify Ollama is running and accessible

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
rag_utils.py		rag_utils.py
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Chat Web App

Features

Prerequisites

Installation

Usage

How It Works

Document Processing

RAG Pipeline

File Structure

Configuration

Changing the LLM Model

Adjusting RAG Parameters

Embedding Model

Troubleshooting

Ollama Connection Issues

PDF Processing Issues

Memory Issues

Dependencies

Contributing

License

Support

About

Uh oh!

Releases

Packages

Languages

saadnaseem/RAG_with_chromadb

Folders and files

Latest commit

History

Repository files navigation

RAG Chat Web App

Features

Prerequisites

Installation

Usage

How It Works

Document Processing

RAG Pipeline

File Structure

Configuration

Changing the LLM Model

Adjusting RAG Parameters

Embedding Model

Troubleshooting

Ollama Connection Issues

PDF Processing Issues

Memory Issues

Dependencies

Contributing

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages