This repository contains a Streamlit chatbot powered by LlamaIndex. The chatbot uses RAG on the Data Science Clinic's GitHub repository to provide responses to queries about the Data Science Clinic.
Visit the live demo.
This project uses uv for fast Python package management. Install uv if you haven't already:
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"Install dependencies:
uv syncThe code for ingesting data is located in src/ingest.py. It loads data from GitHub and Google Drive, parses metadata and inserts those documents into a remote Redis vector store.
- Create a Google service account key and save it to the root of this directory as
service_account_key.json. - Get a Redis database password.
- Run the following command:
make run-ingest REDIS_PASSWORD=<password>Create a project secrets file to store your keys at .streamlit/secrets.toml. The contents of this file should read:
OPENAI_API_KEY="sk-proj..."
REDIS_PASSWORD="xD2C..."Run the Streamlit application locally with the following command:
make run-appTo add new dependencies, update the dependencies list in pyproject.toml and run:
uv syncYou can also run the application directly with uv:
# Activate the virtual environment
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Run the ingestion script
python src/ingest.py
# Run the Streamlit app
streamlit run src/app.py