GitHub - PavloGor/campus-docs-assistant: 🎓 Campus Docs Assistant – Built to solve a common challenge in universities, this AI-powered chatbot uses LLMs, intelligent agents and RAG to make academic documents and institutional information easier to access and understand.

INTRODUCTION

The Campus Docs Assistant is an AI-powered platform designed to streamline access to academic and administrative information within universities. Through a user-friendly chatbot interface, students, faculty, and staff can interact naturally with an intelligent assistant capable of answering questions, retrieving official documents, and offering support grounded in institutional data. This natural language interaction simplifies complex information retrieval tasks, eliminating the need to manually navigate dense and often confusing documentation.

By leveraging state-of-the-art AI technologies, the assistant understands complex queries, performs context-aware document retrieval, and generates accurate and concise responses in real time. Its web-based interface ensures accessibility while promoting autonomy in accessing institutional knowledge. This makes the Campus Docs Assistant a valuable tool for educational institutions aiming to enhance user experience, reduce repetitive inquiries, and improve the overall efficiency of information management.

AVATARS

Avatar	Usage	Meaning
	User messages	Indicates messages sent by the user
	Assistant Responses	Standard replies generated by the assistant
	Assistant Streaming Direct Response	Responses generated using the assistant's own knowledge
	Assistant Streaming Tool Response	Responses generated with the help of integrated tools
	Indexing Operations	Indicates that the assistant is processing documents
	Error or Reset Messages	System-level feedback such as error messages or resets

This legend provides a clear understanding of the avatars used in the application and their significance

MOTIVATION

This project was initially inspired by the specific challenges observed at the Federal University of Mato Grosso do Sul (UFMS), but the problem it addresses is common across many universities. In academic settings, students often struggle to obtain simple pieces of information due to the overwhelming complexity and volume of official documentation. Regulations, guidelines, and institutional policies are typically stored in dense, legalistic documents that are not user-friendly or easy to navigate.

In practice, students seeking a single answer — such as internship requirements, enrollment rules, or calendar dates — frequently end up reading through dozens of pages of official publications. Frustrated by this experience, many resort to contacting academic coordinators directly. However, from the administration's perspective, this creates a high volume of repetitive inquiries that could have been answered if students had easier access to the right part of the documentation.

This cycle results in inefficiency and dissatisfaction on both sides: students receive vague or delayed responses, and coordinators are overwhelmed by simple questions that require them to redirect students to existing official documents. The Campus Docs Assistant was developed to break this cycle, acting as a bridge between formal documentation and practical student needs. By enabling natural language interaction and intelligent information retrieval, it aims to reduce friction, save time, and promote autonomous access to institutional knowledge.

demo_web.mov

demo_file.mov

FEATURES

AI-Powered Query Handling

The assistant processes user queries using the Maritalk large language model, which is optimized for conversational AI and advanced natural language understanding.
It includes a tool decision system that evaluates the context of each query to determine whether to generate a direct response or trigger external tools, ensuring intelligent and context-aware interactions.

Smart Document Retrieval

The assistant performs semantic search using Pinecone, a high-performance vector database, allowing retrieval of the most relevant documents based on meaning rather than keywords.
It uses Ollama embeddings to convert documents and user queries into vector representations, enabling fast and accurate similarity matching for academic and administrative content.

Context-Aware Responses

Implements a retrieve and generate mechanism that blends user queries with retrieved content to produce accurate, relevant answers. Leveraging LangChain's retrieval augmented generation logic under the hood.
Maintains dynamic context management, keeping the conversation history clean and focused to ensure that responses remain concise and contextually accurate.

Web Scraping and Indexing

Integrates Playwright to render and scrape dynamic web pages, allowing the assistant to index and respond with external institutional content.
Utilizes intelligent document chunking to split large texts into digestible parts for efficient indexing and retrieval, enabling high performance even with large datasets.

Interactive User Interface

Built with Streamlit the assistant features a responsive and interactive UI where users can submit queries, view answers, and configure behavior—all within an accessible web interface.

Modular and Scalable Architecture

Employs a LangGraph based state machine, where conversational logic is handled through dynamic workflows ensuring flexibility in managing tool calls, memory and state transitions.
Designed with robust error handling to gracefully manage runtime issues, API failures and unexpected user input across various system components.

demo_graph.mov

GETTING STARTED

This guide outlines how to use the Campus Docs Assistant in two distinct scenarios:

Creating and managing a new knowledge base for Coordinators & Institutions
Accessing and querying an existing knowledge base for Students & Users

All users regardless of role need the following:

Ollama installed and running locally
Access to a Maritalk API key
Access to Pinecone credentials

Role-Based Configuration [Coordinators / Institutions]

As a coordinator or institution representative you are responsible for:

Creating and configuring the Pinecone vector database

Create a Pinecone index with the appropriate dimension size
Ensure the index dimension matches your chosen embedding model dimension

Indexing content into the knowledge base

Upload PDFs containing institutional content
Add URLs for web-based institutional resources
Maintain and update the knowledge base as needed

Providing access credentials to students

Share the Pinecone API key and index name
Provide Maritalk API access information
Communicate which embedding model students should use

Role-Based Configuration [Students / Users]

As a student or end-user you will focus on using the assistant not configuring indexing:

Use existing knowledge base credentials

Obtain necessary credentials from your institution
Configure the assistant with these credentials
Query the knowledge base using natural language

Avoid modifying the shared knowledge base

While technically possible to index content to a shared knowledge base this is not recommended
If you need your own knowledge base, create a separate Pinecone index

SETUP GUIDELINES

To begin, it's recommended to use the nomic-embed-text model for generating embeddings, as it provides an output dimension of $768$. When setting up your Pinecone index, ensure that its dimensionality matches the output of the embedding model — this compatibility is crucial for proper functioning. Additionally, make sure that the Ollama runtime is running locally, as the assistant depends on it to operate. Lastly, if you intend to incorporate personal knowledge bases, avoid making changes to the shared Pinecone index. Instead, create a separate index to keep your data isolated and organized.

INSTALLATION GUIDE

Clone the Repository

$ git clone https://github.com/GiovaneIwamoto/campus-docs-assistant.git
$ cd campus-docs-assistant

Install Dependencies

$ pip install -r requirements.txt

Install Playwright

$ pip install playwright
$ playwright install

Run the Application

$ cd app
$ streamlit run app.py

CONTACT & SUPPORT

Whether you're a developer, university representative, coordinator, or student exploring or deploying the Campus Docs Assistant, I'm here to help and collaborate! Feel free to reach out for:

General inquiries about the project
Feature requests or suggestions
Help provisioning resources
Troubleshooting installation or usage issues
Collaboration opportunities or academic use cases

Email: giovaneiwamoto@gmail.com

You can also open an issue on GitHub for bug reports or enhancements.

LIKE THE PROJECT

If you find this project useful or believe in its potential to enhance academic processes, consider giving it a ★ star on GitHub — it really helps with visibility and community support!

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
app		app
assets		assets
images		images
videos		videos
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

INTRODUCTION

AVATARS

MOTIVATION

FEATURES

GETTING STARTED

Role-Based Configuration [Coordinators / Institutions]

Role-Based Configuration [Students / Users]

SETUP GUIDELINES

INSTALLATION GUIDE

CONTACT & SUPPORT

LIKE THE PROJECT

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

INTRODUCTION

AVATARS

MOTIVATION

FEATURES

GETTING STARTED

Role-Based Configuration [Coordinators / Institutions]

Role-Based Configuration [Students / Users]

SETUP GUIDELINES

INSTALLATION GUIDE

CONTACT & SUPPORT

LIKE THE PROJECT

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages