The Campus Docs Assistant is an AI-powered platform designed to streamline access to academic and administrative information within universities. Through a user-friendly chatbot interface, students, faculty, and staff can interact naturally with an intelligent assistant capable of answering questions, retrieving official documents, and offering support grounded in institutional data. This natural language interaction simplifies complex information retrieval tasks, eliminating the need to manually navigate dense and often confusing documentation.
By leveraging state-of-the-art AI technologies, the assistant understands complex queries, performs context-aware document retrieval, and generates accurate and concise responses in real time. Its web-based interface ensures accessibility while promoting autonomy in accessing institutional knowledge. This makes the Campus Docs Assistant a valuable tool for educational institutions aiming to enhance user experience, reduce repetitive inquiries, and improve the overall efficiency of information management.
This legend provides a clear understanding of the avatars used in the application and their significance
This project was initially inspired by the specific challenges observed at the Federal University of Mato Grosso do Sul (UFMS), but the problem it addresses is common across many universities. In academic settings, students often struggle to obtain simple pieces of information due to the overwhelming complexity and volume of official documentation. Regulations, guidelines, and institutional policies are typically stored in dense, legalistic documents that are not user-friendly or easy to navigate.
In practice, students seeking a single answer β such as internship requirements, enrollment rules, or calendar dates β frequently end up reading through dozens of pages of official publications. Frustrated by this experience, many resort to contacting academic coordinators directly. However, from the administration's perspective, this creates a high volume of repetitive inquiries that could have been answered if students had easier access to the right part of the documentation.
This cycle results in inefficiency and dissatisfaction on both sides: students receive vague or delayed responses, and coordinators are overwhelmed by simple questions that require them to redirect students to existing official documents. The Campus Docs Assistant was developed to break this cycle, acting as a bridge between formal documentation and practical student needs. By enabling natural language interaction and intelligent information retrieval, it aims to reduce friction, save time, and promote autonomous access to institutional knowledge.
demo_web.mov
demo_file.mov
-
The assistant processes user queries using the Maritalk large language model, which is optimized for conversational AI and advanced natural language understanding.
-
It includes a tool decision system that evaluates the context of each query to determine whether to generate a direct response or trigger external tools, ensuring intelligent and context-aware interactions.
-
The assistant performs semantic search using Pinecone, a high-performance vector database, allowing retrieval of the most relevant documents based on meaning rather than keywords.
-
It uses Ollama embeddings to convert documents and user queries into vector representations, enabling fast and accurate similarity matching for academic and administrative content.
-
Implements a retrieve and generate mechanism that blends user queries with retrieved content to produce accurate, relevant answers. Leveraging LangChain's retrieval augmented generation logic under the hood.
-
Maintains dynamic context management, keeping the conversation history clean and focused to ensure that responses remain concise and contextually accurate.
-
Integrates Playwright to render and scrape dynamic web pages, allowing the assistant to index and respond with external institutional content.
-
Utilizes intelligent document chunking to split large texts into digestible parts for efficient indexing and retrieval, enabling high performance even with large datasets.
- Built with Streamlit the assistant features a responsive and interactive UI where users can submit queries, view answers, and configure behaviorβall within an accessible web interface.
Modular and Scalable Architecture
-
Employs a LangGraph based state machine, where conversational logic is handled through dynamic workflows ensuring flexibility in managing tool calls, memory and state transitions.
-
Designed with robust error handling to gracefully manage runtime issues, API failures and unexpected user input across various system components.
demo_graph.mov
This guide outlines how to use the Campus Docs Assistant in two distinct scenarios:
-
Creating and managing a new knowledge base for
Coordinators & Institutions -
Accessing and querying an existing knowledge base for
Students & Users
All users regardless of role need the following:
- Ollama installed and running locally
- Access to a Maritalk API key
- Access to Pinecone credentials
As a coordinator or institution representative you are responsible for:
Creating and configuring the Pinecone vector database
- Create a Pinecone index with the appropriate dimension size
- Ensure the index dimension matches your chosen embedding model dimension
Indexing content into the knowledge base
- Upload PDFs containing institutional content
- Add URLs for web-based institutional resources
- Maintain and update the knowledge base as needed
Providing access credentials to students
- Share the Pinecone API key and index name
- Provide Maritalk API access information
- Communicate which embedding model students should use
As a student or end-user you will focus on using the assistant not configuring indexing:
Use existing knowledge base credentials
- Obtain necessary credentials from your institution
- Configure the assistant with these credentials
- Query the knowledge base using natural language
Avoid modifying the shared knowledge base
- While technically possible to index content to a shared knowledge base this is not recommended
- If you need your own knowledge base, create a separate Pinecone index
To begin, it's recommended to use the nomic-embed-text model for generating embeddings, as it provides an output dimension of
Clone the Repository
$ git clone https://github.com/GiovaneIwamoto/campus-docs-assistant.git
$ cd campus-docs-assistant
Install Dependencies
$ pip install -r requirements.txt
Install Playwright
$ pip install playwright
$ playwright install
Run the Application
$ cd app
$ streamlit run app.py
Whether you're a developer, university representative, coordinator, or student exploring or deploying the Campus Docs Assistant, I'm here to help and collaborate! Feel free to reach out for:
- General inquiries about the project
- Feature requests or suggestions
- Help provisioning resources
- Troubleshooting installation or usage issues
- Collaboration opportunities or academic use cases
Email: giovaneiwamoto@gmail.com
You can also open an issue on GitHub for bug reports or enhancements.
If you find this project useful or believe in its potential to enhance academic processes, consider giving it a β star on GitHub β it really helps with visibility and community support!



