A sophisticated Retrieval Augmented Generation (RAG) chatbot built with TypeScript that can ingest documents, perform semantic search, and provide intelligent responses with quality analysis and auto-improvement capabilities.
- Ollama Support: Run completely local with open-source models (no API keys needed!)
- Multi-Provider: Support for both Ollama (local) and OpenAI (API)
- Document Ingestion: Support for PDF, DOCX, and TXT files
- Vector Embeddings: Uses Ollama's nomic-embed-text or OpenAI's text-embedding models
- Semantic Search: In-memory vector store with cosine similarity
- Conversation Management: Maintains conversation context and history
- π Intelligent Analysis: Multi-layered quality assessment and auto-improvement
- π Quality Scoring: Comprehensive metrics for retrieval and answer quality
- π Hallucination Detection: Identifies and mitigates unsupported information
- π Auto-Improvement: Automatically enhances poor-quality responses
- π 10-Node Workflow: Advanced intelligent analysis pipeline for superior response quality
- REST API: Full API for web applications
- CLI Interface: Interactive command-line interface
- TypeScript: Full type safety and modern JavaScript features
This chatbot features an advanced 10-node intelligent analysis pipeline that ensures high-quality responses through systematic processing:
-
start - BαΊ―t ΔαΊ§u luα»ng
- Initializes the conversation flow
-
check_memory_or_context - Kiα»m tra memory/context cΓ³ giΓΊp Δược khΓ΄ng
- Analyzes existing conversation history and context relevance
-
clarify_question - LΓ m rΓ΅ cΓ’u hα»i nαΊΏu mΖ‘ hα»
- Identifies and resolves ambiguous queries
-
rephrase_or_simplify_query - ΔΖ‘n giαΊ£n hΓ³a cΓ’u hα»i
- Optimizes query structure for better retrieval
-
retrieve_documents - Truy xuαΊ₯t tΓ i liα»u (RAG)
- Performs semantic search and document retrieval
-
choose_tool_or_direct_answer - QuyαΊΏt Δα»nh dΓΉng cΓ΄ng cα»₯ hay khΓ΄ng
- Determines whether external tools or direct answering is needed
-
call_tool - Gα»i external tool/API
- Integrates with external APIs and tools when necessary
-
generate_answer - TαΊ‘o cΓ’u trαΊ£ lα»i cuα»i cΓΉng
- Generates the final response using retrieved context
-
save_to_memory - LΖ°u lαΊ‘i hα»i thoαΊ‘i vΓ o memory
- Persists conversation state for future reference
-
end - TrαΊ£ kαΊΏt quαΊ£ cuα»i cΓΉng
- Returns the final processed response to the user
This workflow ensures that every response goes through rigorous analysis, context checking, and quality validation before being delivered to the user.
- Node.js 18+
- Option 1 (Recommended): Ollama for local inference
- Option 2: OpenAI API key for cloud inference
- TypeScript knowledge
-
Clone and setup the project:
cd agent-chatbot-rag2 npm install -
Choose your provider:
# Install Ollama from https://ollama.ai # Pull required models ollama pull llama3.1 # Chat model (~4.7GB) ollama pull nomic-embed-text # Embedding model (~274MB) # Configuration is already set in .env for Ollama
# Edit .env file LLM_PROVIDER=openai OPENAI_API_KEY=your_openai_api_key_here -
Build the project:
npm run build
# Start the interactive CLI
npm run cli
# Available CLI commands:
# /upload <file_path> - Upload and process a document
# /stats - Show document statistics
# /clear-docs - Clear all documents
# /new - Start a new conversation
# /history - Show conversation history
# /health - Check system health
# /help - Show help message
# /quit - Exit# Start the API server
npm run api
# Server will start on http://localhost:3000import { RAGChatbot } from './src/chatbot';
const chatbot = new RAGChatbot();
// Ingest a document
await chatbot.ingestDocument('./path/to/document.pdf');
// Chat with the bot
const response = await chatbot.chat('What is this document about?');
console.log(response.content);When running the API server (npm run api), the following endpoints are available:
POST /upload
Content-Type: multipart/form-data
curl -X POST -F "document=@/path/to/file.pdf" http://localhost:3000/uploadPOST /chat
Content-Type: application/json
{
"query": "What is this document about?",
"conversationId": "optional-conversation-id"
}GET /conversation/:idGET /conversationsGET /healthGET /documents/stats# 1. Start the CLI
npm run cli
# 2. Upload a document
/upload ./examples/sample-document.txt
# 3. Ask questions
What features does this chatbot have?
Can you explain the workflow?
What file formats are supported?# 1. Start the API server
npm run api
# 2. Upload a document
curl -X POST -F "document=@./examples/sample-document.txt" http://localhost:3000/upload
# 3. Chat via API
curl -X POST -H "Content-Type: application/json" \
-d '{"query": "What is this chatbot capable of?"}' \
http://localhost:3000/chatimport { RAGChatbot } from './src/chatbot';
import path from 'path';
async function example() {
const chatbot = new RAGChatbot();
// Ingest multiple documents
await chatbot.ingestDocument('./docs/manual.pdf');
await chatbot.ingestDocument('./docs/faq.docx');
// Start a conversation
const response1 = await chatbot.chat('How do I get started?');
console.log(response1.content);
// Continue the conversation with context
const response2 = await chatbot.chat(
'Can you be more specific?',
response1.id // Same conversation
);
console.log(response2.content);
// Check quality metrics
console.log(`Quality Score: ${response2.metadata?.qualityScore}`);
console.log(`Sources: ${response2.metadata?.sources}`);
}The chatbot can be configured via environment variables in .env:
# Provider Selection
LLM_PROVIDER=ollama # or 'openai'
# Ollama Settings (Default)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_CHAT_MODEL=llama3.1
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
# OpenAI Settings (Alternative)
OPENAI_API_KEY=your_key_here
OPENAI_CHAT_MODEL=gpt-4
OPENAI_EMBEDDING_MODEL=text-embedding-ada-002
# Quality Thresholds
MIN_RETRIEVAL_SCORE=0.7
MIN_ANSWER_QUALITY_SCORE=0.6
HALLUCINATION_THRESHOLD=0.3
# Server Settings
PORT=3000
HOST=localhostThe chatbot provides comprehensive quality analysis for every response:
- Retrieval Score: How well retrieved documents match the query (0.0-1.0)
- Answer Quality: How well the answer addresses the query (0.0-1.0)
- Hallucination Score: Amount of unsupported information (0.0-1.0, lower is better)
- Context Relevance: How relevant the context is to the answer (0.0-1.0)
- Overall Score: Combined quality metric
When quality scores fall below thresholds, the system automatically:
- Identifies specific quality issues
- Generates an improved response
- Marks the response as "improved" in metadata
# Run example scripts
npm run dev examples/basic-usage.ts
npm run dev examples/document-ingestion.tsnpm run build # Compile TypeScript
npm run clean # Clean build directory
npm run watch # Watch mode for developmentsrc/
βββ chatbot.ts # Main chatbot class
βββ types.ts # TypeScript interfaces
βββ config.ts # Configuration loader
βββ api.ts # REST API server
βββ cli.ts # CLI interface
βββ providers/ # LLM provider implementations
β βββ ollama.ts
β βββ openai.ts
βββ vectorstore/ # Vector storage implementations
β βββ index.ts
βββ utils/ # Utility functions
β βββ document-processor.ts
βββ workflow/ # Intelligent workflow engine
βββ chatbot-workflow.ts
1. Ollama Connection Error
# Make sure Ollama is running
ollama serve
# Check if models are available
ollama list
# Pull required models if missing
ollama pull llama3.1
ollama pull nomic-embed-text2. File Upload Issues
- Ensure file size is under 10MB
- Supported formats: PDF, DOCX, TXT
- Check file permissions
3. Low Quality Responses
- Adjust quality thresholds in
.env - Upload more relevant documents
- Try rephrasing questions more specifically
4. Memory Issues
- Clear conversations:
chatbot.clearConversation(id) - Clear documents:
chatbot.clearDocuments() - Restart the application
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT License - see LICENSE file for details.
- Ollama for local LLM inference
- OpenAI for cloud-based AI services
- The TypeScript community for excellent tooling
Built with β€οΈ and TypeScript