Smart PDF Viewer

A comprehensive PDF viewer and chat application.

Project Overview

This application serves as a demonstration of advanced PDF processing capabilities combined with intelligent chat functionality. The primary goal was to create a system that can handle dense documents, long syllabi, and multiple PDFs simultaneously while providing accurate, context-aware responses.

Core Objectives

Enhanced RAG System: Implement a retrieval-augmented generation system that can accurately process and respond to queries across multiple PDF documents
Advanced PDF Reader: Create a robust PDF viewer with sophisticated highlighting, annotation, and markup capabilities
Performance Optimization: Ensure the system can handle large documents and multiple PDFs without performance degradation
User Experience: Design an intuitive interface that has a clean aesthetic while providing powerful functionality

Technical Approach

RAG (Retrieval-Augmented Generation) Implementation

The RAG system was architected with a primary focus on accuracy and performance when handling multiple legal documents simultaneously. The core challenge was ensuring that dense legal content, often spanning hundreds of pages across multiple documents, could be processed and queried efficiently while maintaining contextual accuracy.

The text processing pipeline begins with reliable PDF text extraction using pdfjs-dist, which provides robust content parsing even for complex legal documents with varied formatting. The extracted text then undergoes intelligent chunking that carefully preserves semantic meaning across sentences and paragraphs, ensuring that legal concepts and references remain intact. This context-aware chunking is particularly crucial for legal documents where the meaning of a clause often depends on its surrounding context and references to other sections.

For embedding generation, the system leverages OpenAI's high-quality vector representations, which excel at capturing the nuanced language patterns found in legal documents. The embedding process is optimized through batch processing to handle multiple PDFs efficiently, while a sophisticated caching mechanism prevents unnecessary reprocessing of already analyzed documents. This approach significantly reduces both processing time and API costs.

Query processing represents the most complex aspect of the system, requiring multi-PDF context aggregation to provide comprehensive responses. The system maintains source attribution with precise page references, ensuring transparency and allowing users to verify information directly in the source documents. Robust fallback mechanisms handle edge cases gracefully, providing meaningful responses even when the primary retrieval mechanisms encounter issues.

PDF Reader Architecture

The PDF reader was built using react-pdf-viewer as the foundation, chosen for its robust rendering capabilities and extensive plugin ecosystem. The highlighting system integrates seamlessly with the official highlight plugin, providing reliable text selection across multiple lines and accurate positioning that persists across sessions.

The highlighting functionality extends beyond simple text marking to include a comprehensive comment system linked to each highlight, enabling detailed annotations that enhance the research workflow. Historical highlights are managed through a dedicated system that allows users to navigate directly to previously marked sections, creating an efficient way to revisit important content across large documents.

The user interface was designed with responsiveness as a core principle, adapting gracefully to different screen sizes while maintaining functionality. Dark and light mode support accommodates user preferences and different working environments, while draggable resizing allows users to optimize their workspace layout. The toolbar provides context-sensitive controls that adapt based on the current interaction mode, reducing cognitive load and improving efficiency.

Architecture & Design Decisions

Frontend Architecture

Component Structure:

src/
├── components/
│   ├── ReactApp.tsx          # Main application orchestrator
│   ├── Auth.tsx              # Authentication system
│   ├── PDFManager.tsx        # PDF upload and management
│   ├── ReactPDFViewer.tsx    # PDF rendering and interaction
│   ├── ChatWithPDF.tsx       # Chat interface and RAG integration
│   ├── ChatList.tsx          # Conversation management
│   └── HistoricalHighlights.tsx # Highlight management
├── services/
│   ├── ragService.ts         # RAG system implementation
│   ├── databaseService.ts    # Supabase integration
│   ├── textExtractor.ts      # PDF text extraction
│   ├── textChunker.ts        # Text processing and chunking
│   └── vectorStore.ts        # Vector storage and retrieval
└── style.css                 # Global styles and theming

State Management:

React hooks for local component state
Context providers for global application state
Optimized re-rendering with useCallback and React.memo
Persistent state management through Supabase

Backend Integration

Database Design:

PostgreSQL through Supabase for reliable data persistence
Optimized schema for PDFs, conversations, highlights, and user data
UUID-based primary keys for scalability
Proper indexing for efficient queries

File Storage:

Supabase Storage for PDF file management
Automatic cleanup on PDF deletion
Efficient file serving with CDN integration

Key Features Implemented

1. Multi-PDF Chat System

Background Processing:

PDFs are processed automatically when selected, not when messages are sent
Batch processing (3 PDFs at a time) to prevent system overload
Visual progress indicators with pulsing animations
Smart caching to skip already processed documents

Conversation Management:

Multiple concurrent conversations with different PDF sets
Persistent chat history across sessions
Message editing and regeneration capabilities
Source attribution with expandable references

Advanced Features:

Text-to-speech for chat responses
Copy functionality for easy sharing
Markdown rendering with LaTeX math support
Dark/light mode toggle

2. User Authentication & Registration

Flexible Authentication:

User registration with custom username and password
Secure password hashing and validation
Session persistence across browser refreshes
Duplicate username validation with clear error messages

User Experience:

Toggle between login and registration modes
Intuitive error handling for authentication failures
Seamless transition between authentication states
User-specific data isolation and privacy

3. Advanced PDF Reader

Highlighting Capabilities:

Multi-line text selection with accurate highlighting
Custom color palette with persistent storage
Comment system for detailed annotations
Historical highlights with navigation to source locations

User Interface:

Responsive design that works on various screen sizes
Draggable chat window for optimal workspace utilization
Intuitive toolbar with context-sensitive controls
Smooth scrolling and zoom capabilities

4. Authentication & Data Persistence

User Management:

Simple authentication system for assessment purposes
Secure session management
User-specific data isolation

Data Persistence:

All PDFs, conversations, and highlights persist across sessions
Automatic synchronization between components
Robust error handling and recovery mechanisms

Performance Optimizations

The performance optimization phase was crucial for ensuring that the application could handle real-world usage scenarios with multiple large PDFs without degradation in user experience. The optimization strategy focused on two primary areas: multi-PDF processing efficiency and React component performance.

Multi-PDF Processing Optimization began with a fundamental architectural change: moving PDF processing from the message-sending phase to the PDF selection phase. This background processing approach means that PDFs are processed when selected, not when messages are sent, dramatically improving perceived performance. The system implements a 1-second delay to prevent processing on every selection change, allowing users to make multiple selections without triggering unnecessary processing. Batch processing handles multiple PDFs with 500ms delays between batches, preventing system overload while maintaining efficiency.

The caching strategy represents a sophisticated approach to avoiding redundant work. Already processed PDFs are skipped entirely, with in-memory caching for frequently accessed data. Smart cache invalidation ensures that when PDFs are updated, the cache is properly refreshed. The performance results speak to the effectiveness of this approach: background processing completes in just 1.006 seconds for 5 PDFs, with caching providing a 17.94x speedup for cached PDFs. Chat responses are 2.37x faster for subsequent messages, and all operations complete within 10 seconds even under heavy load.

React Component Optimization focused on minimizing unnecessary re-renders and optimizing component lifecycle management. The implementation uses useCallback for event handlers to prevent unnecessary re-renders, React.memo for expensive components, and carefully optimized dependency arrays in useEffect hooks. State management was refined to minimize state updates and reduce re-render cycles, with efficient state synchronization between components and proper cleanup of event listeners and timers. These optimizations ensure that the application remains responsive even when handling complex interactions across multiple components.

Technology Stack

Frontend

React 18 with TypeScript for type-safe development
Vite for fast development and optimized builds
react-pdf-viewer for robust PDF rendering
CSS3 with custom properties for theming
Font Awesome for professional iconography

Backend & Services

Supabase for database, authentication, and file storage
PostgreSQL for reliable data persistence
OpenAI API for embeddings and language model integration
Vercel for deployment and hosting

Development Tools

TypeScript for type safety and better developer experience
ESLint for code quality and consistency
Git for version control and collaboration
Vercel CLI for deployment automation

Development Process

The development journey was structured into six distinct phases, each building upon the previous work while addressing specific technical challenges and user experience requirements.

Phase 1: Foundation & Setup began with establishing a solid technical foundation using Vite and TypeScript for fast development and type safety. The initial focus was on creating a clean component architecture and implementing a robust authentication system. Database schema design was crucial at this stage, as the relationships between users, PDFs, conversations, and highlights needed to be carefully planned to support the complex interactions that would follow. Supabase integration provided the backend infrastructure needed for persistent data storage and real-time updates.

Phase 2: Core PDF Functionality centered on building the fundamental PDF handling capabilities. This involved creating a comprehensive upload and management system that could handle various PDF formats and sizes, implementing a reliable PDF viewer using react-pdf-viewer, and developing the text extraction and processing pipeline that would later power the RAG system. Initial highlighting capabilities were also implemented during this phase, though they would undergo significant refinement in later stages.

Phase 3: RAG System Development represented the most technically challenging phase, requiring the implementation of sophisticated text chunking algorithms, embedding generation, and vector storage systems. The chat interface was built to support multi-PDF queries, with careful attention to source attribution and reference management. This phase required extensive experimentation with different chunking strategies and embedding models to achieve optimal results for legal document processing.

Phase 4: Advanced Features focused on enhancing the user experience through features like historical highlights with navigation, message editing and regeneration capabilities, text-to-speech integration, and comprehensive theming support. Each feature was designed to address specific user workflow needs while maintaining the application's performance and reliability.

Phase 5: Performance Optimization became critical as the application's complexity grew. This phase involved implementing background PDF processing, batch processing strategies, sophisticated caching mechanisms, and React component optimization using memoization techniques. The goal was to ensure that the application remained responsive even when handling multiple large documents simultaneously.

Phase 6: Polish & Deployment brought together all the previous work into a cohesive, production-ready application. This included UI/UX refinements to include branding, comprehensive testing and quality assurance, performance benchmarking, and deployment optimization for Vercel. The focus was on creating a professional, polished experience that would demonstrate the full capabilities of the system.

Challenges & Solutions

Throughout the development process, several significant technical challenges emerged that required innovative solutions and careful architectural decisions.

Multi-PDF Processing Performance presented one of the most critical challenges. Initially, processing multiple large PDFs simultaneously caused significant delays and created a poor user experience, with users often waiting several minutes for responses. The solution involved a fundamental shift in approach: implementing background processing where PDFs are processed when selected rather than when messages are sent. This was combined with intelligent batch processing that handles three PDFs at a time with strategic delays to prevent system overload. A sophisticated caching system ensures that already processed PDFs are skipped entirely, while visual progress indicators keep users informed of the processing status. The result was a dramatic improvement in perceived performance, with subsequent messages responding nearly instantly.

Accurate Multi-line Highlighting proved to be more complex than initially anticipated. The custom highlighting implementation struggled with multi-line text selections and accurate positioning, particularly when dealing with complex legal documents with varied formatting. After extensive experimentation, the solution was to integrate the official react-pdf-viewer highlight plugin, which provides reliable text selection across multiple lines and accurate highlight positioning that persists across sessions. This integration also enabled a custom color palette and comprehensive comment system, significantly enhancing the annotation capabilities.

State Synchronization became increasingly complex as the application grew in functionality. Managing state between PDFs, chats, and highlights led to inconsistencies and synchronization issues that affected the user experience. The solution involved implementing a comprehensive state synchronization system with centralized state management and proper data flow patterns. Automatic cleanup of orphaned data ensures data integrity, while real-time updates across all components maintain consistency. Robust error handling and recovery mechanisms provide graceful degradation when issues occur.

RAG Accuracy with Multiple PDFs required careful consideration of how to ensure accurate responses when querying across multiple documents with potentially overlapping content. The solution involved enhancing the RAG system with improved text chunking that preserves context across document boundaries, better source attribution with precise page references, and sophisticated query preprocessing to handle multi-PDF scenarios effectively. Fallback mechanisms ensure that meaningful responses are provided even when the primary retrieval mechanisms encounter edge cases or unexpected content structures.

Testing & Quality Assurance

Performance Testing

Comprehensive performance benchmarks for multi-PDF processing
Load testing with various PDF sizes and quantities
Memory usage monitoring and optimization
Response time measurements for different scenarios

User Experience Testing

Cross-browser compatibility testing
Responsive design validation across devices
Accessibility considerations and improvements
User workflow optimization

Code Quality

TypeScript for compile-time error prevention
ESLint for code consistency and best practices
Component testing for critical functionality
Error boundary implementation for graceful failure handling

Deployment

Vercel Integration

Automatic deployments from GitHub
Environment variable management
CDN integration for optimal performance
Preview deployments for testing

Environment Configuration

Secure API key management
Database connection optimization
File storage configuration
Performance monitoring setup

Future Enhancements

Short-term Improvements

Advanced search capabilities across all PDFs
Export functionality for conversations and highlights
Collaborative features for team usage
Mobile app development

Long-term Vision

AI-powered document summarization
Advanced legal research capabilities
Integration with legal databases

Getting Started

For detailed setup instructions, please see the SETUP.md guide.

Prerequisites

Node.js 18+ and npm
Supabase account for backend services
OpenAI API key for RAG functionality

Installation

Clone the repository:

git clone https://github.com/AdiBak/SmartPDFReader.git
cd SmartPDFReader

Install dependencies:

npm install

Set up environment variables:

cp .env.example .env.local
# Add your Supabase and OpenAI API keys

Start the development server:

npm run dev

Open http://localhost:3000 in your browser

Configuration

Update the following environment variables in .env.local:

VITE_SUPABASE_URL: Your Supabase project URL
VITE_SUPABASE_ANON_KEY: Your Supabase anonymous key
VITE_OPENAI_API_KEY: Your OpenAI API key

Usage

Authentication:
- Create Account: Click "Don't have an account? Sign up" to register with a new username and password
- Login: Use your registered credentials to sign in
Upload PDFs: Drag and drop or click to upload legal documents
Create Chats: Select PDFs and start conversations
Highlight Text: Use the highlight tool to annotate important sections
Ask Questions: Get intelligent responses based on your PDF content

Note: See SETUP.md for complete configuration instructions.

Demo Video

Watch the complete application walkthrough demonstrating all features:

Click to watch the full demo video showcasing the Smart Reader

Video Highlights:

User registration and authentication flow
Multi-PDF upload and processing
Advanced chat functionality with source attribution
PDF highlighting and annotation features
Performance optimizations and responsive design

Conclusion

This project demonstrates a comprehensive approach to building a sophisticated PDF analysis and chat application. The combination of advanced RAG capabilities, robust PDF processing, and performance optimizations creates a powerful tool for legal document analysis and research.

The journey from initial concept to deployed application involved numerous technical challenges for me, each met with innovative solutions and careful consideration of user experience. The result is a scalable, performant application that showcases modern web development practices and AI integration.

Built with ❤️ by @AdiBak

Live Demo: https://pdfsmart.vercel.app/

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
images		images
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
SETUP.md		SETUP.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
supabase-setup.sql		supabase-setup.sql
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vercel.json		vercel.json
vite.config.ts		vite.config.ts

AdiBak/SmartPDFReader

Folders and files

Latest commit

History

Repository files navigation