This CLI application processes receipt images, detects text areas, extracts text (items, prices, and totals), and compares multiple receipts by visualizing the totals in a chart.
- Image preprocessing (resizing, orientation correction, deskewing)
- Text area detection
- Optical Character Recognition (OCR) using Tesseract
- Summarization of receipt data (item name, quantity, price, subtotal, cash, change)
- Comparison of multiple receipts with a visual bar chart
-
Install Tesseract
- Windows: Download and install Tesseract here
- Uncomment line 7 in ocr_recognition.py file. (Give the correct installed path of tesseract OCR engine)
- Linux: Install via terminal:
sudo apt install tesseract-ocr
- macOS: Install via Homebrew:
brew install tesseract
-
Install Python Dependencies Run the following command to install dependencies:
pip install -r requirements.txt
python main.py path/to/receipt_image.jpg- example:
python main.py images/input/r1.pngpython main.py --compare