Skip to content

leanhtrung/business-reg-ocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Business Registration OCR

Extract data from business registration documents using OCR.

Setup

  1. Clone this repository
  2. Create virtual environment:
   python3 -m venv venv
   source venv/bin/activate
  1. Install dependencies:
   brew install tesseract
   pip install -r requirements.txt

Usage

  1. Place document images in data/sample_documents/
  2. Run:
   python src/main.py
  1. Check results in output/ folder

Project Structure

business-reg-ocr/
├── src/
│   ├── main.py              # Main application
│   ├── image_processor.py   # Image preprocessing
│   ├── ocr_engine.py        # OCR engine
│   └── parser.py            # Data extraction
├── tests/
├── data/sample_documents/   # Input images
└── output/                  # Results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages