- Brazil
-
22:22
(UTC -03:00) - https://linktr.ee/acsenrafilho
- in/acsenrafilho
Highlights
- Pro
OCR & Document Analysis
docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Document binarization using deep learning
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Handwritten Text Synthesis and Recognition
Python tool for converting files and office documents to Markdown.
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Hallucination-prevention RAG system with verbatim span extraction. Ensures all generated content is grounded in source documents with exact citations.
A Comprehensive Toolkit for High-Quality PDF Content Extraction




