Skip to content

PDF parsing doesn't support partially numbered lists #68

@majdalsado

Description

@majdalsado

When parsing a PDF file with partial numberings as is common in MasterFormat, a format used in most construction and government documents across US/Canada, the parser fails to show numbered lists properly. See example.

MarkItDown Output:

RFP for Construction Management Services
Rotary Clubs of Grande Prairie Wellness Centre Society
Ken Sargent House

Section 00 00 43
Instructions to Respondents
Page 1 of 8

INTENT

.1

.2

.3

.4

.5

.6

.7

.8

The intent of this Request for Proposal (RFP) is to solicit submissions, in the format
detailed in this document, from qualified Construction Managers for the following project:

KEN SARGENT HOUSE
GRANDE PRAIRIE, ALBERTA

Available information relative to the project is included in Section 00 00 45 – Description
of Project.

Actual Document

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingopen for contributionInvites open-source developers to contribute to the project.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions