This repository was archived by the owner on Jun 15, 2023. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 115
Home
Chris Hager edited this page Nov 2, 2015
·
5 revisions
Welcome to the pdfx wiki!
Discussions
Various
(Possibly) Useful Tools / Libraries
- http://blog.matt-swain.com/post/25650072381/a-lightweight-xmp-parser-for-extracting-pdf
- https://github.com/ckreibich/scholar.py (A parser for Google Scholar, written in Python)
- https://code.google.com/p/pdfmeat/ (PDF MEtadata Acquisition Tool (aka pdftobibtex/pdf2bibtex))
- https://code.google.com/p/pdfssa4met/ (PDF Structure and Syntactic Analysis for Metadata Extraction and Tagging)
- https://github.com/CrossRef/pdfextract (A tool and library that can extract various areas of text from a PDF, especially a scholarly article PDF) [ruby]
- https://github.com/ContentMine/quickscrape [nodejs]