You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary: PyMuPDF is a Python binding for the document renderer and toolkit MuPDF
12
12
Description:
13
-
Release date: April 10, 2021
13
+
Release date: April 29, 2021
14
14
15
15
Authors
16
16
=======
@@ -21,7 +21,7 @@ Description:
21
21
Introduction
22
22
============
23
23
24
-
PyMuPDF (current version 1.18.12) is a Python binding with support for `MuPDF <http://mupdf.com/>`_ (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer and toolkit, which is maintained and developed by Artifex Software, Inc.
24
+
PyMuPDF (current version 1.18.13) is a Python binding with support for `MuPDF <http://mupdf.com/>`_ (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer and toolkit, which is maintained and developed by Artifex Software, Inc.
25
25
26
26
MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.
@@ -15,7 +15,7 @@ On **[PyPI](https://pypi.org/project/PyMuPDF)** since August 2016: [![Downloads]
15
15
16
16
# Introduction
17
17
18
-
PyMuPDF (current version 1.18.12) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
18
+
PyMuPDF (current version 1.18.13) is a Python binding with support for [MuPDF](https://mupdf.com/) (current version 1.18.*), a lightweight PDF, XPS, and E-book viewer, renderer, and toolkit, which is maintained and developed by Artifex Software, Inc.
19
19
20
20
MuPDF can access files in PDF, XPS, OpenXPS, CBZ, EPUB and FB2 (e-books) formats, and it is known for its top performance and high rendering quality.
Copy file name to clipboardExpand all lines: docs/app4.rst
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,10 +15,14 @@ Starting with version 1.18.11, the image transformation matrix is returned by so
15
15
16
16
The transformation matrix contains information about how an image was transformed to fit into the rectangle (its "boundary box" = "bbox") on some document page. By inspecting the image's bbox on the page and this matrix, one can determine for example, whether and how the image is displayed scaled or rotated on a page.
17
17
18
-
The relationship between image width and height and the bbox on a page is the following:
18
+
The relationship between image dimension and its bbox on a page is the following:
1. Using the original image's width and height, we can define the image rectangle ``imgrect = fitz.Rect(0, 0, width, height)`` and a "shrink matrix" ``shrink = fitz.Matrix(1/width, 0, 0, 1/height, 0, 0)``.
21
24
2. Transforming the image rectangle with its shrink matrix, will result in the unit rectangle: ``imgrect * shrink = fitz.Rect(0, 0, 1, 1)``.
25
+
22
26
3. Using the image **transformation matrix** "transform", the following steps will compute the bbox::
* **Fixed** an internal memory leak when computing image bboxes -- :meth:`Page.get_image_bbox`.
8
+
* **Added** support for low-level access and modification of the PDF trailer. Applies to :meth:`Document.xref_get_keys`, :meth:`Document.xref_get_key`, and :meth:`Document.xref_set_key`.
9
+
* **Added** documentation for maintaining private entries in PDF metadata.
10
+
* **Added** documentation for handling transparent image insertions, :meth:`Page.insert_image`.
11
+
* **Added** :meth:`Page.get_image_rects`, an improved version of :meth:`Page.get_image_bbox`.
12
+
* **Changed** :meth:`Page.insert_image` to also accept the xref of an existing image in the file. This allows "copying" images between pages, and extremely fast mutiple insertions.
13
+
* **Changed** :meth:`Pixmap.setAlpha` to support new parameters for pre-multiplying colors with their alpha values and setting a specific color to fully transparent (e.g. white).
14
+
* **Changed** :meth:`Document.embfile_add` to automatically set creation and modification date-time. Correspondingly, :meth:`Document.embfile_upd` automatically maintains modification date-time (``/ModDate`` PDF key), and :meth:`Document.embfile_info` correspondingly reports these data. In addition, the embedded file's associated "collection item" is included via its :data:`xref`. This supports the development of PDF portfolio applications.
15
+
16
+
Changes in Version 1.18.11 / 1.18.12
17
+
-------------------------------------
6
18
* **Fixed** issue `#972 <https://github.com/pymupdf/PyMuPDF/issues/972>`_. Improved layout of source distribution material.
7
19
* **Fixed** issue `#962 <https://github.com/pymupdf/PyMuPDF/issues/962>`_. Stabilized Linux distribution detection for generating PyMuPDF from sources.
8
20
* **Added:** :meth:`Page.get_xobjects` delivers the result of :meth:`Document.get_page_xobjects`.
Copy file name to clipboardExpand all lines: docs/document.rst
+32-24Lines changed: 32 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -731,7 +731,7 @@ For details on **embedded files** refer to Appendix 3.
731
731
:arg int xref: the :data:`xref`. *Changed in v1.18.10:* Use ``-1`` to access the special dictionary "PDF trailer" (it has no identifying xref).
732
732
:arg str key: the desired PDF key. Must **exactly** match (case-sensitive) one of the keys contained in :meth:`Document.xref_get_keys`.
733
733
734
-
:returns: a tuple (type, value), where type is one of "xref", "array", "dict", "int", "float" "null", "bool", "float", "name", "string" or "unknown" (should not occur). Independent of "type", the value of the key is **always** formatted as a string -- see the following example -- and a faithful reflection of what is stored in the PDF. An argument like the return value can be used to modify the value of a key of :data:`xref`.
734
+
:returns: a tuple (type, value), where type is one of "xref", "array", "dict", "int", "float", "null", "bool", "float", "name", "string" or "unknown" (should not occur). Independent of "type", the value of the key is **always** formatted as a string -- see the following example -- and a faithful reflection of what is stored in the PDF. An argument like the returned value can be used to modify the value of a key of :data:`xref`.
735
735
736
736
>>> for key in doc.xref_get_keys(xref):
737
737
print(key, "=" , doc.xref_get_key(xref, key))
@@ -757,26 +757,30 @@ For details on **embedded files** refer to Appendix 3.
757
757
758
758
.. method:: xref_set_key(xref, key, value)
759
759
760
-
*(New in v 1.18.7)*
760
+
*(New in v 1.18.7, changed in v 1.18.13)*
761
761
762
-
PDF only: Set the value of a PDF key in the object given by an xref. This is an expert function: if you do not know what you are doing, there is a high risk to render (parts of) the PDF unusable. Please do consult :ref:`AdobeManual` about object specification formats (page 51) and the structure of special dictionary types like page objects.
762
+
PDF only: Set (add, update, delete) the value of a PDF key for the object given by an xref.
763
+
764
+
.. caution:: This is an expert function: if you do not know what you are doing, there is a high risk to render (parts of) the PDF unusable. Please do consult :ref:`AdobeManual` about object specification formats (page 51) and the structure of special dictionary types like page objects.
763
765
764
-
:arg int xref: the :data:`xref`. Note that changing the content of the **PDF trailer** in this way is currently not enabled for safety reasons.
766
+
:arg int xref: the :data:`xref`. *Changed in v1.18.13:* To update the PDF trailer, specify -1.
765
767
:arg str key: the desired PDF key (without leading "/"). Must not be empty. Any valid PDF key -- whether already present in the object (which will be overwritten) -- or new. It is possible to use PDF path notation like ``"Resources/ExtGState"`` -- which sets the value for key ``"/ExtGState"`` as a sub-object of ``"/Resources"``.
766
-
:arg str value: the value for the key. It must be a non-empty string and, depending on the desired PDF object type, the following rules must be observed -- there is some syntax, but no type checking and no checking of the PDFsemantics. Upper or lower case are important!
768
+
:arg str value: the value for the key. It must be a non-empty string and, depending on the desired PDF object type, the following rules must be observed. There is some syntax checking, but **no type checking** and no checking if it makes sense PDF-wise, i.e. **no semantics checking**. Upper or lower case are important!
767
769
768
-
* **xref** -- must be provided as ``"nnn 0 R"`` with a valid :data:`xref` number nnn of the PDF. The suffix "``0 R``" is required to be recognizable as a xref.
769
-
* **array** -- a string like ``"[a b c d e f ...]"``. The brackets are required. Array items must be separated by at least one space (not commas like in Python). An empty array ``"[]"`` is possible and equivalent to removing the key. Array items may be any PDF objects, like dictionaries, xrefs, other arrays, etc. Like in Python, array items need not be of the same type.
770
+
* **xref** -- must be provided as ``"nnn 0 R"`` with a valid :data:`xref` number nnn of the PDF. The suffix "``0 R``" is required to be recognizable as an xref by PDF applications.
771
+
* **array** -- a string like ``"[a b c d e f]"``. The brackets are required. Array items must be separated by at least one space (not commas like in Python). An empty array ``"[]"`` is possible and equivalent to removing the key. Array items may be any PDF objects, like dictionaries, xrefs, other arrays, etc. Like in Python, array items may be of different types.
770
772
* **dict** -- a string like ``"<< ... >>"``. The brackets are required and must enclose a valid PDF dictionary definition. The empty dictionary ``"<<>>"`` is possible and equivalent to removing the key.
771
773
* **int** -- an integer formatted **as a string**.
772
-
* **float** -- a float formatted **as a string**. Scientific notation (with exponents) is not allowed by PDF.
773
-
* **null** -- the string ``"null"``. This is the PDF equivalent to Python's ``None`` and causes the key to be ignored -- however not necessarily removed.
774
+
* **float** -- a float formatted **as a string**. Scientific notation (with exponents) is **not allowed by PDF**.
775
+
* **null** -- the string ``"null"``. This is the PDF equivalent to Python's ``None`` and causes the key to be ignored -- however not necessarily removed, resp. removed on saves with garbage collection.
774
776
* **bool** -- one of the strings ``"true"`` or ``"false"``.
775
-
* **name** -- a valid PDF name with a leading slash: ``"/PageLayout"``.
776
-
* **string** -- a valid PDF string. Denote the empty string as ``"()"``. Depending on its content, it must be enclosed in bracket types "(...)" or "<...>", and reserved PDF characters must be escaped. If in doubt, we **strongly recommend** to use :meth:`getPDFstr`! This function automatically generates the right bracketsand determines the required format. E.g. it will do conversions like these:
777
+
* **name** -- a valid PDF name with a leading slash: ``"/PageLayout"``. See page 56 of the :ref:`AdobeManual`.
778
+
* **string** -- a valid PDF string. **All PDF strings** must be enclosed by some type of brackets. Denote the empty string as ``"()"``. Depending on its content, the possible bracket types are "(...)" or "<...>". Reserved PDF characters must be escaped. If in doubt, we **strongly recommend** to use :meth:`getPDFstr`! This function automatically generates the right brackets, escapes, and overall format. E.g. it will do conversions like these:
777
779
780
+
>>> # because of €, the following yields UTF-16BE BOM
*(Changed in version 1.14.13)* *io.BytesIO* is now also supported.
@@ -1322,6 +1326,9 @@ For details on **embedded files** refer to Appendix 3.
1322
1326
:arg str ufilename: optional unicode filename. Documentation only, will be set to *filename* if *None*.
1323
1327
:arg str desc: optional description. Documentation only, will be set to *name* if *None*.
1324
1328
1329
+
:rtype: int
1330
+
:returns: *(Changed in v1.18.13)* The method now returns the :data:`xref` of the inserted file. In addition, the file object now will be automatically given the PDF keys ``/CreationDate`` and ``/ModDate`` based on the current date-time.
1331
+
1325
1332
1326
1333
.. method:: embfile_count()
1327
1334
@@ -1334,7 +1341,7 @@ For details on **embedded files** refer to Appendix 3.
1334
1341
1335
1342
PDF only: Retrieve the content of embedded file by its entry number or name. If the document is not a PDF, or entry cannot be found, an exception is raised.
1336
1343
1337
-
:arg int,str item: index or name of entry. An integer must be in *range(embfile_count())*.
1344
+
:arg int,str item: index or name of entry. An integer must be in ``range(embfile_count())``.
1338
1345
1339
1346
:rtype: bytes
1340
1347
@@ -1351,9 +1358,11 @@ For details on **embedded files** refer to Appendix 3.
1351
1358
1352
1359
.. method:: embfile_info(item)
1353
1360
1361
+
*(Changed in v1.18.13)*
1362
+
1354
1363
PDF only: Retrieve information of an embedded file given by its number or by its name.
1355
1364
1356
-
:arg int/str item: index or name of entry. An integer must be in *range(embfile_count())*.
1365
+
:arg int/str item: index or name of entry. An integer must be in ``range(embfile_count())``.
1357
1366
1358
1367
:rtype: dict
1359
1368
:returns: a dictionary with the following keys:
@@ -1364,12 +1373,16 @@ For details on **embedded files** refer to Appendix 3.
1364
1373
* *desc* -- (*str*) description
1365
1374
* *size* -- (*int*) original file size
1366
1375
* *length* -- (*int*) compressed file length
1376
+
* *creationDate* -- *(New in v1.18.13)* (*str*) date-time of item creation in PDF format
1377
+
* *modDate* -- *(New in v1.18.13)* (*str*) date-time of last change in PDF format
1378
+
* *collection* -- *(New in v1.18.13)* (*int*) :data:`xref` of the associated PDF portfolio item if any, else zero.
1379
+
* *checksum* -- *(New in v1.18.13)* (*str*) a hashcode of the stored file content as a hexadecimal string. Should be MD5 according to PDF specifications, but be prepared to see other hashing algorithms.
1367
1380
1368
1381
.. method:: embfile_names()
1369
1382
1370
1383
*(New in version 1.14.16)*
1371
1384
1372
-
PDF only: Return a list of embedded file names. The sequence of names equals the physical sequence in the document.
1385
+
PDF only: Return a list of embedded file names. The sequence of the names equals the physical sequence in the document.
1373
1386
1374
1387
:rtype: list
1375
1388
@@ -1382,7 +1395,7 @@ For details on **embedded files** refer to Appendix 3.
1382
1395
1383
1396
PDF only: Change an embedded file given its entry number or name. All parameters are optional. Letting them default leads to a no-operation.
1384
1397
1385
-
:arg int/str item: index or name of entry. An integer must be in *range(0, embfile_count())*.
1398
+
:arg int/str item: index or name of entry. An integer must be in ``range(embfile_count())``.
1386
1399
:arg bytes,bytearray,BytesIO buffer: the new file content.
1387
1400
1388
1401
*(Changed in version 1.14.13)* *io.BytesIO* is now also supported.
@@ -1391,16 +1404,11 @@ For details on **embedded files** refer to Appendix 3.
0 commit comments