Examining a page¶
Pages are dictionaries¶
In PDFs, the main data structure is the dictionary, a key-value data
structure much like a Python dict
or attrdict
. The major difference is
that the keys can only be names, while values can be any type, including
other dictionaries.
PDF dictionaries are represented as pikepdf.Dictionary
, and names
are of type pikepdf.Name
. A page is just another dictionary, with a
few required fields that give it special status as a page.
A pikepdf.Name
that is, usually, an ASCII-encoded string beginning with
“/” followed by a capital letter.
In [1]: from pikepdf import Pdf
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-5d5e0079e556> in <module>()
----> 1 from pikepdf import Pdf
ModuleNotFoundError: No module named 'pikepdf'
In [2]: example = Pdf.open('../tests/resources/congress.pdf')