pyhanko.pdf_utils.misc module

Utility functions for PDF library. Taken from PyPDF2 with modifications and additions, see here for the original license of the PyPDF2 project.

Generally, all of these constitute internal API, except for the exception classes.

exception pyhanko.pdf_utils.misc.PdfError(msg: str, *args)

Bases: Exception

exception pyhanko.pdf_utils.misc.PdfReadError(msg: str, *args)

Bases: PdfError

exception pyhanko.pdf_utils.misc.PdfStrictReadError(msg: str, *args)

Bases: PdfReadError

exception pyhanko.pdf_utils.misc.PdfWriteError(msg: str, *args)

Bases: PdfError

exception pyhanko.pdf_utils.misc.PdfStreamError(msg: str, *args)

Bases: PdfReadError

exception pyhanko.pdf_utils.misc.IndirectObjectExpected(msg: Optional[str] = None)

Bases: PdfReadError

pyhanko.pdf_utils.misc.get_and_apply(dictionary: dict, key, function: Callable, *, default=None)
class pyhanko.pdf_utils.misc.OrderedEnum(value)

Bases: Enum

Ordered enum (from the Python documentation)

class pyhanko.pdf_utils.misc.StringWithLanguage(value: str, lang_code: Optional[str] = None, country_code: Optional[str] = None)

Bases: object

A string with a language attached to it.

value: str
lang_code: Optional[str] = None
country_code: Optional[str] = None
pyhanko.pdf_utils.misc.is_regular_character(byte_value: int)
pyhanko.pdf_utils.misc.read_non_whitespace(stream, seek_back=False, allow_eof=False)

Finds and reads the next non-whitespace character (ignores whitespace).

pyhanko.pdf_utils.misc.read_until_whitespace(stream, maxchars=None)

Reads non-whitespace characters and returns them. Stops upon encountering whitespace or when maxchars is reached.

pyhanko.pdf_utils.misc.read_until_regex(stream, regex, ignore_eof=False)

Reads until the regular expression pattern matched (ignore the match) Raise PdfStreamError on premature end-of-file. :param bool ignore_eof: If true, ignore end-of-line and return immediately :param regex: regex to match :param stream: stream to search

pyhanko.pdf_utils.misc.skip_over_whitespace(stream, stop_after_eol=False) bool

Similar to readNonWhitespace, but returns a Boolean if more than one whitespace character was read.

Will return the cursor to before the first non-whitespace character encountered, or after the first end-of-line sequence if one is encountered.

pyhanko.pdf_utils.misc.skip_over_comment(stream) bool

Raise an error if the buffer in question is not writable, and return a boolean to indicate whether it supports random-access reading.





Prepare an output stream that supports both reading and writing. Intended to be used for writing & updating signed files: when producing a signature, we render the PDF to a byte buffer with placeholder values for the signature data, or straight to the provided output stream if possible.

More precisely: this function will return the original output stream if it is writable, readable and seekable. If the output parameter is None, not readable or not seekable, this function will return a BytesIO instance instead. If the output parameter is not None and not writable, IOError will be raised.


output – A writable file-like object, or None.


A file-like object that supports reading, writing and seeking.

pyhanko.pdf_utils.misc.finalise_output(orig_output, returned_output)

Several internal APIs transparently replaces non-readable/seekable buffers with BytesIO for signing operations, but we don’t want to expose that to the public API user. This internal API function handles the unwrapping.

pyhanko.pdf_utils.misc.DEFAULT_CHUNK_SIZE = 4096

Default chunk size for stream I/O.

pyhanko.pdf_utils.misc.chunked_write(temp_buffer: bytearray, stream, output, max_read=None)
pyhanko.pdf_utils.misc.chunked_digest(temp_buffer: bytearray, stream, md, max_read=None)
pyhanko.pdf_utils.misc.chunk_stream(temp_buffer: bytearray, stream, max_read=None)
class pyhanko.pdf_utils.misc.ConsList(head: object, tail: 'ConsList' = None)

Bases: object

head: object
tail: ConsList = None
static empty() ConsList
static sing(value) ConsList
class pyhanko.pdf_utils.misc.Singleton(name, bases, dct)

Bases: type

pyhanko.pdf_utils.misc.isoparse(dt_str: str) datetime