pyhanko.pdf_utils.incremental_writer module

Utility for writing incremental updates to existing PDF files.

class pyhanko.pdf_utils.incremental_writer.IncrementalPdfFileWriter(input_stream, prev: Optional[PdfFileReader] = None, strict=True)

Bases: BasePdfFileWriter

Class to incrementally update existing files.

This BasePdfFileWriter subclass encapsulates a PdfFileReader instance in addition to exposing an interface to add and modify PDF objects.

Incremental updates to a PDF file append modifications to the end of the file. This is critical when the original file contents are not to be modified directly (e.g. when it contains digital signatures). It has the additional advantage of providing an automatic audit trail of sorts.

  • input_stream – Input stream to read current revision from.

  • strict – Ingest the source file in strict mode. The default is True.

  • prev – Explicitly pass in a PDF reader. This parameter is internal API.

classmethod from_reader(reader: PdfFileReader) IncrementalPdfFileWriter

Instantiate an incremental writer from a PDF file reader.


reader – A PdfFileReader object with a PDF to extend.

get_object(ido, as_metadata_stream: bool = False)

Retrieve the object associated with the provided reference from this PDF handler.

  • ref – An instance of generic.Reference.

  • as_metadata_stream – Whether to dereference the object as an XMP metadata stream.


A PDF object.

mark_update(obj_ref: Union[Reference, IndirectObject])

Mark an object reference to be updated. This is only relevant for incremental updates, but is included as a no-op by default for interoperability reasons.


obj_ref – An indirect object instance or a reference.

update_container(obj: PdfObject)

Mark the container of an object (as indicated by the container_ref attribute on PdfObject) for an update.

As with mark_update(), this only applies to incremental updates, but defaults to a no-op.


obj – The object whose top-level container needs to be rewritten.


Signal that the document catalog should be written to the output. Equivalent to calling mark_update() with root_ref.

set_info(info: Optional[Union[IndirectObject, DictionaryObject]])

Set the /Info entry of the document trailer.


info – The new /Info dictionary, either as an indirect reference or as a DictionaryObject

set_custom_trailer_entry(key: NameObject, value: PdfObject)

Set a custom, unmanaged entry in the document trailer or cross-reference stream dictionary.


Calling this method to set an entry that is managed by pyHanko internally (info dictionary, document catalog, etc.) has undefined results.

  • key – Dictionary key to use in the trailer.

  • value – Value to set


Write the contents of this PDF writer to a stream.


stream – A writable output stream.

property document_meta_view: DocumentMetadata

Write the updated file contents in-place to the same stream as the input stream. This obviously requires a stream supporting both reading and writing operations.


Method to handle updates to encrypted files.

This method handles decrypting of the original file, and makes sure the resulting updated file is encrypted in a compatible way. The standard mandates that updates to encrypted files be effected using the same encryption settings. In particular, incremental updates cannot remove file encryption.


user_pwd – The original file’s user password.


PdfReadError – Raised when there is a problem decrypting the file.

encrypt_pubkey(credential: EnvelopeKeyDecrypter)

Method to handle updates to files encrypted using public-key encryption.

The same caveats as encrypt() apply here.


credential – The EnvelopeKeyDecrypter handling the recipient’s private key.


PdfReadError – Raised when there is a problem decrypting the file.

stream_xrefs: bool

Boolean controlling whether or not the output file will contain its cross-references in stream format, or as a classical XRef table.

The default for new files is True. For incremental updates, the writer adapts to the system used in the previous iteration of the document (as mandated by the standard).