pyhanko.pdf_utils.filters module

Implementation of stream filters for PDF.

Taken from PyPDF2 with modifications. See here for the original license of the PyPDF2 project.

Note that not all decoders specified in the standard are supported. In particular /Crypt and /LZWDecode are missing.

class pyhanko.pdf_utils.filters.Decoder

Bases: object

General filter/decoder interface.

classmethod decode(data: bytes, decode_params: dict) → bytes

Decode a stream.

Parameters
  • data – Data to decode.

  • decode_params – Decoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Decoded data.

classmethod encode(data: bytes, decode_params: dict) → bytes

Encode a stream.

Parameters
  • data – Data to encode.

  • decode_params – Encoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Encoded data.

class pyhanko.pdf_utils.filters.ASCII85Decode

Bases: pyhanko.pdf_utils.filters.Decoder

Implementation of the base 85 encoding scheme specified in ISO 32000-1.

classmethod encode(data: bytes, decode_params=None) → bytes

Encode a stream.

Parameters
  • data – Data to encode.

  • decode_params – Encoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Encoded data.

classmethod decode(data, decode_params=None)

Decode a stream.

Parameters
  • data – Data to decode.

  • decode_params – Decoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Decoded data.

class pyhanko.pdf_utils.filters.ASCIIHexDecode

Bases: pyhanko.pdf_utils.filters.Decoder

Wrapper around binascii.hexlify() that implements the Decoder interface.

classmethod encode(data: bytes, decode_params=None) → bytes

Encode a stream.

Parameters
  • data – Data to encode.

  • decode_params – Encoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Encoded data.

classmethod decode(data, decode_params=None)

Decode a stream.

Parameters
  • data – Data to decode.

  • decode_params – Decoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Decoded data.

class pyhanko.pdf_utils.filters.FlateDecode

Bases: pyhanko.pdf_utils.filters.Decoder

Implementation of the /FlateDecode filter.

Warning

Currently not all predictor values are supported. This may cause problems when extracting image data from PDF files.

classmethod decode(data: bytes, decode_params)

Decode a stream.

Parameters
  • data – Data to decode.

  • decode_params – Decoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Decoded data.

classmethod encode(data, decode_params=None)

Encode a stream.

Parameters
  • data – Data to encode.

  • decode_params – Encoder parameters, sourced from the /DecoderParams entry associated with this filter.

Returns

Encoded data.

pyhanko.pdf_utils.filters.DECODERS = {'/A85': <class 'pyhanko.pdf_utils.filters.ASCII85Decode'>, '/AHx': <class 'pyhanko.pdf_utils.filters.ASCIIHexDecode'>, '/ASCII85Decode': <class 'pyhanko.pdf_utils.filters.ASCII85Decode'>, '/ASCIIHexDecode': <class 'pyhanko.pdf_utils.filters.ASCIIHexDecode'>, '/Crypt': <class 'pyhanko.pdf_utils.filters.CryptDecoder'>, '/Fl': <class 'pyhanko.pdf_utils.filters.FlateDecode'>, '/FlateDecode': <class 'pyhanko.pdf_utils.filters.FlateDecode'>}

Dictionary mapping decoder names to implementations.