pyhanko.pdf_utils.filters module
Implementation of stream filters for PDF.
Taken from PyPDF2 with modifications. See here for the original license of the PyPDF2 project.
Note that not all decoders specified in the standard are supported.
In particular /Crypt
and /LZWDecode
are missing.
- class pyhanko.pdf_utils.filters.Decoder
Bases:
object
General filter/decoder interface.
- decode(data: bytes, decode_params: dict) bytes
Decode a stream.
- Parameters
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Decoded data.
- encode(data: bytes, decode_params: dict) bytes
Encode a stream.
- Parameters
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Encoded data.
- class pyhanko.pdf_utils.filters.ASCII85Decode
Bases:
pyhanko.pdf_utils.filters.Decoder
Implementation of the base 85 encoding scheme specified in ISO 32000-1.
- encode(data: bytes, decode_params=None) bytes
Encode a stream.
- Parameters
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Encoded data.
- decode(data, decode_params=None)
Decode a stream.
- Parameters
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Decoded data.
- class pyhanko.pdf_utils.filters.ASCIIHexDecode
Bases:
pyhanko.pdf_utils.filters.Decoder
Wrapper around
binascii.hexlify()
that implements theDecoder
interface.- encode(data: bytes, decode_params=None) bytes
Encode a stream.
- Parameters
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Encoded data.
- decode(data, decode_params=None)
Decode a stream.
- Parameters
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Decoded data.
- class pyhanko.pdf_utils.filters.FlateDecode
Bases:
pyhanko.pdf_utils.filters.Decoder
Implementation of the
/FlateDecode
filter.Warning
Currently not all predictor values are supported. This may cause problems when extracting image data from PDF files.
- decode(data: bytes, decode_params)
Decode a stream.
- Parameters
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Decoded data.
- encode(data, decode_params=None)
Encode a stream.
- Parameters
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns
Encoded data.
- pyhanko.pdf_utils.filters.get_generic_decoder(name: str) pyhanko.pdf_utils.filters.Decoder
Instantiate a specific stream filter decoder type by (PDF) name.
The following names are recognised:
/FlateDecode
or/Fl
for the decoder implementing Flatecompression.
/ASCIIHexDecode
or/AHx
for the decoder that converts bytes to their hexadecimal representations./ASCII85Decode
or/A85
for the decoder that converts byte strings to a base-85 textual representation.
Warning
/Crypt
is a special case because it requires access to the document’s security handler.Warning
LZW compression is currently unsupported, as are most compression methods that are used specifically for image data.
- Parameters
name – Name of the decoder to instantiate.