pyhanko.pdf_utils.filters module
Implementation of stream filters for PDF.
Taken from PyPDF2 with modifications. See here for the original license of the PyPDF2 project.
Note that not all decoders specified in the standard are supported.
In particular /LZWDecode
and the various JPEG-based decoders are missing.
- class pyhanko.pdf_utils.filters.Decoder
Bases:
object
General filter/decoder interface.
- decode(data: bytes, decode_params: dict) bytes
Decode a stream.
- Parameters:
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Decoded data.
- encode(data: bytes, decode_params: dict) bytes
Encode a stream.
- Parameters:
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Encoded data.
- class pyhanko.pdf_utils.filters.ASCII85Decode
Bases:
Decoder
Implementation of the base 85 encoding scheme specified in ISO 32000-1.
- encode(data: bytes, decode_params=None) bytes
Encode a stream.
- Parameters:
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Encoded data.
- decode(data, decode_params=None)
Decode a stream.
- Parameters:
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Decoded data.
- class pyhanko.pdf_utils.filters.ASCIIHexDecode
Bases:
Decoder
Wrapper around
binascii.hexlify()
that implements theDecoder
interface.- encode(data: bytes, decode_params=None) bytes
Encode a stream.
- Parameters:
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Encoded data.
- decode(data, decode_params=None)
Decode a stream.
- Parameters:
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Decoded data.
- class pyhanko.pdf_utils.filters.FlateDecode
Bases:
Decoder
Implementation of the
/FlateDecode
filter.Warning
Currently not all predictor values are supported. This may cause problems when extracting image data from PDF files.
- decode(data: bytes, decode_params)
Decode a stream.
- Parameters:
data – Data to decode.
decode_params – Decoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Decoded data.
- encode(data, decode_params=None)
Encode a stream.
- Parameters:
data – Data to encode.
decode_params – Encoder parameters, sourced from the
/DecoderParams
entry associated with this filter.
- Returns:
Encoded data.
- pyhanko.pdf_utils.filters.get_generic_decoder(name: str) Decoder
Instantiate a specific stream filter decoder type by (PDF) name.
The following names are recognised:
/FlateDecode
or/Fl
for the decoder implementing Flatecompression.
/ASCIIHexDecode
or/AHx
for the decoder that converts bytes to their hexadecimal representations./ASCII85Decode
or/A85
for the decoder that converts byte strings to a base-85 textual representation.
Warning
/Crypt
is a special case because it requires access to the document’s security handler.Warning
LZW compression is currently unsupported, as are most compression methods that are used specifically for image data.
- Parameters:
name – Name of the decoder to instantiate.