pyhanko.pdf_utils.crypt package

Submodules

pyhanko.pdf_utils.crypt.api module

exception pyhanko.pdf_utils.crypt.api.PdfKeyNotAvailableError(msg: str, *args)

Bases: PdfReadError

class pyhanko.pdf_utils.crypt.api.AuthStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: OrderedEnum

Describes the status after an authentication attempt.

FAILED = 0
USER = 1
OWNER = 2
class pyhanko.pdf_utils.crypt.api.PdfMacStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Status of PDF MAC validation.

NOT_APPLICABLE = 0
SUCCESSFUL = 1
FAILED = 2
class pyhanko.pdf_utils.crypt.api.AuthResult(status: AuthStatus, permission_flags: PdfPermissions | None = None, mac_status: PdfMacStatus = PdfMacStatus.NOT_APPLICABLE, mac_failure_reason: str | None = None)

Bases: object

Describes the result of an authentication attempt.

status: AuthStatus

Authentication status after the authentication attempt.

permission_flags: PdfPermissions | None = None

Granular permission flags. The precise meaning depends on the security handler.

mac_status: PdfMacStatus = 0

Status of PDF MAC validation.

mac_failure_reason: str | None = None

Reason for PDF MAC validation failure in human-readable form.

class pyhanko.pdf_utils.crypt.api.SecurityHandlerVersion(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: VersionEnum

Indicates the security handler’s version.

The enum constants are named more or less in accordance with the cryptographic algorithms they permit.

RC4_40 = 1
RC4_LONGER_KEYS = 2
RC4_OR_AES128 = 4
AES256 = 5
AES_GCM = 6
OTHER = None

Placeholder value for custom security handlers.

as_pdf_object() PdfObject
classmethod from_number(value) SecurityHandlerVersion
check_key_length(key_length: int) int
class pyhanko.pdf_utils.crypt.api.SecurityHandler(version: SecurityHandlerVersion, legacy_keylen, crypt_filter_config: CryptFilterConfiguration, encrypt_metadata=True, compat_entries=True, kdf_salt: bytes | None = None)

Bases: object

Generic PDF security handler interface.

This class contains relatively little actual functionality, except for some common initialisation logic and bookkeeping machinery to register security handler implementations.

Parameters:
  • version – Indicates the version of the security handler to use, as described in the specification. See SecurityHandlerVersion.

  • legacy_keylen – Key length in bytes (only relevant for legacy encryption handlers).

  • crypt_filter_config

    The crypt filter configuration for the security handler, in the form of a CryptFilterConfiguration object.

    Note

    PyHanko implements legacy security handlers (which, according to the standard, aren’t crypt filter-aware) using crypt filters as well, even though they aren’t serialised to the output file.

  • encrypt_metadata

    Flag indicating whether document (XMP) metadata is to be encrypted.

    Warning

    Currently, PyHanko does not manage metadata streams, so until that changes, it is the responsibility of the API user to mark metadata streams using the /Identity crypt filter as required.

    Nonetheless, the value of this flag is required in key derivation computations, so the security handler needs to know about it.

  • kdf_salt

    Optional salt value used when deriving additional key material from the main file encryption key.

    Note

    This is currently only relevant for the ISO/TS 32004 (PDF MAC) extension.

  • compat_entries – Write deprecated but technically unnecessary configuration settings for compatibility with certain implementations.

static register(cls: Type[SecurityHandler])

Register a security handler class. Intended to be used as a decorator on subclasses.

See build() for further information.

Parameters:

cls – A subclass of SecurityHandler.

static build(encrypt_dict: DictionaryObject) SecurityHandler

Instantiate an appropriate SecurityHandler from a PDF document’s encryption dictionary.

PyHanko will search the registry for a security handler with a name matching the /Filter entry. Failing that, a security handler implementing the protocol designated by the /SubFilter entry (see support_generic_subfilters()) will be chosen.

Once an appropriate SecurityHandler subclass has been selected, pyHanko will invoke the subclass’s instantiate_from_pdf_object() method with the original encryption dictionary as its argument.

Parameters:

encrypt_dict – A PDF encryption dictionary.

Returns:

classmethod get_name() str

Retrieves the name of this security handler.

Returns:

The name of this security handler.

extract_credential() SerialisableCredential | None

Extract a serialisable credential for later use, if the security handler supports it. It should allow the security handler to be unlocked with the same access level as the current one.

Returns:

A serialisable credential, or None.

classmethod support_generic_subfilters() Set[str]

Indicates the generic /SubFilter values that this security handler supports.

Returns:

A set of generic protocols (indicated in the /SubFilter entry of an encryption dictionary) that this SecurityHandler class implements. Defaults to the empty set.

classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)

Instantiate an object of this class using a PDF encryption dictionary as input.

Parameters:

encrypt_dict – A PDF encryption dictionary.

Returns:

is_authenticated() bool

Return True if the security handler has been successfully authenticated against for document encryption purposes.

The default implementation just attempts to call get_file_encryption_key() and returns True if that doesn’t raise an error.

as_pdf_object() DictionaryObject

Serialise this security handler to a PDF encryption dictionary.

Returns:

A PDF encryption dictionary.

authenticate(credential, id1=None) AuthResult

Authenticate a credential holder with this security handler.

Parameters:
  • credential – A credential. The type of the credential is left up to the subclasses.

  • id1 – The first part of the document ID of the document being accessed.

Returns:

An AuthResult object indicating the level of access obtained.

get_string_filter() CryptFilter
Returns:

The crypt filter responsible for decrypting strings for this security handler.

get_stream_filter(name=None) CryptFilter
Parameters:

name – Optionally specify a crypt filter by name.

Returns:

The default crypt filter responsible for decrypting streams for this security handler, or the crypt filter named name, if not None.

get_embedded_file_filter()
Returns:

The crypt filter responsible for decrypting embedded files for this security handler.

get_file_encryption_key() bytes

Retrieve the global file encryption key (used for streams and/or strings). If there is no such thing, or the key is not available, an error should be raised.

Raises:

PdfKeyNotAvailableError – when the key is not available

get_kdf_salt() bytes

Get KDF salt value, or raise an error if there is none.

Note

This is currently only relevant for the ISO/TS 32004 (PDF MAC) extension.

Returns:

The KDF salt value.

property pdf_mac_enabled: bool

Boolean indicating whether this security handler has PDF MAC support enabled.

classmethod read_cf_dictionary(cfdict: DictionaryObject, acts_as_default: bool) CryptFilter | None

Interpret a crypt filter dictionary for this type of security handler.

Parameters:
  • cfdict – A crypt filter dictionary.

  • acts_as_default – Indicates whether this filter is intended to be used in /StrF or /StmF.

Returns:

An appropriate CryptFilter object, or None if the crypt filter uses the /None method.

Raises:

NotImplementedError – Raised when the crypt filter’s /CFM entry indicates an unknown crypt filter method.

classmethod process_crypt_filters(encrypt_dict: DictionaryObject) CryptFilterConfiguration | None
classmethod register_crypt_filter(method: NameObject, factory: Callable[[DictionaryObject, bool], CryptFilter])
get_min_pdf_version() Tuple[int, int] | None
get_extensions() List[DeveloperExtension]
class pyhanko.pdf_utils.crypt.api.CryptFilter

Bases: object

Generic abstract crypt filter class.

The superclass only handles the binding with the security handler, and offers some default implementations for serialisation routines that may be overridden in subclasses.

There is generally no requirement for crypt filters to be compatible with any security handler (the leaf classes in this module aren’t), but the API supports mixin usage so code can be shared.

property method: NameObject
Returns:

The method name (/CFM entry) associated with this crypt filter.

get_extensions() List[DeveloperExtension] | None

Get applicable developer extensions for this crypt filter.

property keylen: int
Returns:

The keylength (in bytes) of the key associated with this crypt filter.

encrypt(key, plaintext: bytes, params=None) bytes

Encrypt plaintext with the specified key.

Parameters:
  • key – The current local key, which may or may not be equal to this crypt filter’s global key.

  • plaintext – Plaintext to encrypt.

  • params – Optional parameters private to the crypt filter, specified as a PDF dictionary. These can only be used for explicit crypt filters; the parameters are then sourced from the corresponding entry in /DecodeParms.

Returns:

The resulting ciphertext.

decrypt(key, ciphertext: bytes, params=None) bytes

Decrypt ciphertext with the specified key.

Parameters:
  • key – The current local key, which may or may not be equal to this crypt filter’s global key.

  • ciphertext – Ciphertext to decrypt.

  • params – Optional parameters private to the crypt filter, specified as a PDF dictionary. These can only be used for explicit crypt filters; the parameters are then sourced from the corresponding entry in /DecodeParms.

Returns:

The resulting plaintext.

as_pdf_object() DictionaryObject

Serialise this crypt filter to a PDF crypt filter dictionary.

Note

Implementations are encouraged to use a cooperative inheritance model, where subclasses first call super().as_pdf_object() and add the keys they need before returning the result.

This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.

Returns:

A PDF crypt filter dictionary.

derive_shared_encryption_key() bytes

Compute the (global) file encryption key for this crypt filter.

Returns:

The key, as a bytes object.

Raises:

misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).

derive_object_key(idnum, generation) bytes

Derive the encryption key for a specific object, based on the shared file encryption key.

Parameters:
  • idnum – ID of the object being encrypted.

  • generation – Generation number of the object being encrypted.

Returns:

The local key to use for this object.

set_embedded_only()
property shared_key: bytes

Return the shared file encryption key for this crypt filter, or attempt to compute it using derive_shared_encryption_key() if not available.

class pyhanko.pdf_utils.crypt.api.IdentityCryptFilter

Bases: CryptFilter

Class implementing the trivial crypt filter.

This is a singleton class, so all its instances are identical. Additionally, some of the CryptFilter API is nonfunctional. In particular, as_pdf_object() always raises an error, since the /Identity filter cannot be serialised.

method = '/None'
keylen = 0
derive_shared_encryption_key() bytes

Always returns an empty byte string.

derive_object_key(idnum, generation) bytes

Always returns an empty byte string.

Parameters:
  • idnum – Ignored.

  • generation – Ignored.

Returns:

as_pdf_object()

Not implemented for this crypt filter.

Raises:

misc.PdfError – Always.

encrypt(key, plaintext: bytes, params=None) bytes

Identity function.

Parameters:
  • key – Ignored.

  • plaintext – Returned as-is.

  • params – Ignored.

Returns:

The original plaintext.

decrypt(key, ciphertext: bytes, params=None) bytes

Identity function.

Parameters:
  • key – Ignored.

  • ciphertext – Returned as-is.

  • params – Ignored.

Returns:

The original ciphertext.

class pyhanko.pdf_utils.crypt.api.CryptFilterConfiguration(crypt_filters: Dict[str, CryptFilter], default_stream_filter='/Identity', default_string_filter='/Identity', default_file_filter=None)

Bases: object

Crypt filter store attached to a security handler.

Instances of this class are not designed to be reusable.

Parameters:
  • crypt_filters – A dictionary mapping names to their corresponding crypt filters.

  • default_stream_filter – Name of the default crypt filter to use for streams.

  • default_stream_filter – Name of the default crypt filter to use for strings.

  • default_file_filter

    Name of the default crypt filter to use for embedded files.

    Note

    PyHanko currently is not aware of embedded files, so managing these is the API user’s responsibility.

filters() Iterable[CryptFilter]

Enumerate all crypt filters in this configuration.

set_security_handler(handler: SecurityHandler)

Set the security handler on all crypt filters in this configuration.

Parameters:

handler – A SecurityHandler instance.

get_for_stream()

Retrieve the default crypt filter to use with streams.

Returns:

A CryptFilter instance.

get_for_string()

Retrieve the default crypt filter to use with strings.

Returns:

A CryptFilter instance.

get_for_embedded_file()

Retrieve the default crypt filter to use with embedded files.

Returns:

A CryptFilter instance.

property stream_filter_name: NameObject

The name of the default crypt filter to use with streams.

property string_filter_name: NameObject

The name of the default crypt filter to use with streams.

property embedded_file_filter_name: NameObject

Retrieve the name of the default crypt filter to use with embedded files.

as_pdf_object()

Serialise this crypt filter configuration to a dictionary object, including all its subordinate crypt filters (with the exception of the identity filter, if relevant).

standard_filters()

Return the “standard” filters associated with this crypt filter configuration, i.e. those registered as the defaults for strings, streams and embedded files, respectively.

These sometimes require special treatment (as per the specification).

Returns:

A set with one, two or three elements.

pyhanko.pdf_utils.crypt.api.build_crypt_filter(reg: Dict[NameObject, Callable[[DictionaryObject, bool], CryptFilter]], cfdict: DictionaryObject, acts_as_default: bool) CryptFilter | None

Interpret a crypt filter dictionary for a security handler.

Parameters:
  • reg – A registry of named crypt filters.

  • cfdict – A crypt filter dictionary.

  • acts_as_default – Indicates whether this filter is intended to be used in /StrF or /StmF.

Returns:

An appropriate CryptFilter object, or None if the crypt filter uses the /None method.

Raises:

NotImplementedError – Raised when the crypt filter’s /CFM entry indicates an unknown crypt filter method.

pyhanko.pdf_utils.crypt.cred_ser module

class pyhanko.pdf_utils.crypt.cred_ser.SerialisedCredential(credential_type: str, data: bytes)

Bases: object

A credential in serialised form.

credential_type: str

The registered type name of the credential (see SerialisableCredential.register()).

data: bytes

The credential data, as a byte string.

class pyhanko.pdf_utils.crypt.cred_ser.SerialisableCredential

Bases: ABC

Class representing a credential that can be serialised.

classmethod get_name() str

Get the type name of the credential, which will be embedded into serialised values and used on deserialisation.

static register(cls: Type[SerialisableCredential])

Register a subclass into the credential serialisation registry, using the name returned by get_name(). Can be used as a class decorator.

Parameters:

cls – The subclass.

Returns:

The subclass.

static deserialise(ser_value: SerialisedCredential) SerialisableCredential

Deserialise a SerialisedCredential value by looking up the proper subclass of SerialisableCredential and invoking its deserialisation method.

Parameters:

ser_value – The value to deserialise.

Returns:

The deserialised credential.

Raises:

misc.PdfReadError – If a deserialisation error occurs.

serialise() SerialisedCredential

Serialise a value to an annotated SerialisedCredential value.

Returns:

A SerialisedCredential value.

Raises:

misc.PdfWriteError – If a serialisation error occurs.

pyhanko.pdf_utils.crypt.filter_mixins module

class pyhanko.pdf_utils.crypt.filter_mixins.RC4CryptFilterMixin(*, keylen=5, **kwargs)

Bases: CryptFilter, ABC

Mixin for RC4-based crypt filters.

Parameters:

keylen – Key length, in bytes. Defaults to 5.

method = '/V2'
property keylen: int
Returns:

The keylength (in bytes) of the key associated with this crypt filter.

encrypt(key, plaintext: bytes, params=None) bytes

Encrypt data using RC4.

Parameters:
  • key – Local encryption key.

  • plaintext – Plaintext to encrypt.

  • params – Ignored.

Returns:

Ciphertext.

decrypt(key, ciphertext: bytes, params=None) bytes

Decrypt data using RC4.

Parameters:
  • key – Local encryption key.

  • ciphertext – Ciphertext to decrypt.

  • params – Ignored.

Returns:

Plaintext.

derive_object_key(idnum, generation) bytes

Derive the local key for the given object ID and generation number, by calling legacy_derive_object_key().

Parameters:
  • idnum – ID of the object being encrypted.

  • generation – Generation number of the object being encrypted.

Returns:

The local key.

class pyhanko.pdf_utils.crypt.filter_mixins.AESCryptFilterMixin(*, keylen: int, **kwargs)

Bases: CryptFilter, ABC

Mixin for AES-based crypt filters.

property method: NameObject
Returns:

The method name (/CFM entry) associated with this crypt filter.

property keylen: int
Returns:

The keylength (in bytes) of the key associated with this crypt filter.

encrypt(key, plaintext: bytes, params=None)

Encrypt data using AES in CBC mode, with PKCS#7 padding.

Parameters:
  • key – The key to use.

  • plaintext – The plaintext to be encrypted.

  • params – Ignored.

Returns:

The resulting ciphertext, prepended with a 16-byte initialisation vector.

decrypt(key, ciphertext: bytes, params=None) bytes

Decrypt data using AES in CBC mode, with PKCS#7 padding.

Parameters:
  • key – The key to use.

  • ciphertext – The ciphertext to be decrypted, prepended with a 16-byte initialisation vector.

  • params – Ignored.

Returns:

The resulting plaintext.

derive_object_key(idnum, generation) bytes

Derive the local key for the given object ID and generation number.

If the associated handler is of version SecurityHandlerVersion.AES256 or greater, this method simply returns the global key as-is. If not, the computation is carried out by legacy_derive_object_key().

Parameters:
  • idnum – ID of the object being encrypted.

  • generation – Generation number of the object being encrypted.

Returns:

The local key.

class pyhanko.pdf_utils.crypt.filter_mixins.AESGCMCryptFilterMixin(**kwargs)

Bases: CryptFilter, ABC

Mixin for AES GCM-based crypt filters (ISO 32003)

method = '/AESV4'
property keylen: int
Returns:

The keylength (in bytes) of the key associated with this crypt filter.

encrypt(key, plaintext: bytes, params=None)

Encrypt data using AES-GCM.

Parameters:
  • key – The key to use.

  • plaintext – The plaintext to be encrypted.

  • params – Ignored.

Returns:

The resulting ciphertext and tag, prepended with a 12-byte nonce

decrypt(key, ciphertext: bytes, params=None) bytes

Decrypt data using AES-GCM.

Parameters:
  • key – The key to use.

  • ciphertext – The ciphertext to be decrypted, prepended with a 12-byte initialisation vector, and suffixed with the 16-byte authentication tag.

  • params – Ignored.

Returns:

The resulting plaintext.

get_extensions() List[DeveloperExtension] | None

Get applicable developer extensions for this crypt filter.

pyhanko.pdf_utils.crypt.permissions module

class pyhanko.pdf_utils.crypt.permissions.PdfPermissions(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Flag

Utility mixin for PDF permission flags.

classmethod allow_everything()

Set all permissions.

classmethod from_uint(uint_flags: int)

Convert a 32-bit unsigned integer into PDF permission flags.

classmethod from_bytes(flags: bytes)

Convert a string of 4 bytes into PDF permission flags.

classmethod from_sint32(sint32_flags: int)

Convert a 32-bit signed integer into PDF permission flags.

as_uint32() int

Convert a set of PDF permission flags to their 32-bit unsigned integer representation.

This will already take into account some conventions in the PDF specification, i.e. to set as-yet undefined permission flags to ‘Allow’.

as_bytes() bytes

Convert a set of PDF permission flags to their binary representation.

as_sint32() int

Convert a set of PDF permission flags to their signed integer representation.

mac_required() bool
class pyhanko.pdf_utils.crypt.permissions.StandardPermissions(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: PdfPermissions, Flag

Permission flags for the standard security handler.

See Table 22 in ISO 32000-2:2020.

ALLOW_PRINTING = 4
ALLOW_MODIFICATION_GENERIC = 8
ALLOW_CONTENT_EXTRACTION = 16
ALLOW_ANNOTS_FORM_FILLING = 32
ALLOW_FORM_FILLING = 256
ALLOW_ASSISTIVE_TECHNOLOGY = 512
ALLOW_REASSEMBLY = 1024
ALLOW_HIGH_QUALITY_PRINTING = 2048
TOLERATE_MISSING_PDF_MAC = 4096
as_uint32() int

Convert a set of PDF permission flags to their 32-bit unsigned integer representation.

This will already take into account some conventions in the PDF specification, i.e. to set as-yet undefined permission flags to ‘Allow’.

mac_required() bool
class pyhanko.pdf_utils.crypt.permissions.PubKeyPermissions(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: PdfPermissions, Flag

Permission flags for the public-key security handler.

See Table 24 in ISO 32000-2:2020.

ALLOW_ENCRYPTION_CHANGE = 2
ALLOW_PRINTING = 4
ALLOW_MODIFICATION_GENERIC = 8
ALLOW_CONTENT_EXTRACTION = 16
ALLOW_ANNOTS_FORM_FILLING = 32
ALLOW_FORM_FILLING = 256
ALLOW_ASSISTIVE_TECHNOLOGY = 512
ALLOW_REASSEMBLY = 1024
ALLOW_HIGH_QUALITY_PRINTING = 2048
TOLERATE_MISSING_PDF_MAC = 4096
as_uint32() int

Convert a set of PDF permission flags to their 32-bit unsigned integer representation.

This will already take into account some conventions in the PDF specification, i.e. to set as-yet undefined permission flags to ‘Allow’.

mac_required() bool

pyhanko.pdf_utils.crypt.pubkey module

class pyhanko.pdf_utils.crypt.pubkey.RecipientEncryptionPolicy(ignore_key_usage: bool = False, prefer_oaep: bool = False)

Bases: object

ignore_key_usage: bool = False

Ignore key usage bits in the recipient’s certificate.

prefer_oaep: bool = False

For RSA recipients, encrypt with RSAES-OAEP.

Warning

This is not widely supported.

class pyhanko.pdf_utils.crypt.pubkey.PubKeyCryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)

Bases: CryptFilter, ABC

Crypt filter for use with public key security handler. These are a little more independent than their counterparts for the standard security handlers, since different crypt filters can cater to different sets of recipients.

Parameters:
  • recipients – List of CMS objects encoding recipient information for this crypt filters.

  • acts_as_default – Indicates whether this filter is intended to be used in /StrF or /StmF.

  • encrypt_metadata

    Whether this crypt filter should encrypt document-level metadata.

    Warning

    See SecurityHandler for some background on the way pyHanko interprets this value.

add_recipients(certs: ~typing.List[~asn1crypto.x509.Certificate], policy: ~pyhanko.pdf_utils.crypt.pubkey.RecipientEncryptionPolicy, perms: ~pyhanko.pdf_utils.crypt.permissions.PubKeyPermissions = <PubKeyPermissions.ALLOW_ENCRYPTION_CHANGE|ALLOW_PRINTING|ALLOW_MODIFICATION_GENERIC|ALLOW_CONTENT_EXTRACTION|ALLOW_ANNOTS_FORM_FILLING|ALLOW_FORM_FILLING|ALLOW_ASSISTIVE_TECHNOLOGY|ALLOW_REASSEMBLY|ALLOW_HIGH_QUALITY_PRINTING|TOLERATE_MISSING_PDF_MAC: 7998>)

Add recipients to this crypt filter. This always adds one full CMS object to the Recipients array

Parameters:
  • certs – A list of recipient certificates.

  • policy – Encryption policy choices for the chosen set of recipients.

  • perms – The permission bits to assign to the listed recipients.

authenticate(credential) AuthResult

Authenticate to this crypt filter in particular. If used in /StmF or /StrF, you don’t need to worry about calling this method directly.

Parameters:

credential – The EnvelopeKeyDecrypter to authenticate with.

Returns:

An AuthResult object indicating the level of access obtained.

derive_shared_encryption_key() bytes

Compute the (global) file encryption key for this crypt filter.

Returns:

The key, as a bytes object.

Raises:

misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).

as_pdf_object()

Serialise this crypt filter to a PDF crypt filter dictionary.

Note

Implementations are encouraged to use a cooperative inheritance model, where subclasses first call super().as_pdf_object() and add the keys they need before returning the result.

This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.

Returns:

A PDF crypt filter dictionary.

class pyhanko.pdf_utils.crypt.pubkey.PubKeyAESCryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)

Bases: PubKeyCryptFilter, AESCryptFilterMixin

AES crypt filter for public key security handlers.

class pyhanko.pdf_utils.crypt.pubkey.PubKeyAESGCMCryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)

Bases: PubKeyCryptFilter, AESGCMCryptFilterMixin

AES-GCM crypt filter for public key security handlers.

class pyhanko.pdf_utils.crypt.pubkey.PubKeyRC4CryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)

Bases: PubKeyCryptFilter, RC4CryptFilterMixin

RC4 crypt filter for public key security handlers.

pyhanko.pdf_utils.crypt.pubkey.DEFAULT_CRYPT_FILTER = '/DefaultCryptFilter'

Default name to use for the default crypt filter in public key security handlers.

pyhanko.pdf_utils.crypt.pubkey.DEF_EMBEDDED_FILE = '/DefEmbeddedFile'

Default name to use for the EFF crypt filter in public key security handlers for documents where only embedded files are encrypted.

class pyhanko.pdf_utils.crypt.pubkey.PubKeyAdbeSubFilter(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Enum describing the different subfilters that can be used for public key encryption in the PDF specification.

S3 = '/adbe.pkcs7.s3'
S4 = '/adbe.pkcs7.s4'
S5 = '/adbe.pkcs7.s5'
pyhanko.pdf_utils.crypt.pubkey.construct_envelope_content(seed: bytes, perms: PubKeyPermissions, include_permissions=True)
pyhanko.pdf_utils.crypt.pubkey.construct_recipient_cms(certificates: List[Certificate], seed: bytes, perms: PubKeyPermissions, policy: RecipientEncryptionPolicy, include_permissions=True) ContentInfo
exception pyhanko.pdf_utils.crypt.pubkey.InappropriateCredentialError

Bases: TypeError

class pyhanko.pdf_utils.crypt.pubkey.EnvelopeKeyDecrypter

Bases: object

General credential class for use with public key security handlers.

This allows the key decryption process to happen offline, e.g. on a smart card.

property cert: Certificate
Returns:

Return the recipient’s certificate

decrypt(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm) bytes

Invoke the actual key decryption algorithm. Used with key transport.

Parameters:
  • encrypted_key – Payload to decrypt.

  • algo_params – Specification of the encryption algorithm as a CMS object.

Raises:

InappropriateCredentialError – if the credential cannot be used for key transport.

Returns:

The decrypted payload.

decrypt_with_exchange(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm, originator_identifier: OriginatorIdentifierOrKey, user_keying_material: bytes) bytes

Decrypt an envelope key using a key derived from a key exchange.

Parameters:
  • encrypted_key – Payload to decrypt.

  • algo_params – Specification of the encryption algorithm as a CMS object.

  • originator_identifier – Information about the originator necessary to complete the key exchange.

  • user_keying_material – The user keying material that will be used in the key derivation.

Returns:

The decrypted payload.

class pyhanko.pdf_utils.crypt.pubkey.ECCCMSSharedInfo(value=None, default=None, **kwargs)

Bases: Sequence

class pyhanko.pdf_utils.crypt.pubkey.SimpleEnvelopeKeyDecrypter(cert: Certificate, private_key: PrivateKeyInfo)

Bases: EnvelopeKeyDecrypter, SerialisableCredential

Implementation of EnvelopeKeyDecrypter where the private key is an RSA or ECC key residing in memory.

Parameters:
  • cert – The recipient’s certificate.

  • private_key – The recipient’s private key.

dhsinglepass_stddh_arc_pattern = re.compile('1\\.3\\.132\\.1\\.11\\.(\\d+)')
classmethod get_name() str

Get the type name of the credential, which will be embedded into serialised values and used on deserialisation.

property cert: Certificate
Returns:

Return the recipient’s certificate

static load(key_file, cert_file, key_passphrase=None)

Load a key decrypter using key material from files on disk.

Parameters:
  • key_file – File containing the recipient’s private key.

  • cert_file – File containing the recipient’s certificate.

  • key_passphrase – Passphrase for the key file, if applicable.

Returns:

An instance of SimpleEnvelopeKeyDecrypter.

classmethod load_pkcs12(pfx_file, passphrase=None)

Load a key decrypter using key material from a PKCS#12 file on disk.

Parameters:
  • pfx_file – Path to the PKCS#12 file containing the key material.

  • passphrase – Passphrase for the private key, if applicable.

Returns:

An instance of SimpleEnvelopeKeyDecrypter.

decrypt(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm) bytes

Decrypt the payload using RSA with PKCS#1 v1.5 padding or OAEP. Other schemes are not (currently) supported by this implementation.

Parameters:
  • encrypted_key – Payload to decrypt.

  • algo_params – Specification of the encryption algorithm as a CMS object. Must use rsaes_pkcs1v15 or rsaes_oaep.

Returns:

The decrypted payload.

decrypt_with_exchange(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm, originator_identifier: OriginatorIdentifierOrKey, user_keying_material: bytes | None) bytes

Decrypt the payload using a key agreed via ephemeral-static standard (non-cofactor) ECDH with X9.63 key derivation. Other schemes aer not supported at this time.

Parameters:
  • encrypted_key – Payload to decrypt.

  • algo_params – Specification of the encryption algorithm as a CMS object.

  • originator_identifier – The originator info, which must be an EC key.

  • user_keying_material – The user keying material that will be used in the key derivation.

Returns:

The decrypted payload.

pyhanko.pdf_utils.crypt.pubkey.read_envelope_key(ed: EnvelopedData, decrypter: EnvelopeKeyDecrypter) bytes | None
pyhanko.pdf_utils.crypt.pubkey.read_seed_from_recipient_cms(recipient_cms: ContentInfo, decrypter: EnvelopeKeyDecrypter) Tuple[bytes | None, PubKeyPermissions | None]
class pyhanko.pdf_utils.crypt.pubkey.PubKeySecurityHandler(version: SecurityHandlerVersion, pubkey_handler_subfilter: PubKeyAdbeSubFilter, legacy_keylen, encrypt_metadata=True, crypt_filter_config: CryptFilterConfiguration | None = None, recipient_objs: list | None = None, compat_entries=True, kdf_salt: bytes | None = None)

Bases: SecurityHandler

Security handler for public key encryption in PDF.

As with the standard security handler, you essentially shouldn’t ever have to instantiate these yourself (see build_from_certs()).

classmethod build_from_certs(certs: ~typing.List[~asn1crypto.x509.Certificate], keylen_bytes=16, version=SecurityHandlerVersion.AES256, use_aes=True, use_crypt_filters=True, perms: ~pyhanko.pdf_utils.crypt.permissions.PubKeyPermissions = <PubKeyPermissions.ALLOW_ENCRYPTION_CHANGE|ALLOW_PRINTING|ALLOW_MODIFICATION_GENERIC|ALLOW_CONTENT_EXTRACTION|ALLOW_ANNOTS_FORM_FILLING|ALLOW_FORM_FILLING|ALLOW_ASSISTIVE_TECHNOLOGY|ALLOW_REASSEMBLY|ALLOW_HIGH_QUALITY_PRINTING|TOLERATE_MISSING_PDF_MAC: 7998>, encrypt_metadata=True, policy: ~pyhanko.pdf_utils.crypt.pubkey.RecipientEncryptionPolicy = RecipientEncryptionPolicy(ignore_key_usage=False, prefer_oaep=False), pdf_mac: bool = True, **kwargs) PubKeySecurityHandler

Create a new public key security handler.

This method takes many parameters, but only certs is mandatory. The default behaviour is to create a public key encryption handler where the underlying symmetric encryption is provided by AES-256. Any remaining keyword arguments will be passed to the constructor.

Parameters:
  • certs – The recipients’ certificates.

  • keylen_bytes – The key length (in bytes). This is only relevant for legacy security handlers.

  • version – The security handler version to use.

  • use_aes – Use AES-128 instead of RC4 (only meaningful if the version parameter is RC4_OR_AES128).

  • use_crypt_filters – Whether to use crypt filters. This is mandatory for security handlers of version RC4_OR_AES128 or higher.

  • perms – Permission flags.

  • encrypt_metadata

    Whether to encrypt document metadata.

    Warning

    See SecurityHandler for some background on the way pyHanko interprets this value.

  • pdf_mac

    Include an ISO 32004 MAC.

    Warning

    Only works for PDF 2.0 security handlers.

  • policy – Encryption policy choices for the chosen set of recipients.

Returns:

An instance of PubKeySecurityHandler.

classmethod get_name() str

Retrieves the name of this security handler.

Returns:

The name of this security handler.

classmethod support_generic_subfilters() Set[str]

Indicates the generic /SubFilter values that this security handler supports.

Returns:

A set of generic protocols (indicated in the /SubFilter entry of an encryption dictionary) that this SecurityHandler class implements. Defaults to the empty set.

classmethod read_cf_dictionary(cfdict: DictionaryObject, acts_as_default: bool) CryptFilter

Interpret a crypt filter dictionary for this type of security handler.

Parameters:
  • cfdict – A crypt filter dictionary.

  • acts_as_default – Indicates whether this filter is intended to be used in /StrF or /StmF.

Returns:

An appropriate CryptFilter object, or None if the crypt filter uses the /None method.

Raises:

NotImplementedError – Raised when the crypt filter’s /CFM entry indicates an unknown crypt filter method.

classmethod process_crypt_filters(encrypt_dict: DictionaryObject) CryptFilterConfiguration | None
classmethod gather_pub_key_metadata(encrypt_dict: DictionaryObject)
classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)

Instantiate an object of this class using a PDF encryption dictionary as input.

Parameters:

encrypt_dict – A PDF encryption dictionary.

Returns:

as_pdf_object()

Serialise this security handler to a PDF encryption dictionary.

Returns:

A PDF encryption dictionary.

add_recipients(certs: ~typing.List[~asn1crypto.x509.Certificate], perms: ~pyhanko.pdf_utils.crypt.permissions.PubKeyPermissions = <PubKeyPermissions.ALLOW_ENCRYPTION_CHANGE|ALLOW_PRINTING|ALLOW_MODIFICATION_GENERIC|ALLOW_CONTENT_EXTRACTION|ALLOW_ANNOTS_FORM_FILLING|ALLOW_FORM_FILLING|ALLOW_ASSISTIVE_TECHNOLOGY|ALLOW_REASSEMBLY|ALLOW_HIGH_QUALITY_PRINTING|TOLERATE_MISSING_PDF_MAC: 7998>, policy: ~pyhanko.pdf_utils.crypt.pubkey.RecipientEncryptionPolicy = RecipientEncryptionPolicy(ignore_key_usage=False, prefer_oaep=False))
authenticate(credential: EnvelopeKeyDecrypter | SerialisedCredential, id1=None) AuthResult

Authenticate a user to this security handler.

Parameters:
  • credential – The credential to use (an instance of EnvelopeKeyDecrypter in this case).

  • id1 – First part of the document ID. Public key encryption handlers ignore this key.

Returns:

An AuthResult object indicating the level of access obtained.

get_file_encryption_key() bytes

Retrieve the global file encryption key (used for streams and/or strings). If there is no such thing, or the key is not available, an error should be raised.

Raises:

PdfKeyNotAvailableError – when the key is not available

pyhanko.pdf_utils.crypt.standard module

class pyhanko.pdf_utils.crypt.standard.StandardSecuritySettingsRevision(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: VersionEnum

Indicate the standard security handler revision to emulate.

RC4_BASIC = 2
RC4_EXTENDED = 3
RC4_OR_AES128 = 4
AES256 = 6
AES_GCM = 7
OTHER = None

Placeholder value for custom security handlers.

as_pdf_object() PdfObject
classmethod from_number(value) StandardSecuritySettingsRevision
class pyhanko.pdf_utils.crypt.standard.StandardCryptFilter

Bases: CryptFilter, ABC

Crypt filter for use with the standard security handler.

derive_shared_encryption_key() bytes

Compute the (global) file encryption key for this crypt filter.

Returns:

The key, as a bytes object.

Raises:

misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).

as_pdf_object()

Serialise this crypt filter to a PDF crypt filter dictionary.

Note

Implementations are encouraged to use a cooperative inheritance model, where subclasses first call super().as_pdf_object() and add the keys they need before returning the result.

This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.

Returns:

A PDF crypt filter dictionary.

class pyhanko.pdf_utils.crypt.standard.StandardAESCryptFilter(*, keylen: int, **kwargs)

Bases: StandardCryptFilter, AESCryptFilterMixin

AES crypt filter for the standard security handler.

class pyhanko.pdf_utils.crypt.standard.StandardAESGCMCryptFilter(**kwargs)

Bases: StandardCryptFilter, AESGCMCryptFilterMixin

AES-GCM crypt filter for the standard security handler.

class pyhanko.pdf_utils.crypt.standard.StandardRC4CryptFilter(*, keylen=5, **kwargs)

Bases: StandardCryptFilter, RC4CryptFilterMixin

RC4 crypt filter for the standard security handler.

class pyhanko.pdf_utils.crypt.standard.StandardSecurityHandler(version: SecurityHandlerVersion, revision: StandardSecuritySettingsRevision, legacy_keylen, perm_flags: StandardPermissions, odata, udata, oeseed=None, ueseed=None, encrypted_perms=None, encrypt_metadata=True, crypt_filter_config: CryptFilterConfiguration | None = None, compat_entries=True, kdf_salt: bytes | None = None)

Bases: SecurityHandler

Implementation of the standard (password-based) security handler.

You shouldn’t have to instantiate StandardSecurityHandler objects yourself. For encrypting new documents, use build_from_pw() or build_from_pw_legacy().

For decrypting existing documents, pyHanko will take care of instantiating security handlers through SecurityHandler.build().

classmethod get_name() str

Retrieves the name of this security handler.

Returns:

The name of this security handler.

classmethod build_from_pw_legacy(rev: ~pyhanko.pdf_utils.crypt.standard.StandardSecuritySettingsRevision, id1, desired_owner_pass, desired_user_pass=None, keylen_bytes=16, use_aes128=True, perms: ~pyhanko.pdf_utils.crypt.permissions.StandardPermissions = <StandardPermissions.ALLOW_PRINTING|ALLOW_MODIFICATION_GENERIC|ALLOW_CONTENT_EXTRACTION|ALLOW_ANNOTS_FORM_FILLING|ALLOW_FORM_FILLING|ALLOW_ASSISTIVE_TECHNOLOGY|ALLOW_REASSEMBLY|ALLOW_HIGH_QUALITY_PRINTING|TOLERATE_MISSING_PDF_MAC: 7996>, crypt_filter_config=None, encrypt_metadata=True, **kwargs)

Initialise a legacy password-based security handler, to attach to a PdfFileWriter. Any remaining keyword arguments will be passed to the constructor.

Danger

The functionality implemented by this handler is deprecated in the PDF standard. We only provide it for testing purposes, and to interface with legacy systems.

Parameters:
  • rev – Security handler revision to use, see StandardSecuritySettingsRevision.

  • id1 – The first part of the document ID.

  • desired_owner_pass – Desired owner password.

  • desired_user_pass – Desired user password.

  • keylen_bytes – Length of the key (in bytes).

  • use_aes128 – Use AES-128 instead of RC4 (default: True).

  • perms – Permission bits to set

  • crypt_filter_config – Custom crypt filter configuration. PyHanko will supply a reasonable default if none is specified.

Returns:

A StandardSecurityHandler instance.

classmethod build_from_pw(desired_owner_pass, desired_user_pass=None, perms: ~pyhanko.pdf_utils.crypt.permissions.StandardPermissions = <StandardPermissions.ALLOW_PRINTING|ALLOW_MODIFICATION_GENERIC|ALLOW_CONTENT_EXTRACTION|ALLOW_ANNOTS_FORM_FILLING|ALLOW_FORM_FILLING|ALLOW_ASSISTIVE_TECHNOLOGY|ALLOW_REASSEMBLY|ALLOW_HIGH_QUALITY_PRINTING|TOLERATE_MISSING_PDF_MAC: 7996>, encrypt_metadata=True, pdf_mac: bool = True, use_gcm: bool = False, **kwargs)

Initialise a password-based security handler backed by AES-256, to attach to a PdfFileWriter. This handler will use the new PDF 2.0 encryption scheme.

Any remaining keyword arguments will be passed to the constructor.

Parameters:
  • desired_owner_pass – Desired owner password.

  • desired_user_pass – Desired user password.

  • perms – Desired usage permissions.

  • encrypt_metadata – Whether to set up the security handler for encrypting metadata as well.

  • pdf_mac – Include an ISO/TS 32004 MAC.

  • use_gcm

    Use AES-GCM (ISO/TS 32003) to encrypt strings and streams.

    Danger

    Due to the way PDF encryption works, the authentication guarantees of AES-GCM only apply to the content of individual strings and streams. The PDF file structure itself is not authenticated. Document-level integrity protection is provided by the pdf_mac=True option.

    Warning

    This option is disabled by default because support for ISO/TS 32003 is not available in mainstream PDF software yet. This default may change in the future.

Returns:

A StandardSecurityHandler instance.

classmethod gather_encryption_metadata(encrypt_dict: DictionaryObject) dict

Gather and preprocess the “easy” metadata values in an encryption dictionary, and turn them into constructor kwargs.

This function processes /Length, /P, /Perms, /O, /U, /OE, /UE and /EncryptMetadata.

classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)

Instantiate an object of this class using a PDF encryption dictionary as input.

Parameters:

encrypt_dict – A PDF encryption dictionary.

Returns:

property pdf_mac_enabled: bool

Boolean indicating whether this security handler has PDF MAC support enabled.

as_pdf_object()

Serialise this security handler to a PDF encryption dictionary.

Returns:

A PDF encryption dictionary.

authenticate(credential, id1: bytes | None = None) AuthResult

Authenticate a user to this security handler.

Parameters:
  • credential – The credential to use (a password in this case).

  • id1 – First part of the document ID. This is mandatory for legacy encryption handlers, but meaningless otherwise.

Returns:

An AuthResult object indicating the level of access obtained.

get_file_encryption_key() bytes

Retrieve the (global) file encryption key for this security handler.

Returns:

The file encryption key as a bytes object.

Raises:

misc.PdfReadError – Raised if this security handler was instantiated from an encryption dictionary and no credential is available.

Module contents

Changed in version 0.13.0: Refactor crypt module into package.

Changed in version 0.3.0: Added support for PDF 2.0 encryption standards and crypt filters.

Utilities for PDF encryption. This module covers all methods outlined in the standard:

  • Legacy RC4-based encryption (based on PyPDF2 code).

  • AES-128 encryption with legacy key derivation (partly based on PyPDF2 code).

  • PDF 2.0 AES-256 encryption.

  • Public key encryption backed by any of the above.

Following the language in the standard, encryption operations are backed by subclasses of the SecurityHandler class, which provides a more or less generic API.

Danger

The members of this package are all considered internal API, and are therefore subject to change without notice.

Danger

One should also be aware that the legacy encryption scheme implemented here is (very) weak, and we only support it for compatibility reasons. Under no circumstances should it still be used to encrypt new files.

About crypt filters

Crypt filters are objects that handle encryption and decryption of streams and strings, either for all of them, or for a specific subset (e.g. streams representing embedded files). In the context of the PDF standard, crypt filters are a notion that only makes sense for security handlers of version 4 and up. In pyHanko, however, all encryption and decryption operations pass through crypt filters, and the serialisation/deserialisation logic in SecurityHandler and its subclasses transparently deals with staying backwards compatible with earlier revisions.

Internally, pyHanko loosely distinguishes between implicit and explicit uses of crypt filters:

  • Explicit crypt filters are used by directly referring to them from the /Filter entry of a stream dictionary. These are invoked in the usual stream decoding process.

  • Implicit crypt filters are set by the /StmF and /StrF entries in the security handler’s crypt filter configuration, and are invoked by the object reading/writing procedures as necessary. These filters are invisble to the stream encoding/decoding process: the encoded_data attribute of an “implicitly encrypted” stream will therefore contain decrypted data ready to be decoded in the usual way.

As long as you don’t require access to encoded object data and/or raw encrypted object data, this distiction should be irrelevant to you as an API user.