pyhanko.pdf_utils.crypt package
Submodules
pyhanko.pdf_utils.crypt.api module
- exception pyhanko.pdf_utils.crypt.api.PdfKeyNotAvailableError(msg: str, *args)
Bases:
PdfReadError
- class pyhanko.pdf_utils.crypt.api.AuthStatus(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
OrderedEnum
Describes the status after an authentication attempt.
- FAILED = 0
- USER = 1
- OWNER = 2
- class pyhanko.pdf_utils.crypt.api.AuthResult(status: AuthStatus, permission_flags: int | None = None)
Bases:
object
Describes the result of an authentication attempt.
- status: AuthStatus
Authentication status after the authentication attempt.
- permission_flags: int | None = None
Granular permission flags. The precise meaning depends on the security handler.
- class pyhanko.pdf_utils.crypt.api.SecurityHandlerVersion(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
VersionEnum
Indicates the security handler’s version.
The enum constants are named more or less in accordance with the cryptographic algorithms they permit.
- RC4_40 = 1
- RC4_LONGER_KEYS = 2
- RC4_OR_AES128 = 4
- AES256 = 5
- OTHER = None
Placeholder value for custom security handlers.
- classmethod from_number(value) SecurityHandlerVersion
- check_key_length(key_length: int) int
- class pyhanko.pdf_utils.crypt.api.SecurityHandler(version: SecurityHandlerVersion, legacy_keylen, crypt_filter_config: CryptFilterConfiguration, encrypt_metadata=True, compat_entries=True)
Bases:
object
Generic PDF security handler interface.
This class contains relatively little actual functionality, except for some common initialisation logic and bookkeeping machinery to register security handler implementations.
- Parameters:
version – Indicates the version of the security handler to use, as described in the specification. See
SecurityHandlerVersion
.legacy_keylen – Key length in bytes (only relevant for legacy encryption handlers).
crypt_filter_config –
The crypt filter configuration for the security handler, in the form of a
CryptFilterConfiguration
object.Note
PyHanko implements legacy security handlers (which, according to the standard, aren’t crypt filter-aware) using crypt filters as well, even though they aren’t serialised to the output file.
encrypt_metadata –
Flag indicating whether document (XMP) metadata is to be encrypted.
Warning
Currently, PyHanko does not manage metadata streams, so until that changes, it is the responsibility of the API user to mark metadata streams using the /Identity crypt filter as required.
Nonetheless, the value of this flag is required in key derivation computations, so the security handler needs to know about it.
compat_entries – Write deprecated but technically unnecessary configuration settings for compatibility with certain implementations.
- static register(cls: Type[SecurityHandler])
Register a security handler class. Intended to be used as a decorator on subclasses.
See
build()
for further information.- Parameters:
cls – A subclass of
SecurityHandler
.
- static build(encrypt_dict: DictionaryObject) SecurityHandler
Instantiate an appropriate
SecurityHandler
from a PDF document’s encryption dictionary.PyHanko will search the registry for a security handler with a name matching the
/Filter
entry. Failing that, a security handler implementing the protocol designated by the/SubFilter
entry (seesupport_generic_subfilters()
) will be chosen.Once an appropriate
SecurityHandler
subclass has been selected, pyHanko will invoke the subclass’sinstantiate_from_pdf_object()
method with the original encryption dictionary as its argument.- Parameters:
encrypt_dict – A PDF encryption dictionary.
- Returns:
- classmethod get_name() str
Retrieves the name of this security handler.
- Returns:
The name of this security handler.
- extract_credential() SerialisableCredential | None
Extract a serialisable credential for later use, if the security handler supports it. It should allow the security handler to be unlocked with the same access level as the current one.
- Returns:
A serialisable credential, or
None
.
- classmethod support_generic_subfilters() Set[str]
Indicates the generic
/SubFilter
values that this security handler supports.- Returns:
A set of generic protocols (indicated in the
/SubFilter
entry of an encryption dictionary) that thisSecurityHandler
class implements. Defaults to the empty set.
- classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)
Instantiate an object of this class using a PDF encryption dictionary as input.
- Parameters:
encrypt_dict – A PDF encryption dictionary.
- Returns:
- is_authenticated() bool
Return
True
if the security handler has been successfully authenticated against for document encryption purposes.The default implementation just attempts to call
get_file_encryption_key()
and returnsTrue
if that doesn’t raise an error.
- as_pdf_object() DictionaryObject
Serialise this security handler to a PDF encryption dictionary.
- Returns:
A PDF encryption dictionary.
- authenticate(credential, id1=None) AuthResult
Authenticate a credential holder with this security handler.
- Parameters:
credential – A credential. The type of the credential is left up to the subclasses.
id1 – The first part of the document ID of the document being accessed.
- Returns:
An
AuthResult
object indicating the level of access obtained.
- get_string_filter() CryptFilter
- Returns:
The crypt filter responsible for decrypting strings for this security handler.
- get_stream_filter(name=None) CryptFilter
- Parameters:
name – Optionally specify a crypt filter by name.
- Returns:
The default crypt filter responsible for decrypting streams for this security handler, or the crypt filter named
name
, if notNone
.
- get_embedded_file_filter()
- Returns:
The crypt filter responsible for decrypting embedded files for this security handler.
- get_file_encryption_key() bytes
Retrieve the global file encryption key (used for streams and/or strings). If there is no such thing, or the key is not available, an error should be raised.
- Raises:
PdfKeyNotAvailableError – when the key is not available
- classmethod read_cf_dictionary(cfdict: DictionaryObject, acts_as_default: bool) CryptFilter | None
Interpret a crypt filter dictionary for this type of security handler.
- Parameters:
cfdict – A crypt filter dictionary.
acts_as_default – Indicates whether this filter is intended to be used in
/StrF
or/StmF
.
- Returns:
An appropriate
CryptFilter
object, orNone
if the crypt filter uses the/None
method.- Raises:
NotImplementedError – Raised when the crypt filter’s
/CFM
entry indicates an unknown crypt filter method.
- classmethod process_crypt_filters(encrypt_dict: DictionaryObject) CryptFilterConfiguration | None
- classmethod register_crypt_filter(method: NameObject, factory: Callable[[DictionaryObject, bool], CryptFilter])
- get_min_pdf_version() Tuple[int, int] | None
- class pyhanko.pdf_utils.crypt.api.CryptFilter
Bases:
object
Generic abstract crypt filter class.
The superclass only handles the binding with the security handler, and offers some default implementations for serialisation routines that may be overridden in subclasses.
There is generally no requirement for crypt filters to be compatible with any security handler (the leaf classes in this module aren’t), but the API supports mixin usage so code can be shared.
- property method: NameObject
- Returns:
The method name (
/CFM
entry) associated with this crypt filter.
- property keylen: int
- Returns:
The keylength (in bytes) of the key associated with this crypt filter.
- encrypt(key, plaintext: bytes, params=None) bytes
Encrypt plaintext with the specified key.
- Parameters:
key – The current local key, which may or may not be equal to this crypt filter’s global key.
plaintext – Plaintext to encrypt.
params – Optional parameters private to the crypt filter, specified as a PDF dictionary. These can only be used for explicit crypt filters; the parameters are then sourced from the corresponding entry in
/DecodeParms
.
- Returns:
The resulting ciphertext.
- decrypt(key, ciphertext: bytes, params=None) bytes
Decrypt ciphertext with the specified key.
- Parameters:
key – The current local key, which may or may not be equal to this crypt filter’s global key.
ciphertext – Ciphertext to decrypt.
params – Optional parameters private to the crypt filter, specified as a PDF dictionary. These can only be used for explicit crypt filters; the parameters are then sourced from the corresponding entry in
/DecodeParms
.
- Returns:
The resulting plaintext.
- as_pdf_object() DictionaryObject
Serialise this crypt filter to a PDF crypt filter dictionary.
Note
Implementations are encouraged to use a cooperative inheritance model, where subclasses first call
super().as_pdf_object()
and add the keys they need before returning the result.This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.
- Returns:
A PDF crypt filter dictionary.
Compute the (global) file encryption key for this crypt filter.
- Returns:
The key, as a
bytes
object.- Raises:
misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).
- derive_object_key(idnum, generation) bytes
Derive the encryption key for a specific object, based on the shared file encryption key.
- Parameters:
idnum – ID of the object being encrypted.
generation – Generation number of the object being encrypted.
- Returns:
The local key to use for this object.
- set_embedded_only()
Return the shared file encryption key for this crypt filter, or attempt to compute it using
derive_shared_encryption_key()
if not available.
- class pyhanko.pdf_utils.crypt.api.IdentityCryptFilter
Bases:
CryptFilter
Class implementing the trivial crypt filter.
This is a singleton class, so all its instances are identical. Additionally, some of the
CryptFilter
API is nonfunctional. In particular,as_pdf_object()
always raises an error, since the/Identity
filter cannot be serialised.- method = '/None'
- keylen = 0
Always returns an empty byte string.
- derive_object_key(idnum, generation) bytes
Always returns an empty byte string.
- Parameters:
idnum – Ignored.
generation – Ignored.
- Returns:
- as_pdf_object()
Not implemented for this crypt filter.
- Raises:
misc.PdfError – Always.
- encrypt(key, plaintext: bytes, params=None) bytes
Identity function.
- Parameters:
key – Ignored.
plaintext – Returned as-is.
params – Ignored.
- Returns:
The original plaintext.
- decrypt(key, ciphertext: bytes, params=None) bytes
Identity function.
- Parameters:
key – Ignored.
ciphertext – Returned as-is.
params – Ignored.
- Returns:
The original ciphertext.
- class pyhanko.pdf_utils.crypt.api.CryptFilterConfiguration(crypt_filters: Dict[str, CryptFilter], default_stream_filter='/Identity', default_string_filter='/Identity', default_file_filter=None)
Bases:
object
Crypt filter store attached to a security handler.
Instances of this class are not designed to be reusable.
- Parameters:
crypt_filters – A dictionary mapping names to their corresponding crypt filters.
default_stream_filter – Name of the default crypt filter to use for streams.
default_stream_filter – Name of the default crypt filter to use for strings.
default_file_filter –
Name of the default crypt filter to use for embedded files.
Note
PyHanko currently is not aware of embedded files, so managing these is the API user’s responsibility.
- filters()
Enumerate all crypt filters in this configuration.
- set_security_handler(handler: SecurityHandler)
Set the security handler on all crypt filters in this configuration.
- Parameters:
handler – A
SecurityHandler
instance.
- get_for_stream()
Retrieve the default crypt filter to use with streams.
- Returns:
A
CryptFilter
instance.
- get_for_string()
Retrieve the default crypt filter to use with strings.
- Returns:
A
CryptFilter
instance.
- get_for_embedded_file()
Retrieve the default crypt filter to use with embedded files.
- Returns:
A
CryptFilter
instance.
- property stream_filter_name: NameObject
The name of the default crypt filter to use with streams.
- property string_filter_name: NameObject
The name of the default crypt filter to use with streams.
- property embedded_file_filter_name: NameObject
Retrieve the name of the default crypt filter to use with embedded files.
- as_pdf_object()
Serialise this crypt filter configuration to a dictionary object, including all its subordinate crypt filters (with the exception of the identity filter, if relevant).
- standard_filters()
Return the “standard” filters associated with this crypt filter configuration, i.e. those registered as the defaults for strings, streams and embedded files, respectively.
These sometimes require special treatment (as per the specification).
- Returns:
A set with one, two or three elements.
- pyhanko.pdf_utils.crypt.api.build_crypt_filter(reg: Dict[NameObject, Callable[[DictionaryObject, bool], CryptFilter]], cfdict: DictionaryObject, acts_as_default: bool) CryptFilter | None
Interpret a crypt filter dictionary for a security handler.
- Parameters:
reg – A registry of named crypt filters.
cfdict – A crypt filter dictionary.
acts_as_default – Indicates whether this filter is intended to be used in
/StrF
or/StmF
.
- Returns:
An appropriate
CryptFilter
object, orNone
if the crypt filter uses the/None
method.- Raises:
NotImplementedError – Raised when the crypt filter’s
/CFM
entry indicates an unknown crypt filter method.
- pyhanko.pdf_utils.crypt.api.ALL_PERMS = -4
Dummy value that translates to “everything is allowed” in an encrypted PDF document.
pyhanko.pdf_utils.crypt.cred_ser module
- class pyhanko.pdf_utils.crypt.cred_ser.SerialisedCredential(credential_type: str, data: bytes)
Bases:
object
A credential in serialised form.
- credential_type: str
The registered type name of the credential (see
SerialisableCredential.register()
).
- data: bytes
The credential data, as a byte string.
- class pyhanko.pdf_utils.crypt.cred_ser.SerialisableCredential
Bases:
ABC
Class representing a credential that can be serialised.
- classmethod get_name() str
Get the type name of the credential, which will be embedded into serialised values and used on deserialisation.
- static register(cls: Type[SerialisableCredential])
Register a subclass into the credential serialisation registry, using the name returned by
get_name()
. Can be used as a class decorator.- Parameters:
cls – The subclass.
- Returns:
The subclass.
- static deserialise(ser_value: SerialisedCredential) SerialisableCredential
Deserialise a
SerialisedCredential
value by looking up the proper subclass ofSerialisableCredential
and invoking its deserialisation method.- Parameters:
ser_value – The value to deserialise.
- Returns:
The deserialised credential.
- Raises:
misc.PdfReadError – If a deserialisation error occurs.
- serialise() SerialisedCredential
Serialise a value to an annotated
SerialisedCredential
value.- Returns:
A
SerialisedCredential
value.- Raises:
misc.PdfWriteError – If a serialisation error occurs.
pyhanko.pdf_utils.crypt.filter_mixins module
- class pyhanko.pdf_utils.crypt.filter_mixins.RC4CryptFilterMixin(*, keylen=5, **kwargs)
Bases:
CryptFilter
,ABC
Mixin for RC4-based crypt filters.
- Parameters:
keylen – Key length, in bytes. Defaults to 5.
- method = '/V2'
- property keylen: int
- Returns:
The keylength (in bytes) of the key associated with this crypt filter.
- encrypt(key, plaintext: bytes, params=None) bytes
Encrypt data using RC4.
- Parameters:
key – Local encryption key.
plaintext – Plaintext to encrypt.
params – Ignored.
- Returns:
Ciphertext.
- decrypt(key, ciphertext: bytes, params=None) bytes
Decrypt data using RC4.
- Parameters:
key – Local encryption key.
ciphertext – Ciphertext to decrypt.
params – Ignored.
- Returns:
Plaintext.
- derive_object_key(idnum, generation) bytes
Derive the local key for the given object ID and generation number, by calling
legacy_derive_object_key()
.- Parameters:
idnum – ID of the object being encrypted.
generation – Generation number of the object being encrypted.
- Returns:
The local key.
- class pyhanko.pdf_utils.crypt.filter_mixins.AESCryptFilterMixin(*, keylen: int, **kwargs)
Bases:
CryptFilter
,ABC
Mixin for AES-based crypt filters.
- property method: NameObject
- Returns:
The method name (
/CFM
entry) associated with this crypt filter.
- property keylen: int
- Returns:
The keylength (in bytes) of the key associated with this crypt filter.
- encrypt(key, plaintext: bytes, params=None)
Encrypt data using AES in CBC mode, with PKCS#7 padding.
- Parameters:
key – The key to use.
plaintext – The plaintext to be encrypted.
params – Ignored.
- Returns:
The resulting ciphertext, prepended with a 16-byte initialisation vector.
- decrypt(key, ciphertext: bytes, params=None) bytes
Decrypt data using AES in CBC mode, with PKCS#7 padding.
- Parameters:
key – The key to use.
ciphertext – The ciphertext to be decrypted, prepended with a 16-byte initialisation vector.
params – Ignored.
- Returns:
The resulting plaintext.
- derive_object_key(idnum, generation) bytes
Derive the local key for the given object ID and generation number.
If the associated handler is of version
SecurityHandlerVersion.AES256
or greater, this method simply returns the global key as-is. If not, the computation is carried out bylegacy_derive_object_key()
.- Parameters:
idnum – ID of the object being encrypted.
generation – Generation number of the object being encrypted.
- Returns:
The local key.
pyhanko.pdf_utils.crypt.pubkey module
- class pyhanko.pdf_utils.crypt.pubkey.RecipientEncryptionPolicy(ignore_key_usage: bool = False, prefer_oaep: bool = False)
Bases:
object
- ignore_key_usage: bool = False
Ignore key usage bits in the recipient’s certificate.
- prefer_oaep: bool = False
For RSA recipients, encrypt with RSAES-OAEP.
Warning
This is not widely supported.
- class pyhanko.pdf_utils.crypt.pubkey.PubKeyCryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)
Bases:
CryptFilter
,ABC
Crypt filter for use with public key security handler. These are a little more independent than their counterparts for the standard security handlers, since different crypt filters can cater to different sets of recipients.
- Parameters:
recipients – List of CMS objects encoding recipient information for this crypt filters.
acts_as_default – Indicates whether this filter is intended to be used in
/StrF
or/StmF
.encrypt_metadata –
Whether this crypt filter should encrypt document-level metadata.
Warning
See
SecurityHandler
for some background on the way pyHanko interprets this value.
- add_recipients(certs: List[Certificate], policy: RecipientEncryptionPolicy, perms=-4)
Add recipients to this crypt filter. This always adds one full CMS object to the Recipients array
- Parameters:
certs – A list of recipient certificates.
policy – Encryption policy choices for the chosen set of recipients.
perms – The permission bits to assign to the listed recipients.
- authenticate(credential) AuthResult
Authenticate to this crypt filter in particular. If used in
/StmF
or/StrF
, you don’t need to worry about calling this method directly.- Parameters:
credential – The
EnvelopeKeyDecrypter
to authenticate with.- Returns:
An
AuthResult
object indicating the level of access obtained.
Compute the (global) file encryption key for this crypt filter.
- Returns:
The key, as a
bytes
object.- Raises:
misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).
- as_pdf_object()
Serialise this crypt filter to a PDF crypt filter dictionary.
Note
Implementations are encouraged to use a cooperative inheritance model, where subclasses first call
super().as_pdf_object()
and add the keys they need before returning the result.This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.
- Returns:
A PDF crypt filter dictionary.
- class pyhanko.pdf_utils.crypt.pubkey.PubKeyAESCryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)
Bases:
PubKeyCryptFilter
,AESCryptFilterMixin
AES crypt filter for public key security handlers.
- class pyhanko.pdf_utils.crypt.pubkey.PubKeyRC4CryptFilter(*, recipients=None, acts_as_default=False, encrypt_metadata=True, **kwargs)
Bases:
PubKeyCryptFilter
,RC4CryptFilterMixin
RC4 crypt filter for public key security handlers.
- pyhanko.pdf_utils.crypt.pubkey.DEFAULT_CRYPT_FILTER = '/DefaultCryptFilter'
Default name to use for the default crypt filter in public key security handlers.
- pyhanko.pdf_utils.crypt.pubkey.DEF_EMBEDDED_FILE = '/DefEmbeddedFile'
Default name to use for the EFF crypt filter in public key security handlers for documents where only embedded files are encrypted.
- class pyhanko.pdf_utils.crypt.pubkey.PubKeyAdbeSubFilter(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
Enum
Enum describing the different subfilters that can be used for public key encryption in the PDF specification.
- S3 = '/adbe.pkcs7.s3'
- S4 = '/adbe.pkcs7.s4'
- S5 = '/adbe.pkcs7.s5'
- pyhanko.pdf_utils.crypt.pubkey.construct_envelope_content(seed: bytes, perms: int, include_permissions=True)
- pyhanko.pdf_utils.crypt.pubkey.construct_recipient_cms(certificates: List[Certificate], seed: bytes, perms: int, policy: RecipientEncryptionPolicy, include_permissions=True) ContentInfo
- exception pyhanko.pdf_utils.crypt.pubkey.InappropriateCredentialError
Bases:
TypeError
- class pyhanko.pdf_utils.crypt.pubkey.EnvelopeKeyDecrypter
Bases:
object
General credential class for use with public key security handlers.
This allows the key decryption process to happen offline, e.g. on a smart card.
- property cert: Certificate
- Returns:
Return the recipient’s certificate
- decrypt(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm) bytes
Invoke the actual key decryption algorithm. Used with key transport.
- Parameters:
encrypted_key – Payload to decrypt.
algo_params – Specification of the encryption algorithm as a CMS object.
- Raises:
InappropriateCredentialError – if the credential cannot be used for key transport.
- Returns:
The decrypted payload.
- decrypt_with_exchange(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm, originator_identifier: OriginatorIdentifierOrKey, user_keying_material: bytes) bytes
Decrypt an envelope key using a key derived from a key exchange.
- Parameters:
encrypted_key – Payload to decrypt.
algo_params – Specification of the encryption algorithm as a CMS object.
originator_identifier – Information about the originator necessary to complete the key exchange.
user_keying_material – The user keying material that will be used in the key derivation.
- Returns:
The decrypted payload.
Bases:
Sequence
- class pyhanko.pdf_utils.crypt.pubkey.SimpleEnvelopeKeyDecrypter(cert: Certificate, private_key: PrivateKeyInfo)
Bases:
EnvelopeKeyDecrypter
,SerialisableCredential
Implementation of
EnvelopeKeyDecrypter
where the private key is an RSA or ECC key residing in memory.- Parameters:
cert – The recipient’s certificate.
private_key – The recipient’s private key.
- dhsinglepass_stddh_arc_pattern = re.compile('1\\.3\\.132\\.1\\.11\\.(\\d+)')
- classmethod get_name() str
Get the type name of the credential, which will be embedded into serialised values and used on deserialisation.
- property cert: Certificate
- Returns:
Return the recipient’s certificate
- static load(key_file, cert_file, key_passphrase=None)
Load a key decrypter using key material from files on disk.
- Parameters:
key_file – File containing the recipient’s private key.
cert_file – File containing the recipient’s certificate.
key_passphrase – Passphrase for the key file, if applicable.
- Returns:
An instance of
SimpleEnvelopeKeyDecrypter
.
- classmethod load_pkcs12(pfx_file, passphrase=None)
Load a key decrypter using key material from a PKCS#12 file on disk.
- Parameters:
pfx_file – Path to the PKCS#12 file containing the key material.
passphrase – Passphrase for the private key, if applicable.
- Returns:
An instance of
SimpleEnvelopeKeyDecrypter
.
- decrypt(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm) bytes
Decrypt the payload using RSA with PKCS#1 v1.5 padding or OAEP. Other schemes are not (currently) supported by this implementation.
- Parameters:
encrypted_key – Payload to decrypt.
algo_params – Specification of the encryption algorithm as a CMS object. Must use
rsaes_pkcs1v15
orrsaes_oaep
.
- Returns:
The decrypted payload.
- decrypt_with_exchange(encrypted_key: bytes, algo_params: KeyEncryptionAlgorithm, originator_identifier: OriginatorIdentifierOrKey, user_keying_material: bytes | None) bytes
Decrypt the payload using a key agreed via ephemeral-static standard (non-cofactor) ECDH with X9.63 key derivation. Other schemes aer not supported at this time.
- Parameters:
encrypted_key – Payload to decrypt.
algo_params – Specification of the encryption algorithm as a CMS object.
originator_identifier – The originator info, which must be an EC key.
user_keying_material – The user keying material that will be used in the key derivation.
- Returns:
The decrypted payload.
- pyhanko.pdf_utils.crypt.pubkey.read_envelope_key(ed: EnvelopedData, decrypter: EnvelopeKeyDecrypter) bytes | None
- pyhanko.pdf_utils.crypt.pubkey.read_seed_from_recipient_cms(recipient_cms: ContentInfo, decrypter: EnvelopeKeyDecrypter) Tuple[bytes | None, int | None]
- class pyhanko.pdf_utils.crypt.pubkey.PubKeySecurityHandler(version: SecurityHandlerVersion, pubkey_handler_subfilter: PubKeyAdbeSubFilter, legacy_keylen, encrypt_metadata=True, crypt_filter_config: CryptFilterConfiguration | None = None, recipient_objs: list | None = None, compat_entries=True)
Bases:
SecurityHandler
Security handler for public key encryption in PDF.
As with the standard security handler, you essentially shouldn’t ever have to instantiate these yourself (see
build_from_certs()
).- classmethod build_from_certs(certs: List[Certificate], keylen_bytes=16, version=SecurityHandlerVersion.AES256, use_aes=True, use_crypt_filters=True, perms: int = -4, encrypt_metadata=True, policy: RecipientEncryptionPolicy = RecipientEncryptionPolicy(ignore_key_usage=False, prefer_oaep=False), **kwargs) PubKeySecurityHandler
Create a new public key security handler.
This method takes many parameters, but only
certs
is mandatory. The default behaviour is to create a public key encryption handler where the underlying symmetric encryption is provided by AES-256. Any remaining keyword arguments will be passed to the constructor.- Parameters:
certs – The recipients’ certificates.
keylen_bytes – The key length (in bytes). This is only relevant for legacy security handlers.
version – The security handler version to use.
use_aes – Use AES-128 instead of RC4 (only meaningful if the
version
parameter isRC4_OR_AES128
).use_crypt_filters – Whether to use crypt filters. This is mandatory for security handlers of version
RC4_OR_AES128
or higher.perms – Permission flags (as a 4-byte signed integer).
encrypt_metadata –
Whether to encrypt document metadata.
Warning
See
SecurityHandler
for some background on the way pyHanko interprets this value.policy – Encryption policy choices for the chosen set of recipients.
- Returns:
An instance of
PubKeySecurityHandler
.
- classmethod get_name() str
Retrieves the name of this security handler.
- Returns:
The name of this security handler.
- classmethod support_generic_subfilters() Set[str]
Indicates the generic
/SubFilter
values that this security handler supports.- Returns:
A set of generic protocols (indicated in the
/SubFilter
entry of an encryption dictionary) that thisSecurityHandler
class implements. Defaults to the empty set.
- classmethod read_cf_dictionary(cfdict: DictionaryObject, acts_as_default: bool) CryptFilter
Interpret a crypt filter dictionary for this type of security handler.
- Parameters:
cfdict – A crypt filter dictionary.
acts_as_default – Indicates whether this filter is intended to be used in
/StrF
or/StmF
.
- Returns:
An appropriate
CryptFilter
object, orNone
if the crypt filter uses the/None
method.- Raises:
NotImplementedError – Raised when the crypt filter’s
/CFM
entry indicates an unknown crypt filter method.
- classmethod process_crypt_filters(encrypt_dict: DictionaryObject) CryptFilterConfiguration | None
- classmethod gather_pub_key_metadata(encrypt_dict: DictionaryObject)
- classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)
Instantiate an object of this class using a PDF encryption dictionary as input.
- Parameters:
encrypt_dict – A PDF encryption dictionary.
- Returns:
- as_pdf_object()
Serialise this security handler to a PDF encryption dictionary.
- Returns:
A PDF encryption dictionary.
- add_recipients(certs: List[Certificate], perms=-4, policy: RecipientEncryptionPolicy = RecipientEncryptionPolicy(ignore_key_usage=False, prefer_oaep=False))
- authenticate(credential: EnvelopeKeyDecrypter | SerialisedCredential, id1=None) AuthResult
Authenticate a user to this security handler.
- Parameters:
credential – The credential to use (an instance of
EnvelopeKeyDecrypter
in this case).id1 – First part of the document ID. Public key encryption handlers ignore this key.
- Returns:
An
AuthResult
object indicating the level of access obtained.
- get_file_encryption_key() bytes
Retrieve the global file encryption key (used for streams and/or strings). If there is no such thing, or the key is not available, an error should be raised.
- Raises:
PdfKeyNotAvailableError – when the key is not available
pyhanko.pdf_utils.crypt.standard module
- class pyhanko.pdf_utils.crypt.standard.StandardSecuritySettingsRevision(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
Bases:
VersionEnum
Indicate the standard security handler revision to emulate.
- RC4_BASIC = 2
- RC4_EXTENDED = 3
- RC4_OR_AES128 = 4
- AES256 = 6
- OTHER = None
Placeholder value for custom security handlers.
- classmethod from_number(value) StandardSecuritySettingsRevision
- class pyhanko.pdf_utils.crypt.standard.StandardCryptFilter
Bases:
CryptFilter
,ABC
Crypt filter for use with the standard security handler.
Compute the (global) file encryption key for this crypt filter.
- Returns:
The key, as a
bytes
object.- Raises:
misc.PdfError – Raised if the data needed to derive the key is not present (e.g. because the caller hasn’t authenticated yet).
- as_pdf_object()
Serialise this crypt filter to a PDF crypt filter dictionary.
Note
Implementations are encouraged to use a cooperative inheritance model, where subclasses first call
super().as_pdf_object()
and add the keys they need before returning the result.This makes it easy to write crypt filter mixins that can provide functionality to multiple handlers.
- Returns:
A PDF crypt filter dictionary.
- class pyhanko.pdf_utils.crypt.standard.StandardAESCryptFilter(*, keylen: int, **kwargs)
Bases:
StandardCryptFilter
,AESCryptFilterMixin
AES crypt filter for the standard security handler.
- class pyhanko.pdf_utils.crypt.standard.StandardRC4CryptFilter(*, keylen=5, **kwargs)
Bases:
StandardCryptFilter
,RC4CryptFilterMixin
RC4 crypt filter for the standard security handler.
- class pyhanko.pdf_utils.crypt.standard.StandardSecurityHandler(version: SecurityHandlerVersion, revision: StandardSecuritySettingsRevision, legacy_keylen, perm_flags: int, odata, udata, oeseed=None, ueseed=None, encrypted_perms=None, encrypt_metadata=True, crypt_filter_config: CryptFilterConfiguration | None = None, compat_entries=True)
Bases:
SecurityHandler
Implementation of the standard (password-based) security handler.
You shouldn’t have to instantiate
StandardSecurityHandler
objects yourself. For encrypting new documents, usebuild_from_pw()
orbuild_from_pw_legacy()
.For decrypting existing documents, pyHanko will take care of instantiating security handlers through
SecurityHandler.build()
.- classmethod get_name() str
Retrieves the name of this security handler.
- Returns:
The name of this security handler.
- classmethod build_from_pw_legacy(rev: StandardSecuritySettingsRevision, id1, desired_owner_pass, desired_user_pass=None, keylen_bytes=16, use_aes128=True, perms: int = -4, crypt_filter_config=None, encrypt_metadata=True, **kwargs)
Initialise a legacy password-based security handler, to attach to a
PdfFileWriter
. Any remaining keyword arguments will be passed to the constructor.Danger
The functionality implemented by this handler is deprecated in the PDF standard. We only provide it for testing purposes, and to interface with legacy systems.
- Parameters:
rev – Security handler revision to use, see
StandardSecuritySettingsRevision
.id1 – The first part of the document ID.
desired_owner_pass – Desired owner password.
desired_user_pass – Desired user password.
keylen_bytes – Length of the key (in bytes).
use_aes128 – Use AES-128 instead of RC4 (default:
True
).perms – Permission bits to set (defined as an integer)
crypt_filter_config – Custom crypt filter configuration. PyHanko will supply a reasonable default if none is specified.
- Returns:
A
StandardSecurityHandler
instance.
- classmethod build_from_pw(desired_owner_pass, desired_user_pass=None, perms=-4, encrypt_metadata=True, **kwargs)
Initialise a password-based security handler backed by AES-256, to attach to a
PdfFileWriter
. This handler will use the new PDF 2.0 encryption scheme.Any remaining keyword arguments will be passed to the constructor.
- Parameters:
desired_owner_pass – Desired owner password.
desired_user_pass – Desired user password.
perms – Desired usage permissions.
encrypt_metadata – Whether to set up the security handler for encrypting metadata as well.
- Returns:
A
StandardSecurityHandler
instance.
- classmethod gather_encryption_metadata(encrypt_dict: DictionaryObject) dict
Gather and preprocess the “easy” metadata values in an encryption dictionary, and turn them into constructor kwargs.
This function processes
/Length
,/P
,/Perms
,/O
,/U
,/OE
,/UE
and/EncryptMetadata
.
- classmethod instantiate_from_pdf_object(encrypt_dict: DictionaryObject)
Instantiate an object of this class using a PDF encryption dictionary as input.
- Parameters:
encrypt_dict – A PDF encryption dictionary.
- Returns:
- as_pdf_object()
Serialise this security handler to a PDF encryption dictionary.
- Returns:
A PDF encryption dictionary.
- authenticate(credential, id1: bytes | None = None) AuthResult
Authenticate a user to this security handler.
- Parameters:
credential – The credential to use (a password in this case).
id1 – First part of the document ID. This is mandatory for legacy encryption handlers, but meaningless otherwise.
- Returns:
An
AuthResult
object indicating the level of access obtained.
- get_file_encryption_key() bytes
Retrieve the (global) file encryption key for this security handler.
- Returns:
The file encryption key as a
bytes
object.- Raises:
misc.PdfReadError – Raised if this security handler was instantiated from an encryption dictionary and no credential is available.
Module contents
Changed in version 0.13.0: Refactor crypt
module into package.
Changed in version 0.3.0: Added support for PDF 2.0 encryption standards and crypt filters.
Utilities for PDF encryption. This module covers all methods outlined in the standard:
Legacy RC4-based encryption (based on PyPDF2 code).
AES-128 encryption with legacy key derivation (partly based on PyPDF2 code).
PDF 2.0 AES-256 encryption.
Public key encryption backed by any of the above.
Following the language in the standard, encryption operations are backed by
subclasses of the SecurityHandler
class, which provides a more or less
generic API.
Danger
The members of this package are all considered internal API, and are therefore subject to change without notice.
Danger
One should also be aware that the legacy encryption scheme implemented here is (very) weak, and we only support it for compatibility reasons. Under no circumstances should it still be used to encrypt new files.
About crypt filters
Crypt filters are objects that handle encryption and decryption of streams and
strings, either for all of them, or for a specific subset (e.g. streams
representing embedded files). In the context of the PDF standard, crypt filters
are a notion that only makes sense for security handlers of version 4 and up.
In pyHanko, however, all encryption and decryption operations pass through
crypt filters, and the serialisation/deserialisation logic in
SecurityHandler
and its subclasses transparently deals with staying
backwards compatible with earlier revisions.
Internally, pyHanko loosely distinguishes between implicit and explicit uses of crypt filters:
Explicit crypt filters are used by directly referring to them from the
/Filter
entry of a stream dictionary. These are invoked in the usual stream decoding process.Implicit crypt filters are set by the
/StmF
and/StrF
entries in the security handler’s crypt filter configuration, and are invoked by the object reading/writing procedures as necessary. These filters are invisble to the stream encoding/decoding process: theencoded_data
attribute of an “implicitly encrypted” stream will therefore contain decrypted data ready to be decoded in the usual way.
As long as you don’t require access to encoded object data and/or raw encrypted object data, this distiction should be irrelevant to you as an API user.