pyhanko.sign.signers.pdf_cms module

This module defines utility classes to format CMS objects for use in PDF signatures.

class pyhanko.sign.signers.pdf_cms.Signer(prefer_pss=False, embed_roots=True)

Bases: object

Abstract signer object that is agnostic as to where the cryptographic operations actually happen.

As of now, pyHanko provides two implementations:

  • SimpleSigner implements the easy case where all the key material can be loaded into memory.

  • PKCS11Signer implements a signer that is capable of interfacing with a PKCS11 device (see also BEIDSigner).

Parameters
  • prefer_pss – When signing using an RSA key, prefer PSS padding to legacy PKCS#1 v1.5 padding. Default is False. This option has no effect on non-RSA signatures.

  • embed_roots

    New in version 0.9.0.

    Option that controls whether or not additional self-signed certificates should be embedded into the CMS payload. The default is True.

    Note

    The signer’s certificate is always embedded, even if it is self-signed.

    Note

    Trust roots are configured by the validator, so embedding them typically does nothing in a typical validation process. Therefore they can be safely omitted in most cases. Nonetheless, embedding the roots can be useful for documentation purposes.

    Warning

    To be precise, if this flag is False, a certificate will be dropped if (a) it is not the signer’s, (b) it is self-issued and (c) its subject and authority key identifiers match (or either is missing). In other words, we never validate the actual self-signature. This heuristic is sufficiently accurate for most applications.

signing_cert: Certificate

The certificate that will be used to create the signature.

cert_registry: CertificateStore

Collection of certificates associated with this signer. Note that this is simply a bookkeeping tool; in particular it doesn’t care about trust.

signature_mechanism: SignedDigestAlgorithm = None

The (cryptographic) signature mechanism to use.

attribute_certs: Iterable[AttributeCertificateV2] = ()

Attribute certificates to include with the signature.

Note

Only v2 attribute certificates are supported.

get_signature_mechanism(digest_algorithm)

Get the signature mechanism for this signer to use. If signature_mechanism is set, it will be used. Otherwise, this method will attempt to put together a default based on mechanism used in the signer’s certificate.

Parameters

digest_algorithm – Digest algorithm to use as part of the signature mechanism. Only used if a signature mechanism object has to be put together on-the-fly.

Returns

A SignedDigestAlgorithm object.

property subject_name
Returns

The subject’s common name as a string, extracted from signing_cert.

static format_revinfo(ocsp_responses: Optional[list] = None, crls: Optional[list] = None)

Format Adobe-style revocation information for inclusion into a CMS object.

Parameters
  • ocsp_responses – A list of OCSP responses to include.

  • crls – A list of CRLs to include.

signer_info(digest_algorithm: str, signed_attrs, signature)

Format the SignerInfo entry for a CMS signature.

Parameters
  • digest_algorithm – Digest algorithm to use.

  • signed_attrs – Signed attributes (see signed_attrs()).

  • signature – The raw signature to embed (see sign_raw()).

Returns

An asn1crypto.cms.SignerInfo object.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Compute the raw cryptographic signature of the data provided, hashed using the digest algorithm provided.

Parameters
  • data – Data to sign.

  • digest_algorithm

    Digest algorithm to use.

    Warning

    If signature_mechanism also specifies a digest, they should match.

  • dry_run – Do not actually create a signature, but merely output placeholder bytes that would suffice to contain an actual signature.

Returns

Signature bytes.

async unsigned_attrs(digest_algorithm, signature: bytes, timestamper=None, dry_run=False) Optional[CMSAttributes]

Changed in version 0.9.0: Made asynchronous _(breaking change)_

Compute the unsigned attributes to embed into the CMS object. This function is called after signing the hash of the signed attributes (see signed_attrs()).

By default, this method only handles timestamp requests, but other functionality may be added by subclasses

If this method returns None, no unsigned attributes will be embedded.

Parameters
  • digest_algorithm – Digest algorithm used to hash the signed attributes.

  • signature – Signature of the signed attribute hash.

  • timestamper – Timestamp supplier to use.

  • dry_run – Flag indicating “dry run” mode. If True, only the approximate size of the output matters, so cryptographic operations can be replaced by placeholders.

Returns

The unsigned attributes to add, or None.

async signed_attrs(data_digest: bytes, digest_algorithm: str, attr_settings: Optional[PdfCMSSignedAttributes] = None, content_type='data', use_pades=False, timestamper=None, dry_run=False, is_pdf_sig=True)

Changed in version 0.4.0: Added positional digest_algorithm parameter _(breaking change)_.

Changed in version 0.5.0: Added dry_run, timestamper and cades_meta parameters.

Changed in version 0.9.0: Made asynchronous, grouped some parameters under attr_settings _(breaking change)_

Format the signed attributes for a CMS signature.

Parameters
  • data_digest – Raw digest of the data to be signed.

  • digest_algorithm

    New in version 0.4.0.

    Name of the digest algorithm used to compute the digest.

  • use_pades – Respect PAdES requirements.

  • dry_run

    New in version 0.5.0.

    Flag indicating “dry run” mode. If True, only the approximate size of the output matters, so cryptographic operations can be replaced by placeholders.

  • attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • timestamper

    New in version 0.5.0.

    Timestamper to use when creating timestamp tokens.

  • content_type

    CMS content type of the encapsulated data. Default is data.

    Danger

    This parameter is internal API, and non-default values must not be used to produce PDF signatures.

  • is_pdf_sig

    Whether the signature being generated is for use in a PDF document.

    Danger

    This parameter is internal API.

Returns

An asn1crypto.cms.CMSAttributes object.

async async_sign(data_digest: bytes, digest_algorithm: str, dry_run=False, use_pades=False, timestamper=None, signed_attr_settings: Optional[PdfCMSSignedAttributes] = None, is_pdf_sig=True, encap_content_info=None) ContentInfo

New in version 0.9.0.

Produce a detached CMS signature from a raw data digest.

Parameters
  • data_digest – Digest of the actual content being signed.

  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • signed_attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • use_pades – Respect PAdES requirements.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • is_pdf_sig

    Whether the signature being generated is for use in a PDF document.

    Danger

    This parameter is internal API.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns

An ContentInfo object.

async async_sign_prescribed_attributes(digest_algorithm: str, signed_attrs: CMSAttributes, cms_version='v1', dry_run=False, timestamper=None, encap_content_info=None) ContentInfo

New in version 0.9.0.

Start the CMS signing process with the prescribed set of signed attributes.

Parameters
  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • signed_attrs – CMS attributes to sign.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cms_version – CMS version to use.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns

An ContentInfo object.

async async_sign_general_data(input_data: Union[IO, bytes, ContentInfo, EncapsulatedContentInfo], digest_algorithm: str, detached=True, use_cades=False, timestamper=None, chunk_size=4096, signed_attr_settings: Optional[PdfCMSSignedAttributes] = None, max_read=None) ContentInfo

New in version 0.9.0.

Produce a CMS signature for an arbitrary data stream (not necessarily PDF data).

Parameters
  • input_data

    The input data to sign. This can be either a bytes object a file-type object, a cms.ContentInfo object or a cms.EncapsulatedContentInfo object.

    Warning

    asn1crypto mandates cms.ContentInfo for CMS v1 signatures. In practical terms, this means that you need to use cms.ContentInfo if the content type is data, and cms.EncapsulatedContentInfo otherwise.

    Warning

    We currently only support CMS v1, v3 and v4 signatures. This is only a concern if you need certificates or CRLs of type ‘other’, in which case you can change the version yourself (this will not invalidate any signatures). You’ll also need to do this if you need support for version 1 attribute certificates, or if you want to sign with subjectKeyIdentifier in the sid field.

  • digest_algorithm – The name of the digest algorithm to use.

  • detached – If True, create a CMS detached signature (i.e. an object where the encapsulated content is not embedded in the signature object itself). This is the default. If False, the content to be signed will be embedded as encapsulated content.

  • signed_attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • use_cades – Construct a CAdES-style CMS object.

  • timestamper

    PdfTimeStamper to use to create a signature timestamp

    Note

    If you want to create a content timestamp (as opposed to a signature timestamp), see CAdESSignedAttrSpec.

  • chunk_size – Chunk size to use when consuming input data.

  • max_read – Maximal number of bytes to read from the input stream.

Returns

A CMS ContentInfo object of type signedData.

sign(data_digest: bytes, digest_algorithm: str, timestamp: Optional[datetime] = None, dry_run=False, revocation_info=None, use_pades=False, timestamper=None, cades_signed_attr_meta: Optional[CAdESSignedAttrSpec] = None, encap_content_info=None) ContentInfo

Deprecated since version 0.9.0: Use async_sign() instead. The implementation of this method will invoke async_sign() using asyncio.run().

Produce a detached CMS signature from a raw data digest.

Parameters
  • data_digest – Digest of the actual content being signed.

  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • timestamp

    Signing time to embed into the signed attributes (will be ignored if use_pades is True).

    Note

    This timestamp value is to be interpreted as an unfounded assertion by the signer, which may or may not be good enough for your purposes.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • revocation_info – Revocation information to embed; this should be the output of a call to Signer.format_revinfo() (ignored when use_pades is True).

  • use_pades – Respect PAdES requirements.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cades_signed_attr_meta

    New in version 0.5.0.

    Specification for CAdES-specific signed attributes.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns

An ContentInfo object.

sign_prescribed_attributes(digest_algorithm: str, signed_attrs: CMSAttributes, cms_version='v1', dry_run=False, timestamper=None, encap_content_info=None) ContentInfo

Deprecated since version 0.9.0: Use async_sign_prescribed_attributes() instead. The implementation of this method will invoke async_sign_prescribed_attributes() using asyncio.run().

Start the CMS signing process with the prescribed set of signed attributes.

Parameters
  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • signed_attrs – CMS attributes to sign.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cms_version – CMS version to use.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns

An ContentInfo object.

sign_general_data(input_data: Union[IO, bytes, ContentInfo, EncapsulatedContentInfo], digest_algorithm: str, detached=True, timestamp: Optional[datetime] = None, use_cades=False, timestamper=None, cades_signed_attr_meta: Optional[CAdESSignedAttrSpec] = None, chunk_size=4096, max_read=None) ContentInfo

New in version 0.7.0.

Deprecated since version 0.9.0: Use async_sign_general_data() instead. The implementation of this method will invoke async_sign_general_data() using asyncio.run().

Produce a CMS signature for an arbitrary data stream (not necessarily PDF data).

Parameters
  • input_data

    The input data to sign. This can be either a bytes object a file-type object, a cms.ContentInfo object or a cms.EncapsulatedContentInfo object.

    Warning

    asn1crypto mandates cms.ContentInfo for CMS v1 signatures. In practical terms, this means that you need to use cms.ContentInfo if the content type is data, and cms.EncapsulatedContentInfo otherwise.

    Warning

    We currently only support CMS v1, v3 and v4 signatures. This is only a concern if you need certificates or CRLs of type ‘other’, in which case you can change the version yourself (this will not invalidate any signatures). You’ll also need to do this if you need support for version 1 attribute certificates, or if you want to sign with subjectKeyIdentifier in the sid field.

  • digest_algorithm – The name of the digest algorithm to use.

  • detached – If True, create a CMS detached signature (i.e. an object where the encapsulated content is not embedded in the signature object itself). This is the default. If False, the content to be signed will be embedded as encapsulated content.

  • timestamp

    Signing time to embed into the signed attributes (will be ignored if use_cades is True).

    Note

    This timestamp value is to be interpreted as an unfounded assertion by the signer, which may or may not be good enough for your purposes.

  • use_cades – Construct a CAdES-style CMS object.

  • timestamper

    PdfTimeStamper to use to create a signature timestamp

    Note

    If you want to create a content timestamp (as opposed to a signature timestamp), see CAdESSignedAttrSpec.

  • cades_signed_attr_meta – Specification for CAdES-specific signed attributes.

  • chunk_size – Chunk size to use when consuming input data.

  • max_read – Maximal number of bytes to read from the input stream.

Returns

A CMS ContentInfo object of type signedData.

class pyhanko.sign.signers.pdf_cms.SimpleSigner(signing_cert: Certificate, signing_key: PrivateKeyInfo, cert_registry: CertificateStore, signature_mechanism: Optional[SignedDigestAlgorithm] = None, prefer_pss=False, embed_roots=True, attribute_certs=None)

Bases: Signer

Simple signer implementation where the key material is available in local memory.

signing_key: PrivateKeyInfo

Private key associated with the certificate in signing_cert.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Compute the raw cryptographic signature of the data provided, hashed using the digest algorithm provided.

Parameters
  • data – Data to sign.

  • digest_algorithm

    Digest algorithm to use.

    Warning

    If signature_mechanism also specifies a digest, they should match.

  • dry_run – Do not actually create a signature, but merely output placeholder bytes that would suffice to contain an actual signature.

Returns

Signature bytes.

sign_raw(data: bytes, digest_algorithm: str) bytes

Synchronous raw signature implementation.

Parameters
  • data – Data to be signed.

  • digest_algorithm – Digest algorithm to use.

Returns

Raw signature encoded according to the conventions of the signing algorithm used.

classmethod load_pkcs12(pfx_file, ca_chain_files=None, other_certs=None, passphrase=None, signature_mechanism=None, prefer_pss=False)

Load certificates and key material from a PCKS#12 archive (usually .pfx or .p12 files).

Parameters
  • pfx_file – Path to the PKCS#12 archive.

  • ca_chain_files – Path to (PEM/DER) files containing other relevant certificates not included in the PKCS#12 file.

  • other_certs – Other relevant certificates, specified as a list of asn1crypto.x509.Certificate objects.

  • passphrase – Passphrase to decrypt the PKCS#12 archive, if required.

  • signature_mechanism – Override the signature mechanism to use.

  • prefer_pss – Prefer PSS signature mechanism over RSA PKCS#1 v1.5 if there’s a choice.

Returns

A SimpleSigner object initialised with key material loaded from the PKCS#12 file provided.

classmethod load(key_file, cert_file, ca_chain_files=None, key_passphrase=None, other_certs=None, signature_mechanism=None, prefer_pss=False)

Load certificates and key material from PEM/DER files.

Parameters
  • key_file – File containing the signer’s private key.

  • cert_file – File containing the signer’s certificate.

  • ca_chain_files – File containing other relevant certificates.

  • key_passphrase – Passphrase to decrypt the private key (if required).

  • other_certs – Other relevant certificates, specified as a list of asn1crypto.x509.Certificate objects.

  • signature_mechanism – Override the signature mechanism to use.

  • prefer_pss – Prefer PSS signature mechanism over RSA PKCS#1 v1.5 if there’s a choice.

Returns

A SimpleSigner object initialised with key material loaded from the files provided.

class pyhanko.sign.signers.pdf_cms.ExternalSigner(signing_cert: Certificate, cert_registry: CertificateStore, signature_value: bytes, signature_mechanism: Optional[SignedDigestAlgorithm] = None, prefer_pss=False, embed_roots=True)

Bases: Signer

Class to help formatting CMS objects for use with remote signing. It embeds a fixed signature value into the CMS, set at initialisation.

Intended for use with Interrupted signing.

signing_cert: Certificate

The certificate that will be used to create the signature.

cert_registry: CertificateStore

Collection of certificates associated with this signer. Note that this is simply a bookkeeping tool; in particular it doesn’t care about trust.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Return a fixed signature value.

class pyhanko.sign.signers.pdf_cms.PdfCMSSignedAttributes(signing_time: Optional[datetime] = None, adobe_revinfo_attr: Optional[CMSAttribute] = None, cades_signed_attrs: Optional[CAdESSignedAttrSpec] = None)

Bases: object

New in version 0.7.0.

Serialisable container class describing input for various signed attributes in a CMS object for a PDF signature.

signing_time: Optional[datetime] = None

Timestamp for the signingTime attribute. Will be ignored in a PAdES context.

adobe_revinfo_attr: Optional[CMSAttribute] = None

Adobe-style signed revocation info attribute.

cades_signed_attrs: Optional[CAdESSignedAttrSpec] = None

Optional settings for CAdES-style signed attributes.

async pyhanko.sign.signers.pdf_cms.format_attributes(attr_provs: List[CMSAttributeProvider], other_attrs: Iterable[CMSAttributes] = (), dry_run: bool = False) CMSAttributes

Format CMS attributes obtained from attribute providers.

Parameters
  • attr_provs – List of attribute providers.

  • other_attrs – Other (predetermined) attributes to include.

  • dry_run – Whether to invoke the attribute providers in dry-run mode or not.

Returns

A cms.CMSAttributes value.

async pyhanko.sign.signers.pdf_cms.format_signed_attributes(data_digest: bytes, attr_provs: List[CMSAttributeProvider], content_type='data', dry_run=False) CMSAttributes

Format signed attributes for a CMS SignerInfo value.

Parameters
  • data_digest – The byte string to put in the messageDigest attribute.

  • attr_provs – List of attribute providers to source attributes from.

  • content_type – The content type of the data being signed (default is data).

  • dry_run – Whether to invoke the attribute providers in dry-run mode or not.

Returns

A cms.CMSAttributes value representing the signed attributes.

pyhanko.sign.signers.pdf_cms.asyncify_signer(signer_cls)

Decorator to turn a legacy Signer subclass into one that works with the new async API.

pyhanko.sign.signers.pdf_cms.select_suitable_signing_md(key: PublicKeyInfo) str

Choose a reasonable default signing message digest given the properties of (the public part of) a key.

The fallback value is constants.DEFAULT_MD.

Parameters

key – A keys.PublicKeyInfo object.

Returns

The name of a message digest algorithm.