pyhanko.sign.signers package

Submodules

pyhanko.sign.signers.cms_embedder module

This module describes and implements the low-level PdfCMSEmbedder protocol for embedding CMS payloads into PDF signature objects.

class pyhanko.sign.signers.cms_embedder.PdfCMSEmbedder(new_field_spec: SigFieldSpec | None = None)

Bases: object

Low-level class that handles embedding CMS objects into PDF signature fields.

It also takes care of appearance generation and DocMDP configuration, but does not otherwise offer any of the conveniences of PdfSigner.

Parameters:

new_field_specSigFieldSpec to use when creating new fields on-the-fly.

write_cms(field_name: str | None, writer: BasePdfFileWriter, existing_fields_only=False)

Added in version 0.3.0.

Changed in version 0.7.0: Digest wrapped in PreparedByteRangeDigest in step 3; output returned in step 3 instead of step 4.

This method returns a generator coroutine that controls the process of embedding CMS data into a PDF signature field. Can be used for both timestamps and regular signatures.

Danger

This is a very low-level interface that performs virtually no error checking, and is intended to be used in situations where the construction of the CMS object to be embedded is not under the caller’s control (e.g. a remote signer that produces full-fledged CMS objects).

In almost every other case, you’re better of using PdfSigner instead, with a custom Signer implementation to handle the cryptographic operations if necessary.

The coroutine follows the following specific protocol.

  1. First, it retrieves or creates the signature field to embed the CMS object in, and yields a reference to said field.

  2. The caller should then send in a SigObjSetup object, which is subsequently processed by the coroutine. For convenience, the coroutine will then yield a reference to the signature dictionary (as embedded in the PDF writer).

  3. Next, the caller should send a SigIOSetup object, describing how the resulting document should be hashed and written to the output. The coroutine will write the entire document with a placeholder region reserved for the signature and compute the document’s hash and yield it to the caller. It will then yield a prepared_digest, output tuple, where prepared_digest is a PreparedByteRangeDigest object containing the document digest and the relevant offsets, and output is the output stream to which the document to be signed was written.

    From this point onwards, no objects may be changed or added to the IncrementalPdfFileWriter currently in use.

  4. Finally, the caller should pass in a CMS object to place inside the signature dictionary. The CMS object can be supplied as a raw bytes object, or an asn1crypto-style object. The coroutine’s final yield is the value of the signature dictionary’s /Contents entry, given as a hexadecimal string.

Caution

It is the caller’s own responsibility to ensure that enough room is available in the placeholder signature object to contain the final CMS object.

Parameters:
  • field_name – The name of the field to fill in. This should be a field of type /Sig.

  • writer – An IncrementalPdfFileWriter containing the document to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

Returns:

A generator coroutine implementing the protocol described above.

class pyhanko.sign.signers.cms_embedder.SigMDPSetup(md_algorithm: str, certify: bool = False, field_lock: pyhanko.sign.fields.FieldMDPSpec | None = None, docmdp_perms: pyhanko.sign.fields.MDPPerm | None = None)

Bases: object

md_algorithm: str

Message digest algorithm to write into the signature reference dictionary, if one is written at all.

Warning

It is the caller’s responsibility to make sure that this value agrees with the value embedded into the CMS object, and with the algorithm used to hash the document. The low-level PdfCMSEmbedder API will simply take it at face value.

certify: bool = False

Sign with an author (certification) signature, as opposed to an approval signature. A document can contain at most one such signature, and it must be the first one.

field_lock: FieldMDPSpec | None = None

Field lock information to write to the signature reference dictionary.

docmdp_perms: MDPPerm | None = None

DocMDP permissions to write to the signature reference dictionary.

apply(sig_obj_ref, writer)

Apply the settings to a signature object.

Danger

This method is internal API.

class pyhanko.sign.signers.cms_embedder.SigObjSetup(sig_placeholder: PdfSignedData, mdp_setup: SigMDPSetup | None = None, appearance_setup: SigAppearanceSetup | None = None)

Bases: object

Describes the signature dictionary to be embedded as the form field’s value.

sig_placeholder: PdfSignedData

Bare-bones placeholder object, usually of type SignatureObject or DocumentTimestamp.

In particular, this determines the number of bytes to allocate for the CMS object.

mdp_setup: SigMDPSetup | None = None

Optional DocMDP settings, see SigMDPSetup.

appearance_setup: SigAppearanceSetup | None = None

Optional appearance settings, see SigAppearanceSetup.

class pyhanko.sign.signers.cms_embedder.SigAppearanceSetup(style: BaseStampStyle, timestamp: datetime, name: str | None, text_params: dict | None = None)

Bases: object

Signature appearance configuration.

Part of the low-level PdfCMSEmbedder API, see SigObjSetup.

style: BaseStampStyle

Stamp style to use to generate the appearance.

timestamp: datetime

Timestamp to show in the signature appearance.

name: str | None

Signer name to show in the signature appearance.

text_params: dict | None = None

Additional text interpolation parameters to pass to the underlying stamp style.

apply(sig_annot, writer)

Apply the settings to an annotation.

Danger

This method is internal API.

class pyhanko.sign.signers.cms_embedder.SigIOSetup(md_algorithm: str, in_place: bool = False, chunk_size: int = 4096, output: IO | None = None)

Bases: object

I/O settings for writing signed PDF documents.

Objects of this type are used in the penultimate phase of the PdfCMSEmbedder protocol.

md_algorithm: str

Message digest algorithm to use to compute the document hash. It should be supported by pyca/cryptography.

Warning

This is also the message digest algorithm that should appear in the corresponding signerInfo entry in the CMS object that ends up being embedded in the signature field.

in_place: bool = False

Sign the input in-place. If False, write output to a BytesIO object, or output if the latter is not None.

chunk_size: int = 4096

Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

output: IO | None = None

Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

pyhanko.sign.signers.constants module

This module defines constants & defaults used by pyHanko when creating digital signatures.

pyhanko.sign.signers.constants.DEFAULT_MD = 'sha256'

Default message digest algorithm used when computing digests for use in signatures.

pyhanko.sign.signers.constants.DEFAULT_SIG_SUBFILTER = SigSeedSubFilter.ADOBE_PKCS7_DETACHED

Default SubFilter to use for a PDF signature.

pyhanko.sign.signers.constants.DEFAULT_SIGNER_KEY_USAGE = {'non_repudiation'}

Default key usage bits required for the signer’s certificate.

pyhanko.sign.signers.constants.SIG_DETAILS_DEFAULT_TEMPLATE = 'Digitally signed by %(signer)s.\nTimestamp: %(ts)s.'

Default template string for signature appearances.

pyhanko.sign.signers.constants.DEFAULT_SIGNING_STAMP_STYLE = TextStampStyle(border_width=3, background=<pyhanko.pdf_utils.content.RawContent object>, background_layout=SimpleBoxLayoutRule(x_align=<AxisAlignment.ALIGN_MID: 2>, y_align=<AxisAlignment.ALIGN_MID: 2>, margins=Margins(left=5, right=5, top=5, bottom=5), inner_content_scaling=<InnerScaling.SHRINK_TO_FIT: 4>), background_opacity=0.6, text_box_style=TextBoxStyle(font=<pyhanko.pdf_utils.font.basic.SimpleFontEngineFactory object>, font_size=10, leading=None, border_width=0, box_layout_rule=None, vertical_text=False), inner_content_layout=None, stamp_text='Digitally signed by %(signer)s.\nTimestamp: %(ts)s.', timestamp_format='%Y-%m-%d %H:%M:%S %Z')

Default stamp style used for visible signatures.

pyhanko.sign.signers.constants.ESIC_EXTENSION_1 = DeveloperExtension(prefix_name='/ESIC', base_version='/1.7', extension_level=1, url=None, extension_revision=None, compare_by_level=True, subsumed_by=(), subsumes=(), multivalued=<DevExtensionMultivalued.NEVER: 2>)

ESIC extension for PDF 1.7. Used to declare usage of PAdES structures.

pyhanko.sign.signers.constants.ISO32001 = DeveloperExtension(prefix_name='/ISO_', base_version='/2.0', extension_level=32001, url='https://www.iso.org/standard/45874.html', extension_revision=':2022', compare_by_level=False, subsumed_by=(), subsumes=(), multivalued=<DevExtensionMultivalued.ALWAYS: 1>)

ISO extension to PDF 2.0 to include SHA-3 and SHAKE256 support. This extension is defined in ISO/TS 32001.

Declared automatically whenever either of these is used in the signing or document digesting process.

pyhanko.sign.signers.constants.ISO32002 = DeveloperExtension(prefix_name='/ISO_', base_version='/2.0', extension_level=32002, url='https://www.iso.org/standard/45875.html', extension_revision=':2022', compare_by_level=False, subsumed_by=(), subsumes=(), multivalued=<DevExtensionMultivalued.ALWAYS: 1>)

ISO extension to PDF 2.0 to include EdDSA support and clarify supported curves for ECDSA. This extension is defined in ISO/TS 32002.

Declared automatically whenever Ed25519 or Ed448 are used, and when ECDSA is used with one of the curves listed in ISO/TS 32002.

pyhanko.sign.signers.csc_signer module

Added in version 0.10.0.

Asynchronous Signer implementation for interacting with a remote signing service using the Cloud Signature Consortium (CSC) API.

This implementation is based on version 1.0.4.0 (2019-06) of the CSC API specification.

Usage notes

This module’s CSCSigner class supplies an implementation of the Signer class in pyHanko. As such, it is flexible enough to be used either through pyHanko’s high-level API (sign_pdf() et al.), or through the interrupted signing API.

CSCSigner overview

CSCSigner is only directly responsible for calling the signatures/signHash endpoint in the CSC API. Other than that, it only handles batch control. This means that the following tasks require further action on the API user’s part:

  • authenticating to the signing service (typically using OAuth2);

  • obtaining Signature Activation Data (SAD) from the signing service;

  • provisioning the certificates to embed into the document (usually those are supplied by the signing service as well).

The first two involve a degree of implementation/vendor dependence that is difficult to cater to in full generality, and the third is out of scope for Signer subclasses in general.

However, this module still provides a number of convenient hooks and guardrails that should allow you to fill in these blanks with relative ease. We briefly discuss these below.

Throughout, the particulars of how pyHanko should connect to a signing service are supplied in a CSCServiceSessionInfo object. This object contains the base CSC API URL, the CSC credential ID to use, and authentication data.

Authenticating to the signing service

While the authentication process itself is the API user’s responsibility, CSCServiceSessionInfo includes an oauth_token field that will (by default) be used to populate the HTTP Authorization header for every request.

To handle OAuth-specific tasks, you might want to use a library like OAuthLib.

Obtaining SAD from the signing service

This is done by subclassing CSCAuthorizationInfo and passing an instance to the CSCSigner. The CSCAuthorizationInfo instance should call the signer’s credentials/authorize endpoint with the proper parameters required by the service. See the documentation for CSCAuthorizationInfo for details and= information about helper functions.

Certificate provisioning

In pyHanko’s API, Signer instances need to be initialised with the signer’s certificate, preferably together with other relevant CA certificates. In a CSC context, these are typically retrieved from the signing service by calling the credentials/info endpoint.

This module offers a helper function to handle that task, see fetch_certs_in_csc_credential().

class pyhanko.sign.signers.csc_signer.CSCSigner(session: ClientSession, auth_manager: CSCAuthorizationManager, sign_timeout: int = 300, prefer_pss: bool = False, embed_roots: bool = True, client_data: str | None = None, batch_autocommit: bool = True, batch_size: int | None = None, est_raw_signature_size=512)

Bases: Signer

Implements the Signer interface for a remote CSC signing service. Requests are made asynchronously, using aiohttp.

Parameters:
  • session – The aiohttp session to use when performing queries.

  • auth_manager – A CSCAuthorizationManager instance capable of procuring signature activation data from the signing service.

  • sign_timeout – Timeout for signing operations, in seconds. Defaults to 300 seconds (5 minutes).

  • prefer_pss – When signing using an RSA key, prefer PSS padding to legacy PKCS#1 v1.5 padding. Default is False. This option has no effect on non-RSA signatures.

  • embed_roots – Option that controls whether or not additional self-signed certificates should be embedded into the CMS payload. The default is True.

  • client_data – CSC client data to add to any signing request(s), if applicable.

  • batch_autocommit – Whether to automatically commit a signing transaction as soon as a batch is full. The default is True. If False, the caller has to trigger commit() manually.

  • batch_size – The number of signatures to sign in one transaction. This defaults to 1 (i.e. a separate signatures/signHash call is made for every signature).

  • est_raw_signature_size – Estimated raw signature size (in bytes). Defaults to 512 bytes, which, combined with other built-in safety margins, should provide a generous overestimate.

get_signature_mechanism_for_digest(digest_algorithm)

Get the signature mechanism for this signer to use. If signature_mechanism is set, it will be used. Otherwise, this method will attempt to put together a default based on mechanism used in the signer’s certificate.

Parameters:

digest_algorithm – Digest algorithm to use as part of the signature mechanism. Only used if a signature mechanism object has to be put together on-the-fly.

Returns:

A SignedDigestAlgorithm object.

async format_csc_signing_req(tbs_hashes: List[str], digest_algorithm: str) dict

Populate the request data for a CSC signing request

Parameters:
  • tbs_hashes – Base64-encoded hashes that require signing.

  • digest_algorithm – The digest algorithm to use.

Returns:

A dict that, when encoded as a JSON object, be used as the request body for a call to signatures/signHash.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Compute the raw cryptographic signature of the data provided, hashed using the digest algorithm provided.

Parameters:
  • data – Data to sign.

  • digest_algorithm

    Digest algorithm to use.

    Warning

    If signature_mechanism also specifies a digest, they should match.

  • dry_run – Do not actually create a signature, but merely output placeholder bytes that would suffice to contain an actual signature.

Returns:

Signature bytes.

async commit()

Commit the current batch by calling the signatures/signHash endpoint on the CSC service.

This coroutine does not return anything; instead, it notifies all waiting signing coroutines that their signature has been fetched.

class pyhanko.sign.signers.csc_signer.CSCServiceSessionInfo(service_url: str, credential_id: str, oauth_token: str | None = None, api_ver: str = 'v1')

Bases: object

Information about the CSC service, together with the required authentication data.

service_url: str

Base URL of the CSC service. This is the part that precedes /csc/<version>/... in the API endpoint URLs.

credential_id: str

The identifier of the CSC credential to use when signing. The format is vendor-dependent.

oauth_token: str | None = None

OAuth token to use when making requests to the CSC service.

api_ver: str = 'v1'

CSC API version.

Note

This section does not affect any of the internal logic, it only changes how the URLs are formatted.

endpoint_url(endpoint_name)

Complete an endpoint name to a full URL.

Parameters:

endpoint_name – Name of the endpoint (e.g. credentials/info).

Returns:

A URL.

property auth_headers

HTTP Header(s) necessary for authentication, to be passed with every request.

Note

By default, this supplies the Authorization header with the value of oauth_token as the Bearer value.

Returns:

A dict of headers.

class pyhanko.sign.signers.csc_signer.CSCCredentialInfo(signing_cert: Certificate, chain: List[Certificate], supported_mechanisms: FrozenSet[str], max_batch_size: int, hash_pinning_required: bool, response_data: dict)

Bases: object

Information about a CSC credential, typically fetched using a credentials/info call. See also fetch_certs_in_csc_credential().

signing_cert: Certificate

The signer’s certificate.

chain: List[Certificate]

Other relevant CA certificates.

supported_mechanisms: FrozenSet[str]

Signature mechanisms supported by the credential.

max_batch_size: int

The maximal batch size that can be used with this credential.

hash_pinning_required: bool

Flag controlling whether SAD must be tied to specific hashes.

response_data: dict

The JSON response data from the server as an otherwise unparsed dict.

as_cert_store() CertificateStore

Register the relevant certificates into a CertificateStore and return it.

Returns:

A CertificateStore.

async pyhanko.sign.signers.csc_signer.fetch_certs_in_csc_credential(session: ClientSession, csc_session_info: CSCServiceSessionInfo, timeout: int = 30) CSCCredentialInfo

Call the credentials/info endpoint of the CSC service for a specific credential, and encode the result into a CSCCredentialInfo object.

Parameters:
  • session – The aiohttp session to use when performing queries.

  • csc_session_info – General information about the CSC service and the credential.

  • timeout – How many seconds to allow before time-out.

Returns:

A CSCCredentialInfo object with the processed response.

class pyhanko.sign.signers.csc_signer.CSCAuthorizationInfo(sad: str, expires_at: datetime | None = None)

Bases: object

Authorization data to make a signing request. This is the result of a call to credentials/authorize.

sad: str

Signature activation data; opaque to the client.

expires_at: datetime | None = None

Expiry date of the signature activation data.

class pyhanko.sign.signers.csc_signer.CSCAuthorizationManager(csc_session_info: CSCServiceSessionInfo, credential_info: CSCCredentialInfo)

Bases: ABC

Abstract class that handles authorisation requests for the CSC signing client.

Note

Implementations may wish to make use of the format_csc_auth_request() convenience method to format requests to the credentials/authorize endpoint.

Parameters:
  • csc_session_info – General information about the CSC service and the credential.

  • credential_info – Details about the credential.

async authorize_signature(hash_b64s: List[str]) CSCAuthorizationInfo

Request a SAD token from the signing service, either freshly or to extend the current transaction.

Depending on the lifecycle of this object, pre-fetched SAD values may be used. All authorization transaction management is left to implementing subclasses.

Parameters:

hash_b64s – Base64-encoded hash values about to be signed.

Returns:

Authorization data.

format_csc_auth_request(num_signatures: int = 1, pin: str | None = None, otp: str | None = None, hash_b64s: List[str] | None = None, description: str | None = None, client_data: str | None = None) dict

Format the parameters for a call to credentials/authorize.

Parameters:
  • num_signatures – The number of signatures to request authorisation for.

  • pin – The user’s PIN (if applicable).

  • otp – The current value of an OTP token, provided by the user (if applicable).

  • hash_b64s – An explicit list of base64-encoded hashes to be tied to the SAD. Is optional if the service’s SCAL value is 1, i.e. when hash_pinning_required is false.

  • description – A free-form description of the authorisation request (optional).

  • client_data – Custom vendor-specific data (if applicable).

Returns:

A dict that, when encoded as a JSON object, be used as the request body for a call to credentials/authorize.

static parse_csc_auth_response(response_data: dict) CSCAuthorizationInfo

Parse the response from a credentials/authorize call into a CSCAuthorizationInfo object.

Parameters:

response_data – The decoded response JSON.

Returns:

A CSCAuthorizationInfo object.

property auth_headers

HTTP Header(s) necessary for authentication, to be passed with every request. By default, this delegates to CSCServiceSessionInfo.auth_headers.

Returns:

A dict of headers.

class pyhanko.sign.signers.csc_signer.PrefetchedSADAuthorizationManager(csc_session_info: CSCServiceSessionInfo, credential_info: CSCCredentialInfo, csc_auth_info: CSCAuthorizationInfo)

Bases: CSCAuthorizationManager

Simplistic CSCAuthorizationManager for use with pre-fetched signature activation data.

This class is effectively only useful for CSC services that do not require SAD to be pinned to specific document hashes. It allows you to use a SAD that was fetched before starting the signing process, for a one-shot signature.

This can simplify resource management in cases where obtaining a SAD is time-consuming, but the caller still wants the conveniences of pyHanko’s high-level API without having to keep too many pyHanko objects in memory while waiting for a credentials/authorize call to go through.

Legitimate uses are limited, but the implementation is trivial, so we provide it here.

Parameters:
  • csc_session_info – General information about the CSC service and the credential.

  • credential_info – Details about the credential.

  • csc_auth_info – The pre-fetched signature activation data.

async authorize_signature(hash_b64s: List[str]) CSCAuthorizationInfo

Return the prefetched SAD, or raise an error if called twice.

Parameters:

hash_b64s – List of hashes to be signed; ignored.

Returns:

The prefetched authorisation data.

pyhanko.sign.signers.functions module

This module defines pyHanko’s high-level API entry points.

pyhanko.sign.signers.functions.sign_pdf(pdf_out: BasePdfFileWriter, signature_meta: PdfSignatureMetadata, signer: Signer, timestamper: TimeStamper | None = None, new_field_spec: SigFieldSpec | None = None, existing_fields_only=False, bytes_reserved=None, in_place=False, output=None)

Thin convenience wrapper around PdfSigner.sign_pdf().

Parameters:
  • pdf_out – An IncrementalPdfFileWriter.

  • bytes_reserved – Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

  • signature_meta – The specification of the signature to add.

  • signerSigner object to use to produce the signature object.

  • timestamperTimeStamper object to use to produce any time stamp tokens that might be required.

  • in_place – Sign the input in-place. If False, write output to a BytesIO object.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • new_field_spec – If a new field is to be created, this parameter allows the caller to specify the field’s properties in the form of a SigFieldSpec. This parameter is only meaningful if existing_fields_only is False.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

Returns:

The output stream containing the signed output.

async pyhanko.sign.signers.functions.async_sign_pdf(pdf_out: BasePdfFileWriter, signature_meta: PdfSignatureMetadata, signer: Signer, timestamper: TimeStamper | None = None, new_field_spec: SigFieldSpec | None = None, existing_fields_only=False, bytes_reserved=None, in_place=False, output=None)

Thin convenience wrapper around PdfSigner.async_sign_pdf().

Parameters:
  • pdf_out – An IncrementalPdfFileWriter.

  • bytes_reserved – Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

  • signature_meta – The specification of the signature to add.

  • signerSigner object to use to produce the signature object.

  • timestamperTimeStamper object to use to produce any time stamp tokens that might be required.

  • in_place – Sign the input in-place. If False, write output to a BytesIO object.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • new_field_spec – If a new field is to be created, this parameter allows the caller to specify the field’s properties in the form of a SigFieldSpec. This parameter is only meaningful if existing_fields_only is False.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

Returns:

The output stream containing the signed output.

pyhanko.sign.signers.functions.embed_payload_with_cms(pdf_writer: BasePdfFileWriter, file_spec_string: str, payload: EmbeddedFileObject, cms_obj: ContentInfo, extension='.sig', file_name: str | None = None, file_spec_kwargs=None, cms_file_spec_kwargs=None)

Embed some data as an embedded file stream into a PDF, and associate it with a CMS object.

The resulting CMS object will also be turned into an embedded file, and associated with the original payload through a related file relationship.

This can be used to bundle (non-PDF) detached signatures with PDF attachments, for example.

Added in version 0.7.0.

Parameters:
  • pdf_writer – The PDF writer to use.

  • file_spec_string – See file_spec_string in FileSpec.

  • payload – Payload object.

  • cms_obj – CMS object pertaining to the payload.

  • extension – File extension to use for the CMS attachment.

  • file_name – See file_name in FileSpec.

  • file_spec_kwargs – Extra arguments to pass to the FileSpec constructor for the main attachment specification.

  • cms_file_spec_kwargs – Extra arguments to pass to the FileSpec constructor for the CMS attachment specification.

pyhanko.sign.signers.pdf_byterange module

This module contains the low-level building blocks for dealing with bookkeeping around /ByteRange digests in PDF files.

class pyhanko.sign.signers.pdf_byterange.PreparedByteRangeDigest(document_digest: bytes, reserved_region_start: int, reserved_region_end: int)

Bases: object

Added in version 0.7.0.

Changed in version 0.14.0: Removed md_algorithm attribute since it was unused.

Bookkeeping class that contains the digest of a document that is about to be signed (or otherwise authenticated) based on said digest. It also keeps track of the region in the output stream that is omitted in the byte range.

Instances of this class can easily be serialised, which allows for interrupting the signing process partway through.

document_digest: bytes

Digest of the document, computed over the appropriate /ByteRange.

reserved_region_start: int

Start of the reserved region in the output stream that is not part of the /ByteRange.

reserved_region_end: int

End of the reserved region in the output stream that is not part of the /ByteRange.

fill_with_cms(output: IO, cms_data: bytes | ContentInfo)

Write a DER-encoded CMS object to the reserved region indicated by reserved_region_start and reserved_region_end in the output stream.

Parameters:
  • output – Output stream to use. Must be writable and seekable.

  • cms_data – CMS object to write. Can be provided as an asn1crypto.cms.ContentInfo object, or as raw DER-encoded bytes.

Returns:

A bytes object containing the contents that were written, plus any additional padding.

fill_reserved_region(output: IO, content_bytes: bytes)

Write hex-encoded contents to the reserved region indicated by reserved_region_start and reserved_region_end in the output stream.

Parameters:
  • output – Output stream to use. Must be writable and seekable.

  • content_bytes – Content bytes. These will be padded, hexadecimally encoded and written to the appropriate location in output stream.

Returns:

A bytes object containing the contents that were written, plus any additional padding.

class pyhanko.sign.signers.pdf_byterange.PdfByteRangeDigest(data_key='/Contents', *, bytes_reserved=None)

Bases: DictionaryObject

General class to model a PDF Dictionary that has a /ByteRange entry and a another data entry (named /Contents by default) that will contain a value based on a digest computed over said /ByteRange. The /ByteRange will cover the entire file, except for the value of the data entry itself.

Danger

This is internal API.

Parameters:
  • data_key – Name of the data key, which is /Contents by default.

  • bytes_reserved – Number of bytes to reserve for the contents placeholder. If None, a generous default is applied, but you should try to estimate the size as accurately as possible.

fill(writer: BasePdfFileWriter, md_algorithm, in_place=False, output=None, chunk_size=4096)

Generator coroutine that handles the document hash computation and the actual filling of the placeholder data.

Danger

This is internal API; you should use use PdfSigner wherever possible. If you really need fine-grained control, use PdfCMSEmbedder instead.

class pyhanko.sign.signers.pdf_byterange.PdfSignedData(obj_type, subfilter: SigSeedSubFilter = SigSeedSubFilter.ADOBE_PKCS7_DETACHED, timestamp: datetime | None = None, bytes_reserved=None)

Bases: PdfByteRangeDigest

Generic class to model signature dictionaries in a PDF file. See also SignatureObject and DocumentTimestamp.

Parameters:
  • obj_type – The type of signature object.

  • subfilter – See SigSeedSubFilter.

  • timestamp – The timestamp to embed into the /M entry.

  • bytes_reserved

    The number of bytes to reserve for the signature. Defaults to 16 KiB.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

class pyhanko.sign.signers.pdf_byterange.SignatureObject(timestamp: datetime | None = None, subfilter: SigSeedSubFilter = SigSeedSubFilter.ADOBE_PKCS7_DETACHED, name=None, location=None, reason=None, contact_info=None, app_build_props: BuildProps | None = None, prop_auth_time: int | None = None, prop_auth_type: SigAuthType | None = None, bytes_reserved=None)

Bases: PdfSignedData

Class modelling a (placeholder for) a regular PDF signature.

Parameters:
  • timestamp – The (optional) timestamp to embed into the /M entry.

  • subfilter – See SigSeedSubFilter.

  • bytes_reserved

    The number of bytes to reserve for the signature. Defaults to 16 KiB.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • name – Signer name. You probably want to leave this blank, viewers should default to the signer’s subject name.

  • location – Optional signing location.

  • reason – Optional signing reason. May be restricted by seed values.

  • app_build_props – Optional dictionary containing informations about the computer environment used for signing. See BuildProps.

  • prop_auth_time – Optional information representing the number of seconds since signer was last authenticated.

  • prop_auth_type – Optional information about the method of user’s authentication See SigAuthType.

Params contact_info:

Optional information from the signer to enable the receiver to contact the signer and verify the signature.

class pyhanko.sign.signers.pdf_byterange.DocumentTimestamp(bytes_reserved=None)

Bases: PdfSignedData

Class modelling a (placeholder for) a regular PDF signature.

Parameters:

bytes_reserved

The number of bytes to reserve for the signature. Defaults to 16 KiB.

Warning

Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

class pyhanko.sign.signers.pdf_byterange.BuildProps(name: str, revision: str | None = None)

Bases: object

Entries in a signature build properties dictionary; see Adobe PDF Signature Build Dictionary Specification.

name: str

The application’s name.

revision: str | None = None

The application’s revision ID string.

Note

This corresponds to the REx entry in the build properties dictionary.

as_pdf_object() DictionaryObject

Render the build properties as a PDF object.

Returns:

A PDF dictionary.

pyhanko.sign.signers.pdf_cms module

This module defines utility classes to format CMS objects for use in PDF signatures.

class pyhanko.sign.signers.pdf_cms.Signer(*, prefer_pss: bool = False, embed_roots: bool = True, signature_mechanism: SignedDigestAlgorithm | None = None, signing_cert: Certificate | None = None, cert_registry: CertificateStore | None = None, attribute_certs: Iterable[AttributeCertificateV2] = ())

Bases: object

Abstract signer object that is agnostic as to where the cryptographic operations actually happen.

As of now, pyHanko provides two implementations:

  • SimpleSigner implements the easy case where all the key material can be loaded into memory.

  • PKCS11Signer implements a signer that is capable of interfacing with a PKCS#11 device.

Parameters:
  • prefer_pss – When signing using an RSA key, prefer PSS padding to legacy PKCS#1 v1.5 padding. Default is False. This option has no effect on non-RSA signatures.

  • embed_roots

    Added in version 0.9.0.

    Option that controls whether or not additional self-signed certificates should be embedded into the CMS payload. The default is True.

    Note

    The signer’s certificate is always embedded, even if it is self-signed.

    Note

    Trust roots are configured by the validator, so embedding them doesn’t affect the semantics of a typical validation process. Therefore, they can be safely omitted in most cases. Nonetheless, embedding the roots can be useful for documentation purposes. In addition, some validators are poorly implemented, and will refuse to build paths if the roots are not present in the file.

    Warning

    To be precise, if this flag is False, a certificate will be dropped if (a) it is not the signer’s, (b) it is self-issued and (c) its subject and authority key identifiers match (or either is missing). In other words, we never validate the actual self-signature. This heuristic is sufficiently accurate for most applications.

  • signature_mechanism – The (cryptographic) signature mechanism to use for all signing operations. If unset, the default behaviour is to try to impute a reasonable one given the preferred digest algorithm and public key.

  • signing_cert – See signing_cert.

  • attribute_certs – See attribute_certs.

  • cert_registry – Initial value for cert_registry. If unset, an empty certificate store will be initialised.

property signature_mechanism: SignedDigestAlgorithm | None

Changed in version 0.18.0: Turned into a property instead of a class attribute.

The (cryptographic) signature mechanism to use for all signing operations.

property signing_cert: Certificate | None

Changed in version 0.14.0: Made optional (see note)

Changed in version 0.18.0: Turned into a property instead of a class attribute.

The certificate that will be used to create the signature.

Note

This is an optional field only to a limited extent. Subclasses may require it to be present, and not setting it at the beginning of the signing process implies that certain high-level convenience features will no longer work or be limited in function (e.g., automatic hash selection, appearance generation, revocation information collection, …).

However, making signing_cert optional enables certain signing workflows where the certificate of the signer is not known until the signature has actually been produced. This is most relevant in certain types of remote signing scenarios.

property cert_registry: CertificateStore

Changed in version 0.18.0: Turned into a property instead of a class attribute.

Collection of certificates associated with this signer. Note that this is simply a bookkeeping tool; in particular it doesn’t care about trust.

property attribute_certs: Iterable[AttributeCertificateV2]

Changed in version 0.18.0: Turned into a property instead of a class attribute.

Attribute certificates to include with the signature.

Note

Only v2 attribute certificates are supported.

get_signature_mechanism_for_digest(digest_algorithm: str | None) SignedDigestAlgorithm

Get the signature mechanism for this signer to use. If signature_mechanism is set, it will be used. Otherwise, this method will attempt to put together a default based on mechanism used in the signer’s certificate.

Parameters:

digest_algorithm – Digest algorithm to use as part of the signature mechanism. Only used if a signature mechanism object has to be put together on-the-fly.

Returns:

A SignedDigestAlgorithm object.

property subject_name: str | None
Returns:

The subject’s common name as a string, extracted from signing_cert, or None if no signer’s certificate is available

static format_revinfo(ocsp_responses: list | None = None, crls: list | None = None)

Format Adobe-style revocation information for inclusion into a CMS object.

Parameters:
  • ocsp_responses – A list of OCSP responses to include.

  • crls – A list of CRLs to include.

signer_info(digest_algorithm: str, signed_attrs, signature)

Format the SignerInfo entry for a CMS signature.

Parameters:
  • digest_algorithm – Digest algorithm to use.

  • signed_attrs – Signed attributes (see signed_attrs()).

  • signature – The raw signature to embed (see sign_raw()).

Returns:

An asn1crypto.cms.SignerInfo object.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Compute the raw cryptographic signature of the data provided, hashed using the digest algorithm provided.

Parameters:
  • data – Data to sign.

  • digest_algorithm

    Digest algorithm to use.

    Warning

    If signature_mechanism also specifies a digest, they should match.

  • dry_run – Do not actually create a signature, but merely output placeholder bytes that would suffice to contain an actual signature.

Returns:

Signature bytes.

async unsigned_attrs(digest_algorithm: str, signature: bytes, signed_attrs: CMSAttributes, timestamper=None, dry_run=False) CMSAttributes | None

Changed in version 0.9.0: Made asynchronous _(breaking change)_

Changed in version 0.14.0: Added signed_attrs parameter _(breaking change)_

Compute the unsigned attributes to embed into the CMS object. This function is called after signing the hash of the signed attributes (see signed_attrs()).

By default, this method only handles timestamp requests, but other functionality may be added by subclasses

If this method returns None, no unsigned attributes will be embedded.

Parameters:
  • digest_algorithm – Digest algorithm used to hash the signed attributes.

  • signed_attrs – Signed attributes of the signature.

  • signature – Signature of the signed attribute hash.

  • timestamper – Timestamp supplier to use.

  • dry_run – Flag indicating “dry run” mode. If True, only the approximate size of the output matters, so cryptographic operations can be replaced by placeholders.

Returns:

The unsigned attributes to add, or None.

async signed_attrs(data_digest: bytes, digest_algorithm: str, attr_settings: PdfCMSSignedAttributes | None = None, content_type='data', use_pades=False, timestamper=None, dry_run=False, is_pdf_sig=True)

Changed in version 0.4.0: Added positional digest_algorithm parameter _(breaking change)_.

Changed in version 0.5.0: Added dry_run, timestamper and cades_meta parameters.

Changed in version 0.9.0: Made asynchronous, grouped some parameters under attr_settings _(breaking change)_

Format the signed attributes for a CMS signature.

Parameters:
  • data_digest – Raw digest of the data to be signed.

  • digest_algorithm

    Added in version 0.4.0.

    Name of the digest algorithm used to compute the digest.

  • use_pades – Respect PAdES requirements.

  • dry_run

    Added in version 0.5.0.

    Flag indicating “dry run” mode. If True, only the approximate size of the output matters, so cryptographic operations can be replaced by placeholders.

  • attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • timestamper

    Added in version 0.5.0.

    Timestamper to use when creating timestamp tokens.

  • content_type

    CMS content type of the encapsulated data. Default is data.

    Danger

    This parameter is internal API, and non-default values must not be used to produce PDF signatures.

  • is_pdf_sig

    Whether the signature being generated is for use in a PDF document.

    Danger

    This parameter is internal API.

Returns:

An asn1crypto.cms.CMSAttributes object.

async async_sign(data_digest: bytes, digest_algorithm: str, dry_run=False, use_pades=False, timestamper=None, signed_attr_settings: PdfCMSSignedAttributes | None = None, is_pdf_sig=True, encap_content_info=None) ContentInfo

Added in version 0.9.0.

Produce a detached CMS signature from a raw data digest.

Parameters:
  • data_digest – Digest of the actual content being signed.

  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • signed_attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • use_pades – Respect PAdES requirements.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • is_pdf_sig

    Whether the signature being generated is for use in a PDF document.

    Danger

    This parameter is internal API.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns:

An ContentInfo object.

async async_sign_prescribed_attributes(digest_algorithm: str, signed_attrs: CMSAttributes, cms_version='v1', dry_run=False, timestamper=None, encap_content_info=None) ContentInfo

Added in version 0.9.0.

Start the CMS signing process with the prescribed set of signed attributes.

Parameters:
  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • signed_attrs – CMS attributes to sign.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cms_version – CMS version to use.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns:

An ContentInfo object.

async async_sign_general_data(input_data: IO | bytes | ContentInfo | EncapsulatedContentInfo, digest_algorithm: str, detached=True, use_cades=False, timestamper=None, chunk_size=4096, signed_attr_settings: PdfCMSSignedAttributes | None = None, max_read=None) ContentInfo

Added in version 0.9.0.

Produce a CMS signature for an arbitrary data stream (not necessarily PDF data).

Parameters:
  • input_data

    The input data to sign. This can be either a bytes object a file-type object, a cms.ContentInfo object or a cms.EncapsulatedContentInfo object.

    Warning

    asn1crypto mandates cms.ContentInfo for CMS v1 signatures. In practical terms, this means that you need to use cms.ContentInfo if the content type is data, and cms.EncapsulatedContentInfo otherwise.

    Warning

    We currently only support CMS v1, v3 and v4 signatures. This is only a concern if you need certificates or CRLs of type ‘other’, in which case you can change the version yourself (this will not invalidate any signatures). You’ll also need to do this if you need support for version 1 attribute certificates, or if you want to sign with subjectKeyIdentifier in the sid field.

  • digest_algorithm – The name of the digest algorithm to use.

  • detached – If True, create a CMS detached signature (i.e. an object where the encapsulated content is not embedded in the signature object itself). This is the default. If False, the content to be signed will be embedded as encapsulated content.

  • signed_attr_settingsPdfCMSSignedAttributes object describing the attributes to be added.

  • use_cades – Construct a CAdES-style CMS object.

  • timestamper

    PdfTimeStamper to use to create a signature timestamp

    Note

    If you want to create a content timestamp (as opposed to a signature timestamp), see CAdESSignedAttrSpec.

  • chunk_size – Chunk size to use when consuming input data.

  • max_read – Maximal number of bytes to read from the input stream.

Returns:

A CMS ContentInfo object of type signedData.

sign(data_digest: bytes, digest_algorithm: str, timestamp: datetime | None = None, dry_run=False, revocation_info=None, use_pades=False, timestamper=None, cades_signed_attr_meta: CAdESSignedAttrSpec | None = None, encap_content_info=None) ContentInfo

Deprecated since version 0.9.0: Use async_sign() instead. The implementation of this method will invoke async_sign() using asyncio.run().

Produce a detached CMS signature from a raw data digest.

Parameters:
  • data_digest – Digest of the actual content being signed.

  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • timestamp

    Signing time to embed into the signed attributes (will be ignored if use_pades is True).

    Note

    This timestamp value is to be interpreted as an unfounded assertion by the signer, which may or may not be good enough for your purposes.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • revocation_info – Revocation information to embed; this should be the output of a call to Signer.format_revinfo() (ignored when use_pades is True).

  • use_pades – Respect PAdES requirements.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cades_signed_attr_meta

    Added in version 0.5.0.

    Specification for CAdES-specific signed attributes.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns:

An ContentInfo object.

sign_prescribed_attributes(digest_algorithm: str, signed_attrs: CMSAttributes, cms_version='v1', dry_run=False, timestamper=None, encap_content_info=None) ContentInfo

Deprecated since version 0.9.0: Use async_sign_prescribed_attributes() instead. The implementation of this method will invoke async_sign_prescribed_attributes() using asyncio.run().

Start the CMS signing process with the prescribed set of signed attributes.

Parameters:
  • digest_algorithm – Digest algorithm to use. This should be the same digest method as the one used to hash the (external) content.

  • signed_attrs – CMS attributes to sign.

  • dry_run

    If True, the actual signing step will be replaced with a placeholder.

    In a PDF signing context, this is necessary to estimate the size of the signature container before computing the actual digest of the document.

  • timestamper

    TimeStamper used to obtain a trusted timestamp token that can be embedded into the signature container.

    Note

    If dry_run is true, the timestamper’s dummy_response() method will be called to obtain a placeholder token. Note that with a standard HTTPTimeStamper, this might still hit the timestamping server (in order to produce a realistic size estimate), but the dummy response will be cached.

  • cms_version – CMS version to use.

  • encap_content_info

    Data to encapsulate in the CMS object.

    Danger

    This parameter is internal API, and must not be used to produce PDF signatures.

Returns:

An ContentInfo object.

sign_general_data(input_data: IO | bytes | ContentInfo | EncapsulatedContentInfo, digest_algorithm: str, detached=True, timestamp: datetime | None = None, use_cades=False, timestamper=None, cades_signed_attr_meta: CAdESSignedAttrSpec | None = None, chunk_size=4096, max_read=None) ContentInfo

Added in version 0.7.0.

Deprecated since version 0.9.0: Use async_sign_general_data() instead. The implementation of this method will invoke async_sign_general_data() using asyncio.run().

Produce a CMS signature for an arbitrary data stream (not necessarily PDF data).

Parameters:
  • input_data

    The input data to sign. This can be either a bytes object a file-type object, a cms.ContentInfo object or a cms.EncapsulatedContentInfo object.

    Warning

    asn1crypto mandates cms.ContentInfo for CMS v1 signatures. In practical terms, this means that you need to use cms.ContentInfo if the content type is data, and cms.EncapsulatedContentInfo otherwise.

    Warning

    We currently only support CMS v1, v3 and v4 signatures. This is only a concern if you need certificates or CRLs of type ‘other’, in which case you can change the version yourself (this will not invalidate any signatures). You’ll also need to do this if you need support for version 1 attribute certificates, or if you want to sign with subjectKeyIdentifier in the sid field.

  • digest_algorithm – The name of the digest algorithm to use.

  • detached – If True, create a CMS detached signature (i.e. an object where the encapsulated content is not embedded in the signature object itself). This is the default. If False, the content to be signed will be embedded as encapsulated content.

  • timestamp

    Signing time to embed into the signed attributes (will be ignored if use_cades is True).

    Note

    This timestamp value is to be interpreted as an unfounded assertion by the signer, which may or may not be good enough for your purposes.

  • use_cades – Construct a CAdES-style CMS object.

  • timestamper

    PdfTimeStamper to use to create a signature timestamp

    Note

    If you want to create a content timestamp (as opposed to a signature timestamp), see CAdESSignedAttrSpec.

  • cades_signed_attr_meta – Specification for CAdES-specific signed attributes.

  • chunk_size – Chunk size to use when consuming input data.

  • max_read – Maximal number of bytes to read from the input stream.

Returns:

A CMS ContentInfo object of type signedData.

class pyhanko.sign.signers.pdf_cms.SimpleSigner(signing_cert: Certificate, signing_key: PrivateKeyInfo, cert_registry: CertificateStore, signature_mechanism: SignedDigestAlgorithm | None = None, prefer_pss: bool = False, embed_roots: bool = True, attribute_certs: Iterable[AttributeCertificateV2] | None = None)

Bases: Signer

Simple signer implementation where the key material is available in local memory.

signing_key: PrivateKeyInfo

Private key associated with the certificate in signing_cert.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Compute the raw cryptographic signature of the data provided, hashed using the digest algorithm provided.

Parameters:
  • data – Data to sign.

  • digest_algorithm

    Digest algorithm to use.

    Warning

    If signature_mechanism also specifies a digest, they should match.

  • dry_run – Do not actually create a signature, but merely output placeholder bytes that would suffice to contain an actual signature.

Returns:

Signature bytes.

sign_raw(data: bytes, digest_algorithm: str) bytes

Synchronous raw signature implementation.

Parameters:
  • data – Data to be signed.

  • digest_algorithm – Digest algorithm to use.

Returns:

Raw signature encoded according to the conventions of the signing algorithm used.

classmethod load_pkcs12(pfx_file, ca_chain_files=None, other_certs=None, passphrase=None, signature_mechanism=None, prefer_pss=False)

Load certificates and key material from a PCKS#12 archive (usually .pfx or .p12 files).

Parameters:
  • pfx_file – Path to the PKCS#12 archive.

  • ca_chain_files – Path to (PEM/DER) files containing other relevant certificates not included in the PKCS#12 file.

  • other_certs – Other relevant certificates, specified as a list of asn1crypto.x509.Certificate objects.

  • passphrase – Passphrase to decrypt the PKCS#12 archive, if required.

  • signature_mechanism – Override the signature mechanism to use.

  • prefer_pss – Prefer PSS signature mechanism over RSA PKCS#1 v1.5 if there’s a choice.

Returns:

A SimpleSigner object initialised with key material loaded from the PKCS#12 file provided.

classmethod load(key_file, cert_file, ca_chain_files=None, key_passphrase=None, other_certs=None, signature_mechanism=None, prefer_pss=False)

Load certificates and key material from PEM/DER files.

Parameters:
  • key_file – File containing the signer’s private key.

  • cert_file – File containing the signer’s certificate.

  • ca_chain_files – File containing other relevant certificates.

  • key_passphrase – Passphrase to decrypt the private key (if required).

  • other_certs – Other relevant certificates, specified as a list of asn1crypto.x509.Certificate objects.

  • signature_mechanism – Override the signature mechanism to use.

  • prefer_pss – Prefer PSS signature mechanism over RSA PKCS#1 v1.5 if there’s a choice.

Returns:

A SimpleSigner object initialised with key material loaded from the files provided.

class pyhanko.sign.signers.pdf_cms.ExternalSigner(signing_cert: Certificate | None, cert_registry: CertificateStore | None, signature_value: bytes | int | None = None, signature_mechanism: SignedDigestAlgorithm | None = None, prefer_pss: bool = False, embed_roots: bool = True)

Bases: Signer

Class to help formatting CMS objects for use with remote signing. It embeds a fixed signature value into the CMS, set at initialisation.

Intended for use with Interrupted signing.

Parameters:
  • signing_cert – The signer’s certificate.

  • cert_registry – The certificate registry to use in CMS generation.

  • signature_value – The value of the signature as a byte string, a placeholder length, or None.

  • signature_mechanism – The signature mechanism used by the external signing service.

  • prefer_pss – Switch to prefer PSS when producing RSA signatures, as opposed to RSA with PKCS#1 v1.5 padding.

  • embed_roots – Whether to embed relevant root certificates into the CMS payload.

async async_sign_raw(data: bytes, digest_algorithm: str, dry_run=False) bytes

Return a fixed signature value.

class pyhanko.sign.signers.pdf_cms.PdfCMSSignedAttributes(signing_time: datetime | None = None, cades_signed_attrs: CAdESSignedAttrSpec | None = None, adobe_revinfo_attr: RevocationInfoArchival | None = None)

Bases: CMSSignedAttributes

Added in version 0.7.0.

Changed in version 0.14.0: Split off some fields into CMSSignedAttributes.

Serialisable container class describing input for various signed attributes in a CMS object for a PDF signature.

adobe_revinfo_attr: RevocationInfoArchival | None = None

Adobe-style signed revocation info attribute.

async pyhanko.sign.signers.pdf_cms.format_attributes(attr_provs: List[CMSAttributeProvider], other_attrs: Iterable[CMSAttributes] = (), dry_run: bool = False) CMSAttributes

Format CMS attributes obtained from attribute providers.

Parameters:
  • attr_provs – List of attribute providers.

  • other_attrs – Other (predetermined) attributes to include.

  • dry_run – Whether to invoke the attribute providers in dry-run mode or not.

Returns:

A cms.CMSAttributes value.

async pyhanko.sign.signers.pdf_cms.format_signed_attributes(data_digest: bytes, attr_provs: List[CMSAttributeProvider], content_type='data', dry_run=False) CMSAttributes

Format signed attributes for a CMS SignerInfo value.

Parameters:
  • data_digest – The byte string to put in the messageDigest attribute.

  • attr_provs – List of attribute providers to source attributes from.

  • content_type – The content type of the data being signed (default is data).

  • dry_run – Whether to invoke the attribute providers in dry-run mode or not.

Returns:

A cms.CMSAttributes value representing the signed attributes.

pyhanko.sign.signers.pdf_cms.asyncify_signer(signer_cls)

Decorator to turn a legacy Signer subclass into one that works with the new async API.

pyhanko.sign.signers.pdf_cms.select_suitable_signing_md(key: PublicKeyInfo) str

Choose a reasonable default signing message digest given the properties of (the public part of) a key.

The fallback value is constants.DEFAULT_MD.

Parameters:

key – A keys.PublicKeyInfo object.

Returns:

The name of a message digest algorithm.

pyhanko.sign.signers.pdf_cms.signer_from_p12_config(config: PKCS12SignatureConfig, provided_pfx_passphrase: bytes | None = None)
pyhanko.sign.signers.pdf_cms.signer_from_pemder_config(config: PemDerSignatureConfig, provided_key_passphrase: bytes | None = None)

pyhanko.sign.signers.pdf_signer module

This module implements support for PDF-specific signing functionality.

class pyhanko.sign.signers.pdf_signer.PdfSignatureMetadata(field_name: str | None = None, md_algorithm: str | None = None, location: str | None = None, reason: str | None = None, contact_info: str | None = None, name: str | None = None, app_build_props: ~pyhanko.sign.signers.pdf_byterange.BuildProps | None = None, prop_auth_time: int | None = None, prop_auth_type: ~pyhanko.sign.fields.SigAuthType | None = None, certify: bool = False, subfilter: ~pyhanko.sign.fields.SigSeedSubFilter | None = None, embed_validation_info: bool = False, use_pades_lta: bool = False, timestamp_field_name: str | None = None, validation_context: ~pyhanko_certvalidator.context.ValidationContext | None = None, docmdp_permissions: ~pyhanko.sign.fields.MDPPerm = MDPPerm.FILL_FORMS, signer_key_usage: ~typing.Set[str] = <factory>, cades_signed_attr_spec: ~pyhanko.sign.ades.api.CAdESSignedAttrSpec | None = None, dss_settings: ~pyhanko.sign.signers.pdf_signer.DSSContentSettings = DSSContentSettings(include_vri=True, skip_if_unneeded=True, placement=<SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS: 3>, next_ts_settings=None), tight_size_estimates: bool = False, ac_validation_context: ~pyhanko_certvalidator.context.ValidationContext | None = None)

Bases: object

Specification for a PDF signature.

field_name: str | None = None

The name of the form field to contain the signature. If there is only one available signature field, the name may be inferred.

md_algorithm: str | None = None

The name of the digest algorithm to use. It should be supported by pyca/cryptography.

If None, select_suitable_signing_md() will be invoked to generate a suitable default, unless a seed value dictionary happens to be available.

location: str | None = None

Location of signing.

reason: str | None = None

Reason for signing (textual).

contact_info: str | None = None

Information provided by the signer to enable the receiver to contact the signer to verify the signature.

name: str | None = None

Name of the signer. This value is usually not necessary to set, since it should appear on the signer’s certificate, but there are cases where it might be useful to specify it here (e.g. in situations where signing is delegated to a trusted third party).

app_build_props: BuildProps | None = None

Properties of the application that created the signature.

If specified, this data will be recorded in the Prop_Build dictionary of the signature.

prop_auth_time: int | None = None

Number of seconds since signer was last authenticated.

prop_auth_type: SigAuthType | None = None

Signature /Prop_AuthType to use.

This should be one of PIN or PASSWORD or FINGERPRINT If not specified, this property won’t be set on the signature dictionary.

certify: bool = False

Sign with an author (certification) signature, as opposed to an approval signature. A document can contain at most one such signature, and it must be the first one.

subfilter: SigSeedSubFilter | None = None

Signature subfilter to use.

This should be one of ADOBE_PKCS7_DETACHED or PADES. If not specified, the value may be inferred from the signature field’s seed value dictionary. Failing that, ADOBE_PKCS7_DETACHED is used as the default value.

embed_validation_info: bool = False

Flag indicating whether validation info (OCSP responses and/or CRLs) should be embedded or not. This is necessary to be able to validate signatures long after they have been made. This flag requires validation_context to be set.

The precise manner in which the validation info is embedded depends on the (effective) value of subfilter:

  • With ADOBE_PKCS7_DETACHED, the validation information will be embedded inside the CMS object containing the signature.

  • With PADES, the validation information will be embedded into the document security store (DSS).

use_pades_lta: bool = False

If True, the signer will append an additional document timestamp after writing the signature’s validation information to the document security store (DSS). This flag is only meaningful if subfilter is PADES.

The PAdES B-LTA profile solves the long-term validation problem by adding a timestamp chain to the document after the regular signatures, which is updated with new timestamps at regular intervals. This provides an audit trail that ensures the long-term integrity of the validation information in the DSS, since OCSP responses and CRLs also have a finite lifetime.

See also PdfTimeStamper.update_archival_timestamp_chain().

timestamp_field_name: str | None = None

Name of the timestamp field created when use_pades_lta is True. If not specified, a unique name will be generated using uuid.

validation_context: ValidationContext | None = None

The validation context to use when validating signatures. If provided, the signer’s certificate and any timestamp certificates will be validated before signing.

This parameter is mandatory when embed_validation_info is True.

docmdp_permissions: MDPPerm = 2

Indicates the document modification policy that will be in force after this signature is created. Only relevant for certification signatures or signatures that apply locking.

Warning

For non-certification signatures, this is only explicitly allowed since PDF 2.0 (ISO 32000-2), so older software may not respect this setting on approval signatures.

signer_key_usage: Set[str]

Key usage extensions required for the signer’s certificate. Defaults to non_repudiation only, but sometimes digital_signature or a combination of both may be more appropriate. See x509.KeyUsage for a complete list.

Only relevant if a validation context is also provided.

cades_signed_attr_spec: CAdESSignedAttrSpec | None = None

Added in version 0.5.0.

Specification for CAdES-specific attributes.

dss_settings: DSSContentSettings = DSSContentSettings(include_vri=True, skip_if_unneeded=True, placement=<SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS: 3>, next_ts_settings=None)

Added in version 0.8.0.

DSS output settings. See DSSContentSettings.

tight_size_estimates: bool = False

Added in version 0.8.0.

When estimating the size of a signature container, do not add safety margins.

Note

This should be OK if the entire CMS object is produced by pyHanko, and the signing scheme produces signatures of a fixed size. However, if the signature container includes unsigned attributes such as signature timestamps, the size of the signature is never entirely predictable.

ac_validation_context: ValidationContext | None = None

Added in version 0.11.0.

Validation context for attribute certificates

class pyhanko.sign.signers.pdf_signer.DSSContentSettings(include_vri: bool = True, skip_if_unneeded: bool = True, placement: SigDSSPlacementPreference = SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS, next_ts_settings: TimestampDSSContentSettings | None = None)

Bases: GeneralDSSContentSettings

Added in version 0.8.0.

Settings for a DSS update with validation information for a signature.

placement: SigDSSPlacementPreference = 3

Preference for where to perform a DSS update with validation information for a specific signature. See SigDSSPlacementPreference.

The default is SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS.

next_ts_settings: TimestampDSSContentSettings | None = None

Explicit settings for DSS updates pertaining to a document timestamp added as part of the same signing workflow, if applicable.

If None, a default will be generated based on the values of this settings object.

Note

When consuming DSSContentSettings objects, you should call get_settings_for_ts() instead of relying on the value of this field.

get_settings_for_ts() TimestampDSSContentSettings

Retrieve DSS update settings for document timestamps that are part of our signing workflow, if there are any.

assert_viable()

Check settings for consistency, and raise SigningError otherwise.

class pyhanko.sign.signers.pdf_signer.TimestampDSSContentSettings(include_vri: bool = True, skip_if_unneeded: bool = True, update_before_ts: bool = False)

Bases: GeneralDSSContentSettings

Added in version 0.8.0.

Settings for a DSS update with validation information for a document timestamp.

Note

In most workflows, adding a document timestamp doesn’t trigger any DSS updates beyond VRI additions, because the same TSA is used for signature timestamps and for document timestamps.

update_before_ts: bool = False

Perform DSS update before creating the timestamp, instead of after.

Warning

This setting can only be used if include_vri is False.

assert_viable()

Check settings for consistency, and raise SigningError otherwise.

class pyhanko.sign.signers.pdf_signer.GeneralDSSContentSettings(include_vri: bool = True, skip_if_unneeded: bool = True)

Bases: object

Added in version 0.8.0.

Settings that govern DSS creation and updating in general.

include_vri: bool = True

Flag to control whether to create and update entries in the VRI dictionary. The default is to always update the VRI dictionary.

Note

The VRI dictionary is a relic of the past that is effectively deprecated in the current PAdES standards, and most modern validators don’t rely on it being there.

That said, there’s no real harm in creating these entries, other than that it occasionally forces DSS updates where none would otherwise be necessary, and that it prevents the DSS from being updated prior to signing (as opposed to after signing).

skip_if_unneeded: bool = True

Do not perform a write if updating the DSS would not add any new information.

Note

This setting is only used if the DSS update would happen in its own revision.

class pyhanko.sign.signers.pdf_signer.SigDSSPlacementPreference(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Added in version 0.8.0.

Preference for where to perform a DSS update with validation information for a specific signature.

TOGETHER_WITH_SIGNATURE = 1

Update the DSS in the revision that contains the signature. Doing so can be useful to create a PAdES-B-LT signature in a single revision. Such signatures can be processed by a validator that isn’t capable of incremental update analysis.

Warning

This setting can only be used if include_vri is False.

SEPARATE_REVISION = 2

Always perform the DSS update in a separate revision, after the signature, but before any timestamps are added.

Note

This is the old default behaviour.

TOGETHER_WITH_NEXT_TS = 3

If the signing workflow includes a document timestamp after the signature, update the DSS in the same revision as the timestamp. In the absence of document timestamps, this is equivalent to SEPARATE_REVISION.

Warning

This option controls the addition of validation info for the signature and its associated signature timestamp, not the validation info for the document timestamp itself. See DSSContentSettings.next_ts_settings.

In most practical situations, the distinction is only relevant in interrupted signing workflows (see Interrupted signing), where the lifecycle of the validation context is out of pyHanko’s hands.

class pyhanko.sign.signers.pdf_signer.PdfTimeStamper(timestamper: TimeStamper, field_name: str | None = None, invis_settings: InvisSigSettings = InvisSigSettings(set_print_flag=True, set_hidden_flag=False, box_out_of_bounds=False), readable_field_name: str = 'Timestamp')

Bases: object

Class to encapsulate the process of appending document timestamps to PDF files.

property field_name: str

Retrieve or generate the field name for the signature field to contain the document timestamp.

Returns:

The field name, as a (Python) string.

timestamp_pdf(pdf_out: IncrementalPdfFileWriter, md_algorithm, validation_context=None, bytes_reserved=None, validation_paths=None, timestamper: TimeStamper | None = None, *, in_place=False, output=None, dss_settings: TimestampDSSContentSettings = TimestampDSSContentSettings(include_vri=True, skip_if_unneeded=True, update_before_ts=False), chunk_size=4096, tight_size_estimates: bool = False)

Changed in version 0.9.0: Wrapper around async_timestamp_pdf().

Timestamp the contents of pdf_out. Note that pdf_out should not be written to after this operation.

Parameters:
  • pdf_out – An IncrementalPdfFileWriter.

  • md_algorithm – The hash algorithm to use when computing message digests.

  • validation_context – The pyhanko_certvalidator.ValidationContext against which the TSA response should be validated. This validation context will also be used to update the DSS.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • validation_paths – If the validation path(s) for the TSA’s certificate are already known, you can pass them using this parameter to avoid having to run the validation logic again.

  • timestamper – Override the default TimeStamper associated with this PdfTimeStamper.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

  • dss_settings – DSS output settings. See TimestampDSSContentSettings.

  • tight_size_estimates

    When estimating the size of a document timestamp container, do not add safety margins.

    Note

    External TSAs cannot be relied upon to always produce the exact same output length, which makes this option risky to use.

Returns:

The output stream containing the signed output.

async async_timestamp_pdf(pdf_out: IncrementalPdfFileWriter, md_algorithm, validation_context=None, bytes_reserved=None, validation_paths=None, timestamper: TimeStamper | None = None, *, in_place=False, output=None, dss_settings: TimestampDSSContentSettings = TimestampDSSContentSettings(include_vri=True, skip_if_unneeded=True, update_before_ts=False), chunk_size=4096, tight_size_estimates: bool = False, embed_roots: bool = True)

Added in version 0.9.0.

Timestamp the contents of pdf_out. Note that pdf_out should not be written to after this operation.

Parameters:
  • pdf_out – An IncrementalPdfFileWriter.

  • md_algorithm – The hash algorithm to use when computing message digests.

  • validation_context – The pyhanko_certvalidator.ValidationContext against which the TSA response should be validated. This validation context will also be used to update the DSS.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • validation_paths – If the validation path(s) for the TSA’s certificate are already known, you can pass them using this parameter to avoid having to run the validation logic again.

  • timestamper – Override the default TimeStamper associated with this PdfTimeStamper.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

  • dss_settings – DSS output settings. See TimestampDSSContentSettings.

  • tight_size_estimates

    When estimating the size of a document timestamp container, do not add safety margins.

    Note

    External TSAs cannot be relied upon to always produce the exact same output length, which makes this option risky to use.

  • embed_roots

    Option that controls whether the root certificate of each validation path should be embedded into the DSS. The default is True.

    Note

    Trust roots are configured by the validator, so embedding them typically does nothing in a typical validation process. Therefore they can be safely omitted in most cases. Nonetheless, embedding the roots can be useful for documentation purposes.

Returns:

The output stream containing the signed output.

update_archival_timestamp_chain(reader: PdfFileReader, validation_context, in_place=True, output=None, chunk_size=4096, default_md_algorithm='sha256')

Changed in version 0.9.0: Wrapper around async_update_archival_timestamp_chain().

Validate the last timestamp in the timestamp chain on a PDF file, and write an updated version to an output stream.

Parameters:
  • reader – A PdfReader encapsulating the input file.

  • validation_contextpyhanko_certvalidator.ValidationContext object to validate the last timestamp.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

  • default_md_algorithm – Message digest to use if there are no preceding timestamps in the file.

Returns:

The output stream containing the signed output.

async async_update_archival_timestamp_chain(reader: PdfFileReader, validation_context, in_place=True, output=None, chunk_size=4096, default_md_algorithm='sha256', embed_roots: bool = True)

Added in version 0.9.0.

Validate the last timestamp in the timestamp chain on a PDF file, and write an updated version to an output stream.

Parameters:
  • reader – A PdfReader encapsulating the input file.

  • validation_contextpyhanko_certvalidator.ValidationContext object to validate the last timestamp.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

  • default_md_algorithm – Message digest to use if there are no preceding timestamps in the file.

  • embed_roots

    Option that controls whether the root certificate of each validation path should be embedded into the DSS. The default is True.

    Note

    Trust roots are configured by the validator, so embedding them typically does nothing in a typical validation process. Therefore they can be safely omitted in most cases. Nonetheless, embedding the roots can be useful for documentation purposes.

Returns:

The output stream containing the signed output.

class pyhanko.sign.signers.pdf_signer.PdfSigner(signature_meta: PdfSignatureMetadata, signer: Signer, *, timestamper: TimeStamper | None = None, stamp_style: BaseStampStyle | None = None, new_field_spec: SigFieldSpec | None = None)

Bases: object

Class to handle PDF signatures in general.

Parameters:
  • signature_meta – The specification of the signature to add.

  • signerSigner object to use to produce the signature object.

  • timestamperTimeStamper object to use to produce any time stamp tokens that might be required.

  • stamp_style – Stamp style specification to determine the visible style of the signature, typically an object of type TextStampStyle or QRStampStyle. Defaults to constants.DEFAULT_SIGNING_STAMP_STYLE.

  • new_field_spec – If a new field is to be created, this parameter allows the caller to specify the field’s properties in the form of a SigFieldSpec. This parameter is only meaningful if existing_fields_only is False.

property default_md_for_signer: str | None

Name of the default message digest algorithm for this signer, if there is one. This method will try the md_algorithm attribute on the signer’s signature_meta, or try to retrieve the digest algorithm associated with the underlying Signer.

Returns:

The name of the message digest algorithm, or None.

register_extensions(pdf_out: BasePdfFileWriter, *, md_algorithm: str)
init_signing_session(pdf_out: BasePdfFileWriter, existing_fields_only=False) PdfSigningSession

Initialise a signing session with this PdfSigner for a specified PDF file writer.

This step in the signing process handles all field-level operations prior to signing: it creates the target form field if necessary, and makes sure the seed value dictionary gets processed.

See also digest_doc_for_signing() and sign_pdf().

Parameters:
  • pdf_out – The writer containing the PDF file to be signed.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

Returns:

A PdfSigningSession object modelling the signing session in its post-setup stage.

digest_doc_for_signing(pdf_out: BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096) Tuple[PreparedByteRangeDigest, PdfTBSDocument, IO]

Deprecated since version 0.9.0: Use async_digest_doc_for_signing() instead.

Set up all stages of the signing process up to and including the point where the signature placeholder is allocated, and the document’s /ByteRange digest is computed.

See sign_pdf() for a less granular, more high-level approach.

Note

This method is useful in remote signing scenarios, where you might want to free up resources while waiting for the remote signer to respond. The PreparedByteRangeDigest object returned allows you to keep track of the required state to fill the signature container at some later point in time.

Parameters:
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns:

A tuple containing a PreparedByteRangeDigest object, a PdfTBSDocument object and an output handle to which the document in its current state has been written.

async async_digest_doc_for_signing(pdf_out: BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096) Tuple[PreparedByteRangeDigest, PdfTBSDocument, IO]

Added in version 0.9.0.

Set up all stages of the signing process up to and including the point where the signature placeholder is allocated, and the document’s /ByteRange digest is computed.

See sign_pdf() for a less granular, more high-level approach.

Note

This method is useful in remote signing scenarios, where you might want to free up resources while waiting for the remote signer to respond. The PreparedByteRangeDigest object returned allows you to keep track of the required state to fill the signature container at some later point in time.

Parameters:
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns:

A tuple containing a PreparedByteRangeDigest object, a PdfTBSDocument object and an output handle to which the document in its current state has been written.

sign_pdf(pdf_out: BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096)

Changed in version 0.9.0: Wrapper around async_sign_pdf().

Sign a PDF file using the provided output writer.

Parameters:
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved – Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns:

The output stream containing the signed data.

async async_sign_pdf(pdf_out: BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096)

Added in version 0.9.0.

Sign a PDF file using the provided output writer.

Parameters:
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved – Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns:

The output stream containing the signed data.

class pyhanko.sign.signers.pdf_signer.PdfSigningSession(pdf_signer: PdfSigner, pdf_out: BasePdfFileWriter, cms_writer, sig_field, md_algorithm: str, timestamper: TimeStamper | None, subfilter: SigSeedSubFilter, system_time: datetime | None = None, sv_spec: SigSeedValueSpec | None = None)

Bases: object

Added in version 0.7.0.

Class modelling a PDF signing session in its initial state.

The __init__ method is internal API, get an instance using PdfSigner.init_signing_session().

async perform_presign_validation(pdf_out: BasePdfFileWriter | None = None) PreSignValidationStatus | None

Perform certificate validation checks for the signer’s certificate, including any necessary revocation checks.

This function will also attempt to validate & collect revocation information for the relevant TSA (by requesting a dummy timestamp).

Parameters:

pdf_out – Current PDF writer. Technically optional; only used to look for the end of the timestamp chain in the previous revision when producing a PAdES-LTA signature in a document that is already signed (to ensure that the timestamp chain is uninterrupted).

Returns:

A PreSignValidationStatus object, or None if there is no validation context available.

async estimate_signature_container_size(validation_info: PreSignValidationStatus | None, tight=False)
prepare_tbs_document(validation_info: PreSignValidationStatus | None, bytes_reserved, appearance_text_params=None) PdfTBSDocument

Set up the signature appearance (if necessary) and signature dictionary in the PDF file, to put the document in its final pre-signing state.

Parameters:
  • validation_info – Validation information collected prior to signing.

  • bytes_reserved – Bytes to reserve for the signature container.

  • appearance_text_params – Optional text parameters for the signature appearance content.

Returns:

A PdfTBSDocument describing the document in its final pre-signing state.

class pyhanko.sign.signers.pdf_signer.PdfTBSDocument(cms_writer, signer: Signer, md_algorithm: str, use_pades: bool, timestamper: TimeStamper | None = None, post_sign_instructions: PostSignInstructions | None = None, validation_context: ValidationContext | None = None)

Bases: object

Added in version 0.7.0.

A PDF document in its final pre-signing state.

The __init__ method is internal API, get an instance using PdfSigningSession.prepare_tbs_document(). Alternatively, use resume_signing() or finish_signing() to continue a previously interrupted signing process without instantiating a new PdfTBSDocument object.

digest_tbs_document(*, output: IO | None = None, in_place: bool = False, chunk_size=4096) Tuple[PreparedByteRangeDigest, IO]

Write the document to an output stream and compute the digest, while keeping track of the (future) location of the signature contents in the output stream.

The digest can then be passed to the next part of the signing pipeline.

Warning

This method can only be called once.

Parameters:
  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns:

A tuple containing a PreparedByteRangeDigest and the output stream to which the output was written.

async perform_signature(document_digest: bytes, pdf_cms_signed_attrs: PdfCMSSignedAttributes) PdfPostSignatureDocument

Perform the relevant cryptographic signing operations on the document digest, and write the resulting CMS object to the appropriate location in the output stream.

Warning

This method can only be called once, and must be invoked after digest_tbs_document().

Parameters:
  • document_digest – Digest of the document, as computed over the relevant /ByteRange.

  • pdf_cms_signed_attrs – Description of the signed attributes to include.

Returns:

A PdfPostSignatureDocument object.

classmethod resume_signing(output: IO, prepared_digest: PreparedByteRangeDigest, signature_cms: bytes | ContentInfo, post_sign_instr: PostSignInstructions | None = None, validation_context: ValidationContext | None = None) PdfPostSignatureDocument

Resume signing after obtaining a CMS object from an external source.

This is a class method; it doesn’t require a PdfTBSDocument instance. Contrast with perform_signature().

Parameters:
  • output – Output stream housing the document in its final pre-signing state. This stream must at least be writable and seekable, and also readable if post-signature processing is required.

  • prepared_digest – The prepared digest returned by a prior call to digest_tbs_document().

  • signature_cms – CMS object to embed in the signature dictionary.

  • post_sign_instr – Instructions for post-signing processing (DSS updates and document timestamps).

  • validation_context – Validation context to use in post-signing operations. This is mainly intended for TSA certificate validation, but it can also contain additional validation data to embed in the DSS.

Returns:

A PdfPostSignatureDocument.

classmethod finish_signing(output: IO, prepared_digest: PreparedByteRangeDigest, signature_cms: bytes | ContentInfo, post_sign_instr: PostSignInstructions | None = None, validation_context: ValidationContext | None = None, chunk_size=4096)

Finish signing after obtaining a CMS object from an external source, and perform any required post-signature processing.

This is a class method; it doesn’t require a PdfTBSDocument instance. Contrast with perform_signature().

Parameters:
  • output – Output stream housing the document in its final pre-signing state.

  • prepared_digest – The prepared digest returned by a prior call to digest_tbs_document().

  • signature_cms – CMS object to embed in the signature dictionary.

  • post_sign_instr – Instructions for post-signing processing (DSS updates and document timestamps).

  • validation_context – Validation context to use in post-signing operations. This is mainly intended for TSA certificate validation, but it can also contain additional validation data to embed in the DSS.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

async classmethod async_finish_signing(output: IO, prepared_digest: PreparedByteRangeDigest, signature_cms: bytes | ContentInfo, post_sign_instr: PostSignInstructions | None = None, validation_context: ValidationContext | None = None, chunk_size=4096)

Finish signing after obtaining a CMS object from an external source, and perform any required post-signature processing.

This is a class method; it doesn’t require a PdfTBSDocument instance. Contrast with perform_signature().

Parameters:
  • output – Output stream housing the document in its final pre-signing state.

  • prepared_digest – The prepared digest returned by a prior call to digest_tbs_document().

  • signature_cms – CMS object to embed in the signature dictionary.

  • post_sign_instr – Instructions for post-signing processing (DSS updates and document timestamps).

  • validation_context – Validation context to use in post-signing operations. This is mainly intended for TSA certificate validation, but it can also contain additional validation data to embed in the DSS.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

class pyhanko.sign.signers.pdf_signer.PdfPostSignatureDocument(sig_contents: bytes, post_sign_instr: PostSignInstructions | None = None, validation_context: ValidationContext | None = None)

Bases: object

Added in version 0.7.0.

Represents the final phase of the PDF signing process

async post_signature_processing(output: IO, chunk_size=4096)

Handle DSS updates and LTA timestamps, if applicable.

Parameters:
  • output – I/O buffer containing the signed document. Must support reading, writing and seeking.

  • chunk_size – Chunk size to use for I/O operations that do not support the buffer protocol.

class pyhanko.sign.signers.pdf_signer.PreSignValidationStatus(signer_path: ~pyhanko_certvalidator.path.ValidationPath, validation_paths: ~typing.List[~pyhanko_certvalidator.path.ValidationPath], ts_validation_paths: ~typing.List[~pyhanko_certvalidator.path.ValidationPath] | None = None, adobe_revinfo_attr: ~asn1crypto.pdf.RevocationInfoArchival | None = None, ocsps_to_embed: ~typing.List[~asn1crypto.ocsp.OCSPResponse] = <factory>, crls_to_embed: ~typing.List[~asn1crypto.crl.CertificateList] = <factory>, ac_validation_paths: ~typing.List[~pyhanko_certvalidator.path.ValidationPath] = <factory>)

Bases: object

Added in version 0.7.0.

Container for validation data collected prior to creating a signature, e.g. for later inclusion in a document’s DSS, or as a signed attribute on the signature.

signer_path: ValidationPath

Validation path for the signer’s certificate.

validation_paths: List[ValidationPath]

List of other relevant validation paths.

ts_validation_paths: List[ValidationPath] | None = None

List of validation paths relevant for embedded timestamps.

adobe_revinfo_attr: RevocationInfoArchival | None = None

Preformatted revocation info attribute to include, if requested by the settings.

ocsps_to_embed: List[OCSPResponse]

List of OCSP responses collected so far.

crls_to_embed: List[CertificateList]

List of CRLS collected so far.

ac_validation_paths: List[ValidationPath]

List of validation paths relevant for embedded attribute certificates.

class pyhanko.sign.signers.pdf_signer.PostSignInstructions(validation_info: ~pyhanko.sign.signers.pdf_signer.PreSignValidationStatus, timestamper: ~pyhanko.sign.timestamps.api.TimeStamper | None = None, timestamp_md_algorithm: str | None = None, timestamp_field_name: str | None = None, dss_settings: ~pyhanko.sign.signers.pdf_signer.DSSContentSettings = DSSContentSettings(include_vri=True, skip_if_unneeded=True, placement=<SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS: 3>, next_ts_settings=None), tight_size_estimates: bool = False, embed_roots: bool = True, file_credential: ~pyhanko.pdf_utils.crypt.cred_ser.SerialisedCredential | None = None, strict: bool = True)

Bases: object

Added in version 0.7.0.

Container class housing instructions for incremental updates to the document after the signature has been put in place. Necessary for PAdES-LT and PAdES-LTA workflows.

validation_info: PreSignValidationStatus

Validation information to embed in the DSS (if not already present).

timestamper: TimeStamper | None = None

Timestamper to use for produce document timestamps. If None, no timestamp will be added.

timestamp_md_algorithm: str | None = None

Digest algorithm to use when producing timestamps. Defaults to DEFAULT_MD.

timestamp_field_name: str | None = None

Name of the timestamp field to use. If not specified, a field name will be generated.

dss_settings: DSSContentSettings = DSSContentSettings(include_vri=True, skip_if_unneeded=True, placement=<SigDSSPlacementPreference.TOGETHER_WITH_NEXT_TS: 3>, next_ts_settings=None)

Added in version 0.8.0.

Settings to fine-tune DSS generation.

tight_size_estimates: bool = False

Added in version 0.8.0.

When estimating the size of a document timestamp container, do not add safety margins.

Note

External TSAs cannot be relied upon to always produce the exact same output length, which makes this option risky to use.

embed_roots: bool = True

Added in version 0.9.0.

Option that controls whether the root certificate of each validation path should be embedded into the DSS. The default is True.

Note

Trust roots are configured by the validator, so embedding them typically does nothing in a typical validation process. Therefore they can be safely omitted in most cases. Nonetheless, embedding the roots can be useful for documentation purposes.

Note

This setting is not part of DSSContentSettings because its value is taken from the corresponding property on the Signer involved, not from the initial configuration.

file_credential: SerialisedCredential | None = None

Added in version 0.13.0.

Serialised file credential, to update encrypted files.

strict: bool = True

Added in version 0.25.2.

Controls whether to open the signed document in strict mode before applying post-signing instructions.