pyhanko.sign.signers.pdf_signer module

This module implements support for PDF-specific signing functionality.

class pyhanko.sign.signers.pdf_signer.PdfSignatureMetadata(field_name: Optional[str] = None, md_algorithm: Optional[str] = None, location: Optional[str] = None, reason: Optional[str] = None, name: Optional[str] = None, certify: bool = False, subfilter: Optional[pyhanko.sign.fields.SigSeedSubFilter] = None, embed_validation_info: bool = False, use_pades_lta: bool = False, timestamp_field_name: Optional[str] = None, validation_context: Optional[pyhanko_certvalidator.context.ValidationContext] = None, docmdp_permissions: pyhanko.sign.fields.MDPPerm = MDPPerm.FILL_FORMS, signer_key_usage: Set[str] = <factory>, cades_signed_attr_spec: Optional[pyhanko.sign.ades.api.CAdESSignedAttrSpec] = None)

Bases: object

Specification for a PDF signature.

field_name: str = None

The name of the form field to contain the signature. If there is only one available signature field, the name may be inferred.

md_algorithm: str = None

The name of the digest algorithm to use. It should be supported by pyca/cryptography.

If None, this will ordinarily default to the value of constants.DEFAULT_MD, unless a seed value dictionary and/or a prior certification signature happen to be available.

location: str = None

Location of signing.

reason: str = None

Reason for signing (textual).

name: str = None

Name of the signer. This value is usually not necessary to set, since it should appear on the signer’s certificate, but there are cases where it might be useful to specify it here (e.g. in situations where signing is delegated to a trusted third party).

certify: bool = False

Sign with an author (certification) signature, as opposed to an approval signature. A document can contain at most one such signature, and it must be the first one.

subfilter: pyhanko.sign.fields.SigSeedSubFilter = None

Signature subfilter to use.

This should be one of ADOBE_PKCS7_DETACHED or PADES. If not specified, the value may be inferred from the signature field’s seed value dictionary. Failing that, ADOBE_PKCS7_DETACHED is used as the default value.

embed_validation_info: bool = False

Flag indicating whether validation info (OCSP responses and/or CRLs) should be embedded or not. This is necessary to be able to validate signatures long after they have been made. This flag requires validation_context to be set.

The precise manner in which the validation info is embedded depends on the (effective) value of subfilter:

  • With ADOBE_PKCS7_DETACHED, the validation information will be embedded inside the CMS object containing the signature.

  • With PADES, the validation information will be embedded into the document security store (DSS).

use_pades_lta: bool = False

If True, the signer will append an additional document timestamp after writing the signature’s validation information to the document security store (DSS). This flag is only meaningful if subfilter is PADES.

The PAdES B-LTA profile solves the long-term validation problem by adding a timestamp chain to the document after the regular signatures, which is updated with new timestamps at regular intervals. This provides an audit trail that ensures the long-term integrity of the validation information in the DSS, since OCSP responses and CRLs also have a finite lifetime.

See also PdfTimeStamper.update_archival_timestamp_chain().

timestamp_field_name: str = None

Name of the timestamp field created when use_pades_lta is True. If not specified, a unique name will be generated using uuid.

validation_context: pyhanko_certvalidator.context.ValidationContext = None

The validation context to use when validating signatures. If provided, the signer’s certificate and any timestamp certificates will be validated before signing.

This parameter is mandatory when embed_validation_info is True.

docmdp_permissions: pyhanko.sign.fields.MDPPerm = 2

Indicates the document modification policy that will be in force after this signature is created. Only relevant for certification signatures or signatures that apply locking.

Warning

For non-certification signatures, this is only explicitly allowed since PDF 2.0 (ISO 32000-2), so older software may not respect this setting on approval signatures.

signer_key_usage: Set[str]

Key usage extensions required for the signer’s certificate. Defaults to non_repudiation only, but sometimes digital_signature or a combination of both may be more appropriate. See x509.KeyUsage for a complete list.

Only relevant if a validation context is also provided.

cades_signed_attr_spec: Optional[pyhanko.sign.ades.api.CAdESSignedAttrSpec] = None

New in version 0.5.0.

Specification for CAdES-specific attributes.

class pyhanko.sign.signers.pdf_signer.PdfTimeStamper(timestamper: pyhanko.sign.timestamps.TimeStamper, field_name: Optional[str] = None)

Bases: object

Class to encapsulate the process of appending document timestamps to PDF files.

property field_name: str

Retrieve or generate the field name for the signature field to contain the document timestamp.

Returns

The field name, as a (Python) string.

timestamp_pdf(pdf_out: pyhanko.pdf_utils.incremental_writer.IncrementalPdfFileWriter, md_algorithm, validation_context=None, bytes_reserved=None, validation_paths=None, timestamper: Optional[pyhanko.sign.timestamps.TimeStamper] = None, *, in_place=False, output=None, chunk_size=4096)

Timestamp the contents of pdf_out. Note that pdf_out should not be written to after this operation.

Parameters
  • pdf_out – An IncrementalPdfFileWriter.

  • md_algorithm – The hash algorithm to use when computing message digests.

  • validation_context – The pyhanko_certvalidator.ValidationContext against which the TSA response should be validated. This validation context will also be used to update the DSS.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • validation_paths – If the validation path(s) for the TSA’s certificate are already known, you can pass them using this parameter to avoid having to run the validation logic again.

  • timestamper – Override the default TimeStamper associated with this PdfTimeStamper.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns

The output stream containing the signed output.

update_archival_timestamp_chain(reader: pyhanko.pdf_utils.reader.PdfFileReader, validation_context, in_place=True, output=None, chunk_size=4096, default_md_algorithm='sha256')

Validate the last timestamp in the timestamp chain on a PDF file, and write an updated version to an output stream.

Parameters
  • reader – A PdfReader encapsulating the input file.

  • validation_contextpyhanko_certvalidator.ValidationContext object to validate the last timestamp.

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

  • default_md_algorithm – Message digest to use if there are no preceding timestamps in the file.

Returns

The output stream containing the signed output.

class pyhanko.sign.signers.pdf_signer.PdfSigner(signature_meta: pyhanko.sign.signers.pdf_signer.PdfSignatureMetadata, signer: pyhanko.sign.signers.pdf_cms.Signer, *, timestamper: Optional[pyhanko.sign.timestamps.TimeStamper] = None, stamp_style: Optional[pyhanko.stamp.BaseStampStyle] = None, new_field_spec: Optional[pyhanko.sign.fields.SigFieldSpec] = None)

Bases: object

Class to handle PDF signatures in general.

Parameters
  • signature_meta – The specification of the signature to add.

  • signerSigner object to use to produce the signature object.

  • timestamperTimeStamper object to use to produce any time stamp tokens that might be required.

  • stamp_style – Stamp style specification to determine the visible style of the signature, typically an object of type TextStampStyle or QRStampStyle. Defaults to constants.DEFAULT_SIGNING_STAMP_STYLE.

  • new_field_spec – If a new field is to be created, this parameter allows the caller to specify the field’s properties in the form of a SigFieldSpec. This parameter is only meaningful if existing_fields_only is False.

property default_md_for_signer: Optional[str]

Name of the default message digest algorithm for this signer, if there is one. This method will try the md_algorithm attribute on the signer’s signature_meta, or try to retrieve the digest algorithm associated with the underlying Signer.

Returns

The name of the message digest algorithm, or None.

init_signing_session(pdf_out: pyhanko.pdf_utils.writer.BasePdfFileWriter, existing_fields_only=False) pyhanko.sign.signers.pdf_signer.PdfSigningSession

Initialise a signing session with this PdfSigner for a specified PDF file writer.

This step in the signing process handles all field-level operations prior to signing: it creates the target form field if necessary, and makes sure the seed value dictionary gets processed.

See also digest_doc_for_signing() and sign_pdf().

Parameters
  • pdf_out – The writer containing the PDF file to be signed.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

Returns

A PdfSigningSession object modelling the signing session in its post-setup stage.

digest_doc_for_signing(pdf_out: pyhanko.pdf_utils.writer.BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096) Tuple[pyhanko.sign.signers.pdf_byterange.PreparedByteRangeDigest, pyhanko.sign.signers.pdf_signer.PdfTBSDocument, IO]

Set up all stages of the signing process up to and including the point where the signature placeholder is allocated, and the document’s /ByteRange digest is computed.

See sign_pdf() for a less granular, more high-level approach.

Note

This method is useful in remote signing scenarios, where you might want to free up resources while waiting for the remote signer to respond. The PreparedByteRangeDigest object returned allows you to keep track of the required state to fill the signature container at some later point in time.

Parameters
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved

    Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

    Warning

    Since the CMS object is written to the output file as a hexadecimal string, you should request twice the (estimated) number of bytes in the DER-encoded version of the CMS object.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns

A tuple containing a PreparedByteRangeDigest object, a PdfTBSDocument object and an output handle to which the document in its current state has been written.

sign_pdf(pdf_out: pyhanko.pdf_utils.writer.BasePdfFileWriter, existing_fields_only=False, bytes_reserved=None, *, appearance_text_params=None, in_place=False, output=None, chunk_size=4096)

Sign a PDF file using the provided output writer.

Parameters
  • pdf_out – A PDF file writer (usually an IncrementalPdfFileWriter) containing the data to sign.

  • existing_fields_only – If True, never create a new empty signature field to contain the signature. If False, a new field may be created if no field matching field_name exists.

  • bytes_reserved – Bytes to reserve for the CMS object in the PDF file. If not specified, make an estimate based on a dummy signature.

  • appearance_text_params – Dictionary with text parameters that will be passed to the signature appearance constructor (if applicable).

  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns

The output stream containing the signed data.

class pyhanko.sign.signers.pdf_signer.PdfSigningSession(pdf_signer: pyhanko.sign.signers.pdf_signer.PdfSigner, cms_writer, sig_field, md_algorithm: str, timestamper: pyhanko.sign.timestamps.TimeStamper, subfilter: pyhanko.sign.fields.SigSeedSubFilter, system_time: Optional[datetime.datetime] = None, sv_spec: Optional[pyhanko.sign.fields.SigSeedValueSpec] = None)

Bases: object

New in version 0.7.0.

Class modelling a PDF signing session in its initial state.

The __init__ method is internal API, get an instance using PdfSigner.init_signing_session().

perform_presign_validation(pdf_out: Optional[pyhanko.pdf_utils.writer.BasePdfFileWriter] = None) Optional[pyhanko.sign.signers.pdf_signer.PreSignValidationStatus]

Perform certificate validation checks for the signer’s certificate, including any necessary revocation checks.

This function will also attempt to validate & collect revocation information for the relevant TSA (by requesting a dummy timestamp).

Parameters

pdf_out – Current PDF writer. Technically optional; only used to look for the end of the timestamp chain in the previous revision when producing a PAdES-LTA signature in a document that is already signed (to ensure that the timestamp chain is uninterrupted).

Returns

A PreSignValidationStatus object, or None if there is no validation context available.

prepare_tbs_document(validation_info: pyhanko.sign.signers.pdf_signer.PreSignValidationStatus, bytes_reserved=None, appearance_text_params=None) pyhanko.sign.signers.pdf_signer.PdfTBSDocument

Set up the signature appearance (if necessary) and signature dictionary in the PDF file, to put the document in its final pre-signing state.

Parameters
  • validation_info – Validation information collected prior to signing.

  • bytes_reserved – Bytes to reserve for the signature container. If None, an estimate will be computed.

  • appearance_text_params – Optional text parameters for the signature appearance content.

Returns

A PdfTBSDocument describing the document in its final pre-signing state.

class pyhanko.sign.signers.pdf_signer.PdfTBSDocument(cms_writer, signer: pyhanko.sign.signers.pdf_cms.Signer, md_algorithm: str, use_pades: bool, timestamper: Optional[pyhanko.sign.timestamps.TimeStamper] = None, post_sign_instructions: Optional[pyhanko.sign.signers.pdf_signer.PostSignInstructions] = None, validation_context: Optional[pyhanko_certvalidator.context.ValidationContext] = None)

Bases: object

New in version 0.7.0.

A PDF document in its final pre-signing state.

The __init__ method is internal API, get an instance using PdfSigningSession.prepare_tbs_document(). Alternatively, use resume_signing() or finish_signing() to continue a previously interrupted signing process without instantiating a new PdfTBSDocument object.

digest_tbs_document(*, output: Optional[IO] = None, in_place: bool = False, chunk_size=4096) Tuple[pyhanko.sign.signers.pdf_byterange.PreparedByteRangeDigest, IO]

Write the document to an output stream and compute the digest, while keeping track of the (future) location of the signature contents in the output stream.

The digest can then be passed to the next part of the signing pipeline.

Warning

This method can only be called once.

Parameters
  • output – Write the output to the specified output stream. If None, write to a new BytesIO object. Default is None.

  • in_place – Sign the original input stream in-place. This parameter overrides output.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

Returns

A tuple containing a PreparedByteRangeDigest and the output stream to which the output was written.

perform_signature(document_digest: bytes, pdf_cms_signed_attrs: pyhanko.sign.signers.pdf_cms.PdfCMSSignedAttributes) pyhanko.sign.signers.pdf_signer.PdfPostSignatureDocument

Perform the relevant cryptographic signing operations on the document digest, and write the resulting CMS object to the appropriate location in the output stream.

Warning

This method can only be called once, and must be invoked after digest_tbs_document().

Parameters
  • document_digest – Digest of the document, as computed over the relevant /ByteRange.

  • pdf_cms_signed_attrs – Description of the signed attributes to include.

Returns

A PdfPostSignatureDocument object.

classmethod resume_signing(output: IO, prepared_digest: pyhanko.sign.signers.pdf_byterange.PreparedByteRangeDigest, signature_cms: Union[bytes, asn1crypto.cms.ContentInfo], post_sign_instr: Optional[pyhanko.sign.signers.pdf_signer.PostSignInstructions] = None, validation_context: Optional[pyhanko_certvalidator.context.ValidationContext] = None) pyhanko.sign.signers.pdf_signer.PdfPostSignatureDocument

Resume signing after obtaining a CMS object from an external source.

This is a class method; it doesn’t require a PdfTBSDocument instance. Contrast with perform_signature().

Parameters
  • output – Output stream housing the document in its final pre-signing state. This stream must at least be writable and seekable, and also readable if post-signature processing is required.

  • prepared_digest – The prepared digest returned by a prior call to digest_tbs_document().

  • signature_cms – CMS object to embed in the signature dictionary.

  • post_sign_instr – Instructions for post-signing processing (DSS updates and document timestamps).

  • validation_context – Validation context to use in post-signing operations. This is mainly intended for TSA certificate validation, but it can also contain additional validation data to embed in the DSS.

Returns

A PdfPostSignatureDocument.

classmethod finish_signing(output: IO, prepared_digest: pyhanko.sign.signers.pdf_byterange.PreparedByteRangeDigest, signature_cms: Union[bytes, asn1crypto.cms.ContentInfo], post_sign_instr: Optional[pyhanko.sign.signers.pdf_signer.PostSignInstructions] = None, validation_context: Optional[pyhanko_certvalidator.context.ValidationContext] = None, chunk_size=4096)

Finish signing after obtaining a CMS object from an external source, and perform any required post-signature processing.

This is a class method; it doesn’t require a PdfTBSDocument instance. Contrast with perform_signature().

Parameters
  • output – Output stream housing the document in its final pre-signing state.

  • prepared_digest – The prepared digest returned by a prior call to digest_tbs_document().

  • signature_cms – CMS object to embed in the signature dictionary.

  • post_sign_instr – Instructions for post-signing processing (DSS updates and document timestamps).

  • validation_context – Validation context to use in post-signing operations. This is mainly intended for TSA certificate validation, but it can also contain additional validation data to embed in the DSS.

  • chunk_size – Size of the internal buffer (in bytes) used to feed data to the message digest function if the input stream does not support memoryview.

class pyhanko.sign.signers.pdf_signer.PdfPostSignatureDocument(sig_contents: bytes, post_sign_instr: Optional[pyhanko.sign.signers.pdf_signer.PostSignInstructions] = None, validation_context: Optional[pyhanko_certvalidator.context.ValidationContext] = None)

Bases: object

New in version 0.7.0.

Represents the final phase of the PDF signing process

post_signature_processing(output: IO, chunk_size=4096)

Handle DSS updates and LTA timestamps, if applicable.

Parameters
  • output – I/O buffer containing the signed document. Must support reading, writing and seeking.

  • chunk_size – Chunk size to use for I/O operations that do not support the buffer protocol.

class pyhanko.sign.signers.pdf_signer.PreSignValidationStatus(signer_path: pyhanko_certvalidator.path.ValidationPath, validation_paths: List[pyhanko_certvalidator.path.ValidationPath], ts_validation_paths: Optional[List[pyhanko_certvalidator.path.ValidationPath]] = None, adobe_revinfo_attr: Optional[asn1crypto.cms.CMSAttribute] = None, ocsps_to_embed: Optional[List[asn1crypto.ocsp.OCSPResponse]] = None, crls_to_embed: Optional[List[asn1crypto.crl.CertificateList]] = None)

Bases: object

New in version 0.7.0.

Container for validation data collected prior to creating a signature, e.g. for later inclusion in a document’s DSS, or as a signed attribute on the signature.

signer_path: pyhanko_certvalidator.path.ValidationPath

Validation path for the signer’s certificate.

validation_paths: List[pyhanko_certvalidator.path.ValidationPath]

List of other relevant validation paths.

ts_validation_paths: Optional[List[pyhanko_certvalidator.path.ValidationPath]] = None

List of validation paths relevant for embedded timestamps.

adobe_revinfo_attr: Optional[asn1crypto.cms.CMSAttribute] = None

Preformatted revocation info attribute to include, if requested by the settings.

ocsps_to_embed: List[asn1crypto.ocsp.OCSPResponse] = None

List of OCSP responses collected so far.

crls_to_embed: List[asn1crypto.crl.CertificateList] = None

List of CRLS collected so far.

class pyhanko.sign.signers.pdf_signer.PostSignInstructions(validation_info: pyhanko.sign.signers.pdf_signer.PreSignValidationStatus, timestamper: Optional[pyhanko.sign.timestamps.TimeStamper] = None, timestamp_md_algorithm: Optional[str] = None, timestamp_field_name: Optional[str] = None)

Bases: object

New in version 0.7.0.

Container class housing instructions for incremental updates to the document after the signature has been put in place. Necessary for PAdES-LT and PAdES-LTA workflows.

validation_info: pyhanko.sign.signers.pdf_signer.PreSignValidationStatus

Validation information to embed in the DSS (if not already present).

timestamper: Optional[pyhanko.sign.timestamps.TimeStamper] = None

Timestamper to use for produce document timestamps. If None, no timestamp will be added.

timestamp_md_algorithm: Optional[str] = None

Digest algorithm to use when producing timestamps. Defaults to DEFAULT_MD.

timestamp_field_name: Optional[str] = None

Name of the timestamp field to use. If not specified, a field name will be generated.