pyhanko.sign.diff_analysis.rules.metadata_rules module

class pyhanko.sign.diff_analysis.rules.metadata_rules.DocInfoRule

Bases: pyhanko.sign.diff_analysis.rules_api.WhitelistRule

Rule that allows the /Info dictionary in the trailer to be updated.

apply(old: pyhanko.pdf_utils.reader.HistoricalResolver, new: pyhanko.pdf_utils.reader.HistoricalResolver) Iterable[pyhanko.sign.diff_analysis.rules_api.ReferenceUpdate]

Apply the rule to the changes between two revisions.

Parameters
  • old – The older, base revision.

  • new – The newer revision to be vetted.

class pyhanko.sign.diff_analysis.rules.metadata_rules.MetadataUpdateRule(check_xml_syntax=True, always_refuse_stream_override=False)

Bases: pyhanko.sign.diff_analysis.rules_api.WhitelistRule

Rule to adjudicate updates to the XMP metadata stream.

The content of the metadata isn’t actually validated in any significant way; this class only checks whether the XML is well-formed.

Parameters
  • check_xml_syntax – Do a well-formedness check on the XML syntax. Default True.

  • always_refuse_stream_override

    Always refuse to override the metadata stream if its object ID existed in a prior revision, including if the new stream overrides the old metadata stream and the syntax check passes. Default False.

    Note

    In other situations, pyHanko will reject stream overrides on general principle, since combined with the fault-tolerance of some PDF readers, these can allow an attacker to manipulate parts of the signed content in subtle but significant ways.

    In case of the metadata stream, the risk is significantly mitigated thanks to the XML syntax check on both versions of the stream, but if you’re feeling extra paranoid, you can turn the default behaviour back on by setting always_refuse_stream_override to True.

static is_well_formed_xml(metadata_ref: pyhanko.pdf_utils.generic.Reference)

Checks whether the provided stream consists of well-formed XML data. Note that this does not perform any more advanced XML or XMP validation, the check is purely syntactic.

Parameters

metadata_ref – A reference to a (purported) metadata stream.

Raises

SuspiciousModification – if there are indications that the reference doesn’t point to an XML stream.

apply(old: pyhanko.pdf_utils.reader.HistoricalResolver, new: pyhanko.pdf_utils.reader.HistoricalResolver) Iterable[pyhanko.sign.diff_analysis.rules_api.ReferenceUpdate]

Apply the rule to the changes between two revisions.

Parameters
  • old – The older, base revision.

  • new – The newer revision to be vetted.