Release history
Release date: 2023-11-26
Dependency changes
Bumped the minimal supported Python version to 3.8 (dropping 3.7).
Bumped the lower bound on
qrcode
to7.3.1
.Bumped
pyhanko-certvalidator
to0.26.x
.Bumped the lower bound on
click
to8.1.3
.Bumped the lower bound on
requests
to2.31.0
.Bumped the lower bound on
pyyaml
to6.0
.Bumped the lower bound on
cryptography
to41.0.5
.Bumped
aiohttp
to3.9.x
.Bumped
certomancer-csc-dummy
test dependency to0.2.3
.Introduced new dependency group
etsi
withxsdata
for features implementing functionality from AdES and related ETSI standards.
New features and enhancements
Signing
Add support for
/ContactInfo
,/Prop_AuthTime
and/Prop_AuthType
.
Validation
Experimental support for AdES validation reports (requires new
etsi
optional deps)New API function for simulating PAdES-LTA validation at a time in the future; see
simulate_future_ades_lta_validation()
.Add support for asserting the nonrevoked status of a certificate chain.
CLI
Add
--resave
flag toaddfields
subcommand.
Bugs fixed
Fixed an oversight in the serialisation of the
/ByteRange
entry in a signature that prevented large documents from being signed correctly.Various adjustments to the (still experimental) AdES validation API.
Various local documentation fixes.
PDF signatures that do not omit the
eContent
field in their encapsulated content info are now rejected as invalid.
Miscellaneous
Include PyPDF2 licence file in package metadata.
Cleaned up loading logic in
PdfFileReader
. The most important impact of this change is that structural errors in the encryption dictionary will now cause exceptions to be thrown when decryption is attempted, not in the__init__
function.
Release date: 2023-09-17
Dependency changes
Upgrade
pyhanko-certvalidator
to0.24.x
Miscellaneous
Tolerate missing
D:
in date strings (see PR #296 <https://github.com/MatthiasValvekens/pyHanko/issues/296>).Various minor documentation improvements.
Release workflow dependency bumps and minor improvements.
Release date: 2023-07-28
Dependency changes
Relax upper bound on
uharfbuzz
to<0.38.0
(allows more users to benefit from prebuilt wheels)Bump
python-barcode
from0.14.0
to0.15.1
.Bump
pytest-asyncio
from0.21.0
to0.21.1
.Relax
pytest-cov
bound to allow4.1.x
Miscellaneous
Various minor documentation improvements.
Improved unit test coverage, especially for error handling.
Release date: 2023-06-18
Dependency changes
Bump
pyhanko-certvalidator
to0.23.0
certomancer
updated to0.11.0
,certomancer-csc-dummy
to0.2.2
Breaking changes
Minor reorganisation of the
EnvelopeKeyDecrypter
. The change moves thecert
property from an attribute to an abstract property, and adds a method to allow us to handle protocols based on key agreement in addition to key transport. Implementations need not implement both.Move
ignore_key_usage
into to newRecipientEncryptionPolicy
class.
New features and enhancements
Encryption
Support RSAES-OAEP for file encryption with the public-key security handler. This is not widely supported by PDF viewers in the wild.
Support some ECDH-based key exchange methods for file encryption with the public-key security handler. Concretely, pyHanko now supports the
dhSinglePass-stdDH-sha*kdf
family from RFC 5753, which is also implemented in Acrobat (for NIST curves). X25519 and X448 are also included.
CLI
Better UX for argument errors relating to visible signature creation.
Bugs fixed
Allow processing OCSP responses without
nextUpdate
.Run non-cryptographic CLI commands in nonstrict mode.
Treat nulls the same as missing entries in dictionaries, as required by the standard.
Fix several default stamp style selection issues in CLI
Release date: 2023-04-29
Dependency changes
Remove dependency on
pytz
with fallback tobackports.zoneinfo
Bump
tzlocal
version to4.3
.Do not rely on deprecated timezone API anymore in the tests. See PR #257.
Release date: 2023-04-26
Note
This is largely a maintenance release in the sense that it adds relatively little in the way of core features, but it nevertheless comes with some major reorganisation and work to address technical debt.
This release also marks pyHanko’s move to beta status. That doesn’t mean that
it’s feature-complete in every respect, but it does mean that we’ve now entered
a stabilisation phase in anticipation of the 1.0.0
release, so until then
the focus will be on fixing bugs and clearing up issues in the documentation (in
particular regarding the API contract). After the 1.0.0
release, pyHanko
will simply follow SemVer.
Breaking changes
Some changes have been made to the Signer
class.
For all practical purposes, these are mostly relevant for custom
Signer
implementations. Regular users should see
fairly little impact.
The arguments to
__init__
have been made keyword-only.Several attributes have been turned into read-only properties:
This change was made to better reflect the way the properties were used internally, and made it easier to set expectations for the API: it doesn’t make sense to allow arbitrary modifications to these properties for all
Signer
implementations. The parameters to__init__
have been extended to allow setting defaults more cleanly. Implementation-wise, the properties are backed by an underscored internal variable (e.g._signing_cert
forsigning_cert
). Subclasses can of course still elect to make some of these read-only properties writable by declaring setters.
get_signature_mechanism
was renamed toget_signature_mechanism_for_digest()
to make it more clear that it does more than just fetch the underlying value ofsignature_mechanism
.
Concretely, this means that init logic of the form
class MySigner(Signer):
def __init__(
self,
signing_cert: x509.Certificate,
cert_registry: CertificateStore,
*args, **kwargs
):
self.signing_cert = signing_cert
self.cert_registry = cert_registry
self.signature_mechanism = signature_mechanism
super().__init__()
needs to be rewritten as
class MySigner(Signer):
def __init__(
self,
signing_cert: x509.Certificate,
cert_registry: CertificateStore,
*args, **kwargs
):
self._signing_cert = signing_cert
self._cert_registry = cert_registry
self._signature_mechanism = signature_mechanism
super().__init__()
or, alternatively, as
class MySigner(Signer):
def __init__(
self,
signing_cert: x509.Certificate,
cert_registry: CertificateStore,
*args, **kwargs
):
super().__init__(
signing_cert=signing_cert,
cert_registry=cert_registry,
signature_mechanism=signature_mechanism
)
Other than these, there have been some miscellaneous changes.
The CLI no longer allows signing files encrypted using public-key encryption targeted towards the signer’s certificate, because that feature didn’t make much sense in key management terms, was rarely used, and hard to integrate with the new plugin system.
APIs with
status_cls
parameters have made certain args keyword-only for strict type checking purposes.Move
add_content_to_page
toadd_to_page()
to deal with a (conceptual) circular dependency between modules.
CertificateStore
is no longer reexported bypyhanko.sign.general
.The
BEIDSigner
no longer allows convenient access to the authentication certificate.Packaging-wise, underscores have been replaced with hyphens in optional dependency groups.
In
pyhanko_certvalidator
,InvalidCertificateError
is no longer a subclass ofPathValidationError
.
Finally, some internal refactoring took place as well:
The
cli.py
module was refactored into a new subpackage (pyhanko.cli
) and is now also tested systematically.CLI config classes have been refactored, some configuration was moved to the new
pyhanko.config
package.Time tolerance config now passes around timedelta objects instead of second values.
The
qualify()
function in the difference analysis has been split intoqualify()
andqualify_transforming()
.
Organisational changes
Certificate and key loading was moved to a new
pyhanko.keys
module, butpyhanko.sign.general
still reexports the relevant functions for backwards compatibility. Concretely, the affected functions are
Onboarded
mypy
and flag pyHanko as a typed library by addingpy.typed
.Package metadata and tooling settings have now been centralised to
pyproject.toml
. Other configuration files likesetup.py
,requirements.txt
and most tool-specific config have been eliminated.The docstring-based documentation for
pyhanko_certvalidator
was added to the API reference.Some non-autogenerated API reference documentation pages were consolidated to reduce the sprawl.
Heavily reworked the CI/CD pipeline. PyHanko releases are now published via GitHub Actions and signed with Sigstore. GPG signatures will continue to be provided for the time being.
Dependency changes
Bump
pyhanko-certvalidator
to0.22.0
.Relax the upper bound on
uharfbuzz
for better Python 3.11 support
Bugs fixed
The AdES LTA validator now tolerates documents that don’t have a DSS (assuming that all the required information is otherwise present).
Ensure that the
trusted
attribute onSignatureStatus
is not set if the validation path is not actually available.Correct the typing on
validation_path
.Fix several result presentation bugs in the AdES code.
Fix overeager sharing of
POEManager
objects in AdES code.Correct algo policy handling in AdES-with-time validation.
Ensure that
container_ref
is also populated on past versions of the trailer dictionary.
New features and enhancements
Signing
The CLI now features plugins! All current
addsig
subcommands have been reimplemented to use the plugin interface. Other plugins will be auto-detected through package entry points.
Validation
Refine algorithm policy handling; put in place a subclass of
AlgorithmUsagePolicy
specifically for CMS validation; seeCMSAlgorithmUsagePolicy
.Try to remember paths when validation fails.
Make certificates from local CMS context available during path building for past certificate validation (subject to PoE checks).
Move
docmdp_ok
up in the hierarchy toModificationInfo
.
0.17.2
Release date: 2023-03-10
Note
This is a follow-up on yesterday’s bugfix release, addressing a number of similar issues.
Bugs fixed
Address another potential infinite loop in the comment processing logic.
Fix some (rather esoteric) correctness issues w.r.t. PDF whitespace.
Release date: 2023-03-09
Note
This is a maintenance release without significant functionality changes. It contains a bugfix, addresses some documentation issues and applies the Black formatter to the codebase.
Bugs fixed
Address a potential infinite loop in the PDF parsing logic. See PR #237.
Release date: 2023-01-31
Note
This is a bit of an odd release. It comes with relatively few functional changes or enhancements to existing features, but it has nevertheless been in the works for quite a long time.
In early 2022, I decided that the time was right to equip pyHanko with its own AdES validation engine, implementing the machinery specified by ETSI EN 319 102-1. I knew ahead of time that this would not be an easy task:
PyHanko’s own validation code was put together in a fairly ad-hoc manner starting from the provisions in the CMS specification, so some refactoring would be necessary.
pyhanko-certvalidator
also was never designed to be anything more than an RFC 5280 validation engine, and retrofitting the fine-tuning required by the AdES spec definitely wasn’t easy.
Initially, I estimated that this effort would take a few months tops. Yet here
we are, approximately one year down the road: pyhanko.sign.validation.ades
.
Truth be told, the implementation isn’t yet ready for prime time, but it is in
a state where it’s at least useful for experimentation purposes, and can be
iterated on.
Also, given the volume of subtle changes and far-reaching refactoring in the
internals of both the pyhanko
and pyhanko-certvalidator
packages,
continually rebasing the feature/ades-validation
feature branch turned
into a chore quite quickly.
So, if you’re keen to start playing around with AdES validation: please do so, and let me know what you think. If standards-based validation is not something you care about, feel free to disregard everything I wrote above, it almost certainly won’t affect any of your code.
My plan is to incrementally build upon and polish the code in
pyhanko.sign.validation.ades
, and eventually deprecate the current
ad-hoc LTV validation logic in
pyhanko.sign.validation.ltv.async_validate_pdf_ltv_signature()
.
That’s still a ways off from now, though.
Dependency updates
pyhanko-certvalidator
updated to0.20.0
Breaking changes
There are various changes in the validation internals that are not backwards compatible, but all of those concern internal APIs.
There are some noteworthy changes to the
pyhanko-certvalidator
API. Those are documented in the change log. Most of these do not affect basic usage.
New features and enhancements
Validation
Experimental AdES validation engine
pyhanko.sign.validation.ades
.In the status API, make a more meaningful distinction between
valid
andintact
, and document that distinction.
0.16.0
Release date: 2022-12-21
Dependency updates
pyhanko-certvalidator
updated to0.19.8
Breaking changes
This release includes breaking changes to the difference analysis engine. Unless you’re implementing your own difference analysis policies, this change should break your API usage.
New features and enhancements
Signing
Add support for Prop_Build metadata in signatures. See PR #192
Validation
Improvements to the difference analysis engine that allow more nuance to be expressed in the rule system.
Bugs fixed
Tolerate an indirect Extensions and MarkInfo dictionary in difference analysis. See PR #177.
Gracefully handle unreadable/undecodable producer strings.
Release date: 2022-10-27
Note
This release adds Python 3.11 to the list of supported Python versions.
Dependency updates
pyhanko-certvalidator
updated to0.19.6
certomancer
updated to0.9.1
Bugs fixed
Be more tolerant towards deviations from DER restrictions in signed attributes when validating signatures.
Release date: 2022-10-11
Note
Other than a few bug fixes, the highlight of this release is the addition of support for two very recently published PDF extension standards, ISO/TS 32001 and ISO/TS 32002.
Bugs fixed
Fix metadata handling in encrypted documents see issue #160.
Make sure XMP stream dictionaries contain the required typing entries.
Respect
visible_sig_settings
on field autocreation.Fix a division by zero corner case in the stamp layout code; see issue #170.
New features and enhancements
Signing
Add support for the new PDF extensions defined by ISO/TS 32001 and ISO/TS 32002; see PR #169.
SHA-3 support
EdDSA support for both the PKCS#11 signer and the in-memory signer
Auto-register developer extensions in the file
Make it easier to extract keys from
bytes
objects.
Validation
Add support for validating EdDSA signatures (as defined in ISO/TS 32002)
0.14.0
Release date: 2022-09-17
Note
This release contains a mixture of minor and major changes. Of particular note is the addition of automated metadata management support, including XMP metadata. This change affects almost every PDF write operation in the background. While pyHanko has very good test coverage, some instability and regressions may ensue. Bug reports are obviously welcome.
Breaking changes
The breaking changes in this release are all relatively minor.
Chances are that your code isn’t affected at all, other than perhaps by
the change to
PreparedByteRangeDigest
.
md_algorithm
attribute removed fromPreparedByteRangeDigest
since it wasn’t necessary for further processing.Low-level change in
raw_get
for PDF container object types (ArrayObject
andDictionaryObject
): thedecrypt
parameter is no longer a boolean, but a tri-state enum value of typeEncryptedObjAccess
.Developer extension management API moved into
pyhanko.pdf_utils.extensions
.
get_courier()
convenience function moved intopyhanko.pdf_utils.font.basic
and now takes a mandatory writer argument.The
token_label
attribute was removed fromPKCS11SignatureConfig
, but will still be parsed (with a deprecation warning).The
prompt_pin
attribute inPKCS11SignatureConfig
was changed from a bool to an enum. SeePKCS11PinEntryMode
.
Dependency updates
pytest-aiohttp
updated to1.0.4
certomancer
updated to0.9.0
certomancer-csc-dummy
updated to0.2.1
Relax bounds on
uharfbuzz
to allow everything up to the current version (i.e.0.30.0
) as well.New optional dependency group
xmp
, which for now only containsdefusedxml
Bugs fixed
Allow certificates with no
CN
in the certificate subject.The extension dictionary handling logic can now deal with encrypted documents without actually decrypting the document contents.
Fix processing error when passing empty strings to
uharfbuzz
; see issue #132.Use proper PDF text string serialisation routine in simple font handler, to ensure everything is escaped correctly.
Ensure that
output_version
is set to at least the input version in incrementally updated files.
New features and enhancements
Signing
Drop the requirement for
signing_cert
to be set from the start of the signing process in an interrupted signing workflow. This has come up on several occasions in the past, since it’s necessary in remote signing scenarios where the certificate is generated or provided on-demand when submitting the document digest to the signing service. See pull #141 for details.Add convenience API to set the
/TU
entry on a signature field; seereadable_field_name
.Allow greater control over the initialisation of document timestamp fields.
New class hierarchy for (un)signed attribute provisioning; see
SignedAttributeProviderSpec
andUnsignedAttributeProviderSpec
.Allow greater control over annotation flags for visible signatures. This is implemented using
VisibleSigSettings
. See discussion #150.Factor out and improve PKCS#11 token finding; see
TokenCriteria
and issue #149.Factor out and improve PKCS#11 mechanism selection, allowing more raw modes.
Change pin entry settings for PKCS#11 to be more granular, in order to also allow
PROTECTED_AUTH
; see issue #133.Allow the PKCS#11 PIN to be sourced from an environment variable when pyHanko is invoked through the CLI and no PIN is provided in the configuration. PyHanko will now first check the
PYHANKO_PKCS11_PIN
variable before prompting for a PIN. This also works when prompting for PIN entry is disabled altogether.
Note
The PKCS#11 code is now also tested in CI, using SoftHSMv2.
Validation
Allow validation time overrides in the CLI. Passing in the special value
claimed
tells pyHanko to take the stated signing time in the file at face value. See issue #130.
Encryption
Also return permissions on owner access to allow for easier inspection.
Better version enforcement for security handlers.
Layout
Allow metrics to be specified for simple fonts.
Provide metrics for default Courier font.
Experimental option that allows graphics to be embedded in the central area of the QR code; see
qr_inner_content
.
Miscellaneous
Basic XMP metadata support with optional
xmp
dependency group.Automated metadata management (document info dictionary, XMP metadata).
Refactor some low-level digesting and CMS validation code.
Make the CLI print a warning when the key passphrase is left empty.
Tweak configuration management utilities to better cope with fallback logic for deprecated configuration parameters.
Move all cross-reference writing logic into
pyhanko.pdf_utils.xref
.Improve error classes and error reporting in the CLI so that errors in non-verbose mode still provide a little more info.
0.13.2
Release date: 2022-07-02
Note
This is a patch release to address some dependency issues and bugs.
Dependency updates
python-barcode
updated and pinned to0.14.0
.
Bugs fixed
Fix lack of newline after XRef stream header.
Do not write DigestMethod in signature reference dictionaries (deprecated/nonfunctional entry).
Make
pyhanko.pdf_utils.writer.copy_into_new_writer()
more flexible by allowing caller-specified keyword arguments for the writer object.Refine settings for invisible signature fields (see
pyhanko.sign.fields.InvisSigSettings
).Correctly read objects from object streams in encrypted documents.
Release date: 2022-05-01
Note
This is a patch release to update fontTools
and uharfbuzz
to address
a conflict between the latest fontTools
and older uharfbuzz
versions.
Dependency updates
fontTools
updated to4.33.3
uharfbuzz
updated to0.25.0
Release date: 2022-04-25
Note
Like the previous two releases, this is largely a maintenance release.
Dependency updates
asn1crypto
updated to1.5.1
pyhanko-certvalidator
updated to0.19.5
certomancer
updated to0.8.2
Depend on
certomancer-csc-dummy
for tests; get rid ofpython-pae
test dependency.
Bugs fixed
Various parsing robustness improvements.
Be consistent with security handler version bounds.
Improve coverage of encryption code.
Ensure owner password gets prioritised in the legacy security handler.
New features and enhancements
Miscellaneous
Replaced some
ValueError
usages withPdfError
Improvements to error handling in strict mode.
Make CLI stack traces less noisy by default.
Encryption
Refactor internal
crypt
module into package.Add support for serialising credentials.
Cleaner credential inheritance for incremental writers.
Signing
Allow post-signing actions on encrypted files with serialised credentials.
Improve
--use-pades-lta
ergonomics in CLI.Add
--no-pass
parameter topemder
CLI.
Validation
Preparatory scaffolding for AdES status reporting.
Provide some tolerance against malformed ACs.
Increase robustness against invalid DNs.
0.12.1
Release date: 2022-02-26
Dependency updates
uharfbuzz
updated to0.19.0
pyhanko-certvalidator
updated to0.19.4
certomancer
updated to0.8.1
Bugs fixed
Fix typing issue in DSS reading logic (see issue #81)
Release date: 2022-01-26
Note
This is largely a maintenance release, and contains no new high-level features or public API changes. As such, upgrading is strongly recommended.
The most significant change is the (rather minimalistic) support for hybrid reference files. Since working with hybrid reference files means dealing with potential ambiguity (which is dangerous when dealing with signatures), creation and validation of signatures in hybrid reference documents is only enabled in nonstrict mode. Hybrid reference files are relatively rare these days, but the internals need to be able to cope with them either way, in order to be able to update such files safely.
New features and enhancements
Miscellaneous
Significant refactor of cross-reference parsing internals. This doesn’t affect any public API entrypoints, but read the reference documentation for
pyhanko.pdf_utils.xref
if you happen to have code that directly relies on that internal logic.Minimal support for hybrid reference files.
Add
strict
flag toIncrementalPdfFileWriter
.Expose
--no-strict-syntax
CLI flag in theaddsig
subcommand.
Bugs fixed
Ensure that signature appearance bounding boxes are rounded to a reasonable precision. Failure to do so caused issues with some viewers.
To be consistent with the purpose of the strictness flag, non-essential xref consistency checking is now only enabled when running in strict mode (which is the default).
The hybrid reference support indirectly fixes some potential silent file corruption issues that could arise when working on particularly ill-behaved hybrid reference files.
Release date: 2021-12-23
Dependency changes
Update
pyhanko-certvalidator
to0.19.2
Bump
fontTools
to4.28.2
Update
certomancer
test dependency to0.7.1
Breaking changes
Due to import order issues resulting from refactoring of the validation code, some classes and class hierarchies in the higher-level API had to be moved. The affected classes are listed below, with links to their respective new locations in the API reference.
The low-level function validate_sig_integrity()
was also
moved.
New features and enhancements
Signing
Support embedding attribute certificates into CMS signatures, either in the
certificates
field or using the CAdESsigner-attrs-v2
attribute.More explicit errors on unfulfilled text parameters
Better use of
asyncio
when collecting validation information for timestampsInternally disambiguate PAdES and CAdES for the purpose of attribute handling.
Validation
Refactor
diff_analysis
module into sub-packageRefactor
validation
module into sub-package (together with portions ofpyhanko.sign.general
); see Breaking changes.Make extracted certificate information more easily accessible.
Integrated attribute certificate validation (requires a separate validation context with trust roots for attribute authorities)
Report on signer attributes as supplied by the CAdES
signer-attrs-v2
attribute.
Miscellaneous
Various parsing and error handling improvements to xref processing, object streams, and object header handling.
Use
NotImplementedError
for unimplemented stream filters instead of less-appropriate exceptionsAlways drop GPOS/GDEF/GSUB when subsetting OpenType and TrueType fonts
Initial support for string-keyed CFF fonts as CIDFonts (subsetting is still inefficient)
copy_into_new_writer()
is now smarter about how it deals with the/Producer
lineFix a typo in the ASN.1 definition of
signature-policy-store
Various, largely aesthetic, cleanup & docstring fixes in internal APIs
Bugs fixed
Fix a critical bug in content timestamp generation causing the wrong message imprint to be sent to the timestamping service. The bug only affected the signed
content-time-stamp
attribute from CAdES, not the (much more widely used)signature-time-stamp
attribute. The former timestamps the content (and is part of the signed data), while the latter timestamps the signature (and is therefore not part of the signed data).Fix a bug causing an empty unsigned attribute sequence to be written if there were no unsigned attributes. This is not allowed (although many validators accept it), and was a regression introduced in
0.9.0
.Ensure non-PDF CAdES signatures always have
signingTime
set.Fix and improve timestamp summary reporting
Corrected TrueType subtype handling
Properly set
ts_validation_paths
Gracefully deal with unsupported certificate types in CMS
Ensure attribute inspection internals can deal with
SignerInfo
withoutsignedAttrs
.
Release date: 2021-11-28
Dependency changes
Update
pyhanko-certvalidator
to0.18.0
Update
aiohttp
to3.8.0
(optional dependency)Introduce
python-pae==0.1.0
(tests)
New features and enhancements
Signing
There’s a new
Signer
implementation that allows pyHanko to be used with remote signing services that implement the Cloud Signature Consortium API. Since auth handling differs from vendor to vendor, using this feature requires still the caller to supply an authentication handler implementation; seepyhanko.sign.signers.csc_signer
for more information. This feature is currently incubating.
Validation
Add CLI option to skip diff analysis.
Add CLI flag to disable strict syntax checks.
Use chunked digests while validating.
Improved difference analysis logging.
Miscellaneous
Better handling of nonexistent objects: clearer errors in strict mode, better fallback behaviour in nonstrict mode. This applies to both regular object dereferencing and xref history analysis.
Added many new tests for various edge cases, mainly in validation code.
Added
Python :: 3
andPython :: 3.10
classifiers to distribution.
Bugs fixed
Fix bug in output handler in timestamp updater that caused empty output in some configurations.
Fix a config parsing error when no stamp styles are defined in the configuration file.
Release date: 2021-10-31
Dependency changes
Update
pyhanko-certvalidator
to0.17.3
Update
fontTools
to4.27.1
Update
certomancer
to0.6.0
(tests)Introduce
pytest-aiohttp~=0.3.0
andaiohttp>=3.7.4
(tests)
API-breaking changes
This is a pretty big release, with a number of far-reaching changes in the
lower levels of the API that may cause breakage.
Much of pyHanko’s internal logic has been refactored to prefer asynchronous I/O
wherever possible (pyhanko-certvalidator
was also refactored accordingly).
Some compromises were made to allow non-async-aware code to continue working as-is.
If you’d like a quick overview of how you can take advantage of the new asynchronous library functions, take a look at this section in the signing docs.
Here’s an overview of low-level functionality that changed:
CMS signing logic was refactored and made asynchronous (only relevant if you implemented your own custom signers)
Time stamp client API was refactored and made asynchronous (only relevant if you implemented your own time stamping clients)
The interrupted signing workflow now involves more asyncio as well.
perform_presign_validation()
was made asynchronous.
prepare_tbs_document()
: thebytes_reserved
parameter is mandatory now.
post_signature_processing()
was made asynchronous.
collect_validation_info()
was made asynchronous
Other functions have been deprecated in favour of asynchronous equivalents;
such deprecations are documented in the API reference.
The section on extending Signer
has also been updated.
Warning
Even though we have pretty good test coverage, due to the volume of changes, some instability may ensue. Please do not hesitate to report bugs on the issue tracker!
New features and enhancements
Signing
Async-first signing API
Relax
token-label
requirements in PKCS#11 config, allowingslot-no
as an alternativeAllow selecting keys and certificates by ID in the PKCS#11 signer
Allow the signer’s certificate to be sourced from a file in the PKCS#11 signer
Allow BeID module path to be specified in config
Tweak cert querying logic in PKCS#11 signer
Add support for raw ECDSA to the PKCS#11 signer
Basic DSA support (for completeness w.r.t. ISO 32000)
Choose a default message digest more cleverly, based on the signing algorithm and key size
Fail loudly when trying to add a certifying signature to an already-signed document using the high-level signing API
Provide a flag to skip embedding root certificates
Validation
Async-first validation API
Use non-zero exit code on failed CLI validation
Miscellaneous
Minor reorganisation of
config.py
functionsMove PKCS#11 pin prompt logic to
cli.py
Improve font embedding efficiency (better stream management)
Ensure idempotence of object stream flushing
Improve PKCS#11 signer logging
Make
stream_xrefs=False
by default incopy_into_new_writer()
Removed a piece of fallback logic for
md_algorithm
that relied on obsolete parts of the standardFixed a number of issues related to unexpected cycles in PDF structures
Bugs fixed
Treat ASCII form feed (
\f
) as PDF whitespaceFix a corner case with null incremental updates
Fix some font compatibility issues (relax assumptions about the presence of certain tables/entries)
Be more tolerant when parsing name objects
Correct some issues related to DSS update validation
Correct
pdf_date()
output for negative UTC offsets
Release date: 2021-08-23
Dependency changes
Update
pyhanko-certvalidator
to0.16.0
.
API-breaking changes
Some fields and method names in the config API misspelled pkcs11` as ``pcks11
. This has been
corrected in this release. This is unlikely to cause issues for library users (since the config API
is primarily used by the CLI code), but it’s a breaking change all the same.
If you do have code that relies on the config API, simply substituting s/pcks/pkcs/g
should fix
things.
New features and enhancements
Signing
Make certificate fetching in the PKCS#11 signer more flexible.
Allow passing in the signer’s certificate from outside the token.
Improve certificate registry initialisation.
Give more control over updating the DSS in complex signature workflows. By default, pyHanko now tries to update the DSS in the revision that adds a document timestamp, after the signature (if applicable). In the absence of a timestamp, the old behaviour persists.
Added a flag to (attempt to) produce CMS signature containers without any padding.
Use
signing-certificate-v2
instead ofsigning-certificate
when producing signatures.Default to empty appearance streams for empty signature fields.
Much like the
pkcs11-setups
config entry, there are nowpemder-setups
andpkcs12-setups
at the top level of pyHanko’s config file. You can use those to store arguments for thepemder
andpkcs12
subcommands of pyHanko’saddsig
command, together with passphrases for non-interactive use. See Named setups for on-disk key material.
Validation
Enforce the end-entity cert constraint imposed by the
signing-certificate
orsigning-certificate-v2
attribute (if present).Improve issuer-serial matching logic.
Improve CMS attribute lookup routines.
Encryption
Add a flag to suppress creating “legacy compatibility” entries in the encryption dictionary if they aren’t actually required or meaningful (for now, this only applies to
/Length
).
Miscellaneous
Lazily load the version entry in the catalog.
Minor internal I/O handling improvements.
Allow constructing an
IncrementalPdfFileWriter
from aPdfFileReader
object.Expose common API to modify (most) trailer entries.
Automatically recurse into all configurable fields when processing configuration data.
Replace some certificate storage/indexing classes by references to their corresponding classes in
pyhanko-certvalidator
.
Bugs fixed
Add
/NeedAppearances
in the AcroForm dictionary to the whitelist for incremental update analysis.Fixed several bugs related to difference analysis on encrypted files.
Improve behaviour of dev extensions in difference analysis.
Fix encoding issues with
SignedDigestAlgorithm
, in particular ensuring that the signature mechanism encodes the relevant digest when using ECDSA.Process passfile contents more robustly in the CLI.
Correct timestamp revinfo fetching (by ensuring that a dummy response is present)
Release date: 2021-07-25
Dependency changes
Warning
If you used OTF/TTF fonts with pyHanko prior to the 0.7.0
release, you’ll need HarfBuzz
going forward. Install pyHanko with the [opentype]
optional dependency group to grab
everything you need.
Update
pyhanko-certvalidator
to0.15.3
TrueType/OpenType support moved to new optional dependency group labelled
[opentype]
.Dependency on
fontTools
moved from core dependencies to[opentype]
group.We now use HarfBuzz (
uharfbuzz==0.16.1
) for text shaping with OTF/TTF fonts.
API-breaking changes
Warning
If you use any of pyHanko’s lower-level APIs, review this section carefully before updating.
Signing code refactor
This release includes a refactor of the pyhanko.sign.signers
module into a
package with several submodules. The original API exposed by this
module is reexported in full at the package level, so existing code using pyHanko’s publicly
documented signing APIs should continue to work without modification.
There is one notable exception: as part of this refactor, the low-level
PdfCMSEmbedder
protocol was tweaked slightly, to support
the new interrupted signing workflow (see below). The required changes to existing code should be
minimal; have a look at the relevant section in the library
documentation for a concrete description of the changes, and an updated usage example.
In addition, if you extended the PdfSigner
class, then you’ll have to adapt to the new internal signing workflow as well. This may be
tricky due to the fact that the separation of concerns between different steps in the signing
process is now enforced more strictly.
I’m not aware of use cases requiring PdfSigner
to be extended, but if you’re having trouble migrating your custom subclass to the new API
structure, feel free to open an issue.
Merely having subclassed Signer
shouldn’t require
you to change anything.
Fonts
The low-level font loading API has been refactored to make font resource handling less painful, to provide smoother HarfBuzz integration and to expose more OpenType tweaks in the API.
To this end, the old pyhanko.pdf_utils.font
module was turned into a package containing three
modules: api
, basic
and
opentype
. The api
module contains the definitions for the general FontEngine
and FontEngineFactory
classes,
together with some other general plumbing logic.
The basic
module provides a minimalist implementation with a
(non-embedded) monospaced font.
If you need TrueType/OpenType support, you’ll need the opentype
module together with the optional dependencies in the [opentype]
dependency group (currently
fontTools
and uharfbuzz
, see above).
Take a look at the section for pyhanko.pdf_utils.font
in
the API reference documentation for further details.
For the time being, there are no plans to support embedding Type1 fonts, or to offer support for Type3 fonts at all.
Miscellaneous
The
content_stream
parameter was removed fromimport_page_as_xobject()
. Content streams are now merged automatically, since treating a page content stream array non-atomically is a bad idea.
PdfSigner
is no longer a subclass ofPdfTimeStamper
.
New features and enhancements
Signing
Interrupted signing workflow: segmented signing workflow that can be interrupted partway through and resumed later (possibly in a different process or on a different machine). Useful for dealing with signing processes that rely on user interaction and/or remote signing services.
Generic data signing support: construct CMS
signedData
objects for arbitrary data (not necessarily for use in PDF signature fields).Experimental API for signing individual embedded files (nonstandard).
PKCS#11 settings can now be set in the configuration file.
Validation
Add support for validating CMS
signedData
structures against arbitrary payloads (see also: Generic data signing)Streamline CMS timestamp validation.
Support reporting on (CAdES) content timestamps in addition to signature timestamps.
Allow signer certificates to be identified by the
subjectKeyIdentifier
extension.
Encryption
Support granular crypt filters for embedded files
Add convenient API to encrypt and wrap a PDF document as a binary blob. The resulting file will open as usual in a viewer that supports PDF collections; a fallback page with alternative instructions is shown otherwise.
Miscellaneous
Complete overhaul of appearance generation & layout system. Most of these changes are internal, except for some font loading mechanics (see above). All use of OpenType / TrueType fonts now requires the
[opentype]
optional dependency group. New features:
Use HarfBuzz for shaping (incl. complex scripts)
Support TrueType fonts and OpenType fonts without a CFF table.
Support vertical writing (among other OpenType features).
Use ActualText marked content in addition to ToUnicode.
Introduce simple box layout & alignment rules, and apply them uniformly across all layout decisions where possible. See
pyhanko.stamp
andpyhanko.pdf_utils.layout
for API documentation.Refactored stamp style dataclass hierarchy. This should not affect existing code.
Allow externally generated PDF content to be used as a stamp appearance.
Utility API for embedding files into PDF documents.
Added support for PDF developer extension declarations.
Bugs fixed
Signing
Declare ESIC extension when producing a PAdES signature on a PDF 1.x file.
Validation
Fix handling of orphaned objects in diff analysis.
Tighten up tolerances for (visible) signature field creation.
Fix typo in
BaseFieldModificationRule
Deal with some VRI-related corner cases in the DSS diffing logic.
Encryption
Improve identity crypt filter behaviour when applied to text strings.
Correct handling of non-default public-key crypt filters.
Miscellaneous
Promote stream manipulation methods to base writer.
Correct some edge cases w.r.t. PDF content import
Use floats for MediaBox.
Handle escapes in PDF name objects.
Correct ToUnicode CMap formatting.
Do not close over GSUB when computing font subsets.
Fix
output_version
handling oversight.Misc. export list & type annotation corrections.
0.6.1
Release date: 2021-05-22
Dependency changes
Update
pyhanko-certvalidator
to0.15.2
Replace constraint on
certomancer
andpyhanko-certvalidator
by soft minor version constraint (~=
)Set version bound for
freezegun
Bugs fixed
Add
/Q
and/DA
keys to the whitelist for incremental update analysis on form fields.
Release date: 2021-05-15
Dependency changes
Warning
pyHanko’s 0.6.0
release includes quite a few changes to dependencies, some of which may
break compatibility with existing code. Review this section carefully before updating.
The pyhanko-certvalidator
dependency was updated to 0.15.1
.
This update adds support for name constraints, RSASSA-PSS and EdDSA for the purposes of X.509 path
validation, OCSP checking and CRL validation.
Warning
Since pyhanko-certvalidator
has considerably diverged from “mainline” certvalidator
,
the Python package containing its modules was also renamed from certvalidator
to
pyhanko_certvalidator
, to avoid potential namespace conflicts down the line. You should
update your code to reflect this change.
Concretely,
from certvalidator import ValidationContext
turns into
from pyhanko_certvalidator import ValidationContext
in the new release.
There were several changes to dependencies with native binary components:
The Pillow dependency has been relaxed to
>=7.2.0
, and is now optional. The same goes forpython-barcode
. Image & 1D barcode support now needs to be installed explicitly using the[image-support]
installation parameter.PKCS#11 support has also been made optional, and can be added using the
[pkcs11]
installation parameter.
The test suite now makes use of Certomancer.
This also removed the dependency on ocspbuilder
.
New features and enhancements
Signing
Make preferred hash inference more robust.
Populate
/AP
when creating an empty visible signature field (necessary in PDF 2.0)
Validation
Timestamp and DSS handling tweaks:
Preserve OCSP resps / CRLs from validation kwargs when reading the DSS.
Gracefully process revisions that don’t have a DSS.
When creating document timestamps, the
validation_context
parameter is now optional.Enforce
certvalidator
’sweak_hash_algos
when validating PDF signatures as well. Previously, this setting only applied to certificate validation. By default, MD5 and SHA-1 are considered weak (for digital signing purposes).Expose
DocTimeStamp
/Sig
distinction in a more user-friendly manner.
The
sig_object_type
property onEmbeddedPdfSignature
now returns the signature’s type as a PDF name object.
PdfFileReader
now has two extra convenience properties namedembedded_regular_signatures
andembedded_timestamp_signatures
, that return a list of all regular signatures and document timestamps, respectively.
Encryption
Refactor internal APIs in pyHanko’s security handler implementation to make them easier to extend. Note that while anyone is free to register their own crypt filters for whatever purpose, pyHanko’s security handler is still considered internal API, so behaviour is subject to change between minor version upgrades (even after
1.0.0
).
Miscellaneous
Broaden the scope of
--soft-revocation-check
.Corrected a typo in the signature of
validate_sig_integrity
.Less opaque error message on missing PKCS#11 key handle.
Ad-hoc hash selection now relies on
pyca/cryptography
rather thanhashlib
.
Bugs fixed
Correct handling of DocMDP permissions in approval signatures.
Refactor & correct handling of SigFlags when signing prepared form fields in unsigned files.
Fixed issue with trailing whitespace and/or
NUL
bytes in array literals.Corrected the export lists of various modules.
Release date: 2021-03-24
Bugs fixed
Fixed a packaging blunder that caused an import error on fresh installs.
Release date: 2021-03-22
Dependency changes
Update pyhanko-certvalidator
dependency to 0.13.0
.
Dependency on cryptography
is now mandatory, and oscrypto
has been marked optional.
This is because we now use the cryptography
library for all signing and encryption operations,
but some cryptographic algorithms listed in the PDF standard are not available in cryptography
,
so we rely on oscrypto
for those. This is only relevant for the decryption of files encrypted
with a public-key security handler that uses DES, triple DES or RC2 to encrypt the key seed.
In the public API, we exclusively work with asn1crypto
representations of ASN.1 objects, to
remain as backend-independent as possible.
Note: While oscrypto
is listed as optional in pyHanko’s dependency list, it is still
required in practice, since pyhanko-certvalidator
depends on it.
New features and enhancements
Encryption
Enforce
keyEncipherment
key extension by default when using public-key encryptionShow a warning when signing a document using public-key encryption through the CLI. We currently don’t support using separate encryption credentials in the CLI, and using the same key pair for decryption and signing is bad practice.
Several minor CLI updates.
Signing
Allow customisation of key usage requirements in signer & validator, also in the CLI.
Actively preserve document timestamp chain in new PAdES-LTA signatures.
Support setups where fields and annotations are separate (i.e. unmerged).
Set the
lock
bit in the annotation flags by default.Tolerate signing fields that don’t have any annotation associated with them.
Broader support for PAdES / CAdES signed attributes.
Validation
Support validating PKCS #7 signatures that don’t use
signedAttrs
. Nowadays, those are rare in the wild, but there’s at least one common commercial PDF library that outputs such signatures by default (vendor name redacted to protect the guilty).
- Timestamp-related fixes:
Improve signature vs. document timestamp handling in the validation CLI.
Improve & test handling of malformed signature dictionaries in PDF files.
Align document timestamp updating logic with validation logic.
Correct key usage check for time stamp validation.
Allow customisation of key usage requirements in signer & validator, also in the CLI.
Allow LTA update function to be used to start the timestamp chain as well as continue it.
Tolerate indirect references in signature reference dictionaries.
Improve some potential ambiguities in the PAdES-LT and PAdES-LTA validation logic.
- Revocation info handling changes:
Support “retroactive” mode for revocation info (i.e. treat revocation info as valid in the past).
Added functionality to append current revocation information to existing signatures.
Related CLI updates.
Miscellaneous
Some key material loading functions were cleaned up a little to make them easier to use.
I/O tweaks: use chunked writes with a fixed buffer when copying data for an incremental update
Warn when revocation info is embedded with an offline validation context.
Improve SV validation reporting.
Bugs fixed
Fix issue with
/Certs
not being properly dereferenced in the DSS (#4).Fix loss of precision on
FloatObject
serialisation (#5).Add missing dunders to
BooleanObject
.Do not use
.dump()
withforce=True
in validation.Corrected digest algorithm selection in timestamp validation.
Correct handling of writes with empty user password.
Do not automatically add xref streams to the object cache. This avoids a class of bugs with some kinds of updates to files with broken xref streams.
Due to a typo, the
/Annots
array of a page would not get updated correctly if it was an indirect object. This has been corrected.
Release date: 2021-02-14
New features and enhancements
Encryption
Expose permission flags outside security handler
Make file encryption key straightforward to grab
Signing
Mildly refactor PdfSignedData for non-signing uses
- Make DSS API more flexible
Allow direct input of cert/ocsp/CRL objects as opposed to only certvalidator output
Allow input to not be associated with any concrete VRI.
- Greatly improved PKCS#11 support
Added support for RSASSA-PSS and ECDSA.
Added tests for RSA functionality using SoftHSMv2.
Added a command to the CLI for generic PKCS#11.
Note: Tests don’t run in CI, and ECDSA is not included in the test suite yet (SoftHSMv2 doesn’t seem to expose all the necessary mechanisms).
Factor out unsigned_attrs in signer, added a digest_algorithm parameter to signed_attrs.
Allow signing with any BasePdfFileWriter (in particular, this allows creating signatures in the initial revision of a PDF file)
Add CMSAlgorithmProtection attribute when possible * Note: Not added to PAdES signatures for the time being.
Improved support for deep fields in the form hierarchy (arguably orthogonal to the standard, but it doesn’t hurt to be flexible)
Validation
- Path handling improvements:
Paths in the structure tree are also simplified.
Paths can be resolved relative to objects in a file.
- Limited support for tagged PDF in the validator.
Existing form fields can be filled in without tripping up the modification analysis module.
Adding new form fields to the structure tree after signing is not allowed for the time being.
- Internal refactoring in CMS validation logic:
Isolate cryptographic integrity validation from trust validation
Rename externally_invalid API parameter to encap_data_invalid
Validate CMSAlgorithmProtection when present.
Improved support for deep fields in the form hierarchy (arguably orthogonal to the standard, but it doesn’t hurt to be flexible).
Added
Miscellaneous
Export copy_into_new_writer.
Transparently handle non-seekable output streams in the signer.
Remove unused __iadd__ implementation from VRI class.
Clean up some corner cases in container_ref handling.
Refactored SignatureFormField initialisation (internal API).
Bugs fixed
Deal with some XRef processing edge cases.
Make signed_revision on embedded signatures more robust.
Fix an issue where DocTimeStamp additions would trigger /All-type field locks.
Fix some issues with modification_level handling in validation status reports.
Fix a few logging calls.
Fix some minor issues with signing API input validation logic.
Release date: 2021-01-26
New features and enhancements
Encryption
Reworked internal crypto API.
Added support for PDF 2.0 encryption.
Added support for public key encryption.
Got rid of the homegrown RC4 class (not that it matters all to much, RC4 isn’t secure anyhow); all cryptographic operations in crypt.py are now delegated to oscrypto.
Signing
Encrypted files can now be signed from the CLI.
With the optional cryptography dependency, pyHanko can now create RSASSA-PSS signatures.
Factored out a low-level PdfCMSEmbedder API to cater to remote signing needs.
Miscellaneous
The document ID can now be accessed more conveniently.
The version number is now single-sourced in version.py.
Initialising the page tree in a PdfFileWriter is now optional.
Added a convenience function for copying files.
Validation
With the optional cryptography dependency, pyHanko can now validate RSASSA-PSS signatures.
Difference analysis checker was upgraded with capabilities to handle multiply referenced objects in a more straightforward way. This required API changes, and it comes at a significant performance cost, but the added cost is probably justified. The changes to the API are limited to the diff_analysis module itself, and do not impact the general validation API whatsoever.
Bugs fixed
Allow /DR and /Version updates in diff analysis
Fix revision handling in trailer.flatten()
Release date: 2021-01-10
New features and enhancements
Signing
Allow the caller to specify an output stream when signing.
Validation
The incremental update analysis functionality has been heavily refactored into something more rule-based and modular. The new difference analysis system is also much more user-configurable, and a (sufficiently motivated) library user could even plug in their own implementation.
The new validation system treats
/Metadata
updates more correctly, and fixes a number of other minor stability problems.Improved validation logging and status reporting mechanisms.
Improved seed value constraint enforcement support: this includes added support for
/V
,/MDP
,/LockDocument
,/KeyUsage
and (passive) support for/AppearanceFilter
and/LegalAttestation
.
CLI
You can now specify negative page numbers on the command line to refer to the pages of a document in reverse order.
General PDF API
Added convenience functions to retrieve references from dictionaries and arrays.
Tweaked handling of object freeing operations; these now produce PDF
null
objects instead of (Python)None
.
Bugs fixed
root_ref
now consistently returns aReference
objectCorrected wrong usage of
@freeze_time
in tests that caused some failures due to certificate expiry issues.Fixed a gnarly caching bug in
HistoricalResolver
that sometimes leaked state from later revisions into older ones.Prevented cross-reference stream updates from accidentally being saved with the same settings as their predecessor in the file. This was a problem when updating files generated by other PDF processing software.
Release date: 2020-12-30
Initial release.