openpgp-notes/book/source/04-certificates.md
Heiko Schaefer 59d956c706
minor fixes
2023-12-07 22:14:27 +01:00

55 KiB

(certificates_chapter)=

Certificates

OpenPGP fundamentally hinges on the concept of "{term}OpenPGP certificates<OpenPGP Certificate>," also known as "{term}OpenPGP public keys<OpenPGP Public Key>." These {term}certificates<OpenPGP Certificate> are complex data structures essential for {term}identity verification, data encryption, and {term}digital signatures<OpenPGP Signature Packet>. Understanding their structure and function is pivotal to effectively applying the OpenPGP standard.

An {term}OpenPGP certificate, by definition, does not contain {term}private key material.

Fundamentally, the effective management of {term}certificates<OpenPGP Certificate> and a thorough grasp of their {term}authentication and {term}trust models<Trust Model> are crucial for proficient OpenPGP usage. Although this document offers just a brief overview of these aspects, they form a fundamental part of the broader OpenPGP framework and warrant further study.

  • For an in-depth exploration of OpenPGP's {term}private key material, refer to {ref}private_key_chapter. This chapter provides essential insights into {term}private key<Transferable Secret Key> management and security practices.

  • The bindings that link the {term}components<Component> of a {term}certificate<OpenPGP Certificate> are comprehensively discussed in {ref}component_signatures_chapter, offering a deeper understanding of {term}certificate<OpenPGP Certificate> structure and integrity.

  • Finally, our chapter {ref}zoom_certificates discusses the internal structure of {term}certificates<OpenPGP Certificate> in detail.

Terminology: Understanding "keys"

The term "{term}(cryptographic) keys<Cryptographic Key>" is central to grasping the concept of {term}OpenPGP certificates<OpenPGP certificate>. However, it can refer to different entities, making it a potentially confusing term. Let's clarify those differences.

Public vs. private keys

The term "{term}key," without additional context, can refer to either public or private {term}asymmetric<Asymmetric Cryptography> key material. Additionally, {term}symmetric<Symmetric Cryptography> keys may be used in OpenPGP to encrypt {term}private key material, adding a layer of security and complexity.

(layers_of_keys_in_openpgp)=

Layers of keys in OpenPGP

In OpenPGP, the term "{term}key" may refer to three distinct layers, each serving a unique purpose:

  1. A (bare) "cryptographic key" comprises the private and/or public parameters forming a key. For instance, in the case of an RSA {term}private key<Transferable Secret Key>, the key consists of the exponent d along with the prime numbers p and q.
  2. An OpenPGP {term}component key<OpenPGP Component Key> includes either an "{term}OpenPGP primary key" or an "{term}OpenPGP subkey." It is a building block of an {term}OpenPGP certificate, consisting of a cryptographic keypair coupled with some invariant {term}metadata, such as key {term}creation time.
  3. An "{term}OpenPGP certificate" (or "OpenPGP key") consists of several {term}component keys<OpenPGP Component Key>, {term}identity components<Identity Component>, and other elements. These {term}certificates<Certificate> are dynamic, evolving over time as {term}components<Component> are added, {term}expire<Expiration>, or are marked as {term}invalid<Validation>.

The following section will delve into the OpenPGP-specific layers (2 and 3) to provide a clearer understanding of their roles within {term}OpenPGP certificates<OpenPGP Certificate>.

Structure of OpenPGP certificates

An {term}OpenPGP certificate (or "{term}OpenPGP key") is a collection of an arbitrary number of elements1:

  • {term}Component keys<OpenPGP Component Key>
  • {term}Identity components<Identity Component>
  • Additional {term}metadata, including connections between the {term}certificate<OpenPGP Certificate>'s {term}components<Component>

This documentation collectively refers to {term}component keys<OpenPGP Component Key> and {term}identity components<Identity Component> as "the {term}components<Component> of a {term}certificate<OpenPGP Certificate>."

:name: fig-openpgp-certificate-components
:alt: Depicts a box with white background and the title "OpenPGP certificate". In the box several other boxes and accompanying texts, representing component keys and User IDs, are shown. There are three component keys boxes with a green frame, each with a dotted lower-left section, that shows the text "key creation time" and the green public key symbol in the lower right area. All three have a title, a unique fingerprint below the box and a unique capability keyword, perpendicular to the box on the right side. The top-most component key box has a light-green background, with the title "Component Key (primary)" and capability keyword "certification". The second-to-top component key box has a white background, with the title "Component Key" and capability keyword "encryption". The lowest component key box has a white background, with the title "Component Key" and capability keyword "signing". There are two User ID boxes, each with a black frame, open to top left and lower right corner. Both boxes have a user icon on the top left side, the title "User ID" on the top right side and a User ID string at the bottom. The top box has "Alice Adams <alice@example.org>" and the lower box has "Alice" as User ID string.

Typical {term}`components<Component>` in an {term}`OpenPGP certificate`

Every element in an {term}OpenPGP certificate revolves around a central {term}component: the {term}OpenPGP primary key. The primary key acts as a personal {term}certification authority ({term}CA<Certification Authority>) for the {term}certificate<OpenPGP Certificate>'s owner, enabling cryptographic statements regarding {term}subkeys<OpenPGP Subkey>, {term}identities<Identity>, {term}expiration, {term}revocation, and more.

{term}`OpenPGP certificates<OpenPGP Certificate>` tend to have a long lifespan, with the potential for modifications (typically by their owner) over time. {term}`Components<Component>` may be added or {term}`invalidated<Validation>` throughout a {term}`certificate<OpenPGP Certificate>`'s lifetime. However, once published, {term}`components<Component>` [cannot be removed](append-only) from {term}`certificates<OpenPGP Certificate>`.

(component_keys)=

Component keys

An {term}OpenPGP certificate usually contains multiple {term}component keys<OpenPGP Component Key>. {term}Component keys<OpenPGP Component Key> serve in one of two roles: either as an "{term}OpenPGP primary key" or as an "{term}OpenPGP subkey."

{term}OpenPGP component keys<OpenPGP Component Key> logically consist of an asymmetric cryptographic keypair and a creation timestamp. Once created, these attributes of a {term}component key<OpenPGP Component Key> remain fixed (for ECDH keys, two additional parameters are part of a {term}component key's constitutive data2).

:name: fig-component-key
:alt: Depicts a box with white background and no title. In the box one other box is shown. The inner box has a green frame, with a dotted lower-left section, that shows the text "key creation time" and the green public key symbol, as well as the red-dotted private key symbol in the lower right area. In the top left of the inner box the text reads "Component Key."

An {term}`OpenPGP component key`

{term}Component keys<OpenPGP Component Key> containing {term}private key material also include {term}metadata specifying the password protection scheme. This is another facet of {term}metadata, akin to the aforementioned creation timestamp and additional parameters for certain algorithms. However, this discussion focuses on {term}OpenPGP certificates<OpenPGP Certificate>, in which the {term}component keys<OpenPGP Component Key> contain only the public part of its cryptographic key data. For information on {term}private keys<Transferable Secret Key> in OpenPGP, see {numref}private_key_chapter.

(fingerprint)=

Fingerprint

Each {term}OpenPGP component key possesses an {term}OpenPGP fingerprint. This {term}fingerprint<OpenPGP Fingerprint> is derived from the {term}public key material<OpenPGP Certificate>, the {term}creation timestamp<Creation Time>, and, when relevant, the ECDH parameters.

:name: fig-fingerprint
:alt: Depicts a box with white background and the title "Fingerprint of an OpenPGP component key." Inside, another box with a green frame, the title "Component Key", the text "key creation time" on the lower left and a the green public key symbol on the lower right is shown. Below the component key box a fingerprint in a box with a light-yellow background and a yellow dotted line is depicted. The word "Fingerprint" is shown left of the box with the fingerprint and both are connected with a yellow dotted line.

Every {term}`OpenPGP component key` is identifiable by a {term}`fingerprint<OpenPGP Fingerprint>`.

The {term}fingerprint<OpenPGP Fingerprint> of our example {term}OpenPGP component key is C0A5 8384 A438 E5A1 4F73 7124 26A4 D45D BAEE F4A3 9E6B 30B0 9D55 13F9 78AC CA943.

In practice, the {term}`fingerprint<OpenPGP Fingerprint>` of a {term}`component key<OpenPGP Component Key>`, while not theoretically unique, functions effectively as a unique identifier. The use of a [cryptographic hash algorithm](crypto-hash) in generating {term}`fingerprints<OpenPGP Fingerprint>` makes the occurrence of two different {term}`component keys<OpenPGP Component Key>` with the same {term}`fingerprint<OpenPGP Fingerprint>` extremely unlikely[^finger-unique].

(primary_key)=

Primary key

The {term}OpenPGP primary key is a {term}component key<OpenPGP Component Key> that serves a distinct, central role in an {term}OpenPGP certificate:

  • Its {term}fingerprint<OpenPGP Fingerprint> acts as an identifier for the entire {term}OpenPGP certificate.
  • It facilitates lifecycle operations, such as adding or {term}invalidating<Validation> {term}subkeys<OpenPGP Subkey> or {term}identities<Identity> within a {term}certificate<OpenPGP Certificate>.
:class: note

In the {term}`RFC`, the {term}`OpenPGP primary key` is occasionally referred to as "top-level key." Informally, it has also been termed the "{term}`master key<OpenPGP Primary Key>`."

(subkeys)=

Subkeys

Modern {term}OpenPGP certificates<OpenPGP Certificate> typically include several {term}subkeys<OpenPGP Subkey> in addition to the {term}primary key<OpenPGP Primary Key>, although these {term}subkeys<OpenPGP Subkey> are optional.

While {term}subkeys<OpenPGP Subkey> have the same structural attributes as the {term}primary key<OpenPGP Primary Key>, they fulfill different roles. {term}Subkeys<OpenPGP Subkey> are cryptographically linked with the {term}primary key<OpenPGP Primary Key>, a relationship further discussed in {numref}binding_subkeys.

:name: fig-subkeys
:alt: Diagram depicting three component keys. The primary key is positioned at the top, designated for certification. Below it, connected by arrows, are two subkeys labeled as "for encryption" and "for signing," respectively.

{term}`OpenPGP certificates<OpenPGP Certificate>` can contain multiple {term}`subkeys<OpenPGP Subkey>`.

(identity_components)=

Identity components

{term}Identity components<Identity Component> in an {term}OpenPGP certificate are used by the {term}certificate holder to state that they are known by a certain identifier (like a name, or an email address).

(user_ids_in_openpgp_certificates)=

User IDs in OpenPGP certificates

{term}OpenPGP certificates<OpenPGP Certificate> can contain multiple User IDs. Each {term}User ID associates the {term}certificate<OpenPGP Certificate> with an {term}identity.

:name: fig-user-ids
:alt: Depicts a diagram with white background and the title "User IDs". Inside, a public primary component key for certification and a User ID is shown. A green arrow points from component key to User ID and is annotated with a signature.

Relationship of {term}`User ID` to primary {term}`component key` in an {term}`OpenPGP certificate`

A typical {term}User ID {term}identity is a UTF-8-encoded string composed of a name and an email address. By convention, {term}User IDs<User ID> align with the format described in RFC2822 as a name-addr.

For further conventions on {term}User IDs<User ID>, refer to the document draft-dkg-openpgp-userid-conventions-00, dated 25 August 2023.

Split User IDs

One proposed variant for encoding {term}identities<Identity> in {term}User IDs<User ID> is to use "split User IDs". Although uncommon, there are currently no significant technical barriers to implementing this format4.

The rationale for split {term}User IDs<User ID> lies in the distinction between a name and an email address, which represent two separate facets of an individual's {term}identity. Separating these elements simplifies the process for third parties tasked with certifying that an {term}identity is legitimately connected to a {term}certificate<OpenPGP Certificate>.

Consider this scenario: A third party is confident about the email-based {term}identity of an individual (e.g.,<alice@example.org>) and is willing to certify it. However, they might not have sufficient knowledge about the person's name-based {term}identity (e.g., Alice Adams), so are unwilling to extend the same level of {term}certification. Split {term}User IDs<User ID> address this dichotomy by allowing distinct {term}certification processes for each type of {term}identity.

(primary_user_id)=

Implications of the Primary User ID

Within a {term}certificate<OpenPGP Certificate>, a specific {term}User ID is designated as the Primary User ID.

Each {term}User ID carries associated preference settings, such as preferred encryption algorithms, which is detailed in {numref}zooming_in_user_id). When a {term}certificate<OpenPGP Certificate> is used in the context of a specific {term}identity, then the preferences associated with that {term}identity component are used. When a {term}certificate<OpenPGP Certificate> is used without reference to a specific {term}identity, the preferences associated with the {term}direct key signature, or the {term}primary User ID take precedence by default.

The {term}primary User ID was historically the main store for preferences that apply to the {term}certificate<OpenPGP Certificate> as a whole. For more on this, see {ref}primary-metadata.

User attributes in OpenPGP

While user attributes are similar to {term}User IDs<User ID>, they are less commonly used.

Currently, the OpenPGP standard prescribes only one format to be stored in user attributes: an image in JPEG format. Typically, this image represents the key owner, although it is not required.

Linking the components

To form an {term}OpenPGP certificate, individual {term}components<Component> are interconnected by the {term}certificate holder using their OpenPGP software. Within OpenPGP, this process is termed "binding", as in "a {term}subkey<OpenPGP Subkey> is bound to the {term}primary key<OpenPGP Primary Key>." These bindings are realized using cryptographic {term}signatures<OpenPGP Signature Packet>. An in-depth discussion of this topic can be found in {ref}component_signatures_chapter.

In very abstract terms, the {term}primary key<OpenPGP Primary Key> of a {term}certificate<OpenPGP Certificate> acts as a root of trust or "{term}certification authority<Certification Authority>." It is responsible for:

  • issuing {term}signatures<OpenPGP Signature Packet> that express the {term}certificate holder's intent to use specific {term}subkeys<OpenPGP Subkey> or {term}identity components<Identity Component>;
  • conducting other lifecycle operations, including setting {term}expiration dates and marking {term}components<Component> as {term}invalidated<Validation> or "revoked<Revocation>."

By binding {term}components<Component> using digital {term}signatures<OpenPGP Signature Packet>, recipients of an {term}OpenPGP certificate need only {term}validate<Validation> the {term}authenticity<Authentication> of the {term}primary key to use for their communication partner. Traditionally, this is done by manually verifying the {term}fingerprint<OpenPGP Fingerprint> of the {term}primary key<OpenPGP Primary Key>. Once the {term}validity<Validation> of the {term}primary key<OpenPGP Primary Key> is confirmed, the {term}validity<Validation> of the remaining {term}components<Component> can be automatically assessed by the user's OpenPGP software. Generally, {term}components<Component> are {term}valid<Validation> parts of a {term}certificate<OpenPGP Certificate> if there is a statement signed by the {term}certificate<OpenPGP Certificate>'s {term}primary key<OpenPGP Primary Key> endorsing this {term}validity<Validation>.

(metadata_in_certificates)=

Metadata in certificates

{term}OpenPGP certificates<OpenPGP Certificate>, their {term}component keys<Component Key>, and {term}identities<Identity> possess {term}metadata that is not stored within the {term}components<Component> it pertains to. Instead, this {term}metadata is stored within signature packets, which are integral to the structure of an OpenPGP certificate.

Key attributes, such as {term}capabilities<Capability> (like signing or encryption) and {term}expiration times<Expiration Time>, are examples of {term}metadata not stored in the {term}component key data. How this {term}metadata is stored depends on the {term}component:

  • {term}Primary key<OpenPGP Primary Key> {term}metadata is defined either through a {term}direct key signature on the {term}primary key<OpenPGP Primary Key> (preferred in OpenPGP version 6), or by associating the {term}metadata with the Primary User ID.

  • {term}Subkey<OpenPGP Subkey> {term}metadata is defined within the subkey binding signature that links the {term}subkey<OpenPGP Subkey> to the {term}certificate<OpenPGP Certificate>.

  • {term}Identity component {term}metadata is associated via the certifying self-signature that links the {term}identity (usually in the form of a {term}User ID) to the {term}certificate<OpenPGP Certificate>.

It is crucial to note that the {term}components<Component> of an {term}OpenPGP certificate remain static after their creation. The use of {term}signatures<OpenPGP Signature Packet> to store {term}metadata allows for subsequent modifications without altering the original {term}component<Component>. For instance, a {term}certificate holder can update the {term}expiration time of a {term}component by issuing a new, superseding {term}signature<OpenPGP Signature Packet>.

:name: fig-primary-metadata
:alt: Depicts a direct key signature, associated with a primary component key.

{term}`Metadata` can be associated with the {term}`primary key<OpenPGP Primary Key>` using a *{term}`direct key signature`*.

(capabilities_key_flags)=

Defining operational capabilities of component keys with key flags

Each {term}component key has a set of "key flags" that delineate the operations a key can perform.

Commonly used {term}key flags<Key Flag> include:

  • {term}Certification<Certification Key Flag>: enables issuing third-party {term}certifications<Certification>
  • {term}Signing<Signing Key Flag>: allows the key to sign data
  • {term}Encryption<Encryption Key Flag>: allows the key to encrypt data
  • {term}Authentication<Authentication Key Flag>: primarily used for SSH authentication5
Distinct {term}`component keys<Component Key>` handle specific operations. Only the {term}`primary key<OpenPGP Primary Key>` can be used for {term}`certification`, although it can have additional {term}`capabilities<Capability>`. {term}`Subkeys<OpenPGP Subkey>` can be used for signing, encryption, and authentication but cannot have the {term}`certification` {term}`capability`. A {term}`component key` can technically have multiple {term}`capabilities<Capability>`. It is considered good practice, however, to [use separate keys for each capability](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#section-10.1.5-7). 

Notably, in many algorithms, encryption and signing-related functionalities (i.e., {term}`certification<Certification Key Flag>`, {term}`signing<Signing Key Flag>`, {term}`authentication<Authentication Key Flag>`) are mutually exclusive, because the algorithms only support one of those two families of operations[^key-flag-sharing].

Algorithm preferences and feature signaling

OpenPGP incorporates significant "cryptographic agility". It doesn't rely on a single fixed set of algorithms. Instead, it defines a suite of cryptographic primitives from which users (or their applications) can choose.

This agility facilitates the easy adoption of new cryptographic primitives into the standard, allowing for a seamless transition. Users can gradually migrate to new cryptographic mechanisms without disruption.

However, this approach requires that OpenPGP software determine the cryptographic mechanisms that a set of communication partners can handle and prefer. OpenPGP employs several mechanisms for this purpose, which allow negotiation between sender and recipient. It's important to note that OpenPGP is not an online scheme; thus, this negotiation is effectively one-way. The active party interprets the preferences expressed in the {term}certificate<OpenPGP Certificate> of the passive party.

Negotiation mechanisms in OpenPGP include:

Beyond these explicitly expressed preferences, implementations also deduce {term}capabilities<Capability> of communication partners based on the version of the {term}OpenPGP certificate they possess.

User ID-specific preferences

As a starting point, a {term}certificate<OpenPGP certificate> has a set of preferences that apply generally. These are defined either in a {term}direct key signature, or via the {term}primary User ID of the {term}certificate<OpenPGP certificate>.

Additionally, OpenPGP allows modeling {term}User ID-specific preferences. The idea is that a user may prefer a different suite of algorithms on their private email account compared to their work email account. Such {term}identity-specific preferences can be expressed on the certifying {term}signatures<OpenPGP Signature Packet> that bind {term}User IDs<User ID> to a {term}certificate<OpenPGP certificate>.

A typical OpenPGP certificate, revisited

Following our review of how {term}keys<Component Key> and {term}identity components<Identity Component> are linked, let's reexamine the {term}OpenPGP certificate from {numref}fig-openpgp-certificate-components. Our focus now extends to all of its binding signatures and the {term}direct key signature that contains {term}metadata for the full {term}certificate<OpenPGP certificate>:

:name: fig-openpgp-certificate
:alt: Depicts an OpenPGP certificate, including a set of components, binding signatures, and a direct key signature on the primary key.

This shows a typical {term}`OpenPGP certificate`, including binding {term}`signatures<OpenPGP Signature Packet>` for all of its {term}`components<Component>`, and a {term}`signature<OpenPGP Signature Packet>` that associates {term}`metadata` with the {term}`primary key<OpenPGP Primary Key>`.

(revocations)=

Revocations

When a {term}certificate holder needs to {term}invalidate<Validation> certain {term}components<Component> of their {term}certificate<OpenPGP Certificate>, or even the entire {term}certificate<OpenPGP Certificate>, they accomplish this through "{term}revocation." {term}Revoking<Revocation> the {term}primary key<OpenPGP Primary Key> renders the entire {term}certificate<OpenPGP Certificate> {term}invalid<Validation>.

Notably, {term}revocations<Revocation> are not the only means by which {term}components<Component> can become {term}invalid<Validation>. Other factors, such as the passing of a {term}component's {term}expiration time, can also render {term}components<Component> {term}invalid<Validation>.

For more detailed information on {term}revoking<Revocation> specific {term}components<Component> of a {term}certificate<OpenPGP Certificate>, see the section on {ref}self-revocations.

(third_party_identity_certifications)=

Third-party (identity) certifications

Since its inception, {term}third-party identity certifications<Third-party Identity Certification> have been a cornerstone of the OpenPGP ecosystem. The original PGP designers, starting with Phil Zimmermann, advocated for decentralized {term}trust models<Trust Model> over reliance on centralized authorities. This decentralized approach in OpenPGP is known as the "Web of Trust."

Third-party {term}certifications<Certification> are statements by OpenPGP users confirming that a user with a specific {term}identity is the owner of a particular {term}OpenPGP certificate.

For example, Bob's OpenPGP software may issue a {term}certification that Bob has checked that the {term}User ID Alice Adams <alice@example.org> and the {term}certificate<OpenPGP Certificate> with the {term}fingerprint<OpenPGP Fingerprint> AAA1 8CBB 2546 85C5 8358 3205 63FD 37B6 7F33 00F9 FB0E C457 378C D29F 1026 98B3 are legitimately linked.

Take, for instance, a scenario where Bob's OpenPGP software issues a {term}certification confirming as legitimate the link between the {term}User ID Alice Adams <alice@example.org> and the {term}certificate<OpenPGP Certificate> bearing the {term}fingerprint<OpenPGP Fingerprint> AAA1 8CBB 2546 85C5 8358 3205 63FD 37B6 7F33 00F9 FB0E C457 378C D29F 1026 98B3.

This process assumes that Bob knows the person known as Alice Adams and is confident that alice@example.org is indeed Alice's email address. Bob also verifies that the {term}certificate<OpenPGP Certificate> his OpenPGP software associates with Alice matches the one Alice uses. In essence, both users must have a {term}certificate<OpenPGP Certificate> for Alice with an identical {term}fingerprint<OpenPGP Fingerprint>. In OpenPGP version 6, manual {term}fingerprint<OpenPGP Fingerprint> comparison by end-users is discouraged, with a replacement {term}verification mechanism still under development. The {term}verification process must occur over a sufficiently secure channel, such as an end-to-end encrypted video call or a face-to-face meeting.

For more on third-party {term}certifications<Certification>, see {ref}third_party_cert.

Advanced topics

When are certificates valid?

Certificates are composed out of smaller parts and connected back to the primary key with signatures. Since OpenPGP certificates are append only data structure, previous signatures can be revoked by issuing revocation signatures and appending them to the certificate. This also means that each component such as User ID and a subkey may be revoked without affecting the rest of the certificate. A special case is a Key revocation signature (type ID 0x20) which marks the primary key as revoked and that indirectly makes all other components unusable.

A related concept is key expiration, that also makes the component unusable, but compared to revocations, which are final, expiration is just a reminder for certificate users that the certificate is not fresh and a newer version should be acquired. Only primary keys are using Key Expiration Time subpackets for expressing the expiration time. All other components rely on the expiration of their binding signature. If the binding signature expires, the binding becomes invalid, and the component is considered expired.

Revocation, on the other hand, is final and cannot be withdrawn and indicates that the component should not be used. Revocation signatures over components use Reason for Revocation subpacket to specify further details about the reason why the component or certification was revoked.

Some libraries such as Sequoia PGP follow the guidance of the RFC and differentiate revocation signatures based on the Reason for Revocation subpacket. Key is superseded, Key is retired and User ID is no longer valid are considered "soft" revocations. Any other reason makes the revocation "hard." The distinction plays a role when Sequoia constructs a view of the certificate at a specified point in time. Selecting the time before a soft revocation has been made makes the component valid, but if the revocation is hard then it's considered invalid at any point in time. The distinction stems from the following attack: if the key was compromised, then the attacker can issue backdated signatures, so it's important to always consider compromised keys as suspect. On the other hand, if the subkey was merely retired, and the certificate holder moved to a different subkey, then the signatures in the past, made by the retired key, are still correct.

(append-only)=

Certificates are effectively append-only data structures

OpenPGP certificates act as append-only data structures, in practice. By this, we mean that packets that are associated with a certificate cannot be "recalled", once they were published. Third parties (such as other users, or keyservers) may keep and/or distribute copies of those packets.

While it is not possible to "remove" elements, once they were publicly associated with an OpenPGP certificate, it is possible to invalidate them by adding new metadata to the certificate. This new metadata could set an expiration time on a component, or explicitly revoke that component. In both cases, no packets are removed from the certificate.

Invalidation resembles removal of a component in a semantical sense. The component is not a valid element of the certificate anymore, at least starting from some point in time. Implementations that handle the certificate may omit the invalid component in their representation.

We have to distinguish the "packet level" information about a certificate from an application-level view of that certificate. The two may differ.

Reasoning about append-only properties in a distributed system

OpenPGP is a thoroughly distributed system. Users can obtain and transmit certificate information about their own, as well as other users', certificates using a broad range of mechanisms. These mechanisms include keyservers, manual handling, WKD and Autocrypt.

User's OpenPGP software may obtain different views of a particular certificate, over time. These systems have to reconcile and store a combined version of the possibly disparate elements they may obtain from different sources.

In practice, this means that various OpenPGP users may have differing views of any given certificate. For various reasons, not all users will be in possession of a fully up-to date and complete version of a certificate.

There are various potential problems associated with this fact: Users may not be aware that a component has been invalidated by the certificate holder. Revocations may not have been propagated to some third party. So for example, they may not be aware that the certificate holder has rotated their encryption subkey to a new one, and doesn't want to receive messages encrypted to the previous encryption subkey.

One mechanism that addresses a part of this issue is expiration: By setting their certificates to expire after an appropriate interval, certificate holders can force their communication partners to refresh their certificate, e.g. from a keyserver6.

Good practices, like setting appropriate expiration times, can mitigate the complexity of the inherently distributed nature of certificates.

However, such mitigations by definition cannot address all possible cases of outdated certificate information in a decentralized, asynchronous system such as OpenPGP. So a defensive approach is generally appropriate when reasoning about the view of certificates that different actors have.

When thinking about edge cases, it's useful to "assume the worst." For example:

  • Recipients may not obtain updates to a certificate in a timely manner (this could happen for various reasons, including, but not limited to, interference by malicious actors).
  • Data associated with a certificate may compound, and can become too large for convenient handling. If such a problem arises, then by definition, the certificate holder cannot address it: recall that the certificate holder cannot "recall" existing packets.

Differing "views" of a certificate exist

Another way to think about this discussion is that different OpenPGP users may have a different view of any certificate. There is a notional "canonical" version of the certificate, but we cannot assume that every user has exactly this copy. Besides propagation of elements that the certificate holder has linked to a certificate, third-party certifications are by design a distributed mechanism. A third-party certification is issued by a third party, and may or may not be distributed widely by them, or by the certificate holder. Not distributing third-party certifications widely is a workflow that may be entirely appropriate for some use cases.

As a general tendency, it is desirable for OpenPGP users to have the most complete possible view of all certificates that they interact with.

However, there are contexts in which it is preferable to only use a subset of the available elements of a certificate. We discuss this in the section {ref}cert-mini.

Merging

As described above, OpenPGP certificates are effectively append-only data structures. As part of the practical realization of this fact, OpenPGP software needs to merge different copies of a certificate.

For example, Bob's OpenPGP system may have a local copy of Alice's certificate, and obtain a different version of Alice's certificate from a keyserver. The goal of the implementation is to add new information about Alice's certificate, if any, to the local copy. Alice may have added a new identity, replaced a subkey with a new subkey, or revoked some components of her certificate. Or, Alice may have revoked her certificate, signaling that she doesn't want communication partners to use that certificate anymore. All of these updates could be crucial for Bob to be aware of.

Merging two versions of a certificate involves making decisions about which packets should be kept. The versions of the certificate will typically contain some packets that are identical. No duplicates of the exact same packet should be stored in the merged version of the certificate. Additionally, if the newly obtained copy contains packets that are in fact entirely unrelated to the certificate, those should not be retained (a third party may have included unrelated packets, either by mistake, or with malicious intent).

Handling unauthenticated information

For information that is related to the certificate, but not bound to it by a self-signature, there is no generally correct approach. The receiving implementation must revolve these cases, possibly in a context-specific manner. Such cases include:

  • Third-party certifications. These could be valuable information, where a third party attests that the association of an identity to a certificate is valid. On the other hand, they could also be a type of spam.
  • Subpackets in the unhashed area of a signature packet. Again, these could contain information that is useful to the recipient. However, the data could also be either useless, or even misleading/harmful.

(cert-mini)=

Certificate minimization

Certificate minimization is the practice of presenting a partial view of a certificate by filtering out some of its components.

Filtering out some elements of a certificate can have different benefits:

  • For some workflows it's clear that the full certificate is not required. For example, email clients only need encryption, signing and certification component keys. They don't need authentication subkeys, which are used for SSH connections.
  • In some contexts, data can be added to certificates by third parties, e.g. by adding third-party User ID certifications on some key servers. In the worst case this can lead to "certificate flooding" which inflates the target certificate to a point where consumer software rejects the certificate completely. Filtering out elements can mitigate this.
  • Sometimes, a certificate organically grows so big that the user software has problems handing it.

Elements that can be omitted as part of a minimization process

There are different types of elements that can be omitted during minimization:

  • Subkeys (along with signatures on those subkeys)
  • Identity components (along with both their self-signatures and third-party signatures)
  • Signatures, by themselves:
    • Self-signatures that have been superseded by newer self-signatures for the same purpose
    • Third-party certifications

Minimization in applications

Hagrid, which runs keys.openpgp.org

The hagrid keyserver software doesn't publish the identity components in certificates by default. This is a central aspect of the privacy policy of the service. Certificates can be uploaded to the service by third parties, which is useful. However, identifying information is only distributed by the service on an explicit opt-in basis.

Separately, third-party certifications are currently filtered out by the service, to avoid flooding attacks.

GnuPG

GnuPG strips some signatures on key import.

In addition, GnuPG offers two explicit methods for certificate minimization, described in the GnuPG manual as:

clean
Compact (by removing all signatures except the selfsig) any user ID that is no longer usable (e.g. revoked, or expired). Then, remove any signatures that are not usable by the trust calculations. Specifically, this removes any signature that does not validate, any signature that is superseded by a later signature, revoked signatures, and signatures issued by keys that are not present on the keyring.
minimize
Make the key as small as possible. This removes all signatures from each user ID except for the most recent self-signature.

clean removes third-party signatures by certificates that are not present in current keyring, as well as other stale data. minimize removes superseded signatures that are not needed at the point when the command is executed.

Limitations that can result from stripping historical self-signatures

Some implementations, such as Sequoia, prefer to rely on the full historical set of self-signatures to construct a view of the certificate over time. This way, signatures can be verified at different reference times. In this model, removing superseded self-signatures can cause problems with the validation of historical signature.

An example for the tension between minimization and nuanced verification of the temporal validity of signatures can be seen in the case of rpm-sequoia. See this discussion for details:

Initially, when checking the validity of a data signature for a software package, rpm-sequoia used the signature's creation time as the reference time. However, the availability of historical self-signatures in certificates is limited. So sometimes only a more recent self-signature for the primary key is available, and there is no evidence that the primary key was valid at the reference time.

To deal with this reality, the rpm-sequoia implementation was adjusted to accept data signatures that predate the validity of the current primary key self-signature7.

Autocrypt

The Autocrypt Level 1 specification defines a specific minimal format for OpenPGP certificates that are distributed by the autocrypt mechanism.

One goal of the Autocrypt mechanism is to distribute certificates widely. To this end, Autocrypt sends certificates in mail headers, where smaller size is greatly preferable.

Basic encrypted email functionality requires only a small subset of the recipient's certificate, so small certificate size is feasible.

Minimization for email

Note that it's not generally clear if minimization brings more benefit than harm.

For example, we might consider minimizing a certificate for distribution via WKD, with the use-case of email in mind.

The following fragment processes an example certificate. It drops any subkey that is not valid at the time of export (because of revocation or expiration). Additionally, authentication subkeys are stripped, since they are irrelevant for email:

gpg --export-options export-minimal,export-clean,no-export-attributes \
    --export-filter keep-uid=mbox=wiktor@metacode.biz \
    --export-filter 'drop-subkey=expired -t || revoked -t || usage =~ a' \
    --export wiktor@metacode.biz

At the time of writing, the original certificate consists of 152322 bytes of data. The filtered variant consists of only 3771 bytes, which is 40x smaller. In some contexts, there are hard constraints on size, and minimization is unavoidable, e.g., when embedding certificate data in email headers.

The above minimization might be convenient when interacting with a ProtonMail client, which fetches OpenPGP certificates via WKD automatically, while composing a message. The ProtonMail use case requires only component keys, not third-party certifications, and it doesn't require historical component keys or self-signatures.

However, in a different context, the same certificate might be fetched to verify the authenticity of a signature. In that case, third-party certifications are crucial for the client. Stripping them could prevent the client from performing Web of Trust calculations and authenticating the signature.

Pitfalls of minimization

Disadvantages/risks of minimizing certificates:

  • Does not present a full view of how the certificate (and the validity of its components) evolved over time.
  • As other certificates are collected, third-party certifications that were previously unusable may become usable again. Dropping third-party certifications as a part of minimization prevents this mechanism.
  • Removing component keys that the minimizing implementation can't use means that the receiver does not receive a copy of those, even if the receiver supports them.
  • Refreshing certificates from key servers may inflate the certificate again, since OpenPGP certificates tend to act as append-only structures.
  • Carelessly stripping all invalid components may make the certificate unusable. Some libraries, such as anonaddy-sequoia strip unusable encryption subkeys. However, at least one subkey is retained, even if all encryption subkeys are unusable. Even though this may leave only an expired encryption subkey in the certificate, this presents a better UX for the end-user who probably is still in possession of the private key for decryption.

Guidelines

  1. Don't minimize certificates unless you have a good reason to.
  2. When presenting a minimized certificate view, consider when that view needs to be updated. Ideally, minimized certificates are freshly generated, on demand (e.g. the Autocrypt header is constructed while an email is sent or composed) and the client merges all data collected.

Fingerprints and beyond: "Naming" certificates in user-facing contexts

Version 4

With OpenPGP version 4 certificates, it was customary that user-facing software used 20 byte fingerprints as an identifier for the certificate. Or alternatively, shortened Key ID variants of the fingerprint. Both were represented in hexadecimal format, sometimes with whitespace to group the identifier into blocks for easier readability.

For example, in workflows to accept a certificate for a communication partner, or during third-party certification of an identity, users were shown hexadecimal representations of a fingerprint. Users were asked to manually verify that the fingerprint corresponds to the expected certificate.

Version 6

The OpenPGP version 6 standard uses 32 byte fingerprints, but explicitly defines no format for displaying those fingerprints in a human-readable form. The standard recommends strongly against using version 6 fingerprints as identifiers in user-facing workflows.

Instead, "mechanical fingerprint transfer and comparison" should be preferred, wherever possible. The reasoning is that humans tend to be bad at comparing high-entropy data (in addition, many users are probably put off by being asked to compare long hexadecimal strings).

Use in APIs

However, both Fingerprints and Key IDs may (and usually must) be used, programmatically, by software that handles OpenPGP data, to address specific certificates. This is equally true for OpenPGP version 6.

Note that regardless of the OpenPGP version, software that relies on 8-byte Key IDs should not assume that Key IDs are unique. It is trivial to generate collisions for 8-byte Key IDs, so applications must be able to handle Key ID collisions gracefully.

The historical 4-byte "short Key IDs" format should not be used anywhere, anymore (finding collisions in a 32-bit keyspace has been trivial for a long time).

(cert-freshness)=

Certificate freshness: Triggering updates with expiration

For a certificate holder, one problem is that communication partners may not regularly poll for updates of their certificate.

A certificate holder usually prefers that everyone else regularly obtains updates for their certificate. This way, a third party will, for example, not mistakenly keep using the certificate indefinitely, in case it gets revoked. Instead, in the worst case, someone will use the certificate until the expiration date.

Once the expiration date is reached, third parties, or ideally their OpenPGP software will have to obtain an update for the certificate. For example, from a keyserver, or via WKD. Ideally, certificate updates are obtained automatically, by the user's OpenPGP software, without any need for human intervention.

After the update, the updated copy of the certificate will usually have a fresh expiration time. The same procedure will repeat once that new expiration time has been reached.

Metadata leak of Social Graph

Third-party certifications, which are signatures made by other certificates, over identity components, form a back-bone of OpenPGP trust-model called the Web of Trust. The name stems from the fact that the collection of certifications forms a unidirectional graph resembling a web. Each edge of graph connects the signing certificate to the identity component associated with another certificate.

OpenPGP software can inspect that graph, and coupled with trust data and a trust anchor (which usually is the certificate holder's own key), can infer whether the target certificate is genuine.

Third-party certifications are published as part of the target certificate to facilitate the process of certificate authentication. Unfortunately, as a side-effect of this approach it's feasible to reconstruct the entire social graph of all people issuing certifications. The certification's signature creation time can be used to deduct whether the ceritifate owner attended a Key Signing Party (and if it was public where was it) and whom they interacted with.

(unbound_user_ids)=

Adding unbound User IDs to a certificate

Some OpenPGP subsystems may add User IDs to a certificate, which are not bound to the primary key by the certificate's owner. This can be useful to store local identity information (e.g., Sequoia's public store attaches "pet-names" to certificates, in this way).

Sequoia additionally certifies these foreign User IDs with the local trust root to facilitate authentication of certificates but marks all this additional signatures with a Non Exportable subpacket so that they are not visible when publishing the certificate e.g. on keyservers.

(cert-flooding)=

Third-party certification flooding

While a convenience for consumers, indiscriminately accepting and integrating third-party identity certifications comes with significant risks.

Without any restrictions in place, malicious entities can flood a certificate with excessive certifications. Called "certificate flooding," this form of digital vandalism grossly expands the certificate size, making the certificate cumbersome and impractical for users.

It also opens the door to potential denial-of-service attacks, rendering the certificate non-functional or significantly impeding its operation.

The popular SKS keyserver network experienced certificate flooding firsthand, causing it to shut down operations in 2019.

TODO: merge in text from ch8:

However, in systems that unconditionally accept these certifications, it can lead to unintended consequences. Specifically, this approach has been exploited to cause denial-of-service attacks through [certificate flooding](https://dkg.fifthhorseman.net/blog/openpgp-certificate-flooding.html), a problem notably experienced by the SKS network of OpenPGP servers. 

  1. In technical terms, the elements of an {term}OpenPGP certificate are a collection of "{term}packets<Packet>." Each {term}component key<OpenPGP Component Key> and {term}identity component is internally represented as a {term}packet. Another common type of {term}packet is the "{term}signature" {term}packet, which connect the {term}components<Component> of a {term}certificate<OpenPGP Certificate>. ↩︎

  2. For ECDH {term}component keys<OpenPGP Component Key>, two additional algorithm parameters are integral to the {term}component key<OpenPGP Component Key>'s constitutive and immutable properties. Those parameters specify a hash function and a {term}symmetric<Symmetric Cryptography> encryption algorithm. ↩︎

  3. In OpenPGP version 4, the rightmost 64 bits were sometimes used as a shorter identifier, called "{term}Key ID." For example, an OpenPGP version 4 {term}certificate<OpenPGP Certificate> with the {term}fingerprint<OpenPGP Fingerprint> B3D2 7B09 FBA4 1235 2B41 8972 C8B8 6AC4 2455 4239 might be referenced by the 64-bit {term}Key ID C8B8 6AC4 2455 4239 or formatted as 0xC8B86AC424554239.
    Historically, even shorter 32-bit identifiers were used, like this: 2455 4239, or 0x24554239. Such identifiers still appear in very old documents about PGP. However, 32-bit identifiers have been long deemed unfit for purpose. At one point, 32-bit identifiers were called "short {term}Key ID," while 64-bit identifiers were referred to as "long Key ID." ↩︎

  4. Historically, the OpenPGP ecosystem faced challenges in this context. For further details, refer to Daniel Kahn Gillmor's January 2019 article, "What were Separated User IDs". ↩︎

  5. It's important to note that the function of the authentication {term}key flag is unrelated to the {term}authentication process used in certifying OpenPGP {term}identities<Identity> and linking them to {term}certificate<OpenPGP Certificate>. Rather, this flag indicates a mechanism that uses {term}cryptographic signatures<OpenPGP Signature Packet> to confirm control of {term}private key material with a remote system. ↩︎

  6. See, for example, here: "Expiration dates really serve two purposes: naturally eliminating unused keys, and enforcing periodical checks on the primary key." ↩︎

  7. Which in OpenPGP version 4 is often a primary User ID binding signature. ↩︎