(certificates_chapter)= # Certificates OpenPGP fundamentally hinges on the concept of "{term}`OpenPGP certificates`," also known as "{term}`OpenPGP public keys`." These {term}`certificates` are complex data structures essential for {term}`identity verification`, data encryption, and {term}`digital signatures`. Understanding their structure and function is pivotal to effectively applying the OpenPGP standard. An {term}`OpenPGP certificate`, by definition, does not contain {term}`private key material`. Fundamentally, the effective management of {term}`certificates` and a thorough grasp of their {term}`authentication` and {term}`trust models` are crucial for proficient OpenPGP usage. Although this document offers just a brief overview of these aspects, they form a fundamental part of the broader OpenPGP framework and warrant further study. - For an in-depth exploration of OpenPGP's {term}`private key material`, refer to {ref}`private_key_chapter`. This chapter provides essential insights into {term}`private key` management and security practices. - The bindings that link the {term}`components` of a {term}`certificate` are comprehensively discussed in {ref}`component_signatures_chapter`, offering a deeper understanding of {term}`certificate` structure and integrity. - Finally, our chapter {ref}`zoom_certificates` discusses the internal structure of {term}`certificates` in detail. ## Terminology: Understanding "keys" The term "{term}`(cryptographic) keys`" is central to grasping the concept of {term}`OpenPGP certificates`. However, it can refer to different entities, making it a potentially confusing term. Let's clarify those differences. ### Public vs. private keys The term "{term}`key`," without additional context, can refer to either public or private {term}`asymmetric` key material. Additionally, {term}`symmetric` keys may be used in OpenPGP to encrypt {term}`private key material`, adding a layer of security and complexity. (layers_of_keys_in_openpgp)= ### Layers of keys in OpenPGP In OpenPGP, the term "{term}`key`" may refer to three distinct layers, each serving a unique purpose: 1. A (bare) ["cryptographic key"](asymmetric_key_pair) comprises the private and/or public parameters forming a key. For instance, in the case of an RSA {term}`private key`, the key consists of the exponent `d` along with the prime numbers `p` and `q`. 2. An OpenPGP *{term}`component key`* includes either an "{term}`OpenPGP primary key`" or an "{term}`OpenPGP subkey`." It is a building block of an {term}`OpenPGP certificate`, consisting of a cryptographic keypair coupled with some invariant {term}`metadata`, such as key {term}`creation time`. 3. An "{term}`OpenPGP certificate`" (or "OpenPGP key") consists of several {term}`component keys`, {term}`identity components`, and other elements. These {term}`certificates` are dynamic, evolving over time as {term}`components` are added, {term}`expire`, or are marked as {term}`invalid`. The following section will delve into the OpenPGP-specific layers (2 and 3) to provide a clearer understanding of their roles within {term}`OpenPGP certificates`. ## Structure of OpenPGP certificates An {term}`OpenPGP certificate` (or "{term}`OpenPGP key`") is a collection of an arbitrary number of elements[^packets]: [^packets]: In technical terms, the elements of an {term}`OpenPGP certificate` are a collection of "{term}`packets`." Each {term}`component key` and {term}`identity component` is internally represented as a {term}`packet`. Another common type of {term}`packet` is the "{term}`signature`" {term}`packet`, which connect the {term}`components` of a {term}`certificate`. - {term}`Component keys` - {term}`Identity components` - Additional {term}`metadata`, including connections between the {term}`certificate`'s {term}`components` This documentation collectively refers to {term}`component keys` and {term}`identity components` as "the {term}`components` of a {term}`certificate`." ```{figure} diag_converted/Components_of_an_OpenPGP_Certificate.svg :name: fig-openpgp-certificate-components :alt: Depicts a box with white background and the title "OpenPGP certificate". In the box several other boxes and accompanying texts, representing component keys and User IDs, are shown. There are three component keys boxes with a green frame, each with a dotted lower-left section, that shows the text "key creation time" and the green public key symbol in the lower right area. All three have a title, a unique fingerprint below the box and a unique capability keyword, perpendicular to the box on the right side. The top-most component key box has a light-green background, with the title "Component Key (primary)" and capability keyword "certification". The second-to-top component key box has a white background, with the title "Component Key" and capability keyword "encryption". The lowest component key box has a white background, with the title "Component Key" and capability keyword "signing". There are two User ID boxes, each with a black frame, open to top left and lower right corner. Both boxes have a user icon on the top left side, the title "User ID" on the top right side and a User ID string at the bottom. The top box has "Alice Adams " and the lower box has "Alice" as User ID string. Typical {term}`components` in an {term}`OpenPGP certificate` ``` Every element in an {term}`OpenPGP certificate` revolves around a central {term}`component`: the *{term}`OpenPGP primary key`*. The primary key acts as a personal *{term}`certification authority`* ({term}`CA`) for the {term}`certificate`'s owner, enabling cryptographic statements regarding {term}`subkeys`, {term}`identities`, {term}`expiration`, {term}`revocation`, and more. ```{note} {term}`OpenPGP certificates` tend to have a long lifespan, with the potential for modifications (typically by their owner) over time. {term}`Components` may be added or {term}`invalidated` throughout a {term}`certificate`'s lifetime. However, once published, {term}`components` [cannot be removed](append-only) from {term}`certificates`. ``` (component_keys)= ## Component keys An {term}`OpenPGP certificate` usually contains multiple {term}`component keys`. {term}`Component keys` serve in one of two roles: either as an "{term}`OpenPGP primary key`" or as an "{term}`OpenPGP subkey`." {term}`OpenPGP component keys` logically consist of an [asymmetric cryptographic keypair](asymmetric_key_pair) and a creation timestamp. Once created, these attributes of a {term}`component key` remain fixed (for ECDH keys, two additional parameters are part of a {term}`component key`'s constitutive data[^ecdh-parameters]). [^ecdh-parameters]: For [ECDH](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#name-algorithm-specific-part-for-ecd) {term}`component keys`, two additional algorithm parameters are integral to the {term}`component key`'s constitutive and immutable properties. Those parameters specify a hash function and a {term}`symmetric` encryption algorithm. ```{figure} diag_converted/Component_Key.svg :name: fig-component-key :alt: Depicts a box with white background and no title. In the box one other box is shown. The inner box has a green frame, with a dotted lower-left section, that shows the text "key creation time" and the green public key symbol, as well as the red-dotted private key symbol in the lower right area. In the top left of the inner box the text reads "Component Key." An {term}`OpenPGP component key` ``` {term}`Component keys` containing {term}`private key material` also include {term}`metadata` specifying the password protection scheme. This is another facet of {term}`metadata`, akin to the aforementioned creation timestamp and additional parameters for certain algorithms. However, this discussion focuses on {term}`OpenPGP certificates`, in which the {term}`component keys` contain only the public part of its cryptographic key data. For information on {term}`private keys` in OpenPGP, see {numref}`private_key_chapter`. (fingerprint)= ### Fingerprint Each {term}`OpenPGP component key` possesses an *{term}`OpenPGP fingerprint`*. This {term}`fingerprint` is derived from the {term}`public key material`, the {term}`creation timestamp`, and, when relevant, the ECDH parameters. ```{figure} diag_converted/Fingerprint.svg :name: fig-fingerprint :alt: Depicts a box with white background and the title "Fingerprint of an OpenPGP component key." Inside, another box with a green frame, the title "Component Key", the text "key creation time" on the lower left and a the green public key symbol on the lower right is shown. Below the component key box a fingerprint in a box with a light-yellow background and a yellow dotted line is depicted. The word "Fingerprint" is shown left of the box with the fingerprint and both are connected with a yellow dotted line. Every {term}`OpenPGP component key` is identifiable by a {term}`fingerprint`. ``` The {term}`fingerprint` of our example {term}`OpenPGP component key` is `C0A5 8384 A438 E5A1 4F73 7124 26A4 D45D BAEE F4A3 9E6B 30B0 9D55 13F9 78AC CA94`[^keyid]. [^keyid]: In OpenPGP version 4, the rightmost 64 bits were sometimes used as a shorter identifier, called "{term}`Key ID`." For example, an OpenPGP version 4 {term}`certificate` with the {term}`fingerprint` `B3D2 7B09 FBA4 1235 2B41 8972 C8B8 6AC4 2455 4239` might be referenced by the 64-bit {term}`Key ID` `C8B8 6AC4 2455 4239` or formatted as `0xC8B86AC424554239`. Historically, even shorter 32-bit identifiers were used, like this: `2455 4239`, or `0x24554239`. Such identifiers still appear in very old documents about PGP. However, [32-bit identifiers have been long deemed unfit for purpose](https://evil32.com/). At one point, 32-bit identifiers were called "short {term}`Key ID`," while 64-bit identifiers were referred to as "long Key ID." ```{note} In practice, the {term}`fingerprint` of a {term}`component key`, while not theoretically unique, functions effectively as a unique identifier. The use of a [cryptographic hash algorithm](crypto-hash) in generating {term}`fingerprints` makes the occurrence of two different {term}`component keys` with the same {term}`fingerprint` extremely unlikely[^finger-unique]. ``` [^finger-unique]: For both {term}`OpenPGP version 6` and version 4, the likelihood of accidental occurrence of duplicate {term}`fingerprints` is negligible when {term}`key material` is generated based on an acceptable source of entropy. A separate question is if an attacker can purposely craft a second key with the same {term}`fingerprint` as a given pre-existing {term}`component key`. With the current state of the art, this is not possible for OpenPGP version 6 and version 4 keys. However, at the time of this writing, the SHA-1-based {term}`fingerprints` of OpenPGP version 4 are considered insufficiently strong at protecting against the generation of pairs of {term}`key material` with the same {term}`fingerprint`. (primary_key)= ### Primary key The {term}`OpenPGP primary key` is a {term}`component key` that serves a distinct, central role in an {term}`OpenPGP certificate`: - Its {term}`fingerprint` acts as an identifier for the entire {term}`OpenPGP certificate`. - It facilitates lifecycle operations, such as adding or {term}`invalidating` {term}`subkeys` or {term}`identities` within a {term}`certificate`. ```{admonition} Terminology :class: note In the {term}`RFC`, the {term}`OpenPGP primary key` is occasionally referred to as "top-level key." Informally, it has also been termed the "{term}`master key`." ``` (subkeys)= ### Subkeys Modern {term}`OpenPGP certificates` typically include several {term}`subkeys` in addition to the {term}`primary key`, although these {term}`subkeys` are optional. While {term}`subkeys` have the same structural attributes as the {term}`primary key`, they fulfill different roles. {term}`Subkeys` are cryptographically linked with the {term}`primary key`, a relationship further discussed in {numref}`binding_subkeys`. ```{figure} diag_converted/Binding_Subkeys.svg :name: fig-subkeys :alt: Diagram depicting three component keys. The primary key is positioned at the top, designated for certification. Below it, connected by arrows, are two subkeys labeled as "for encryption" and "for signing," respectively. {term}`OpenPGP certificates` can contain multiple {term}`subkeys`. ``` (identity_components)= ## Identity components {term}`Identity components` in an {term}`OpenPGP certificate` are used by the {term}`certificate holder` to state that they are known by a certain identifier (like a name, or an email address). (user_ids_in_openpgp_certificates)= ### User IDs in OpenPGP certificates {term}`OpenPGP certificates` can contain multiple [User IDs](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#name-user-id-packet-tag-13). Each {term}`User ID` associates the {term}`certificate` with an {term}`identity`. ```{figure} diag_converted/Binding_a_UserID.svg :name: fig-user-ids :alt: Depicts a diagram with white background and the title "User IDs". Inside, a public primary component key for certification and a User ID is shown. A green arrow points from component key to User ID and is annotated with a signature. Relationship of {term}`User ID` to primary {term}`component key` in an {term}`OpenPGP certificate` ``` A typical {term}`User ID` {term}`identity` is a UTF-8-encoded string composed of a name and an email address. By convention, {term}`User IDs` align with the format described in [RFC2822](https://www.rfc-editor.org/rfc/rfc2822) as a *name-addr*. For further conventions on {term}`User IDs`, refer to the document [draft-dkg-openpgp-userid-conventions-00](https://datatracker.ietf.org/doc/draft-dkg-openpgp-userid-conventions/), dated 25 August 2023. **Split User IDs** One proposed variant for encoding {term}`identities` in {term}`User IDs` is to use ["split User IDs"](https://dkg.fifthhorseman.net/blog/2021-dkg-openpgp-transition.html#split-user-ids). Although uncommon, there are currently no significant technical barriers to implementing this format[^dkg-split]. [^dkg-split]: Historically, the OpenPGP ecosystem faced challenges in this context. For further details, refer to Daniel Kahn Gillmor's January 2019 article, ["What were Separated User IDs"](https://dkg.fifthhorseman.net/blog/2019-dkg-openpgp-transition.html#what-were-separated-user-ids). The rationale for split {term}`User IDs` lies in the distinction between a name and an email address, which represent two separate facets of an individual's {term}`identity`. Separating these elements simplifies the process for third parties tasked with certifying that an {term}`identity` is legitimately connected to a {term}`certificate`. Consider this scenario: A third party is confident about the email-based {term}`identity` of an individual (e.g.,``) and is willing to certify it. However, they might not have sufficient knowledge about the person's name-based {term}`identity` (e.g., `Alice Adams`), so are unwilling to extend the same level of {term}`certification`. Split {term}`User IDs` address this dichotomy by allowing distinct {term}`certification` processes for each type of {term}`identity`. (primary_user_id)= ### Implications of the Primary User ID Within a {term}`certificate`, a specific {term}`User ID` is designated as the [Primary User ID](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#name-primary-user-id). Each {term}`User ID` carries associated preference settings, such as preferred encryption algorithms, which is detailed in {numref}`zooming_in_user_id`). When a {term}`certificate` is used in the context of a specific {term}`identity`, then the preferences associated with that {term}`identity component` are used. When a {term}`certificate` is used without reference to a specific {term}`identity`, the preferences associated with the {term}`direct key signature`, or the {term}`primary User ID` take precedence by default. The {term}`primary User ID` was historically the main store for preferences that apply to the {term}`certificate` as a whole. For more on this, see {ref}`primary-metadata`. ### User attributes in OpenPGP While [user attributes](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#name-user-attribute-packet-tag-1) are similar to {term}`User IDs`, they are less commonly used. Currently, the OpenPGP standard prescribes only one format to be stored in user attributes: an [image](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#name-the-image-attribute-subpack) in JPEG format. Typically, this image represents the key owner, although it is not required. ## Linking the components To form an {term}`OpenPGP certificate`, individual {term}`components` are interconnected by the {term}`certificate holder` using their OpenPGP software. Within OpenPGP, this process is termed "binding", as in "a {term}`subkey` is bound to the {term}`primary key`." These bindings are realized using cryptographic {term}`signatures`. An in-depth discussion of this topic can be found in {ref}`component_signatures_chapter`. In very abstract terms, the {term}`primary key` of a {term}`certificate` acts as a root of trust or "{term}`certification authority`." It is responsible for: - issuing {term}`signatures` that express the {term}`certificate holder`'s intent to use specific {term}`subkeys` or {term}`identity components`; - conducting other lifecycle operations, including setting {term}`expiration` dates and marking {term}`components` as {term}`invalidated` or "`revoked`." By binding {term}`components` using digital {term}`signatures`, recipients of an {term}`OpenPGP certificate` need only {term}`validate` the {term}`authenticity` of the {term}`primary key` to use for their communication partner. Traditionally, this is done by manually verifying the *{term}`fingerprint`* of the {term}`primary key`. Once the {term}`validity` of the {term}`primary key` is confirmed, the {term}`validity` of the remaining {term}`components` can be automatically assessed by the user's OpenPGP software. Generally, {term}`components` are {term}`valid` parts of a {term}`certificate` if there is a statement signed by the {term}`certificate`'s {term}`primary key` endorsing this {term}`validity`. (metadata_in_certificates)= ## Metadata in certificates {term}`OpenPGP certificates`, their {term}`component keys`, and {term}`identities` possess {term}`metadata` that is not stored within the {term}`components` it pertains to. Instead, this {term}`metadata` is stored within signature packets, which are integral to the structure of an OpenPGP certificate. Key attributes, such as {term}`capabilities` (like *signing* or *encryption*) and {term}`expiration times`, are examples of {term}`metadata` not stored in the {term}`component key` data. How this {term}`metadata` is stored depends on the {term}`component`: - **{term}`Primary key` {term}`metadata`** is defined either through a {term}`direct key signature` on the {term}`primary key` (preferred in OpenPGP version 6), or by associating the {term}`metadata` with the [Primary User ID](primary_user_id). - **{term}`Subkey` {term}`metadata`** is defined within the [subkey binding signature](binding_subkeys) that links the {term}`subkey` to the {term}`certificate`. - **{term}`Identity component` {term}`metadata`** is associated via the [certifying self-signature](bind_ident) that links the {term}`identity` (usually in the form of a {term}`User ID`) to the {term}`certificate`. It is crucial to note that the {term}`components` of an {term}`OpenPGP certificate` remain static after their creation. The use of {term}`signatures` to store {term}`metadata` allows for subsequent modifications without altering the original {term}`component`. For instance, a {term}`certificate holder` can update the {term}`expiration time` of a {term}`component` by issuing a new, superseding {term}`signature`. ```{figure} diag_converted/Primary_key_metadata.svg :name: fig-primary-metadata :alt: Depicts a direct key signature, associated with a primary component key. {term}`Metadata` can be associated with the {term}`primary key` using a *{term}`direct key signature`*. ``` (capabilities_key_flags)= ### Defining operational capabilities of component keys with key flags Each {term}`component key` has a set of ["key flags"](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#key-flags) that delineate the operations a key can perform. Commonly used {term}`key flags` include: - **{term}`Certification`**: enables issuing third-party {term}`certifications` - **{term}`Signing`**: allows the key to sign data - **{term}`Encryption`**: allows the key to encrypt data - **{term}`Authentication`**: primarily used for SSH authentication[^auth-flag] [^auth-flag]: It's important to note that the function of the [authentication](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-authentication-via-digital-) {term}`key flag` is unrelated to the {term}`authentication` process used in certifying OpenPGP {term}`identities` and linking them to {term}`certificate`. Rather, this flag indicates a mechanism that uses {term}`cryptographic signatures` to confirm control of {term}`private key material` with a remote system. ```{note} Distinct {term}`component keys` handle specific operations. Only the {term}`primary key` can be used for {term}`certification`, although it can have additional {term}`capabilities`. {term}`Subkeys` can be used for signing, encryption, and authentication but cannot have the {term}`certification` {term}`capability`. A {term}`component key` can technically have multiple {term}`capabilities`. It is considered good practice, however, to [use separate keys for each capability](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#section-10.1.5-7). Notably, in many algorithms, encryption and signing-related functionalities (i.e., {term}`certification`, {term}`signing`, {term}`authentication`) are mutually exclusive, because the algorithms only support one of those two families of operations[^key-flag-sharing]. ``` [^key-flag-sharing]: With ECC algorithms, it's impossible to combine {term}`encryption` functions with those intended for {term}`signing`. For example, ed25519 is specifically used for {term}`signing`; cv25519 is designated for {term}`encryption`. ### Algorithm preferences and feature signaling OpenPGP incorporates significant ["cryptographic agility"](https://en.wikipedia.org/wiki/Cryptographic_agility). It doesn't rely on a single fixed set of algorithms. Instead, it defines a suite of cryptographic primitives from which users (or their applications) can choose. This agility facilitates the easy adoption of new cryptographic primitives into the standard, allowing for a seamless transition. Users can gradually migrate to new cryptographic mechanisms without disruption. However, this approach requires that OpenPGP software determine the cryptographic mechanisms that a set of communication partners can handle and prefer. OpenPGP employs several mechanisms for this purpose, which allow negotiation between sender and recipient. It's important to note that OpenPGP is not an online scheme; thus, this negotiation is effectively one-way. The active party interprets the preferences expressed in the {term}`certificate` of the passive party. Negotiation mechanisms in OpenPGP include: - [Preferred hash algorithms](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#preferred-hashes-subpacket) - [Preferred symmetric ciphers for v1 SEIPD](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#preferred-v1-seipd) - [Preferred AEAD ciphersuites](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#preferred-v2-seipd) - [Features subpacket](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#features-subpacket) - [Preferred compression algorithms](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#preferred-compression-subpacket) Beyond these explicitly expressed preferences, implementations also deduce {term}`capabilities` of communication partners based on the version of the {term}`OpenPGP certificate` they possess. #### User ID-specific preferences As a starting point, a {term}`certificate` has a set of preferences that apply generally. These are defined either in a {term}`direct key signature`, or via the {term}`primary User ID` of the {term}`certificate`. Additionally, OpenPGP allows modeling {term}`User ID`-specific preferences. The idea is that a user may prefer a different suite of algorithms on their private email account compared to their work email account. Such {term}`identity`-specific preferences can be expressed on the certifying {term}`signatures` that bind {term}`User IDs` to a {term}`certificate`. ## A typical OpenPGP certificate, revisited Following our review of how {term}`keys` and {term}`identity components` are linked, let's reexamine the {term}`OpenPGP certificate` from {numref}`fig-openpgp-certificate-components`. Our focus now extends to all of its binding signatures and the {term}`direct key signature` that contains {term}`metadata` for the full {term}`certificate`: ```{figure} diag_converted/OpenPGP_Certificate.svg :name: fig-openpgp-certificate :alt: Depicts an OpenPGP certificate, including a set of components, binding signatures, and a direct key signature on the primary key. This shows a typical {term}`OpenPGP certificate`, including binding {term}`signatures` for all of its {term}`components`, and a {term}`signature` that associates {term}`metadata` with the {term}`primary key`. ``` (revocations)= ## Revocations When a {term}`certificate holder` needs to {term}`invalidate` certain {term}`components` of their {term}`certificate`, or even the entire {term}`certificate`, they accomplish this through "{term}`revocation`." {term}`Revoking` the {term}`primary key` renders the entire {term}`certificate` {term}`invalid`. Notably, {term}`revocations` are not the only means by which {term}`components` can become {term}`invalid`. Other factors, such as the passing of a {term}`component`'s {term}`expiration time`, can also render {term}`components` {term}`invalid`. For more detailed information on {term}`revoking` specific {term}`components` of a {term}`certificate`, see the section on {ref}`self-revocations`. (third_party_identity_certifications)= ## Third-party (identity) certifications Since its inception, {term}`third-party identity certifications` have been a cornerstone of the OpenPGP ecosystem. The original PGP designers, starting with Phil Zimmermann, advocated for decentralized {term}`trust models` over reliance on centralized authorities. This decentralized approach in OpenPGP is known as the ["Web of Trust."](wot) Third-party {term}`certifications` are statements by OpenPGP users confirming that a user with a specific {term}`identity` is the owner of a particular {term}`OpenPGP certificate`. For example, Bob's OpenPGP software may issue a {term}`certification` that Bob has checked that the {term}`User ID` `Alice Adams ` and the {term}`certificate` with the {term}`fingerprint` `AAA1 8CBB 2546 85C5 8358 3205 63FD 37B6 7F33 00F9 FB0E C457 378C D29F 1026 98B3` are legitimately linked. Take, for instance, a scenario where Bob's OpenPGP software issues a {term}`certification` confirming as legitimate the link between the {term}`User ID` `Alice Adams ` and the {term}`certificate` bearing the {term}`fingerprint` `AAA1 8CBB 2546 85C5 8358 3205 63FD 37B6 7F33 00F9 FB0E C457 378C D29F 1026 98B3`. This process assumes that Bob knows the person known as `Alice Adams` and is confident that `alice@example.org` is indeed Alice's email address. Bob also verifies that the {term}`certificate` his OpenPGP software associates with Alice matches the one Alice uses. In essence, both users must have a {term}`certificate` for Alice with an identical {term}`fingerprint`. In OpenPGP version 6, manual {term}`fingerprint` comparison by end-users is discouraged, with a replacement {term}`verification` mechanism still under development. The {term}`verification` process must occur over a sufficiently secure channel, such as an end-to-end encrypted video call or a face-to-face meeting. For more on third-party {term}`certifications`, see {ref}`third_party_cert`. ## Advanced topics ### When are certificates valid? Certificates are composites of components that are linked together using [signatures](08-signing_components). A certificate can be valid or invalid as a whole. However, even when a certificate is valid, individual components (subkeys or identities) of it can be invalid. In this section, we discuss the validity of certificates and their components. This discussion is closely related to [signature validity](verification_chapter), and builds on that concept. The validity of the signatures that link a certificate is a necessary precondition. Two concepts are particularly central to the validity of certificates and components: - Expiration - Revocation #### Expiration Certificates and components can "expire," which renders them invalid. Each component of a certificate can have an expiration time, or be unlimited in its temporal validity. The OpenPGP software of a sender will refuse to encrypt email to an expired certificate, or to an encryption component key that is expired. The sender's software rejects encryption to the key, essentially as a courtesy to the certificate owner, respecting the preferences expressed in their certificate metadata. The expiration mechanism in OpenPGP is complemented by a mechanism to extend/renew expiration time. Using the expiration mechanism is useful for two reasons: - Expiration of a certificate means that it cannot be used anymore. This forces users of that certificate (or their OpenPGP software) to poll for updates for it. For example, from a keyserver. - It is a passive way for certificates to "time out," e.g., if their owner loses control over them, or isn't able to broadcast a revocation, for any reason. Component keys use *Key Expiration Time* subpackets for expressing the expiration time. Identity components rely on the [*signature expiration time*](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-10.html#signature-expiration-subpacket) subpacket of their binding signature. If a binding signature expires, the binding becomes invalid, and the component is considered expired. #### Revocation Since OpenPGP certificates act as ["append only" data structures](append-only), existing components or signatures cannot simply be "removed." Instead, they can be marked as invalid by issuing revocation signatures. These additional revocation signatures are added to the certificate. Each component, such as User ID and a subkey, may be revoked without affecting the rest of the certificate. Revoking the primary key with a [*Key revocation signature*](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-key-revocation-signature-ty) (type ID `0x20`) is a special case: This marks the entire certificate, including all of its components unusable. #### Semantics of Revocations In contrast to expiration, revocation is typically final and not withdrawn. A revocation indicates that the component should not be used. Revocation signatures over components use a [*Reason for Revocation*](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#reason-for-revocation) subpacket to specify further details about the reason why the component or certification was revoked. The OpenPGP format specifies a set of distinct [values for *Reasons for Revocation*](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#table-10), and additionally provides space for a human-readable free text field for comments about the revocation. Some libraries, such as Sequoia PGP, expose these distinct reasons for users, enabling nuanced machine-readable statements by the revoker. Other implementations focus mainly on the distinction between "hard" and "soft" revocations. Of the defined revocation types, *Key is superseded*, *Key is retired* and *User ID is no longer valid* are considered "soft" revocations. Any other reason (including a missing *reason for revocation* subpacket) means that the revocation is "hard." The distinction between hard and soft revocations plays a role when evaluating the validity of a component or signature at a specified reference time: Hard revocations have unbounded [temporal validity](temporal-validity), they are in effect even before their creation time. Hard revocations invalidate the revoked component or signature at all points in time. By contrast, a soft revocation leaves the revoked component or signature valid before the creation time of the revocation signature. A soft revocation can technically be overridden, for example, with a newer binding signature. Hard revocations address the following problem: If a private key was compromised, then the attacker can issue signatures using that key. This means, the attacker could issue a signature dated before the revocation, impersonating the owner of the key. A recipient of that signature would mistakenly consider this signature valid if the issuing key has been soft revoked. This is a problem. To counteract this problem, it is reasonable to clearly mark compromised keys as suspect at any point in time. That's what hard revocations do. On the other hand, if the subkey was merely retired, and the certificate holder moved to a different subkey, then the signatures in the past, made by the retired key, are still valid. (append-only)= ### Certificates are effectively append-only data structures OpenPGP certificates act as *append-only data structures*, in practice. By this, we mean that packets that are associated with a certificate cannot be "recalled", once they were published. Third parties (such as other users, or keyservers) may keep and/or distribute copies of those packets. While it is not possible to *remove* elements, once they were publicly associated with an OpenPGP certificate, it is possible to invalidate them by adding new metadata to the certificate. This new metadata could set an *expiration time* on a component, or explicitly *revoke* that component. In both cases, no packets are removed from the certificate. Invalidation resembles removal of a component in a semantical sense. The component is not a valid element of the certificate anymore, at least starting from some point in time. Implementations that handle the certificate may omit the invalid component in their representation. We have to distinguish the "packet level" information about a certificate from an application-level view of that certificate. The two may differ. #### Reasoning about append-only properties in a distributed system OpenPGP is a thoroughly distributed system. Users can obtain and transmit certificate information about their own, as well as other users', certificates using a broad range of mechanisms. These mechanisms include keyservers, manual handling, [Web Key Directory](https://datatracker.ietf.org/doc/draft-koch-openpgp-webkey-service/) (WKD) and [Autocrypt](https://en.wikipedia.org/wiki/Autocrypt). Different users' OpenPGP software may obtain different views of a particular certificate, over time. Individual users' OpenPGP instances have to reconcile and store a combined version of the possibly disparate elements they obtain from different sources. In practice, this means that various OpenPGP users may have differing views of any given certificate. For various reasons, not all users will be in possession of a fully up-to date and complete version of a certificate. There are various potential problems associated with this fact: Users may not be aware that a component has been invalidated by the certificate holder. Revocations may not have been propagated to some third party. So for example, they may not be aware that the certificate holder has rotated their encryption subkey to a new one, and doesn't want to receive messages encrypted to the previous encryption subkey. One mechanism that addresses a part of this issue is *expiration*: By setting their certificates to expire after an appropriate interval, certificate holders can force their communication partners to refresh their certificate, e.g. from a keyserver[^mgorny]. [^mgorny]: See, for example, [here](https://blogs.gentoo.org/mgorny/2018/08/13/openpgp-key-expiration-is-not-a-security-measure/): "Expiration dates really serve two purposes: naturally eliminating unused keys, and enforcing periodical checks on the primary key." Good practices, like setting appropriate expiration times, can mitigate the complexity of the inherently distributed nature of certificates. However, such mitigations by definition cannot address all possible cases of outdated certificate information in a decentralized, asynchronous system such as OpenPGP. So a defensive approach is generally appropriate when reasoning about the view of certificates that different actors have. When thinking about edge cases, it's useful to "assume the worst." For example: - Recipients may not obtain updates to a certificate in a timely manner (this could happen for various reasons, including, but not limited to, interference by malicious actors). - Data associated with a certificate may compound, and can become too large for convenient handling. If such a problem arises, then by definition, the certificate holder cannot address it: recall that the certificate holder cannot "recall" existing packets. #### Differing "views" of a certificate exist Another way to think about this discussion is that different OpenPGP users may have a different view of any certificate. There is a notional "canonical" version of the certificate, but we cannot assume that every user has exactly this copy. Besides propagation of elements that the certificate holder has linked to a certificate, third-party certifications are by design a distributed mechanism. A third-party certification is issued by a third party, and may or may not be distributed widely by them, or by the certificate holder. Not distributing third-party certifications widely is a workflow that may be entirely appropriate for some use cases. As a general tendency, it is desirable for OpenPGP users to have the most complete possible view of all certificates that they interact with. However, there are contexts in which it is preferable to only use a subset of the available elements of a certificate. We discuss this in the section {ref}`cert-mini`. ### Merging As described above, OpenPGP certificates are effectively [append-only](append-only) data structures. As part of the practical realization of this fact, OpenPGP software needs to *merge* different copies of a certificate. For example, Bob's OpenPGP system may have a local copy of Alice's certificate, and obtain a different version of Alice's certificate from a keyserver. The goal of the implementation is to add new information about Alice's certificate, if any, to the local copy. Alice may have added a new identity, replaced a subkey with a new subkey, or revoked some components of her certificate. Or, Alice may have revoked her certificate, signaling that she doesn't want communication partners to use that certificate anymore. All of these updates could be crucial for Bob to be aware of. Merging two versions of a certificate involves making decisions about which packets should be kept. The versions of the certificate will typically contain some packets that are identical. No duplicates of the exact same packet should be stored in the merged version of the certificate. Additionally, if the newly obtained copy contains packets that are in fact entirely unrelated to the certificate, those should not be retained (a third party may have included unrelated packets, either by mistake, or with malicious intent). #### Handling unauthenticated information For information that *is* related to the certificate, but not bound to it by a self-signature, there is no generally correct approach. The receiving implementation must revolve these cases, possibly in a context-specific manner. Such cases include: - Third-party certifications. These could be valuable information, where a third party attests that the association of an identity to a certificate is valid. On the other hand, they could also be a type of spam. - Subpackets in the unhashed area of a signature packet. Again, these could contain information that is useful to the recipient. However, the data could also be either useless, or even misleading/harmful. (cert-mini)= ### Certificate minimization Certificate minimization is the practice of presenting a partial view of a certificate by filtering out some of its components. Filtering out some elements of a certificate can have different benefits: - For some workflows it's clear that the full certificate is not required. For example, email clients only need encryption, signing and certification component keys. They don't need authentication subkeys, which are used for SSH connections. - In some contexts, data can be added to certificates by third parties, e.g. by adding third-party User ID certifications on some key servers. In the worst case this can lead to ["certificate flooding"](https://dkg.fifthhorseman.net/blog/openpgp-certificate-flooding.html) which inflates the target certificate to a point where consumer software rejects the certificate completely. Filtering out elements can mitigate this. - Sometimes, a certificate organically grows so big that the user software [has problems handling it](https://www.reddit.com/r/GnuPG/comments/bp23p4/my_key_is_too_large/). #### Elements that can be omitted as part of a minimization process There are different types of elements that can be omitted during minimization: - Subkeys (along with signatures on those subkeys) - Identity components (along with both their self-signatures and third-party signatures) - Signatures, by themselves: - Self-signatures that have been superseded by newer self-signatures for the same purpose - Third-party certifications #### Minimization in applications ##### Hagrid, which runs keys.openpgp.org The [hagrid keyserver software](https://gitlab.com/keys.openpgp.org/hagrid) doesn't publish the identity components in certificates by default. This is a central aspect of the [privacy policy](https://keys.openpgp.org/about/privacy) of the service. Certificates can be uploaded to the service by third parties, which is useful. However, identifying information is only distributed by the service on an explicit opt-in basis. Separately, third-party certifications are currently filtered out by the service, to avoid flooding attacks. ##### GnuPG GnuPG [strips some signatures on key import](https://dev.gnupg.org/T4607#127792). In addition, GnuPG offers two explicit methods for certificate minimization, described [in the GnuPG manual](https://www.gnupg.org/documentation/manuals/gnupg-devel/OpenPGP-Key-Management.html) as: *clean* : *Compact (by removing all signatures except the selfsig) any user ID that is no longer usable (e.g. revoked, or expired). Then, remove any signatures that are not usable by the trust calculations. Specifically, this removes any signature that does not validate, any signature that is superseded by a later signature, revoked signatures, and signatures issued by keys that are not present on the keyring.* *minimize* : *Make the key as small as possible. This removes all signatures from each user ID except for the most recent self-signature.* `clean` removes third-party signatures by certificates that are not present in current keyring, as well as other stale data. `minimize` removes superseded signatures that are not needed at the point when the command is executed. #### Limitations that can result from stripping historical self-signatures Some implementations, such as Sequoia, prefer to rely on the full historical set of self-signatures to construct a view of the certificate over time. This way, signatures can be verified at different reference times. In this model, removing superseded self-signatures can cause problems with the validation of historical signature. An example for the tension between minimization and nuanced verification of the [temporal validity](temporal-validity) of signatures can be seen in the case of rpm-sequoia. See [this discussion](https://github.com/rpm-software-management/rpm-sequoia/issues/50#issuecomment-1689642607) for details: Initially, when checking the validity of a data signature for a software package, `rpm-sequoia` used the signature's creation time as the reference time. However, the availability of historical self-signatures in certificates is limited. So sometimes only a more recent self-signature for the primary key is available, and there is no evidence that the primary key was valid at the reference time. To deal with this reality, the rpm-sequoia implementation was adjusted to accept data signatures that predate the validity of the current primary key self-signature[^primary-self-sig]. [^primary-self-sig]: Which in OpenPGP version 4 is often a primary User ID binding signature. #### Autocrypt The Autocrypt Level 1 specification defines a specific [minimal format for OpenPGP certificates](https://autocrypt.org/level1.html#openpgp-based-key-data) that are distributed by the autocrypt mechanism. One goal of the Autocrypt mechanism is to distribute certificates widely. To this end, Autocrypt sends certificates in mail headers, where smaller size is greatly preferable. Basic encrypted email functionality requires only a small subset of the recipient's certificate, so small certificate size is feasible. #### Minimization for email Note that it's not generally clear if minimization brings more benefit than harm. For example, we might consider minimizing a certificate for distribution via WKD, with the use-case of email in mind. Many certificates can be significantly pruned if the only goal of distributing them is to enable encryption and signature verification. For such cases, many components can be dropped, including invalid subkeys and their binding signatures, authentication subkeys (which are irrelevant to email), shadowed self-signatures, and third-party certifications. With many real-world certificates, the space savings of such a minimization are significant[^space-example]. Such minimization might be appropriate and convenient to enable encrypted communication with a ProtonMail client, which automatically fetches OpenPGP certificates via WKD while composing a message. The ProtonMail use case requires only component keys, not third-party certifications, and it doesn't require historical component keys or self-signatures. However, in a different context, the same certificate might be fetched to verify the authenticity of a signature. In that case, third-party certifications may be crucial for the client. Stripping them could prevent the client from performing Web of Trust calculations and verifying the authenticity of the certificate. [^space-example]: The following fragment processes an example certificate. It drops any subkey that is not valid at the time of export (because of revocation or expiration), authentication subkeys, and any third-party certifications: ```sh gpg --export-options export-minimal,export-clean,no-export-attributes \ --export-filter keep-uid=mbox=wiktor@metacode.biz \ --export-filter 'drop-subkey=expired -t || revoked -t || usage =~ a' \ --export wiktor@metacode.biz ``` At the time of writing, the original certificate consists of 152322 bytes of data. The filtered variant consists of only 3771 bytes, which is 40x smaller. In some contexts, there are hard constraints on size, and minimization is unavoidable, e.g., when embedding certificate data in email headers. #### Pitfalls of minimization Disadvantages/risks of minimizing certificates: - A minimized certificate does not present a full view of how it (and the validity of its components) evolved over time. - As an OpenPGP instance learns about more certificates, third-party certifications that were previously unusable may become usable. Dropping third-party certifications by unknown issuers as a part of minimization prevents this mechanism. - Removing component keys that the minimizing implementation can't use means that the receiver does not receive a copy of those, even if *the receiver* supports them. - Refreshing certificates from key servers may inflate the certificate again, since OpenPGP certificates tend to act as [append-only structures](append-only). - Carelessly stripping all invalid components may make the certificate unusable. Some libraries, such as [anonaddy-sequoia](https://gitlab.com/willbrowning/anonaddy-sequoia/-/blob/master/src/sequoia.rs?ref_type=heads#L125) strip unusable encryption subkeys. However, at least one subkey is retained, even if all encryption subkeys are unusable. Even though this may leave only an expired encryption subkey in the certificate, this presents a better UX for the end-user who probably is still in possession of the private key for decryption. #### Guidelines 1. Don't minimize certificates unless you have a good reason to. 2. When minimizing a certificate, minimize it in a way that suites your use-case. E.g., when minimizing a certificate for distribution alongside a signed software packet, make sure to include enough historical self-signatures as to not break the verification of the signed packet. 3. When presenting a minimized certificate view, consider when that view needs to be updated. Ideally, minimized certificates are freshly generated, on demand (e.g., an Autocrypt header is constructed while an email is sent or composed). The receiver is expected to typically merge all data it sees, locally. ### Fingerprints and beyond: "Naming" certificates in user-facing contexts Certificates in OpenPGP have traditionally often been "named" using hexadecimal strings of varying length. For example, a business card might have shown the hexadecimal fingerprint of a person's OpenPGP certificate to facilitate secure communication. Over time, different formats and lengths for these identifiers have been used. This section outlines the various ways in which certificates can be named, and their properties. #### Fingerprints and Key IDs in Version 4 With OpenPGP version 4 certificates, it was customary that user-facing software used 20 byte (160 bit) *fingerprints* as an identifier for the certificate. Or alternatively, the 8 byte (64 bit) *Key ID* variant of the fingerprint. Both were represented in hexadecimal format, sometimes with whitespace to group the identifier into blocks for easier readability. For example, in workflows to accept a certificate for a communication partner, or during third-party certification of an identity, users were shown hexadecimal representations of a fingerprint. Users were asked to manually verify that the fingerprint corresponds to the expected certificate. #### Fingerprints in Version 6 The OpenPGP version 6 standard uses 32 byte (256 bit) fingerprints, but explicitly defines no format for displaying those fingerprints in a human-readable form. The standard [recommends strongly against](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-fingerprint-usability) using version 6 fingerprints as identifiers in user-facing workflows. Instead, "mechanical fingerprint transfer and comparison" should be preferred, wherever possible. The reasoning is that humans tend to be bad at comparing high-entropy data[^schuermann] (in addition, many users are probably put off by being asked to compare long hexadecimal strings). [^schuermann]: See "An Empirical Study of Textual Key-Fingerprint Representations" #### Use of Fingerprints and Key IDs in APIs However, both Fingerprints and Key IDs may (and usually *must*) be used, programmatically, by software that handles OpenPGP data, to address specific certificates. This is equally true for OpenPGP version 6. Note that regardless of the OpenPGP version, software that relies on 8-byte Key IDs should not assume that Key IDs are unique. It is trivial to generate collisions for 8-byte Key IDs, so applications must be able to handle Key ID collisions gracefully. The historical 4-byte "short Key IDs" format should not be used anywhere, anymore (finding collisions in a 32-bit keyspace has been [trivial for a long time](https://evil32.com/)). (email-lookup)= #### Looking up certificates by email Searching OpenPGP certificates by email is a use case that often arises. For example, when composing an email to a new contact, the sender may want to find the OpenPGP certificate for that contact. Different mechanisms allow certificate lookup by email, for example: - [Web Key Directory](https://datatracker.ietf.org/doc/draft-koch-openpgp-webkey-service/) (WKD) - The [keys.openpgp.org](https://keys.openpgp.org/) "verifying keyserver" (also known as ["hagrid"](https://gitlab.com/keys.openpgp.org/hagrid), the name of the server software it runs) - SKS-style OpenPGP keyservers (today, most of these run the [Hockeypuck](https://github.com/hockeypuck/hockeypuck) software) Their properties differ, also see {ref}`distribution`. [^hip1]: (cert-freshness)= ### Certificate freshness: Triggering updates with an expiration time For a certificate holder, one problem is that their communication partners may not regularly poll for updates of their certificate. A certificate holder usually prefers that everyone else regularly obtains updates for their certificate. This way, a third party will, for example, not mistakenly keep using the certificate indefinitely, after it gets revoked. Setting an expiration time on the certificate, ahead of time, limits the worst case scenario: communication partners will at most use a revoked certificate until its expiration time, even if they never learn of the revocation. Once the expiration time is reached, third parties, or ideally their OpenPGP software will have to stop using the certificate, and may attempt to obtain an update for it. For example, from a keyserver, or via WKD. Ideally, certificate updates are obtained automatically, by the user's OpenPGP software, without any need for human intervention. After the update, the updated copy of the certificate will usually have a fresh expiration time. The same procedure will repeat once that new expiration time has been reached. ### Metadata leak of Social Graph Third-party certifications are signatures over identity components made by other certificates. These certifications form the back-bone of the OpenPGP trust-model called the Web of Trust. The name stems from the fact that the collection of certifications forms a unidirectional graph resembling a web. Each edge of the graph connects the signing certificate to the identity component associated with another certificate. OpenPGP software can inspect that graph. Based on the certification data in the graph and a set of trust anchors, it can infer whether a target certificate is legitimate. The trust anchor is usually the certificate holder's own key, but a user may designate additional certificates of organizations they are connected to as trust anchors. Third-party certifications can be published as part of the target certificate to facilitate the process of certificate authentication. Unfortunately, a side effect of this approach is that it's feasible to reconstruct the entire social graph of all people issuing certifications. In addition, the signature creation time of certifications can be used to deduce whether the certificate owner attended a Key Signing Party (and if it was public, where it was held) and whom they interacted with. So, there is some tension between the goals of - a decentralized system where every participant can access certification information and perform analysis on it locally, - privacy related goals (also see {ref}`email-lookup`, for a comparison of certificate distribution mechanisms, which also touches on this theme). (unbound_user_ids)= ### Adding unbound, local, User IDs to a certificate Some OpenPGP subsystems may add User IDs to a certificate, which are not bound to the primary key by the certificate's owner. This can be useful to store local identity information (e.g., Sequoia's public store attaches ["pet-names"][PET] to certificates, in this way). [PET]: https://sequoia-pgp.org/blog/2023/04/08/sequoia-sq/#an-address-book-style-trust-model Sequoia additionally certifies these "local, third party, User IDs" with a local trust root to facilitate local authentication decisions. To prevent accidental publication of these local User IDs (e.g. to public keyservers), Sequoia marks these binding signatures as "local" artifacts using [Exportable Certification](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-exportable-certification) subpackets to mark them as non-exportable. (distribution)= ### Certificate distribution mechanisms Different mechanisms for discovering certificates, and updating certificate data exist in the OpenPGP space: - A *Web Key Directory* service is operated by the entity that controls the domain name of the email in question. This means that WKD is decentralized, and the reliability of OpenPGP certificates may vary depending on the organization that operates a particular WKD instance. - The *keys.openpgp.org* service is a "verifying" keyserver: the keyserver software only publishes identity components (which include email addresses) after sending a verification email to that address, and receiving opt-in consent by the user of the email address. This service makes a different tradeoff: it is centralized, and relying on it to correctly perform the verification step requires trust in the operator. The tradeoff allows the service to only list identity information with the consent of the owner of that identity, and to prevent "enumeration" of the certificates and identities it stores (that is: third parties cannot obtain a list of email addresses in the service's database). By design, this service allows easy publication of revocations without requiring publication of any identity components. - *SKS-style keyservers* act as a distributed synchronizing database, which accepts certificate information without verification (TODO: does the network handle third party signatures? If so, how?[^hip1]). One central difference between hockeypuck and hagrid (the software that runs the *keys.openpgp.org* service) is that hockeypuck distributes identity packets and third-party certifications that have indeterminate validity, while hagrid does not. (cert-flooding)= ### Third-party certification flooding Traditional OpenPGP keyservers are one mechanism for [collection and distribution](distribution) of certificate information. Their model revolves around receiving certificate information from sources that don't identify themselves to the keyserver network. Traditionally, these keyservers have accepted both components bound to certificates by self-signatures, and third party identity certifications. While a convenience for consumers, indiscriminately accepting and integrating third-party identity certifications comes with significant risks. Without any restrictions in place, malicious entities can flood a certificate with excessive certifications. Called "certificate flooding," this form of digital vandalism grossly expands the certificate size, making the certificate cumbersome and impractical for users. It also opens the door to potential denial-of-service attacks, rendering the certificate non-functional or significantly impeding its operation. The popular [SKS keyserver network experienced certificate flooding firsthand](https://dkg.fifthhorseman.net/blog/openpgp-certificate-flooding.html) in 2019, causing significant changes to its operation.