9.9 KiB
Pushdown Automaton for the OpenPGP Message Format
See RFC4880 §11.3. OpenPGP Messages for the formal definition.
A simulation of the automaton can be found here.
RFC4880 defines the grammar of OpenPGP messages as follows:
OpenPGP Message :- Encrypted Message | Signed Message |
Compressed Message | Literal Message.
Compressed Message :- Compressed Data Packet.
Literal Message :- Literal Data Packet.
ESK :- Public-Key Encrypted Session Key Packet |
Symmetric-Key Encrypted Session Key Packet.
ESK Sequence :- ESK | ESK Sequence, ESK.
Encrypted Data :- Symmetrically Encrypted Data Packet |
Symmetrically Encrypted Integrity Protected Data Packet
Encrypted Message :- Encrypted Data | ESK Sequence, Encrypted Data.
One-Pass Signed Message :- One-Pass Signature Packet,
OpenPGP Message, Corresponding Signature Packet.
Signed Message :- Signature Packet, OpenPGP Message |
One-Pass Signed Message.
In addition, decrypting a Symmetrically Encrypted Data packet or a
Symmetrically Encrypted Integrity Protected Data packet as well as
decompressing a Compressed Data packet must yield a valid OpenPGP
Message.
This grammar can be translated into a pushdown automaton with the following graphical representation:
graph LR
start((start)) -- "ε,ε/m#" --> pgpmsg((OpenPGP Message))
pgpmsg -- "Literal Data,m/ε" --> literal((Literal Message))
literal -- "ε,#/ε" --> accept((Valid))
literal -- "Signature,o/ε" --> sig4ops((Corresponding Signature))
sig4ops -- "Signature,o/ε" --> sig4ops
sig4ops -- "ε,#/ε" --> accept
pgpmsg -- "OnePassSignature,m/o" --> ops((One-Pass-Signed Message))
ops -- "ε,ε/m" --> pgpmsg
pgpmsg -- "Signature,m/ε" --> signed((Signed Message))
signed -- "ε,ε/m" --> pgpmsg
pgpmsg -- "Compressed Data,m/ε" --> comp((Compressed Message))
comp -. "ε,ε/m" .-> pgpmsg
comp -- "ε,#/ε" --> accept
comp -- "Signature,o/ε" --> sig4ops
pgpmsg -- "SKESK|PKESK,m/k" --> esks((ESKs))
pgpmsg -- "Sym. Enc. (Int. Prot.) Data,m/ε" --> enc
esks -- "SKESK|PKESK,k/k" --> esks
esks -- "Sym. Enc. (Int. Prot.) Data,k/ε" --> enc((Encrypted Message))
enc -. "ε,ε/m" .-> pgpmsg
enc -- "ε,#/ε" --> accept
enc -- "Signature,o/ε" --> sig4ops
accept -- "ε,ε/ε" --> accept
Formally, the PDA is defined as M = (\mathcal{Q}, \Sigma, \Upgamma, \delta, q_0, Z, F), where
\mathcal{Q}is a finite set of states\Sigmais a finite set which is called the input alphabet\Upgammais a finite set which is called the stack alphabet\deltais a finite set of\mathcal{Q}\times(\Sigma\cup\textbraceleft\epsilon\textbraceright)\times\Upgamma\times\mathcal{Q}\times\Upgamma^*, the transition relationq_0\in\mathcal{Q}is the start stateZ\in\Upgammais the initial stack symbolF\subseteq\mathcal{Q}is the set of accepting states
In our diagram, the initial state is q_0 = \text{start}.
The initial stack symbol is Z=\epsilon (TODO: Make it #?).
The set of accepting states is F=\textbraceleft\text{valid}\textbraceright.
\delta is defined by the transitions shown in the graph diagram.
The input alphabet \Sigma consists of the following OpenPGP packets:
\text{Literal Data}: Literal Data Packet\text{Signature}: Signature Packet\text{OnePassSignature}: One-Pass-Signature Packet\text{Compressed Data}: Compressed Data Packet\text{SKESK}: Symmetric-Key Encrypted Session Key Packet\text{PKESK}: Public-Key Encrypted Session Key Packet\text{Sym. Enc. Data}: Symmetrically Encrypted Data Packet\text{Sym. Enc. Int. Prot. Data}: Symmetrically Encrypted Integrity Protected Data Packet
Additionally, \epsilon is used to transition without reading OpenPGP packets or popping the stack.
The following stack alphabet \Upgamma is used:
m: OpenPGP Messageo: One-Pass-Signature packet.k: Encrypted Session Key#: Terminal for valid OpenPGP messages
Note: The standards document states, that Marker Packets shall be ignored as well. For the sake of readability, those transitions are omitted here.
The dotted line indicates a nested transition.
For example, the transition (\text{Compressed Message}, \epsilon, \epsilon, \text{OpenPGP Message}, m) indicates, that the content of the
Compressed Data packet itself is an OpenPGP Message.