Merkle Proof Encoding Formats#
A Merkle proof is a datum which demonstrates that a Merkle tree has a given value. Typically a Merkle root and a subtree of a Merkle tree are used as a Merkle proof. Verification is done by computing the Merkle root and checking it is the same as the given hash. In Octez, Merkle proofs are used for Optimistic Rollups (see Smart Optimistic Rollups) in the event an invalid hash is submitted from a layer 2 node to layer 1. An honest layer 2 node can then present a Merkle proof to demonstrate that the previously submitted hash is in fact fraudulent.
This document shows the encoding format of the Merkle proof implemented in src/lib_context/merkle_proof_encoding/merkle_proof_encoding.ml.
There are 2 versions of encodings (defined as V1 and V2), each generating 2 types of
Merkle proofs (named tree_proof and stream_proof), for 2 types of Irmin
Trees (32-tree and binary tree). The data structure is defined in
src/lib_context/sigs/context.ml (API ) as below.
The internal structure of Irmin, which is used to manage contexts in Octez, appears in it.
Encoding formats give the conversion between a tree_proof
and a byte sequence,
and between a stream_proof
and a byte sequence.
type index = int
type hash = Context_hash.t (* 32-byte Blake2b hash *)
type kinded_hash =[
| `Value of hash
| `Node of hash
]
type 'a inode = {length : int; proofs : (index * 'a) list}
type 'a inode_extender = {length : int; segment : index list; proof : 'a}
type tree
| Value of value
| Blinded_value of hash
| Node of (step * tree) list
| Blinded_node of hash
| Inode of inode_tree inode
| Extender of inode_tree inode_extender
type inode_tree =
| Blinded_inode of hash
| Inode_values of (step * tree) list
| Inode_tree of inode_tree inode
| Inode_extender of inode_tree inode_extender
type elt =
| Value of value
| Node of (step * kinded_hash) list
| Inode of hash inode
| Inode_extender of hash inode_extender
type stream = elt Seq.t
type tree_proof = { version : int; before : kinded_hash; after : kinded_hash; state : tree }
type stream_proof = { version : int; before : kinded_hash; after : kinded_hash; state : stream }
before
and after
in both proofs are Merkle roots and denote that
the Merkle root of a tree changed from before
to after
by applying the get/set operations for the tree.
tree_proof
is a partial tree containing the hashes of nodes needed for verification and the values the operations refer to.
It omits extra nodes and hashes that are recoverable by re-computing.
stream proof
can produce on demand a partial Merkle tree corresponding to a tree_proof
.
Note: all encoding versions use big endian.
Notation Example#
In order to explain the notation used for describing the encodings,
let us consider the Bytes
format as an example (this format is not actually used).
In our notation, the format is defined as in the table below.
Name |
Size |
Contents |
---|---|---|
tag |
1 byte |
0b000000yy where yy is tag for length (0b00 for p = 1, 0b01 for p = 2, 0b10 for p = 4, 0b11 for p = 8) |
length |
p bytes |
int |
content |
(length) bytes |
bytes |
This format means that given a byte sequence b
, the first 1
byte of b
represents the tag which determines the number of bytes p
of the next record length
; the next p
bytes of b
represent the length of
the content as an integer, and the following (length) bytes represent the
content.
0b000000yy
in the description of tag
means that the first 6 bits of
the tag must be 0
, and we represent “yy” as the last 2 bits of the
tag.
For example, when b
is 0x000c48656c6c6f20576f726c6421
, tag
is 0x00
(so the byte length of length
is 1), length
is 0x0c
(so the
byte length of content
is 12) and content
is
0x48656c6c6f20576f726c6421
See also#
Another implementation of the encoder for the Merkle proofs, written in the Rust language, is available here.