Merkle Proof Encoding Formats
=============================
A Merkle proof is a datum which demonstrates that a `Merkle tree `_ has a given value.
Typically a Merkle root and a subtree of a Merkle tree are used as a Merkle proof.
Verification is done by computing the Merkle root and checking it is the same as the given hash.
In Octez, Merkle proofs are used for Optimistic Rollups (see :doc:`../active/transaction_rollups`) in the event
an invalid hash is submitted from a layer 2 node to layer 1.
An honest layer 2 node can then present a Merkle proof to demonstrate that the previously submitted hash is in fact fraudulent.
This document shows the encoding format of the Merkle proof implemented in :src:`src/lib_context/helpers/merkle_proof_encoding.ml`.
There are 2 versions of encodings (defined as V1 and V2), each generating 2 types of
Merkle proofs (named tree_proof and stream_proof), for 2 types of Irmin
Trees (32-tree and binary tree). The data structure is defined in
:src:`src/lib_context/sigs/context.ml` (`API `_) as below.
The internal structure of Irmin, which is used to manage contexts in Octez, appears in it.
Encoding formats give the conversion between a ``tree_proof`` and a byte sequence,
and between a ``stream_proof`` and a byte sequence.
.. code:: ocaml
type index = int
type hash = Context_hash.t (* 32-byte Blake2b hash *)
type kinded_hash =[
| `Value of hash
| `Node of hash
]
type 'a inode = {length : int; proofs : (index * 'a) list}
type 'a inode_extender = {length : int; segment : index list; proof : 'a}
type tree
| Value of value
| Blinded_value of hash
| Node of (step * tree) list
| Blinded_node of hash
| Inode of inode_tree inode
| Extender of inode_tree inode_extender
type inode_tree =
| Blinded_inode of hash
| Inode_values of (step * tree) list
| Inode_tree of inode_tree inode
| Inode_extender of inode_tree inode_extender
type elt =
| Value of value
| Node of (step * kinded_hash) list
| Inode of hash inode
| Inode_extender of hash inode_extender
type stream = elt Seq.t
type tree_proof = { version : int; before : kinded_hash; after : kinded_hash; state : tree }
type stream_proof = { version : int; before : kinded_hash; after : kinded_hash; state : stream }
``before`` and ``after`` in both proofs are Merkle roots and denote that
the Merkle root of a tree changed from ``before`` to ``after`` by applying the get/set operations for the tree.
``tree_proof`` is a partial tree containing the hashes of nodes needed for verification and the values the operations refer to.
It omits extra nodes and hashes that are recoverable by re-computing.
``stream proof`` can produce on demand a partial Merkle tree corresponding to a ``tree_proof``.
.. toctree::
:maxdepth: 1
:caption: Merkle proof formats
./merkle-proof-formats/v1-tree32
./merkle-proof-formats/v1-tree2
./merkle-proof-formats/v2-tree32
./merkle-proof-formats/v2-tree2
Note: all encoding versions use big endian.
Notation Example
----------------
In order to explain the notation used for describing the encodings,
let us consider the ``Bytes`` format as an example (this format is not actually used).
In our notation, the format is defined as in the table below.
+-----------------------+-----------------------+-----------------------+
| Name | Size | Contents |
+=======================+=======================+=======================+
| tag | 1 byte | 0b000000yy where yy |
| | | is tag for length |
| | | (0b00 for p = 1, 0b01 |
| | | for p = 2, 0b10 for p |
| | | = 4, 0b11 for p = 8) |
+-----------------------+-----------------------+-----------------------+
| length | p bytes | int |
+-----------------------+-----------------------+-----------------------+
| content | (length) bytes | bytes |
+-----------------------+-----------------------+-----------------------+
This format means that given a byte sequence ``b``, the first 1
byte of ``b`` represents the tag which determines the number of bytes ``p``
of the next record ``length``; the next ``p`` bytes of ``b`` represent the length of
the content as an integer, and the following (length) bytes represent the
content.
``0b000000yy`` in the description of ``tag`` means that the first 6 bits of
the tag must be ``0``, and we represent “yy” as the last 2 bits of the
tag.
For example, when ``b`` is ``0x000c48656c6c6f20576f726c6421``, ``tag`` is ``0x00``
(so the byte length of ``length`` is 1), ``length`` is ``0x0c`` (so the
byte length of ``content`` is 12) and ``content`` is
``0x48656c6c6f20576f726c6421``
See also
--------
Another implementation of the encoder for the Merkle proofs, written in the Rust language,
is available `here `__.