The Error Monad

There are multiple ways to deal with errors in OCaml: exceptions, result, option, etc., each approach has strengths and weaknesses. However, in Octez we settle on one, single, uniform way of dealing with errors: the error monad.

This is a tutorial for the error monad. It makes the following assumptions

  • You can read English.

  • You have a basic familiarity with OCaml.

  • You have a basic familiarity with Lwt (if not, this tutorial links to Lwt resources when it becomes necessary).

Other than that, each section of this tutorial can be understood using the information of previous sections, no additional information is necessary. Note, however, that most sections also make forward references along the lines of “more details on this later” or “alternatives are presented later”. During a first read, this foreshadowing is only useful in that it lets you know that the information is coming and that you can move on with a partial understanding. On subsequent reads, it helps putting things in context.

Note that the core of the tutorial focuses on error management in Octez. A short section gives additional information about error management in the protocol.

Part 1: result, Lwt, and Lwt-result

The result type

The type result is part of the standard library of OCaml. It is defined as

type ('a, 'b) result =
| Ok of 'a
| Error of 'b

(See reference manual.)

The constructors of the result type have meaning: Ok is for normal, successful cases that carry a value that is somewhat expected. Error is for abnormal, failure cases that carry information about what went wrong. E.g.,

let get a index =
  if index < 0 then
    Error "negative index in array access"
  else if index >= Array.length a then
    Error "index beyond length in array access"
  else
    Ok a.(index)

You can see the result type as an opinionated either: a left-or-right sum type where the left and right side have distinct roles. Or you can see the result type as a buffed-up option type: a type that either carries a value or doesn’t (but in this case it carries an error instead).

Exercises

  • Write the implementation for

    (** [to_option r] is [Some x] if [r] is [Ok x]. Otherwise it is [None]. *)
    val to_option : ('a, unit) result -> 'a option
    
  • Write the implementation for

    (** [to_option ?log r] is [Some x] if [r] is [Ok x]. Otherwise it calls
        [log e] and returns [None] if [r] is [Error e]. By default [log] is
        [ignore]. *)
    val to_option : ?log:('e -> unit) -> ('a, 'e) result -> 'a option
    
  • Write the implementation for

    (** [catch f] is [Ok (f ())] except if [f] raises an exception [exc] in
        which case it is [Error exc]. *)
    val catch : (unit -> 'a) -> ('a, exn) result
    

The binding operator

Working directly with the result type can quickly become cumbersome. Consider, for example the following code.

(** [compose3 f g h x] is [f (g (h x))] except that it handles errors. *)
let compose3 f g h x
  match h x with
  | Error e -> Error e
  | Ok y ->
    match g y with
    | Error e -> Error e
    | Ok z ->
      f z

The nested match-with constructs lead to further and further indentation. The Error cases are all identical and simply add noise.

To circumvent this, in Octez we use a binding operator: a user-defined let-binding. Specifically, you can open the Result_syntax module which includes the binding operator let* dedicated to result.

(** [compose3 f g h x] is [f (g (h x))] except that it handles errors. *)
let compose3 f g h x
  let open Result_syntax in (* adds [let*] in scope *)
  let* y = h x in
  let* z = g y in
  f z

An expression let* x = e in e' is equivalent to Result.bind e (fun x -> e') and also to match e with | Error err -> Error err | Ok x -> e'. In all of these forms, the expression e' is evaluated if and only if the expression e is successful (i.e., evaluates to Ok).

Exercises

  • Rewrite the following code without match-with

    let apply2 (f, g) (x, y) =
      match f x with
      | Error e -> Error e
      | Ok u ->
        match g y with
        | Error e -> Error e
        | Ok v -> Ok (u, v)
    

    Did you remember to open the syntax module?

  • Write the implementation for

    (** [map f [x1; x2; x3; ..]] is a list [[y1; y2; y3; ..]] where [y1 = f x1],
        [y2 = f x2], etc. except if [f] is fails on one of the inputs, in which
        case it is an [Error] carrying the same error as [f]'s. *)
    val map : ('a -> ('b, 'err) result) -> 'a list -> ('a list, 'err) result
    

    Note that this exercise is for learning only. You won’t need to write this function in Octez. Indeed, a helper function which does exactly that is provided in the extended standard library of Octez.

Aside: the Error_monad module is opened everywhere

In Octez, the Error_monad module provides types and values for error management. It is part of the tezos-error-monad package. And it is opened in most of the source code. Apart from some specific libraries (discussed later), the content of this module is already in scope.

The Error_monad module contains the Result_syntax module. This is why you can use let open Result_syntax in directly in your own functions.

The Error_monad module contains other modules and functions and types which you will learn about later.

Recovering from errors

When given a value of type result, you can inspect its content to determine if it is an Ok or an Error. You can use this feature to recover from the failures.

match e with
| Ok x -> do_something x
| Error e -> do_something_else e

Recovering can mean different things depending on the task that failed and the error with which it failed. Sometimes you just want to retry, sometimes you want to retry with a different input, sometimes you want to propagate the error, sometimes you want to log the error and continue as if it hadn’t happened, etc.

let rec write data =
  match write_to_disk data with
  | Ok () -> ()
  | Error EAGAIN -> write data (* again: try again *)
  | Error ENOSPC -> Stdlib.exit 1 (* no space left on device: escalate to exit *)

There is a correspondence between a match-with around a result and a try-with. Both are for recovering from failures. However, in Octez, you will mostly use a match-with around a result, because we favour result over exceptions. You may use the try-with construct when interfacing with an external library which uses exceptions.

There are several ways to use the match-with recovery with the binding operator from Result_syntax. Depending on the size of the expression you are recovering from, one may be more readable than the other. Choose accordingly.

You can simply place the expression directly inside the match-with.

match
  let* x = get_horizontal point in
  let* y = get_vertical point in
  Ok (x, y)
with
| Ok (x, y) -> ..
| Error e -> ..

Alternatively, if the expression grows too much in size or in complexity, you can move the expression inside a vanilla let-binding: let r = .. in match r with ...

Alternatively, if the expression grows even more, or if the expression may be re-used in other parts of the code, you may move the expression inside a vanilla function which you can call inside the match-with.

You can also use the functions from the standard library’s Result module. Note however, that some of these functions are shadowed in the extended library of Octez, which you will learn more about later.

Mixing different kinds of errors

Occasionally, you may have to call a function which returns a value of type, say, (_, unit) result and one, say, (_, string) result. In this case, you cannot simply bind the two expressions as is. Specifically, the type checker will complain. You can see the constraint you would be breaking in the type of let* where the two error types are the same ('e):

val ( let* ) : ('a, 'e) result -> ('a -> ('b, 'e) result) -> ('b, 'e) result

When you need to mix those function, you have to either handle the errors of each idependently (see the section above about recovering from errors) or you need to convert the errors so they have the same type. You should use Result.map_error to do that.

let* cfg =
  Result.map_error (fun () -> "Error whilst reading configuration")
  @@ read_configuration ()
in
..

The Result_syntax module

You have already learned about the let* binding operator from the Result_syntax. But there are other values you can use from this module.

The following functions form the core of the Result_syntax module.

  • let*: a binding operator to continue with the value in the Ok constructor or interrupt with the error in the Error constructor.

    let* x = e in e' is equivalent to match e with Error err -> Error err | Ok x -> e'.

    (See above for examples and more details.)

  • return : 'a -> ('a, 'e) result: the expression return x is equivalent to Ok x. The function is included for consistency with other syntax modules presented later. You can use either form.

  • fail : 'e -> ('a, 'e) result: the expression fail e is equivalent to Error e. The function is included for consistency with other syntax modules presented later. You can use either form.

The following functions offer additional, less often used functionalities.

  • both : ('a, 'e) result -> ('b, 'e) result -> ('a * 'b, 'e list) result: the expression both e1 e2 is Ok if both expressions evaluate to Ok and Error otherwise.

    Note that the expression both e1 e2 is different from the expression let* x = e1 in let* y = e2 in return (x, y). In the former (both) version, both e1 and e2 are evaluated, regardless of success/failure of e1 and e2. In the latter (let*-let*) version, e2 is evaluated if and only if e1 is successful.

    This distinction is important when the expressions e1 and e2 have side effects: both (f ()) (g ()).

  • all : ('a, 'e) result list -> ('a list, 'e list) result: the function all is a generalisation of both from tuples to lists.

    Note that, as a generalisation of both, in a call to the function all, all the elements of the list are evaluated, regardless of success/failure of the elements: all (List.map f xs).

  • join : (unit, 'e) result list -> (unit, 'e list) result: the function join is a specialisation of all for lists of unit-typed expressions (typically, for side-effects).

    Note that, as a specialisation of all, in a call to the function join, all the elements of the list are evaluated, regardless of success/failure of the elements: join (List.map f xs).

The following functions do not provide new functionalities but they are useful for small performance gains or for shorter syntax.

  • return_unit is equivalent to return () but it avoids one allocation. This is important in parts of the code that are performance critical, and it is a good habit to take otherwise.

    return_nil is equivalent to return [] but it avoids one allocation.
    return_true is equivalent to return true but it avoids one allocation.
    return_false is equivalent to return false but it avoids one allocation.
    return_none is equivalent to return None but it avoids one allocation.
    return_some x is equivalent to return (Some x) and it is provided for completeness with return_none.
  • let+ is a binding operator similar to let* but when the expression which follows the in returns a non-result value. In other words, let+ x = e in e' is equivalent to let* x = e in return (e').

    The let+ is purely for syntactic conciseness (compared to the * variant), use it if it makes your code more readable.

Lwt

In Octez, I/O and concurrency are handled using the Lwt library. With Lwt you use promises to handle I/O and concurrency. You can think of promises as data structures that are empty until they resolve, at which point they hold a value.

A promise for a value of type 'a has type 'a Lwt.t. The function Lwt.bind : 'a t -> ('a -> 'b t) -> 'b t waits for the promise to resolve (i.e., to carry a value of type 'a) before applying the provided function. The expression bind p f is a promise which resolves only once the promise p has resolved and then the promise returned by f has resolved too.

If you are not familiar with Lwt, you should check out the official manual and this introduction before continuing. This is important. Do it.

Unlike is mentioned in those separate resources on Lwt, in Octez, we do not in general use the success/failure mechanism of Lwt. Instead, we mostly rely on result (as mentioned above).

Thus, in the rest of this tutorial we only consider the subset of Lwt without failures. In practice, you might need to take care of exceptions in some cases, but this is discussed in the later, more advanced parts of the tutorial.

The Lwt_syntax module

In Octez, because Lwt is pervasive, you need to bind promises often. To make it easier, you can use the Lwt_syntax module. The Lwt_syntax module is made available everywhere by the Error_monad module. The Lwt_syntax module is similar to the Result_syntax module but for the Lwt monad (more about monads later).

  • let*: a binding operator to wait for the promise to resolve before continuing.

    let* x = e in e' is a promise that resolves after e has resolved to a given value and then e' has resolved with that value carried by x.

    Note that Lwt_syntax and Result_syntax (see above) both use let* for their main binding operator. Consequently, the specific meaning of let* depends on which module is open. This extends to other syntax modules introduced later in this tutorial.

    (What if you need to use both Lwt and result? Which syntax module should you use? You will learn about that in the next section!)

  • return : 'a -> 'a Lwt.t: the expression return x is equivalent to Lwt.return x. It is a promise that is already resolved with the value of x.

  • and*: a binding operator alias for both (see below). You can use it with let* the same way you use and with let.

    let apply_triple f (x, y, z) =
      let open Lwt_syntax in
      let* u = f x
      and* v = f y
      and* w = f z
      in
      return (u, v, w)
    

    When you use and*, the bound promises (f x, f y, and f z) are evaluated concurrently, and the expression which follows the in (return ..) is evaluated once all the bound promises have all resolved.

The following functions offer additional, less often used functionalities.

  • both: 'a Lwt.t -> 'b Lwt.t -> ('a * 'b) Lwt.t: the expression both p q is a promise that resolves only once both promises p and q (which make progress concurrently) have resolved.

    In practice, you will most likely use and* instead of both.

  • all: 'a Lwt.t list -> 'a list Lwt.t: the function all is a generalisation of both from tuples to lists.

    Note that, as a generalisation of both, in a call to the function all, all the promises in the provided list make progress towards resolution concurrently.

  • join : unit Lwt.t list -> unit Lwt.t: the function join is a specialisation of all to lists of units (i.e., side-effects).

The following functions do not provide new functionalities but they are useful for small performance gains or for shorter syntax.

  • return_unit is equivalent to return () but it avoids one allocation. This is important in parts of the code that are performance critical, and it is a good habit to take otherwise.

    return_nil is equivalent to return [] but it avoids one allocation.
    return_true is equivalent to return true but it avoids one allocation.
    return_false is equivalent to return false but it avoids one allocation.
    return_none is equivalent to return None but it avoids one allocation.
    return_some x is equivalent to return (Some x) and it is provided for completeness.
    return_ok x is equivalent to return (Ok x) and it is provided for completeness.
    return_error x is equivalent to return (Error x) and it is provided for completeness.
  • let+ and and+ are binding operators similar to let* and and* but when the expression which follows the in returns a non-promise value. In other words, let+ x = e1 and+ y = e2 in e is equivalent to let* x = e1 and* y = e2 in return e.

    The let+ and and+ are purely for syntactic conciseness (compared to the * variants), use them if it makes your code more readable.

Promises of results: Lwt and result together

In Octez, we have functions that perform I/O and also may fail. In this case, the function returns a promise of a result. This is the topic of this section.

Note that Lwt and result are orthogonal concerns. On the one hand, Lwt is for concurrency, for automatically scheduling code around I/O, for making progress on different parts of the program side-by-side. On the other hand, result is for aborting computations, for handling success/failures. It is because Lwt and result are orthogonal that we can use them together.

'a  --------------> ('a, 'e) result
 |                           |
 |                           |
 V                           V
'a Lwt.t ---------> ('a, 'e) result Lwt.t

When we combine Lwt and result for control-flow purpose we combine both of the orthogonal behaviours. We can achieve this combined behaviour “by hand”. However, doing so requires mixing Lwt_syntax.( let* ) and regular match-with:

let apply2 (f, g) (x, y) =
  let open Lwt_syntax in
  let* r = f x in
  match r with
  | Error e -> return (Error e)
  | Ok u ->
    let* r = g y in
    match r with
    | Error e -> return (Error e)
    | Ok v -> return (Ok (u, v))

This is interesting to consider because it shows the two orthogonal features of control-flow separately: wait for the promise to resolve, and check for errors. However, in practice, this becomes cumbersome even faster than when working with plain result values.

To make this easier, in Octez we use a binding operator. Specifically, you can open the Lwt_result_syntax (instead of the other syntax modules) which includes a binding operator dedicated to promises of result.

let apply2 (f, g) (x, y) =
  let open Lwt_result_syntax in
  let* u = f x in
  let* v = g y in
  return (u, v)

When a promise resolves to Ok we say that it resolves successfully. When it resolves to Error we say that it resolves unsuccessfully or that it fails.

Exercises

  • Rewrite the following code without match-with

    let compose3 f g h x
      let open Lwt_syntax in
      let* r = h x in
      match r with
      | Error e -> return (Error e)
      | Ok y ->
        let* s = g y in
        match s with
        | Error e -> return (Error e)
        | Ok z ->
          f z
    

    Did you remember to change the opened syntax module?

The Lwt_result_syntax module

Octez provides the Lwt_result_syntax module to help handle promises of results.

  • let*: a binding operator to wait for the promise to resolve before continuing with the value in the Ok constructor or interrupting with the error in the Error constructor.

    Note how the let* binding operator combines the behaviour of Lwt_syntax.( let* ) and Result_syntax.( let* ). Also note that the different let*s are differentiated by context; specifically by what syntax module has been opened.

  • return: 'a -> ('a, 'e) result Lwt.t: the expression return x is a promise already successfully resolved to x. More formally, return x is equivalent to Lwt_syntax.return (Result_syntax.return x).

  • fail: 'e -> ('a, 'e) result Lwt.t: the expression fail e is a promise already unsuccessfully resolved with the error e. More formally, fail e is equivalent to Lwt_syntax.return (Result_syntax.fail e).

The following functions offer additional, less often used functionalities.

  • both : ('a, 'e) result Lwt.t -> ('b, 'e) result Lwt.t -> ('a * 'b, 'e list) result Lwt.t: the expression both p1 p2 is a promise that resolves successfully if both p1 and p2 resolve successfully. It resolves unsuccessfully if either p1 or p2 resolve unsuccessfully.

    Note that in the expression both p1 p2, both promises p1 and p2 are evaluated concurrently. Moreover, the returned promise only resolves once both promises have resolved, even if one resolves unsuccessfully.

    Note that this syntax module does not offer and* as a binding operator alias for both. This is because, as with Result_syntax, the type for errors in both is not stable (it is 'e on the argument side and 'e   list on the return side). This hinders practical uses of and*.

  • all : ('a, 'e) result Lwt.t list -> ('a list, 'e list) result Lwt.t: the function all is a generalisation of both from tuples to lists.

    Note that, as a generalisation of both, in a call to the function all, all the promises in the provided list make progress towards resolution concurrently and continue to evaluate until resolution regardless of successes and failures.

  • join : (unit, 'e) result Lwt.t list -> (unit, 'e list) result Lwt.t: the function join is a specialisation of all for lists of unit-type expressions (typically, for side-effects).

The following functions do not provide new functionalities but they are useful for small performance gains or for shorter syntax.

  • return_unit is equivalent to return () but it avoids one allocation. This is important in parts of the code that are performance critical, and it is a good habit to take otherwise.

    return_nil is equivalent to return [] but it avoids one allocation.
    return_true is equivalent to return true but it avoids one allocation.
    return_false is equivalent to return false but it avoids one allocation.
    return_none is equivalent to return None but it avoids one allocation.
    return_some x is equivalent to return (Some x) and it is provided for completeness.

    Note that, like with Result_syntax, this syntax module does not provide return_ok and return_error. This is to avoid nested result types. If you need to nest results you can do so by hand.

  • let+ is a binding operator similar to let* but when the expression which follows the in returns a non-promise value. In other words, let+ x = e in e' is equivalent to let* x = e in return (e').

    The let+ is purely for syntactic conciseness (compared to the * variant), use it if it makes your code more readable.

Exercises

  • Write the implementation for

    (** [map f [x1; x2; ..]] is a promise for a list [[y1; y2; .. ]] where [y1]
        is the value that [f x1] successfully resolves to, etc. except if [f]
        resolves unsuccessfully on one of the input in which case it also
        resolves unsuccessfully with the same error as [f]. *)
    map : ('a -> ('b, 'e) result Lwt.t) -> 'a list -> ('b list, 'e) result Lwt.t
    

    How does your code compare to the one in the result-only variant of this exercise?

    Note that this exercise is for learning only. You won’t need to write this function in Octez. Indeed, a helper function which does exactly that is provided in the extended standard library of Octez.

  • Make your map function tail-recursive.

  • What type error is triggered by the following code?

    open Lwt_result_syntax ;;
    let ( and* ) = both ;;
    let _ =
      let* x = return 0 and* y = return 1 in
      let* z = return 2 in
      return (x + y + z) ;;
    
  • Rewrite the following function without match-with

    let compose3 f g h x =
      let open Lwt_syntax in
      let* y_result = f x in
      match y_result with
      | Error e -> return (Error e)
      | Ok y ->
        let* z_result = g y in
        match z_result with
        | Error e -> return (Error e)
        | Ok z ->
          h z
    

Converting errors of promises

Remember that, with Result_syntax, you cannot mix different types of errors in a single sequence of let*. This also applies to Lwt_result_syntax. Indeed, the type checker will prevent you from doing so.

You can use the same Result.map_error function as for plain results. But when you are working with promises of result, the syntactic cost of doing so is high:

let open Lwt_result_syntax in
let* config =
  let open Lwt_syntax in
  let* config_result = read_configuration () in
  Lwt.return (Result.map_error (fun () -> ..) config_result)
in
..

To avoid this syntactic weight, the Lwt_result_module provides a dedicated function:

lwt_map_error : ('e -> 'f) -> ('a, 'e) result Lwt.t -> ('a, 'f) result Lwt.t

Lifting

Occasionally, whilst you are working with promises of result (i.e., working with values of the type (_, _) result Lwt.t), you will need to call a function that returns a simple promise (a promise that cannot fail, a promise of a value that’s not a result, i.e., a value of type _ Lwt.t) or a simple result (an immediate value of a result, i.e., a value of type (_, _) result). This is common enough that the module Lwt_result_syntax provides helpers dedicated to this.

From Lwt-only into Lwt-result

lwt_ok: 'a Lwt.t -> ('a, 'e) result Lwt.t: the expression lwt_ok p is a promise which waits for the promise p to resolve to a value, after which it resolves successfully with the same value.

The expression lwt_ok p is equivalent to let open Lwt_syntax in let* v = p in return (Ok v)

In more practical terms, the function lwt_ok lifts an Lwt-only value (a promise) into an Lwt-result value (a promise of a result). The function is generally used as follows:

let* x = lwt_ok @@ plain_lwt_function foo bar in
..

From result-only into Lwt-result

bind_from_result : ('a, 'e) result -> ('a -> ('b, 'e) result Lwt.t) -> ('b, 'e) result Lwt.t: the expression bind_from_result r (fun x -> e) is either e with x set to the value carried by the Ok constructor of r, or a promise already unsuccessfully resolved if r is an Error. In less formal terms, bind_from_result r (fun x -> e) either continues with e or interrupts immediately without evaluating e if r is an Error.

The expression bind_from_result r (fun x -> e) is equivalent to match r with | Error err -> fail err | Ok x -> e

The function bind_from_result is useful to use a result-only in an Lwt-result expression. Notice however that the name and type are different from lwt_ok. That is because result and Lwt have different control-flow roles: the former is for aborting a computation the latter is for concurrency. Thus, simply lifting from result-only into Lwt-result is somewhat inefficient because, in case of Error, it doesn’t abort immediately. This efficiency consideration is the motivation behind the bind_from_result function.

The difference of type with lwt_ok makes bind_from_result somewhat less syntactically graceful. Here are examples of practical uses.

(* modify in-place the content of a file, return the new content *)
let map_file_content filename (f : string -> (string, string) result) =
  let open Lwt_result_syntax in
  let* file_handle = open_file filename in
  let* contents = read_file file_handle in
  bind_from_result (f contents) @@ fun new_contents ->
  let* () = write_file file_handle new_contents in
  let* () = close_file file_handle
  return new_contents

For occasional use, this ersatz of an infix binding operator is sufficient. You can help by grouping multiple calls to result-only functions into a single expressions; thus reducing the number of calls to bind_from_result.

Of course, for occasional use, you can also resort to a plain match-with.

In addition, if your code is not performance-critical (as is the case for the example above, because system-calls will dominate the cost of this function), you can use Lwt.return as a simple lifting function.

…
let* new_contents = Lwt.return @@ f contents in
…

Alternatively, if you mix many result-only expressions in your Lwt-result function, you can locally define a dedicated binding-operator.

let ( let*? ) = bind_from_result in
..
let*? new_contents = f contents in
..

Exercises

  • Define a binding operator to replace uses of lwt_ok.

Wait! there is too much! what module am I supposed to open locally and what operators should I use?

If you are feeling overwhelmed by the different syntax modules, here are some simple guidelines.

  • If your function returns (_, _) result Lwt.t values, then you start the function with let open Lwt_result_syntax in. Within the function you use

    • let for vanilla expressions,

    • let* for Lwt-result expressions,

    • let* .. = lwt_ok @@ .. in for Lwt-only expressions,

    • bind_from_result (..) @@ fun .. -> for result-only expressions (or alternatives presented above).

    And you end your function with a call to return.

  • If your function returns (_, _) result values, then you start the function with let open Result_syntax in. Within the function you use

    • let for vanilla expressions,

    • let* for result expressions,

    And you end your function with a call to return.

  • If your function returns _ Lwt.t values, then you start the function with let open Lwt_syntax in.

    • let for vanilla expressions,

    • let* for Lwt expressions,

    And you end your function with a call to return.

These are rules of thumb and there are exceptions to them. Still, they should cover most of your uses and, as such, they are a good starting point.

What’s an error?

So far, this tutorial has considered errors in an abstract way. Most of the types carried by the Error constructors have been parameters ('e). This is a common pattern for higher-order functions that compose multiple result and Lwt-result functions together. But, in practice, not every function is a higher-order abstract combinator and you sometimes need to choose a concrete error. This section explores common choices.

A dedicated algebraic data type

Often, a dedicated algebraic data type is appropriate. A sum type represents the different kinds of failures that might occur. E.g., type hashing_error = Not_enough_data | Invalid_escape_sequence. A product type (typically a record) carries multiple bits of information about a failure. E.g., type parsing_error = { source: string; location: int; token: token; }

This approach works best when a set of functions (say, all the functions of a module) have similar ways to fail. Indeed, when that is the case, you can simply define the error type once and all calls to these functions can match on that error type if need be.

E.g., binary encoding and decoding errors in data-encoding.

Polymorphic variants

In some cases, the different functions of a module may each fail with different subsets of a common set of errors. In such a case, you can use polymorphic variants to represent errors. E.g.,

val connect_to_peer:
  address -> (connection, [ `timeout | `connection_refused ]) result Lwt.t
val send_message:
  connection -> signing_key -> string ->
    (unit, [ `timeout | `connection_closed ]) result Lwt.t
val read_message:
  connection ->
    (string, [ `timeout | `unknown_signing_key | `invalid_signature | `connection_closed ]) result Lwt.t
val close_connection: connection -> (unit, [ `unread_messages of string list ]) result

The benefit of this approach is that the caller can compose the different functions together easily and match only on the union of errors that may actually happen. The type checker keeps track of the variants that can reach any program point.

let handshake conn =
  let open Lwt_result_syntax in
  let* () = send_message conn "ping" in
  let* m = read_message conn in
  if m = "pong" then
    return ()
  else
    `unrecognised_message m

let handshake conn =
  let open Lwt_syntax in
  let* r = handshake conn in
  match r with
  | Ok () -> return_unit
  | Error (`unknown_signing_key | `invalid_signature) ->
    (* we ignore unread messages if the peer had signature issues *)
    let _ = close_connection conn in
    return_unit
  | Error (`timeout | `connection_closed) ->
    match close_connection with
    | Ok () -> return_unit
    | Error (`unread_messages msgs) ->
      let* () = log_unread_messages msgs in
      return_unit

A human-readable string

In some cases, there is nothing to be done about an error but to inform the user. In this case, the error may just as well be the message.

It is important to note that these messages are not generally meant to be matched against. Indeed, such messages may not be stable and even if they are, they probably don’t carry precise enough information to be acted upon.

You should only use string as an error type when the error is not recoverable and you should not try to recover from string errors (or more precisely, your recovery should not depend on the content of the string).

An abstract type

If the error is not meant to be recovered from, it is sometimes ok to use an abstract type. This is generally useful at the interface of a module, specifically when the functions within the module are meant to inspect the errors and possibly attempt recovery, but the callers outside of the modules are not.

If you do use an abstract type for errors, you should also provide a pretty-printing function.

A wrapper around one of the above

Sometimes you want to add context or information to an error. E.g.,

type 'a with_debug_info = {
  payload: 'a;
  timestamp: Time.System.t;
  position: string * int * int * int;
}

let with_debug_info ~position f =
  match f () with
  | Ok _ as ok -> ok
  | Error e -> Error { payload = e; timestamp = Systime_os.now (); position }

This specific example can be useful for debugging, but other wrappers can be useful in other contexts.

Mixing error types

It is difficult to work with different types of errors within the same function. This most commonly happens if you are calling functions from different libraries, which use different types of errors.

This is difficult because the errors on both sides of the binding operator are the same.

val ( let* ) : ('a, 'e) result -> ('a -> ('b, 'e) result) -> ('b, 'e) result

The error monad provides some support to deal with multiple types of errors at once. But this support is limited. It is not generally an issue because error-mixing is somewhat rare: it tends to happen at the boundary between different levels of abstractions.

If you encounter one of these situations, you will need to convert all the errors to a common type.

type error = Configuration_error | Command_line_error

let* config =
  match Config.parse_file config_file with
  | Ok _ as ok -> ok
  | Error Config.File_not_found -> Ok Config.default
  | Error Config.Invalid_file -> Error Configuration_error
in
let* cli_parameters =
  match Cli.parse_parameters () with
  | Ok _ as ok -> ok
  | Error Cli.Invalid_parameter -> Error Command_line_error
in
..

You can also use the Result.map_error and lwt_map_error functions introduced in previous sections.

Wait! it was supposed to be “one single uniform way of dealing with errors”! what is this?

The error management in Octez is through a unified way (syntax modules with regular, predictable interfaces) of handling different types of errors.

The variety of errors is a boon in that it lets you use whatever is the most appropriate for the part of the code that you are working on. However, the variety of errors is also a curse in that stitching together functions which return different errors requires boilerplate conversion functions.

That’s where the global error type comes in: a unified type for errors. And that’s for the next section to introduce.

META COMMENTARY

The previous sections are not Octez-specific. True, the syntax modules are defined within the Octez source tree, but they could be released separately (and they will be) or they could easily be replicated in a separate project.

The next sections are Octez-specific. They introduce types and values that are used within the whole of Octez.

You should take this opportunity to take a break.
Come back in a few minutes.

Part 2: tzresult and Lwt-tzresult

error, trace, and tzresult

The error type is an extensible variant type defined in the Error_monad module.

Each part of the Octez source code can extend it to include some errors relevant to that part. E.g., the p2p layer adds type error += Connection_refused whilst the store layer adds type error += Snapshot_file_not_found of string. More information on errors is presented later.

The 'error trace type is a data structure dedicated to holding errors. It is exported by the Error_monad module. It is only ever instantiated as error trace. It can accumulate multiple errors, which helps present information about the context in which an error occurs. More information on traces is presented later.

The 'a tzresult type is an alias for ('a, error trace) result. It is used in Octez to represent the possibility of failure in a generic way. Using tzresult is a trade-off. You should only use it in situations where the pros outweigh the cons.

Pros:

  • A generic error that is the same everywhere means that the binding operator let* can be used everywhere without having to convert errors to a common type.

  • Traces allow you to easily add context to an existing error (see how later).

Cons:

  • A generic wrapper around generic errors that cannot be meaningfully matched against (although the pattern exists in legacy code).

  • Type information about errors is coarse-grained to the point of being meaningless (e.g., there is no exhaustiveness check when matching).

  • Registration is syntactically heavy and requires care (see below).

In general, the type tzresult is mostly useful at a high-enough level of abstraction and in the outermost interface of a module or even package (i.e., when exposing errors to callers that do not have access to internal implementation details).

Generic failing

The easiest way to return a tzresult is via functions provided by the Error_monad. These functions are located at the top-level of the module, as such they are available, unqualified, everywhere in the code. They do not even require the syntax modules to be open.

  • error_with is for formatting a string and wrapping it inside an error inside a trace inside an Error. E.g.,

    let retry original_limit (f : unit -> unit tzresult) =
      let rec retry limit f
        if limit < 0 then
          error_with "retried %d times" original_limit
        else
          match f () with
          | Error _ -> retry (limit - 1) f
          | Ok () -> Ok ()
      in
      retry original_limit f
    

    You can use all the formating percent-escapes from the Format module. However, you should generally keep the message on a single line so that it can be printed nicely in logs.

    Formally, error_with has type ('a, Format.formatter, unit, 'b tzresult) format4 -> 'a. This is somewhat inscrutable. Suffice to say: it is used with a format string and returns an Error.

  • error_with_exn : exn -> 'a tzresult is for wrapping an exception inside an error inside a trace inside an Error.

    let media_type_kind s =
      match s with
      | "json" | "bson" -> Ok `Json
      | "octet-stream" -> Ok `Binary
      | _ -> error_with_exn Not_found
    
  • failwith is for formatting a string and wrapping it inside an error inside a trace inside an Error inside a promise. E.g.,

    failwith "Cannot read file %s (ot %s)" file_name (Unix.error_message error)
    

    failwith is similar to error_with but for the combined Lwt-tzresult monad. Again the type (('a, Format.formatter, unit, 'b tzresult Lwt.t) format4 -> 'a) is inscrutable, but again usage is as simple as a formatting.

  • fail_with_exn is for wrapping an exception inside an error inside a trace inside an Error inside a promise. E.g.,

    fail_with_exn Not_found
    

    fail_with_exn is similar to error_with_exn but for the combined Lwt-tzresult monad.

These functions are useful as a way to fail with generic errors that carry a simple payload. The next section is about using custom errors with more content.

Exercises

  • Write a function tzresult_of_option : 'a option -> 'a tzresult which replaces None with a generic failure of your choosing.

  • Write a function catch : (unit -> 'a) -> 'a tzresult which wraps any exception raised during the evaluation of the function call.

Declaring and registering errors

On many occasions, error_with and friends (see above) are not sufficient: you may want your error to carry more information than can be conveyed in a simple string or a simple exception. When you do, you need a custom error. You must first declare and then register an error.

To declare a new error you simply extend the error type. Remember that the error type is defined and exported to the rest of the code by the Error_monad module. You extend it with the += syntax:

type error +=
  Invalid_configuration of { expected: string; got: string; line: int }

The registration function register_error_kind is also part of the Error_monad module. You register a new error by calling this function.

let () =
  register_error_kind
    `Temporary
    ~id:"invalid_configuration"
    ~title:"Invalid configuration"
    ~description:"The configuration is invalid."
    ~pp:(fun f (expected, got, line) ->
      Format.fprintf f
        "When parsing configuration expected %s but got %s (at line %d)"
        expected got line
    )
    Data_encoding.(
      obj3
        (req "expected" string)
        (req "got" string)
        (req "line" int31))
    (function
      | Invalid_configuration {expected; got; line} ->
        Some (expected, got, line)
      | _ -> None)
    (fun (expected, got, line) -> Invalid_configuration {expected; got; line})

Note that you MUST register the errors you declare. Failure to do so can lead to serious issues.

The arguments for the register_error_kind function are as follows: - category (`Temporary): the category argument is meaningless in the shell, just use `Temporary.

  • id: a short string containing only characters that do not need escaping ([a-zA-Z0-9._-]), must be unique across the whole program.

  • title: a short human readable string.

  • description: a longer human readable string.

  • pp: a pretty-printing function carrying enough information for a full error message for the user. Note that the function does not receive the error, instead it receives the projected payload of the error (here a 3-uple (expected, got, line).

  • encoding: an encoding for the projected payload of the error.

  • projection: a partial function that matches the specific error (out of all of them) and return its projected payload. This function always has the form function | <the error you are returning> -> Some <projected payload> | _ -> None.

  • injection: a function that takes the projected payload and constructs the error out of it.

For errors that do not carry information (e.g., type error += Size_limit_exceeded), the projected payload of the error is unit.

It is customary to either register the error immediately after the error is declared or to register multiple errors immediately after declaring them all. In some cases, the registration happens in a separate module. Either way, registration of declared error is compulsory.

Exercises

  • Register the following error

    (** [Size_limit_exceeded {limit; current_size; attempted_insertion}] is used
        when an insertion into the global table of known blocks would cause the
        size of the table to exceed the limit. The field [limit] holds the
        maximum allowed size, the field [current_size] holds the current size of
        the table and [attempted_insertion] holds the size of the element that
        was passed to the insertion function. *)
    type error += Size_limit_exceeded {
      limit: int;
      current_size: int;
      attempted_insertion: int
    }
    

The Tzresult_syntax module

Remember that 'a tzresult is a special case of ('a, 'e) result. Specifically, a special case where 'e is error trace. Consequently, you can handle tzresult values using the Result_syntax module. However, a more specialised module Tzresult_syntax is available.

The Tzresult_syntax module is identical to the Result_syntax module but for the following differences.

  • fail: 'e -> ('a, 'e trace) result: the expression fail e wraps e in a trace inside an Error. When e is of type error as is the case throughout Octez, fail e is of type 'a tzresult.

  • and*: a binding operator alias for both (see below). You can use it with let* the same way you use and with let.

    let apply_triple f (x, y, z) =
      let open Tzresult_syntax in
      let* u = f x
      and* v = f y
      and* w = f z
      in
      return (u, v, w)
    

    When you use and*, the bound results (f x, f y, and f z) are all evaluated fully, regardless of the success/failure of the others. The expression which follows the in (return ..) is evaluated if all the bound results are successful.

  • both : ('a, 'e trace) result -> ('b, 'e trace) result -> ('a * 'b, 'e trace) result: the expression both a b is Ok if both a and b are Ok and Error otherwise`.

    Note that unlike Result_syntax.both, the type of errors (error trace) is the same on both the argument and return side of this function: the traces are combined automatically. This remark applies to the all and join (see below) as well.

    The stability of the return type is what allows this syntax module to include an and* binding operator.

  • all : ('a, 'e trace) result list -> ('a list, 'e trace) result: the function all is a generalisation of both from tuples to lists.

  • join : (unit, 'e trace) result list -> (unit, 'e trace) result: the function join is a specialisation of all for list of unit-typed expressions (typically, for side-effects).

  • and+ is a binding operator similar to and* but for use with let+ rather than let*.

Exercises

  • What is the difference between the two following functions?

    let twice f =
      let open Tzresult_syntax in
      let* () = f () in
      let* () = f () in
      return_unit
    
    let twice f =
      let open Tzresult_syntax in
      let* () = f ()
      and* () = f ()
      in
      return_unit
    

The Lwt_tzresult_syntax module

In the same way result can be combined with Lwt, tzresult can also be combined with Lwt. And in the same way that Tzresult_syntax is a small variation of Result_syntax, Lwt_tzresult_syntax is a small variation of Lwt_result_syntax.

There are possibly too many parallels to keep track of, so the diagram below might help.

'a  -----------> ('a, 'e) result ------------> 'a tzresult
 |                        |                         |
 |                        |                         |
 V                        V                         V
'a Lwt.t ------> ('a, 'e) result Lwt.t ------> 'a tzresult Lwt.t

Anyway, the Lwt_tzresult_syntax module is identical to the Lwt_result_syntax module but for the following differences.

  • fail: 'e -> ('a, 'e trace) result Lwt.t: the expression fail e wraps e in a trace inside an Error inside a promise. When e is of type error as is the case throughout Octez, fail e is of type 'a tzresult Lwt.t.

  • and*: a binding operator alias for both. You can use it with let* the same way you use and with let.

    let apply_triple f (x, y, z) =
      let open Lwt_tzresult_syntax in
      let* u = f x
      and* v = f y
      and* w = f z
      in
      return (u, v, w)
    

    When you use and*, the bound promises (f x, f y, and f z) are evaluated concurrently, and the expression which follows the in (return   ..) is evaluated once all the bound promises have all resolved but only if all of them resolve successfully.

    Note how this and* binding operator inherits the properties of both Lwt_syntax.( and* ) and Tzresult_syntax.( and* ). Specifically, the promises are evaluated concurrently and the expression which follows the in is evaluated only if all the bound promises have successfully resolved. These two orthogonal properties are combined. This remark also applies to both, all, join and and+ below.

  • both : ('a, 'e trace) result Lwt.t -> ('b, 'e trace) result Lwt.t -> ('a * 'b, 'e trace) result Lwt.t: the expression both p q is a promise that resolves once both p and q have resolved. It resolves to Ok if both p and q do, and to Error otherwise`.

    Note that unlike Lwt_result_syntax.both, the type of errors (error trace) is the same on both the argument and return side of this function: the trace are combined automatically. This remark applies to the all and join (see below) as well.

    The stability of the return type is what allows this syntax module to include an and* binding operator.

  • all : ('a, 'e trace) result Lwt.t list -> ('a list, 'e trace) result Lwt.t: the function all is a generalisation of both from tuples to lists.

  • join : (unit, 'e trace) result Lwt.t list -> (unit, 'e trace) result Lwt.t: the function join is a specialisation of all for lists of unit-typed expressions (typically, for side-effects).

  • and+ is a binding operator similar to and* but for use with let+ rather than let*.

Exercises

  • Rewrite this function to use the Lwt_tzresult_syntax module and no other syntax module.

    let apply_tuple (f, g) (x, y) =
      let open Lwt_syntax in
      let* u = f x
      and* v = g y
      in
      let r = Tzresult_syntax.both u v in
      return r
    
  • Write the implementation for

    (** [map f [x1; x2; ..]] is [[y1; y2; ..]] where [y1] is the successful
        result of [f x1], [y2] is the successful result of [f x2], etc. If [f]
        fails on any of the inputs, returns an `Error` instead. Either way, all
        the calls to [f] on all the inputs are evaluated concurrently and all
        the calls to [f] have resolved before the whole promise resolves. *)
    val map : ('a -> 'b tzresult Lwt.t) -> 'a list -> 'b list tzresult
    

Are you kidding me?! there is even more! what module am I supposed to open locally and what operators should I use?

You can also use simple guidelines to use these syntax modules effectively.

  • If your function returns _ tzresult Lwt.t values, then you start the function with let open Lwt_tzresult_syntax in. Within the function you use

    • let for vanilla expressions,

    • let* for Lwt-tzresult expressions,

    • let* .. = lwt_ok @@ .. in for Lwt-only expressions,

    • bind_from_result (..) @@ fun .. -> for tzresult-only expressions (or alternatives presented above).

    And you end your function with a call to return.

  • If your function returns _ tzresult values, then you start the function with let open Tzresult_syntax in. Within the function you use

    • let for vanilla expressions,

    • let* for tzresult expressions,

    And you end your function with a call to return.

The rest of the guidelines (for (_, _) result Lwt.t, (_, _) result, and _ Lwt.t) are unaffected.

Tracing

Remember that a trace is a data structure specifically designed for errors.

Traces have two roles:

  • As a programmer you benefit from the traces’ ability to combine automatically. Indeed, this feature of traces makes the and* binding operators possible which can simplify some tasks such as concurrent evaluation of multiple tzresult promises.

  • For the user, traces combine multiple errors, allowing for high-level errors (e.g., Cannot_bootstrap_node) to be paired with low-level errors (e.g., Unix_error EADDRINUSE). When used correctly, this can help create more informative error messages which, in turn, can help debugging.

Tracing primitives are declared in the Error_monad module. As such, they are available almost everywhere in the code. They should be used whenever an error passes from one component (say the p2p layer, or the storage layer) into another (say the shell).

  • record_trace err r leaves r untouched if it evaluates to Ok. Otherwise, it adds the error err to the trace carried in Error.

    let check_hashes head block operation =
      let open Tzresult_syntax in
      let* () =
        record_trace (Invalid_hash { kind: "head"; hash: head}) @@
        check_hash chain
      in
      let* () =
        record_trace (Invalid_hash { kind: "block"; hash: block}) @@
        check_hash block
      in
      let* () =
        record_trace (Invalid_hash { kind: "operation"; hash: operation}) @@
        check_hash operation
      in
      return_unit
    

    In this example a failure from any of the calls to check_hash will be given context that helps with understanding the source of the error.

  • record_trace_eval is a lazy version of record_trace in that the error added to the trace is only evaluated if needed. More formally record_trace_eval make_err r leaves r untouched if it evaluates to Ok. Otherwise it calls make_err and adds the returned error onto the trace.

    You should use the strict record_trace version when the error you are adding to the trace is an immediate value (e.g., Overflow) or a constructor with immediate values (e.g., Invalid_file name). You should use the lazy record_trace_eval version when the error you are adding to the trace requires computation to generate (e.g., if it requires formatting or querying).

  • trace is the Lwt-aware variant of record_trace. More formally, trace err p leaves p untouched if it resolves successfully, otherwise it adds the error err to the trace carried by the unsuccessful resolve.

    let get_data_and_gossip_it () =
      let open Lwt_tzresult_syntax in
      let* data =
        trace Cannot_get_random_data_from_storage @@
        Storage.get_random_data ()
      in
      let* number_of_peers =
        trace Cannot_gossip_data @@
        P2p.gossip data
      in
      return (data, number_of_peers)
    

    In this example, low-level storage errors are given more context by the Cannot_get_random_data_from_storage error. Similarly, low-level p2p errors are given more context by the Cannot_gossip_data error. This is important because both the storage and the p2p layer may suffer from similar system issues (such as file-descriptor exhaustion).

  • trace_eval is the lazy version of trace, or, equivalently, the Lwt-aware version of record_trace_eval.

    You should use the strict trace version when the error you are adding is immediate or a constructor with immediate values. You should use the lazy trace_eval version when the error you are adding requires computation to generate.

Do not hesitate to use the tracing primitives. Too much context is better than too little context. Think of trace (and variants) as a way to document Octez. Specifically, as making the error messages of Octez more informative.

META COMMENTARY

The previous sections had a practical focus: how to handle errors? how to mix different syntaxes? how?! By contrast, the following sections are in-depth discussions of advanced or tangential topics which you should feel free to skim or even to skip.

You should take this opportunity to take a break.
Come back in a few minutes.

Part 3: Advanced topics

Working with standard data-structures: Lwtreslib

Handling values within the Lwt, result, and Lwt-result monads is so common in Octez that you also have access to an extension of the Stdlib dedicated to these monads: the Lwtreslib library.

The tezos-lwt-result-stdlib package exports an Lwtreslib module which is made available, through tezos-error-monad and tezos-base, to the whole of the codebase. Specifically, within the codebase of Octez the following modules of OCaml’s Stdlib are shadowed by Lwtreslib’s:

  • List,

  • Result,

  • Option,

  • Seq,

  • Map,

  • Set, and

  • Hashtbl.

In all those modules, the underlying data structures are compatible with those of the Stdlib and thus with the rest of the OCaml ecosystem. However, the primitives in these modules are extended to support Lwt, result and the combination of the two. Specifically, for each function that traverses the data structure, the module also contains variants that perform the same traversal within each of the monad. E.g., for List.map

(* vanilla map *)
val map : ('a -> 'b) -> 'a list -> 'b list

(* [result]-aware map: stops at the first error *)
val map_e : ('a -> ('b, 'trace) result) -> 'a list -> ('b list, 'trace) result

(* sequential Lwt map: treats each element after the previous one *)
val map_s : ('a -> 'b Lwt.t) -> 'a list -> 'b list Lwt.t

(* sequential Lwt-[result] map:
   - treats each element after the previous one
   - stops at the first error *)
val map_es :
  ('a -> ('b, 'trace) result Lwt.t) ->
  'a list ->
  ('b list, 'trace) result Lwt.t

(* concurrent Lwt map: treats all the elements concurrently *)
val map_p : ('a -> 'b Lwt.t) -> 'a list -> 'b list Lwt.t

(* concurrent Lwt-[result] map:
   - treats all the elements concurrently
   - treats the whole list no matter the success/errors *)
val map_ep :
  ('a -> ('b, 'trace) result Lwt.t) ->
  'a list ->
  ('b list, 'trace list) result Lwt.t

Check out the online documentation of Lwtreslib for a description of the semantic and naming convention.

In addition to shadowing existing modules, Lwtreslib also exports new modules:

  • Seq_e for result-aware variant of Seq,

  • Seq_s for Lwt-aware variant of Seq,

  • Seq_es for Lwt-result-aware variant of Seq, and

  • WithExceptions for unsafe accesses such as Option.get.

Whenever you need to traverse a standard data structure with some result or Lwt or Lwt-result function, Lwtreslib should have that function ready for you. You should never fold over a data structure with a promise or result accumulator. E.g., you should do

List.rev_map_es fetch keys

and you shouldn’t do

List.fold_left
  (fun resources key ->
    let* resources = resources in
    let* resource = fetch key in
    return (resource :: resources))
  (return [])
  keys

If you do not find the traversal function you need, do not hesitate to contribute to Lwtreslib.

Working with traces and errors and such directly

Occasionally, you may need more interaction with traces than the primitives presented thus far.

For error reporting or debugging purpose, you may need to show traces to users. You can do so with the following values.

  • pp_print_trace: a %a-compatible formatter. Note that the trace formatting is unspecified and subject to change. Also be aware that it generally prints the trace over multiple lines.

  • pp_print_top_error_of_trace: a %a-compatible formatter that only shows the most recent error in the trace (or one of the most recent errors if there are several). This is useful to get shorter error messages. Most often used for the declaration of logging events in Internal_event.Simple.

  • trace_encoding: an encoding for traces. Useful to combine into encoding of data structures that contain traces. Most often used for the declaration of logging events in Internal_event.Simple.

If you are working with non-standard data structures and if you need to define monad-aware traversors for these data structures, you may need to build some traces by hand. You can do so with the following values.

  • TzTrace.make : 'e -> 'e trace useful to convert an error into an error   trace. By extension, this is useful to convert an ('a, error) result into an 'a tzresult.

  • TzTrace.cons : 'e -> 'e trace -> 'e trace is the low-level combinators that builds-up traces. In most cases, you’ll want to use trace or record_trace instead, but you might need it when you are defining a low-level traversal function for some data structure.

    let iter_with_bounded_errors bound f xs =
      (* we rely on syntax for Lwt, we handle results by hand *)
      let open Lwt_syntax in
      let rec aux_all_ok = function
        | [] -> return_ok ()
        | x :: xs ->
          let* r = f x in
          match r with
          | Ok () -> aux_all_ok xs
          | Error e -> aux_some_error 1 (TzTrace.make e) xs
      and aux_some_error num_errors trace xs =
        if num_errors > bound then
          return_error (TzTrace.cons (Exceeded_error_limit bound) trace)
        else
          match xs with
          | [] -> return_ok ()
          | x :: xs ->
            let* r = f x in
            match r with
            | Ok () -> aux_some_error num_errors trace xs
            | Error e -> aux_some_error (num_errors + 1) (TzTrace.cons e trace) xs
      in
      aux_all_ok xs
    
  • TzTrace.conp : 'e trace -> 'e trace -> 'e trace is the parallel composition of two traces. Unlike cons, the traces composed by conp are not organised hierarchically. The errors are presented as having happened side-by-side.

    Note that currently there is little difference between cons and conp traces. But the difference will be more marked in the future.

    You should use conp (rather than cons) when you are gathering errors and traces from two or more concurrent processes.

Working within the protocol

If you are working on the protocol, things are slightly different for you. This is because the protocol has a restricted access to external resources and libraries. You can find more details in the dedicated documentation. This section focuses on the error-monad within the protocol.

The protocol environment libraries evolve at a slightly different pace than the underlying library. You need to check the mli files within src/lib_protocol_environment/sigs/.

Note that unlike in the shell, the traces in the protocol are already abstract. As a result there is no matching of traces (and thus errors) within the protocol: you can match Ok and Error, but not the payload of the Error. This part of the legacy code has already been removed.

The main difference between the protocol and the shell is that the category parameter of the register_error_kind function is meaningful. You must pass a category which is appropriate for the error you are registering:

  • Branch: is for branch-specific failures, i.e., failures that happen in the current branch (of the chain) but maybe wouldn’t happen in a different branch. E.g., a reference to an unknown block is invalid, but it might become valid once the head block has changed. This category is then used by the shell to retry after the branch changes.

  • Temporary: is for transient failures, i.e., failures that happen but may not always happen. This category is used by the shell to retry at some later time.

  • Permanent: is for irremediable failures, i.e., failures that happen and will always happen whatever the context. E.g., originating a contract that does not type-check is a permanent error. This is used by the shell to mark the data as invalid.

  • Outdated: is for failures that happen when some data is too old.

Another thing to consider is that errors from the protocol can reach the shell. However, because the error type of the protocol is distinct from that of the shell, the protocol errors are wrapped inside a shell error constructor.

This has no impact within the protocol (where shell errors don’t exist) nor within the shell (where protocol errors are automatically wrapped inside a shell error). However, it can have an impact in the spaces in between. Most typically, this matters in the unit-tests of the protocol (src/proto_alpha/lib_protocol/test/unit/) where you call some protocol functions directly. In this case, you need to wrap the errors yourself, using the wrapping functions provided by the environment: Environment.wrap_tzresult, Environment.wrap_tztrace, and Environment.wrap_tzerror.

Working below the error-monad

If you are working on some low-level libraries (e.g., src/lib_stdlib) or the external dependencies (e.g., data-encoding) you don’t have access to the error monad at all.

In this case, you can still use the result type but you need to define your own let* binding operator: let ( let* ) = Result.bind.

You can also use Lwt which provides its own Lwt.Syntax module.

Finally, the Lwt_result module (provided as part of Lwt) can help you deal with result-Lwt combinations, including via its Lwt_result.Syntax module.

Working with external libraries

This tutorial covers error-management techniques in Octez. However, from within Octez, you may need to call external libraries for cryptography or RPCs or data-encoding or what have you.

The first thing you do is to carefully read the documentation of the external library you are using. You should check the overview documentation with a look out for comments on error management.

Then, you also need to read the documentation of each function that you are calling. This documentation may explain how errors are handled: does the function return a result? does it raise and exception? is it unspecified?

If the function you are calling may raise exceptions, you should catch these exceptions. You can either do so at the level of the call itself or, if you are calling multiple functions that can all raise similar exceptions, around a whole block of calls.

When you catch an exception, the most common thing to do is to translate it or wrap it into a result or a tzresult.

try
  let v1 = Data_encoding.Json.destruct e1 j1 in
  let v2 = Data_encoding.Json.destruct e2 j2 in
  Ok (v1, v2)
with
  | exc -> Error (Cannot_destruct_json_value exc)

Note that if you are calling an Lwt function, you have to use Lwt.catch or Lwt.try_bind rather than try-with.

Lwt.catch
  (fun () ->
    let open Lwt_syntax in
    let* () = Lwt_unix.mkdir d1 perm in
    let* () = Lwt_unix.mkdir d2 perm in
    Lwt_result_syntax.return_unit)
  (function
    | exc -> Lwt_result_syntax.fail (Cannot_destruct_json_value exc))

The error monad provides several helpers functions for catching exceptions.

val catch : ?catch_only:(exn -> bool) -> (unit -> 'a) -> 'a tzresult

If the function you are calling may raise exceptions only under well-defined conditions on the parameters, then you can also check those conditions yourself and ignore the exceptions. When doing so, please add a comment to explain it.

let get_or_defaults low_default high_default array offset =
  if offset < 0 then
    low_default
  else if offset >= Array.length array then
    high_default
  else
    (* This cannot raise because of checks on offset above *)
    Array.get array offset

If the function may fail with result, you can map the error directly or simply continue with it. If it may fail with option, you can translate None into an appropriate error.

match find k kvs with
| None -> Error "cannot find key"
| Some v -> Ok v

If the function’s documentation specifies some pre-conditions but doesn’t explain what happens if those aren’t met, then you must check those pre-conditions.

Appendices

Legacy code

(This section will be removed once the whole code-base has been updated to use the binding operators as recommended. In the meantime, you need to learn some legacy constructs so you can read code that hasn’t been upgraded yet. You should not use the operators introduced in this section to write new code.)

The legacy code is written with infix bindings instead of let-style binding operators. The binding >>? for result and tzresult, >>= for Lwt, and >>=? for Lwt-result and Lwt-tzresult. A full equivalence table follows.

Modern

Legacy

let open Result_syntax in
let* x = e in
e'
e >>? fun x ->
e'
let open Result_syntax in
let+ x = e in
e'
e >|? fun x ->
e'
let open Lwt_syntax in
let* x = e in
e'
e >>= fun x ->
e'
let open Lwt_syntax in
let+ x = e in
e'
e >|= fun x ->
e'
let open Lwt_result_syntax in
let* x = e in
e'
e >>=? fun x ->
e'
let open Lwt_result_syntax in
let+ x = e in
e'
e >|=? fun x ->
e'

and*, and+ (any syntax module)

No equivalent, uses both_e, both_p, or both_ep

let open Lwt_result_syntax in
let* x = lwt_ok @@ e in
e'
(e >>= ok) >>=? fun x ->
e'
let open Lwt_result_syntax in
bind_from_result e @@ fun x ->
e'
e >>?= fun x ->
e'
let open Lwt_result_syntax in
bind_from_result e @@ fun x ->
lwt_ok @@ e'
e >|?= fun x ->
e'

In addition, instead of dedicated return and fail functions from a given syntax module, the legacy code relied on global values.

Modern

Legacy

let open Result_syntax in
return x
ok x
let open Result_syntax in
fail e

No equivalent, uses Error e

let open Lwt_syntax in
return x

No equivalent, uses Lwt.return x

let open Lwt_result_syntax in
return x
return x
let open Lwt_result_syntax in
fail e

No equivalent, uses Lwt.return_ok x

let open Tzresult_syntax in
return x
ok x
let open Tzresult_syntax in
fail e
error e
let open Lwt_tzresult_syntax in
return x
return x
let open Lwt_tzresult_syntax in
fail e
fail e

In addition to these syntactic differences, there are also usage differences. You might encounter the following patterns which you should not repeat:

  • Matching against a trace:

    match f () with
    | Ok .. -> ..
    | Error (Timeout :: _) -> ..
    | Error trace -> ..
    

    This is discouraged because the compiler is unable to warn you if the matching is affected by a change in the code. E.g., if you add context to an error in one place in the code, you may change the result of the matching somewhere else in the code.

In depth discussion: what even is a monad?

This tutorial has been pretty loose with the word monad. It has focused on usage with very little explanations of fundamental concepts. It is focused on the surface syntax instead of the underlying mathematical model. This section goes into a bit more details about the general principles of monads and such. It does not claim to even attempt to give a complete description of monads, it simply gives more details than the previous sections.

For coding purposes, a monad is a parametric type equipped with a set of specific operators.

type 'a t
val bind : 'a t -> ('a -> 'b t) -> 'b t
val return : 'a -> 'a t

The return operator injects a value into the monad and the bind operator continues within the monad.

The set of operators must also follow the monad laws. For example bind (return x) f must be equivalent to f x.

Monads are used as a generic way to encode different abstractions within a programming language: I/O, errors, collections, etc. For example, the option monad is defined as

module OptionMonad = struct
  type 'a t = 'a option
  let bind x f = match x with | None -> None | Some x -> f x
  let return x = Some x
end

And it is useful when dealing with queries that may have no answer. This can be used as a lighter form of error management than the result monad.

Some programming languages also offer syntactic sugar for monads. This is to avoid having to write bind within bind within bind. E.g., Haskell relies heavily on monads and has the dedicated do-notation. In OCaml, you can use one of the following methods:

  • binding operators (since OCaml 4.08.0)

    let add x y =
      let ( let* ) = OptionMonad.bind in
      let* x = int_of_strin_opt x in
      let* y = int_of_strin_opt y in
      Some (strin_of_int (x + y))
    
  • infix operators

    let add x y =
      let ( >>= ) = OptionMonad.bind in
      int_of_strin_opt x >>= fun x ->
      int_of_strin_opt y >>= fun y ->
      Some (strin_of_int (x + y))
    

    Note that mixing multiple infix operators is not always easy because of precedence and associativity.

  • partial application and infix @@

    let add x y =
      OptionMonad.bind (int_of_strin_opt x) @@ fun x ->
      OptionMonad.bind (int_of_strin_opt y) @@ fun y ->
      Some (strin_of_int (x + y))
    

    This is useful for the occasional application: you do not need to declare a dedicated operator nor open a dedicated syntax module.

Monads can have additional operators beside the required core. E.g., you can add OptionMonad.join : 'a option option -> 'a option.

In depth discussion: Error_monad, src/lib_error_monad/, Tezos_base__TzPervasives, etc.

The different parts of the error monad (syntax modules, extended stdlib, tracing promitives, etc.) are defined in separate files. Yet, they are all available to you directly. This section explains where each part is defined and how it reaches the scope of your code.

From your code, working back to the definitions.

In most of Octez, the Error_monad module is available. Specifically, it is available in all the packages that depend on tezos-base. This covers everything except the protocols and a handful of low-level libraries.

In those part of Octez, the build files include -open Tezos_base__TzPervasives.

The module Tezos_base__TzPervasives is defined by the compilation unit src/lib_base/TzPervasives.ml.

This compilation unit gathers multiple low-level modules together. Of interest to us is include Tezos_error_monad.Error_monad (left untouched in the mli) and include Tezos_error_monad.TzLwtreslib (not present in the mli, used to shadow the Stdlib modules List, Option, Result, etc.).

The Error_monad module exports:

  • the error type along with the register_error_kind function,

  • the 'a tzresult type,

  • the TzTrace module,

  • the Tzresult_syntax and Lwt_tzresult_syntax modules (from a different, more generic name),

  • and exports a few more functions.

The rest of the tezos-error-monad package:

  • defines the 'a trace type (in TzTrace.ml), and

  • instantiates TzLwtreslib by applying Lwtreslib’s Traced functor to TzTrace.

The Lwtreslib module exports a Traced (T: TRACE) functor. This functor takes a definition of traces and returns a group of modules intended to shadow the Stdlib.

From the underlying definitions, working all the way up to your code.

At the low-level is Lwtreslib.

  • src/lib_lwt_result_stdlib/bare/sigs: defines interfaces for basic, non-traced syntax modules and Stdlib-replacement modules.

  • src/lib_lwt_result_stdlib/bare/structs: defines implementations basic, non-traced syntax modules and Stdlib-replacement modules.

  • src/lib_lwt_result_stdlib/traced/sigs: defines interfaces for traced syntax modules and Stdlib-replacement modules. These interfaces are built on top of the non-traced interfaces, mostly by addition and occasionally by shadowing.

  • src/lib_lwt_result_stdlib/traced/structs: defines implementations for traced syntax modules and Stdlib-replacement modules. These implementations are built on top of the non-traced implementations, mostly by addition and occasionally by shadowing. These are defined as functors over some abstract tracing primitives.

  • src/lib_lwt_result_stdlib/lwtreslib.mli: puts together the traced implementations into a single functor Traced that takes a trace definition and returns fully instantiated modules to shadow the Stdlib.

Above Lwtreslib is the Error monad.

  • src/lib_error_monad/TzTrace.ml: defines the 'a trace type along with the low-level trace-construction primitives.

  • src/lib_error_monad/TzLwtreslib.ml: instantiates Lwtreslib.Traced with TzTrace.

  • src/lib_error_monad/monad_extension_maker.ml: provides a functor which, given a tracing module, provides some higher level functions for tracing as well as a few other functions.

  • src/lib_error_monad/core_maker.ml: provides a functor which, given a name, provides an error type, a register_error_kind function, and a few other related functions. This is a functor so we can instantiate it separately for the shell and for each of the protocols.

  • src/lib_error_monad/TzCore.ml: instantiates the core_maker functor for the shell.

  • src/lib_error_monad/error_monad.ml: puts together all of the above into a single module.

Above the Error monad is lib-base:

  • src/lib_base/TzPervasives.ml: exports the Error_monad module, includes the Error_monad module, exports each of the TzLwtreslib module.

In depth discussion: result as data and result as control-flow

Note that result (and similarly, tzresult) is a data type. Specifically

type ('a, 'b) result =
| Ok of 'a
| Error of 'b

You can treat values of type result as data of that data-type. In this case, you construct and match the values, you pass them around, etc.

Note however that, in Octez, we also use the result type as a control-flow mechanism. Specifically, in conjunction with the let* binding operator, the result type has a continue/abort meaning.

Within your code, you can go from one use to the other. E.g.,

let xs =
  List.rev_map
    (fun x ->
      (* [result] as control-flow *)
      let open Result_syntax in
      let* .. = .. in
      let* .. = .. in
      return ..)
    ys
in
let successes xs =
  (* [result] as data *)
  List.length (List.rev_filter_ok xs)
in
..

Using result as sometimes data and sometimes control-flow is the main reason to bend the guidelines about which syntax module to open. E.g., if your function returns (_, _) result Lwt.t but the result is data returned by the function rather than control-flow used within the function, then you should open Lwt_syntax (rather then Lwt_result_syntax).

As a significant aside, note that in OCaml you can also use exceptions for control-flow (with raise and try-with and match-with-exception) and as data (the type exn is an extensible variant data-type).

(** [iter_no_raise f xs] applies [f] to all the elements of [xs]. If [f] raises
    an exception, the iteration continues and [f] is still applies to other
    elements. The function returns pairs of the exceptions raised by [f] along
    the elements of [xs] that triggered these exceptions. *)
let iter_no_raise f xs =
  List.fold_left
    (fun excs x ->
      match f x with
      | exception exc -> exc :: excs
      | () -> excs)
    []
    xs

You can find uses of exception as data within the error monad itself. First, the generic failure functions (error_with, error_with_exn, failwith, and fail_with_exn) are just wrapper around an error which carries an exception (as data).

Second, Lwtreslib provides helpers to catch exceptions. E.g., Result.catch : (unit -> 'a) -> ('a, exn) result calls a function and wraps any raised exception inside an Error constructor.

In depth discussion: pros and cons of result compared to other error management techniques

In Octez, we use result and the specialised tzresult. For this reason, this tutorial is focused on result/tzresult. However, there are other techniques for handling errors. This section compares them briefly.

In general you should use result and tzresult but in some specific cases you can deviate from that. The comparisons below may help you decide.

Exceptions

In exception-based error handling, you raise an exception (via raise) when an error occurs and you catch it (via try-with) to recover. Exceptions are fast because the OCaml compiler and runtime provide the necessary mechanisms directly.

Whether a function can raise an exception or not cannot be determined by its type. This means that it is easy to forget to recover from an exception. An external library may change the set of exceptions that a function raises and you need to update calls to this function, but the type-checker cannot warn you about it. This places a heavy burden on the developer who is responsible for checking the documentation of all the functions they call.

Exception-raising functions should be documented as such using the @raise documentation keyword.

Pros: performance is good, used widely in the larger ecosystem.
Cons: you cannot rely on the type-checker to help you at all, you depend on the quality of the documentation of your external and internal dependencies.

Note that within the protocol, you should not use exceptions at all.

tzresult

With tzresult, errors are carried by the Error constructor of a result. In this way an 'a tzresult represents the result of a computation that normally returns an 'a but may fail.

Because the type of errors is an abstract wrapper (trace) around an extensible variant (error), you can only recover from these errors in a generic way.

Pros: the type of a function indicates if it can fail or not, you cannot forget to check for success/failure.
Cons: you cannot check which error was raised, registration is heavy and complicated.

result

With result, errors are carried by the Error constructor. Each function defines its own type of errors.

Pros: the type of a function indicates if and how it can fail, you cannot forget to check for success/failure, you can check the payload of failures.
Cons: different errors from different functions cannot be used together (need conversions), and* is unusable.

option

With option, errors are represented by the None constructor. Errors are completely void of payload.

Because there are no payloads attached to an error, you should generally treat the error directly at the call site. Otherwise you might lose track of the origin of the failure. E.g., what was not found in the following code fragment?

match
  let ( let* ) = Option.bind in
  let* z = find "zero" in
  let* o = find "one" in
  Some (z, o)
with
| None -> ..
| Some (z, o) -> ..
Pros: the type of a function indicates if it can fail, you cannot forget to check for success/failure.
Cons: a single kind of errors means it cannot be very informative.

fallback

Another approach to errors is to have a default or fallback value. In that case, the function returns a default sensible value when it would raise and exception or return an error. Alternatively, it can take this fallback value as parameter.

(** @raise [Not_found] if argument is [None] *)
val get : 'a option -> 'a

(** returns [default] if argument is [None] *)
val value : default:'a -> 'a option -> 'a
Pros: there is no error.
Cons: doesn’t work for every function, works differently on different functions.