Documenting

The documentation is available online at https://tezos.gitlab.io, and always up to date with branch master on GitLab.

Building the documentation

You can build the documentation locally or in the CI.

Building the documentation in the CI

When reviewing a merge request (MR) in the Gitlab interface, you may build the documentation in the CI without checking out the source branch of the MR, and without installing Python locally. Proceed as follows:

  • trigger the CI if needed in the home page of the MR, and make sure that job documentation:build_all under the CI stage build is being executed

  • once the whole CI is finished, check the built documentation in the exposed artifacts on the home page of the MR

If you cannot wait for the whole CI to finish, the artifacts are not yet exposed on the home page of the MR; but click on the CI job documentation:build_all, and in the job’s page (once finished), click on Browse to retrieve only the doc artifacts.

In both cases, visualize file docs/_build/index.html.

Building the documentation locally

To build the documentation locally, you need to install the Python package manager Poetry. For instructions on how to obtain Python and Poetry, see the installation instructions for the Python environment.

Another pre-requisite for building the documentation is making sure that the Octez sources on your branch are compiled, because part of the documentation is generated by Octez executables. This involves executing make in the parent directory (the repository root). If this step results in errors, you usually have to restart the compiling procedure from make clean onwards.

Once this is done, you can do:

make -C docs

The output is generated and available in docs/_build. It is built by Sphinx, and uses the Read The Docs theme.

OCaml documentation

As part of the above procedure, Odoc is used for OCaml API generation. You can install Odoc with:

opam install odoc

Octez generates the API documentation for all libraries in HTML format. The generated HTML pages are put in _build/<context>/_doc. It creates one sub-directory per public library and generates an index.html file in each sub-directory.

The documentation is not installed on the system by Octez. It is meant to be read locally while developing and then published on the www when releasing packages.

Writing documentation

Online documentation is written in reStructuredText format, also known as RST. reStructuredText is the default plaintext markup language used by Sphinx, which is the tool used to compile this format into plain web pages in HTML format.

For the RST syntax, see the Sphinx RST primer and also the Sphinx extensions below.

Sphinx extensions

Some ad-hoc reference kinds are supported.

  • :package:`name` or :package:`text<name>` points to the odoc page of the package, checking that the page exists

  • :package-name:`name` or :package-name:`text<name>` just displays the package name (no link), checking that the package exists

  • :package-api:`path/to/api-page.html` or :package-api:`text<path/to/api-page.html>` points to an API page generated by odoc, checking that the page exists. The path is relative to the root of the odoc-generated pages (normally, _build/api/odoc/_html). It must start with a package name, optionally followed by a library name, then by a series of nested module names, and ended by a page name, usually index.html. It may optionally be suffixed by a section name, using the standard HTML #section suffix. This role is meant to point to APIs that do not correspond to a whole package (for that case, prefer to use the :package: role).

  • :src:`/path/to/file/or/dir` or :src:`text</path/to/file/or/dir>` points to the gitlab source tree viewer. It is not possible to refer to a particular line in a file using a line number suffix of the form #Lnnn, because such links are usually too fragile to be used in documentation.

  • :opam:`package` or :opam:`text<package>` points to the package page on opam.ocaml.org, version number is supported (package.version)

  • :gl:`[special gitlab reference]` or :gl:`text <[special gitlab reference]>` expands and links GitLab special references, like for merge requests tezos/tezos!123 (:gl:`tezos/tezos!123`), issues tezos/tezos#999 (:gl:`tezos/tezos#999`) and commits 28309c81 (:gl:`28309c81`). The default project and namespace is tezos/tezos. In other words, tezos/tezos#999, tezos#999 and #999 all refer to the same thing. Currently supports usernames, projects, issues, merge requests, snippets, milestone ids, commits and commit ranges. The implementation of this role is in docs/_extensions/gitlab_custom_role.py.

Style guidelines

Currently there are no enforced guidelines about the style in writing documentation. In particular, the choice of American, British, Canadian, … English (alphabetical, non-exhaustive list!) is up to each contributor. So is the capitalization convention of section names, and other typesetting aspects. The focus should be on the contents: on logical structure of documents, on uniform use of terms, on avoiding incoherencies between pages, and so on.

However, when adding a new page or modifying an existing one, you should check that your text displays correctly and introduces no new problems. For that, you should build the documentation (by running make in the docs directory), address any new error message, and check the generated pages (docs/_build/index.html) in a browser.

Line breaking

When writing documentation in text formats such as RST, it is not required to respect a maximal line width, such as 80 columns. Therefore, you may choose between the different line breaking policies your text editor proposes. However, you should be aware that file comparison tools such as diff tend to output large differences for a paragraph that has been reformatted after only a small change in one phrase. Also, reviewing tools such as the one in the GitLab user interface associate comments and change suggestions to lines, while these comments and suggestions are usually logically associated with whole phrases.

For such reasons:

  • Some contributors use one line per complete phrase, which allows to make rephrasing suggestions more easily in gitlab, associated to this (possibly long) line; and which allows diff to isolate modified phrases, instead of showing the whole container paragraph as modified.

  • Other contributors, whose editor breaks lines at a fixed width, introduce an extra line break at the end of each phrase. This also allows diff to isolate modified phrases.

Thus, you may choose your own formatting style, while tolerating different styles from other contributors.

Writing executable documentation

When you are writing documentation containing executable parts, such as sequences of instructions to install, configure, or launch some tool, there is sometimes a better way than copying those instructions from a terminal (where you supposedly tried them before!) to a documentation page. This better way is to write “executable documentation”. The idea is to write such executable scripts separated from the documentation, and to automatically copy them in the documentation whenever it is (re)generated. Executable documentation allows one to test those scripts, e.g. in CI (continuous integration), ensuring they work and are up to date with the code and with its environment.

Typically, Octez installation scripts not only have to evolve with the Octez codebase, but also with various other evolving resources, such as OPAM packages, package managers, Linux distributions, and so on. By continuously testing such installation scripts, executable documentation allows one to detect problems and fix obsolete instructions as early as possible, avoiding headaches and frustration, for new end users and experienced developers alike.

Technically, executable documentation can be created by using the Sphinx directive literalinclude, which may include whole scripts or parts of them. For example, the following directive includes a script fragment detailing a step in compiling the Octez sources:

.. literalinclude:: compile-sources.sh
  :language: shell
  :start-after: [install packages]
  :end-before: [test executables]

Whenever appropriate, in addition to including the script (fragment) in the documentation as above, make sure it is regularly tested, manually and/or within a CI job.

Writing protocol documentation

Writing protocol documentation is a special case because protocol-related documentation pages are duplicated for several protocol versions (under directories named as the protocols, e.g., “alpha/”), and possibly also in a protocol-independent part (typically under directory shell/).

Besides the need of maintaining several versions of these pages, this duplication introduces the need to carefully handle documentation cross-references, in particular to avoid duplicate labels (i.e., multiple labels with the same name in different pages) and wrong references (i.e., escaping from one protocol version into another).

The following rules promote a systematic way of handling documentation cross-references that avoids introducing such errors.

Definitions

First let us introduce the following definitions:

  • A label is an identifier defining a specific position in a documentation page (typically, before a section name). A reference is a link to a label, in the same or another page. In Sphinx, labels are written .. _label: and references are written :ref: `textual description <label>`, or :ref: `label`. Labels and references are case-insensitive.

  • A versioned label is suffixed by protocol name (e.g. label_alpha); an unversioned label doesn’t (i.e. just label)

  • A local reference is a link from a protocol-specific page to the same page or to another protocol-specific page. An external reference is a reference from a protocol-independent page to a label in a protocol-specific page.

Rules

The following simple rules are proposed for safely managing cross-references:

  1. In all but the current protocol, any defined label must be versioned:

    .. _<label>_<proto>:
    
  2. In the current protocol, labels may be versioned (as targets of local references), unversioned (as targets of external references), or both. The last case is done by defining two labels for such location:

    ..  _<label>:
    ..  _<label>_<proto>:
    
  3. Any local reference in protocol <proto> must be versioned <proto>. This includes references appearing in the currently active protocol.

  4. External references must be unversioned.

The rationale of the above rules:

  • Any label defined in a protocol-specific page must be versioned to avoid name conflicts (as by definition the containing page is duplicated).

  • External references must be unversioned to avoid modifying protocol-independent pages when the current protocol is changed.

  • Local references in the current protocol could also work if unversioned, but when the protocol is changed, they should be rewritten as versioned. It is much simpler to enforce the rule that all local references in a page for any protocol <proto> must be versioned <proto>.

Protocol changes

When a new protocol is adopted, its pages must be “linked” with the protocol-independent pages:

  • remove in the old protocol all the unversioned labels (this operation is unnecessary if the pages of the old protocol are removed altogether)

  • add in the new protocol an unversioned label before each versioned label

NB no rewriting of any reference is needed on protocol changes.

On creating a new protocol proposal version <proto> out of alpha:

  • rename all versioned labels AND references _alpha in its pages to version _<proto>

Rules automation

To help enforcing the above cross-referencing rules in protocol-specific pages, the following scripts are provided under docs/scripts:

  • check_proto_xrefs.py: checks the references, and optionally the labels, in all pages of a given protocol version

    • can be used at any time, e.g. when changing a protocol-specific page

  • add_labels_without_proto.py: adds unversioned labels before each versioned label in a protocol-specific page

    • can be used when a new protocol is adopted, to “link” its documentation into protocol-independent pages

  • remove_labels_without_proto.py: removes unversioned labels in a protocol-specific page

    • can be used when a new protocol is adopted for “unlinking” the pages of the old protocol, only if those pages are not removed altogether

Moreover, the script scripts/snapshot_alpha.sh, used to create a new protocol proposal version <proto> out of alpha integrates renaming of labels and references.

Documenting protocols

Due to the duplication of the documentation for multiple protocol versions, the following extra guidelines should be observed.

  • In principle, protocol-independent pages should only refer to the currently active protocol. Indeed, until newer protocols are adopted, there is no guarantee that their features will be part of Tezos someday. Note that there is a symbolic link called active within the documentation folder pointing to the currently active protocol directory. Use it whenever appropriate to avoid introducing hardcoded protocol numbers.

  • When modifying the pages of a given protocol version, you might have to also modify it for later versions. Otherwise, when newer protocols are adopted, your changes will vanish! In particular, when fixing a problem in the documentation of the current protocol (e.g. adding a term in the glossary), you might have to fix it also for the candidate protocol (if there is one under the voting procedure) and for the Alpha protocol under development (assuming that the features of the candidate protocol will be inherited by or proposed in another form in Alpha).

  • As there is a considerable overhead for maintaining protocol-specific pages, think twice before duplicating a page as protocol-specific. Does this page really refer to the protocol? If yes, does all the page refer to the protocol? If the answer to the last question is “no”, consider splitting the page in two parts, respectively protocol-specific and protocol-independent. This kind of splitting is however unadvised when there are many local cross-references between the parts; in this case, keeping everything in a same page may avoid introducing many labels (this is why the glossary pages are not split into shell and protocol pages).