Supported Open Metrics

Supported Open Metrics#

The Octez node is able to produce metrics information and serve them in the Open Metrics format, an emerging standard for exposing metrics data, especially used in cloud-based systems.

The Octez node supports the following metrics, characterized by: the name of the metric, the type of the metric as in the open metrics specification, a user friendly description on the metric and a list of labels (that can be used to aggregate or query metrics).

For more information check the openmetrics specification: https://openmetrics.io/

Name

Type

Description

Labels

ocaml_gc_allocated_bytes

Counter

Total number of bytes allocated since the program was started.

ocaml_gc_compactions

Counter

Number of heap compactions since the program was started.

ocaml_gc_heap_words

Gauge

Total size of the major heap, in words.

ocaml_gc_major_collections

Counter

Number of major collection cycles completed since the program was started.

ocaml_gc_major_words

Counter

Number of words allocated in the major heap since the program was started.

ocaml_gc_minor_collections

Counter

Number of minor collection cycles completed since the program was started.

ocaml_gc_top_heap_words

Counter

Maximum size reached by the major heap, in words.

octez_distributed_db_message_block_header_messages

Counter

Number of block_header messages

action

octez_distributed_db_message_checkpoint_messages

Counter

Number of checkpoint messages

action

octez_distributed_db_message_current_branch_messages

Counter

Number of current_branch messages

action

octez_distributed_db_message_current_head_messages

Counter

Number of current_head messages

action

octez_distributed_db_message_deactivate_messages

Counter

Number of deactivate messages

action

octez_distributed_db_message_get_block_headers_messages

Counter

Number of get_block_headers messages

action

octez_distributed_db_message_get_checkpoint_messages

Counter

Number of get_checkpoint messages

action

octez_distributed_db_message_get_current_branch_messages

Counter

Number of get_current_branch messages

action

octez_distributed_db_message_get_current_head_messages

Counter

Number of get_current_head messages

action

octez_distributed_db_message_get_operations_for_blocks_messages

Counter

Number of get_operations_for_blocks messages

action

octez_distributed_db_message_get_operations_messages

Counter

Number of get_operations messages

action

octez_distributed_db_message_get_predecessor_header_messages

Counter

Number of get_predecessor_header messages

action

octez_distributed_db_message_get_protocol_branch_messages

Counter

Number of get_protocol_branch messages

action

octez_distributed_db_message_get_protocols_messages

Counter

Number of get_protocols messages

action

octez_distributed_db_message_operation_messages

Counter

Number of operation messages

action

octez_distributed_db_message_operations_for_block_messages

Counter

Number of operations_for_block messages

action

octez_distributed_db_message_predecessor_header_messages

Counter

Number of predecessor_header messages

action

octez_distributed_db_message_protocol_branch_messages

Counter

Number of protocol_branch messages

action

octez_distributed_db_message_protocol_messages

Counter

Number of protocol messages

action

octez_distributed_db_requester_table_length

Gauge

Number of entries (to grab) from the network present

requester_kind;entry_type

octez_external_rpc_process_calls

Summary

External RPC endpoint call counts and sum of execution times.

endpoint;method

octez_mempool_pending_branch_delayed

Gauge

Mempool pending branch delayed operations count

octez_mempool_pending_branch_refused

Gauge

Mempool pending branch refused operations count

octez_mempool_pending_outdated

Gauge

Mempool pending outdated operations count

octez_mempool_pending_refused

Gauge

Mempool pending refused operations count

octez_mempool_pending_unprocessed

Gauge

Mempool pending unprocessed operations count

octez_mempool_pending_validated

Gauge

Mempool pending validated operations count

octez_mempool_worker_completion_count

Counter

Number of requests completed the block validator worker

octez_mempool_worker_error_count

Counter

Number of errors encountered by the block validator worker

octez_mempool_worker_request_count

Counter

Number of requests received by the block validator worker

octez_p2p_connections_active

Gauge

Number of active connections

octez_p2p_connections_incoming

Gauge

Number of incoming connections

octez_p2p_connections_outgoing

Gauge

Number of outgoing connections

octez_p2p_connections_private

Gauge

Number of private connections

octez_p2p_io_scheduler_current_inflow

Gauge

Current ingoing data rate

octez_p2p_io_scheduler_current_outflow

Gauge

Current outgoing data rate

octez_p2p_io_scheduler_total_recv

Gauge

Total amount of received data

octez_p2p_io_scheduler_total_sent

Gauge

Total amount of sent data (in bytes)

octez_p2p_messages_advertise_received

Counter

Number of advertise received

octez_p2p_messages_advertise_sent

Counter

Number of advertise sent

octez_p2p_messages_bootstrap_received

Counter

Number of bootstrap received

octez_p2p_messages_bootstrap_sent

Counter

Number of bootstrap sent

octez_p2p_messages_broadcast_message_sent

Counter

Number of user message sent by broadcasting

octez_p2p_messages_swap_ack_received

Counter

Number of swap acks received

octez_p2p_messages_swap_ack_sent

Counter

Number of swap acks sent

octez_p2p_messages_swap_requests_received

Counter

Number of swap received

octez_p2p_messages_swap_requests_sent

Counter

Number of swap sent

octez_p2p_messages_user_message_received

Counter

Number of user message received

octez_p2p_messages_user_message_received_error

Counter

Number of user message received that resulted in error

octez_p2p_messages_user_message_sent

Counter

Number of user message sent

octez_p2p_peers_accepted

Gauge

Number of accepted connections

octez_p2p_peers_disconnected

Gauge

Number of disconnected peers

octez_p2p_peers_running

Gauge

Number of running peers

octez_p2p_points_accepted

Gauge

Number of accepted points

octez_p2p_points_disconnected

Gauge

Number of disconnected points

octez_p2p_points_greylisted

Gauge

Number of greylisted points

octez_p2p_points_running

Gauge

Number of running points

octez_p2p_points_trusted

Gauge

Number of trusted points

octez_p2p_swap_fail

Counter

Number of failed swap

octez_p2p_swap_ignored

Counter

Number of ignored swap

octez_p2p_swap_success

Counter

Number of successful swap

octez_rpc_calls

Summary

RPC endpoint call counts and sum of execution times.

endpoint;method

octez_store_alternate_heads_count

Gauge

Current number of alternated heads known

octez_store_caboose_level

Gauge

Current caboose level

octez_store_checkpoint_level

Gauge

Current checkpoint level

octez_store_invalid_blocks

Gauge

Number of blocks known to be invalid stored on disk

octez_store_last_merge_time

Gauge

Time, in seconds, for the completion of the last store merge

octez_store_last_written_block_size

Gauge

Size, in bytes, of the last block written in store

octez_store_maintenance_target

Gauge

The level at which the storage maintenance is expected to be triggered. Set to -1 if no target is set

octez_store_savepoint_level

Gauge

Current savepoint level

octez_validator_block_already_commited_blocks_count

Counter

Number of requests to validate a block already handled

octez_validator_block_already_known_invalid_blocks_count

Counter

Number of requests to validate a block already known as invalid

octez_validator_block_application_errors_after_validation_count

Counter

Number of requests to validate an inapplicable but validated block

octez_validator_block_commit_block_failed_count

Counter

Number of requests that failed to commit a block

octez_validator_block_last_finished_request_completion_timestamp

Gauge

Timestamp at which the latest request handled by the worker was completed

octez_validator_block_last_finished_request_push_timestamp

Gauge

Reception timestamp of the latest request handled by the worker

octez_validator_block_last_finished_request_treatment_timestamp

Gauge

Timestamp at which the worker started processing of the latest request it handled

octez_validator_block_operations_per_pass

Gauge

Number of operations per pass for the last validated block

pass_id

octez_validator_block_preapplication_errors_count

Counter

Number of refused application simulations of blocks

octez_validator_block_preapplied_blocks_count

Counter

Number of successful application simulations of blocks

octez_validator_block_validated_blocks_count

Counter

Number of requests to validate a valid block

octez_validator_block_validation_errors_count

Counter

Number of requests to validate an invalid block

octez_validator_block_validation_failed_count

Counter

Number of block validation requests where the validation of a block failed

octez_validator_block_worker_completion_count

Counter

Number of requests completed the block validator worker

octez_validator_block_worker_error_count

Counter

Number of errors encountered by the block validator worker

octez_validator_block_worker_request_count

Counter

Number of requests received by the block validator worker

octez_validator_chain_branch_switch_count

Counter

Number of times the chain_validator switched branch

chain_id

octez_validator_chain_head_consumed_gas

Gauge

Gas consumed in the current node’s head

chain_id

octez_validator_chain_head_cycle

Gauge

Cycle of the current node’s head

chain_id

octez_validator_chain_head_increment_count

Counter

Number of times the chain_validator incremented its head for a direct successor

chain_id

octez_validator_chain_head_level

Gauge

Level of the current node’s head

chain_id

octez_validator_chain_head_round

Gauge

Round of the current node’s head

chain_id

octez_validator_chain_ignored_head_count

Counter

Number of requests where the chain validator ignored a new valid block with a lower fitness than its current head

chain_id

octez_validator_chain_is_bootstrapped

Gauge

Returns 1 if the node has bootstrapped, 0 otherwise.

chain_id

octez_validator_chain_last_finished_request_completion_timestamp

Gauge

Timestamp at which the latest request handled by the worker was completed

chain_id

octez_validator_chain_last_finished_request_push_timestamp

Gauge

Reception timestamp of the latest request handled by the worker

chain_id

octez_validator_chain_last_finished_request_treatment_timestamp

Gauge

Timestamp at which the worker started processing of the latest request it handled

chain_id

octez_validator_chain_synchronisation_status

Gauge

Returns 0 if the node is unsynchronised, 1 if the node is synchronised, 2 if the node is stuck.

chain_id

octez_validator_chain_worker_completion_count

Counter

Number of requests completed the block validator worker

chain_id

octez_validator_chain_worker_error_count

Counter

Number of errors encountered by the block validator worker

chain_id

octez_validator_chain_worker_request_count

Counter

Number of requests received by the block validator worker

chain_id

octez_validator_peer_connections

Counter

Number of time we connected to a peer.

octez_validator_peer_invalid_block

Counter

Number of time we received an invalid block from a peer.

octez_validator_peer_invalid_locator

Counter

Number of time we received an invalid locator from a peer.

octez_validator_peer_new_branch_completed

Counter

Number of time we successfuly completed a new branch request from a peer.

octez_validator_peer_new_head_completed

Counter

Number of time we successfuly completed a new head request from a peer.

octez_validator_peer_on_no_request_count

Counter

Number of time we did no hear new messages from a peer since the last timeout.

octez_validator_peer_operations_fetching_canceled_new_branch

Counter

Number of time we canceled the fetching of operations on a new branch request for a peer.

octez_validator_peer_operations_fetching_canceled_new_known_valid_head

Counter

Number of time we canceled the fetching of operations on a new head request for a peer.

octez_validator_peer_operations_fetching_canceled_new_unknown_head

Counter

Number of time we canceled the fetching of operations on a new head request or an unknown head for a peer.

octez_validator_peer_system_error

Counter

Number of time a request trigerred a system error from a peer.

octez_validator_peer_too_short_locator

Counter

Number of time we received a too short locator from a peer.

octez_validator_peer_unavailable_protocol

Counter

Number of time we received an unknown protocol from a peer.

octez_validator_peer_unknown_ancestor

Counter

Number of time we received a locator with an unknown ancestor from a peer.

octez_validator_peer_unknown_error

Counter

Number of time an unknown error happened for a peer.

octez_version

Gauge

Node version

version;chain_name;distributed_db_version;p2p_version;commit_hash;commit_date

process_cpu_seconds_total

Counter

Total user and system CPU time spent in seconds.

process_start_time_seconds

Counter

Start time of the process since unix epoch in seconds.

prometheus_logs_messages_total

Counter

Total number of messages logged

level;src

Example#

In the following, we indicate a typical monitoring setup for Octez developers. For more details on setting up the node for monitoring see Monitoring an Octez Node.

To instruct the Octez node to produce metrics, the user needs to pass the option --metrics-addr=<ADDR>:<PORT>. The port specified on the command line is the port where the integrated open metrics server will be available (9932 by default). The address defaults to localhost. When the option is not supplied at all, no metrics are produced. Ex.:

octez-node run --metrics-addr=:9091

To query the open metrics server the user can simply query the node.

Ex.:

curl http://<node_addr>:9091/metrics

Collecting metrics#

Different third-party tools can be used to query the Octez node and collect metrics from it. Let us illustrates this with the example of a Prometheus server.

Update the Prometheus configuration file (typically, prometheus.yml) to add a “scrape job” - that is how Prometheus is made aware of a new data source - using adequate values:

  • job_name: Use a unique name among other scrape jobs. All metrics collected through this job will have automatically a ‘job’ label with this value added to it

  • targets: The URL of Octez node.

- job_name: 'octez-metrics'
    scheme: http
    static_configs:
      - targets: ['localhost:9091']