Essential metrics

View as Markdown

This page lists the essential Prometheus metrics exposed by Materialize: the ones we recommend building dashboards and alerts on. This list may evolve as we add observability for new features and refine what’s most useful.

The metrics are grouped by the component of Materialize they describe. A grouping is shown only when it has at least one metric. For the complete list of metrics Materialize exposes, see Appendix: Metrics.

Environment-level metrics

Metrics for the SQL control plane: client connections, availability, and the catalog.
Metric Description Labels
mz_active_sessions The number of active coordinator sessions. session_type
mz_active_subscribes The number of active SUBSCRIBE queries. session_type
mz_adapter_commands The total number of adapter commands issued of the given type since process start. application_name, command_type, status
mz_object_info Maps catalog object IDs to the object's name, schema, database, and type. Constant 1. database_name, global_id, name, object_id, schema_name, type
mz_query_total The total number of queries issued of the given type since process start. session_type, statement_type

Compute metrics

Metrics for compute objects, such as indexes and materialized views, running on clusters and their replicas.
Metric Description Labels
mz_arrangement_maintenance_seconds_total The total time spent maintaining arrangements. worker_id
mz_cluster_info Maps cluster IDs to the cluster's name and size. Constant 1. cluster_id, name, size
mz_compute_commands_total The total number of compute commands sent. command_type, instance_id, replica_id
mz_compute_controller_hydration_queue_size The size of the compute hydration queue. instance_id, replica_id
mz_compute_peek_duration_seconds A histogram of peek durations since restart. instance_id, result
mz_compute_replica_history_dataflow_count The number of dataflows in the replica's command history. worker_id
mz_dataflow_wallclock_lag_seconds A summary of the second-by-second lag of the dataflow frontier relative to wallclock time, aggregated over the last minute. collection_id, instance_id, quantile, replica_id
mz_replica_info Maps cluster replica IDs to the replica's name and size. Constant 1. cluster_id, name, replica_id, size

Source metrics

Metrics for data ingestion from external systems.
Metric Description Labels
mz_dataflow_wallclock_lag_seconds A summary of the second-by-second lag of the dataflow frontier relative to wallclock time, aggregated over the last minute. collection_id, instance_id, quantile, replica_id
mz_source_bytes_received The number of bytes worth of messages the worker has received from upstream. The way the bytes are counted is source-specific. parent_source_id, source_id, worker_id
mz_source_info Maps user source IDs to the source's type, envelope type, and cluster. Constant 1. cluster_id, envelope_type, source_id, type
mz_source_messages_received The number of raw messages the worker has received from upstream. parent_source_id, source_id, worker_id
mz_source_offset_commit_failures A counter representing how many times we have failed to commit offsets for a source source_id
mz_source_offset_committed The total number of _values_ (source-defined unit) we have fully processed, and storage and committed. shard_id, source_id, worker_id
mz_source_offset_known The total number of _values_ (source-defined unit) present in upstream. shard_id, source_id, worker_id

Sink metrics

Metrics for data output to external systems.
Metric Description Labels
mz_dataflow_wallclock_lag_seconds A summary of the second-by-second lag of the dataflow frontier relative to wallclock time, aggregated over the last minute. collection_id, instance_id, quantile, replica_id
mz_sink_bytes_committed The number of bytes committed to the sink. sink_id, worker_id
mz_sink_bytes_staged The number of bytes staged but possibly not committed to the sink. sink_id, worker_id
mz_sink_iceberg_commit_conflicts Number of commit conflicts in the iceberg sink sink_id, worker_id
mz_sink_iceberg_commit_duration_seconds Time spent committing batches to Iceberg in seconds sink_id, worker_id
mz_sink_iceberg_commit_failures Number of commit failures in the iceberg sink sink_id, worker_id
mz_sink_iceberg_data_files_written Number of data files written by the iceberg sink sink_id, worker_id
mz_sink_iceberg_delete_files_written Number of delete files written by the iceberg sink sink_id, worker_id
mz_sink_iceberg_snapshots_committed Number of snapshots committed by the iceberg sink sink_id, worker_id
mz_sink_info Maps user sink IDs to the sink's type, envelope type, and cluster. Constant 1. cluster_id, envelope_type, sink_id, type
mz_sink_rdkafka_connects The number of connection attempts, including successful and failed attempts, and name resolution failures across all brokers. sink_id
mz_sink_rdkafka_disconnects The number of disconnections, whether triggered by the broker, the network, the load balancer, or something else across all brokers. sink_id
mz_sink_rdkafka_outbuf_msg_cnt The number of messages awaiting transmission across all brokers. sink_id
mz_sink_rdkafka_txerrs The total number of transmission errors across all brokers. sink_id
Back to top ↑