Monitor Tool

2/21/25About 2 min

Monitor Tool

1. Prometheus Mapping Relationship for Monitoring Metrics

For a monitoring metric with Metric Name as name, Tags as K1=V1, ..., Kn=Vn, the following mapping applies, where value is the specific value.

Monitoring Metric Type	Mapping Relationship
Counter	`name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value`
AutoGauge, Gauge	`name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value`
Histogram	`name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value` `name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value`
Rate	`name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value` `name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value` `name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value` `name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value`
Timer	`name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value` `name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value` `name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value`

2. Modifying Configuration Files

Taking DataNode as an example, modify the iotdb-system.properties configuration file as follows:

dn_metric_reporter_list=PROMETHEUS
dn_metric_level=CORE
dn_metric_prometheus_reporter_port=9091

Start the IoTDB DataNode.
Open a browser or use curl to access http://server_ip:9091/metrics, and you will get metric data as follows:

...
# HELP file_count
# TYPE file_count gauge
file_count{name="wal",} 0.0
file_count{name="unseq",} 0.0
file_count{name="seq",} 2.0
...

3. Prometheus + Grafana

As shown above, IoTDB exposes monitoring metrics in the standard Prometheus format. You can use Prometheus to collect and store these metrics and Grafana to visualize them.

The relationship between IoTDB, Prometheus, and Grafana is illustrated below:

IoTDB continuously collects monitoring metrics during operation.
Prometheus pulls monitoring metrics from IoTDB's HTTP interface at fixed intervals (configurable).
Prometheus stores the pulled monitoring metrics in its TSDB.
Grafana queries monitoring metrics from Prometheus at fixed intervals (configurable) and visualizes them.

From the interaction flow, it is clear that additional work is required to deploy and configure Prometheus and Grafana.

For example, you can configure Prometheus as follows (some parameters can be adjusted as needed) to pull metrics from IoTDB:

job_name: pull-metrics
honor_labels: true
honor_timestamps: true
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
static_configs:
  - targets:
      - localhost:9091

For more details, refer to the following documents: