Monitor Tool
Monitor Tool
The deployment of monitoring tools can be referenced in the document Monitoring Panel Deployment chapter.
1. Prometheus Mapping Relationship for Monitoring Metrics
For a monitoring metric with Metric Name as
name
, Tags asK1=V1, ..., Kn=Vn
, the following mapping applies, wherevalue
is the specific value.
Monitoring Metric Type | Mapping Relationship |
---|---|
Counter | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value |
AutoGauge, Gauge | name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value |
Histogram | name_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value name{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value |
Rate | name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m1"} value name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m5"} value name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="m15"} value name_total{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", rate="mean"} value |
Timer | name_seconds_max{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_seconds_sum{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_seconds_count{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn"} value name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.5"} value name_seconds{cluster="clusterName", nodeType="nodeType", nodeId="nodeId",k1="V1" , ..., Kn="Vn", quantile="0.99"} value |
2. Modifying Configuration Files
- Taking DataNode as an example, modify the
iotdb-system.properties
configuration file as follows:
dn_metric_reporter_list=PROMETHEUS
dn_metric_level=CORE
dn_metric_prometheus_reporter_port=9091
Start the IoTDB DataNode.
Open a browser or use
curl
to accesshttp://server_ip:9091/metrics
, and you will get metric data as follows:
...
# HELP file_count
# TYPE file_count gauge
file_count{name="wal",} 0.0
file_count{name="unseq",} 0.0
file_count{name="seq",} 2.0
...
3. Prometheus + Grafana
As shown above, IoTDB exposes monitoring metrics in the standard Prometheus format. You can use Prometheus to collect and store these metrics and Grafana to visualize them.
The relationship between IoTDB, Prometheus, and Grafana is illustrated below:

- IoTDB continuously collects monitoring metrics during operation.
- Prometheus pulls monitoring metrics from IoTDB's HTTP interface at fixed intervals (configurable).
- Prometheus stores the pulled monitoring metrics in its TSDB.
- Grafana queries monitoring metrics from Prometheus at fixed intervals (configurable) and visualizes them.
From the interaction flow, it is clear that additional work is required to deploy and configure Prometheus and Grafana.
For example, you can configure Prometheus as follows (some parameters can be adjusted as needed) to pull metrics from IoTDB:
job_name: pull-metrics
honor_labels: true
honor_timestamps: true
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
static_configs:
- targets:
- localhost:9091
For more details, refer to the following documents:
- Prometheus Installation and Usage Documentation
- Prometheus Configuration for Pulling Metrics from HTTP Interface
- Grafana Installation and Usage Documentation
- Grafana Querying Data from Prometheus and Plotting Documentation
4. Apache IoTDB Dashboard
The Apache IoTDB Dashboard
is a companion product of IoTDB Enterprise Edition, supporting unified centralized operation and maintenance management. It allows monitoring multiple clusters through a single monitoring panel. You can contact the business team to obtain the Dashboard's JSON file.


4.1 Cluster Overview
You can monitor, but not limited to:
- Total CPU cores, total memory space, total disk space of the cluster.
- Number of ConfigNodes and DataNodes in the cluster.
- Cluster uptime.
- Cluster write speed.
- Current CPU, memory, and disk usage of each node in the cluster.
- Node-specific information.

4.2 Data Writing
You can monitor, but not limited to:
- Average write latency, median latency, 99th percentile latency.
- Number and size of WAL files.
- Node WAL flush SyncBuffer latency.

4.3 Data Query
You can monitor, but not limited to:
- Node query loading time series metadata latency.
- Node query reading time series latency.
- Node query modifying time series metadata latency.
- Node query loading Chunk metadata list latency.
- Node query modifying Chunk metadata latency.
- Node query filtering by Chunk metadata latency.
- Node query constructing Chunk Reader latency average.

4.4 Storage Engine
You can monitor, but not limited to:
- Number and size of files by type.
- Number and size of TsFiles in various stages.
- Number and latency of various tasks.

4.5 System Monitoring
You can monitor, but not limited to:
- System memory, swap memory, process memory.
- Disk space, file count, file size.
- JVM GC time ratio, GC count by type, GC data volume, heap memory usage by generation.
- Network transmission rate, packet sending rate.


