Data Quality
Data Quality
Completeness
Usage
This function is used to calculate the completeness of time series. The input series are divided into several continuous and non overlapping windows. The timestamp of the first data point and the completeness of each window will be output.
Name: COMPLETENESS
Input Series: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE.
Parameters:
window
: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window.downtime
: Whether the downtime exception is considered in the calculation of completeness. It is 'true' or 'false' (default). When considering the downtime exception, long-term missing data will be considered as downtime exception without any influence on completeness.
Output Series: Output a single series. The type is DOUBLE. The range of each value is [0,1].
Note: Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output.
Examples
Default Parameters
With default parameters, this function will regard all input data as the same window.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
+-----------------------------+---------------+
SQL for query:
select completeness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30
Output series:
+-----------------------------+-----------------------------+
| Time|completeness(root.test.d1.s1)|
+-----------------------------+-----------------------------+
|2020-01-01T00:00:02.000+08:00| 0.875|
+-----------------------------+-----------------------------+
Specific Window Size
When the window size is given, this function will divide the input data as multiple windows.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
|2020-01-01T00:00:32.000+08:00| 130.0|
|2020-01-01T00:00:34.000+08:00| 132.0|
|2020-01-01T00:00:36.000+08:00| 134.0|
|2020-01-01T00:00:38.000+08:00| 136.0|
|2020-01-01T00:00:40.000+08:00| 138.0|
|2020-01-01T00:00:42.000+08:00| 140.0|
|2020-01-01T00:00:44.000+08:00| 142.0|
|2020-01-01T00:00:46.000+08:00| 144.0|
|2020-01-01T00:00:48.000+08:00| 146.0|
|2020-01-01T00:00:50.000+08:00| 148.0|
|2020-01-01T00:00:52.000+08:00| 150.0|
|2020-01-01T00:00:54.000+08:00| 152.0|
|2020-01-01T00:00:56.000+08:00| 154.0|
|2020-01-01T00:00:58.000+08:00| 156.0|
|2020-01-01T00:01:00.000+08:00| 158.0|
+-----------------------------+---------------+
SQL for query:
select completeness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00
Output series:
+-----------------------------+--------------------------------------------+
| Time|completeness(root.test.d1.s1, "window"="15")|
+-----------------------------+--------------------------------------------+
|2020-01-01T00:00:02.000+08:00| 0.875|
|2020-01-01T00:00:32.000+08:00| 1.0|
+-----------------------------+--------------------------------------------+
Consistency
Usage
This function is used to calculate the consistency of time series. The input series are divided into several continuous and non overlapping windows. The timestamp of the first data point and the consistency of each window will be output.
Name: CONSISTENCY
Input Series: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE.
Parameters:
window
: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window.
Output Series: Output a single series. The type is DOUBLE. The range of each value is [0,1].
Note: Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output.
Examples
Default Parameters
With default parameters, this function will regard all input data as the same window.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
+-----------------------------+---------------+
SQL for query:
select consistency(s1) from root.test.d1 where time <= 2020-01-01 00:00:30
Output series:
+-----------------------------+----------------------------+
| Time|consistency(root.test.d1.s1)|
+-----------------------------+----------------------------+
|2020-01-01T00:00:02.000+08:00| 0.9333333333333333|
+-----------------------------+----------------------------+
Specific Window Size
When the window size is given, this function will divide the input data as multiple windows.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
|2020-01-01T00:00:32.000+08:00| 130.0|
|2020-01-01T00:00:34.000+08:00| 132.0|
|2020-01-01T00:00:36.000+08:00| 134.0|
|2020-01-01T00:00:38.000+08:00| 136.0|
|2020-01-01T00:00:40.000+08:00| 138.0|
|2020-01-01T00:00:42.000+08:00| 140.0|
|2020-01-01T00:00:44.000+08:00| 142.0|
|2020-01-01T00:00:46.000+08:00| 144.0|
|2020-01-01T00:00:48.000+08:00| 146.0|
|2020-01-01T00:00:50.000+08:00| 148.0|
|2020-01-01T00:00:52.000+08:00| 150.0|
|2020-01-01T00:00:54.000+08:00| 152.0|
|2020-01-01T00:00:56.000+08:00| 154.0|
|2020-01-01T00:00:58.000+08:00| 156.0|
|2020-01-01T00:01:00.000+08:00| 158.0|
+-----------------------------+---------------+
SQL for query:
select consistency(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00
Output series:
+-----------------------------+-------------------------------------------+
| Time|consistency(root.test.d1.s1, "window"="15")|
+-----------------------------+-------------------------------------------+
|2020-01-01T00:00:02.000+08:00| 0.9333333333333333|
|2020-01-01T00:00:32.000+08:00| 1.0|
+-----------------------------+-------------------------------------------+
Timeliness
Usage
This function is used to calculate the timeliness of time series. The input series are divided into several continuous and non overlapping windows. The timestamp of the first data point and the timeliness of each window will be output.
Name: TIMELINESS
Input Series: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE.
Parameters:
window
: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window.
Output Series: Output a single series. The type is DOUBLE. The range of each value is [0,1].
Note: Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output.
Examples
Default Parameters
With default parameters, this function will regard all input data as the same window.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
+-----------------------------+---------------+
SQL for query:
select timeliness(s1) from root.test.d1 where time <= 2020-01-01 00:00:30
Output series:
+-----------------------------+---------------------------+
| Time|timeliness(root.test.d1.s1)|
+-----------------------------+---------------------------+
|2020-01-01T00:00:02.000+08:00| 0.9333333333333333|
+-----------------------------+---------------------------+
Specific Window Size
When the window size is given, this function will divide the input data as multiple windows.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
|2020-01-01T00:00:32.000+08:00| 130.0|
|2020-01-01T00:00:34.000+08:00| 132.0|
|2020-01-01T00:00:36.000+08:00| 134.0|
|2020-01-01T00:00:38.000+08:00| 136.0|
|2020-01-01T00:00:40.000+08:00| 138.0|
|2020-01-01T00:00:42.000+08:00| 140.0|
|2020-01-01T00:00:44.000+08:00| 142.0|
|2020-01-01T00:00:46.000+08:00| 144.0|
|2020-01-01T00:00:48.000+08:00| 146.0|
|2020-01-01T00:00:50.000+08:00| 148.0|
|2020-01-01T00:00:52.000+08:00| 150.0|
|2020-01-01T00:00:54.000+08:00| 152.0|
|2020-01-01T00:00:56.000+08:00| 154.0|
|2020-01-01T00:00:58.000+08:00| 156.0|
|2020-01-01T00:01:00.000+08:00| 158.0|
+-----------------------------+---------------+
SQL for query:
select timeliness(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00
Output series:
+-----------------------------+------------------------------------------+
| Time|timeliness(root.test.d1.s1, "window"="15")|
+-----------------------------+------------------------------------------+
|2020-01-01T00:00:02.000+08:00| 0.9333333333333333|
|2020-01-01T00:00:32.000+08:00| 1.0|
+-----------------------------+------------------------------------------+
Validity
Usage
This function is used to calculate the Validity of time series. The input series are divided into several continuous and non overlapping windows. The timestamp of the first data point and the Validity of each window will be output.
Name: VALIDITY
Input Series: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE.
Parameters:
window
: The size of each window. It is a positive integer or a positive number with an unit. The former is the number of data points in each window. The number of data points in the last window may be less than it. The latter is the time of the window. The unit is 'ms' for millisecond, 's' for second, 'm' for minute, 'h' for hour and 'd' for day. By default, all input data belongs to the same window.
Output Series: Output a single series. The type is DOUBLE. The range of each value is [0,1].
Note: Only when the number of data points in the window exceeds 10, the calculation will be performed. Otherwise, the window will be ignored and nothing will be output.
Examples
Default Parameters
With default parameters, this function will regard all input data as the same window.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
+-----------------------------+---------------+
SQL for query:
select Validity(s1) from root.test.d1 where time <= 2020-01-01 00:00:30
Output series:
+-----------------------------+-------------------------+
| Time|validity(root.test.d1.s1)|
+-----------------------------+-------------------------+
|2020-01-01T00:00:02.000+08:00| 0.8833333333333333|
+-----------------------------+-------------------------+
Specific Window Size
When the window size is given, this function will divide the input data as multiple windows.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s1|
+-----------------------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0|
|2020-01-01T00:00:03.000+08:00| 101.0|
|2020-01-01T00:00:04.000+08:00| 102.0|
|2020-01-01T00:00:06.000+08:00| 104.0|
|2020-01-01T00:00:08.000+08:00| 126.0|
|2020-01-01T00:00:10.000+08:00| 108.0|
|2020-01-01T00:00:14.000+08:00| 112.0|
|2020-01-01T00:00:15.000+08:00| 113.0|
|2020-01-01T00:00:16.000+08:00| 114.0|
|2020-01-01T00:00:18.000+08:00| 116.0|
|2020-01-01T00:00:20.000+08:00| 118.0|
|2020-01-01T00:00:22.000+08:00| 120.0|
|2020-01-01T00:00:26.000+08:00| 124.0|
|2020-01-01T00:00:28.000+08:00| 126.0|
|2020-01-01T00:00:30.000+08:00| NaN|
|2020-01-01T00:00:32.000+08:00| 130.0|
|2020-01-01T00:00:34.000+08:00| 132.0|
|2020-01-01T00:00:36.000+08:00| 134.0|
|2020-01-01T00:00:38.000+08:00| 136.0|
|2020-01-01T00:00:40.000+08:00| 138.0|
|2020-01-01T00:00:42.000+08:00| 140.0|
|2020-01-01T00:00:44.000+08:00| 142.0|
|2020-01-01T00:00:46.000+08:00| 144.0|
|2020-01-01T00:00:48.000+08:00| 146.0|
|2020-01-01T00:00:50.000+08:00| 148.0|
|2020-01-01T00:00:52.000+08:00| 150.0|
|2020-01-01T00:00:54.000+08:00| 152.0|
|2020-01-01T00:00:56.000+08:00| 154.0|
|2020-01-01T00:00:58.000+08:00| 156.0|
|2020-01-01T00:01:00.000+08:00| 158.0|
+-----------------------------+---------------+
SQL for query:
select Validity(s1,"window"="15") from root.test.d1 where time <= 2020-01-01 00:01:00
Output series:
+-----------------------------+----------------------------------------+
| Time|validity(root.test.d1.s1, "window"="15")|
+-----------------------------+----------------------------------------+
|2020-01-01T00:00:02.000+08:00| 0.8833333333333333|
|2020-01-01T00:00:32.000+08:00| 1.0|
+-----------------------------+----------------------------------------+