Skip to main content

Function-and-Expression

...About 52 min

Function-and-Expression

Arithmetic Operators and Functions

Arithmetic Operators

Unary Arithmetic Operators

Supported operators: +, -

Supported input data types: INT32, INT64 and FLOAT

Output data type: consistent with the input data type

Binary Arithmetic Operators

Supported operators: +, -, *, /, %

Supported input data types: INT32, INT64, FLOAT and DOUBLE

Output data type: DOUBLE

Note: Only when the left operand and the right operand under a certain timestamp are not null, the binary arithmetic operation will have an output value.

Example

select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1

Result:

+-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+
|                         Time|root.sg.d1.s1|-root.sg.d1.s1|root.sg.d1.s2|root.sg.d1.s2|root.sg.d1.s1 + root.sg.d1.s2|root.sg.d1.s1 - root.sg.d1.s2|root.sg.d1.s1 * root.sg.d1.s2|root.sg.d1.s1 / root.sg.d1.s2|root.sg.d1.s1 % root.sg.d1.s2|
+-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+
|1970-01-01T08:00:00.001+08:00|          1.0|          -1.0|          1.0|          1.0|                          2.0|                          0.0|                          1.0|                          1.0|                          0.0|
|1970-01-01T08:00:00.002+08:00|          2.0|          -2.0|          2.0|          2.0|                          4.0|                          0.0|                          4.0|                          1.0|                          0.0|
|1970-01-01T08:00:00.003+08:00|          3.0|          -3.0|          3.0|          3.0|                          6.0|                          0.0|                          9.0|                          1.0|                          0.0|
|1970-01-01T08:00:00.004+08:00|          4.0|          -4.0|          4.0|          4.0|                          8.0|                          0.0|                         16.0|                          1.0|                          0.0|
|1970-01-01T08:00:00.005+08:00|          5.0|          -5.0|          5.0|          5.0|                         10.0|                          0.0|                         25.0|                          1.0|                          0.0|
+-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+
Total line number = 5
It costs 0.014s

Arithmetic Functions

Currently, IoTDB supports the following mathematical functions. The behavior of these mathematical functions is consistent with the behavior of these functions in the Java Math standard library.

Function NameAllowed Input Series Data TypesOutput Series Data TypeRequired AttributesCorresponding Implementation in the Java Standard Library
SININT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#sin(double)
COSINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#cos(double)
TANINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#tan(double)
ASININT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#asin(double)
ACOSINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#acos(double)
ATANINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#atan(double)
SINHINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#sinh(double)
COSHINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#cosh(double)
TANHINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#tanh(double)
DEGREESINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#toDegrees(double)
RADIANSINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#toRadians(double)
ABSINT32 / INT64 / FLOAT / DOUBLESame type as the input series/Math#abs(int) / Math#abs(long) /Math#abs(float) /Math#abs(double)
SIGNINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#signum(double)
CEILINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#ceil(double)
FLOORINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#floor(double)
ROUNDINT32 / INT64 / FLOAT / DOUBLEDOUBLE'places' : Round the significant number, positive number is the significant number after the decimal point, negative number is the significant number of whole numberMath#rint(Math#pow(10,places))/Math#pow(10,places)
EXPINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#exp(double)
LNINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#log(double)
LOG10INT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#log10(double)
SQRTINT32 / INT64 / FLOAT / DOUBLEDOUBLE/Math#sqrt(double)

Example:

select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000;

Result:

+-----------------------------+-------------------+-------------------+--------------------+-------------------+
|                         Time|     root.sg1.d1.s1|sin(root.sg1.d1.s1)| cos(root.sg1.d1.s1)|tan(root.sg1.d1.s1)|
+-----------------------------+-------------------+-------------------+--------------------+-------------------+
|2020-12-10T17:11:49.037+08:00|7360723084922759782| 0.8133527237573284|  0.5817708713544664| 1.3980636773094157|
|2020-12-10T17:11:49.038+08:00|4377791063319964531|-0.8938962705202537|  0.4482738644511651| -1.994085181866842|
|2020-12-10T17:11:49.039+08:00|7972485567734642915| 0.9627757585308978|-0.27030138509681073|-3.5618602479083545|
|2020-12-10T17:11:49.040+08:00|2508858212791964081|-0.6073417341629443| -0.7944406950452296| 0.7644897069734913|
|2020-12-10T17:11:49.041+08:00|2817297431185141819|-0.8419358900502509| -0.5395775727782725| 1.5603611649667768|
+-----------------------------+-------------------+-------------------+--------------------+-------------------+
Total line number = 5
It costs 0.008s

ROUND

Example:

select s4,round(s4),round(s4,2),round(s4,-1) from root.sg1.d1 
+-----------------------------+-------------+--------------------+----------------------+-----------------------+
|                         Time|root.db.d1.s4|ROUND(root.db.d1.s4)|ROUND(root.db.d1.s4,2)|ROUND(root.db.d1.s4,-1)|
+-----------------------------+-------------+--------------------+----------------------+-----------------------+
|1970-01-01T08:00:00.001+08:00|    101.14345|               101.0|                101.14|                  100.0|
|1970-01-01T08:00:00.002+08:00|    20.144346|                20.0|                 20.14|                   20.0|
|1970-01-01T08:00:00.003+08:00|    20.614372|                21.0|                 20.61|                   20.0|
|1970-01-01T08:00:00.005+08:00|    20.814346|                21.0|                 20.81|                   20.0|
|1970-01-01T08:00:00.006+08:00|     60.71443|                61.0|                 60.71|                   60.0|
|2023-03-13T16:16:19.764+08:00|    10.143425|                10.0|                 10.14|                   10.0|
+-----------------------------+-------------+--------------------+----------------------+-----------------------+
Total line number = 6
It costs 0.059s

Comparison Operators and Functions

Basic comparison operators

Supported operators >, >=, <, <=, ==, != (or <> )

Supported input data types: INT32, INT64, FLOAT and DOUBLE

Note: It will transform all type to DOUBLE then do computation.

Output data type: BOOLEAN

Example:

select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test;
IoTDB> select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test;
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+
|                         Time|root.test.a|root.test.b|root.test.a > 10|root.test.a <= root.test.b|!root.test.a <= root.test.b|(root.test.a > 10) & (root.test.a > root.test.b)|
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|         23|       10.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.002+08:00|         33|       21.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.004+08:00|         13|       15.0|            true|                      true|                      false|                                           false|
|1970-01-01T08:00:00.005+08:00|         26|        0.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.008+08:00|          1|       22.0|           false|                      true|                      false|                                           false|
|1970-01-01T08:00:00.010+08:00|         23|       12.0|            true|                     false|                       true|                                            true|
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+

BETWEEN ... AND ... operator

operatormeaning
BETWEEN ... AND ...within the specified range
NOT BETWEEN ... AND ...Not within the specified range

Example: Select data within or outside the interval [36.5,40]:

select temperature from root.sg1.d1 where temperature between 36.5 and 40;
select temperature from root.sg1.d1 where temperature not between 36.5 and 40;

Fuzzy matching operator

For TEXT type data, support fuzzy matching of data using Like and Regexp operators.

operatormeaning
LIKEmatches simple patterns
NOT LIKEcannot match simple pattern
REGEXPMatch regular expression
NOT REGEXPCannot match regular expression

Input data type: TEXT

Return type: BOOLEAN

Use Like for fuzzy matching

Matching rules:

  • % means any 0 or more characters.
  • _ means any single character.

Example 1: Query the data under root.sg.d1 that contains 'cc' in value.

IoTDB> select * from root.sg.d1 where value like '%cc%'
+--------------------------+----------------+
| Time|root.sg.d1.value|
+--------------------------+----------------+
|2017-11-01T00:00:00.000+08:00| aabbccdd|
|2017-11-01T00:00:01.000+08:00| cc|
+--------------------------+----------------+
Total line number = 2
It costs 0.002s

Example 2: Query the data under root.sg.d1 with 'b' in the middle of value and any single character before and after.

IoTDB> select * from root.sg.device where value like '_b_'
+--------------------------+----------------+
| Time|root.sg.d1.value|
+--------------------------+----------------+
|2017-11-01T00:00:02.000+08:00|abc|
+--------------------------+----------------+
Total line number = 1
It costs 0.002s

Use Regexp for fuzzy matching

The filter condition that needs to be passed in is Java standard library style regular expression.

Common regular matching examples:

All characters with a length of 3-20: ^.{3,20}$
Uppercase English characters: ^[A-Z]+$
Numbers and English characters: ^[A-Za-z0-9]+$
Starting with a: ^a.*

Example 1: Query the string of 26 English characters for value under root.sg.d1.

IoTDB> select * from root.sg.d1 where value regexp '^[A-Za-z]+$'
+--------------------------+----------------+
| Time|root.sg.d1.value|
+--------------------------+----------------+
|2017-11-01T00:00:00.000+08:00| aabbccdd|
|2017-11-01T00:00:01.000+08:00| cc|
+--------------------------+----------------+
Total line number = 2
It costs 0.002s

Example 2: Query root.sg.d1 where the value is a string consisting of 26 lowercase English characters and the time is greater than 100.

IoTDB> select * from root.sg.d1 where value regexp '^[a-z]+$' and time > 100
+--------------------------+----------------+
| Time|root.sg.d1.value|
+--------------------------+----------------+
|2017-11-01T00:00:00.000+08:00| aabbccdd|
|2017-11-01T00:00:01.000+08:00| cc|
+--------------------------+----------------+
Total line number = 2
It costs 0.002s

Example 3:

select b, b like '1%', b regexp '[0-2]' from root.test;

operation result

+-----------------------------+-----------+------- ------------------+--------------------------+
| Time|root.test.b|root.test.b LIKE '^1.*?$'|root.test.b REGEXP '[0-2]'|
+-----------------------------+-----------+------- ------------------+--------------------------+
|1970-01-01T08:00:00.001+08:00| 111test111| true| true|
|1970-01-01T08:00:00.003+08:00| 333test333| false| false|
+-----------------------------+-----------+------- ------------------+--------------------------+

IS NULL operator

operatormeaning
IS NULLis a null value
IS NOT NULLis not a null value

Example 1: Select data with empty values:

select code from root.sg1.d1 where temperature is null;

Example 2: Select data with non-null values:

select code from root.sg1.d1 where temperature is not null;

IN operator

operatormeaning
IN / CONTAINSare the values ​​in the specified list
NOT IN / NOT CONTAINSnot a value in the specified list

Input data type: All Types

return type BOOLEAN

**Note: Please ensure that the values ​​in the collection can be converted to the type of the input data. **

For example:

s1 in (1, 2, 3, 'test'), the data type of s1 is INT32

We will throw an exception because 'test' cannot be converted to type INT32

Example 1: Select data with values ​​within a certain range:

select code from root.sg1.d1 where code in ('200', '300', '400', '500');

Example 2: Select data with values ​​outside a certain range:

select code from root.sg1.d1 where code not in ('200', '300', '400', '500');

Example 3:

select a, a in (1, 2) from root.test;

Output 2:

+-----------------------------+-----------+------- -------------+
| Time|root.test.a|root.test.a IN (1,2)|
+-----------------------------+-----------+------- -------------+
|1970-01-01T08:00:00.001+08:00| 1| true|
|1970-01-01T08:00:00.003+08:00| 3| false|
+-----------------------------+-----------+------- -------------+

Condition Functions

Condition functions are used to check whether timeseries data points satisfy some specific condition.

They return BOOLEANs.

Currently, IoTDB supports the following condition functions:

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
ON_OFFINT32 / INT64 / FLOAT / DOUBLEthreshold: a double type variateBOOLEANReturn ts_value >= threshold.
IN_RANGRINT32 / INT64 / FLOAT / DOUBLElower: DOUBLE type
upper: DOUBLE type
BOOLEANReturn ts_value >= lower && value <= upper.

Example Data:

IoTDB> select ts from root.test;
+-----------------------------+------------+
|                         Time|root.test.ts|
+-----------------------------+------------+
|1970-01-01T08:00:00.001+08:00|           1|
|1970-01-01T08:00:00.002+08:00|           2|
|1970-01-01T08:00:00.003+08:00|           3|
|1970-01-01T08:00:00.004+08:00|           4|
+-----------------------------+------------+

Test 1

SQL:

select ts, on_off(ts, 'threshold'='2') from root.test;

Output:

IoTDB> select ts, on_off(ts, 'threshold'='2') from root.test;
+-----------------------------+------------+-------------------------------------+
|                         Time|root.test.ts|on_off(root.test.ts, "threshold"="2")|
+-----------------------------+------------+-------------------------------------+
|1970-01-01T08:00:00.001+08:00|           1|                                false|
|1970-01-01T08:00:00.002+08:00|           2|                                 true|
|1970-01-01T08:00:00.003+08:00|           3|                                 true|
|1970-01-01T08:00:00.004+08:00|           4|                                 true|
+-----------------------------+------------+-------------------------------------+

Test 2

Sql:

select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test;

Output:

IoTDB> select ts, in_range(ts,'lower'='2', 'upper'='3.1') from root.test;
+-----------------------------+------------+--------------------------------------------------+
|                         Time|root.test.ts|in_range(root.test.ts, "lower"="2", "upper"="3.1")|
+-----------------------------+------------+--------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|           1|                                             false|
|1970-01-01T08:00:00.002+08:00|           2|                                              true|
|1970-01-01T08:00:00.003+08:00|           3|                                              true|
|1970-01-01T08:00:00.004+08:00|           4|                                             false|
+-----------------------------+------------+--------------------------------------------------+

Logical Operators

Unary Logical Operators

Supported operator !

Supported input data types: BOOLEAN

Output data type: BOOLEAN

Hint: the priority of ! is the same as -. Remember to use brackets to modify priority.

Binary Logical Operators

Supported operators AND:and,&, &&; OR:or,|,||

Supported input data types: BOOLEAN

Output data type: BOOLEAN

Note: Only when the left operand and the right operand under a certain timestamp are both BOOLEAN type, the binary logic operation will have an output value.

Example:

select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test;

Output:

IoTDB> select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test;
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+
|                         Time|root.test.a|root.test.b|root.test.a > 10|root.test.a <= root.test.b|!root.test.a <= root.test.b|(root.test.a > 10) & (root.test.a > root.test.b)|
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|         23|       10.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.002+08:00|         33|       21.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.004+08:00|         13|       15.0|            true|                      true|                      false|                                           false|
|1970-01-01T08:00:00.005+08:00|         26|        0.0|            true|                     false|                       true|                                            true|
|1970-01-01T08:00:00.008+08:00|          1|       22.0|           false|                      true|                      false|                                           false|
|1970-01-01T08:00:00.010+08:00|         23|       12.0|            true|                     false|                       true|                                            true|
+-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+

Aggregate Functions

Aggregate functions are many-to-one functions. They perform aggregate calculations on a set of values, resulting in a single aggregated result.

All aggregate functions except COUNT(), COUNT_IF() ignore null values and return null when there are no input rows or all values are null. For example, SUM() returns null instead of zero, and AVG() does not include null values in the count.

The aggregate functions supported by IoTDB are as follows:

Function NameDescriptionAllowed Input Series Data TypesRequired AttributesOutput Series Data Type
SUMSummation.INT32 INT64 FLOAT DOUBLE/DOUBLE
COUNTCounts the number of data points.All data types/INT
AVGAverage.INT32 INT64 FLOAT DOUBLE/DOUBLE
EXTREMEFinds the value with the largest absolute value. Returns a positive value if the maximum absolute value of positive and negative values is equal.INT32 INT64 FLOAT DOUBLE/Consistent with the input data type
MAX_VALUEFind the maximum value.INT32 INT64 FLOAT DOUBLE/Consistent with the input data type
MIN_VALUEFind the minimum value.INT32 INT64 FLOAT DOUBLE/Consistent with the input data type
FIRST_VALUEFind the value with the smallest timestamp.All data types/Consistent with input data type
LAST_VALUEFind the value with the largest timestamp.All data types/Consistent with input data type
MAX_TIMEFind the maximum timestamp.All data Types/Timestamp
MIN_TIMEFind the minimum timestamp.All data Types/Timestamp
COUNT_IFFind the number of data points that continuously meet a given condition and the number of data points that meet the condition (represented by keep) meet the specified threshold.BOOLEAN[keep >=/>/=/!=/</<=]threshold:The specified threshold or threshold condition, it is equivalent to keep >= threshold if threshold is used alone, type of threshold is INT64 ignoreNull:Optional, default value is true;If the value is true, null values are ignored, it means that if there is a null value in the middle, the value is ignored without interrupting the continuity. If the value is true, null values are not ignored, it means that if there are null values in the middle, continuity will be brokenINT64
TIME_DURATIONFind the difference between the timestamp of the largest non-null value and the timestamp of the smallest non-null value in a columnAll data Types/INT64
MODEFind the mode. Note: 1.Having too many different values in the input series risks a memory exception; 2.If all the elements have the same number of occurrences, that is no Mode, return the value with earliest time; 3.If there are many Modes, return the Mode with earliest time.All data Types/Consistent with the input data type
COUNT_TIMEThe number of timestamps in the query data set. When used with align by device, the result is the number of timestamps in the data set per device.All data Types, the input parameter can only be */INT64

COUNT

example

select count(status) from root.ln.wf01.wt01;

Result:

+-------------------------------+
|count(root.ln.wf01.wt01.status)|
+-------------------------------+
|                          10080|
+-------------------------------+
Total line number = 1
It costs 0.016s

COUNT_IF

Grammar

count_if(predicate, [keep >=/>/=/!=/</<=]threshold[, 'ignoreNull'='true/false'])

predicate: legal expression with BOOLEAN return type

use of threshold and ignoreNull can see above table

Note: count_if is not supported to use with SlidingWindow in group by time now

example

raw data
+-----------------------------+-------------+-------------+
|                         Time|root.db.d1.s1|root.db.d1.s2|
+-----------------------------+-------------+-------------+
|1970-01-01T08:00:00.001+08:00|            0|            0|
|1970-01-01T08:00:00.002+08:00|         null|            0|
|1970-01-01T08:00:00.003+08:00|            0|            0|
|1970-01-01T08:00:00.004+08:00|            0|            0|
|1970-01-01T08:00:00.005+08:00|            1|            0|
|1970-01-01T08:00:00.006+08:00|            1|            0|
|1970-01-01T08:00:00.007+08:00|            1|            0|
|1970-01-01T08:00:00.008+08:00|            0|            0|
|1970-01-01T08:00:00.009+08:00|            0|            0|
|1970-01-01T08:00:00.010+08:00|            0|            0|
+-----------------------------+-------------+-------------+
Not use ignoreNull attribute (Ignore Null)

SQL:

select count_if(s1=0 & s2=0, 3), count_if(s1=1 & s2=0, 3) from root.db.d1

Result:

+--------------------------------------------------+--------------------------------------------------+
|count_if(root.db.d1.s1 = 0 & root.db.d1.s2 = 0, 3)|count_if(root.db.d1.s1 = 1 & root.db.d1.s2 = 0, 3)|
+--------------------------------------------------+--------------------------------------------------+
|                                                 2|                                                 1|
+--------------------------------------------------+--------------------------------------------------
Use ignoreNull attribute

SQL:

select count_if(s1=0 & s2=0, 3, 'ignoreNull'='false'), count_if(s1=1 & s2=0, 3, 'ignoreNull'='false') from root.db.d1

Result:

+------------------------------------------------------------------------+------------------------------------------------------------------------+
|count_if(root.db.d1.s1 = 0 & root.db.d1.s2 = 0, 3, "ignoreNull"="false")|count_if(root.db.d1.s1 = 1 & root.db.d1.s2 = 0, 3, "ignoreNull"="false")|
+------------------------------------------------------------------------+------------------------------------------------------------------------+
|                                                                       1|                                                                       1|
+------------------------------------------------------------------------+------------------------------------------------------------------------+

TIME_DURATION

Grammar

    time_duration(Path)

Example

raw data
+----------+-------------+
|      Time|root.db.d1.s1|
+----------+-------------+
|         1|           70|
|         3|           10|
|         4|          303|
|         6|          110|
|         7|          302|
|         8|          110|
|         9|           60|
|        10|           70|
|1677570934|           30|
+----------+-------------+
Insert sql
"CREATE DATABASE root.db",
"CREATE TIMESERIES root.db.d1.s1 WITH DATATYPE=INT32, ENCODING=PLAIN tags(city=Beijing)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(1, 2, 10, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(2, null, 20, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(3, 10, 0, null)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(4, 303, 30, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(5, null, 20, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(6, 110, 20, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(7, 302, 20, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(8, 110, null, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(9, 60, 20, true)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(10,70, 20, null)",
"INSERT INTO root.db.d1(timestamp,s1,s2,s3) values(1677570934, 30, 0, true)",

SQL:

select time_duration(s1) from root.db.d1

Result:

+----------------------------+
|time_duration(root.db.d1.s1)|
+----------------------------+
|                  1677570933|
+----------------------------+

Note: Returns 0 if there is only one data point, or null if the data point is null.

COUNT_TIME

Grammar

    count_time(*)

Example

raw data
+----------+-------------+-------------+-------------+-------------+
|      Time|root.db.d1.s1|root.db.d1.s2|root.db.d2.s1|root.db.d2.s2|
+----------+-------------+-------------+-------------+-------------+
|         0|            0|         null|         null|            0|
|         1|         null|            1|            1|         null|
|         2|         null|            2|            2|         null|
|         4|            4|         null|         null|            4|
|         5|            5|            5|            5|            5|
|         7|         null|            7|            7|         null|
|         8|            8|            8|            8|            8|
|         9|         null|            9|         null|         null|
+----------+-------------+-------------+-------------+-------------+
Insert sql
CREATE DATABASE root.db;
CREATE TIMESERIES root.db.d1.s1 WITH DATATYPE=INT32, ENCODING=PLAIN;
CREATE TIMESERIES root.db.d1.s2 WITH DATATYPE=INT32, ENCODING=PLAIN;
CREATE TIMESERIES root.db.d2.s1 WITH DATATYPE=INT32, ENCODING=PLAIN;
CREATE TIMESERIES root.db.d2.s2 WITH DATATYPE=INT32, ENCODING=PLAIN;
INSERT INTO root.db.d1(time, s1) VALUES(0, 0), (4,4), (5,5), (8,8);
INSERT INTO root.db.d1(time, s2) VALUES(1, 1), (2,2), (5,5), (7,7), (8,8), (9,9);
INSERT INTO root.db.d2(time, s1) VALUES(1, 1), (2,2), (5,5), (7,7), (8,8);
INSERT INTO root.db.d2(time, s2) VALUES(0, 0), (4,4), (5,5), (8,8);

Query-Example - 1:

select count_time(*) from root.db.**

Result

+-------------+
|count_time(*)|
+-------------+
|            8|
+-------------+

Query-Example - 2:

select count_time(*) from root.db.d1, root.db.d2

Result

+-------------+
|count_time(*)|
+-------------+
|            8|
+-------------+

Query-Example - 3:

select count_time(*) from root.db.** group by([0, 10), 2ms)

Result

+-----------------------------+-------------+
|                         Time|count_time(*)|
+-----------------------------+-------------+
|1970-01-01T08:00:00.000+08:00|            2|            
|1970-01-01T08:00:00.002+08:00|            1|            
|1970-01-01T08:00:00.004+08:00|            2|            
|1970-01-01T08:00:00.006+08:00|            1|            
|1970-01-01T08:00:00.008+08:00|            2|            
+-----------------------------+-------------+

Query-Example - 4:

select count_time(*) from root.db.** group by([0, 10), 2ms) align by device

Result

+-----------------------------+----------+-------------+
|                         Time|    Device|count_time(*)|
+-----------------------------+----------+-------------+
|1970-01-01T08:00:00.000+08:00|root.db.d1|            2|
|1970-01-01T08:00:00.002+08:00|root.db.d1|            1|
|1970-01-01T08:00:00.004+08:00|root.db.d1|            2|
|1970-01-01T08:00:00.006+08:00|root.db.d1|            1|
|1970-01-01T08:00:00.008+08:00|root.db.d1|            2|
|1970-01-01T08:00:00.000+08:00|root.db.d2|            2|
|1970-01-01T08:00:00.002+08:00|root.db.d2|            1|
|1970-01-01T08:00:00.004+08:00|root.db.d2|            2|
|1970-01-01T08:00:00.006+08:00|root.db.d2|            1|
|1970-01-01T08:00:00.008+08:00|root.db.d2|            1|
+-----------------------------+----------+-------------+

Note:

  1. The parameter in count_time can only be *.
  2. Count_time aggregation cannot be used with other aggregation functions.
  3. Count_time aggregation used with having statement is not supported, and count_time aggregation can not appear in the having statement.
  4. Count_time does not support use with group by level, group by tag.

String Processing

STRING_CONTAINS

Function introduction

This function checks whether the substring s exists in the string

Function name: STRING_CONTAINS

Input sequence: Only a single input sequence is supported, the type is TEXT.

parameter:

  • s: The string to search for.

Output Sequence: Output a single sequence, the type is BOOLEAN.

Usage example

select s1, string_contains(s1, 's'='warn') from root.sg1.d4;
+-----------------------------+--------------+-------------------------------------------+
|                         Time|root.sg1.d4.s1|string_contains(root.sg1.d4.s1, "s"="warn")|
+-----------------------------+--------------+-------------------------------------------+
|1970-01-01T08:00:00.001+08:00|    warn:-8721|                                       true|
|1970-01-01T08:00:00.002+08:00|  error:-37229|                                      false|
|1970-01-01T08:00:00.003+08:00|     warn:1731|                                       true|
+-----------------------------+--------------+-------------------------------------------+
Total line number = 3
It costs 0.007s

STRING_MATCHES

Function introduction

This function judges whether a string can be matched by the regular expression regex.

Function name: STRING_MATCHES

Input sequence: Only a single input sequence is supported, the type is TEXT.

parameter:

  • regex: Java standard library-style regular expressions.

Output Sequence: Output a single sequence, the type is BOOLEAN.

Usage example

select s1, string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4;
+-----------------------------+--------------+------------------------------------------------------+
|                         Time|root.sg1.d4.s1|string_matches(root.sg1.d4.s1, "regex"="[^\\s]+37229")|
+-----------------------------+--------------+------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|    warn:-8721|                                                 false|
|1970-01-01T08:00:00.002+08:00|  error:-37229|                                                  true|
|1970-01-01T08:00:00.003+08:00|     warn:1731|                                                 false|
+-----------------------------+--------------+------------------------------------------------------+
Total line number = 3
It costs 0.007s

Length

Usage

The function is used to get the length of input series.

Name: LENGTH

Input Series: Only support a single input series. The data type is TEXT.

Output Series: Output a single series. The type is INT32.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, length(s1) from root.sg1.d1

Output series:

+-----------------------------+--------------+----------------------+
|                         Time|root.sg1.d1.s1|length(root.sg1.d1.s1)|
+-----------------------------+--------------+----------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                     6|
|1970-01-01T08:00:00.002+08:00|      22test22|                     8|
+-----------------------------+--------------+----------------------+

Locate

Usage

The function is used to get the position of the first occurrence of substring target in input series. Returns -1 if there are no target in input.

Name: LOCATE

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • target: The substring to be located.
  • reverse: Indicates whether reverse locate is required. The default value is false, means left-to-right locate.

Output Series: Output a single series. The type is INT32.

Note: The index begins from 0.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, locate(s1, "target"="1") from root.sg1.d1

Output series:

+-----------------------------+--------------+------------------------------------+
|                         Time|root.sg1.d1.s1|locate(root.sg1.d1.s1, "target"="1")|
+-----------------------------+--------------+------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                                   0|
|1970-01-01T08:00:00.002+08:00|      22test22|                                  -1|
+-----------------------------+--------------+------------------------------------+

Another SQL for query:

select s1, locate(s1, "target"="1", "reverse"="true") from root.sg1.d1

Output series:

+-----------------------------+--------------+------------------------------------------------------+
|                         Time|root.sg1.d1.s1|locate(root.sg1.d1.s1, "target"="1", "reverse"="true")|
+-----------------------------+--------------+------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                                                     5|
|1970-01-01T08:00:00.002+08:00|      22test22|                                                    -1|
+-----------------------------+--------------+------------------------------------------------------+

StartsWith

Usage

The function is used to check whether input series starts with the specified prefix.

Name: STARTSWITH

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • target: The prefix to be checked.

Output Series: Output a single series. The type is BOOLEAN.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, startswith(s1, "target"="1") from root.sg1.d1

Output series:

+-----------------------------+--------------+----------------------------------------+
|                         Time|root.sg1.d1.s1|startswith(root.sg1.d1.s1, "target"="1")|
+-----------------------------+--------------+----------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                                    true|
|1970-01-01T08:00:00.002+08:00|      22test22|                                   false|
+-----------------------------+--------------+----------------------------------------+

EndsWith

Usage

The function is used to check whether input series ends with the specified suffix.

Name: ENDSWITH

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • target: The suffix to be checked.

Output Series: Output a single series. The type is BOOLEAN.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, endswith(s1, "target"="1") from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------------------------------+
|                         Time|root.sg1.d1.s1|endswith(root.sg1.d1.s1, "target"="1")|
+-----------------------------+--------------+--------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                                  true|
|1970-01-01T08:00:00.002+08:00|      22test22|                                 false|
+-----------------------------+--------------+--------------------------------------+

Concat

Usage

The function is used to concat input series and target strings.

Name: CONCAT

Input Series: At least one input series. The data type is TEXT.

Parameter:

  • targets: A series of K-V, key needs to start with target and be not duplicated, value is the string you want to concat.
  • series_behind: Indicates whether series behind targets. The default value is false.

Output Series: Output a single series. The type is TEXT.

Note:

  • If value of input series is NULL, it will be skipped.
  • We can only concat input series and targets separately. concat(s1, "target1"="IoT", s2, "target2"="DB") and
    concat(s1, s2, "target1"="IoT", "target2"="DB") gives the same result.

Examples

Input series:

+-----------------------------+--------------+--------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|
+-----------------------------+--------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|          null|
|1970-01-01T08:00:00.002+08:00|      22test22|      2222test|
+-----------------------------+--------------+--------------+

SQL for query:

select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB") from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------+-----------------------------------------------------------------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|concat(root.sg1.d1.s1, root.sg1.d1.s2, "target1"="IoT", "target2"="DB")|
+-----------------------------+--------------+--------------+-----------------------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|          null|                                                            1test1IoTDB|
|1970-01-01T08:00:00.002+08:00|      22test22|      2222test|                                                  22test222222testIoTDB|
+-----------------------------+--------------+--------------+-----------------------------------------------------------------------+

Another SQL for query:

select s1, s2, concat(s1, s2, "target1"="IoT", "target2"="DB", "series_behind"="true") from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------+-----------------------------------------------------------------------------------------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|concat(root.sg1.d1.s1, root.sg1.d1.s2, "target1"="IoT", "target2"="DB", "series_behind"="true")|
+-----------------------------+--------------+--------------+-----------------------------------------------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|          null|                                                                                    IoTDB1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|      2222test|                                                                          IoTDB22test222222test|
+-----------------------------+--------------+--------------+-----------------------------------------------------------------------------------------------+

substring

Usage

Extracts a substring of a string, starting with the first specified character and stopping after the specified number of characters.The index start at 1. The value range of from and for is an INT32.

Name: SUBSTRING

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • from: Indicates the start position of substring.
  • for: Indicates how many characters to stop after of substring.

Output Series: Output a single series. The type is TEXT.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, substring(s1 from 1 for 2) from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------------------------------+
|                         Time|root.sg1.d1.s1|SUBSTRING(root.sg1.d1.s1 FROM 1 FOR 2)|
+-----------------------------+--------------+--------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                                    1t|
|1970-01-01T08:00:00.002+08:00|      22test22|                                    22|
+-----------------------------+--------------+--------------------------------------+

replace

Usage

Replace a substring in the input sequence with the target substring.

Name: REPLACE

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • first parameter: The target substring to be replaced.
  • second parameter: The substring to replace with.

Output Series: Output a single series. The type is TEXT.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, replace(s1, 'es', 'tt') from root.sg1.d1

Output series:

+-----------------------------+--------------+-----------------------------------+
|                         Time|root.sg1.d1.s1|REPLACE(root.sg1.d1.s1, 'es', 'tt')|
+-----------------------------+--------------+-----------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|                             1tttt1|
|1970-01-01T08:00:00.002+08:00|      22test22|                           22tttt22|
+-----------------------------+--------------+-----------------------------------+

Upper

Usage

The function is used to get the string of input series with all characters changed to uppercase.

Name: UPPER

Input Series: Only support a single input series. The data type is TEXT.

Output Series: Output a single series. The type is TEXT.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|
|1970-01-01T08:00:00.002+08:00|      22test22|
+-----------------------------+--------------+

SQL for query:

select s1, upper(s1) from root.sg1.d1

Output series:

+-----------------------------+--------------+---------------------+
|                         Time|root.sg1.d1.s1|upper(root.sg1.d1.s1)|
+-----------------------------+--------------+---------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|               1TEST1|
|1970-01-01T08:00:00.002+08:00|      22test22|             22TEST22|
+-----------------------------+--------------+---------------------+

Lower

Usage

The function is used to get the string of input series with all characters changed to lowercase.

Name: LOWER

Input Series: Only support a single input series. The data type is TEXT.

Output Series: Output a single series. The type is TEXT.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s1|
+-----------------------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1TEST1|
|1970-01-01T08:00:00.002+08:00|      22TEST22|
+-----------------------------+--------------+

SQL for query:

select s1, lower(s1) from root.sg1.d1

Output series:

+-----------------------------+--------------+---------------------+
|                         Time|root.sg1.d1.s1|lower(root.sg1.d1.s1)|
+-----------------------------+--------------+---------------------+
|1970-01-01T08:00:00.001+08:00|        1TEST1|               1test1|
|1970-01-01T08:00:00.002+08:00|      22TEST22|             22test22|
+-----------------------------+--------------+---------------------+

Trim

Usage

The function is used to get the string whose value is same to input series, with all leading and trailing space removed.

Name: TRIM

Input Series: Only support a single input series. The data type is TEXT.

Output Series: Output a single series. The type is TEXT.

Note: Returns NULL if input is NULL.

Examples

Input series:

+-----------------------------+--------------+
|                         Time|root.sg1.d1.s3|
+-----------------------------+--------------+
|1970-01-01T08:00:00.002+08:00|   3querytest3|
|1970-01-01T08:00:00.003+08:00|  3querytest3 |
+-----------------------------+--------------+

SQL for query:

select s3, trim(s3) from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------------+
|                         Time|root.sg1.d1.s3|trim(root.sg1.d1.s3)|
+-----------------------------+--------------+--------------------+
|1970-01-01T08:00:00.002+08:00|   3querytest3|         3querytest3|
|1970-01-01T08:00:00.003+08:00|  3querytest3 |         3querytest3|
+-----------------------------+--------------+--------------------+

StrCmp

Usage

The function is used to get the compare result of two input series. Returns 0 if series value are the same, a negative integer if value of series1 is smaller than series2,
a positive integer if value of series1 is more than series2.

Name: StrCmp

Input Series: Support two input series. Data types are all the TEXT.

Output Series: Output a single series. The type is INT32.

Note: Returns NULL either series value is NULL.

Examples

Input series:

+-----------------------------+--------------+--------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|
+-----------------------------+--------------+--------------+
|1970-01-01T08:00:00.001+08:00|        1test1|          null|
|1970-01-01T08:00:00.002+08:00|      22test22|      2222test|
+-----------------------------+--------------+--------------+

SQL for query:

select s1, s2, strcmp(s1, s2) from root.sg1.d1

Output series:

+-----------------------------+--------------+--------------+--------------------------------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|strcmp(root.sg1.d1.s1, root.sg1.d1.s2)|
+-----------------------------+--------------+--------------+--------------------------------------+
|1970-01-01T08:00:00.001+08:00|        1test1|          null|                                  null|
|1970-01-01T08:00:00.002+08:00|      22test22|      2222test|                                    66|
+-----------------------------+--------------+--------------+--------------------------------------+

StrReplace

Usage

This is not a built-in function and can only be used after registering the library-udf. The function is used to replace the specific substring with given string.

Name: STRREPLACE

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • target: The target substring to be replaced.
  • replace: The string to be put on.
  • limit: The number of matches to be replaced which should be an integer no less than -1,
    default to -1 which means all matches will be replaced.
  • offset: The number of matches to be skipped, which means the first offset matches will not be replaced, default to 0.
  • reverse: Whether to count all the matches reversely, default to 'false'.

Output Series: Output a single series. The type is TEXT.

Examples

Input series:

+-----------------------------+---------------+
|                         Time|root.test.d1.s1|
+-----------------------------+---------------+
|2021-01-01T00:00:01.000+08:00|      A,B,A+,B-|
|2021-01-01T00:00:02.000+08:00|      A,A+,A,B+|
|2021-01-01T00:00:03.000+08:00|         B+,B,B|
|2021-01-01T00:00:04.000+08:00|      A+,A,A+,A|
|2021-01-01T00:00:05.000+08:00|       A,B-,B,B|
+-----------------------------+---------------+

SQL for query:

select strreplace(s1, "target"=",", "replace"="/", "limit"="2") from root.test.d1

Output series:

+-----------------------------+-----------------------------------------+
|                         Time|strreplace(root.test.d1.s1, "target"=",",|
|                             |              "replace"="/", "limit"="2")|
+-----------------------------+-----------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                A/B/A+,B-|
|2021-01-01T00:00:02.000+08:00|                                A/A+/A,B+|
|2021-01-01T00:00:03.000+08:00|                                   B+/B/B|
|2021-01-01T00:00:04.000+08:00|                                A+/A/A+,A|
|2021-01-01T00:00:05.000+08:00|                                 A/B-/B,B|
+-----------------------------+-----------------------------------------+

Another SQL for query:

select strreplace(s1, "target"=",", "replace"="/", "limit"="1", "offset"="1", "reverse"="true") from root.test.d1

Output series:

+-----------------------------+-----------------------------------------------------+
|                         Time|strreplace(root.test.d1.s1, "target"=",", "replace"= | 
|                             |    "|", "limit"="1", "offset"="1", "reverse"="true")|
+-----------------------------+-----------------------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                            A,B/A+,B-|
|2021-01-01T00:00:02.000+08:00|                                            A,A+/A,B+|
|2021-01-01T00:00:03.000+08:00|                                               B+/B,B|
|2021-01-01T00:00:04.000+08:00|                                            A+,A/A+,A|
|2021-01-01T00:00:05.000+08:00|                                             A,B-/B,B|
+-----------------------------+-----------------------------------------------------+

RegexMatch

Usage

This is not a built-in function and can only be used after registering the library-udf. The function is used to fetch matched contents from text with given regular expression.

Name: REGEXMATCH

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • regex: The regular expression to match in the text. All grammars supported by Java are acceptable,
    for example, \d+\.\d+\.\d+\.\d+ is expected to match any IPv4 addresses.
  • group: The wanted group index in the matched result.
    Reference to java.util.regex, group 0 is the whole pattern and
    the next ones are numbered with the appearance order of left parentheses.
    For example, the groups in A(B(CD)) are: 0-A(B(CD)), 1-B(CD), 2-CD.

Output Series: Output a single series. The type is TEXT.

Note: Those points with null values or not matched with the given pattern will not return any results.

Examples

Input series:

+-----------------------------+-------------------------------+
|                         Time|                root.test.d1.s1|
+-----------------------------+-------------------------------+
|2021-01-01T00:00:01.000+08:00|        [192.168.0.1] [SUCCESS]|
|2021-01-01T00:00:02.000+08:00|       [192.168.0.24] [SUCCESS]|
|2021-01-01T00:00:03.000+08:00|           [192.168.0.2] [FAIL]|
|2021-01-01T00:00:04.000+08:00|        [192.168.0.5] [SUCCESS]|
|2021-01-01T00:00:05.000+08:00|      [192.168.0.124] [SUCCESS]|
+-----------------------------+-------------------------------+

SQL for query:

select regexmatch(s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0") from root.test.d1

Output series:

+-----------------------------+----------------------------------------------------------------------+
|                         Time|regexmatch(root.test.d1.s1, "regex"="\d+\.\d+\.\d+\.\d+", "group"="0")|
+-----------------------------+----------------------------------------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                                           192.168.0.1|
|2021-01-01T00:00:02.000+08:00|                                                          192.168.0.24|
|2021-01-01T00:00:03.000+08:00|                                                           192.168.0.2|
|2021-01-01T00:00:04.000+08:00|                                                           192.168.0.5|
|2021-01-01T00:00:05.000+08:00|                                                         192.168.0.124|
+-----------------------------+----------------------------------------------------------------------+

RegexReplace

Usage

This is not a built-in function and can only be used after registering the library-udf. The function is used to replace the specific regular expression matches with given string.

Name: REGEXREPLACE

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • regex: The target regular expression to be replaced. All grammars supported by Java are acceptable.
  • replace: The string to be put on and back reference notes in Java is also supported,
    for example, '$1' refers to group 1 in the regex which will be filled with corresponding matched results.
  • limit: The number of matches to be replaced which should be an integer no less than -1,
    default to -1 which means all matches will be replaced.
  • offset: The number of matches to be skipped, which means the first offset matches will not be replaced, default to 0.
  • reverse: Whether to count all the matches reversely, default to 'false'.

Output Series: Output a single series. The type is TEXT.

Examples

Input series:

+-----------------------------+-------------------------------+
|                         Time|                root.test.d1.s1|
+-----------------------------+-------------------------------+
|2021-01-01T00:00:01.000+08:00|        [192.168.0.1] [SUCCESS]|
|2021-01-01T00:00:02.000+08:00|       [192.168.0.24] [SUCCESS]|
|2021-01-01T00:00:03.000+08:00|           [192.168.0.2] [FAIL]|
|2021-01-01T00:00:04.000+08:00|        [192.168.0.5] [SUCCESS]|
|2021-01-01T00:00:05.000+08:00|      [192.168.0.124] [SUCCESS]|
+-----------------------------+-------------------------------+

SQL for query:

select regexreplace(s1, "regex"="192\.168\.0\.(\d+)", "replace"="cluster-$1", "limit"="1") from root.test.d1

Output series:

+-----------------------------+-----------------------------------------------------------+
|                         Time|regexreplace(root.test.d1.s1, "regex"="192\.168\.0\.(\d+)",|
|                             |                       "replace"="cluster-$1", "limit"="1")|
+-----------------------------+-----------------------------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                      [cluster-1] [SUCCESS]|
|2021-01-01T00:00:02.000+08:00|                                     [cluster-24] [SUCCESS]|
|2021-01-01T00:00:03.000+08:00|                                         [cluster-2] [FAIL]|
|2021-01-01T00:00:04.000+08:00|                                      [cluster-5] [SUCCESS]|
|2021-01-01T00:00:05.000+08:00|                                    [cluster-124] [SUCCESS]|
+-----------------------------+-----------------------------------------------------------+

RegexSplit

Usage

This is not a built-in function and can only be used after registering the library-udf. The function is used to split text with given regular expression and return specific element.

Name: REGEXSPLIT

Input Series: Only support a single input series. The data type is TEXT.

Parameter:

  • regex: The regular expression used to split the text.
    All grammars supported by Java are acceptable, for example, ['"] is expected to match ' and ".
  • index: The wanted index of elements in the split result.
    It should be an integer no less than -1, default to -1 which means the length of the result array is returned
    and any non-negative integer is used to fetch the text of the specific index starting from 0.

Output Series: Output a single series. The type is INT32 when index is -1 and TEXT when it's an valid index.

Note: When index is out of the range of the result array, for example 0,1,2 split with , and index is set to 3,
no result are returned for that record.

Examples

Input series:

+-----------------------------+---------------+
|                         Time|root.test.d1.s1|
+-----------------------------+---------------+
|2021-01-01T00:00:01.000+08:00|      A,B,A+,B-|
|2021-01-01T00:00:02.000+08:00|      A,A+,A,B+|
|2021-01-01T00:00:03.000+08:00|         B+,B,B|
|2021-01-01T00:00:04.000+08:00|      A+,A,A+,A|
|2021-01-01T00:00:05.000+08:00|       A,B-,B,B|
+-----------------------------+---------------+

SQL for query:

select regexsplit(s1, "regex"=",", "index"="-1") from root.test.d1

Output series:

+-----------------------------+------------------------------------------------------+
|                         Time|regexsplit(root.test.d1.s1, "regex"=",", "index"="-1")|
+-----------------------------+------------------------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                                     4|
|2021-01-01T00:00:02.000+08:00|                                                     4|
|2021-01-01T00:00:03.000+08:00|                                                     3|
|2021-01-01T00:00:04.000+08:00|                                                     4|
|2021-01-01T00:00:05.000+08:00|                                                     4|
+-----------------------------+------------------------------------------------------+

Another SQL for query:

SQL for query:

select regexsplit(s1, "regex"=",", "index"="3") from root.test.d1

Output series:

+-----------------------------+-----------------------------------------------------+
|                         Time|regexsplit(root.test.d1.s1, "regex"=",", "index"="3")|
+-----------------------------+-----------------------------------------------------+
|2021-01-01T00:00:01.000+08:00|                                                   B-|
|2021-01-01T00:00:02.000+08:00|                                                   B+|
|2021-01-01T00:00:04.000+08:00|                                                    A|
|2021-01-01T00:00:05.000+08:00|                                                    B|
+-----------------------------+-----------------------------------------------------+

Data Type Conversion Function

The IoTDB currently supports 6 data types, including INT32, INT64 ,FLOAT, DOUBLE, BOOLEAN, TEXT. When we query or evaluate data, we may need to convert data types, such as TEXT to INT32, or FLOAT to DOUBLE. IoTDB supports cast function to convert data types.

Syntax example:

SELECT cast(s1 as INT32) from root.sg

The syntax of the cast function is consistent with that of PostgreSQL. The data type specified after AS indicates the target type to be converted. Currently, all six data types supported by IoTDB can be used in the cast function. The conversion rules to be followed are shown in the following table. The row represents the original data type, and the column represents the target data type to be converted into:

INT32INT64FLOATDOUBLEBOOLEANTEXT
INT32No need to castCast directlyCast directlyCast directly!=0 : true
==0: false
String.valueOf()
INT64Out of the range of INT32: throw Exception
Otherwise: Cast directly
No need to castCast directlyCast directly!=0L : true
==0: false
String.valueOf()
FLOATOut of the range of INT32: throw Exception
Otherwise: Math.round()
Out of the range of INT64: throw Exception
Otherwise: Math.round()
No need to castCast directly!=0.0f : true
==0: false
String.valueOf()
DOUBLEOut of the range of INT32: throw Exception
Otherwise: Math.round()
Out of the range of INT64: throw Exception
Otherwise: Math.round()
Out of the range of FLOAT:throw Exception
Otherwise: Cast directly
No need to cast!=0.0 : true
==0: false
String.valueOf()
BOOLEANtrue: 1
false: 0
true: 1L
false: 0
true: 1.0f
false: 0
true: 1.0
false: 0
No need to casttrue: "true"
false: "false"
TEXTInteger.parseInt()Long.parseLong()Float.parseFloat()Double.parseDouble()text.toLowerCase =="true" : true
text.toLowerCase =="false" : false
Otherwise: throw Exception
No need to cast

Examples

// timeseries
IoTDB> show timeseries root.sg.d1.**
+-------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+
|   Timeseries|Alias|Database|DataType|Encoding|Compression|Tags|Attributes|Deadband|DeadbandParameters|
+-------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+
|root.sg.d1.s3| null| root.sg|   FLOAT|   PLAIN|     SNAPPY|null|      null|    null|              null|
|root.sg.d1.s4| null| root.sg|  DOUBLE|   PLAIN|     SNAPPY|null|      null|    null|              null|
|root.sg.d1.s5| null| root.sg| BOOLEAN|   PLAIN|     SNAPPY|null|      null|    null|              null|
|root.sg.d1.s6| null| root.sg|    TEXT|   PLAIN|     SNAPPY|null|      null|    null|              null|
|root.sg.d1.s1| null| root.sg|   INT32|   PLAIN|     SNAPPY|null|      null|    null|              null|
|root.sg.d1.s2| null| root.sg|   INT64|   PLAIN|     SNAPPY|null|      null|    null|              null|
+-------------+-----+--------+--------+--------+-----------+----+----------+--------+------------------+

// data of timeseries
IoTDB> select * from root.sg.d1;
+-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+
|                         Time|root.sg.d1.s3|root.sg.d1.s4|root.sg.d1.s5|root.sg.d1.s6|root.sg.d1.s1|root.sg.d1.s2|
+-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+
|1970-01-01T08:00:00.000+08:00|          0.0|          0.0|        false|        10000|            0|            0|
|1970-01-01T08:00:00.001+08:00|          1.0|          1.0|        false|            3|            1|            1|
|1970-01-01T08:00:00.002+08:00|          2.7|          2.7|         true|         TRue|            2|            2|
|1970-01-01T08:00:00.003+08:00|         3.33|         3.33|         true|        faLse|            3|            3|
+-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+

// cast BOOLEAN to other types
IoTDB> select cast(s5 as INT32), cast(s5 as INT64),cast(s5 as FLOAT),cast(s5 as DOUBLE), cast(s5 as TEXT) from root.sg.d1
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+---------------------------+
|                         Time|CAST(root.sg.d1.s5 AS INT32)|CAST(root.sg.d1.s5 AS INT64)|CAST(root.sg.d1.s5 AS FLOAT)|CAST(root.sg.d1.s5 AS DOUBLE)|CAST(root.sg.d1.s5 AS TEXT)|
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+---------------------------+
|1970-01-01T08:00:00.000+08:00|                           0|                           0|                         0.0|                          0.0|                      false|
|1970-01-01T08:00:00.001+08:00|                           0|                           0|                         0.0|                          0.0|                      false|
|1970-01-01T08:00:00.002+08:00|                           1|                           1|                         1.0|                          1.0|                       true|
|1970-01-01T08:00:00.003+08:00|                           1|                           1|                         1.0|                          1.0|                       true|
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+---------------------------+

// cast TEXT to numeric types
IoTDB> select cast(s6 as INT32), cast(s6 as INT64), cast(s6 as FLOAT), cast(s6 as DOUBLE) from root.sg.d1 where time < 2
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+
|                         Time|CAST(root.sg.d1.s6 AS INT32)|CAST(root.sg.d1.s6 AS INT64)|CAST(root.sg.d1.s6 AS FLOAT)|CAST(root.sg.d1.s6 AS DOUBLE)|
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+
|1970-01-01T08:00:00.000+08:00|                       10000|                       10000|                     10000.0|                      10000.0|
|1970-01-01T08:00:00.001+08:00|                           3|                           3|                         3.0|                          3.0|
+-----------------------------+----------------------------+----------------------------+----------------------------+-----------------------------+

// cast TEXT to BOOLEAN
IoTDB> select cast(s6 as BOOLEAN) from root.sg.d1 where time >= 2
+-----------------------------+------------------------------+
|                         Time|CAST(root.sg.d1.s6 AS BOOLEAN)|
+-----------------------------+------------------------------+
|1970-01-01T08:00:00.002+08:00|                          true|
|1970-01-01T08:00:00.003+08:00|                         false|
+-----------------------------+------------------------------+

Constant Timeseries Generating Functions

The constant timeseries generating function is used to generate a timeseries in which the values of all data points are the same.

The constant timeseries generating function accepts one or more timeseries inputs, and the timestamp set of the output data points is the union of the timestamp sets of the input timeseries.

Currently, IoTDB supports the following constant timeseries generating functions:

Function NameRequired AttributesOutput Series Data TypeDescription
CONSTvalue: the value of the output data point
type: the type of the output data point, it can only be INT32 / INT64 / FLOAT / DOUBLE / BOOLEAN / TEXT
Determined by the required attribute typeOutput the user-specified constant timeseries according to the attributes value and type.
PINoneDOUBLEData point value: a double value of π, the ratio of the circumference of a circle to its diameter, which is equals to Math.PI in the Java Standard Library.
ENoneDOUBLEData point value: a double value of e, the base of the natural logarithms, which is equals to Math.E in the Java Standard Library.

Example:

select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; 

Result:

select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; 
+-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+
|                         Time|root.sg1.d1.s1|root.sg1.d1.s2|const(root.sg1.d1.s1, "value"="1024", "type"="INT64")|pi(root.sg1.d1.s2)|e(root.sg1.d1.s1, root.sg1.d1.s2)|
+-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+
|1970-01-01T08:00:00.000+08:00|           0.0|           0.0|                                                 1024| 3.141592653589793|                2.718281828459045|
|1970-01-01T08:00:00.001+08:00|           1.0|          null|                                                 1024|              null|                2.718281828459045|
|1970-01-01T08:00:00.002+08:00|           2.0|          null|                                                 1024|              null|                2.718281828459045|
|1970-01-01T08:00:00.003+08:00|          null|           3.0|                                                 null| 3.141592653589793|                2.718281828459045|
|1970-01-01T08:00:00.004+08:00|          null|           4.0|                                                 null| 3.141592653589793|                2.718281828459045|
+-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+
Total line number = 5
It costs 0.005s

Selector Functions

Currently, IoTDB supports the following selector functions:

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
TOP_KINT32 / INT64 / FLOAT / DOUBLE / TEXTk: the maximum number of selected data points, must be greater than 0 and less than or equal to 1000Same type as the input seriesReturns k data points with the largest values in a time series.
BOTTOM_KINT32 / INT64 / FLOAT / DOUBLE / TEXTk: the maximum number of selected data points, must be greater than 0 and less than or equal to 1000Same type as the input seriesReturns k data points with the smallest values in a time series.

Example:

select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00;

Result:

+-----------------------------+--------------------+------------------------------+---------------------------------+
|                         Time|      root.sg1.d2.s1|top_k(root.sg1.d2.s1, "k"="2")|bottom_k(root.sg1.d2.s1, "k"="2")|
+-----------------------------+--------------------+------------------------------+---------------------------------+
|2020-12-10T20:36:15.531+08:00| 1531604122307244742|           1531604122307244742|                             null|
|2020-12-10T20:36:15.532+08:00|-7426070874923281101|                          null|                             null|
|2020-12-10T20:36:15.533+08:00|-7162825364312197604|          -7162825364312197604|                             null|
|2020-12-10T20:36:15.534+08:00|-8581625725655917595|                          null|             -8581625725655917595|
|2020-12-10T20:36:15.535+08:00|-7667364751255535391|                          null|             -7667364751255535391|
+-----------------------------+--------------------+------------------------------+---------------------------------+
Total line number = 5
It costs 0.006s

Continuous Interval Functions

The continuous interval functions are used to query all continuous intervals that meet specified conditions.
They can be divided into two categories according to return value:

  1. Returns the start timestamp and time span of the continuous interval that meets the conditions (a time span of 0 means that only the start time point meets the conditions)
  2. Returns the start timestamp of the continuous interval that meets the condition and the number of points in the interval (a number of 1 means that only the start time point meets the conditions)
Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
ZERO_DURATIONINT32/ INT64/ FLOAT/ DOUBLE/ BOOLEANmin:Optional with default value 0L
max:Optional with default value Long.MAX_VALUE
LongReturn intervals' start times and duration times in which the value is always 0(false), and the duration time t satisfy t >= min && t <= max. The unit of t is ms
NON_ZERO_DURATIONINT32/ INT64/ FLOAT/ DOUBLE/ BOOLEANmin:Optional with default value 0L
max:Optional with default value Long.MAX_VALUE
LongReturn intervals' start times and duration times in which the value is always not 0, and the duration time t satisfy t >= min && t <= max. The unit of t is ms
ZERO_COUNTINT32/ INT64/ FLOAT/ DOUBLE/ BOOLEANmin:Optional with default value 1L
max:Optional with default value Long.MAX_VALUE
LongReturn intervals' start times and the number of data points in the interval in which the value is always 0(false). Data points number n satisfy n >= min && n <= max
NON_ZERO_COUNTINT32/ INT64/ FLOAT/ DOUBLE/ BOOLEANmin:Optional with default value 1L
max:Optional with default value Long.MAX_VALUE
LongReturn intervals' start times and the number of data points in the interval in which the value is always not 0(false). Data points number n satisfy n >= min && n <= max

Demonstrate

Example data:

IoTDB> select s1,s2,s3,s4,s5 from root.sg.d2;
+-----------------------------+-------------+-------------+-------------+-------------+-------------+
|                         Time|root.sg.d2.s1|root.sg.d2.s2|root.sg.d2.s3|root.sg.d2.s4|root.sg.d2.s5|
+-----------------------------+-------------+-------------+-------------+-------------+-------------+
|1970-01-01T08:00:00.000+08:00|            0|            0|          0.0|          0.0|        false|
|1970-01-01T08:00:00.001+08:00|            1|            1|          1.0|          1.0|         true|
|1970-01-01T08:00:00.002+08:00|            1|            1|          1.0|          1.0|         true|
|1970-01-01T08:00:00.003+08:00|            0|            0|          0.0|          0.0|        false|
|1970-01-01T08:00:00.004+08:00|            1|            1|          1.0|          1.0|         true|
|1970-01-01T08:00:00.005+08:00|            0|            0|          0.0|          0.0|        false|
|1970-01-01T08:00:00.006+08:00|            0|            0|          0.0|          0.0|        false|
|1970-01-01T08:00:00.007+08:00|            1|            1|          1.0|          1.0|         true|
+-----------------------------+-------------+-------------+-------------+-------------+-------------+

Sql:

select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2;

Result:

+-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+
|                         Time|root.sg.d2.s1|zero_count(root.sg.d2.s1)|non_zero_count(root.sg.d2.s2)|zero_duration(root.sg.d2.s3)|non_zero_duration(root.sg.d2.s4)|
+-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+
|1970-01-01T08:00:00.000+08:00|            0|                        1|                         null|                           0|                            null|
|1970-01-01T08:00:00.001+08:00|            1|                     null|                            2|                        null|                               1|
|1970-01-01T08:00:00.002+08:00|            1|                     null|                         null|                        null|                            null|
|1970-01-01T08:00:00.003+08:00|            0|                        1|                         null|                           0|                            null|
|1970-01-01T08:00:00.004+08:00|            1|                     null|                            1|                        null|                               0|
|1970-01-01T08:00:00.005+08:00|            0|                        2|                         null|                           1|                            null|
|1970-01-01T08:00:00.006+08:00|            0|                     null|                         null|                        null|                            null|
|1970-01-01T08:00:00.007+08:00|            1|                     null|                            1|                        null|                               0|
+-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+

Variation Trend Calculation Functions

Currently, IoTDB supports the following variation trend calculation functions:

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
TIME_DIFFERENCEINT32 / INT64 / FLOAT / DOUBLE / BOOLEAN / TEXT/INT64Calculates the difference between the time stamp of a data point and the time stamp of the previous data point. There is no corresponding output for the first data point.
DIFFERENCEINT32 / INT64 / FLOAT / DOUBLE/Same type as the input seriesCalculates the difference between the value of a data point and the value of the previous data point. There is no corresponding output for the first data point.
NON_NEGATIVE_DIFFERENCEINT32 / INT64 / FLOAT / DOUBLE/Same type as the input seriesCalculates the absolute value of the difference between the value of a data point and the value of the previous data point. There is no corresponding output for the first data point.
DERIVATIVEINT32 / INT64 / FLOAT / DOUBLE/DOUBLECalculates the rate of change of a data point compared to the previous data point, the result is equals to DIFFERENCE / TIME_DIFFERENCE. There is no corresponding output for the first data point.
NON_NEGATIVE_DERIVATIVEINT32 / INT64 / FLOAT / DOUBLE/DOUBLECalculates the absolute value of the rate of change of a data point compared to the previous data point, the result is equals to NON_NEGATIVE_DIFFERENCE / TIME_DIFFERENCE. There is no corresponding output for the first data point.
DIFFINT32 / INT64 / FLOAT / DOUBLEignoreNull:optional,default is true. If is true, the previous data point is ignored when it is null and continues to find the first non-null value forwardly. If the value is false, previous data point is not ignored when it is null, the result is also null because null is used for subtractionDOUBLECalculates the difference between the value of a data point and the value of the previous data point. There is no corresponding output for the first data point, so output is null

Example:

select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; 

Result:

+-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+
|                         Time|     root.sg1.d1.s1|time_difference(root.sg1.d1.s1)|difference(root.sg1.d1.s1)|non_negative_difference(root.sg1.d1.s1)|derivative(root.sg1.d1.s1)|non_negative_derivative(root.sg1.d1.s1)|
+-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+
|2020-12-10T17:11:49.037+08:00|7360723084922759782|                              1|      -8431715764844238876|                    8431715764844238876|    -8.4317157648442388E18|                  8.4317157648442388E18|
|2020-12-10T17:11:49.038+08:00|4377791063319964531|                              1|      -2982932021602795251|                    2982932021602795251|     -2.982932021602795E18|                   2.982932021602795E18|
|2020-12-10T17:11:49.039+08:00|7972485567734642915|                              1|       3594694504414678384|                    3594694504414678384|     3.5946945044146785E18|                  3.5946945044146785E18|
|2020-12-10T17:11:49.040+08:00|2508858212791964081|                              1|      -5463627354942678834|                    5463627354942678834|     -5.463627354942679E18|                   5.463627354942679E18|
|2020-12-10T17:11:49.041+08:00|2817297431185141819|                              1|        308439218393177738|                     308439218393177738|     3.0843921839317773E17|                  3.0843921839317773E17|
+-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+
Total line number = 5
It costs 0.014s

Example

RawData

+-----------------------------+------------+------------+
|                         Time|root.test.s1|root.test.s2|
+-----------------------------+------------+------------+
|1970-01-01T08:00:00.001+08:00|           1|         1.0|
|1970-01-01T08:00:00.002+08:00|           2|        null|
|1970-01-01T08:00:00.003+08:00|        null|         3.0|
|1970-01-01T08:00:00.004+08:00|           4|        null|
|1970-01-01T08:00:00.005+08:00|           5|         5.0|
|1970-01-01T08:00:00.006+08:00|        null|         6.0|
+-----------------------------+------------+------------+

Not use ignoreNull attribute (Ignore Null)

SQL:

SELECT DIFF(s1), DIFF(s2) from root.test;

Result:

+-----------------------------+------------------+------------------+
|                         Time|DIFF(root.test.s1)|DIFF(root.test.s2)|
+-----------------------------+------------------+------------------+
|1970-01-01T08:00:00.001+08:00|              null|              null|
|1970-01-01T08:00:00.002+08:00|               1.0|              null|
|1970-01-01T08:00:00.003+08:00|              null|               2.0|
|1970-01-01T08:00:00.004+08:00|               2.0|              null|
|1970-01-01T08:00:00.005+08:00|               1.0|               2.0|
|1970-01-01T08:00:00.006+08:00|              null|               1.0|
+-----------------------------+------------------+------------------+

Use ignoreNull attribute

SQL:

SELECT DIFF(s1, 'ignoreNull'='false'), DIFF(s2, 'ignoreNull'='false') from root.test;

Result:

+-----------------------------+----------------------------------------+----------------------------------------+
|                         Time|DIFF(root.test.s1, "ignoreNull"="false")|DIFF(root.test.s2, "ignoreNull"="false")|
+-----------------------------+----------------------------------------+----------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                    null|                                    null|
|1970-01-01T08:00:00.002+08:00|                                     1.0|                                    null|
|1970-01-01T08:00:00.003+08:00|                                    null|                                    null|
|1970-01-01T08:00:00.004+08:00|                                    null|                                    null|
|1970-01-01T08:00:00.005+08:00|                                     1.0|                                    null|
|1970-01-01T08:00:00.006+08:00|                                    null|                                     1.0|
+-----------------------------+----------------------------------------+----------------------------------------+

Sample Functions

Equal Size Bucket Sample Function

This function samples the input sequence in equal size buckets, that is, according to the downsampling ratio and downsampling method given by the user, the input sequence is equally divided into several buckets according to a fixed number of points. Sampling by the given sampling method within each bucket.

  • proportion: sample ratio, the value range is (0, 1].

Equal Size Bucket Random Sample

Random sampling is performed on the equally divided buckets.

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
EQUAL_SIZE_BUCKET_RANDOM_SAMPLEINT32 / INT64 / FLOAT / DOUBLEproportion The value range is (0, 1], the default is 0.1INT32 / INT64 / FLOAT / DOUBLEReturns a random sample of equal buckets that matches the sampling ratio
Example

Example data: root.ln.wf01.wt01.temperature has a total of 100 ordered data from 0.0-99.0.

IoTDB> select temperature from root.ln.wf01.wt01;
+-----------------------------+-----------------------------+
|                         Time|root.ln.wf01.wt01.temperature|
+-----------------------------+-----------------------------+
|1970-01-01T08:00:00.000+08:00|                          0.0|
|1970-01-01T08:00:00.001+08:00|                          1.0|
|1970-01-01T08:00:00.002+08:00|                          2.0|
|1970-01-01T08:00:00.003+08:00|                          3.0|
|1970-01-01T08:00:00.004+08:00|                          4.0|
|1970-01-01T08:00:00.005+08:00|                          5.0|
|1970-01-01T08:00:00.006+08:00|                          6.0|
|1970-01-01T08:00:00.007+08:00|                          7.0|
|1970-01-01T08:00:00.008+08:00|                          8.0|
|1970-01-01T08:00:00.009+08:00|                          9.0|
|1970-01-01T08:00:00.010+08:00|                         10.0|
|1970-01-01T08:00:00.011+08:00|                         11.0|
|1970-01-01T08:00:00.012+08:00|                         12.0|
|.............................|.............................|            
|1970-01-01T08:00:00.089+08:00|                         89.0|
|1970-01-01T08:00:00.090+08:00|                         90.0|
|1970-01-01T08:00:00.091+08:00|                         91.0|
|1970-01-01T08:00:00.092+08:00|                         92.0|
|1970-01-01T08:00:00.093+08:00|                         93.0|
|1970-01-01T08:00:00.094+08:00|                         94.0|
|1970-01-01T08:00:00.095+08:00|                         95.0|
|1970-01-01T08:00:00.096+08:00|                         96.0|
|1970-01-01T08:00:00.097+08:00|                         97.0|
|1970-01-01T08:00:00.098+08:00|                         98.0|
|1970-01-01T08:00:00.099+08:00|                         99.0|
+-----------------------------+-----------------------------+

Sql:

select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01;

Result:

+-----------------------------+-------------+
|                         Time|random_sample|
+-----------------------------+-------------+
|1970-01-01T08:00:00.007+08:00|          7.0|
|1970-01-01T08:00:00.014+08:00|         14.0|
|1970-01-01T08:00:00.020+08:00|         20.0|
|1970-01-01T08:00:00.035+08:00|         35.0|
|1970-01-01T08:00:00.047+08:00|         47.0|
|1970-01-01T08:00:00.059+08:00|         59.0|
|1970-01-01T08:00:00.063+08:00|         63.0|
|1970-01-01T08:00:00.079+08:00|         79.0|
|1970-01-01T08:00:00.086+08:00|         86.0|
|1970-01-01T08:00:00.096+08:00|         96.0|
+-----------------------------+-------------+
Total line number = 10
It costs 0.024s

Equal Size Bucket Aggregation Sample

The input sequence is sampled by the aggregation sampling method, and the user needs to provide an additional aggregation function parameter, namely

  • type: Aggregate type, which can be avg or max or min or sum or extreme or variance. By default, avg is used. extreme represents the value with the largest absolute value in the equal bucket. variance represents the variance in the sampling equal buckets.

The timestamp of the sampling output of each bucket is the timestamp of the first point of the bucket.

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
EQUAL_SIZE_BUCKET_AGG_SAMPLEINT32 / INT64 / FLOAT / DOUBLEproportion The value range is (0, 1], the default is 0.1
type: The value types are avg, max, min, sum, extreme, variance, the default is avg
INT32 / INT64 / FLOAT / DOUBLEReturns equal bucket aggregation samples that match the sampling ratio
Example

Example data: root.ln.wf01.wt01.temperature has a total of 100 ordered data from 0.0-99.0, and the test data is randomly sampled in equal buckets.

Sql:

select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01;

Result:

+-----------------------------+-----------------+-------+-------+-------+-----------+------------+
|                         Time|          agg_avg|agg_max|agg_min|agg_sum|agg_extreme|agg_variance|
+-----------------------------+-----------------+-------+-------+-------+-----------+------------+
|1970-01-01T08:00:00.000+08:00|              4.5|    9.0|    0.0|   45.0|        9.0|        8.25|
|1970-01-01T08:00:00.010+08:00|             14.5|   19.0|   10.0|  145.0|       19.0|        8.25|
|1970-01-01T08:00:00.020+08:00|             24.5|   29.0|   20.0|  245.0|       29.0|        8.25|
|1970-01-01T08:00:00.030+08:00|             34.5|   39.0|   30.0|  345.0|       39.0|        8.25|
|1970-01-01T08:00:00.040+08:00|             44.5|   49.0|   40.0|  445.0|       49.0|        8.25|
|1970-01-01T08:00:00.050+08:00|             54.5|   59.0|   50.0|  545.0|       59.0|        8.25|
|1970-01-01T08:00:00.060+08:00|             64.5|   69.0|   60.0|  645.0|       69.0|        8.25|
|1970-01-01T08:00:00.070+08:00|74.50000000000001|   79.0|   70.0|  745.0|       79.0|        8.25|
|1970-01-01T08:00:00.080+08:00|             84.5|   89.0|   80.0|  845.0|       89.0|        8.25|
|1970-01-01T08:00:00.090+08:00|             94.5|   99.0|   90.0|  945.0|       99.0|        8.25|
+-----------------------------+-----------------+-------+-------+-------+-----------+------------+
Total line number = 10
It costs 0.044s

Equal Size Bucket M4 Sample

The input sequence is sampled using the M4 sampling method. That is to sample the head, tail, min and max values for each bucket.

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
EQUAL_SIZE_BUCKET_M4_SAMPLEINT32 / INT64 / FLOAT / DOUBLEproportion The value range is (0, 1], the default is 0.1INT32 / INT64 / FLOAT / DOUBLEReturns equal bucket M4 samples that match the sampling ratio
Example

Example data: root.ln.wf01.wt01.temperature has a total of 100 ordered data from 0.0-99.0, and the test data is randomly sampled in equal buckets.

Sql:

select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01;

Result:

+-----------------------------+---------+
|                         Time|M4_sample|
+-----------------------------+---------+
|1970-01-01T08:00:00.000+08:00|      0.0|
|1970-01-01T08:00:00.001+08:00|      1.0|
|1970-01-01T08:00:00.038+08:00|     38.0|
|1970-01-01T08:00:00.039+08:00|     39.0|
|1970-01-01T08:00:00.040+08:00|     40.0|
|1970-01-01T08:00:00.041+08:00|     41.0|
|1970-01-01T08:00:00.078+08:00|     78.0|
|1970-01-01T08:00:00.079+08:00|     79.0|
|1970-01-01T08:00:00.080+08:00|     80.0|
|1970-01-01T08:00:00.081+08:00|     81.0|
|1970-01-01T08:00:00.098+08:00|     98.0|
|1970-01-01T08:00:00.099+08:00|     99.0|
+-----------------------------+---------+
Total line number = 12
It costs 0.065s

Equal Size Bucket Outlier Sample

This function samples the input sequence with equal number of bucket outliers, that is, according to the downsampling ratio given by the user and the number of samples in the bucket, the input sequence is divided into several buckets according to a fixed number of points. Sampling by the given outlier sampling method within each bucket.

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
EQUAL_SIZE_BUCKET_OUTLIER_SAMPLEINT32 / INT64 / FLOAT / DOUBLEThe value range of proportion is (0, 1], the default is 0.1
The value of type is avg or stendis or cos or prenextdis, the default is avg
The value of number should be greater than 0, the default is 3
INT32 / INT64 / FLOAT / DOUBLEReturns outlier samples in equal buckets that match the sampling ratio and the number of samples in the bucket

Parameter Description

  • proportion: sampling ratio
  • number: the number of samples in each bucket, default 3
  • type: outlier sampling method, the value is
    • avg: Take the average of the data points in the bucket, and find the top number farthest from the average according to the sampling ratio
    • stendis: Take the vertical distance between each data point in the bucket and the first and last data points of the bucket to form a straight line, and according to the sampling ratio, find the top number with the largest distance
    • cos: Set a data point in the bucket as b, the data point on the left of b as a, and the data point on the right of b as c, then take the cosine value of the angle between the ab and bc vectors. The larger the angle, the more likely it is an outlier. Find the top number with the smallest cos value
    • prenextdis: Let a data point in the bucket be b, the data point to the left of b is a, and the data point to the right of b is c, then take the sum of the lengths of ab and bc as the yardstick, the larger the sum, the more likely it is to be an outlier, and find the top number with the largest sum value
Example

Example data: root.ln.wf01.wt01.temperature has a total of 100 ordered data from 0.0-99.0. Among them, in order to add outliers, we make the number modulo 5 equal to 0 increment by 100.

IoTDB> select temperature from root.ln.wf01.wt01;
+-----------------------------+-----------------------------+
|                         Time|root.ln.wf01.wt01.temperature|
+-----------------------------+-----------------------------+
|1970-01-01T08:00:00.000+08:00|                          0.0|
|1970-01-01T08:00:00.001+08:00|                          1.0|
|1970-01-01T08:00:00.002+08:00|                          2.0|
|1970-01-01T08:00:00.003+08:00|                          3.0|
|1970-01-01T08:00:00.004+08:00|                          4.0|
|1970-01-01T08:00:00.005+08:00|                        105.0|
|1970-01-01T08:00:00.006+08:00|                          6.0|
|1970-01-01T08:00:00.007+08:00|                          7.0|
|1970-01-01T08:00:00.008+08:00|                          8.0|
|1970-01-01T08:00:00.009+08:00|                          9.0|
|1970-01-01T08:00:00.010+08:00|                         10.0|
|1970-01-01T08:00:00.011+08:00|                         11.0|
|1970-01-01T08:00:00.012+08:00|                         12.0|
|1970-01-01T08:00:00.013+08:00|                         13.0|
|1970-01-01T08:00:00.014+08:00|                         14.0|
|1970-01-01T08:00:00.015+08:00|                        115.0|
|1970-01-01T08:00:00.016+08:00|                         16.0|
|.............................|.............................|
|1970-01-01T08:00:00.092+08:00|                         92.0|
|1970-01-01T08:00:00.093+08:00|                         93.0|
|1970-01-01T08:00:00.094+08:00|                         94.0|
|1970-01-01T08:00:00.095+08:00|                        195.0|
|1970-01-01T08:00:00.096+08:00|                         96.0|
|1970-01-01T08:00:00.097+08:00|                         97.0|
|1970-01-01T08:00:00.098+08:00|                         98.0|
|1970-01-01T08:00:00.099+08:00|                         99.0|
+-----------------------------+-----------------------------+

Sql:

select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01;

Result:

+-----------------------------+------------------+----------------------+------------------+-------------------------+
|                         Time|outlier_avg_sample|outlier_stendis_sample|outlier_cos_sample|outlier_prenextdis_sample|
+-----------------------------+------------------+----------------------+------------------+-------------------------+
|1970-01-01T08:00:00.005+08:00|             105.0|                 105.0|             105.0|                    105.0|
|1970-01-01T08:00:00.015+08:00|             115.0|                 115.0|             115.0|                    115.0|
|1970-01-01T08:00:00.025+08:00|             125.0|                 125.0|             125.0|                    125.0|
|1970-01-01T08:00:00.035+08:00|             135.0|                 135.0|             135.0|                    135.0|
|1970-01-01T08:00:00.045+08:00|             145.0|                 145.0|             145.0|                    145.0|
|1970-01-01T08:00:00.055+08:00|             155.0|                 155.0|             155.0|                    155.0|
|1970-01-01T08:00:00.065+08:00|             165.0|                 165.0|             165.0|                    165.0|
|1970-01-01T08:00:00.075+08:00|             175.0|                 175.0|             175.0|                    175.0|
|1970-01-01T08:00:00.085+08:00|             185.0|                 185.0|             185.0|                    185.0|
|1970-01-01T08:00:00.095+08:00|             195.0|                 195.0|             195.0|                    195.0|
+-----------------------------+------------------+----------------------+------------------+-------------------------+
Total line number = 10
It costs 0.041s

M4 Function

M4 is used to sample the first, last, bottom, top points for each sliding window:

  • the first point is the point with the minimal time;
  • the last point is the point with the maximal time;
  • the bottom point is the point with the minimal value (if there are multiple such points, M4 returns one of them);
  • the top point is the point with the maximal value (if there are multiple such points, M4 returns one of them).
image
Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeDescription
M4INT32 / INT64 / FLOAT / DOUBLEDifferent attributes used by the size window and the time window. The size window uses attributes windowSize and slidingStep. The time window uses attributes timeInterval, slidingStep, displayWindowBegin, and displayWindowEnd. More details see below.INT32 / INT64 / FLOAT / DOUBLEReturns the first, last, bottom, top points in each sliding window. M4 sorts and deduplicates the aggregated points within the window before outputting them.

Attributes

(1) Attributes for the size window:

  • windowSize: The number of points in a window. Int data type. Required.
  • slidingStep: Slide a window by the number of points. Int data type. Optional. If not set, default to the same as windowSize.
image

(2) Attributes for the time window:

  • timeInterval: The time interval length of a window. Long data type. Required.
  • slidingStep: Slide a window by the time length. Long data type. Optional. If not set, default to the same as timeInterval.
  • displayWindowBegin: The starting position of the window (included). Long data type. Optional. If not set, default to Long.MIN_VALUE, meaning using the time of the first data point of the input time series as the starting position of the window.
  • displayWindowEnd: End time limit (excluded, essentially playing the same role as WHERE time < displayWindowEnd). Long data type. Optional. If not set, default to Long.MAX_VALUE, meaning there is no additional end time limit other than the end of the input time series itself.
groupBy window

Examples

Input series:

+-----------------------------+------------------+
|                         Time|root.vehicle.d1.s1|
+-----------------------------+------------------+
|1970-01-01T08:00:00.001+08:00|               5.0|
|1970-01-01T08:00:00.002+08:00|              15.0|
|1970-01-01T08:00:00.005+08:00|              10.0|
|1970-01-01T08:00:00.008+08:00|               8.0|
|1970-01-01T08:00:00.010+08:00|              30.0|
|1970-01-01T08:00:00.020+08:00|              20.0|
|1970-01-01T08:00:00.025+08:00|               8.0|
|1970-01-01T08:00:00.027+08:00|              20.0|
|1970-01-01T08:00:00.030+08:00|              40.0|
|1970-01-01T08:00:00.033+08:00|               9.0|
|1970-01-01T08:00:00.035+08:00|              10.0|
|1970-01-01T08:00:00.040+08:00|              20.0|
|1970-01-01T08:00:00.045+08:00|              30.0|
|1970-01-01T08:00:00.052+08:00|               8.0|
|1970-01-01T08:00:00.054+08:00|              18.0|
+-----------------------------+------------------+

SQL for query1:

select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1

Output1:

+-----------------------------+-----------------------------------------------------------------------------------------------+
|                         Time|M4(root.vehicle.d1.s1, "timeInterval"="25", "displayWindowBegin"="0", "displayWindowEnd"="100")|
+-----------------------------+-----------------------------------------------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                                                                            5.0|
|1970-01-01T08:00:00.010+08:00|                                                                                           30.0|
|1970-01-01T08:00:00.020+08:00|                                                                                           20.0|
|1970-01-01T08:00:00.025+08:00|                                                                                            8.0|
|1970-01-01T08:00:00.030+08:00|                                                                                           40.0|
|1970-01-01T08:00:00.045+08:00|                                                                                           30.0|
|1970-01-01T08:00:00.052+08:00|                                                                                            8.0|
|1970-01-01T08:00:00.054+08:00|                                                                                           18.0|
+-----------------------------+-----------------------------------------------------------------------------------------------+
Total line number = 8

SQL for query2:

select M4(s1,'windowSize'='10') from root.vehicle.d1

Output2:

+-----------------------------+-----------------------------------------+
|                         Time|M4(root.vehicle.d1.s1, "windowSize"="10")|
+-----------------------------+-----------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                      5.0|
|1970-01-01T08:00:00.030+08:00|                                     40.0|
|1970-01-01T08:00:00.033+08:00|                                      9.0|
|1970-01-01T08:00:00.035+08:00|                                     10.0|
|1970-01-01T08:00:00.045+08:00|                                     30.0|
|1970-01-01T08:00:00.052+08:00|                                      8.0|
|1970-01-01T08:00:00.054+08:00|                                     18.0|
+-----------------------------+-----------------------------------------+
Total line number = 7

Suggested Use Cases

(1) Use Case: Extreme-point-preserving downsampling

As M4 aggregation selects the first, last, bottom, top points for each window, M4 usually preserves extreme points and thus patterns better than other downsampling methods such as Piecewise Aggregate Approximation (PAA). Therefore, if you want to downsample the time series while preserving extreme points, you may give M4 a try.

(2) Use case: Error-free two-color line chart visualization of large-scale time series through M4 downsampling

Referring to paper "M4: A Visualization-Oriented Time Series Data Aggregation"open in new window, M4 is a downsampling method to facilitate large-scale time series visualization without deforming the shape in terms of a two-color line chart.

Given a chart of w*h pixels, suppose that the visualization time range of the time series is [tqs,tqe) and (tqe-tqs) is divisible by w, the points that fall within the i-th time span Ii=[tqs+(tqe-tqs)/w*(i-1),tqs+(tqe-tqs)/w*i) will be drawn on the i-th pixel column, i=1,2,...,w. Therefore, from a visualization-driven perspective, use the sql: "select M4(s1,'timeInterval'='(tqe-tqs)/w','displayWindowBegin'='tqs','displayWindowEnd'='tqe') from root.vehicle.d1" to sample the first, last, bottom, top points for each time span. The resulting downsampled time series has no more than 4*w points, a big reduction compared to the original large-scale time series. Meanwhile, the two-color line chart drawn from the reduced data is identical that to that drawn from the original data (pixel-level consistency).

To eliminate the hassle of hardcoding parameters, we recommend the following usage of Grafana's template variableopen in new window $__interval_ms when Grafana is used for visualization:

select M4(s1,'timeInterval'='$__interval_ms') from root.sg1.d1

where timeInterval is set as (tqe-tqs)/w automatically. Note that the time precision here is assumed to be milliseconds.

Comparison with Other Functions

SQLWhether support M4 aggregationSliding window typeExampleDocs
1. native built-in aggregate functions with Group By clauseNo. Lack BOTTOM_TIME and TOP_TIME, which are respectively the time of the points that have the mininum and maximum value.Time Windowselect count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d)https://iotdb.apache.org/UserGuide/Master/Query-Data/Aggregate-Query.html#built-in-aggregate-functionsopen in new window
https://iotdb.apache.org/UserGuide/Master/Query-Data/Aggregate-Query.html#downsampling-aggregate-queryopen in new window
2. EQUAL_SIZE_BUCKET_M4_SAMPLE (built-in UDF)Yes*Size Window. windowSize = 4*(int)(1/proportion)select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01https://iotdb.apache.org/UserGuide/Master/Query-Data/Select-Expression.html#time-series-generating-functionsopen in new window
3. M4 (built-in UDF)Yes*Size Window, Time Window(1) Size Window: select M4(s1,'windowSize'='10') from root.vehicle.d1
(2) Time Window: select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1
refer to this doc
4. extend native built-in aggregate functions with Group By clause to support M4 aggregationnot implementednot implementednot implementednot implemented

Further compare EQUAL_SIZE_BUCKET_M4_SAMPLE and M4:

(1) Different M4 aggregation definition:

For each window, EQUAL_SIZE_BUCKET_M4_SAMPLE extracts the top and bottom points from points EXCLUDING the first and last points.

In contrast, M4 extracts the top and bottom points from points INCLUDING the first and last points, which is more consistent with the semantics of max_value and min_value stored in metadata.

It is worth noting that both functions sort and deduplicate the aggregated points in a window before outputting them to the collectors.

(2) Different sliding windows:

EQUAL_SIZE_BUCKET_M4_SAMPLE uses SlidingSizeWindowAccessStrategy and indirectly controls sliding window size by sampling proportion. The conversion formula is windowSize = 4*(int)(1/proportion).

M4 supports two types of sliding window: SlidingSizeWindowAccessStrategy and SlidingTimeWindowAccessStrategy. M4 directly controls the window point size or time length using corresponding parameters.

Time Series Processing

CHANGE_POINTS

Usage

This function is used to remove consecutive identical values from an input sequence.
For example, input:1,1,2,2,3 output:1,2,3.

Name: CHANGE_POINTS

Input Series: Support only one input series.

Parameters: No parameters.

Example

Raw data:

+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
|                         Time|root.testChangePoints.d1.s1|root.testChangePoints.d1.s2|root.testChangePoints.d1.s3|root.testChangePoints.d1.s4|root.testChangePoints.d1.s5|root.testChangePoints.d1.s6|
+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+
|1970-01-01T08:00:00.001+08:00|                       true|                          1|                          1|                        1.0|                        1.0|                     1test1|
|1970-01-01T08:00:00.002+08:00|                       true|                          2|                          2|                        2.0|                        1.0|                     2test2|
|1970-01-01T08:00:00.003+08:00|                      false|                          1|                          2|                        1.0|                        1.0|                     2test2|
|1970-01-01T08:00:00.004+08:00|                       true|                          1|                          3|                        1.0|                        1.0|                     1test1|
|1970-01-01T08:00:00.005+08:00|                       true|                          1|                          3|                        1.0|                        1.0|                     1test1|
+-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+---------------------------+

SQL for query:

select change_points(s1), change_points(s2), change_points(s3), change_points(s4), change_points(s5), change_points(s6) from root.testChangePoints.d1

Output series:

+-----------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+
|                         Time|change_points(root.testChangePoints.d1.s1)|change_points(root.testChangePoints.d1.s2)|change_points(root.testChangePoints.d1.s3)|change_points(root.testChangePoints.d1.s4)|change_points(root.testChangePoints.d1.s5)|change_points(root.testChangePoints.d1.s6)|
+-----------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                      true|                                         1|                                         1|                                       1.0|                                       1.0|                                    1test1|
|1970-01-01T08:00:00.002+08:00|                                      null|                                         2|                                         2|                                       2.0|                                      null|                                    2test2|
|1970-01-01T08:00:00.003+08:00|                                     false|                                         1|                                      null|                                       1.0|                                      null|                                      null|
|1970-01-01T08:00:00.004+08:00|                                      true|                                      null|                                         3|                                      null|                                      null|                                    1test1|
+-----------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+------------------------------------------+

Lambda Expression

JEXL Function

Java Expression Language (JEXL) is an expression language engine. We use JEXL to extend UDFs, which are implemented on the command line with simple lambda expressions. See the link for operators supported in jexl lambda expressionsopen in new window.

Function NameAllowed Input Series Data TypesRequired AttributesOutput Series Data TypeSeries Data Type Description
JEXLINT32 / INT64 / FLOAT / DOUBLE / TEXT / BOOLEANexpr is a lambda expression that supports standard one or multi arguments in the form x -> {...} or (x, y, z) -> {...}, e.g. x -> {x * 2}, (x, y, z) -> {x + y * z}INT32 / INT64 / FLOAT / DOUBLE / TEXT / BOOLEANReturns the input time series transformed by a lambda expression
Demonstrate

Example data: root.ln.wf01.wt01.temperature, root.ln.wf01.wt01.st, root.ln.wf01.wt01.str a total of 11 data.

IoTDB> select * from root.ln.wf01.wt01;
+-----------------------------+---------------------+--------------------+-----------------------------+
|                         Time|root.ln.wf01.wt01.str|root.ln.wf01.wt01.st|root.ln.wf01.wt01.temperature|
+-----------------------------+---------------------+--------------------+-----------------------------+
|1970-01-01T08:00:00.000+08:00|                  str|                10.0|                          0.0|
|1970-01-01T08:00:00.001+08:00|                  str|                20.0|                          1.0|
|1970-01-01T08:00:00.002+08:00|                  str|                30.0|                          2.0|
|1970-01-01T08:00:00.003+08:00|                  str|                40.0|                          3.0|
|1970-01-01T08:00:00.004+08:00|                  str|                50.0|                          4.0|
|1970-01-01T08:00:00.005+08:00|                  str|                60.0|                          5.0|
|1970-01-01T08:00:00.006+08:00|                  str|                70.0|                          6.0|
|1970-01-01T08:00:00.007+08:00|                  str|                80.0|                          7.0|
|1970-01-01T08:00:00.008+08:00|                  str|                90.0|                          8.0|
|1970-01-01T08:00:00.009+08:00|                  str|               100.0|                          9.0|
|1970-01-01T08:00:00.010+08:00|                  str|               110.0|                         10.0|
+-----------------------------+---------------------+--------------------+-----------------------------+

Sql:

select jexl(temperature, 'expr'='x -> {x + x}') as jexl1, jexl(temperature, 'expr'='x -> {x * 3}') as jexl2, jexl(temperature, 'expr'='x -> {x * x}') as jexl3, jexl(temperature, 'expr'='x -> {multiply(x, 100)}') as jexl4, jexl(temperature, st, 'expr'='(x, y) -> {x + y}') as jexl5, jexl(temperature, st, str, 'expr'='(x, y, z) -> {x + y + z}') as jexl6 from root.ln.wf01.wt01;```

Result:

+-----------------------------+-----+-----+-----+------+-----+--------+
|                         Time|jexl1|jexl2|jexl3| jexl4|jexl5|   jexl6|
+-----------------------------+-----+-----+-----+------+-----+--------+
|1970-01-01T08:00:00.000+08:00|  0.0|  0.0|  0.0|   0.0| 10.0| 10.0str|
|1970-01-01T08:00:00.001+08:00|  2.0|  3.0|  1.0| 100.0| 21.0| 21.0str|
|1970-01-01T08:00:00.002+08:00|  4.0|  6.0|  4.0| 200.0| 32.0| 32.0str|
|1970-01-01T08:00:00.003+08:00|  6.0|  9.0|  9.0| 300.0| 43.0| 43.0str|
|1970-01-01T08:00:00.004+08:00|  8.0| 12.0| 16.0| 400.0| 54.0| 54.0str|
|1970-01-01T08:00:00.005+08:00| 10.0| 15.0| 25.0| 500.0| 65.0| 65.0str|
|1970-01-01T08:00:00.006+08:00| 12.0| 18.0| 36.0| 600.0| 76.0| 76.0str|
|1970-01-01T08:00:00.007+08:00| 14.0| 21.0| 49.0| 700.0| 87.0| 87.0str|
|1970-01-01T08:00:00.008+08:00| 16.0| 24.0| 64.0| 800.0| 98.0| 98.0str|
|1970-01-01T08:00:00.009+08:00| 18.0| 27.0| 81.0| 900.0|109.0|109.0str|
|1970-01-01T08:00:00.010+08:00| 20.0| 30.0|100.0|1000.0|120.0|120.0str|
+-----------------------------+-----+-----+-----+------+-----+--------+
Total line number = 11
It costs 0.118s

Conditional Expressions

CASE

The CASE expression is a kind of conditional expression that can be used to return different values based on specific conditions, similar to the if-else statements in other languages.

The CASE expression consists of the following parts:

  • CASE keyword: Indicates the start of the CASE expression.
  • WHEN-THEN clauses: There may be multiple clauses used to define conditions and give results. This clause is divided into two parts, WHEN and THEN. The WHEN part defines the condition, and the THEN part defines the result expression. If the WHEN condition is true, the corresponding THEN result is returned.
  • ELSE clause: If none of the WHEN conditions is true, the result in the ELSE clause will be returned. The ELSE clause can be omitted.
  • END keyword: Indicates the end of the CASE expression.

The CASE expression is a scalar operation that can be used in combination with any other scalar operation or aggregate function.

In the following text, all THEN parts and ELSE clauses will be collectively referred to as result clauses.

Syntax

The CASE expression supports two formats.

  • Format 1:

    CASE 
        WHEN condition1 THEN expression1
        [WHEN condition2 THEN expression2] ...
        [ELSE expression_end]
    END
    

    The conditions will be evaluated one by one.

    The first condition that is true will return the corresponding expression.

  • Format 2:

    CASE caseValue
        WHEN whenValue1 THEN expression1
        [WHEN whenValue2 THEN expression2] ...
        [ELSE expression_end]
    END
    

    The caseValue will be evaluated first, and then the whenValues will be evaluated one by one. The first whenValue that is equal to the caseValue will return the corresponding expression.

    Format 2 will be transformed into an equivalent Format 1 by iotdb.

    For example, the above SQL statement will be transformed into:

    CASE 
        WHEN caseValue=whenValue1 THEN expression1
        [WHEN caseValue=whenValue1 THEN expression1] ...
        [ELSE expression_end]
    END
    

If none of the conditions are true, or if none of the whenValues match the caseValue, the expression_end will be returned.

If there is no ELSE clause, null will be returned.

Notes

  • In format 1, all WHEN clauses must return a BOOLEAN type.
  • In format 2, all WHEN clauses must be able to be compared to the CASE clause.
  • All result clauses in a CASE expression must satisfy certain conditions for their return value types:
    • BOOLEAN types cannot coexist with other types and will cause an error if present.
    • TEXT types cannot coexist with other types and will cause an error if present.
    • The other four numeric types can coexist, and the final result will be of DOUBLE type, with possible precision loss during conversion.
    • If necessary, you can use the CAST function to convert the result to a type that can coexist with others.
  • The CASE expression does not implement lazy evaluation, meaning that all clauses will be evaluated.
  • The CASE expression does not support mixing with UDFs.
  • Aggregate functions cannot be used within a CASE expression, but the result of a CASE expression can be used as input for an aggregate function.
  • When using the CLI, because the CASE expression string can be lengthy, it is recommended to provide an alias for the expression using AS.

Using Examples

Example 1

The CASE expression can be used to analyze data in a visual way. For example:

  • The preparation of a certain chemical product requires that the temperature and pressure be within specific ranges.
  • During the preparation process, sensors will detect the temperature and pressure, forming two time-series T (temperature) and P (pressure) in IoTDB.
    In this application scenario, the CASE expression can indicate which time parameters are appropriate, which are not, and why they are not.

data:

IoTDB> select * from root.test1
+-----------------------------+------------+------------+
|                         Time|root.test1.P|root.test1.T|
+-----------------------------+------------+------------+
|2023-03-29T11:25:54.724+08:00|   1000000.0|      1025.0|
|2023-03-29T11:26:13.445+08:00|   1000094.0|      1040.0|
|2023-03-29T11:27:36.988+08:00|   1000095.0|      1041.0|
|2023-03-29T11:27:56.446+08:00|   1000095.0|      1059.0|
|2023-03-29T11:28:20.838+08:00|   1200000.0|      1040.0|
+-----------------------------+------------+------------+

SQL statements:

select T, P, case
when 1000<T and T<1050 and 1000000<P and P<1100000 then "good!"
when T<=1000 or T>=1050 then "bad temperature"
when P<=1000000 or P>=1100000 then "bad pressure"
end as `result`
from root.test1

output:

+-----------------------------+------------+------------+---------------+
|                         Time|root.test1.T|root.test1.P|         result|
+-----------------------------+------------+------------+---------------+
|2023-03-29T11:25:54.724+08:00|      1025.0|   1000000.0|   bad pressure|
|2023-03-29T11:26:13.445+08:00|      1040.0|   1000094.0|          good!|
|2023-03-29T11:27:36.988+08:00|      1041.0|   1000095.0|          good!|
|2023-03-29T11:27:56.446+08:00|      1059.0|   1000095.0|bad temperature|
|2023-03-29T11:28:20.838+08:00|      1040.0|   1200000.0|   bad pressure|
+-----------------------------+------------+------------+---------------+
Example 2

The CASE expression can achieve flexible result transformation, such as converting strings with a certain pattern to other strings.

data:

IoTDB> select * from root.test2
+-----------------------------+--------------+
|                         Time|root.test2.str|
+-----------------------------+--------------+
|2023-03-27T18:23:33.427+08:00|         abccd|
|2023-03-27T18:23:39.389+08:00|         abcdd|
|2023-03-27T18:23:43.463+08:00|       abcdefg|
+-----------------------------+--------------+

SQL statements:

select str, case
when str like "%cc%" then "has cc"
when str like "%dd%" then "has dd"
else "no cc and dd" end as `result`
from root.test2

output:

+-----------------------------+--------------+------------+
|                         Time|root.test2.str|      result|
+-----------------------------+--------------+------------+
|2023-03-27T18:23:33.427+08:00|         abccd|      has cc|
|2023-03-27T18:23:39.389+08:00|         abcdd|      has dd|
|2023-03-27T18:23:43.463+08:00|       abcdefg|no cc and dd|
+-----------------------------+--------------+------------+
Example 3: work with aggregation functions
Valid: aggregation function ← CASE expression

The CASE expression can be used as a parameter for aggregate functions. For example, used in conjunction with the COUNT function, it can implement statistics based on multiple conditions simultaneously.

data:

IoTDB> select * from root.test3
+-----------------------------+------------+
|                         Time|root.test3.x|
+-----------------------------+------------+
|2023-03-27T18:11:11.300+08:00|         0.0|
|2023-03-27T18:11:14.658+08:00|         1.0|
|2023-03-27T18:11:15.981+08:00|         2.0|
|2023-03-27T18:11:17.668+08:00|         3.0|
|2023-03-27T18:11:19.112+08:00|         4.0|
|2023-03-27T18:11:20.822+08:00|         5.0|
|2023-03-27T18:11:22.462+08:00|         6.0|
|2023-03-27T18:11:24.174+08:00|         7.0|
|2023-03-27T18:11:25.858+08:00|         8.0|
|2023-03-27T18:11:27.979+08:00|         9.0|
+-----------------------------+------------+

SQL statements:

select
count(case when x<=1 then 1 end) as `(-∞,1]`,
count(case when 1<x and x<=3 then 1 end) as `(1,3]`,
count(case when 3<x and x<=7 then 1 end) as `(3,7]`,
count(case when 7<x then 1 end) as `(7,+∞)`
from root.test3

output:

+------+-----+-----+------+
|(-∞,1]|(1,3]|(3,7]|(7,+∞)|
+------+-----+-----+------+
|     2|    2|    4|     2|
+------+-----+-----+------+
Invalid: CASE expression ← aggregation function

Using aggregation function in CASE expression is not supported

SQL statements:

select case when x<=1 then avg(x) else sum(x) end from root.test3

output:

Msg: 701: Raw data and aggregation result hybrid calculation is not supported.
Example 4: kind 2

Here is a simple example that uses the format 2 syntax.
If all conditions are equality tests, it is recommended to use format 2 to simplify SQL statements.

data:

IoTDB> select * from root.test4
+-----------------------------+------------+
|                         Time|root.test4.x|
+-----------------------------+------------+
|1970-01-01T08:00:00.001+08:00|         1.0|
|1970-01-01T08:00:00.002+08:00|         2.0|
|1970-01-01T08:00:00.003+08:00|         3.0|
|1970-01-01T08:00:00.004+08:00|         4.0|
+-----------------------------+------------+

SQL statements:

select x, case x when 1 then "one" when 2 then "two" else "other" end from root.test4

output:

+-----------------------------+------------+-----------------------------------------------------------------------------------+
|                         Time|root.test4.x|CASE WHEN root.test4.x = 1 THEN "one" WHEN root.test4.x = 2 THEN "two" ELSE "other"|
+-----------------------------+------------+-----------------------------------------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|         1.0|                                                                                one|
|1970-01-01T08:00:00.002+08:00|         2.0|                                                                                two|
|1970-01-01T08:00:00.003+08:00|         3.0|                                                                              other|
|1970-01-01T08:00:00.004+08:00|         4.0|                                                                              other|
+-----------------------------+------------+-----------------------------------------------------------------------------------+
Example 5: type of return clauses

The result clause of a CASE expression needs to satisfy certain type restrictions.

In this example, we continue to use the data from Example 4.

Invalid: BOOLEAN cannot coexist with other types

SQL statements:

select x, case x when 1 then true when 2 then 2 end from root.test4

output:

Msg: 701: CASE expression: BOOLEAN and other types cannot exist at same time
Valid: Only BOOLEAN type exists

SQL statements:

select x, case x when 1 then true when 2 then false end as `result` from root.test4

output:

+-----------------------------+------------+------+
|                         Time|root.test4.x|result|
+-----------------------------+------------+------+
|1970-01-01T08:00:00.001+08:00|         1.0|  true|
|1970-01-01T08:00:00.002+08:00|         2.0| false|
|1970-01-01T08:00:00.003+08:00|         3.0|  null|
|1970-01-01T08:00:00.004+08:00|         4.0|  null|
+-----------------------------+------------+------+
Invalid:TEXT cannot coexist with other types

SQL statements:

select x, case x when 1 then 1 when 2 then "str" end from root.test4

output:

Msg: 701: CASE expression: TEXT and other types cannot exist at same time
Valid: Only TEXT type exists

See in Example 1.

Valid: Numerical types coexist

SQL statements:

select x, case x
when 1 then 1
when 2 then 222222222222222
when 3 then 3.3
when 4 then 4.4444444444444
end as `result`
from root.test4

output:

+-----------------------------+------------+-------------------+
|                         Time|root.test4.x|             result|
+-----------------------------+------------+-------------------+
|1970-01-01T08:00:00.001+08:00|         1.0|                1.0|
|1970-01-01T08:00:00.002+08:00|         2.0|2.22222222222222E14|
|1970-01-01T08:00:00.003+08:00|         3.0|  3.299999952316284|
|1970-01-01T08:00:00.004+08:00|         4.0|   4.44444465637207|
+-----------------------------+------------+-------------------+

Copyright © 2024 The Apache Software Foundation.
Apache and the Apache feather logo are trademarks of The Apache Software Foundation

Have a question? Connect with us on QQ, WeChat, or Slack. Join the community now.