Machine Learning

About 3 min

Machine Learning

AR

Usage

This function is used to learn the coefficients of the autoregressive models for a time series.

Name: AR

Input Series: Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE.

Parameters:

  • p: The order of the autoregressive model. Its default value is 1.

Output Series: Output a single series. The type is DOUBLE. The first line corresponds to the first order coefficient, and so on.

Note:

  • Parameter p should be a positive integer.
  • Most points in the series should be sampled at a constant time interval.
  • Linear interpolation is applied for the missing points in the series.

Examples

Assigning Model Order

Input Series:

+-----------------------------+---------------+
|                         Time|root.test.d0.s0|
+-----------------------------+---------------+
|2020-01-01T00:00:01.000+08:00|           -4.0|
|2020-01-01T00:00:02.000+08:00|           -3.0|
|2020-01-01T00:00:03.000+08:00|           -2.0|
|2020-01-01T00:00:04.000+08:00|           -1.0|
|2020-01-01T00:00:05.000+08:00|            0.0|
|2020-01-01T00:00:06.000+08:00|            1.0|
|2020-01-01T00:00:07.000+08:00|            2.0|
|2020-01-01T00:00:08.000+08:00|            3.0|
|2020-01-01T00:00:09.000+08:00|            4.0|
+-----------------------------+---------------+

SQL for query:

select ar(s0,"p"="2") from root.test.d0

Output Series:

+-----------------------------+---------------------------+
|                         Time|ar(root.test.d0.s0,"p"="2")|
+-----------------------------+---------------------------+
|1970-01-01T08:00:00.001+08:00|                     0.9429|
|1970-01-01T08:00:00.002+08:00|                    -0.2571|
+-----------------------------+---------------------------+

Representation

Usage

This function is used to represent a time series.

Name: Representation

Input Series: Only support a single input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE.

Parameters:

  • tb: The number of timestamp blocks. Its default value is 10.
  • vb: The number of value blocks. Its default value is 10.

Output Series: Output a single series. The type is INT32. The length is tb*vb. The timestamps starting from 0 only indicate the order.

Note:

  • Parameters tb and vb should be positive integers.

Examples

Assigning Window Size and Dimension

Input Series:

+-----------------------------+---------------+
|                         Time|root.test.d0.s0|
+-----------------------------+---------------+
|2020-01-01T00:00:01.000+08:00|           -4.0|
|2020-01-01T00:00:02.000+08:00|           -3.0|
|2020-01-01T00:00:03.000+08:00|           -2.0|
|2020-01-01T00:00:04.000+08:00|           -1.0|
|2020-01-01T00:00:05.000+08:00|            0.0|
|2020-01-01T00:00:06.000+08:00|            1.0|
|2020-01-01T00:00:07.000+08:00|            2.0|
|2020-01-01T00:00:08.000+08:00|            3.0|
|2020-01-01T00:00:09.000+08:00|            4.0|
+-----------------------------+---------------+

SQL for query:

select representation(s0,"tb"="3","vb"="2") from root.test.d0

Output Series:

+-----------------------------+-------------------------------------------------+
|                         Time|representation(root.test.d0.s0,"tb"="3","vb"="2")|
+-----------------------------+-------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                                1|
|1970-01-01T08:00:00.002+08:00|                                                1|
|1970-01-01T08:00:00.003+08:00|                                                0|
|1970-01-01T08:00:00.004+08:00|                                                0|
|1970-01-01T08:00:00.005+08:00|                                                1|
|1970-01-01T08:00:00.006+08:00|                                                1|
+-----------------------------+-------------------------------------------------+

RM

Usage

This function is used to calculate the matching score of two time series according to the representation.

Name: RM

Input Series: Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE.

Parameters:

  • tb: The number of timestamp blocks. Its default value is 10.
  • vb: The number of value blocks. Its default value is 10.

Output Series: Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the matching score.

Note:

  • Parameters tb and vb should be positive integers.

Examples

Assigning Window Size and Dimension

Input Series:

+-----------------------------+---------------+---------------+
|                         Time|root.test.d0.s0|root.test.d0.s1
+-----------------------------+---------------+---------------+
|2020-01-01T00:00:01.000+08:00|           -4.0|           -4.0|
|2020-01-01T00:00:02.000+08:00|           -3.0|           -3.0|
|2020-01-01T00:00:03.000+08:00|           -3.0|           -3.0|
|2020-01-01T00:00:04.000+08:00|           -1.0|           -1.0|
|2020-01-01T00:00:05.000+08:00|            0.0|            0.0|
|2020-01-01T00:00:06.000+08:00|            1.0|            1.0|
|2020-01-01T00:00:07.000+08:00|            2.0|            2.0|
|2020-01-01T00:00:08.000+08:00|            3.0|            3.0|
|2020-01-01T00:00:09.000+08:00|            4.0|            4.0|
+-----------------------------+---------------+---------------+

SQL for query:

select rm(s0, s1,"tb"="3","vb"="2") from root.test.d0

Output Series:

+-----------------------------+-----------------------------------------------------+
|                         Time|rm(root.test.d0.s0,root.test.d0.s1,"tb"="3","vb"="2")|
+-----------------------------+-----------------------------------------------------+
|1970-01-01T08:00:00.001+08:00|                                                 1.00|
+-----------------------------+-----------------------------------------------------+

Copyright © 2023 The Apache Software Foundation.
Apache and the Apache feather logo are trademarks of The Apache Software Foundation

Have a question? Connect with us on QQ, WeChat, or Slack. Join the community now.

We use Google Analytics to collect anonymous, aggregated usage information.