<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "JATS-journalpublishing1.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">INFORMATICA</journal-id>
<journal-title-group><journal-title>Informatica</journal-title></journal-title-group>
<issn pub-type="epub">1822-8844</issn><issn pub-type="ppub">0868-4952</issn><issn-l>0868-4952</issn-l>
<publisher>
<publisher-name>Vilnius University</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">INFOR519</article-id>
<article-id pub-id-type="doi">10.15388/23-INFOR519</article-id>
<article-categories><subj-group subj-group-type="heading">
<subject>Research Article</subject></subj-group></article-categories>
<title-group>
<article-title>Multi-Querying: A Subsequence Matching Approach to Support Multiple Queries</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Liu</surname><given-names>Wen</given-names></name><email xlink:href="627952@qq.com">627952@qq.com</email><xref ref-type="aff" rid="j_infor519_aff_001">1</xref><xref ref-type="aff" rid="j_infor519_aff_002">2</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Ma</surname><given-names>Mingrui</given-names></name><email xlink:href="119345263@qq.com">119345263@qq.com</email><xref ref-type="aff" rid="j_infor519_aff_003">3</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Wang</surname><given-names>Peng</given-names></name><email xlink:href="pengwang5@fudan.edu.cn">pengwang5@fudan.edu.cn</email><xref ref-type="aff" rid="j_infor519_aff_004">4</xref><xref ref-type="corresp" rid="cor1">∗</xref>
</contrib>
<aff id="j_infor519_aff_001"><label>1</label>Artificial Intelligence and Smart Mine Engineering Technology Center, <institution>Xinjiang Institute of Engineering</institution>, <country>Urumqi, China</country></aff>
<aff id="j_infor519_aff_002"><label>2</label><institution>Xinjiang Sunshine Diantong Technology Co., Ltd</institution>, Urumqi, <country>China</country></aff>
<aff id="j_infor519_aff_003"><label>3</label>School of Information Science and Engineering, <institution>XinJiang University</institution>, XinJiang, <country>China</country></aff>
<aff id="j_infor519_aff_004"><label>4</label>School of Computer Science, <institution>Fudan University</institution>, Shanghai, <country>China</country></aff>
</contrib-group>
<author-notes>
<corresp id="cor1"><label>∗</label>Corresponding author.</corresp>
</author-notes>
<pub-date pub-type="ppub"><year>2023</year></pub-date><pub-date pub-type="epub"><day>28</day><month>6</month><year>2023</year></pub-date><volume>34</volume><issue>3</issue><fpage>557</fpage><lpage>576</lpage><history><date date-type="received"><month>8</month><year>2022</year></date><date date-type="accepted"><month>5</month><year>2023</year></date></history>
<permissions><copyright-statement>© 2023 Vilnius University</copyright-statement><copyright-year>2023</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
<license-p>Open access article under the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">CC BY</ext-link> license.</license-p></license></permissions>
<abstract>
<p>The widespread use of sensors has resulted in an unprecedented amount of time series data. Time series mining has experienced a particular surge of interest, among which, subsequence matching is one of the most primary problem that serves as a foundation for many time series data mining techniques, such as anomaly detection and classification. In literature there exist many works to study this problem. However, in many real applications, it is uneasy for users to accurately and clearly elaborate the query intuition with a single query sequence. Consequently, in this paper, we address this issue by allowing users to submit a small query set, instead of a single query. The multiple queries can embody the query intuition better. In particular, we first propose a novel probability-based representation of the query set. A common segmentation is generated which can approximate the queries well, in which each segment is described by some features. For each feature, the corresponding values of multiple queries are represented as a Gaussian distribution. Then, based on the representation, we design a novel distance function to measure the similarity of one subsequence to the multiple queries. Also, we propose a breadth-first search strategy to find out similar subsequences. We have conducted extensive experiments on both synthetic and real datasets, and the results verify the superiority of our approach.</p>
</abstract>
<kwd-group>
<label>Key words</label>
<kwd>multiple queries</kwd>
<kwd>subsequence matching</kwd>
<kwd>time series mining</kwd>
<kwd>fault diagnosis</kwd>
</kwd-group>
<funding-group><funding-statement>This work is supported by NSFC under grant 61962058, the Tianshan Talent of Xinjiang Uygur Autonomous Region – Young Top Talents in Science and Technology (2022TSYCCY0008), Integration of Industry and Education-Joint Laboratory of Data Engineering and Digital Mine (2019QX0035), Bayingolin Mongolian Autonomous Prefecture Science and Technology Research Program (202117), Natural Science Foundation of Xinjiang Uygur Autonomous Region (2019D01A30), and Scientific Research Program of the Higher Education Institution of Xinjiang (XJEDU2018Y056).</funding-statement></funding-group>
</article-meta>
</front>
<body>
<sec id="j_infor519_s_001">
<label>1</label>
<title>Introduction</title>
<p>In recent years, the plummet in the cost of sensors and storage devices has resulted in massive time series data being captured and has substantially driven the need for the analysis of time series data. Among various analysis and applications, the problem of subsequence matching is of primordial importance, which serves as the foundation for many other data mining techniques, such as anomaly detection (Boniol and Palpanas, <xref ref-type="bibr" rid="j_infor519_ref_002">2020</xref>; Boniol <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_003">2020</xref>; Wang <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_019">2022</xref>), and classification (Wang <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_018">2018</xref>; Abanda <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_001">2019</xref>; Iwana and Uchida, <xref ref-type="bibr" rid="j_infor519_ref_009">2020</xref>; Boniol <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_004">2022</xref>).</p>
<p>Specifically, given a long time series <italic>X</italic>, for any query series <italic>Q</italic>, the subsequence matching problem finds the <italic>K</italic> number of subsequences from <italic>X</italic> most similar to <italic>Q</italic> (top-<italic>K</italic> query), or finds subsequences whose distance falls within the threshold <italic>ε</italic> (range query). In the last two decades, plenty of works have been proposed for this problem. Most existing works find results based on the strict distance, like Euclidean distance and Dynamic Time Warping. Among them are scanning-based approaches (Li <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_012">2017</xref>; Rakthanmanon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_016">2012</xref>) and index-based ones (Linardi and Palpanas, <xref ref-type="bibr" rid="j_infor519_ref_013">2018</xref>; Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_021">2019</xref>).</p>
<p>Another type of works define the query in a more flexible way. Query-by-Sketch (Muthumanickam <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_014">2016</xref>) and SpADe (Chen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_005">2007</xref>) approximate the query with a sequence of line segments, and find subsequences that can be approximated in a similar way.</p>
<p>Nevertheless, in many real applications, users are unable to accurately and clearly elaborate the query intuition with a <italic>single</italic> query sequence. Specifically, users may have different strictness requirements for different parts of the query. We illustrate it with the following two examples.</p>
<p><italic><bold>Case study 1</bold>.</italic> In the field of wind power generation, Extreme Operating Gust (EOG) (Hu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_008">2018</xref>) is a typical gust pattern which is a phenomenon of dramatic changes of wind speed in a short period. Early detection of EOG can prevent the damage to the turbine. A typical pattern of EOG, as <italic>Q</italic> and <inline-formula id="j_infor519_ineq_001"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{1}}$]]></tex-math></alternatives></inline-formula> in Fig <xref rid="j_infor519_fig_001">1</xref>, has three physical phases, where its corresponding shape contains a slight decrease (1–100), followed by a steep rise and a steep drop (101–200), and a rise back to the original value (200–300). Users usually emphasize the steep increase/decrease in the second part, which means <inline-formula id="j_infor519_ineq_002"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{1}}$]]></tex-math></alternatives></inline-formula> is more preferred compared to <inline-formula id="j_infor519_ineq_003"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{2}}$]]></tex-math></alternatives></inline-formula> in Fig <xref rid="j_infor519_fig_001">1</xref>. However, if the analyst submits query <italic>Q</italic>, <inline-formula id="j_infor519_ineq_004"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{2}}$]]></tex-math></alternatives></inline-formula> is more similar to <italic>Q</italic> under either ED or DTW distance.</p>
<fig id="j_infor519_fig_001">
<label>Fig. 1</label>
<caption>
<p>EOG pattern.</p>
</caption>
<graphic xlink:href="infor519_g001.jpg"/>
</fig>
<p><italic><bold>Case study 2</bold>.</italic> During the high-speed train’s work time, the sensor will continuously collect vibration data for monitoring. When the train passes by some source of interference, the value of sensors will increase sharply, and return to a normal value after some time, as <italic>Q</italic>, <inline-formula id="j_infor519_ineq_005"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{1}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_006"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{2}}$]]></tex-math></alternatives></inline-formula> in Fig <xref rid="j_infor519_fig_002">2</xref>. However, if <italic>Q</italic> is issued as a query, subsequence <inline-formula id="j_infor519_ineq_007"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{3}}$]]></tex-math></alternatives></inline-formula> is more similar to it, which is an unexpected result. By combining <italic>Q</italic>, <inline-formula id="j_infor519_ineq_008"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{1}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_009"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{2}}$]]></tex-math></alternatives></inline-formula> together, we can learn that the pattern occurrences may have variable durations and distinct amplitudes. The most strict constraint is that the pattern should include an almost upright rise and an almost upright fall.</p>
<fig id="j_infor519_fig_002">
<label>Fig. 2</label>
<caption>
<p>Signal interference pattern.</p>
</caption>
<graphic xlink:href="infor519_g002.jpg"/>
</fig>
<p>The above examples clearly demonstrate the limitation of the single query mechanism. Although a single query can express the shape the user is interested with, it is not enough to express the extent of time shifting and amplitude scaling, as well as the error range.</p>
<p>To solve this problem, in this paper, we propose a multiple query approach. Compared to a single query, submitting a small number of queries together can express the query intuition more accurately. Consider the example in Fig. <xref rid="j_infor519_fig_002">2</xref>: if users take <italic>Q</italic>, <inline-formula id="j_infor519_ineq_010"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{1}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_011"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${S_{2}}$]]></tex-math></alternatives></inline-formula> as the query set, it indicates that the user can show more tolerance in realtion to the subsequence length and the value range, but less in relation to the slope of the increasing and decreasing part. Moreover, submitting multiple queries is also a natural solution in the real-world applications. For example, in the above case of train monitoring, analysts hope to find out all interfered subsequences, and then correct them. The analyst will go through a small part of the long sequence. Once coming across a few interfered subsequences, he/she can submit them together in order to find more instances.</p>
<p>We first propose a <italic>probability-based representation</italic> of the multiple queries. Then, we design a novel distance function to measure the similarity of one subsequence to the multiple queries. In the end, a breadth-first search algorithm is proposed to find out the desired subsequences. To the best of our knowledge, this is the first work to study how to express the query intuition.</p>
<p>Our contributions can be summarized as follows:</p>
<list>
<list-item id="j_infor519_li_001">
<label>•</label>
<p>We analyse the problem of expressing the query intuition, and propose a multi-query mechanism to solve this problem.</p>
</list-item>
<list-item id="j_infor519_li_002">
<label>•</label>
<p>We present a probability-based representation of the multiple queries, as well as the corresponding distance between it and any subsequence.</p>
</list-item>
<list-item id="j_infor519_li_003">
<label>•</label>
<p>We present a breadth-first search algorithm to efficiently search for similar subsequences.</p>
</list-item>
<list-item id="j_infor519_li_004">
<label>•</label>
<p>We conduct extensive experiments on both synthetic and real datasets to verify the effectiveness and efficiency of our approach.</p>
</list-item>
</list>
<p>The rest of the paper is organized as follows. The related works are reviewed in Section <xref rid="j_infor519_s_002">2</xref>. Section <xref rid="j_infor519_s_003">3</xref> introduces definitions and notations. In Section <xref rid="j_infor519_s_005">4</xref> and <xref rid="j_infor519_s_010">5</xref>, we introduce our approach in detail. Section <xref rid="j_infor519_s_018">6</xref> presents an experimental study of our approach using synthetic and real datasets, and we offer conclusions in Section <xref rid="j_infor519_s_026">7</xref>.</p>
</sec>
<sec id="j_infor519_s_002">
<label>2</label>
<title>Related Work</title>
<p>In last two decades, the problem of subsequence matching has been extensively studied. Existing approaches can be classified into two groups:</p>
<p><bold>Fixed Length Queries:</bold> Traditionally, to find out similar subsequences, two representative distance measures are adopted, Euclidean distance (ED) and Dynamic Time Warping (DTW). ED computes the similarity by a one-to-one mapping while DTW allows disalignment and thus supports time shifting. UCR Suite (Rakthanmanon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_016">2012</xref>) is a well-known approach that supports both ED and DTW for subsequence matching and proposes cascading lower bounds for DTW to accelerate search speed. FAST (Li <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_012">2017</xref>) is based on UCR Suite, and further proposes some lower bounds for the sake of efficiency. Both UCR Suite and FAST have to scan the whole time series to conduct distance computation. EBMS (Papapetrou <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_015">2011</xref>), however, reduces the subsequence matching problem to the vector matching problem, and identifies the candidates of matches by the search of nearest neighbours in the vector space. Also, some index-based approaches have been proposed for similarity search. Most of them build indexes based on summarizations of the data series (e.g. Piecewise Aggregate Approximation (PAA) (Keogh <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_010">2001</xref>), or Symbolic Aggregate approXimation (SAX) (Shieh and Keogh, <xref ref-type="bibr" rid="j_infor519_ref_017">2008</xref>)). Coconut (Kondylakis <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_011">2018</xref>) overcomes the limitation that existing summarizations cannot be sorted while keeping similar data series close to each other and proposes to organize data series based on a <italic>z</italic>-order curve. To further reduce the index creation time, adaptive indexing techniques have been proposed to iteratively refine the initial coarse index, such as ADS (Zoumpatianos <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_022">2016</xref>).</p>
<p><bold>Variable Length Queries:</bold> For variable length queries, SpADe (Chen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_005">2007</xref>) proposes a continuous distance calculation approach, which is not sensitive to shifting and scaling in both temporal and amplitude dimensions. It scans data series to get local patterns, and dynamically finds the shortest path among all local patterns to be the distance between two sequences. Query-by-Sketch (Muthumanickam <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_014">2016</xref>) proposes an interactive approach to explore user-sketched patterns. It extracts shape grammar, a combination of basic elementary shapes, from the sketched series, and then applies a symbolic approximation based on regular expressions. To better satisfy the user, Eravci and Ferhatosmanoglu (<xref ref-type="bibr" rid="j_infor519_ref_007">2013</xref>) attempts to improve the search results by incorporating diversity in the results for relevance feedback. Relatively speaking, indexing for variable length queries is more intractable. <inline-formula id="j_infor519_ineq_012"><alternatives><mml:math>
<mml:mi mathvariant="normal">KV</mml:mi></mml:math><tex-math><![CDATA[$\mathrm{KV}$]]></tex-math></alternatives></inline-formula>-<inline-formula id="j_infor519_ineq_013"><alternatives><mml:math><mml:mstyle mathvariant="normal">
<mml:mi mathvariant="normal">matc</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:msub></mml:mstyle></mml:math><tex-math><![CDATA[$\mathrm{matc}{h_{DP}}$]]></tex-math></alternatives></inline-formula> (Wu <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_021">2019</xref>) utilizes multiple varied-length indexes to support normalized subsequence matching under either ED or DTW distance. ULISSE (Linardi and Palpanas, <xref ref-type="bibr" rid="j_infor519_ref_013">2018</xref>), by comparison, uses a single index to answer similarity search queries of variable length. It organizes the series and their summaries in a hierarchical tree structure called the <italic>ULISSE</italic> index.</p>
<p>In summary, up to now, no existing work has attempted to express the query intuition via the multi-query mechanism.</p>
</sec>
<sec id="j_infor519_s_003">
<label>3</label>
<title>Preliminaries</title>
<p>In this section, we begin by introducing all the necessary definitions and notations, followed by a formal problem statement.</p>
<sec id="j_infor519_s_004">
<label>3.1</label>
<title>Definition</title>
<p>In this work, we are dealing with time series. A time series <inline-formula id="j_infor519_ineq_014"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$X=({x_{1}},{x_{2}},\dots ,{x_{N}})$]]></tex-math></alternatives></inline-formula> is an ordered sequence of real-valued numbers, where <inline-formula id="j_infor519_ineq_015"><alternatives><mml:math>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo stretchy="false">|</mml:mo></mml:math><tex-math><![CDATA[$N=|X|$]]></tex-math></alternatives></inline-formula> is the length of <italic>X</italic>. A subsequence <italic>S</italic>, <inline-formula id="j_infor519_ineq_016"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{\prime }}$]]></tex-math></alternatives></inline-formula> or <inline-formula id="j_infor519_ineq_017"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$X[i,j]=({x_{i}},{x_{i+1}},\dots ,{x_{j}})$]]></tex-math></alternatives></inline-formula> <inline-formula id="j_infor519_ineq_018"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(1\leqslant i\leqslant j\leqslant n)$]]></tex-math></alternatives></inline-formula> denotes the continuous sequence of length <inline-formula id="j_infor519_ineq_019"><alternatives><mml:math>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$j-i+1$]]></tex-math></alternatives></inline-formula> starting from the <italic>i</italic>-th position in <italic>X</italic>. Note that the subsequence is itself a time series.</p>
<p>Given a time series <italic>X</italic>, a query sequence <italic>Q</italic>, and a distance function <italic>D</italic>, the problem of subsequence matching is to find out the top-<italic>K</italic> subsequences from <italic>X</italic>, denoted as <inline-formula id="j_infor519_ineq_020"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">K</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\mathbb{R}=\{{S_{1}},{S_{2}},\dots ,{S_{K}}\}$]]></tex-math></alternatives></inline-formula>, which are most similar to <italic>Q</italic>.</p>
<p>The two representative distance measures are <italic>Euclidean Distance</italic> (ED) and <italic>Dynamic Time Warping</italic> (DTW). Formally, given two length-<italic>L</italic> sequences, <italic>S</italic> and <inline-formula id="j_infor519_ineq_021"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{\prime }}$]]></tex-math></alternatives></inline-formula>, their ED and DTW distance can be computed as the following:</p><statement id="j_infor519_stat_001"><label>Definition 1.</label>
<p>Euclidean Distance: <inline-formula id="j_infor519_ineq_022"><alternatives><mml:math>
<mml:mtext mathvariant="italic">ED</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">L</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt></mml:math><tex-math><![CDATA[$\textit{ED}(S,{S^{\prime }})=\sqrt{{\textstyle\sum _{i=1}^{L}}{({s_{i}}-{s^{\prime }_{i}})^{2}}}$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_infor519_ineq_023"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${s_{i}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_024"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${s^{\prime }_{i}}$]]></tex-math></alternatives></inline-formula> is the value at <italic>i</italic>-th <inline-formula id="j_infor519_ineq_025"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(1\leqslant i\leqslant L)$]]></tex-math></alternatives></inline-formula> time stamp of <italic>S</italic> or <inline-formula id="j_infor519_ineq_026"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{\prime }}$]]></tex-math></alternatives></inline-formula> respectively.</p></statement><statement id="j_infor519_stat_002"><label>Definition 2.</label>
<p>Dynamic Time Warping: 
<disp-formula id="j_infor519_eq_001">
<alternatives><mml:math display="block">
<mml:mtable displaystyle="true" columnalign="right left" columnspacing="0pt">
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>;</mml:mo>
<mml:mspace width="2em"/>
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mi>∞</mml:mi>
<mml:mo>;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="align-odd"/>
<mml:mtd class="align-even">
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mo movablelimits="false">min</mml:mo>
<mml:mfenced separators="" open="{" close="">
<mml:mrow>
<mml:mtable displaystyle="true" columnspacing="0pt" columnalign="right left">
<mml:mtr>
<mml:mtd/>
<mml:mtd>
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mtext mathvariant="italic">suf</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext mathvariant="italic">suf</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd/>
<mml:mtd>
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext mathvariant="italic">suf</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd/>
<mml:mtd>
<mml:mtext mathvariant="italic">DTW</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mtext mathvariant="italic">suf</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msqrt>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[\begin{aligned}{}& \textit{DTW}\big(\langle \rangle ,\langle \rangle \big)=0;\hspace{2em}\textit{DTW}\big(S,\langle \rangle \big)=\textit{DTW}\big(\langle \rangle ,{S^{\prime }}\big)=\infty ;\\ {} & \textit{DTW}\big(S,{S^{\prime }}\big)=\sqrt{{\big({s_{1}}-{s^{\prime }_{1}}\big)^{2}}+\min \left\{\begin{aligned}{}& \textit{DTW}\big(\textit{suf}(S),\textit{suf}\big({S^{\prime }}\big)\big),\\ {} & \textit{DTW}\big(S,\textit{suf}\big({S^{\prime }}\big)\big),\\ {} & \textit{DTW}\big(\textit{suf}(S),{S^{\prime }}\big),\end{aligned}\right.}\end{aligned}\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_infor519_ineq_027"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$\langle \rangle $]]></tex-math></alternatives></inline-formula> indicates empty series and <inline-formula id="j_infor519_ineq_028"><alternatives><mml:math>
<mml:mtext mathvariant="italic">suf</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{suf}(S)=({s_{2}},\dots ,{s_{L}})$]]></tex-math></alternatives></inline-formula> is a suffix subsequence of <italic>S</italic>.</p></statement>
<p>In this paper, instead of processing one single query, we attempt to find out subsequences similar to multiple queries. That is, given a set of queries, <inline-formula id="j_infor519_ineq_029"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[$\mathbb{Q}=\{{Q_{1}},{Q_{2}},\dots ,{Q_{N}}\}$]]></tex-math></alternatives></inline-formula>, our objective is to find out the top-<italic>K</italic> subsequences similar to queries in <inline-formula id="j_infor519_ineq_030"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, denoted as <inline-formula id="j_infor519_ineq_031"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{R}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Since each query sequence varies in length, we do not impose a constraint on the length of subsequences in <inline-formula id="j_infor519_ineq_032"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{R}$]]></tex-math></alternatives></inline-formula>. In this way, we find out variable-length subsequences answering multiple queries, which is worthy of wide use in real time series applications.</p>
</sec>
</sec>
<sec id="j_infor519_s_005">
<label>4</label>
<title>Query Representation and Distance Definition</title>
<p>In this section, we first present the probability-based representation of the query set, and then propose a distance definition based on the representation.</p>
<sec id="j_infor519_s_006">
<label>4.1</label>
<title>Query Representation</title>
<p>In this paper, instead of processing the queries in <inline-formula id="j_infor519_ineq_033"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> independently, we first represent them by a unified formation, which is a multi-dimensional probability distribution. Then we find target subsequences from <italic>X</italic> based on the representation.</p>
<p>In many real world applications, the meaningful query sequence can be approximately represented as a sequence of line segments. Recall the EOG pattern in Fig. <xref rid="j_infor519_fig_001">1</xref>. The query sequence <italic>Q</italic> can be approximated with 4 line segments. These line segments capture the most representative characteristics in <italic>Q</italic>.</p>
<p>In response, we propose a two-step approach to represent the bundle of queries together. In the first step, we represent each single query <inline-formula id="j_infor519_ineq_034"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula> in <inline-formula id="j_infor519_ineq_035"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> individually by a traditional segmentation in which each segment is described by some features. Then, in the second step, we represent each feature as a Gaussian distribution over the values from multiple queries.</p>
<sec id="j_infor519_s_007">
<label>4.1.1</label>
<title>Step One: Represent Each Single Query</title>
<p>In step one, we perform a traditional segmentation. We use a bottom-up approach to convert the query <inline-formula id="j_infor519_ineq_036"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${Q_{i}}=({q_{1}},{q_{2}},\dots ,{q_{f}})$]]></tex-math></alternatives></inline-formula> into a piecewise linear representation, where <inline-formula id="j_infor519_ineq_037"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${q_{f}}$]]></tex-math></alternatives></inline-formula> is the segment of single query <inline-formula id="j_infor519_ineq_038"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula> <inline-formula id="j_infor519_ineq_039"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">f</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(1\leqslant f\leqslant |{Q_{i}}|)$]]></tex-math></alternatives></inline-formula>. Initially, we approximate <inline-formula id="j_infor519_ineq_040"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula> with <inline-formula id="j_infor519_ineq_041"><alternatives><mml:math>
<mml:mo maxsize="1.19em" minsize="1.19em" fence="true">⌊</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo maxsize="1.19em" minsize="1.19em" fence="true">⌋</mml:mo></mml:math><tex-math><![CDATA[$\big\lfloor \frac{|{Q_{i}}|}{2}\big\rfloor $]]></tex-math></alternatives></inline-formula> line segments. The <italic>j</italic>-th line, <inline-formula id="j_infor519_ineq_042"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${H_{j}}$]]></tex-math></alternatives></inline-formula>, connects <inline-formula id="j_infor519_ineq_043"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${q_{2j-1}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_044"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${q_{2j}}$]]></tex-math></alternatives></inline-formula>. Next, we iteratively merge the neighbouring lines. In each iteration, we merge the two neighbouring segments into one new line segment that has the minimal approximation error. The merging process repeats until we have <italic>m</italic> (a pre-set parameter) number of line segments. For each <inline-formula id="j_infor519_ineq_045"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula>, we obtain its segmentation, denoted as <inline-formula id="j_infor519_ineq_046"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({Q_{i}^{1}},{Q_{i}^{2}},\dots ,{Q_{i}^{m}})$]]></tex-math></alternatives></inline-formula> and its linear representation, denoted as <inline-formula id="j_infor519_ineq_047"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({H_{i}^{1}},{H_{i}^{2}},\dots ,{H_{i}^{m}})$]]></tex-math></alternatives></inline-formula>.</p>
<p>For each line segment <inline-formula id="j_infor519_ineq_048"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${H_{i}^{j}}$]]></tex-math></alternatives></inline-formula> (<inline-formula id="j_infor519_ineq_049"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi></mml:math><tex-math><![CDATA[$1\leqslant i\leqslant N$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_050"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$1\leqslant j\leqslant m$]]></tex-math></alternatives></inline-formula>), we represent it as a 4-dimension vector, <inline-formula id="j_infor519_ineq_051"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${f_{i}^{j}}=({l_{i}^{j}},{\theta _{i}^{j}},{v_{i}^{j}},{\varepsilon _{i}^{j}})$]]></tex-math></alternatives></inline-formula>, which corresponds to the length, slope, the value of the starting point and MSE error of <inline-formula id="j_infor519_ineq_052"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${H_{i}^{j}}$]]></tex-math></alternatives></inline-formula>, respectively. As a result, the query sequence <inline-formula id="j_infor519_ineq_053"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula> is represented by a <inline-formula id="j_infor519_ineq_054"><alternatives><mml:math>
<mml:mn>4</mml:mn>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$4m$]]></tex-math></alternatives></inline-formula>-dimension vector, <inline-formula id="j_infor519_ineq_055"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${F_{i}}=({f_{i}^{1}},{f_{i}^{2}},\dots ,{f_{i}^{m}})$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
<sec id="j_infor519_s_008">
<label>4.1.2</label>
<title>Step Two: Represent Multiple Queries</title>
<p>After obtaining <inline-formula id="j_infor519_ineq_056"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${F_{i}}$]]></tex-math></alternatives></inline-formula>’s (<inline-formula id="j_infor519_ineq_057"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi></mml:math><tex-math><![CDATA[$1\leqslant i\leqslant N$]]></tex-math></alternatives></inline-formula>), we can generate the uniform representation of query set <inline-formula id="j_infor519_ineq_058"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, which is a multi-dimensional probability distribution. We first present the formal distribution, and then give our approach to generate the specific distribution for <inline-formula id="j_infor519_ineq_059"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>.</p>
<p>Specifically, given query set <inline-formula id="j_infor519_ineq_060"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, its representation, denoted as <inline-formula id="j_infor519_ineq_061"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${P_{\mathbb{Q}}}$]]></tex-math></alternatives></inline-formula>, consists of <inline-formula id="j_infor519_ineq_062"><alternatives><mml:math>
<mml:mn>4</mml:mn>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$4m$]]></tex-math></alternatives></inline-formula> number of individual Gaussian distributions, each of which corresponds to a feature in <inline-formula id="j_infor519_ineq_063"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${f_{i}^{j}}$]]></tex-math></alternatives></inline-formula>. For each feature, we produce a Gaussian distribution to capture latent semantics, which is determined by two parameters: the mean value and the standard deviation. The former encodes the ideal value of the feature, while the latter provides an elastic range.</p>
<p>Formally, we denote the representation of <inline-formula id="j_infor519_ineq_064"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> as <inline-formula id="j_infor519_ineq_065"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${P_{\mathbb{Q}}}=({P^{1}},{P^{2}},\dots ,{P^{m}})$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_infor519_ineq_066"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${P^{j}}=({p_{l}^{j}},{p_{\theta }^{j}},{p_{v}^{j}},{p_{\varepsilon }^{j}})$]]></tex-math></alternatives></inline-formula> corresponds to the <italic>j</italic>-th line segments <inline-formula id="j_infor519_ineq_067"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({H_{1}^{j}},{H_{2}^{j}},\dots ,{H_{N}^{j}})$]]></tex-math></alternatives></inline-formula>. Take the slope feature as an example, that is, <inline-formula id="j_infor519_ineq_068"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${p_{\theta }^{j}}$]]></tex-math></alternatives></inline-formula> is the Gaussian density function of the slope of the <italic>j</italic>-th segment, denoted as 
<disp-formula id="j_infor519_eq_002">
<label>(1)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">π</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo movablelimits="false">exp</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="2.03em" minsize="2.03em">(</mml:mo>
<mml:mo>−</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal" fence="true" maxsize="2.03em" minsize="2.03em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ {p_{\theta }^{j}}(x)=\frac{1}{\sqrt{2\pi }{\sigma _{\theta }^{j}}}\exp \bigg(-\frac{{(x-{\mu _{\theta }^{j}})^{2}}}{2{({\sigma _{\theta }^{j}})^{2}}}\bigg),\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_infor519_ineq_069"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\mu _{\theta }^{j}}$]]></tex-math></alternatives></inline-formula> (or <inline-formula id="j_infor519_ineq_070"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\sigma _{\theta }^{j}}$]]></tex-math></alternatives></inline-formula>) is the mean value (or standard deviation) of the slope values of <inline-formula id="j_infor519_ineq_071"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({H_{1}^{j}},{H_{2}^{j}},\dots ,{H_{N}^{j}})$]]></tex-math></alternatives></inline-formula>. Specifically, <inline-formula id="j_infor519_ineq_072"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[${\mu _{\theta }^{j}}=\frac{{\textstyle\sum _{i=1}^{N}}{\theta _{i}^{j}}}{N}$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor519_ineq_073"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>−</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt></mml:math><tex-math><![CDATA[${\sigma _{\theta }^{j}}=\sqrt{\frac{1}{N}{\textstyle\sum _{i=1}^{N}}{({\theta _{i}^{j}}-{\mu _{\theta }^{j}})^{2}}}$]]></tex-math></alternatives></inline-formula>. The mean value <inline-formula id="j_infor519_ineq_074"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\mu _{\theta }^{i}}$]]></tex-math></alternatives></inline-formula> describes the slope the user prefers, and the standard deviation <inline-formula id="j_infor519_ineq_075"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\sigma _{\theta }^{i}}$]]></tex-math></alternatives></inline-formula> represents how strictly the user stresses on this feature. Apparently, the smaller the value of <inline-formula id="j_infor519_ineq_076"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\sigma _{\theta }^{i}}$]]></tex-math></alternatives></inline-formula> is, the stricter the user’s requirement.</p>
<p>Now we introduce our approach to generate <inline-formula id="j_infor519_ineq_077"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${P_{\mathbb{Q}}}$]]></tex-math></alternatives></inline-formula> for the query set <inline-formula id="j_infor519_ineq_078"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>. We first get the representations <inline-formula id="j_infor519_ineq_079"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({l_{i}^{j}},{\theta _{i}^{j}},{v_{i}^{j}},{\varepsilon _{i}^{j}})$]]></tex-math></alternatives></inline-formula> that correspond to the features of the <italic>j</italic>-th line segment in <inline-formula id="j_infor519_ineq_080"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${Q_{i}}$]]></tex-math></alternatives></inline-formula>. To get the specific Gaussian distribution <inline-formula id="j_infor519_ineq_081"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${P^{j}}=({p_{l}^{j}},{p_{\theta }^{j}},{p_{v}^{j}},{p_{\varepsilon }^{j}})$]]></tex-math></alternatives></inline-formula>, we directly compute the mean value and the standard deviation of the feature values of <inline-formula id="j_infor519_ineq_082"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({H_{1}^{j}},{H_{2}^{j}},\dots ,{H_{N}^{j}})$]]></tex-math></alternatives></inline-formula>. Then, <inline-formula id="j_infor519_ineq_083"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${P_{\mathbb{Q}}}=({P^{1}},{P^{2}},\dots ,{P^{m}})$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
</sec>
<sec id="j_infor519_s_009">
<label>4.2</label>
<title>Distance Definition</title>
<p>Given the fact that the query representation consists of <inline-formula id="j_infor519_ineq_084"><alternatives><mml:math>
<mml:mn>4</mml:mn>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$4m$]]></tex-math></alternatives></inline-formula> number of Gaussian distributions rather than a sequence, the existing distance measures, like ED and DTW, are inapplicable. In this paper, we propose a novel distance function <inline-formula id="j_infor519_ineq_085"><alternatives><mml:math>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$D(S,\mathbb{Q})$]]></tex-math></alternatives></inline-formula> based on the probability distribution.</p>
<p>Formally, to define the distance between subsequence <italic>S</italic> and the query representation <inline-formula id="j_infor519_ineq_086"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">P</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${P_{\mathbb{Q}}}$]]></tex-math></alternatives></inline-formula>, we first approximate <italic>S</italic> with <italic>m</italic> line segments. We indicate the segmentation as <inline-formula id="j_infor519_ineq_087"><alternatives><mml:math>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mi mathvariant="italic">g</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$seg=({S^{1}},{S^{2}},\dots ,{S^{m}})$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_infor519_ineq_088"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{j}}$]]></tex-math></alternatives></inline-formula> (<inline-formula id="j_infor519_ineq_089"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$1\leqslant j\leqslant m$]]></tex-math></alternatives></inline-formula>) denotes the <italic>j</italic>-th segment of <italic>S</italic>. Note that the segment here infers <italic>subsequence</italic> rather than <italic>line segment</italic>. We extract four features, the length <inline-formula id="j_infor519_ineq_090"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${l^{j}}$]]></tex-math></alternatives></inline-formula>, the slope <inline-formula id="j_infor519_ineq_091"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\theta ^{j}}$]]></tex-math></alternatives></inline-formula>, the value of the starting point <inline-formula id="j_infor519_ineq_092"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${v^{j}}$]]></tex-math></alternatives></inline-formula>, and the MSE error <inline-formula id="j_infor519_ineq_093"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\varepsilon ^{j}}$]]></tex-math></alternatives></inline-formula> from the linear representation of <inline-formula id="j_infor519_ineq_094"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{j}}$]]></tex-math></alternatives></inline-formula>. For ease of presentation, we represent all the <italic>j</italic>-th segments in <inline-formula id="j_infor519_ineq_095"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> as <inline-formula id="j_infor519_ineq_096"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{Q}^{j}}$]]></tex-math></alternatives></inline-formula>. That is, <inline-formula id="j_infor519_ineq_097"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${\mathbb{Q}^{j}}=({Q_{1}^{j}},{Q_{2}^{j}},\dots ,{Q_{N}^{j}})$]]></tex-math></alternatives></inline-formula>. Then the distance between <inline-formula id="j_infor519_ineq_098"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{j}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_099"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{Q}^{j}}$]]></tex-math></alternatives></inline-formula> is 
<disp-formula id="j_infor519_eq_003">
<label>(2)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>−</mml:mo>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">ε</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \textit{dist}\big({S^{j}},{\mathbb{Q}^{j}}\big)=-\log \big({p_{l}^{j}}\big({l^{j}}\big){p_{\theta }^{j}}\big({\theta ^{j}}\big){p_{v}^{j}}\big({v^{j}}\big){p_{\varepsilon }^{j}}\big({\varepsilon ^{j}}\big)\big),\]]]></tex-math></alternatives>
</disp-formula> 
<inline-formula id="j_infor519_ineq_100"><alternatives><mml:math>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{dist}({S^{j}},{\mathbb{Q}^{j}})$]]></tex-math></alternatives></inline-formula> is the negative logarithm of the probability. The smaller the value, the more similar <inline-formula id="j_infor519_ineq_101"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${S^{j}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_102"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{Q}^{j}}$]]></tex-math></alternatives></inline-formula> are. Accordingly, under segmentation <inline-formula id="j_infor519_ineq_103"><alternatives><mml:math>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mi mathvariant="italic">g</mml:mi></mml:math><tex-math><![CDATA[$seg$]]></tex-math></alternatives></inline-formula>, the distance between <italic>S</italic> and the query set <inline-formula id="j_infor519_ineq_104"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> can be computed as 
<disp-formula id="j_infor519_eq_004">
<label>(3)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext mathvariant="italic">seg</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ D(S,\mathbb{Q},\textit{seg})={\sum \limits_{j=1}^{m}}\textit{dist}\big({S^{j}},{\mathbb{Q}^{j}}\big).\]]]></tex-math></alternatives>
</disp-formula> 
Since <italic>S</italic> can be segmented by different segmentations, the value of <inline-formula id="j_infor519_ineq_105"><alternatives><mml:math>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext mathvariant="italic">seg</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$D(S,\mathbb{Q},\textit{seg})$]]></tex-math></alternatives></inline-formula> may be different. In this paper, we define the distance between <italic>S</italic> and <inline-formula id="j_infor519_ineq_106"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> as the minimal one among all possible segmentations, that is, 
<disp-formula id="j_infor519_eq_005">
<label>(4)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mo movablelimits="false">min</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mtext mathvariant="italic">seg</mml:mtext>
</mml:mrow>
</mml:munder>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">S</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mtext mathvariant="italic">seg</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ D(S,\mathbb{Q})=\underset{\textit{seg}}{\min }D(S,\mathbb{Q},\textit{seg}).\]]]></tex-math></alternatives>
</disp-formula>
</p>
</sec>
</sec>
<sec id="j_infor519_s_010">
<label>5</label>
<title>Query Processing Approach</title>
<p>In this section, we introduce the search process. Obviously, it is exhaustive to find out the best segmentation of all subsequences in the time series <italic>X</italic>. In response, we divide the search process into two phases: 
<list>
<list-item id="j_infor519_li_005">
<label>1.</label>
<p>Candidate generation. Given the submitted query set <inline-formula id="j_infor519_ineq_107"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, we utilize a Breadth-First Search (BFS) strategy to find at most <inline-formula id="j_infor519_ineq_108"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> number of candidates from <italic>X</italic>, denoted as <inline-formula id="j_infor519_ineq_109"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula>.</p>
</list-item>
<list-item id="j_infor519_li_006">
<label>2.</label>
<p>Post-processing. All subsequences in <inline-formula id="j_infor519_ineq_110"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula> will be verified and re-ordered by computing its actual distance. Moreover, we dismiss the trivial match subsequences.</p>
</list-item>
</list>
</p>
<sec id="j_infor519_s_011">
<label>5.1</label>
<title>BFS-Based Search Process</title>
<p>We first introduce our search strategy to generate the candidate set <inline-formula id="j_infor519_ineq_111"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula> with size not exceeding the parameter <inline-formula id="j_infor519_ineq_112"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula>. We utilize an iterative approach, and generate the candidates segment-by-segment. In the first round, we generate at most <inline-formula id="j_infor519_ineq_113"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> number of candidates with only one segment. The candidate set is denoted as <inline-formula id="j_infor519_ineq_114"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">{</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">}</mml:mo></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}=\{c{s_{1}},c{s_{2}},\dots ,c{s_{{n_{c}}}}\}$]]></tex-math></alternatives></inline-formula>. Each candidate, <inline-formula id="j_infor519_ineq_115"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$c{s_{i}}$]]></tex-math></alternatives></inline-formula>, is a triple <inline-formula id="j_infor519_ineq_116"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$\langle {s_{i}},{e_{i}},{d_{i}}\rangle $]]></tex-math></alternatives></inline-formula>, in which <inline-formula id="j_infor519_ineq_117"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${s_{i}}$]]></tex-math></alternatives></inline-formula> is its starting point and <inline-formula id="j_infor519_ineq_118"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${e_{i}}$]]></tex-math></alternatives></inline-formula> is its ending point. So <inline-formula id="j_infor519_ineq_119"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[$c{s_{i}}$]]></tex-math></alternatives></inline-formula> corresponds to the subsequence <inline-formula id="j_infor519_ineq_120"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[{s_{i}},{e_{i}}]$]]></tex-math></alternatives></inline-formula>. The third element <inline-formula id="j_infor519_ineq_121"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${d_{i}}$]]></tex-math></alternatives></inline-formula> is distance <inline-formula id="j_infor519_ineq_122"><alternatives><mml:math>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{dist}(c{s_{1}},{\mathbb{Q}^{1}})$]]></tex-math></alternatives></inline-formula>. All candidates in <inline-formula id="j_infor519_ineq_123"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula> are ordered based on the values of <inline-formula id="j_infor519_ineq_124"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${d_{i}}$]]></tex-math></alternatives></inline-formula> ascendingly. In other words, <inline-formula id="j_infor519_ineq_125"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula> contains <inline-formula id="j_infor519_ineq_126"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> number of subsequences with smallest distance with <inline-formula id="j_infor519_ineq_127"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${\mathbb{Q}^{1}}$]]></tex-math></alternatives></inline-formula>. We discuss how to select top-<inline-formula id="j_infor519_ineq_128"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> candidates in the next section.</p>
<p>In the second round, we obtain candidate set <inline-formula id="j_infor519_ineq_129"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{2}}$]]></tex-math></alternatives></inline-formula> by extending the candidates in <inline-formula id="j_infor519_ineq_130"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula> with the second segment. Specifically, given any candidate subsequence <inline-formula id="j_infor519_ineq_131"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$cs=\langle s,e,d\rangle $]]></tex-math></alternatives></inline-formula> in <inline-formula id="j_infor519_ineq_132"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula>, if we want to extend <inline-formula id="j_infor519_ineq_133"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> to <inline-formula id="j_infor519_ineq_134"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$c{s^{\prime }}$]]></tex-math></alternatives></inline-formula> with a length-<italic>L</italic> segment, the new candidate <inline-formula id="j_infor519_ineq_135"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$c{s^{\prime }}=\langle s,{e^{\prime }},{d^{\prime }}\rangle $]]></tex-math></alternatives></inline-formula> contains two segments: one corresponds to <inline-formula id="j_infor519_ineq_136"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula> and the other to <inline-formula id="j_infor519_ineq_137"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[e+1,e+L]$]]></tex-math></alternatives></inline-formula>. Note that the new starting point <italic>s</italic> keeps unchanged, and the new ending point <inline-formula id="j_infor519_ineq_138"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${e^{\prime }}$]]></tex-math></alternatives></inline-formula> changes to <inline-formula id="j_infor519_ineq_139"><alternatives><mml:math>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">L</mml:mi></mml:math><tex-math><![CDATA[$e+L$]]></tex-math></alternatives></inline-formula>. Also, the new distance <inline-formula id="j_infor519_ineq_140"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${d^{\prime }}$]]></tex-math></alternatives></inline-formula> is updated to <inline-formula id="j_infor519_ineq_141"><alternatives><mml:math>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$d+\textit{dist}(X[e+1,{e^{\prime }}],{\mathbb{Q}^{2}})$]]></tex-math></alternatives></inline-formula>. Since from each candidate <inline-formula id="j_infor519_ineq_142"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> in <inline-formula id="j_infor519_ineq_143"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula>, we can extend it to multiple candidates by concatenating <inline-formula id="j_infor519_ineq_144"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula> with variable length segments, we generate all possible candidates, compute the distance and add top-<inline-formula id="j_infor519_ineq_145"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> candidates into <inline-formula id="j_infor519_ineq_146"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{2}}$]]></tex-math></alternatives></inline-formula>.</p>
<p>After <italic>m</italic> rounds, we obtain the candidate set <inline-formula id="j_infor519_ineq_147"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">m</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{m}}$]]></tex-math></alternatives></inline-formula>, which is the final candidate set <inline-formula id="j_infor519_ineq_148"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula>. Now each candidate in <inline-formula id="j_infor519_ineq_149"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula> consists of <italic>m</italic> segments.</p>
</sec>
<sec id="j_infor519_s_012">
<label>5.2</label>
<title>Candidate Generation</title>
<p>Now we introduce our approach to generate possible candidates in each round.</p>
<p>In the first round, we enumerate all subsequences <inline-formula id="j_infor519_ineq_150"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula> from all possible starting points, that is, we try all <italic>s</italic>’s within <inline-formula id="j_infor519_ineq_151"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[1,n]$]]></tex-math></alternatives></inline-formula>. To avoid the low-quality candidates, we only select subsequences with length satisfying the <inline-formula id="j_infor519_ineq_152"><alternatives><mml:math>
<mml:mn>3</mml:mn>
<mml:mi mathvariant="italic">σ</mml:mi></mml:math><tex-math><![CDATA[$3\sigma $]]></tex-math></alternatives></inline-formula> standard. Formally, given any starting point <italic>s</italic>, the ending point <italic>e</italic> must satisfy <inline-formula id="j_infor519_ineq_153"><alternatives><mml:math>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo stretchy="false">∈</mml:mo>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>−</mml:mo>
<mml:mn>3</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:mn>3</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$e\in [{\mu _{l}^{1}}-3{\sigma _{l}^{1}},{\mu _{l}^{1}}+3{\sigma _{l}^{1}}]$]]></tex-math></alternatives></inline-formula>. For each candidate subsequence <inline-formula id="j_infor519_ineq_154"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula>, we compute the optimal linear approximation, <inline-formula id="j_infor519_ineq_155"><alternatives><mml:math>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi></mml:math><tex-math><![CDATA[$y=\theta \cdot x+b$]]></tex-math></alternatives></inline-formula>, as well as the MSE error <italic>ε</italic> as follows, 
<disp-formula id="j_infor519_eq_006">
<label>(5)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mfenced separators="" open="{" close="">
<mml:mrow>
<mml:mtable equalrows="false" columnlines="none" equalcolumns="false" columnalign="left">
<mml:mtr>
<mml:mtd class="array">
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>12</mml:mn>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>6</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mi mathvariant="italic">b</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:mn>6</mml:mn>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mi mathvariant="italic">ε</mml:mi>
<mml:mo>=</mml:mo><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">θ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">θ</mml:mi><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">b</mml:mi><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>2</mml:mn>
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mi mathvariant="italic">b</mml:mi><mml:mstyle displaystyle="true">
<mml:mo largeop="true" movablelimits="false">∑</mml:mo></mml:mstyle>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ \left\{\begin{array}{l}\theta =\displaystyle \frac{12\textstyle\sum ix-6(l+1)\textstyle\sum x}{l(l+1)(l-1)},\hspace{1em}\\ {} b=\displaystyle \frac{6\textstyle\sum ix-2(2l+1)\textstyle\sum x}{l(1-l)},\hspace{1em}\\ {} \varepsilon =\displaystyle \sum {x^{2}}+{\theta ^{2}}\displaystyle \sum {i^{2}}+l{b^{2}}-2\theta \displaystyle \sum ix-2b\displaystyle \sum x+2\theta b\displaystyle \sum i,\hspace{1em}\end{array}\right.\]]]></tex-math></alternatives>
</disp-formula> 
where <inline-formula id="j_infor519_ineq_156"><alternatives><mml:math>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>−</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$l=e-s+1$]]></tex-math></alternatives></inline-formula> is the length of <inline-formula id="j_infor519_ineq_157"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula>. After that, we compute <inline-formula id="j_infor519_ineq_158"><alternatives><mml:math>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{dist}(X[s,e],{\mathbb{Q}^{1}})$]]></tex-math></alternatives></inline-formula>. Also, if only any other feature of a candidate (<italic>θ</italic>, <italic>v</italic> or <italic>ε</italic>) violates the <inline-formula id="j_infor519_ineq_159"><alternatives><mml:math>
<mml:mn>3</mml:mn>
<mml:mi mathvariant="italic">σ</mml:mi></mml:math><tex-math><![CDATA[$3\sigma $]]></tex-math></alternatives></inline-formula> standard, we ignore this candidate. During enumerating different candidates, we maintain <inline-formula id="j_infor519_ineq_160"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{1}}$]]></tex-math></alternatives></inline-formula> as a priority queue to keep top-<inline-formula id="j_infor519_ineq_161"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> candidates.</p>
<p>In the <italic>j</italic>-th round (<inline-formula id="j_infor519_ineq_162"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$2\leqslant j\leqslant m$]]></tex-math></alternatives></inline-formula>), we generate candidates by extending previous ones in <inline-formula id="j_infor519_ineq_163"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{j-1}}$]]></tex-math></alternatives></inline-formula>. For each candidate <inline-formula id="j_infor519_ineq_164"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$cs=\langle s,e,d\rangle $]]></tex-math></alternatives></inline-formula> in <inline-formula id="j_infor519_ineq_165"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="italic">CS</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${\textit{CS}_{j-1}}$]]></tex-math></alternatives></inline-formula>, we try all possible segments next to <inline-formula id="j_infor519_ineq_166"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[s,e]$]]></tex-math></alternatives></inline-formula> whose length falls within <inline-formula id="j_infor519_ineq_167"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>−</mml:mo>
<mml:mn>3</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">μ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:mn>3</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[{\mu _{l}^{j}}-3{\sigma _{l}^{j}},{\mu _{l}^{j}}+3{\sigma _{l}^{j}}]$]]></tex-math></alternatives></inline-formula>. Similar with the first round, we also dismiss the candidates violating the <inline-formula id="j_infor519_ineq_168"><alternatives><mml:math>
<mml:mn>3</mml:mn>
<mml:mi mathvariant="italic">σ</mml:mi></mml:math><tex-math><![CDATA[$3\sigma $]]></tex-math></alternatives></inline-formula> standard. When extending candidate <inline-formula id="j_infor519_ineq_169"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$cs=\langle s,e,d\rangle $]]></tex-math></alternatives></inline-formula> to <inline-formula id="j_infor519_ineq_170"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mo fence="true" stretchy="false">⟨</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">⟩</mml:mo></mml:math><tex-math><![CDATA[$c{s^{\prime }}=\langle s,{e^{\prime }},{d^{\prime }}\rangle $]]></tex-math></alternatives></inline-formula>, except the new ending point <inline-formula id="j_infor519_ineq_171"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[${e^{\prime }}$]]></tex-math></alternatives></inline-formula>, we also update the distance as <inline-formula id="j_infor519_ineq_172"><alternatives><mml:math>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[${d^{\prime }}=d+\textit{dist}(X[e+1,{e^{\prime }}],{\mathbb{Q}^{j}})$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
<sec id="j_infor519_s_013">
<label>5.3</label>
<title>Post-Processing</title>
<p>Note that the subsequences in <inline-formula id="j_infor519_ineq_173"><alternatives><mml:math>
<mml:mi mathvariant="italic">C</mml:mi>
<mml:mi mathvariant="italic">S</mml:mi></mml:math><tex-math><![CDATA[$CS$]]></tex-math></alternatives></inline-formula> are not approximated optimally. Here, an additional refinement step has to be performed, where subsequences are verified and re-ordered using the optimal segmentation. Specifically, we fetch the subsequences in <inline-formula id="j_infor519_ineq_174"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula>, approximate each subsequence <inline-formula id="j_infor519_ineq_175"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> with <italic>m</italic> line segments via a dynamic programming algorithm, and thus get the actual distance <inline-formula id="j_infor519_ineq_176"><alternatives><mml:math>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$D(cs,\mathbb{Q})$]]></tex-math></alternatives></inline-formula> under the new segmentation.</p>
<p>The objective of the segmentation is to minimize the distance between <inline-formula id="j_infor519_ineq_177"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_178"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>. We search the optimal segmentation from left to right sequentially on <inline-formula id="j_infor519_ineq_179"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula>. We define <inline-formula id="j_infor519_ineq_180"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$E(i,j)$]]></tex-math></alternatives></inline-formula> (<inline-formula id="j_infor519_ineq_181"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi></mml:math><tex-math><![CDATA[$1\leqslant i\leqslant m$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_182"><alternatives><mml:math>
<mml:mn>1</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo stretchy="false">|</mml:mo></mml:math><tex-math><![CDATA[$1\leqslant j\leqslant |cs|$]]></tex-math></alternatives></inline-formula>) to be the minimal distance between the prefix of <inline-formula id="j_infor519_ineq_183"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> (i.e. <inline-formula id="j_infor519_ineq_184"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$cs=X[1,j]$]]></tex-math></alternatives></inline-formula>) and the prefix of <inline-formula id="j_infor519_ineq_185"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> with <italic>i</italic> segments (i.e. <inline-formula id="j_infor519_ineq_186"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[{\mathbb{Q}^{1}},{\mathbb{Q}^{2}},\dots ,{\mathbb{Q}^{i}}]$]]></tex-math></alternatives></inline-formula>). We begin by initializing <inline-formula id="j_infor519_ineq_187"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$E(1,j)$]]></tex-math></alternatives></inline-formula> to be <inline-formula id="j_infor519_ineq_188"><alternatives><mml:math>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{dist}(X[1,j],{\mathbb{Q}^{1}})$]]></tex-math></alternatives></inline-formula>. When computing <inline-formula id="j_infor519_ineq_189"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$E(i,j)$]]></tex-math></alternatives></inline-formula> <inline-formula id="j_infor519_ineq_190"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(2\leqslant i\leqslant m)$]]></tex-math></alternatives></inline-formula>, we consider all the possible segmentations of <inline-formula id="j_infor519_ineq_191"><alternatives><mml:math>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$X[1,k]$]]></tex-math></alternatives></inline-formula> <inline-formula id="j_infor519_ineq_192"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$(i\leqslant k\leqslant j)$]]></tex-math></alternatives></inline-formula> with <inline-formula id="j_infor519_ineq_193"><alternatives><mml:math>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$i-1$]]></tex-math></alternatives></inline-formula> segments, compare the sum of <inline-formula id="j_infor519_ineq_194"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$E(i-1,k)$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_195"><alternatives><mml:math>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\textit{dist}(X[k,j],{\mathbb{Q}^{k}})$]]></tex-math></alternatives></inline-formula>, and define <inline-formula id="j_infor519_ineq_196"><alternatives><mml:math>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$E(i,j)$]]></tex-math></alternatives></inline-formula> to be the minimal one. Formally, the dynamic programming equation is presented as the following 
<disp-formula id="j_infor519_eq_007">
<label>(6)</label><alternatives><mml:math display="block">
<mml:mtable displaystyle="true">
<mml:mtr>
<mml:mtd>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mfenced separators="" open="{" close="">
<mml:mrow>
<mml:mtable columnspacing="4.0pt" equalrows="false" columnlines="none" equalcolumns="false" columnalign="left left">
<mml:mtr>
<mml:mtd class="array">
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd class="array">
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:mo>+</mml:mo>
<mml:mi>∞</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd class="array">
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo mathvariant="normal">&gt;</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd class="array">
<mml:munder>
<mml:mrow>
<mml:mo movablelimits="false">min</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo>⩽</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
</mml:mrow>
</mml:munder>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mi mathvariant="italic">E</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mo>−</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo>+</mml:mo>
<mml:mtext mathvariant="italic">dist</mml:mtext>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">(</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">k</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">j</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">Q</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" maxsize="1.19em" minsize="1.19em">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd class="array">
<mml:mtext>otherwise</mml:mtext>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
</mml:mtr>
</mml:mtable></mml:math><tex-math><![CDATA[\[ E(i,j)=\left\{\begin{array}{l@{\hskip4.0pt}l}\textit{dist}(X[1,j],{\mathbb{Q}^{1}}),\hspace{1em}& i=1,\\ {} +\infty ,\hspace{1em}& i\gt j,\\ {} \underset{i\leqslant k\leqslant j}{\min }\big(E(i-1,k)+\textit{dist}\big(X[k,j],{\mathbb{Q}^{k}}\big)\big),\hspace{1em}& \text{otherwise}.\end{array}\right.\]]]></tex-math></alternatives>
</disp-formula> 
We re-compute the distance between each subsequence in <inline-formula id="j_infor519_ineq_197"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_198"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>. Before re-ordering, we have to dismiss the trivial match subsequences since candidates can overlap with each other. Formally, given two candidates <inline-formula id="j_infor519_ineq_199"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="italic">e</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$cs=X[s,e]$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_200"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">X</mml:mi>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$c{s^{\prime }}=X[{s^{\prime }},{e^{\prime }}]$]]></tex-math></alternatives></inline-formula>, if their overlapping ratio, <inline-formula id="j_infor519_ineq_201"><alternatives><mml:math><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>∩</mml:mo>
<mml:mspace width="0.1667em"/>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo>∪</mml:mo>
<mml:mspace width="0.1667em"/>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>′</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mstyle></mml:math><tex-math><![CDATA[$\frac{cs\cap \hspace{0.1667em}c{s^{\prime }}}{cs\cup \hspace{0.1667em}c{s^{\prime }}}$]]></tex-math></alternatives></inline-formula>, exceeds <inline-formula id="j_infor519_ineq_202"><alternatives><mml:math>
<mml:mo movablelimits="false">min</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0.8</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mo movablelimits="false">min</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$\min (0.8,\frac{\min (|Q|)}{\max (|Q|)})$]]></tex-math></alternatives></inline-formula>, where the latter indicates the ratio between the minimum length and the maximum length of queries <inline-formula id="j_infor519_ineq_203"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, we dismiss the subsequence with the larger distance.</p>
<p>Afterwards, we simply sort the remaining subsequences in <inline-formula id="j_infor519_ineq_204"><alternatives><mml:math>
<mml:mtext mathvariant="italic">CS</mml:mtext></mml:math><tex-math><![CDATA[$\textit{CS}$]]></tex-math></alternatives></inline-formula> according to <inline-formula id="j_infor519_ineq_205"><alternatives><mml:math>
<mml:mi mathvariant="italic">D</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mi mathvariant="double-struck">Q</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$D(cs,\mathbb{Q})$]]></tex-math></alternatives></inline-formula> and select the <italic>K</italic> smallest as the final results <inline-formula id="j_infor519_ineq_206"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">R</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{R}$]]></tex-math></alternatives></inline-formula>.</p>
</sec>
<sec id="j_infor519_s_014">
<label>5.4</label>
<title>Optimization</title>
<p>In this section, we propose two optimization strategies to further accelerate the search process.</p>
<sec id="j_infor519_s_015">
<label>5.4.1</label>
<title>Basic Aggregates Based Linear Representation Computation</title>
<p>In the search process, for each candidate, we adopt linear regression to find out the best line segment <inline-formula id="j_infor519_ineq_207"><alternatives><mml:math>
<mml:mi mathvariant="italic">y</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi mathvariant="italic">θ</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">b</mml:mi></mml:math><tex-math><![CDATA[$y=\theta \cdot x+b$]]></tex-math></alternatives></inline-formula> in a least square sense using Eq. (<xref rid="j_infor519_eq_006">5</xref>). It is noteworthy that <italic>θ</italic>, <italic>b</italic>, and <italic>ε</italic> can be computed by some combination of three basic aggregates: <inline-formula id="j_infor519_ineq_208"><alternatives><mml:math>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">x</mml:mi></mml:math><tex-math><![CDATA[$\textstyle\sum x$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor519_ineq_209"><alternatives><mml:math>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="italic">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup></mml:math><tex-math><![CDATA[$\textstyle\sum {x^{2}}$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_210"><alternatives><mml:math>
<mml:mo largeop="false" movablelimits="false">∑</mml:mo>
<mml:mi mathvariant="italic">i</mml:mi>
<mml:mi mathvariant="italic">x</mml:mi></mml:math><tex-math><![CDATA[$\textstyle\sum ix$]]></tex-math></alternatives></inline-formula>, with cost <inline-formula id="j_infor519_ineq_211"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(1)$]]></tex-math></alternatives></inline-formula>, as proposed in Wasay <italic>et al.</italic> (<xref ref-type="bibr" rid="j_infor519_ref_020">2017</xref>). As long as we maintain these three in-memory arrays to store these basic aggregates of time series <italic>X</italic>, linear regression can be conducted in <inline-formula id="j_infor519_ineq_212"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(1)$]]></tex-math></alternatives></inline-formula> time for any subsequence in <italic>X</italic>.</p>
<p>However, to process extremely long sequences, the storage overheads are unaffordable. As a consequence, we propose to split the time series into several blocks and conduct subsequence matching within each block sequentially and summarize the results in the end. Obviously, this approach reduces memory consumption while being accurate and efficient.</p>
</sec>
<sec id="j_infor519_s_016">
<label>5.4.2</label>
<title>Adjusting the Searching Order</title>
<p>Up to now, we have generated the candidates segment-by-segment sequentially. However, different standard deviation of the feature values of different segments in <inline-formula id="j_infor519_ineq_213"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> will result in different search space. For example, in <inline-formula id="j_infor519_ineq_214"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, one segment has the standard deviation of length <inline-formula id="j_infor519_ineq_215"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>5</mml:mn></mml:math><tex-math><![CDATA[${\sigma _{l}}=5$]]></tex-math></alternatives></inline-formula>, while in other segment, <inline-formula id="j_infor519_ineq_216"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>10</mml:mn></mml:math><tex-math><![CDATA[${\sigma _{l}}=10$]]></tex-math></alternatives></inline-formula>. According to the <inline-formula id="j_infor519_ineq_217"><alternatives><mml:math>
<mml:mn>3</mml:mn>
<mml:mi mathvariant="italic">σ</mml:mi></mml:math><tex-math><![CDATA[$3\sigma $]]></tex-math></alternatives></inline-formula> standard, the former has <inline-formula id="j_infor519_ineq_218"><alternatives><mml:math>
<mml:mn>5</mml:mn>
<mml:mo>·</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo>·</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>=</mml:mo>
<mml:mn>31</mml:mn></mml:math><tex-math><![CDATA[$5\cdot 3\cdot 2+1=31$]]></tex-math></alternatives></inline-formula> candidates, while the latter has 61 candidates. Obviously, we should first search based on the latter. Specifically, we perform the search process in an optimized order, where we first consider the segment whose standard deviation of the value of length feature is the smallest, and then consider its neighbouring segments. Suppose there are 3 sets of line segments in <inline-formula id="j_infor519_ineq_219"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, i.e. <inline-formula id="j_infor519_ineq_220"><alternatives><mml:math>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[$m=3$]]></tex-math></alternatives></inline-formula>, their standard deviations of the value of length feature are <inline-formula id="j_infor519_ineq_221"><alternatives><mml:math>
<mml:mn>2</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>3</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[$2,3,1$]]></tex-math></alternatives></inline-formula> respectively, that is, <inline-formula id="j_infor519_ineq_222"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn></mml:math><tex-math><![CDATA[${\sigma _{l}^{1}}=2$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor519_ineq_223"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mn>3</mml:mn></mml:math><tex-math><![CDATA[${\sigma _{l}^{2}}=3$]]></tex-math></alternatives></inline-formula>, <inline-formula id="j_infor519_ineq_224"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${\sigma _{l}^{3}}=1$]]></tex-math></alternatives></inline-formula>. Then, in the search process, we generate candidates from right to left sequentially.</p>
</sec>
</sec>
<sec id="j_infor519_s_017">
<label>5.5</label>
<title>Complexity Analysis</title>
<p>The overall process of our approach can be divided into two phases: query representation and query processing.</p>
<p>Before the two phases, we have to scan the whole time series <italic>X</italic> once to store the basic aggregates. Its time cost is <inline-formula id="j_infor519_ineq_225"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(n)$]]></tex-math></alternatives></inline-formula>, where <italic>n</italic> is the length of <italic>X</italic>.</p>
<p>Then, in the first phase, we scan all the queries and perform traditional piecewise linear approximations. Thanks to the basic aggregates, the time cost of conducting linear regression is negligible. This phase is then <inline-formula id="j_infor519_ineq_226"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(N|Q{|^{2}})$]]></tex-math></alternatives></inline-formula> in time complexity, where <italic>N</italic> is the number of queries. For ease of presentation, although multiple queries can vary in length, we denote <inline-formula id="j_infor519_ineq_227"><alternatives><mml:math>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo></mml:math><tex-math><![CDATA[$|Q|$]]></tex-math></alternatives></inline-formula> as the length of the queries.</p>
<p>In the second phase, we first assume <inline-formula id="j_infor519_ineq_228"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>6</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn></mml:math><tex-math><![CDATA[${w_{i}}=6{\sigma _{l}^{i}}+1$]]></tex-math></alternatives></inline-formula>, where <inline-formula id="j_infor519_ineq_229"><alternatives><mml:math>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">σ</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup></mml:math><tex-math><![CDATA[${\sigma _{l}^{i}}$]]></tex-math></alternatives></inline-formula> is the standard deviation of the length values of <inline-formula id="j_infor519_ineq_230"><alternatives><mml:math>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mo>…</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="italic">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$({H_{1}^{i}},{H_{2}^{i}},\dots ,{H_{N}^{i}})$]]></tex-math></alternatives></inline-formula>. Suppose we generate candidates from left to right sequentially; when considering the first segments, it requires <inline-formula id="j_infor519_ineq_231"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O({w_{1}}n\log ({n_{c}}))$]]></tex-math></alternatives></inline-formula> time to maintain the candidate set. For the following <italic>i</italic>-th segments, the time complexity is <inline-formula id="j_infor519_ineq_232"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O({w_{i}}{n_{c}}\log ({n_{c}}))$]]></tex-math></alternatives></inline-formula>. In the post-processing process, we have to verify and compute the actual distance between <inline-formula id="j_infor519_ineq_233"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula> and every subsequence <inline-formula id="j_infor519_ineq_234"><alternatives><mml:math>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi></mml:math><tex-math><![CDATA[$cs$]]></tex-math></alternatives></inline-formula> in the candidate set. It takes <inline-formula id="j_infor519_ineq_235"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="italic">m</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">|</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O({n_{c}}m|cs{|^{2}})$]]></tex-math></alternatives></inline-formula> time to conduct the dynamic programming algorithm for the <inline-formula id="j_infor519_ineq_236"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> candidates, where <italic>m</italic> is the number of line segments. Moreover, the re-ordering process is at most <inline-formula id="j_infor519_ineq_237"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo movablelimits="false">log</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O({n_{c}}\log {n_{c}})$]]></tex-math></alternatives></inline-formula> in time complexity.</p>
<p>Since <inline-formula id="j_infor519_ineq_238"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">i</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${w_{i}}$]]></tex-math></alternatives></inline-formula> is a constant, <inline-formula id="j_infor519_ineq_239"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> is proportional to <italic>n</italic>, and <inline-formula id="j_infor519_ineq_240"><alternatives><mml:math>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">c</mml:mi>
<mml:mi mathvariant="italic">s</mml:mi>
<mml:mo stretchy="false">|</mml:mo></mml:math><tex-math><![CDATA[$|cs|$]]></tex-math></alternatives></inline-formula> and <inline-formula id="j_infor519_ineq_241"><alternatives><mml:math>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo></mml:math><tex-math><![CDATA[$|Q|$]]></tex-math></alternatives></inline-formula> is much smaller than <italic>n</italic>, we can infer that the time complexity of our approach is <inline-formula id="j_infor519_ineq_242"><alternatives><mml:math>
<mml:mi mathvariant="italic">O</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi>
<mml:mo movablelimits="false">log</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$O(n\log ({n_{c}}))$]]></tex-math></alternatives></inline-formula> theoretically.</p>
</sec>
</sec>
<sec id="j_infor519_s_018">
<label>6</label>
<title>Experiments</title>
<p>In this section, we conduct extensive experiments to verify the effectiveness and efficiency of our approach. All experiments are run on a PC with Intel Core i7-7700K CPU (8 cores @ 4.2 GHz) and 16 GB RAM.</p>
<sec id="j_infor519_s_019">
<label>6.1</label>
<title>Datasets</title>
<sec id="j_infor519_s_020">
<label>6.1.1</label>
<title>Synthetic Datasets</title>
<p>We generate the synthetic sequence as follows. First, we generate a long random walk sequence <italic>T</italic>. Then we embed in <italic>T</italic> some meaningful pattern instances, in which some are taken as queries, and others as target results.</p>
<fig id="j_infor519_fig_003">
<label>Fig. 3</label>
<caption>
<p>Two fundamental patterns.</p>
</caption>
<table-wrap id="j_infor519_tab_001">
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g003.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(a) W-wave</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g004.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(b) Backward wave</td>
</tr>
</tbody>
</table>
</table-wrap>
</fig>
<p>We generate two types of patterns, W-wave from the Adiac dataset in UCR archive (Dau <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_006">2019</xref>), and backward wave, common in stock series. For each pattern, we first generate a seed instance and add some noise following a Gaussian distribution <inline-formula id="j_infor519_ineq_243"><alternatives><mml:math>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>0.05</mml:mn>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo></mml:math><tex-math><![CDATA[$N(0,0.05)$]]></tex-math></alternatives></inline-formula>, as shown in Fig. <xref rid="j_infor519_fig_003">3</xref>. Then we modify them to generate more instances. Specifically, we adopt three types of variations, length, amplitude, and shape as follows. 
<list>
<list-item id="j_infor519_li_007">
<label>∙</label>
<p>Length: It is obvious that the W-wave in Fig. <xref rid="j_infor519_fig_003">3</xref>(a) can be split into four segments. For the middle two segments, we change the segment length with a scaling factor <italic>λ</italic> by inserting or deleting some data points. In other words, for a length-<italic>l</italic> segment, the length of the new segment is <inline-formula id="j_infor519_ineq_244"><alternatives><mml:math>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi></mml:math><tex-math><![CDATA[$l\cdot \lambda $]]></tex-math></alternatives></inline-formula>.</p>
</list-item>
<list-item id="j_infor519_li_008">
<label>∙</label>
<p>Amplitude: Similar to the case of length scaling, we increase or decrease the amplitude of the middle two segments in the W-wave. We still use a factor <italic>λ</italic> to control the extent. Specifically, every value in new segment changes to <inline-formula id="j_infor519_ineq_245"><alternatives><mml:math>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi></mml:math><tex-math><![CDATA[$v\cdot \lambda $]]></tex-math></alternatives></inline-formula>, where <italic>v</italic> is the original value.</p>
</list-item>
<list-item id="j_infor519_li_009">
<label>∙</label>
<p>Shape: For the backward wave shown in Fig. <xref rid="j_infor519_fig_003">3</xref>(b), we change the global shape of the pattern by modifying the last two segments on both length and amplitude. To be more specific, we change the segment length from <italic>l</italic> to <inline-formula id="j_infor519_ineq_246"><alternatives><mml:math>
<mml:mi mathvariant="italic">l</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi></mml:math><tex-math><![CDATA[$l\cdot \lambda $]]></tex-math></alternatives></inline-formula>, and the data values from <italic>v</italic> to <inline-formula id="j_infor519_ineq_247"><alternatives><mml:math>
<mml:mi mathvariant="italic">v</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">λ</mml:mi></mml:math><tex-math><![CDATA[$v\cdot \lambda $]]></tex-math></alternatives></inline-formula>.</p>
</list-item>
</list> 
Note that the three types of variations will influence different features in the line segments. We list them respectively in Table <xref rid="j_infor519_tab_002">1</xref>.</p>
<table-wrap id="j_infor519_tab_002">
<label>Table 1</label>
<caption>
<p>Influence of variations.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"/>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><italic>l</italic></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><italic>θ</italic></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><italic>v</italic></td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin"><italic>ε</italic></td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">Length</td>
<td style="vertical-align: top; text-align: left"><italic>√</italic></td>
<td style="vertical-align: top; text-align: left"/>
<td style="vertical-align: top; text-align: left"/>
<td style="vertical-align: top; text-align: left"><italic>√</italic></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">Amplitude</td>
<td style="vertical-align: top; text-align: left"/>
<td style="vertical-align: top; text-align: left"><italic>√</italic></td>
<td style="vertical-align: top; text-align: left"><italic>√</italic></td>
<td style="vertical-align: top; text-align: left"><italic>√</italic></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">Shape</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><italic>√</italic></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"/>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><italic>√</italic></td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin"><italic>√</italic></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>For each type of variation, we test our approach under different extents. Take the length variation as an example. To generate a dataset, we first set a parameter <italic>r</italic>, which determines the length variation range. Specifically, given <italic>r</italic>, we can only set the length scaling factor <italic>λ</italic> within the range <inline-formula id="j_infor519_ineq_248"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[1/(1+r),1+r]$]]></tex-math></alternatives></inline-formula>. Obviously, the larger the value of <italic>r</italic>, the larger the length scaling extent.</p>
<p>For each variation, given the fixed <italic>r</italic>, we pick out 50 values from the range <inline-formula id="j_infor519_ineq_249"><alternatives><mml:math>
<mml:mo fence="true" stretchy="false">[</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo mathvariant="normal" stretchy="false">/</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
<mml:mo mathvariant="normal">,</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo fence="true" stretchy="false">]</mml:mo></mml:math><tex-math><![CDATA[$[1/(1+r),1+r]$]]></tex-math></alternatives></inline-formula> as the value of <italic>λ</italic>. Then we generate 50 corresponding pattern instances, 40 of which are randomly planted into the random walk long time series. The rest 10 instances form the query set <inline-formula id="j_infor519_ineq_250"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>. A summary of the synthetic datasets is presented in Table <xref rid="j_infor519_tab_003">2</xref>.</p>
<table-wrap id="j_infor519_tab_003">
<label>Table 2</label>
<caption>
<p>A summary of the synthetic datasets.</p>
</caption>
<table>
<thead>
<tr>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Dataset</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Pattern</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Variation</td>
<td style="vertical-align: top; text-align: left; border-top: solid thin; border-bottom: solid thin">Length</td>
</tr>
</thead>
<tbody>
<tr>
<td style="vertical-align: top; text-align: left">1</td>
<td style="vertical-align: top; text-align: left">W-wave</td>
<td style="vertical-align: top; text-align: left">Length</td>
<td style="vertical-align: top; text-align: left">0.7 million</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">2</td>
<td style="vertical-align: top; text-align: left">W-wave</td>
<td style="vertical-align: top; text-align: left">Amplitude</td>
<td style="vertical-align: top; text-align: left">0.7 million</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">3</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">Backward wave</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">Shape</td>
<td style="vertical-align: top; text-align: left; border-bottom: solid thin">0.4 million</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="j_infor519_s_021">
<label>6.1.2</label>
<title>Real Datasets</title>
<p>The real dataset is the train monitoring dataset, which is the time series collected by the vibration sensor. Its total length is 15 million. There exist more than 100 interference subsequences that vary in length and amplitude. The length of these subsequences is within the range of 200 to 2500. Consequently, we maintain a query set of size 15 whose lengths almost distribute uniformly between 200 and 2500. The rest ones are leaved as the target results.</p>
</sec>
</sec>
<sec id="j_infor519_s_022">
<label>6.2</label>
<title>Counterpart Approaches</title>
<p>Note that it is difficult to find reasonable baselines to compare our approach to because the existing methods are only devised for the case of a single query. Since no approach can deal with multiple queries as a whole, we choose two representative subsequence matching algorithms, UCR Suite (Rakthanmanon <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_016">2012</xref>) and SpADe (Chen <italic>et al.</italic>, <xref ref-type="bibr" rid="j_infor519_ref_005">2007</xref>), and then enable them to handle the problem of multiple queries. UCR Suite finds the best normalized subsequences and supports both Euclidean distance and Dynamic Time Warping (UCR-ED and UCR-DTW for short). SpADe finds the shortest path within several local patterns, and is able to handle shifting and scaling both in temporal and amplitude dimensions.</p>
<p>To make UCR-Suite (or SpADe) support multiple queries, we first find out top-<italic>K</italic> similar subsequences for each query based on UCR-Suite (or SpADe). We then sort these <inline-formula id="j_infor519_ineq_251"><alternatives><mml:math>
<mml:mi mathvariant="italic">K</mml:mi>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">N</mml:mi></mml:math><tex-math><![CDATA[$K\cdot N$]]></tex-math></alternatives></inline-formula> number of subsequences in the ascending order of their distances (normalized by the subsequence length) and pick out the top-<italic>K</italic> ones excluding any trivial results as final. For UCR-Suite, we utilize both ED and DTW as the distance, denoted as UCR-ED and UCR-DTW, respectively.</p>
<fig id="j_infor519_fig_004">
<label>Fig. 4</label>
<caption>
<p>Accuracy comparisons under different variations.</p>
</caption>
<table-wrap id="j_infor519_tab_004">
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g005.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(a) Length scaling</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g006.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(b) Amplitude scaling</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g007.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(c) Shape scaling</td>
</tr>
</tbody>
</table>
</table-wrap>
</fig>
<p>Let <italic>N</italic> be the number of queries in <inline-formula id="j_infor519_ineq_252"><alternatives><mml:math>
<mml:mi mathvariant="double-struck">Q</mml:mi></mml:math><tex-math><![CDATA[$\mathbb{Q}$]]></tex-math></alternatives></inline-formula>, we denote our approach and the other three competitors as MQ-<italic>N</italic>, UCR-ED-<italic>N</italic>, UCR-DTW-<italic>N</italic> and SpADe-<italic>N</italic>, respectively. Note that the three rivals have to scan the time series for <italic>N</italic> times. For fairness, we do not count I/O time when comparing efficiency.</p>
</sec>
<sec id="j_infor519_s_023">
<label>6.3</label>
<title>Results on Synthetic Datasets</title>
<fig id="j_infor519_fig_005">
<label>Fig. 5</label>
<caption>
<p>Efficiency comparisons under different variations.</p>
</caption>
<table-wrap id="j_infor519_tab_005">
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g008.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(a) Length scaling</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g009.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(b) Amplitude scaling</td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g010.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(c) Shape scaling</td>
</tr>
</tbody>
</table>
</table-wrap>
</fig>
<p>In the first experiment, we compare our approach MQ with UCR-ED, UCR-DTW, and SpADe on synthetic datasets. Both accuracy and efficiency are tested. To compare these approaches extensively, we vary the parameter <italic>r</italic> for all three variations, length, amplitude and shape. Specifically, for the length variation, we set the maximal scaling factor <italic>r</italic> from 0.25 to 0.7 with step 0.05. For the amplitude variation, we varied <italic>r</italic> in a range of 0.2 to 2 with step 0.2. For the shape variation, we changed <italic>r</italic> in a range of 0.05 to 0.5 with step 0.05.</p>
<p>We set the number of queries, <italic>N</italic>, as 5 and 10, respectively. The experimental results are then indicated by MQ-5, MQ-10, UCR-DTW-5, UCR-DTW-10, SpADe-5, and SpADe-10.<xref ref-type="fn" rid="j_infor519_fn_001">1</xref><fn id="j_infor519_fn_001"><label><sup>1</sup></label>
<p>UCR-ED is omitted for its consistent inferiority to UCR-DTW.</p></fn> For MQ, we set the only parameter, the size of the candidate set <inline-formula id="j_infor519_ineq_253"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula>, to be <inline-formula id="j_infor519_ineq_254"><alternatives><mml:math>
<mml:mn>0.05</mml:mn>
<mml:mo>·</mml:mo>
<mml:mi mathvariant="italic">n</mml:mi></mml:math><tex-math><![CDATA[$0.05\cdot n$]]></tex-math></alternatives></inline-formula>, where <italic>n</italic> is the length of the time series <italic>X</italic>.</p>
<p>In each set of experiments, we attempt to find out the top-40 subsequences in the time series. The accuracy is the ratio between the number of <italic>correct</italic> subsequences and 40. Subsequence <italic>S</italic> is correct, if the overlapping ratio between <italic>S</italic> and certain planted instance exceeds the tolerance parameter, <inline-formula id="j_infor519_ineq_255"><alternatives><mml:math>
<mml:mi mathvariant="italic">ϵ</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo movablelimits="false">min</mml:mo>
<mml:mo maxsize="1.19em" minsize="1.19em" fence="true" mathvariant="normal">(</mml:mo>
<mml:mn>0.8</mml:mn>
<mml:mo mathvariant="normal">,</mml:mo><mml:mstyle displaystyle="false">
<mml:mfrac>
<mml:mrow>
<mml:mo movablelimits="false">min</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo movablelimits="false">max</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">(</mml:mo>
<mml:mo stretchy="false">|</mml:mo>
<mml:mi mathvariant="italic">Q</mml:mi>
<mml:mo stretchy="false">|</mml:mo>
<mml:mo mathvariant="normal" fence="true" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mstyle>
<mml:mo maxsize="1.19em" minsize="1.19em" fence="true" mathvariant="normal">)</mml:mo></mml:math><tex-math><![CDATA[$\epsilon =\min \big(0.8,\frac{\min (|Q|)}{\max (|Q|)}\big)$]]></tex-math></alternatives></inline-formula>.</p>
<p>The results are shown in Fig. <xref rid="j_infor519_fig_004">4</xref> and Fig. <xref rid="j_infor519_fig_005">5</xref>, respectively. It can be seen that MQ outperforms UCR-DTW and SpADe in both accuracy and efficiency under all variations. The reason is that MQ summarizes common characteristics in multiple queries while the other approaches are only able to find out the subsequences that are similar to certain given query. As a consequence, when the number of query sequences <italic>N</italic> increases, UCR-DTW and SpADe yield better results. Nevertheless, UCR-DTW-10 (or SpADe-10) is twice as slow as UCR-DTW-5 (or SpADe-5) on average, which means with more query sequences provided, UCR-DTW and SpADe can find out more satisfying subsequences, but at the cost of efficiency. Instead, MQ captures latent semantics and thus demonstrates its superiority in both accuracy and efficiency under different variations. The only exception is in Fig. <xref rid="j_infor519_fig_004">4</xref>(c), where the accuracy of MQ sharply decreases when <inline-formula id="j_infor519_ineq_256"><alternatives><mml:math>
<mml:mi mathvariant="italic">r</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>0.4</mml:mn></mml:math><tex-math><![CDATA[$r=0.4$]]></tex-math></alternatives></inline-formula>, which is mainly because the undue scaling in shape will result in ambiguity of what users really want.</p>
</sec>
<sec id="j_infor519_s_024">
<label>6.4</label>
<title>Results on Real Datasets</title>
<fig id="j_infor519_fig_006">
<label>Fig. 6</label>
<caption>
<p>Comparison on real dataset.</p>
</caption>
<table-wrap id="j_infor519_tab_006">
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g011.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(a) Accuracy vs. <italic>N</italic></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g012.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(b) Efficiency vs. <italic>N</italic></td>
</tr>
</tbody>
</table>
</table-wrap>
</fig>
<fig id="j_infor519_fig_007">
<label>Fig. 7</label>
<caption>
<p>Influence of parameter <inline-formula id="j_infor519_ineq_257"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula>.</p>
</caption>
<table-wrap id="j_infor519_tab_007">
<table>
<tbody>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g013.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(a) Accuracy vs. <inline-formula id="j_infor519_ineq_258"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center"><graphic xlink:href="infor519_g014.jpg"/></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: center">(b) Efficiency vs. <inline-formula id="j_infor519_ineq_259"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula></td>
</tr>
</tbody>
</table>
</table-wrap>
</fig>
<p>In this experiment, we compare our approach with other ones as the number of query sequences <italic>N</italic> varied on real dataset. We pick out queries from the query set of size 15 so that their lengths distribute uniformly. Note that no matter the value of <italic>N</italic>, we find out top-100 subsequences from the real time series. The results are shown in Fig. <xref rid="j_infor519_fig_006">6</xref>.</p>
<p>It can be seen that in Fig. <xref rid="j_infor519_fig_006">6</xref>(a), the accuracy of MQ, UCR-ED, and UCR-DTW increases when <italic>N</italic> increases while the accuracy of SpADe is consistently low. The reason is that SpADe extracts local patterns by using a fixed size of sliding window, and consequently, it fails to capture the query intuition. Moreover, it is noteworthy that when <inline-formula id="j_infor519_ineq_260"><alternatives><mml:math>
<mml:mi mathvariant="italic">N</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>6</mml:mn></mml:math><tex-math><![CDATA[$N=6$]]></tex-math></alternatives></inline-formula>, the accuracy of MQ has already exceeded 0.9, which means that MQ is able to find out what users really want with only a small query set.</p>
<p>Figure <xref rid="j_infor519_fig_006">6</xref>(b) compares MQ and other approaches on efficiency. Obviously, MQ outperforms all other approaches, and its running time is insensitive to <italic>N</italic> since MQ summarizes all the queries and searches for results in the time series only once. Due to the specific pattern shape and the lack of abundant pruning strategies, UCR-ED is inferior to UCR-DTW in terms of efficiency in this dataset.</p>
</sec>
<sec id="j_infor519_s_025">
<label>6.5</label>
<title>Influence of Parameter <inline-formula id="j_infor519_ineq_261"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula></title>
<p>In this experiment, we investigate the influence of the parameter <inline-formula id="j_infor519_ineq_262"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula>, the size of the candidate set. On the real dataset, we varied <inline-formula id="j_infor519_ineq_263"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> from 0.01 to 0.1, and the number of queries, <italic>N</italic> was set to 3, 6, and 9 respectively.</p>
<p>Results are shown in Fig. <xref rid="j_infor519_fig_007">7</xref>. It can be seen that as <inline-formula id="j_infor519_ineq_264"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> gets larger, the accuracy of MQ increases while the efficiency decreases. It is because once the query set is fixed, we can maintain more candidates by enlarging the candidate set, i.e. increasing the value of <inline-formula id="j_infor519_ineq_265"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> at the cost of more running time. Also, we can find that MQ has already achieved satisfying results when <inline-formula id="j_infor519_ineq_266"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0.05</mml:mn></mml:math><tex-math><![CDATA[${n_{c}}=0.05$]]></tex-math></alternatives></inline-formula>, so we set the default value of <inline-formula id="j_infor519_ineq_267"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub></mml:math><tex-math><![CDATA[${n_{c}}$]]></tex-math></alternatives></inline-formula> to 0.05 in previous experiments. Generally, as shown in Fig. <xref rid="j_infor519_fig_007">7</xref>(a), the accuracy of MQ increases as the number of queries <italic>N</italic> increases. The only exception is when <inline-formula id="j_infor519_ineq_268"><alternatives><mml:math>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="italic">n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">c</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>0.01</mml:mn></mml:math><tex-math><![CDATA[${n_{c}}=0.01$]]></tex-math></alternatives></inline-formula>. The reason is that the standard deviation of the query feature values experiences an increase as <italic>N</italic> increases, resulting in a larger search space. The small candidate set then fails to maintain enough candidates, and thus achieves poor results. Meanwhile, the changes in the search space cause the running time to increase proportionally, as shown in Fig. <xref rid="j_infor519_fig_007">7</xref>(b).</p>
</sec>
</sec>
<sec id="j_infor519_s_026">
<label>7</label>
<title>Conclusions</title>
<p>In this paper, we have proposed a novel subsequence matching approach, Multi-querying, to reflect the query intuition. Given multiple queries, we use a multi-dimensional probability distribution to represent them. Then, a breadth-first search algorithm is then applied to finding out the top-<italic>K</italic> most similar subsequences. Extensive experiments have demonstrated that Multi-querying outperforms the state-of-the-art algorithms in terms of accuracy and performance. To the best of our knowledge, this is the first study to introduce the concept of multiple queries to express the query intuition.</p>
</sec>
</body>
<back>
<ref-list id="j_infor519_reflist_001">
<title>References</title>
<ref id="j_infor519_ref_001">
<mixed-citation publication-type="journal"><string-name><surname>Abanda</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Mori</surname>, <given-names>U.</given-names></string-name>, <string-name><surname>Lozano</surname>, <given-names>J.A.</given-names></string-name> (<year>2019</year>). <article-title>A review on distance based time series classification</article-title>. <source>Data Mining and Knowledge Discovery</source>, <volume>33</volume>(<issue>2</issue>), <fpage>378</fpage>–<lpage>412</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3514221.3526183" xlink:type="simple">https://doi.org/10.1145/3514221.3526183</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_002">
<mixed-citation publication-type="journal"><string-name><surname>Boniol</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2020</year>). <article-title>Series2graph: Graph-based subsequence anomaly detection for time series</article-title>. <source>Proceedings of the VLDB Endowment</source>, <volume>13</volume>(<issue>12</issue>), <fpage>1821</fpage>–<lpage>1834</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.14778/3407790.3407792" xlink:type="simple">https://doi.org/10.14778/3407790.3407792</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_003">
<mixed-citation publication-type="chapter"><string-name><surname>Boniol</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Linardi</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Roncallo</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2020</year>). <chapter-title>Automated anomaly detection in large sequences</chapter-title>. In: <source>2020 IEEE 36th International Conference on Data Engineering (ICDE)</source>, <conf-loc>Dallas, TX, USA</conf-loc>, pp. <fpage>1834</fpage>–<lpage>1837</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/ICDE48307.2020.00182" xlink:type="simple">https://doi.org/10.1109/ICDE48307.2020.00182</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_004">
<mixed-citation publication-type="chapter"><string-name><surname>Boniol</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Meftah</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Remy</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2022</year>). <chapter-title>DCAM: dimension-wise class activation map for explaining multivariate data series classification</chapter-title>. In: <source>Proceedings of the 2022 International Conference on Management of Data, SIGMOD ’22</source>. <publisher-name>Association for Computing Machinery</publisher-name>, <publisher-loc>New York, NY, USA</publisher-loc>, pp. <fpage>1175</fpage>–<lpage>1189</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3514221.3526183" xlink:type="simple">https://doi.org/10.1145/3514221.3526183</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_005">
<mixed-citation publication-type="chapter"><string-name><surname>Chen</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Nascimento</surname>, <given-names>M.A.</given-names></string-name>, <string-name><surname>Ooi</surname>, <given-names>B.C.</given-names></string-name>, <string-name><surname>Tung</surname>, <given-names>A.K.H.</given-names></string-name> (<year>2007</year>). <chapter-title>SpADe: on shape-based pattern detection in streaming time series</chapter-title>. In: <source>Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007</source>, <conf-loc>The Marmara Hotel, Istanbul, Turkey, April 15–20, 2007</conf-loc>, pp. <fpage>786</fpage>–<lpage>795</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/ICDE.2007.367924" xlink:type="simple">https://doi.org/10.1109/ICDE.2007.367924</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_006">
<mixed-citation publication-type="journal"><string-name><surname>Dau</surname>, <given-names>H.A.</given-names></string-name>, <string-name><surname>Bagnall</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Kamgar</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Yeh</surname>, <given-names>C.-C.M.</given-names></string-name>, <string-name><surname>Zhu</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Gharghabi</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Ratanamahatana</surname>, <given-names>C.A.</given-names></string-name>, <string-name><surname>Keogh</surname>, <given-names>E.</given-names></string-name> (<year>2019</year>). <article-title>The UCR time series archive</article-title>. <source>IEEE/CAA Journal of Automatica Sinica</source>, <volume>6</volume>(<issue>6</issue>), <fpage>1293</fpage>–<lpage>1305</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/JAS.2019.1911747" xlink:type="simple">https://doi.org/10.1109/JAS.2019.1911747</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_007">
<mixed-citation publication-type="journal"><string-name><surname>Eravci</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>Ferhatosmanoglu</surname>, <given-names>H.</given-names></string-name> (<year>2013</year>). <article-title>Diversity based relevance feedback for time series search</article-title>. <source>Proceedings of the VLDB Endowment</source>, <volume>7</volume>(<issue>2</issue>), <fpage>109</fpage>–<lpage>120</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.14778/2732228.2732230" xlink:type="simple">https://doi.org/10.14778/2732228.2732230</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_008">
<mixed-citation publication-type="journal"><string-name><surname>Hu</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Letson</surname>, <given-names>F.</given-names></string-name>, <string-name><surname>Barthelmie</surname>, <given-names>R.J.</given-names></string-name>, <string-name><surname>Pryor</surname>, <given-names>S.C.</given-names></string-name> (<year>2018</year>). <article-title>Wind gust characterization at wind turbine relevant heights in moderately complex terrain</article-title>. <source>Journal of Applied Meteorology and Climatology</source>, <volume>57</volume>(<issue>7</issue>), <fpage>1459</fpage>–<lpage>1476</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1175/JAMC-D-18-0040.1" xlink:type="simple">https://doi.org/10.1175/JAMC-D-18-0040.1</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_009">
<mixed-citation publication-type="journal"><string-name><surname>Iwana</surname>, <given-names>B.K.</given-names></string-name>, <string-name><surname>Uchida</surname>, <given-names>S.</given-names></string-name> (<year>2020</year>). <article-title>Time series classification using local distance-based features in multi-modal fusion networks</article-title>. <source>Pattern Recognition</source>, <volume>97</volume>, <fpage>107024</fpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1016/j.patcog.2019.107024" xlink:type="simple">https://doi.org/10.1016/j.patcog.2019.107024</ext-link>. <uri>https://www.sciencedirect.com/science/article/pii/S0031320319303279</uri>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_010">
<mixed-citation publication-type="journal"><string-name><surname>Keogh</surname>, <given-names>E.</given-names></string-name>, <string-name><surname>Chakrabarti</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Pazzani</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Mehrotra</surname>, <given-names>S.</given-names></string-name> (<year>2001</year>). <article-title>Dimensionality reduction for fast similarity search in large time series databases</article-title>. <source>Knowledge and Information Systems</source>, <volume>3</volume>(<issue>3</issue>), <fpage>263</fpage>–<lpage>286</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/PL00011669" xlink:type="simple">https://doi.org/10.1007/PL00011669</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_011">
<mixed-citation publication-type="journal"><string-name><surname>Kondylakis</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Dayan</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Zoumpatianos</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2018</year>). <article-title>Coconut: a scalable bottom-up approach for building data series indexes</article-title>. <source>Proceedings of the VLDB Endowment</source>, <volume>11</volume>(<issue>6</issue>), <fpage>677</fpage>–<lpage>690</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.14778/3184470.3184472" xlink:type="simple">https://doi.org/10.14778/3184470.3184472</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_012">
<mixed-citation publication-type="chapter"><string-name><surname>Li</surname>, <given-names>Y.</given-names></string-name>, <string-name><surname>Tang</surname>, <given-names>B.</given-names></string-name>, <string-name><surname>U</surname>, <given-names>L.H.</given-names></string-name>, <string-name><surname>Yiu</surname>, <given-names>M.L.</given-names></string-name>, <string-name><surname>Gong</surname>, <given-names>Z.</given-names></string-name> (<year>2017</year>). <chapter-title>Fast subsequence search on time series data</chapter-title>. In: <source>Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017</source>, <conf-loc>Venice, Italy, March 21–24, 2017</conf-loc>, pp. <fpage>514</fpage>–<lpage>517</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.5441/002/edbt.2017.58" xlink:type="simple">https://doi.org/10.5441/002/edbt.2017.58</ext-link>. <uri>https://openproceedings.org/2017/conf/edbt/paper-369.pdf</uri>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_013">
<mixed-citation publication-type="journal"><string-name><surname>Linardi</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2018</year>). <article-title>Scalable, variable-length similarity search in data series: the ULISSE approach</article-title>. <source>Proceedings of the VLDB Endowment</source>, <volume>11</volume>(<issue>13</issue>), <fpage>2236</fpage>–<lpage>2248</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.14778/3275366.3284968" xlink:type="simple">https://doi.org/10.14778/3275366.3284968</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_014">
<mixed-citation publication-type="chapter"><string-name><surname>Muthumanickam</surname>, <given-names>P.K.</given-names></string-name>, <string-name><surname>Vrotsou</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Cooper</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Johansson</surname>, <given-names>J.</given-names></string-name> (<year>2016</year>). <chapter-title>Shape grammar extraction for efficient query-by-sketch pattern matching in long time series</chapter-title>. In: <source>11th IEEE Conference on Visual Analytics Science and Technology, IEEE VAST 2016</source>, <conf-loc>Baltimore, MD, USA, October 23–28, 2016</conf-loc>. <publisher-name>IEEE Computer Society</publisher-name>, pp. <fpage>121</fpage>–<lpage>130</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/VAST.2016.7883518" xlink:type="simple">https://doi.org/10.1109/VAST.2016.7883518</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_015">
<mixed-citation publication-type="journal"><string-name><surname>Papapetrou</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Athitsos</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Potamias</surname>, <given-names>M.</given-names></string-name>, <string-name><surname>Kollios</surname>, <given-names>G.</given-names></string-name>, <string-name><surname>Gunopulos</surname>, <given-names>D.</given-names></string-name> (<year>2011</year>). <article-title>Embedding-based subsequence matching in time-series databases</article-title>. <source>ACM Transactions on Database Systems</source>, <volume>36</volume>(<issue>3</issue>), <fpage>17</fpage>–<lpage>11739</lpage>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_016">
<mixed-citation publication-type="chapter"><string-name><surname>Rakthanmanon</surname>, <given-names>T.</given-names></string-name>, <string-name><surname>Campana</surname>, <given-names>B.J.L.</given-names></string-name>, <string-name><surname>Mueen</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Batista</surname>, <given-names>G.E.A.P.A.</given-names></string-name>, <string-name><surname>Westover</surname>, <given-names>M.B.</given-names></string-name>, <string-name><surname>Zhu</surname>, <given-names>Q.</given-names></string-name>, <string-name><surname>Zakaria</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Keogh</surname>, <given-names>E.J.</given-names></string-name> (<year>2012</year>). <chapter-title>Searching and mining trillions of time series subsequences under dynamic time warping</chapter-title>. In: <source>The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’12</source>, <conf-loc>Beijing, China, August 12–16, 2012</conf-loc>. <publisher-name>ACM</publisher-name>, pp. <fpage>262</fpage>–<lpage>270</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/2339530.2339576" xlink:type="simple">https://doi.org/10.1145/2339530.2339576</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_017">
<mixed-citation publication-type="chapter"><string-name><surname>Shieh</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Keogh</surname>, <given-names>E.</given-names></string-name> (<year>2008</year>). <chapter-title>ISAX: indexing and mining terabyte sized time series</chapter-title>. In: <source>Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’08</source>. <publisher-name>Association for Computing Machinery</publisher-name>, <publisher-loc>New York, NY, USA</publisher-loc>, pp. <fpage>623</fpage>–<lpage>631</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/1401890.1401966" xlink:type="simple">https://doi.org/10.1145/1401890.1401966</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_018">
<mixed-citation publication-type="journal"><string-name><surname>Wang</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Li</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Sun</surname>, <given-names>H.</given-names></string-name>, <string-name><surname>Guo</surname>, <given-names>Z.</given-names></string-name>, <string-name><surname>Bai</surname>, <given-names>Y.</given-names></string-name> (<year>2018</year>). <article-title>Shapelet classification algorithm based on efficient subsequence matching</article-title>. <source>Data Science Journal</source>, <volume>17</volume>(<issue>6</issue>), <fpage>1</fpage>–<lpage>12</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.5334/dsj-2018-006" xlink:type="simple">https://doi.org/10.5334/dsj-2018-006</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_019">
<mixed-citation publication-type="journal"><string-name><surname>Wang</surname>, <given-names>Q.</given-names></string-name>, <string-name><surname>Whitmarsh</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Navarro</surname>, <given-names>V.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2022</year>). <article-title>IEDeaL: a deep learning framework for detecting highly imbalanced interictal epileptiform discharges</article-title>. <source>Proceedings of the VLDB Endowment</source>, <volume>16</volume>(<issue>3</issue>), <fpage>480</fpage>–<lpage>490</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.14778/3570690.3570698" xlink:type="simple">https://doi.org/10.14778/3570690.3570698</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_020">
<mixed-citation publication-type="chapter"><string-name><surname>Wasay</surname>, <given-names>A.</given-names></string-name>, <string-name><surname>Wei</surname>, <given-names>X.</given-names></string-name>, <string-name><surname>Dayan</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Idreos</surname>, <given-names>S.</given-names></string-name> (<year>2017</year>). <chapter-title>Data canopy: accelerating exploratory statistical analysis</chapter-title>. In: <source>Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD Conference 2017</source>, <conf-loc>Chicago, IL, USA, May 14–19, 2017</conf-loc>, <publisher-name>ACM</publisher-name>, pp. <fpage>557</fpage>–<lpage>572</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1145/3035918.3064051" xlink:type="simple">https://doi.org/10.1145/3035918.3064051</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_021">
<mixed-citation publication-type="chapter"><string-name><surname>Wu</surname>, <given-names>J.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>P.</given-names></string-name>, <string-name><surname>Pan</surname>, <given-names>N.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>C.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>W.</given-names></string-name>, <string-name><surname>Wang</surname>, <given-names>J.</given-names></string-name> (<year>2019</year>). <chapter-title>KV-match: a subsequence matching approach supporting normalization and time warping</chapter-title>. In: <source>35th IEEE International Conference on Data Engineering, ICDE 2019</source>, <conf-loc>Macao, China, April 8–11, 2019</conf-loc>. <publisher-name>IEEE</publisher-name>, pp. <fpage>866</fpage>–<lpage>877</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1109/ICDE.2019.00082" xlink:type="simple">https://doi.org/10.1109/ICDE.2019.00082</ext-link>.</mixed-citation>
</ref>
<ref id="j_infor519_ref_022">
<mixed-citation publication-type="journal"><string-name><surname>Zoumpatianos</surname>, <given-names>K.</given-names></string-name>, <string-name><surname>Idreos</surname>, <given-names>S.</given-names></string-name>, <string-name><surname>Palpanas</surname>, <given-names>T.</given-names></string-name> (<year>2016</year>). <article-title>ADS: the adaptive data series index</article-title>. <source>VLDB Endowment</source>, <volume>25</volume>(<issue>6</issue>), <fpage>843</fpage>–<lpage>866</lpage>. <ext-link ext-link-type="doi" xlink:href="https://doi.org/10.1007/s00778-016-0442-5" xlink:type="simple">https://doi.org/10.1007/s00778-016-0442-5</ext-link>.</mixed-citation>
</ref>
</ref-list>
</back>
</article>
