1 Introduction
In recent years, with the popularization and application of GPS-enabled devices, massive volumes of GPS trajectory data are recorded. These data hide interesting and valuable knowledge patterns describing the behaviour of moving objects (Morzy and Rosikiewicz, 2007). Individual trajectories can reflect professional characteristics of the user, trajectories with high degree of spatio-temporal regularities on weekdays imply the user has a fixed full-time job, trajectories with irregular spatio-temporal characteristics in a week imply work unit of the user is not fixed (Li et al., 2014). Group trajectories contain a lot of valuable knowledge patterns. According to these patterns, people can find restaurants and travel routes that interest them. Urban function units (e.g. residential area, commercial area and industrial area) also can be distinguished by using these patterns (Li et al., 2014).
Typically, the size of raw trajectory data recorded by GPS-enabled devices is very large. Consider Beijing with 67,000 taxis, suppose we collect trajectory data in every 2–3 seconds, the size of the trajectories generated by these taxis for just a single day is as high as 60TB (Yuan, 2012). Such massive volumes of data will bring some problems for location-based applications. (1) It will take up a lot of storage spaces. (2) Sending a large amount of GPS trajectory data will cause network overhead. (3) It will bring a heavy burden for web browsers when rendering these trajectories on the client side. It takes approximately 1s for rendering 1000 points on the map (Chen et al., 2012). (4) When the size of trajectory data gets larger, it becomes more difficult to find valuable patterns from the trajectory data. Reducing the size of trajectory data has the potential to make it more easily (Giannotti et al., 2007).
1.1 Existing Trajectory Compression Algorithms
To solve the trajectory compression problem various algorithms are proposed, each offers a different method to compress trajectory data. In this section, we will briefly introduce the related algorithms of trajectory compression.
Douglas-Peucker (DP) algorithm (Douglas and Peucker, 1973) is widely used in computer aided design (CAD) area, it can be employed to compress trajectory data. Given a trajectory T and a parameter of spatial error μ, DP algorithm constructs the approximate trajectory ${T^{\prime }}$ by repeatedly adding points from T until the maximum spatial error of ${T^{\prime }}$ becomes smaller than μ. DP algorithm can effectively compress trajectory data, but it only focuses on spatial information. TD–TR (Meratnia and de By, 2004) is an improvement algorithm of DP, which uses SED error instead of spatial error. Compared to DP algorithm, TD–TR algorithm has the benefit of taking temporal information into account. Optimal algorithms (Perez and Vidal, 1994; Salotti, 2000, 2001) aim at minimizing spatial error, by removing positioning points in searching process they can achieve a minimum spatial error compression. Due to the computational overhead of the optimal algorithms, near-optimal algorithms are proposed to reduce the time complexity. Near-optimal algorithms proposed in papers (Kolesnikov and Franti, 2002, 2003) can achieve a faster search by reducing the search space and using heuristic search. Paper (Tian and Xu, 2011) proposes a trajectory compression method based on turning point judgment method, this algorithm uses turning points to delineate a trajectory, in which the advantages and disadvantages of point-by-point judgment method and combined judgment method are analysed. Trajectory simplification (TS) algorithm is proposed in paper (Chen et al., 2009), which considers both the shape skeleton and the semantic meanings of a GPS trajectory. TS algorithm uses heading change degree of a point and distance between this point and its most adjacent neighbours to weight importance of the point, the points with high weight will be retained in final compressed trajectory. Open Window Time Ratio (OPW-TR) (Meratnia and de By, 2004) is a compression algorithm for trajectory data, given a trajectory T and a parameter of SED error μ, OPW-TR algorithm starts by defining a segment between the first point P1 (the anchor point) and the third point P3 (the float point). Then the algorithm calculates all SED errors of the points between the anchor point and the float point. As long as all SED values are below μ, an attempt is made to move the float point one point forward in trajectory T. When the SED threshold μ is going to be exceeded, a new anchor point will be chosen out. The process will repeat until the entire trajectory has been transformed into a piecewise linear approximation. Based on the different anchor point select strategies, OPW-TR has two modes called Before-OPW-TR and Normal-OPW-TR. Threshold-guided algorithm (Potamias et al., 2006) compresses trajectory by constructing a safe area using moving object’s speed and direction, if an incoming positioning point is in the safe area, then this point contributes little information and hence can be discarded without significant loss in accuracy. Spatial QUalIty Simplification Heuristic – Extended (SQUISH-E) algorithm (Muckell et al., 2014) is an extended version of SQUISH algorithm (Muckell et al., 2011). Compared to SQUISH algorithm, SQUISH-E algorithm has the capability of compressed trajectories while ensuring that SED error is within a user-specified bound. The main idea of SQUISH-E algorithm is to construct a priority queue of positioning points, this algorithm compresses trajectory by repeatedly removing the positioning point of the lowest priority until the termination condition is met. And based on parameters setting, SQUISH-E can be divided into SQUISH-E(μ) and SQUISH-E(λ). A hybrid trajectory compression algorithm based on multiple spatiotemporal characteristics is proposed in paper (Jiagao et al., 2015), this algorithm uses characteristic points to delineate a trajectory, and the characteristic points are chosen by using both distance, direction and speed information. Articles (Qian and Lu, 2017; Sun et al., 2016) are a review of related algorithms and have certain reference value. Document (Cao and Li, 2017) proposes DOTS (Directed acyclic graph based Online Trajectory Simplification) algorithm and constructs a directed acyclic graph on-line simplification.
1.2 The Proposed Algorithm
The positioning points where an object changed moving direction contain rich semantic meanings, these points imply user’s behaviour, such as, staying, taking photos or watching something attractive, etc. Without these positioning points, we will know little about user’s behaviour. In this paper, we propose a heading maintaining oriented trajectory compression algorithm called HMOTC.
The HMOTC algorithm has the benefit of taking into account both the heading change degree of positioning points and the direction change degree between positioning points and trajectory segment. So, this algorithm can accurately capture the direction information of the trajectory. As shown in Fig. 1, a multi transportation mode (walk → bus → walk → bus → walk → train) GPS trajectory from north to south is compressed by DP, TD-TR and our HMOTC. For traditional position-preserving compression algorithms DP and TD-TR, we could know little about how the trajectory’s user walked on walk segment from the compressed trajectories, because they lose too much of the moving object’s heading information. For HMOTC algorithms, because the direction information of the trajectory is taken into account, we can know more details about user’s walking direction and walking route from the compressed trajectories.
Another benefit of HMOTC algorithm is that the algorithm has the ability of controlling compression with respect to spatial error. Without spatial information control it may lead to an inaccuracy compression. To illustrate that, let us take DPTS-SP-Prac (Long et al., 2013) algorithm as an example. DPTS-SP-Prac is a direction-preserving trajectory compression algorithm which focuses on capturing the direction change degree between positioning points and trajectory segment, the algorithm has a theoretical bounded position error, but it is uncontrollable. As shown in Fig. 2, the multi transportation mode trajectory is compressed under a ${40^{\circ }}$ angle threshold. The DPTS-SP-Prac algorithm maintains a good trajectory contour in the walk segment and the bus segment. However, as the train segment has the characteristics of high speed and small direction change, a lot of spatial information is lost, the maximum perpendicular distance error of DPTS-SP-Prac in train segment is as high as 2.2 km (refer to Fig. 2, dashed box area). For the proposed algorithm, because it has the ability of spatial information control, the algorithm can maintain a good trajectory contour in walk, bus and train segments.
The HMOTC has the flexibility of controlling compression with respect to both spatial error and direction error, so the compression process can be precisely controlled according to users’ needs. If only the direction information of the trajectory is taken into account, the SED threshold of the algorithm can be set to positive infinity. If only the spatial information of the trajectory is taken into account, the angle threshold of the algorithm can be set to ${180^{\circ }}$. Because the algorithm can compress trajectory data while retrieving points from trajectory, it has the advantages of supporting both online and offline applications.
The remainder of this paper is organized as follows. In Section 2, some related definitions about trajectory compression are reported. After that our HMOTC algorithm is described in Section 3. An empirical evaluation of trajectory compression algorithms with different error measurements is given in Section 4. Finally, paper conclusions are discussed in Section 5.
2 Related Definitions
According to the loss of information, trajectory compression algorithms can be classified into two categories: lossless and lossy compression. Lossless compression algorithms enable reconstruction of the original trajectory data without information loss. Lossy compression will cause information loss, the trajectory data will be changed and can not be reconstructed. Usually, the compression efficiency of lossless compression algorithms is not obvious. For example, the compression efficiency of a lossless compression algorithm proposed in Zheng et al. (2005) is 25%. In contrast, lossy compression can significantly reduce the data size while maintaining an acceptable error tolerance. Due to the advantage of lossy compression, this paper focuses on lossy compression of trajectory data.
Generally, a GPS trajectory T is a temporally ordered sequence of positioning points $T={P_{1}},{P_{2}},{P_{3}},\dots ,{P_{n-1}},{P_{n}}$, each point ${P_{i}}$ consists of three tuples $\langle {X_{i}},{Y_{i}},{t_{i}}\rangle $, where ${X_{i}}$, ${Y_{i}}$ represent the coordinate at time stamp ${t_{i}}$. The problem of lossy trajectory compression is to seek a set of temporally ordered points ${T^{\prime }}={P_{i1}},{P_{i2}},{P_{i3}},\dots ,{P_{i[m-1]}},{P_{im}}$ (a subset of T) as an approximation of T, where $1={i_{1}}<\cdots <{i_{m}}=n$. And the definition of compression ratio CR in this paper is defined as
2.1 Error Measures
There are many error measures to evaluate the accuracy of trajectory compression algorithms. In this section, we will introduce four error measures called spatial error, Synchronous Euclidean Distance (SED) (Potamias et al., 2006) error, speed error and heading error.
Definition 1.
Spatial error: Given a trajectory T and its compressed representation ${T^{\prime }}$, the spatial error of ${T^{\prime }}$ with respect to a point ${P_{i}}$ in T is defined as the distance between ${P_{i}}$ (${x_{i}},{y_{i}},{t_{i}}$) and its estimation ${P^{\prime }_{i}}$ (${x^{\prime }_{i}},{y^{\prime }_{i}},{t_{i}}$). If ${T^{\prime }}$ contains ${P_{i}}$, then ${P^{\prime }_{i}}={P_{i}}$. Otherwise, the closest point to ${P_{i}}$ is defined as ${P^{\prime }_{i}}$ which is along the line between pred${T^{\prime }}$ (${P_{i}}$) and succT′ (${P_{i}}$), where predT′ (${P_{i}}$) and succT′ (${P_{i}}$) denote ${P_{i}}$’s closest predecessor and successor among the points in ${T^{\prime }}$ (Muckell et al., 2014).
Definition 2.
SED error: Synchronous euclidean distance is an error measure which considers both spatial and temporal information. The definition of SED error is similar to spatial error, the SED error of ${T^{\prime }}$ with respect to a point ${P_{i}}$ in T is defined as the distance between ${P_{i}}$ (${x_{i}},{y_{i}},{t_{i}}$) and its estimation ${P^{\prime }_{i}}$ (${x^{\prime }_{i}},{y^{\prime }_{i}},{t_{i}}$). The difference is the coordinate ${x^{\prime }_{i}},{y^{\prime }_{i}}$ of ${P^{\prime }_{i}}$ (${x^{\prime }_{i}},{y^{\prime }_{i}},{t_{i}}$) are defined using formulas (4) and (5), where ${P_{s}}$ (${x_{s}},{y_{s}},{t_{s}}$) = predT′ (${P_{i}}$) and ${P_{e}}$ (${x_{e}},{y_{e}},{t_{e}}$) = succ T′ (${P_{i}}$).
For instance, in Fig. 3(a), spatial errors of ${P_{1}}$, ${P_{4}}$, ${P_{6}}$ are zero and spatial error of ${P_{2}}$ is the perpendicular distance from ${P_{2}}$ to line ${P_{1}}{P_{4}}$. In Fig. 3(b), SED errors of ${P_{1}},{P_{4}},{P_{6}}$ are zero and SED error of ${P_{2}}$ is the distance between ${P_{2}}$ and ${P^{\prime }_{2}}$.
Fig. 3
Spatial and SED errors are illustrated by using $T={P_{1}},{P_{2}},{P_{3}},{P_{4}},{P_{5}},{P_{6}}$ and ${T^{\prime }}={P_{1}},{P_{4}},{P_{6}}$.
Definition 3.
Speed error: Given a trajectory T and its compressed representation ${T^{\prime }}$, the speed error of ${T^{\prime }}$ with respect to a point ${P_{i}}$ (${x_{i}},{y_{i}},{t_{i}}$) in T is defined as the absolute difference value between Speed (${P_{i}}$) and AverageSpeed (${P_{s}}{P_{e}}$), where ${P_{s}}({x_{s}},{y_{s}},{t_{s}})=\text{pred}{T^{\prime }}({P_{i+1}})$, ${P_{e}}({x_{e}},{y_{e}},{t_{e}})=\text{succ}{T^{\prime }}({P_{i}})$. ${P_{i}}$’s speed and average speed of segment ${P_{s}}{P_{e}}$ are defined as follows:
Definition 4.
Heading error: given a trajectory T and its compressed representation ${T^{\prime }}$, the heading error of ${T^{\prime }}$ with respect to a point ${P_{i}}$ in T is defined as heading change between Heading (${P_{i}}$) and Heading (${P_{s}}{P_{e}}$), where ${P_{s}}=\text{pred}{T^{\prime }}({P_{i}}+1)$, ${P_{e}}=succ{T^{\prime }}({P_{i}})$. ${P_{i}}$’s heading, heading of segment ${P_{s}}{P_{e}}$ and HeadingChange (${h_{1}},{h_{2}}$) are defined as follows:
3 HMOTC Algorithm
A key feature of our HMOTC algorithm is that it takes into account the direction information of GPS trajectories, the algorithm controls both the heading change degree of positioning points and the direction change degree between positioning points and trajectory segment.
3.1 Algorithm Description
Given a trajectory T, a SED threshold μ and an angle threshold θ, HMOTC algorithm starts by defining a segment between the first point ${P_{1}}$ and the third point ${P_{3}}$ in T, where ${P_{1}}$ is called anchor point ${P_{a}}$ and ${P_{3}}$ is called float point ${P_{f}}$. Then the following steps are applied in segment (window) ${P_{a}}{P_{f}}$:
(1) Checking whether predT(${P_{f}}$) is a turning point by calculating heading change α between points predT(${P_{f}}$) and predT(predT(${P_{f}}$)). If $\alpha >\theta $ (i.e. predT(${P_{f}}$) is a turning point), then predT(${P_{f}}$) becomes the new anchor point and the second point behind predT(${P_{f}}$) becomes the new float point.
(2) It is not enough to capture the heading change of a moving object by calculating heading change of a single point, this policy is vulnerable to error propagation, as a moving object could change its heading gradually. In order to address this problem, the heading change of any two points in segment ${P_{a}}{P_{f}}$ (except point ${P_{f}}$) and the heading change between segment ${P_{a}}{P_{f}}$ and each point in segment ${P_{a}}{P_{f}}$ (except point ${P_{f}}$) are calculated. If there is a heading change value greater than the given angle threshold θ, then the point (between ${P_{a}}$ and ${P_{f}}$) of the minimum Accumulated Synchronous Euclidean Distance (ASED) error becomes the new anchor point and the second point behind it becomes the new float point.
(3) For maintaining shape skeleton of the trajectory and minimizing the loss of spatiotemporal information, SED errors of the points between ${P_{a}}$ and ${P_{f}}$ are calculated. If there is a SED error value greater than the given SED threshold μ, then the point (between ${P_{a}}$ and ${P_{f}}$) of the minimum ASED error becomes the new anchor point and the second point behind it becomes the new float point.
As long as the anchor point ${P_{a}}$ is not changed, an attempt is made to move the float point ${P_{f}}$ one point forward in trajectory T, and then apply the above steps in segment ${P_{a}}{P_{f}}$. When there is a new anchor point, the above steps can be applied in segment ${P_{a}}{P_{a+2}}$ (i.e. ${P_{a}}{P_{f}}$), where ${P_{a}}$ is the new anchor point and ${P_{a+2}}$ is the new float point. The above process will be repeated until the entire trajectory T has been transformed into a piecewise linear approximation ${T^{\prime }}$.
For compressing trajectories more accurate, ASED error measure is used for anchor point selection. Given anchor point ${P_{a}}$ and float point ${P_{f}}$, the ASED of ${P_{i}}$ $(f>i>a)$ is defined as
where ${T_{a}}f={P_{a}},{P_{a+1}},\dots ,{P_{f}}$ and ${T^{\prime }_{a}}f={P_{a}},{P_{i}},{P_{f}}$. By using ASED the most appropriate anchor point can be selected. For example, as shown in Fig. 4, ${T_{a}}f={P_{1}},{P_{2}},{P_{3}},{P_{4}},{P_{5}},{P_{6}},{P_{7}},{P_{8}},{P_{9}}$, where ${P_{1}}$ is the anchor point and ${P_{9}}$ is the float point. For selecting a new anchor point between ${P_{a}}$ and ${P_{f}}$, the ASED errors of ${P_{2}},{P_{3}},{P_{4}},{P_{5}},{P_{6}},{P_{7}},{P_{8}}$ are calculated. Due to ${P_{6}}$ has the minimum ASED error, ${P_{6}}$ and ${P_{8}}$ become new anchor point and new float point. And as seen in Fig. 4, the ${P_{6}}$ is the appropriate anchor point. Given a trajectory T of length n, HMOTC algorithm has $O({n^{2}})$ worst-case running time.
(11)
\[ \mathit{ASED}({P_{i}})={\sum \limits_{i=a}^{f}}\mathit{SEDError}\big({T_{a}}f,{T^{\prime }_{a}}f,{P_{i}}\big),\]3.2 Pseudo-Code of HMOTC Algorithm
Figure 5 describes the details of HMOTC algorithm. First ${P_{1}}$ and ${P_{3}}$ are selected as anchor point ${P_{a}}$ and float point ${P_{f}}$ (lines 2 and 3). Then the algorithm will do some checks in segment ${P_{a}}{P_{f}}$ (line 7), as long as the anchor point ${P_{a}}$ is not changed (i.e. index = −1), then move the float point ${P_{f}}$ one point forward (line 13). Otherwise, ${P_{\mathit{index}}}$ becomes the new anchor point and ${P_{\mathit{index}}}+2$ becomes the new float point (lines 10 and 11). When the entire trajectory T has been transformed into a piecewise linear approximation ${T^{\prime }}$, the compressed trajectory ${T^{\prime }}$ will be returned (line 20).
Figure 6 provides a detailed description of findNewAnchorIndex algorithm. This algorithm is used to find the new anchor point according to SED error and heading error, if there is no new anchor point found, then the value −1 will be returned (value −1 means that the segment ${P_{a}}{P_{f}}$ can represent the sub-trajectory ${T_{\mathit{sub}}}={P_{a}},{P_{a}}+1,\dots ,{P_{f}}$ without too much loss of information). First, this algorithm calculates the heading change of predT(${P_{f}}$) (line 3). Then, the algorithm checks whether predT(${P_{f}}$) is a turning point (line 4), if predT(${P_{f}}$) is a turning point then predT(${P_{f}}$) becomes the new anchor point (line 5). After that, the algorithm will calculate three errors. (1) calculate each point’s SED error (line 8). (2) calculate heading changes between segment ${P_{a}}{P_{f}}$ and each point (line 9). (3) calculate heading changes of any two points (line 12). If any errors are greater than the given parameters (line 13), then findNewAnchorIndex algorithm seeking the point which has the minimum ASED error (line 15) and return the point’s index (line 16). When all points in segment ${P_{a}}{P_{f}}$ (expect ${P_{f}}$) are processed, value −1 will be returned (line 19).
4 Evaluations
In order to verify the performance of the proposed algorithm, we implemented HMOTC algorithm and other algorithms by using Scala language. And Geolife dataset (Zheng et al., 2009, 2008, 2010) was used for algorithms evaluating. The Microsoft Geolife dataset was obtained from 182 users over a period of five years (from April 2007 to August 2012), it contains 17,621 trajectories with a total distance of 1,292,951 kilometers and a total duration of 50,176 hours. 91.5 percent of the trajectories are logged in a dense representation, e.g. every 1–5 seconds or every 5–10 meters per point. This dataset can support transportation mode learning, 73 users have labelled their trajectories with transportation mode, such as driving, taking a bus, riding a bike and walking. In our evaluations, three labelled trajectories were used. Trajectory one is a multi-modal trajectory, it contains three transportation modes (walk, bus, train), 5911 positioning points, and a total duration of 3 hours 49 minutes (from 2008-06-18, 12:10:33 to 2008-06-18, 15:59:59). Trajectory two is a bus track, it contains 2045 positioning points, and a total duration of 50 minutes (from 2008-06-18, 12:33:34 to 2008-06-18, 13:23:02). Trajectory three is a taxi track in motorway, it contains 2167 positioning points, and a total duration of 37 minutes (from 2008-04-04, 07:16:50 to 2008-04-04, 07:53:00).
Our evaluation did not include algorithms which do not consider temporal information, because these algorithms will introduce larger temporal errors. And the direction-preserving algorithm DPTS-SP-Prac is also not included, because the algorithm lacks the control ability of the spatial error, the algorithm will introduce a larger spatial error. And the algorithms which use speed as a parameter are also not included, because it is hard to find an appropriate speed value as the parameter for a trajectory which contains multiple transportation modes. Various error metrics including average SED error, average spatial error, average speed error and average heading error were used in our evaluation. Given a trajectory $T={P_{1}},{P_{2}},{P_{3}},\dots ,{P_{n-1}},{P_{n}}$ and its compressed representation ${T^{\prime }}={P_{i1}},{P_{i2}},{P_{i3}},\dots ,{P_{im-1}},{P_{im}}$, these error metrics are defined as follows:
(12)
\[ \mathit{AverageSpatialError}\big(T,{T^{\prime }}\big)=\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathit{SpatialError}\big(T,{T^{\prime }},{P_{i}}\big),\](13)
\[ \mathit{AverageSEDError}\big(T,{T^{\prime }}\big)=\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathit{SEDError}\big(T,{T^{\prime }},{P_{i}}\big),\]4.1 The Effect of Parameters on Compression Ratio
Since the proposed algorithm is a heading oriented algorithm, this algorithm can accurately capture the direction information of the trajectory, and all turning points whose moving direction change degree are greater than the given angle threshold will be retained in compressed trajectory. So the compression rate of this algorithm is greatly influenced by trajectory’s heading. The places where a user stayed, walked or took photos always cause a large heading change (because of the GPS positioning error, even if the user is stationary, it will produce a larger direction change). As shown in Fig. 7, due to the bus stopping at each bus station, the trajectory two has a lot of points whose direction change degree is greater than the given angle threshold at bus stations, thus the compression rate of trajectory two is mainly influenced by angle threshold. Similarly to trajectory two, the compression rate of trajectory one is also mainly influenced by angle threshold, because trajectory one contains both walk and bus transportation modes. Trajectory three is a taxi track in motorway, it contains few points whose direction change degree is greater than the given angle threshold, so the compression rate of trajectory three is influenced by both angle threshold and SED threshold.
When the zoom = 19 is shown in Fig. 8, the image effects under different conditions of the algorithm are used for compression. In order to facilitate the comparison, this paper intercepts a section of track with obvious effect. Figure 8(a) is the original trajectory. In Fig. 8(b) the SED threshold value of 30 is taken and the angle threshold is ${60^{\circ }}$. It can be seen that keeping the contour works well and approaches the original trajectory. Figure 8(c) is a case where no angle threshold is set, and the trajectory compression rate is improved, but the profile is distorted, and there is a large difference to the original trajectory. In Fig. 8(d), the SED threshold is set to the maximum, which is the case considering only the angle threshold, and the effect is good. However, if the SED threshold is too large, trajectory distortion will be caused in other trajectory segments. Figure 8(e) is used to show that if you want to mention the compression ratio, you must increase the angle threshold and the effect of keeping the contour decreases.
4.2 The Effect of Parameters on Compression Ratio
The proposed algorithm has the benefit of taking into account heading information of the trajectory, it can have direction error under control. In the following evaluations, the angle threshold of the proposed algorithm was set to ${60^{\circ }}$ for trajectory one, trajectory two and trajectory three. And compared to the traditional position-preserving compression algorithms which cannot control the direction error, from the experimental results, it can be seen that the proposed algorithms have smaller loss of heading information, while other algorithms will lose a lot of direction information.
Figures 9 and 10 contrast the trajectory compression algorithms in terms of average spatial error and average SED error. The experimental results of the two figures are basically the same, and they are introduced together. For trajectory one, the proposed algorithm outperforms other algorithms in terms of both average spatial error and average SED error. For trajectory two and trajectory three, the error of HMOTC is smaller than Before-OPW-TR and TD-TR, and slightly greater than SQUISH-E (μ).
Figure 11 provides a comparison in terms of average heading error. The result shows that the proposed algorithm is more accurate than other position-preserving compression algorithms. This is because HMOTC algorithm can compress trajectories while ensuring that heading error is within a user-specified bound. In contrast, other algorithms lack the capability of heading error control, so these algorithms will lose a lot of heading information.
Due to the proposed algorithm has the capability of heading error control. So, a lot of turning points are retained in compressed trajectory. Although these points are important in describing semantic meanings of a trajectory, they will cause a low compression ratio. As seen in Fig. 12, the compression ratio of HMOTC algorithm is lower than other algorithms. A high compression ratio can be achieved by setting a 180${^{\circ }}$ angle threshold, if these turning points are not taken into account. In the next section, evaluations are given under the condition of same compression ratio.
4.3 Evaluations Under the Condition of Same Compression Ratio
In this section, we evaluated algorithms under the condition of same compression ratio. In order to achieve the given compression ratio, the value of HMOTC algorithm’s angle threshold parameter may be set to obtuse angle or straight angle. In this case, HMOTC algorithm will lose the advantage of heading maintaining. From the experimental results, it can be seen that even under the condition of the same compression ratio, information loss of the proposed algorithm is not too large. And in terms of average SED error, the proposed algorithm has a better performance than other algorithms.
In Fig. 13, algorithms are evaluated in terms of average spatial error. The average spatial error of the Normal-OPW-TR algorithm in the three trajectory modes is the largest, and the HMOTC algorithm is the smallest in the trajectory 1, but is equivalent to other algorithms in the trajectory 2 and the trajectory 3. The results show that Normal-OPW-TR algorithm introduced larger spatial error than other algorithms, and the spatial error of HMOTC algorithm is smaller than most algorithms.
In Fig. 14, average SED errors are shown for each algorithm over various compression ratios. The track 1 and track 2 can clearly show that the average SED error of the HMOTC algorithm is smaller than that of other algorithms, and the track 3 is not obvious, but it is equivalent to other algorithms. Experiments show that HMOTC and TD-TR are the most accurate algorithms in terms of SED error. Compared to TD-TR algorithm, HMOTC algorithm has the advantage of supporting both offline and online applications.
Figure 15 provides a comparison in terms of average speed error. The three models in the figure show that the average speed error of this algorithm is not very obvious, but the effect is basically the same, no uncontrollable or greater error loss occurs. As shown in the figure, SQUISH-E(μ), HMOTC and TD-TR are the most accurate algorithms in terms of speed error. Compared to HMOTC, SQUISH-E(μ) and TD-TR have the limitation that they only support offline applications.
Figure 16 shows the average heading errors for each algorithm over various compression ratios. The results show that with the increase of compression ratio, HMOTC algorithm gradually lost its advantage of heading maintaining. The reason behind this is that in order to achieve the given compression ratio, the value of HMOTC algorithm’s angle threshold parameter is set to obtuse angle or straight angle. As seen in Fig. 16, the proposed algorithm has better performance in case of low compression ratio (compression ratio $\mathit{CR}\geqslant 15$).
5 Conclusions
In this paper, we proposed a heading maintaining oriented trajectory compression algorithm, called HMOTC. The algorithm can maintain both shape skeleton and semantic meanings of the trajectory, all turning points whose moving direction change degree is greater than the given angle threshold will be retained in compressed trajectory. Compared to traditional position-preserving compression algorithms, the HMOTC algorithm has the benefit of taking into account heading information of the trajectory, so this algorithm can support wider range of applications. The HMOTC has the flexibility of controlling compression with respect to both spatial error and direction error, so the compression process can be precisely controlled according to users’ needs. If only the direction information of the trajectory is taken into account, the SED threshold of the algorithm can be set to positive infinity. If only the spatial information of the trajectory is taken into account, the angle threshold of the algorithm can be set to 180. Because the algorithm can compress trajectory data while retrieving points from trajectory, it has the advantage of supporting both online and offline applications. The results show that the proposed algorithm can achieve more accurate approximation than other algorithms under the condition of same SED threshold, especially in the heading maintaining aspect. And it also has the good performance under the condition of same compression ratio.