Skip to main content
Log in

Trajectory splicing

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

With continued development of location-based systems, large amounts of trajectories become available which record moving objects’ locations across time. If the trajectories collected by different location-based systems come from the same moving object, they are spliceable trajectories, which contribute to representing holistic behaviors of the moving object. In this paper, we consider how to efficiently identify spliceable trajectories. More specifically, we first formalize a spliced model to capture spliceable trajectories where their times are disjoint, and the distances between them are close. Next, to efficiently implement the model, we design three components: a disjoint time index, a directed acyclic graph of sub-trajectory location connections, and two splice algorithms. The disjoint time index saves a disjoint time set of each trajectory for querying disjoint time trajectories efficiently. The directed acyclic graph contributes to discovering groups of spliceable trajectories. Based on the identified groups, the splice algorithm findmaxCTR finds maximal groups containing all spliceable trajectories. Although the splice algorithm is efficient in some practical applications, its running time is exponential. Therefore, an approximate algorithm findApproxMaxCTR is proposed to find trajectories which can be spliced with each other with a certain probability within polynomial run time. Finally, experiments on two datasets demonstrate that the model and its components are effective and efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. Namely \((ti(\textit{STR}_m^n) \subset gap(\textit{STR}_i^j,\textit{STR}_i^{j+1}))\)\(\cap \)\((ti(\textit{STR}_i^j) \subset gap(\textit{STR}_m^{n-1},\textit{STR}_m^{n}))\)\(\cap \)\( (d(last(\textit{STR}{_i^j}),\)\(first(\textit{STR}{_m^n}))\le \gamma )\).

  2. https://en.wikipedia.org/wiki/Dense_graph.

References

  1. Bakalov P, Hadjieleftheriou M, Tsotras VJ (2005) Time relaxed spatiotemporal trajectory joins. In: Proceedings of the 13th annual ACM international workshop on geographic information systems, ACM, New York, NY, USA, pp 182–191

  2. Dai J, Yang B, Guo C, Ding Z (2015) Personalized route recommendation using big trajectory data. In: 2015 IEEE 31st international conference on data engineering, pp 543–554

  3. Dai J, Yang B, Guo C, Jensen CS, Hu J (2016) Path cost distribution estimation using trajectory data. Proc VLDB Endow 10(3):85–96

    Article  Google Scholar 

  4. Ding Z, Yang B, Chi Y, Guo L (2016) Enabling smart transportation systems: a parallel spatio-temporal database approach. IEEE Trans Comput 65(5):1377–1391

    Article  MathSciNet  Google Scholar 

  5. Emrich T, Kriegel HP, Mamoulis N, Renz M, Züfle A (2012) Querying uncertain spatio-temporal data. In: 2012 IEEE 28th international conference on data engineering, pp 354–365

  6. Eppstein D, Löffler M, Strash D (2010) Listing all maximal cliques in sparse graphs in near-optimal time. In: Algorithms and computation, no. 6506 in lecture notes in computer science, Springer Berlin Heidelberg, pp 403–414

  7. Goh CH, Lu H, Ooi BC, Tan KL (1996) Indexing temporal data using existing B+-trees. Data Knowl Eng 18(2):147–165

    Article  Google Scholar 

  8. Gudmundsson J, van Kreveld M (2006) Computing longest duration flocks in trajectory data. In: Proceedings of the 14th annual ACM international symposium on advances in geographic information systems, ACM, New York, NY, USA, pp 35–42

  9. Gudmundsson J, van Kreveld M, Speckmann B (2004) Efficient detection of motion patterns in spatio-temporal data sets. In: Proceedings of the 12th annual ACM international workshop on geographic information systems, ACM, New York, NY, USA, pp 250–257

  10. Guo C, Jensen CS, Yang B (2014) Towards total traffic awareness. SIGMOD Rec 43(3):18–23

    Article  Google Scholar 

  11. Guo C, Yang B, Andersen O, Jensen CS, Torp K (2015) Ecomark 2.0: empowering eco-routing with vehicular environmental models and actual vehicle fuel consumption data. GeoInformatica 19(3):567–599

    Article  Google Scholar 

  12. Guo C, Yang B, Hu J, Jensen CS (2018) Learning to route with sparse trajectory sets. In: IEEE 34th international conference on data engineering, pp 1073–1084

  13. Güting RH, Valdés F, Damiani ML (2015) Symbolic trajectories. ACM Trans Spat Algorithms Syst 1(2):7:1–7:51

    Google Scholar 

  14. HCormen T, ELeiserson C, LRivest R, Stein C (2009) Introduction to algorithm, 3rd edn. MIT Press, Cambridge

    Google Scholar 

  15. Hu J, Yang B, Jensen CS, Ma Y (2017) Enabling time-dependent uncertain eco-weights for road networks. GeoInformatica 21(1):57–88

    Article  Google Scholar 

  16. Hu J, Guo C, Yang B, Jensen CS (2019) Stochastic weight completion for road networks using graph convolutional networks. In: IEEE 35th international conference on data engineering, pp 1274–1285

  17. Hua L, Chenjuan G, Bin Y, Christian SJ (2016) Finding frequently visited indoor pois using symbolic indoor tracking data. In: Proceedings of the 19th international conference on extending database technology, pp 449–460

  18. Jensen CS, Lu H, Yang B (2009) Indexing the trajectories of moving objects in symbolic indoor space. In: International symposium on spatial and temporal databases, pp 208–227

  19. Jeung H, Yiu ML, Zhou X, Jensen CS, Shen HT (2008) Discovery of convoys in trajectory databases. 1:1068–1080

  20. Kieu T, Yang B, Guo C, Jensen CS (2018a) Distinguishing trajectories from different drivers using incompletely labeled trajectories. In: Proceedings of the 27th ACM international conference on information and knowledge management, pp 863–872

  21. Kieu T, Yang B, Jensen CS (2018b) Outlier detection for multidimensional time series using deep neural networks. In: IEEE 19th international conference on mobile data management, pp 125–134

  22. Kieu T, Yang B, Guo C, Jensen CS (2019) Outlier detection for time series with recurrent autoencoder ensembles. In: 28th international joint conference on artificial intelligence

  23. Korte B, Vygen J (2012) Combinatorial optimization, algorithms and combinatorics, vol 21. Springer, Berlin

    Book  Google Scholar 

  24. Lee JG, Han J, Whang KY (2007) Trajectory clustering: a partition-and-group framework. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, ACM, New York, NY, USA, pp 593–604

  25. Lee JG, Han J, Li X (2015) A unifying framework of mining trajectory patterns of various temporal tightness. IEEE Trans Knowl Data Eng 27(6):1478–1490

    Article  Google Scholar 

  26. Li X, Ceikute V, Jensen C, Tan KL (2013) Effective online group discovery in trajectory databases. IEEE Trans Knowl Data Eng 25(12):2752–2766

    Article  Google Scholar 

  27. Li Z, Ding B, Han J, Kays R (2010) Swarm: mining relaxed temporal moving object clusters. Proc VLDB Endow 3:723–734

    Article  Google Scholar 

  28. Liao L, Patterson DJ, Fox D, Kautz H (2007) Learning and inferring transportation routines. Artif Intell 171(5–6):311–331

    Article  MathSciNet  Google Scholar 

  29. Sakr MA, Güting RH (2011) Spatiotemporal pattern queries. GeoInformatica 15(3):497–540

    Article  Google Scholar 

  30. Spaccapietra S, Parent C, Damiani ML, de Macedo JA, Porto F, Vangenot C (2008) A conceptual view on trajectories. Data Knowl Eng 65(1):126–146

    Article  Google Scholar 

  31. Su H, Zheng K, Huang J, Wang H, Zhou X (2014) Calibrating trajectory data for spatio-temporal similarity analysis. VLDB J 24(1):93–116

    Article  Google Scholar 

  32. Sun J, Tao Y, Papadias D, Kollios G (2006) Spatio-temporal join selectivity. Inf Syst 31(8):793–813

    Article  Google Scholar 

  33. Tao Y, Papadias D (2001) MV3r-Tree: A spatio-temporal access method for timestamp and interval queries. In: Proceedings of the 27th international conference on very large data bases, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 431–440

  34. Tomita E, Tanaka A, Takahashi H (2006) The worst-case time complexity for generating all maximal cliques and computational experiments. Theor Comput Sci 363(1):28–42

    Article  MathSciNet  Google Scholar 

  35. Valdés F, Güting RH (2014) Index-supported pattern matching on symbolic trajectories. In: Proceedings of the 22Nd ACM SIGSPATIAL international conference on advances in geographic information systems, ACM, New York, NY, USA, pp 53–62

  36. Vieira MR, Bakalov P, Tsotras VJ (2009) On-line discovery of flock patterns in spatio-temporal data. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems, ACM, New York, NY, USA, pp 286–295

  37. Wang L, Zheng Y, Xie X, Ma WY (2008) A flexible spatio-temporal indexing scheme for large-scale GPS track retrieval. In: 9th international conference on mobile data management, IEEE, pp 1–8

  38. Wu H, Xue M, Cao J, Karras P, Ng WS, Koo KK (2016) Fuzzy trajectory linking. In: IEEE 32nd international conference on data engineering, IEEE, pp 859–870

  39. Xie K, Deng K, Zhou X (2009) From trajectories to activities: a spatio-temporal join approach. In: Proceedings of the 2009 international workshop on location based social networks, ACM, New York, NY, USA, pp 25–32

  40. Xu J, Güting RH, Zheng Y (2015) The TM-RTree: an index on generic moving objects for range queries. GeoInformatica 19(3):487–524

    Article  Google Scholar 

  41. Yang B, Ma Q, Qian W, Zhou A (2009) TRUSTER: trajectory data processing on clusters. In: DASFAA, pp 768–771

  42. Yang B, Guo C, Jensen CS, Kaul M, Shang S (2014) Stochastic skyline route planning under time-varying uncertainty. In: IEEE 30th international conference on data engineering, pp 136–147

  43. Yang B, Guo C, Ma Y, Jensen CS (2015) Toward personalized, context-aware routing. VLDB J 24(2):297–318

    Article  Google Scholar 

  44. Yang B, Dai J, Guo C, Jensen CS, Hu J (2018) PACE: a path-centric paradigm for stochastic path finding. VLDB J 27(2):153–178

    Article  Google Scholar 

  45. Zheng K, Zheng Y, Yuan N, Shang S, Zhou X (2014) Online discovery of gathering patterns over trajectories. IEEE Trans Knowl Data Eng 26(8):1974–1988

    Article  Google Scholar 

  46. Zheng Y (2015) Trajectory data mining: an overview. ACM Trans Intell Syst Technol 6(3):1–41

    Article  Google Scholar 

  47. Zheng Y, Zhang L, Xie X, Ma WY (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th international conference on world wide web, ACM, New York, NY, USA, pp 791–800

  48. Zheng Y, Xie X, Ma WY (2010) Geolife: a collaborative social networking service among user, location and trajectory. IEEE Data Eng Bull 33(2):32–39

    Google Scholar 

  49. Zhou P, Zhang D, Salzberg B, Cooperman G, Kollios G (2005) Close pair queries in moving object databases. In: Proceedings of the 13th annual ACM international workshop on geographic information systems, ACM, New York, NY, USA, pp 2–11

Download references

Acknowledgements

We would like to thank Professor Christian S. Jensen for useful discussions and comments. This work was supported by National Science and Technology Major Project (no. 2017ZX05018-005), National Natural Science Foundation of China (no. 61402532), Science Foundation of China University of Petroleum-Beijing (no. 01JB0415), and China Scholarship Council.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qiang Lu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Computing disjoint time set

Lemma

In the query interval time T, the disjoint time set \(\textit{DT}_i\) of each trajectory \(\textit{TR}_i\) can be computed by Eq. 6.

Proof

Let \(Q_i^{k,d}\) be a trajectory set where each trajectory \(\textit{TR}\) appears in T and its time interval set \(ti(\textit{TR})\) does not overlap with \(\textit{TR}_i\) in the kth time slice. So, \(Q_i^{k,d}=\textit{DT}_i^{k,d}\cup R^{k,d}\), where \(R^{k,d}\) is a trajectory set in which each trajectory appears in T except in the kth time slice. For example, in Fig. 2, assumed \(T=[0,3d]\), \(R_A^{3,d}=\lbrace D,E \rbrace \) and \(\textit{DT}_A^{3,d}=\lbrace B,C\rbrace \). Then, \(Q_A^{3,d}=\lbrace B,C,D,E \rbrace \). Therefore,

$$\begin{aligned} \textit{DT}_i(T)=Q_i^{1,d} \cap Q_i^{2,d} \cap \ldots \cap Q_i^{n,d} \end{aligned}$$
(15)

Due to \(P_i=P_i^{1,d}\cup P_i^{2,d} \cup \ldots \cup P_i^{n,d}\),

$$\begin{aligned} \lnot \textit{DT}_i^{k,d} \cup Q^{k,d}= & {} P_i \end{aligned}$$
(16)
$$\begin{aligned} \lnot \textit{DT}_i^{k,d} \cap Q_i^{k,d}= & {} \phi \end{aligned}$$
(17)

According to Eqs. 5, 16 and 17, Eq. 15 can be deduced as follows.

$$\begin{aligned} \textit{DT}_i(T)&=(P_i-\lnot \textit{DT}_i^{1,d}) \cap \ldots \cap (P_i-\lnot \textit{DT}_i^{n,d})\nonumber \\&=P_i-[\lnot \textit{DT}_i^{1,d} \cup (\lnot \textit{DT}_i^{2,d}-\lnot \textit{DT}_i^{1,d}) \nonumber \\&\quad \cup \ldots \cup (\lnot \textit{DT}_i^{n,d}-\lnot \textit{DT}_i^{n-1,d})] \nonumber \\&=P_i-[(P_i-\textit{DT}_i^{1,d}) \cup DF_i^{2,d} \cup \ldots \cup DF_i^{n,d}] \end{aligned}$$
(18)

\(\square \)

Appendix B Finding spliced trajectory

Lemma 4. Assuming that Algorithm 2 is processing the current pair \(\langle \textit{STR}_k^v,\textit{STR}_i^j\rangle \), the sub-trajectory \(\textit{STR}_i^j\) is a temporary end vertex, and sub-trajectories from the same trajectory before trajectory \(\textit{STR}_k^v\) constitute a temporary trajectory, if a path from \(\textit{STR}_k^v\) to \(\textit{STR}_i^j\) is found by the function existPath, temporary trajectories that the path have passed through can form a spliced path.

Proof

Let \(P_c\) which is found by existPath be a path from \(\textit{STR}_k^v\) to \(\textit{STR}_i^j\). We firstly prove there must exist a path \(P_l\) from \(\textit{STR}_k^v\) to \(\textit{STR}_i^j\) in the current graph STLC-DAG. \(P_l\) is an time-ordered sequence where each \(\textit{STR} \in \)\(\lbrace \textit{STR}_m^n| ti(\textit{STR}_k^v).st< ti(\textit{STR}_m^n).st < ti(\textit{STR}_i^j).st, m\in M(P_c) \rbrace \cup \lbrace \textit{STR}_k^v,\textit{STR}_i^j\rbrace \). And, \(M(P_c)\) is a set of \(\textit{TR}\hbox {s}\) that \(P_c\) have passed through except i and k. We prove the problem according to the following situations.

If \(|M(P_c)|=0\) or \(|M(P_c)|=1\), \(P_c\) must be \(P_l\).

If \(|M(P_c)| \ge 2\), suppose \(P_l\) does not exist in the current STLC-DAG. Let \(P_a\) be the path contains the maximum number of \(\textit{STR}\hbox {s}\) from \(P_l\), where \(M(P_c) \subseteq M(P_a)\). Then, at least one vertex \(\textit{STR}_m^n\) from \(P_l\) is not on \(P_a\). According to time, let \(\textit{STR}_m^n\) be between \(P_a[i]\) and \(P_a[i+1]\), namely \(ti(P_a[i]).st< ti(\textit{STR}_m^n).st < ti(P_a[i+1]).t\), where \(P_a[i](P_a[j])\) is a ith or jth \(\textit{STR}\) in \(P_a\), \(m_i(m_{i+1})\) is the subscript of \(P_a[i](P_a[i+1])\), and \(m_i,m_{i+1}\in m(P_c)\). Therefore, before running the current pair, the algorithm has executed evaluation of the two pair \(\langle P_a[i],\textit{STR}_m^n \rangle \) and \(\langle \textit{STR}_m^n,P_a[i+1] \rangle \). The evaluation generated two following results. One is that, if there does not exist a path between \(\langle P_a[i],\textit{STR}_m^n \rangle \) or \(\langle \textit{STR}_m^n,P_a[i+1] \rangle \), it shows \(TR_m\) and \(TR_{m_i}\)(\(TR_{m_{i+1}}\)) cannot be spliced. So, \(m_i \notin SP_m\) or \(m_{i+1} \notin SP_m\). According to existPath (Algorithm 3), it cannot find that a path contains \(\textit{STR}_{m_i}(\textit{STR}_{m_{i+1}})\) and \(\textit{STR}_m\). It contradicts with \(P_c\). The other is that, if there does exist both above paths, \(\textit{STR}_m^n\) can be added into \(P_a\). It contradicts with \(P_a\) that has the maximum number of \(\textit{STR}\hbox {s}\) from \(P_l\). Therefore, \(P_l\) must exist in the current STLC-DAG.

Then, since \(P_l\) from \(\textit{STR}_k^v\) to \(\textit{STR}_i^j\) exists in STLC-DAG, it implies that there must exists a path \(P_b\) from the start vertex to \(\textit{STR}_k^v\) in the current STLC-DAG. And, \(P_b\) contains all \(\textit{STR}\hbox {s}\) of \(\textit{TR}\hbox {s}\) between the start vertex and \(\textit{STR}_k^v\)(\(P_c\) has passed through these \(\textit{TR}s\)). This is because the algorithm has processed previous pair \(\langle \textit{STR}_t^r,\textit{STR}_k^v\rangle \). And, there must exist a path \(P_t\) similar to \(P_a\) between \(\textit{STR}_t^r\) and \(\textit{STR}_k^v\) owing to the path found by existPath. And so on, these previous pairs form the \(P_b\). Therefore, the \(P_b\) and \(P_l\) can form a spliced path. \(\square \)

Lemma 5 If and only if a path found by algorithm 3 contains sub-trajectories from two different trajectories, the two trajectories can be spliced.

Proof

If there exists a path, which is found by Algorithm 3, between any two sub-trajectories from two trajectories, respectively, according to Lemma 4, the trajectories that the path passed through can be spliced with the two sub-trajectories. So, the two two sub-trajectories can be spliced. According to the definition 6, if two sub-trajectories are spliceable sub-trajectories, there exists a spliced path that can pass through all sub-trajectories of the two trajectories. \(\square \)

Theorem 1

If there exists a directed edge between two trajectories, the two trajectories can be spliced.

Proof

Suppose there is an edge between \(\textit{STR}_i^j\) and \(\textit{STR}_m^n\), which the two STRs belong to \(\textit{TR}_i\) and \(\textit{TR}_j\), respectively, and \(\textit{TR}_i\) cannot be spliced with \(\textit{TR}_m\). According to Lemma 5, at least one pair of \(\textit{STR}\hbox {s}\) from the two \(\textit{TR}\hbox {s}\), respectively, cannot be connected by a path that is found by existPath. But, a Algorithm 2 (Line 10) must have deleted all edges between \(\textit{TR}_i\) and \(\textit{TR}_j\) if it finds that a pair between them cannot be connected by a path. Therefore, there is not an edge between them. It contradicts the assumption that there is an edge between \(\textit{STR}_i^j\) and \(\textit{STR}_m^n\).

Theorem 2

For each \(\textit{SP}_i \in \textit{SP}\), where \(\textit{SP}\) is one of output parameters of Algorithm 2, \(\textit{SP}_i\) is a set of trajectories that can be spliced with the trajectory \(\textit{TR}_i\).

Proof

At initialized phase of Algorithm 2, \(\textit{SP} = \textit{DT}\). Suppose one \(SP_i\) has a subscript m, and its corresponding \(\textit{TR}_m\) cannot be spliced with \(\textit{TR}_i\). According to Lemma 5, there is not a path between one pair \(\langle \textit{STR}_i^j,\textit{STR}_m^n \rangle \). And, \(SP_i=SP_i-m\) (Line 12 in Algorithm 2), has been executed. It contradicts with \(SP_i\) because \(SP_i\) contains m. \(\square \)

Lemma 6. In SP-set graph, a clique is a group of spliceable trajectories, a maximal clique is a complete trajectory.

Proof

A group of spliceable trajectories can be directly or indirectly spliced with each other. Therefore, there exists an edge between any two of them. So, the group of spliceable trajectories is a clique in the graph. If the clique is the maximal clique, the group of spliceable trajectories on the maximal clique cannot be contained by other groups. So, the maximal clique in the graph is a complete trajectory \(\textit{CTR}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Q., Wang, R., Yang, B. et al. Trajectory splicing. Knowl Inf Syst 62, 1279–1312 (2020). https://doi.org/10.1007/s10115-019-01382-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-019-01382-x

Keywords

Navigation