Skip to main content
Log in

EPF: A General Framework for Supporting Continuous Top-k Queries Over Streaming Data

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Continuous top-k query over sliding window is a fundamental problem in the domain of streaming data management, which monitors the query window and retrieves k objects with the highest scores when the window slides. The key of supporting this query is maintaining a subset of objects in the window, and try to retrieve answers from them when the window slides. The state-of-the-art approach called SAP utilizes the partition technique to support top-k searches. Its key idea is using, as few as possible, high-quality candidates to support the query via finding a proper partition. However, it has to waste relatively high computation cost in evaluating whether the partition is proper and re-scanning the widow. In this paper, we propose an ELM-based framework named EPF, which improves SAP via learning the nature of streaming data. If we learn that the distribution of streaming data is predictable, we could construct a suitable prediction model for a more efficient partition of the window. Furthermore, we propose a novel algorithm to reduce the re-scanning cost. We conduct a thorough experimental study of this technique on real and synthetic datasets and show the significant performance improvement when applying the technique in existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Al-Radaideh QA, Bataineh DQ. A hybrid approach for arabic text summarization using domain knowledge and genetic algorithms. Cogn Comput 2018;10(4):651–669.

    Article  Google Scholar 

  2. Keuninckx L, Danckaert J, van der Sande G. Real-time audio processing with a cascade of discrete-time delay line-based reservoir computers. Cogn Comput 2017;9(3):315–326.

    Article  Google Scholar 

  3. Wang H, Xu L, Wang X, Luo B. Learning optimal seeds for ranking saliency. Cogn Comput 2018; 10(2):347–358.

    Article  Google Scholar 

  4. Oliva J, Serrano JI, Dolores del Castillo M, Iglesias Á. Cross-linguistic cognitive modeling of verbal morphology acquisition. Cogn Comput 2017;9(2):237–258.

    Article  Google Scholar 

  5. Zhang H-G, Wu L, Song Y, Su C-W, Wang Q, Su F. An online sequential learning non-parametric value-at-risk model for high-dimensional time series. Cogn Comput 2018;10(2):187–200.

    Article  Google Scholar 

  6. Wang B, Zhu R, Luo S, Yang X, Guoren W. H-MRST A novel framework for supporting probability degree range query using extreme learning machine. Cogn Comput 2017;9(1):68–80.

    Article  Google Scholar 

  7. Scardapane S, Uncini A. Semi-supervised echo state networks for audio classification. Cogn Comput 2017;9 (1):125–135.

    Article  Google Scholar 

  8. Shen Z, Cheema MA, Lin X, Zhang W, Wang H. 2012. Efficiently monitoring top-k pairs over sliding windows. In: ICDE, pp 798–809.

  9. Zhu R, Wang B, Luo S, Yang X, Wang G. Approximate continuous top-k query over sliding window. J Comput Sci Technol 2017;32(1):93–109.

    Article  Google Scholar 

  10. Tong Y, She J, Ding B, Chen L, Wo T, Xu K. Online minimum matching in real-time spatial data E77xperiments and analysis. PVLDB 2016;9(12):1053–1064.

    Google Scholar 

  11. Tong Y, She J, Ding B, Wang L, Chen L. 2016. Online mobile micro-task allocation in spatial crowdsourcing. In: 32nd IEEE international conference on data engineering, ICDE 2016, Helsinki, Finland, May 16-20, 2016, pp 49–60.

  12. Tarutani Y, Hashimoto K, Hasegawa G, Nakamura Y, Tamura T, Matsuda K, Matsuoka M. 2015. Temperature distribution prediction in data centers for decreasing power consumption by machine learning. In: 7th IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2015, Vancouver, BC, Canada November 30 - December 3, 2015, pp 635–642.

  13. Foo YW, Goh C, Li Y. 2016. Machine learning with sensitivity analysis to determine key factors contributing to energy consumption in cloud data centers. In: International conference on cloud computing research and innovations, ICCCRI 2016, Singapore, Singapore, May 4-5, 2016, pp 107–113.

  14. Blanchart P, Ferecatu M, Datcu M. 2011. Active learning using the data distribution for interactive image classification and retrieval. In: Proceedings of the IEEE symposium on computational intelligence and data mining, CIDM 2011, part of the IEEE symposium series on computational intelligence 2011, April 11-15, 2011, Paris, France pp 7–14.

  15. Huang G-B, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 2006;17:879– 892.

    Article  Google Scholar 

  16. Huang G-B, Zhu Q-Y, Siew C-K. 2004. Extreme learning machine: a new learning scheme of feedforward neural networks. In: International symposium on neural networks, vol 2.

  17. Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern 2012;42:513–529.

    Article  Google Scholar 

  18. Huang G-B, Zhu Q-Y, Siew C-K. Extreme learning machine: Theory and applications. Neurocomputing 2006;70:489–501.

    Article  Google Scholar 

  19. Huang G-B, Ding X, Zhou H. Optimization method based extreme learning machine for classification. Neurocomputing 2010;74:155–163.

    Article  Google Scholar 

  20. Caruana G, Li M, Qi M. 2011. A MapReduce based parallel SVM for large scale spam filtering. In: Fuzzy systems and knowledge discovery.

  21. Zhu R, Wang B, Yang X, Zheng B, Wang G. SAP: improving continuous top-k queries over streaming data. IEEE Trans Knowl Data Eng 2017;29(6):1310–1328.

    Article  Google Scholar 

  22. Mouratidis K, Bakiras S, Papadias D. 2006. Continuous monitoring of top-k queries over sliding windows. In: SIGMOD conference, pp 635–646.

  23. Yang D, Shastri A, Rundensteiner EA, Ward MO. 2011. An optimal strategy for monitoring top-k queries in streaming windows. In: EDBT, pp 57–68.

  24. Deng C, Wang B, Lin W, Huang G-B, Zhao B. Effective visual tracking by pairwise metric learning. Neurocomputing 2017;261:266–275.

    Article  Google Scholar 

  25. Lendasse A, Vong C-M, Toh K-A, Miche Y, Huang G-B. Advances in extreme learning machines (ELM2015). Neurocomputing 2017;261:1–3.

    Article  Google Scholar 

  26. Wang S, Deng C, Lin W, Huang G-B, Zhao B. Nmf-based image quality assessment using extreme learning machine. IEEE Trans Cybern 2017;47(1):232–243.

    Article  Google Scholar 

  27. Rong H-J, Huang G-B, Sundararajan N, Saratchandran P. Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Trans Syst Man Cybern 2009;39:1067–1072.

    Article  Google Scholar 

  28. Cheng Y, Ye Y, Chen L, Wang G, Giraud-Carrier CG, Sun Y. Distr: A distributed method for the reachability query over large uncertain graphs. IEEE Trans Parallel Distrib Syst 2016;27(11):3172–3185.

    Article  Google Scholar 

  29. Tong Y, She J, Meng R. Bottleneck-aware arrangement over event-based social networks: the max-min approach. World Wide Web 2016;19(6):1151–1177.

    Article  Google Scholar 

  30. Weisstein EW. de moivre-laplace theorem. From MathWorld - A Wolfram Web Resource. http://mathworld.wolfram.com/deMoivre-LaplaceTheorem.html.

  31. Cortes C, Vapnik V. Support vector networks. Mach Learn 1995;20:273–297.

    Google Scholar 

  32. Fan Y, Qian Y, Soong FK, He L. 2015. Multi-speaker modeling and speaker adaptation for dnn-based TTS synthesis. In: 2015 IEEE international conference on acoustics, speech and signal processing, ICASSP 2015, South Brisbane, Queensland, Australia, April 19-24, 2015, pp 4475–4479.

  33. Jourabloo A, Liu X. 2016. Large-pose face alignment via cnn-based dense 3d model fitting. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp 4188–4196.

  34. Clark S, Dyer C, Blunsom P, Yogatama D, Kuncoro A, Hale J. 2018. Lstms can learn syntax-sensitive dependencies well, but modeling structure makes them better. In: Proceedings of the 56th annual meeting of the association for computational linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Vol 1: Long Papers, pp 1426–1436.

  35. Zhang X, Gao T, Gao D. A new deep spatial transformer convolutional neural network for image saliency detection. Design Autom Emb Sys 2018;22(3):243–256.

    Article  Google Scholar 

Download references

Funding

This work is partially supported by the NSF of China under grant Nos. 61702344, 61272178, 61502317, U1401256, and the NSF of China for Key Program under grant No. 61532021.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Zhu.

Ethics declarations

Conflict of interests

The authors declare that they have no potential con ict of interest. This article does not contain any studies involving human participants and/or animals by any of the authors. Informed consent was obtained from all individual participants.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, H., Zhu, R. & Wang, B. EPF: A General Framework for Supporting Continuous Top-k Queries Over Streaming Data. Cogn Comput 12, 176–194 (2020). https://doi.org/10.1007/s12559-019-09661-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-019-09661-z

Keywords

Navigation