Skip to main content
Log in

Scalable Decentralized Indexing and Querying of Multi-Streams in the Fog

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

NOA-AID (Network Overlays for Adaptive information Aggregation, Indexing and Discovery on the fog) is an approach for decentralized indexing, aggregation and discovery of data belonging to streams. It is organized on two network layers. The upper layer is in charge of delivering an information discovery approach by providing a distributed index structure. The lower layer is devoted to resource aggregation based on epidemic protocols designed for highly dynamic environment, well suited to stream-oriented scenarios. It defines a flexible approach to express queries targeting highly heterogeneous data, as well as a self-organizing dynamic system allowing the optimal resolution of queries on the most suitable stream producers. The paper also presents a theoretical study and discusses the costs related to information management operations; it also gives an empirical validation of findings. Finally, it reports an extended experimental evaluation that demonstrated the ability of NOA-AID to be effective and efficient for retrieving information extracted from streams in highly-dynamic and distributed processing architectures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)

    Google Scholar 

  2. Bajaber, F., Elshawi, R., Batarfi, O., Altalhi, A., Barnawi, A., Sakr, S.: Big data 2.0 processing systems: Taxonomy and open challenges. J. Grid Comput. 14(3), 379–405 (2016)

    Google Scholar 

  3. Baraglia, R., Dazzi, P., Guidi, B., Ricci, L.: Godel: Delaunay overlays in p2p networks via gossip. In: IEEE 12th int. conf. on peer-to-peer computing (P2P), pp. 1–12. IEEE (2012)

  4. Baraglia, R., Dazzi, P., Mordacchini, M., Ricci, L.: A peer-to-peer recommender system for self-emerging user communities based on gossip overlays. J. Comput. Syst. Sci. 79(2), 291–308 (2013)

    MathSciNet  Google Scholar 

  5. Baraglia, R., Dazzi, P., Mordacchini, M., Ricci, L., Alessi, L.: Group: A gossip based building community protocol. In: Smart spaces and next generation wired/wireless networking, pp. 496–507. Springer (2011)

  6. Bentivogli, L., Forner, P., Magnini, B., Pianta, E.: Revising wordnet domains hierarchy: Semantics, coverage, and balancing. In: Proc. of COLING 2004 workshop on multilingual linguistic resources, pp. 101–108 (2004)

  7. Bruno, R., Conti, M., Mordacchini, M., Passarella, A.: An analytical model for content dissemination in opportunistic networks using cognitive heuristics. In: Proc. of the 15th ACM int. conf. on modeling, analysis and simulation of wireless and mobile systems, pp. 61–68. ACM (2012)

  8. Cai, M., Frank, M., Chen, J., Szekely, P.: Maan: A multi-attribute addressable network for grid information services. J. Grid Comput. 2(1), 3–14 (2004)

    MATH  Google Scholar 

  9. Carlini, E., Coppola, M., Dazzi, P., Laforenza, D., Martinelli, S., Ricci, L.: Service and resource discovery supports over p2p overlays. In: International conference on ultra modern telecommunications & workshops. IEEE (2009)

  10. Cai, Z., Lee, I., Chu, S. C., Huang, X.: Simsim: A service discovery method preserving content similarity and spatial similarity in p2p mobile cloud. J. Grid Comput. 17(1), 79–95 (2019)

    Google Scholar 

  11. Carlini, E., Dazzi, P., Mordacchini, M., Ricci, L.: Toward community-driven interest management for distributed virtual environment. In: European conf. on parallel processing, pp. 363–373. Springer, Berlin (2013)

  12. Chang, R.S., Hu, M.S.: A resource discovery tree using bitmap for grids. Futur. Gener. Comput. Syst. 26, 29–37 (2010)

    Google Scholar 

  13. Chaturvedi, S., Tyagi, S., Simmhan, Y: Cost-effective Sharing of Streaming Dataflows for IoT Applications. In: IEEE transactions on cloud computing. IEEE (2019)

  14. Conti, M., Mordacchini, M., Passarella, A., Rozanova, L.: A semantic-based algorithm for data dissemination in opportunistic networks. In: Proc. of the 7th international workshop on self-organizing systems (IWSOS13), pp. 14–26. Springer (2013)

  15. Conti, M., Passarella, A., Das, S.K.: The internet of people (IoP): A new wave in pervasive mobile computing. Pervasive and Mobile Computing 41(Supplement C), 1–27 (2017)

    Google Scholar 

  16. Crespo, A., Garcia-Molina, H.: Semantic overlay networks for p2p systems. Agents and Peer-to-Peer Computing, 1–13 (2005)

  17. Danelutto, M., Dazzi, P., et al.: A java/jini framework supporting stream parallel computations. In: PARCO, pp. 681–688 (2005)

  18. Dazzi, P., Mordacchini, M.: NOA-AID: Network overlays for adaptive information aggregation, indexing and discovery at the edge. In: International Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP 2017) (2017)

  19. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’00, pp. 71–80. ACM, New York (2000)

  20. Falchi, F., Gennaro, C., Zezula, P.: Nearest neighbor search in metric spaces through content-addressable networks. Inf. Proc. Manag. 43(3), 665–683 (2007)

    Google Scholar 

  21. Gama, J., Kosina, P.: Learning decision rules from data streams. In: IJCAI international joint conference on artificial intelligence, pp. 1255–1260 (2011)

  22. Gao, F., Ali, M.I., Curry, E., Mileo, A.: Automated discovery and integration of semantic urban data streams: The ACEIS middleware. Futur. Gener. Comput. Syst. 76(Supplement C), 561–581 (2017)

    Google Scholar 

  23. Gedik, B., Schneider, S., Hirzel, M., Wu, K.L.: Elastic scaling for data stream processing. IEEE Trans. Parall. Distr. Syst. 25(6), 1447–1463 (2014)

    Google Scholar 

  24. Gennaro, C., Mordacchini, M., Orlando, S., Rabitti, F.: Mroute: A peer-to-peer routing index for similarity search in metric spaces. In: Proc. of the 5th int. workshop on databases, information systems and peer-to-peer computing (DBISP2P 2007), pp. 1–12 (2007)

  25. Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proc. of the Int. Conf. on Very Large Data Bases, pp. 518–529 (1999)

  26. Ghobaei-Arani, M., Souri, A., Rahmanian, A. A.: Resource management approaches in fog computing: A comprehensive review. Journal of Grid Computing, pp. 1–42 Springer (2019)

  27. Ghobaei-Arani, M., Souri, A., Safara, F., Norouzi, M.: An efficient task scheduling approach using moth-flame optimization algorithm for cyber-physical system applications in fog computing. Transactions on Emerging Telecommunications Technologies, e3770 Wiley (2019)

  28. Guerraoui, R., Sidath, B., Kermarrec, A., Fessant, F. L., Huguenin, K., Rivière, E.: Gosskip, an efficient, fault-tolerant and self organizing overlay using gossip-based construction and skip-lists principles. In: 6th IEEE Int. Conf. on Peer-toPeer Computing, 2006 Ratnasamy, pp. 12–22 (2001)

  29. Heintz, Benjamin, Chandra, Abhishek, Sitaraman, Ramesh K: Optimizing Grouped Aggregation in Geo-Distributed Streaming Analytics. Inproceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC ’15), pp. 133–144 ACM (2015)

  30. Henning, V., Reichelt, J.: Mendeley-A Last fm For Research? (2009)

  31. Hießl, T., Hochreiner, C., Schulte, S.: Towards a framework for data stream processing in the fog, Informatik Spektrum 42, pp. 256–265 Springer (2019)

  32. Hochreiner, C., Vögler, M., Schulte, S., Dustdar, S.: Elastic stream processing for the internet of things. In: 2016 IEEE 9th international conference on cloud computing (CLOUD), pp. 100–107. IEEE (2016)

  33. Jack, K., Hammerton, J., Harvey, D., Hoyt, J.J., Reichelt, J., Henning, V.: Mendeley reply to the datatel challenge. Proc. Comput. Sci. 1(2), 1–3 (2010)

    Google Scholar 

  34. Jelasity, M., Montresor, A., Babaoglu, O.: T-man: Gossip-based fast overlay topology construction. Comput. Netw. 53(13), 2321–2339 (2009). Elsevier

    MATH  Google Scholar 

  35. Kavalionak, H., Gennaro, C., Amato, G., Vairo, C., Perciante, C., Meghini, C., Falchi, F.: Distributed video surveillance using smart cameras. J. Grid Comput. 17(1), 59–77 (2019)

    Google Scholar 

  36. Le, T., Stahl, F., Gomes, J.B., Gaber, M.M., Fatta, G.D.: Computationally efficient rule-based classification for continuous streaming data, pp. 21–34 Springer International Publishing. https://doi.org/10.1007/978-3-319-12069-0_2 (2014)

  37. Liu, L., Antonopoulos, N., Mackin, S., Xu, J., Russell, D.: Efficient resource discovery in self-organized unstructured peer-to-peer networks. Concurrency and Computation: Practice and Experience 21, 159–183 (2009)

    Google Scholar 

  38. Liu, X., Dastjerdi, A. V., Buyya, R.: Stream processing in IoT: Foundations, state-of-the-art, and future directions. In: Internet of Things, pp. 145–161. Morgan Kaufmann (2016)

  39. Lulli, A., Ricci, L., Carlini, E., Dazzi, P., Lucchese, C.: Cracker: Crumbling large graphs into connected components. In: 2015 IEEE symposium on computers and communication (ISCC), pp. 574–581. IEEE (2015)

  40. Marzolla, M., Mordacchini, M., Orlando, S.: A p2p resource discovery system based on a forest of trees. In: 17th int. workshop on database and expert systems applications (DEXA’06), pp. 261–265. https://doi.org/10.1109/DEXA.2006.16 (2006)

  41. Mencagli, G., Torquati, M., Danelutto, M.: Elastic-ppq: A two-level autonomic system for spatial preference query processing over dynamic data streams. Futur. Gener. Comput. Syst. 79(Part 3), 862–877 (2018)

    Google Scholar 

  42. Mordacchini, M., Conti, M., Passarella, A., Bruno, R.: Human-centric data dissemination in the IoP: Large-scale Modeling and Evaluation. ACM Trans. Auto. Adapt. Syst. (TAAS) 14(3), 1–25 (2020). ACM

    Google Scholar 

  43. Mordacchini, M., Dazzi, P., Tolomei, G., Baraglia, R., Silvestri, F., Orlando, S.: Challenges in designing an interest-based distributed aggregation of users in p2p systems. In: ICUMT’09. int. conf. on ultra modern telecommunications & workshops, 2009. pp. 1–8. IEEE (2009)

  44. Mordacchini, M., Passarella, A., Conti, M., Allen, S.M., Chorley, M.J., Colombo, G.B., Tanasescu, V., Whitaker, R.M.: Crowdsourcing through cognitive opportunistic networks, vol. 10. ACM (2015)

  45. Mordacchini, M., Ricci, L., Ferrucci, L., Albano, M., Baraglia, R.: Hivory: Range queries on hierarchical voronoi overlays. In: IEEE 10th int. conf. on peer-to-peer computing (P2P2010), pp. 1–10. IEEE (2010)

  46. Nasiri, H., Nasehi, S., Goudarzi, M.: Evaluation of distributed stream processing frameworks for IoT applications in Smart Cities. J Big Data 6, 52 Springer (2019)

  47. Peiro Sajjad, H., Liu, Y., Vlassov, V.: Optimizing Windowed Aggregation over Geo-Distributed Data Streams. In: Proceedings of the 2018 IEEE international conference on edge computing (EDGE2018), pp. 33–41. IEEE (2018)

  48. Peris, A.D., Hernández, J.M., Huedo, E.: Distributed late-binding scheduling and cooperative data caching. J. Grid Comput. 15(2), 235–256 (2017)

    Google Scholar 

  49. Pirrò, G., Talia, D., Trunfio, P.: A dht-based semantic overlay network for service discovery. Futur. Gener. Comput. Syst. 28(4), 689–707 (2012)

    Google Scholar 

  50. Pubmed central. www.ncbi.nlm.nih.gov/pmc/

  51. Ruffo, G., Schifanella, R.: A peer-to-peer recommender system based on spontaneous affinities. ACM Trans. Internet Technol 9, 4:1–4:34 (2009)

    Google Scholar 

  52. Selimi, M., Cerdà-Alabern, L., Freitag, F., Veiga, L., Sathiaseelan, A., Crowcroft, J.: A lightweight service placement approach for community network micro-clouds. J. Grid Comput. 17(1), 169–189 (2019)

    Google Scholar 

  53. Smith, R.G.: The contract net protocol: High-level communication and control in a distributed problem solver. IEEE Transactions on computers, (12), pp. 1104–1113 IEEE (1980)

  54. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to data mining, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (2005)

    Google Scholar 

  55. Tennant, M., Stahl, F., Rana, O., Gomes, J.B.: Scalable real-time classification of data streams with concept drift. Futur. Gener. Comput. Syst. 75(Supplement C), 187–199 (2017)

    Google Scholar 

  56. Tolosana-Calasanz, R., Bañares, J., Pham, C., Rana, O.F.: Resource management for bursty streams on multi-tenancy cloud environments. Future Gener. Comput.Syst. 55, 444–459 (2016)

    Google Scholar 

  57. Toshniwal, A., Taneja, S., Shukla, A., Ramasamy, K., Patel, J.M., Kulkarni, S., Jackson, J., Gade, K., Fu, M., Donham, J.: Storm@twitter. In: Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pp. 147–156 ACM (2014)

  58. Tudoran, R., Costan, A., Nano, O., Santos, I., Soncu, H., Jetstream, A.G.: Enabling high throughput live event streaming on multi-site clouds. Futur. Gener. Comput. Syst. 54, 274–291 (2016)

    Google Scholar 

  59. Vanneste, S., de Hoog, J., Huybrechts, T., Bosmans, S., Eyckerman, R., Sharif, M., Mercelis, S., Hellinckx, P.: Distributed uniform streaming framework: An elastic fog computing platform for event stream processing and platform transparency. Future Internet 11(7), 158 (2019). MDPI

    Google Scholar 

  60. Voulgaris, S., Gavidia, D., Van Steen, M.: Cyclon: Inexpensive membership management for unstructured p2p overlays. J. Netw. syst. Manag. 13(2), 197–217 (2005)

    Google Scholar 

  61. Voulgaris, S., van Steen, M.: Epidemic-style management of semantic overlays for content-based searching. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005 parallel processing, pp. 1143–1152. Springer, Berlin (2005)

  62. Yang, S.: IoT stream processing and analytics in the fog. IEEE Commun. Mag. 55(8), 21–27 (2017). IEEE

    Google Scholar 

  63. Zaharia, M., Xin, R.S, Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., Ghodsi, A., Gonzales, J., Shenker, S.: Stoica Ion: Apache spark: a unified engine for big data processing. Communications of the ACM, vol. 59, issue 11, pp. 56–65 ACM (2016)

  64. Zhang, Q., Li, S., Wu, Q., Yu, J.: Improving dht load balance using the churn. In: 2016 IEEE international conference on computer and information technology (CIT), pp. 354–360. IEEE (2016)

  65. Zhou, Q., Simmhan, Y., Prasanna, V.: Knowledge-infused and consistent complex event processing over real-time and persistent streams. Futur. Gener. Comput. Syst. 76, 391–406 (2017)

    Google Scholar 

  66. Zhu, Y., Hu, Y.: Efficient semantic search on dht overlays. J. Parall. Distr. Comput. 67(5), 604–616 (2007)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrizio Dazzi.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dazzi, P., Mordacchini, M. Scalable Decentralized Indexing and Querying of Multi-Streams in the Fog. J Grid Computing 18, 395–418 (2020). https://doi.org/10.1007/s10723-020-09521-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-020-09521-3

Keywords

Navigation