Abstract
In this paper a two phase approach for high utility itemset mining has been proposed. In the first phase potential high utility itemsets are generated using potential high utility maximal supersets. The transaction weighted utility measure is used in ascertaining the potential high utility itemsets. The maximal supersets are obtained from high utility paths ending in the items in the transaction database. The supersets are constructed without using any tree structures. The prefix information of an item in a transaction is stored in the form of binary codes. Thus, the prefix information of a path in a transaction is encoded as binary codes and stored in the node containing the item information. The potential high utility itemsets are generated from the maximal supersets using a modified set enumeration tree. The high utility itemsets are then obtained from the set enumeration tree by calculating the actual utility by scanning the transaction database. The experiments highlight the superior performance of the system compared to other similar systems in the literature.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) “Fast algorithms for mining association rules.” In Proc. 20th int. conf. very large data bases. VLDB 1215:487–499
Agrawal, R., Imieliński, T., & Swami, A. (1993). “Mining association rules between sets of items in large databases.” In Acm sigmod record 22 (2). ACM: 207–216
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, New York
Dam T, Li K, Fournier-Viger P et al (2019) CLS-miner: efficient and effective closed high-utility itemset mining. Front Comput Sci 13:357–381. https://doi.org/10.1007/s11704-016-6245-4
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Disc 15(1):5586
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Sethi, K. K., & Ramesh, D. (2017). “HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing.” The Journal of Supercomputing: 1–17
Yao, H., Hamilton, H. J., & Butz, C. J. (2004). “A foundational approach to mining itemset utilities from databases.” In Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics: 482–486
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107
Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217
Liu Y, Liao WK, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility Itemsets. In PAKDD 3518:689–695
Tseng VS, Shie BE, Wu CW, Philip SY (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786
Tseng, V. S., Wu, C. W., Shie, B. E., & Yu, P. S. (2010). “UP-Growth: an efficient algorithm for high utility itemset mining.” In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining ACM: 253–262
Fournier-Viger, P., Lin, J. C. W., Duong, Q. H., & Dam, T. L. (2016). “FHM+: faster high-utility itemset mining using length upperbound reduction.” In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer International Publishing: 115–127
Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381
Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878
Chan, R., Yang, Q., & Shen, Y. D. (2003). “Mining high utility itemsets.” In Data Mining ICDM Third IEEE International Conference on IEEE: 19–26
Uday KR, Yashwanth RT, Fournier-Viger P, Toyoda M, Krishna RP, Kitsuregawa M (2019) Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility. In: Yang Q, Zhou ZH, Gong Z, Zhang ML, Huang SJ (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture notes in computer science, vol 11440. Springer, Cham
Nguyen LT, Nguyen P, Nguyen TD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144
Sethi KK, Ramesh D, Edla DR (2018) P-FHM+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput Sci 132:918–927
Arybarzan N, Bidgoli B, Reshnehlab M (2018) negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is part of the Topical Collection: Special Issue on Network In Box, Architecture, Networking and Applications
Guest Editor: Ching-Hsien Hsu
Rights and permissions
About this article
Cite this article
Javangula, V., Koneru, S. & Dasari, H. High utility itemset mining using path encoding and constrained subset generation. Peer-to-Peer Netw. Appl. 14, 2410–2418 (2021). https://doi.org/10.1007/s12083-020-00980-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-020-00980-9