Skip to main content
Log in

Mining High-Average Utility Itemsets with Positive and Negative External Utilities

  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is an emerging data mining topic. It aims to find the high-utility itemsets by considering both the internal (i.e., quantity) and external (i.e., profit) utilities of items. High-average-utility itemset mining (HAUIM) is an extension of the HUIM, which provides a more fair measurement named average-utility, by taking into account the length of itemsets in addition to their utilities. In the literature, several algorithms have been introduced for mining high-average-utility itemsets (HAUIs). However, these algorithms assume that databases contain only positive utilities. For some real-world applications, on the other hand, databases may also contain negative utilities. In such databases, the proposed algorithms for HAUIM may not discover the complete set of HAUIs since they are designed for only positive utilities. In this study, to discover the correct and complete set of HAUIs with both positive and negative utilities, an algorithm named MHAUIPNU (mining high-average-utility itemsets with positive and negative utilities) is proposed. MHAUIPNU introduces an upper bound model, three pruning strategies, and a data structure. Experimental results show that MHAUIPNU is very efficient in reducing the size of the search space and thus in mining HAUIs with negative utilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993). https://doi.org/10.1145/170036.170072

    Article  Google Scholar 

  2. Chu, C.J., Tseng, V.S., Liang, T.: An efficient algorithm for mining high utility itemsets with negative item values in large databases. Appl. Math. Comput. 215(2), 767–778 (2009). https://doi.org/10.1016/j.amc.2009.05.066

    Article  MATH  Google Scholar 

  3. Deng, Z.H.: DiffNodesets: an efficient structure for fast mining frequent itemsets. Appl. Soft. Comput. 41, 214–223 (2016). https://doi.org/10.1016/j.asoc.2016.01.010

    Article  Google Scholar 

  4. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: Spmf: a java open-source pattern mining library. J. Mach. Learn. Res. 15, 3389–3393 (2014)

    MATH  Google Scholar 

  5. Fournier-Viger, P., Wu, C.W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Lect. Notes in Comput. Sci., pp. 83–92. Springer International Publishing (2014). https://doi.org/10.1007/978-3-319-08326-1_9

  6. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. ACM SIGMOD Rec. 29(2), 1–12 (2000). https://doi.org/10.1145/335191.335372

    Article  Google Scholar 

  7. Hong, T.P., Lee, C.H., Wang, S.L.: Effective utility mining with the measure of average utility. Expert Syst. with Appl. 38(7), 8259–8265 (2011). https://doi.org/10.1016/j.eswa.2011.01.006

    Article  Google Scholar 

  8. Huang, H., Wu, X., Relue, R.: Mining frequent patterns with the pattern tree. New Gener. Comput. 23(4), 315–337 (2005). https://doi.org/10.1007/bf03037636

    Article  MATH  Google Scholar 

  9. Kim, D., Yun, U.: Efficient algorithm for mining high average-utility itemsets in incremental transaction databases. Appl. Intell. 47(1), 114–131 (2017). https://doi.org/10.1007/s10489-016-0890-z

    Article  Google Scholar 

  10. Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015). https://doi.org/10.1016/j.eswa.2014.11.001

    Article  Google Scholar 

  11. Krishnamoorthy, S.: Efficiently mining high utility itemsets with negative unit profits. Knowl. Based Syst. 145, 1–14 (2018). https://doi.org/10.1016/j.knosys.2017.12.035

    Article  Google Scholar 

  12. Lan, G.C., Hong, T.P., Tseng, V.S.: Efficiently mining of high average-utility itemsets with an improved upper-bound strategy. Int. J. Inf. Technol. Decis. Making 11(05), 1009–1030 (2012). https://doi.org/10.1142/s0219622012500307

    Article  Google Scholar 

  13. Lan, G.C., Hong, T.P., Tseng, V.S.: A projection-based approach for discovering high average-utility itemsets. J. Inf. Sci. Eng. 28, 193–209 (2012)

    Google Scholar 

  14. Lin, C.W., Hong, T.P., Lu, W.H.: Efficiently mining high average utility itemsets with a tree structure. In: Intell. Inf. Database Syst., pp. 131–139. Springer, Berlin (2010). https://doi.org/10.1007/978-3-642-12145-6_14

  15. Lin, C.W., Hong, T.P., Lu, W.H.: Using the structure of prelarge trees to incrementally mine frequent itemsets. New Gener. Comput. 28(1), 5–20 (2010). https://doi.org/10.1007/s00354-008-0072-6

    Article  MATH  Google Scholar 

  16. Lin, C.W., Hong, T.P., Lu, W.H.: An effective tree structure for mining high utility itemsets. Expert Syst. Appl. 38(6), 7419–7424 (2011). https://doi.org/10.1016/j.eswa.2010.12.082

    Article  Google Scholar 

  17. Lin, J.C.W., Fournier-Viger, P., Gan, W.: FHN: an efficient algorithm for mining high-utility itemsets with negative unit profits. Knowl. Based Syst. 111, 283–298 (2016). https://doi.org/10.1016/j.knosys.2016.08.022

    Article  Google Scholar 

  18. Lin, J.C.W., Li, T., Fournier-Viger, P., Hong, T.P., Zhan, J., Voznak, M.: An efficient algorithm to mine high average-utility itemsets. Adv. Eng. Inf. 30(2), 233–243 (2016). https://doi.org/10.1016/j.aei.2016.04.002

    Article  Google Scholar 

  19. Lin, J.C.W., Ren, S., Fournier-Viger, P., Hong, T.P.: EHAUPM: efficient high average-utility pattern mining with tighter upper bounds. IEEE Access 5, 12927–12940 (2017). https://doi.org/10.1109/access.2017.2717438

    Article  Google Scholar 

  20. Lin, J.C.W., Ren, S., Fournier-Viger, P., Hong, T.P., Su, J.H., Vo, B.: A fast algorithm for mining high average-utility itemsets. Appl. Intell. 47(2), 331–346 (2017). https://doi.org/10.1007/s10489-017-0896-1

    Article  Google Scholar 

  21. Lin, J.C.W., Shao, Y., Fournier-Viger, P., Djenouri, Y., Guo, X.: Maintenance algorithm for high average-utility itemsets with transaction deletion. Appl. Intell. 48(10), 3691–3706 (2018). https://doi.org/10.1007/s10489-018-1180-8

    Article  Google Scholar 

  22. Liu, J., Wang, K., Fung, B.C.: Mining high utility patterns in one phase without generating candidates. IEEE Trans. Knowl. Data Eng. 28(5), 1245–1257 (2016). https://doi.org/10.1109/tkde.2015.2510012

    Article  Google Scholar 

  23. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proc. of the 21st ACM Int. Conf. Inf. Knowl. Manag., CIKM (2012). https://doi.org/10.1145/2396761.2396773

  24. Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Adv. Knowl. Discov. Data Min., pp. 689–695. Springer, Berlin (2005). https://doi.org/10.1007/11430919_79

  25. Lu, T., Vo, B., Nguyen, H.T., Hong, T.P.: A new method for mining high average utility itemsets. In: Comput. Inf. Syst. Ind. Manag., pp. 33–42. Springer, Berlin (2014). https://doi.org/10.1007/978-3-662-45237-0_5

  26. Peng, A.Y., Koh, Y.S., Riddle, P.: mHUIMiner: a fast high utility itemset mining algorithm for sparse datasets. In: Adv. in Knowl. Discov. Data Min., pp. 196–207. Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-57529-2_16

  27. Ryang, H., Yun, U.: Indexed list-based high utility pattern mining with utility upper-bound reduction and pattern combination techniques. Knowl. Inf. Syst. 51(2), 627–659 (2016). https://doi.org/10.1007/s10115-016-0989-x

    Article  Google Scholar 

  28. Singh, K., Shakya, H.K., Singh, A., Biswas, B.: Mining of high-utility itemsets with negative utility. Expert Syst. (2018). https://doi.org/10.1111/exsy.12296

    Article  Google Scholar 

  29. Truong, T., Duong, H., Le, H.B., Viger, P.F.: Efficient vertical mining of high average-utility itemsets based on novel upper-bounds. IEEE Trans. Knowl. Data. Eng., pp. 301–314 (2018). https://doi.org/10.1109/tkde.2018.2833478

  30. Tseng, V.S., Shie, B.E., Wu, C.W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013). https://doi.org/10.1109/tkde.2012.59

    Article  Google Scholar 

  31. Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: UP-growth: an efficient algorithm for high utility itemset mining. In: Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. (2010). https://doi.org/10.1145/1835804.1835839

  32. Wu, J.M.T., Lin, J.C.W., Pirouz, M., Fournier-Viger, P.: TUB-HAUPM: tighter upper bound for mining high average-utility patterns. IEEE Access 6, 18655–18669 (2018). https://doi.org/10.1109/access.2018.2820740

    Article  Google Scholar 

  33. Wu, T.Y., Lin, J.C.W., Shao, Y., Fournier-Viger, P., Hong, T.P.: Updating the discovered high average-utility patterns with transaction insertion. In: Adv. Intell. Syst. Comput., pp. 66–73. Springer Singapore (2017). https://doi.org/10.1007/978-981-10-6487-6_9

  34. Yildirim, I., Celik, M.: FIMHAUI: Fast incremental mining of high average-utility itemsets. In: 2018 Int. Conf. on Artif. Intell. and Data Process. (IDAP). IEEE (2018). https://doi.org/10.1109/idap.2018.8620819

  35. Yun, U., Kim, D.: Mining of high average-utility itemsets using novel list structure and pruning strategy. Future Gener. Comput. Syst. 68, 346–360 (2017). https://doi.org/10.1016/j.future.2016.10.027

    Article  Google Scholar 

  36. Yun, U., Kim, D., Yoon, E., Fujita, H.: Damped window based high average utility pattern mining over data streams. Knowl. Based Syst. 144, 188–205 (2018). https://doi.org/10.1016/j.knosys.2017.12.029

    Article  Google Scholar 

  37. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2016). https://doi.org/10.1007/s10115-016-0986-0

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irfan Yildirim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yildirim, I., Celik, M. Mining High-Average Utility Itemsets with Positive and Negative External Utilities. New Gener. Comput. 38, 153–186 (2020). https://doi.org/10.1007/s00354-019-00078-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-019-00078-8

Keywords

Navigation