Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining,Applied Intelligence

当前位置： X-MOL 学术 › Appl. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Effective sanitization approaches to protect sensitive knowledge in high-utility itemset mining
Applied Intelligence ( IF 5.3 ) Pub Date : 2019-07-15 , DOI: 10.1007/s10489-019-01524-2
Xuan Liu , Shiting Wen , Wanli Zuo

Abstract

For mutual benefit, data is shared among business organizations. However, this may result in privacy and security threats. To address this issue, privacy-preserving data mining is presented to sanitize the original database to hide all sensitive knowledge. Privacy-preserving utility mining is an extension of privacy-preserving data mining, the objective of which is to hide all sensitive high-utility itemsets and minimize the side effects on non-sensitive knowledge caused by the sanitization process. In this paper, three heuristic algorithms for privacy-preserving utility mining are proposed, namely, Selecting Maximum Utility item first (SMAU), Selecting Minimum Utility item first (SMIU) and Selecting Minimum Side Effects item first (SMSE). The quality of the database is well maintained because all of the proposed algorithms consider the side effects on the non-sensitive itemsets. Furthermore, to avoid performing multiple database scans, two table structures, T-table and HUI-table, are adopted to accelerate the hiding process by only scanning the database twice. The experimental results show that the proposed approaches successfully conceal all sensitive itemsets with fewer distortions of non-sensitive knowledge. Moreover, the influence of the database density on the proposed approaches is observed.

中文翻译：

有效的消毒方法可保护高用途项集挖掘中的敏感知识

摘要

为了互惠互利，数据在业务组织之间共享。但是，这可能会导致隐私和安全威胁。为了解决此问题，提出了保护隐私的数据挖掘以清理原始数据库以隐藏所有敏感知识。隐私保护实用程序挖掘是隐私保护数据挖掘的扩展，其目的是隐藏所有敏感的高实用性项集，并最大程度地减少由清理过程引起的对非敏感知识的副作用。本文提出了三种用于隐私保护效用挖掘的启发式算法，即选择最大效用项优先（SMAU），选择最小效用项优先（SMIU）和选择最小副作用项优先（SMSE）。由于所有提出的算法都考虑了对非敏感项目集的副作用，因此数据库的质量得到了很好的维护。此外，为避免执行多次数据库扫描，两个表结构T-table和HUI-table通过仅扫描数据库两次来加快隐藏过程。实验结果表明，所提出的方法成功地隐藏了所有敏感项目集，并且减少了非敏感知识的失真。此外，观察到数据库密度对所提出的方法的影响。

更新日期：2020-01-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>