当前位置: X-MOL 学术Inf. Process. Manag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
VIDPSO: Victim item deletion based PSO inspired sensitive pattern hiding algorithm for dense datasets
Information Processing & Management ( IF 7.4 ) Pub Date : 2020-04-30 , DOI: 10.1016/j.ipm.2020.102255
Shalini Jangra , Durga Toshniwal

Collaborative frequent itemset mining involves analyzing the data shared from multiple business entities to find interesting patterns from it. However, this comes at the cost of high privacy risk. Because some of these patterns may contain business-sensitive information and hence are denoted as sensitive patterns. The revelation of such patterns can disclose confidential information. Privacy-preserving data mining (PPDM) includes various sensitive pattern hiding (SPH) techniques, which ensures that sensitive patterns do not get revealed when data mining models are applied on shared datasets. In the process of hiding sensitive patterns, some of the non-sensitive patterns also become infrequent. SPH techniques thus affect the results of data mining models. Maintaining a balance between data privacy and data utility is an NP-hard problem because it requires the selection of sensitive items for deletion and also the selection of transactions containing these items such that side effects of deletion are minimal. There are various algorithms proposed by researchers that use evolutionary approaches such as genetic algorithm(GA), particle swarm optimization (PSO) and ant colony optimization (ACO). These evolutionary SPH algorithms mask sensitive patterns through the deletion of sensitive transactions. Failure in the sensitive patterns masking and loss of data have been the biggest challenges for such algorithms. The performance of evolutionary algorithms further gets degraded when applied on dense datasets. In this research paper, victim item deletion based PSO inspired evolutionary algorithm named VIDPSO is proposed to sanitize the dense datasets. In the proposed algorithm, each particle of the population consists of n number of sub-particles derived from pre-calculated victim items. The proposed algorithm has a high exploration capability to search the solution space for selecting optimal transactions. Experiments conducted on real and synthetic dense datasets depict that VIDPSO algorithm performs better vis-a-vis GA, PSO and ACO based SPH algorithms in terms of hiding failure with minimal loss of data.



中文翻译:

VIDPSO:基于受害人项删除的PSO启发式密集数据集敏感模式隐藏算法

协作频繁项集挖掘包括分析从多个业务实体共享的数据以从中找到有趣的模式。但是,这是以高隐私风险为代价的。因为这些模式中的某些可能包含业务敏感信息,因此被称为敏感模式。这种模式的启示可以泄露机密信息。隐私保护数据挖掘(PPDM)包括各种敏感模式隐藏(SPH)技术,可确保在将数据挖掘模型应用于共享数据集时不会泄露敏感模式。在隐藏敏感模式的过程中,一些非敏感模式也变得很少见。SPH技术因此会影响数据挖掘模型的结果。维护数据隐私和数据实用程序之间的平衡是一个NP难题,因为它需要选择要删除的敏感项目,还需要选择包含这些项目的交易,以使删除的副作用最小。研究人员提出了使用进化方法的各种算法,例如遗传算法(GA),粒子群优化(PSO)和蚁群优化(ACO)。这些进化的SPH算法通过删除敏感交易来掩盖敏感模式。敏感模式掩盖的失败和数据丢失已成为此类算法的最大挑战。当将算法应用于密集数据集时,其性能会进一步下降。在这篇研究论文中,提出了一种基于受害者项目删除的PSO启发式进化算法VIDPSO,对密集的数据集进行消毒。在提出的算法中,总体中的每个粒子都包含n个子粒子,这些子粒子是从预先计算的受害者项目中得出的。该算法具有较高的探索能力,可以搜索解决方案空间以选择最佳交易。在真实和合成密集数据集上进行的实验表明,相对于基于GA,PSO和ACO的SPH算法,VIDPSO算法在隐藏失败方面具有更好的性能,并且数据丢失最少。该算法具有较高的探索能力,可以搜索解决方案空间以选择最佳交易。在真实和合成密集数据集上进行的实验表明,相对于基于GA,PSO和ACO的SPH算法,VIDPSO算法在隐藏失败方面具有更好的性能,并且数据丢失最少。该算法具有较高的探索能力,可以搜索解决方案空间以选择最佳交易。在真实和合成密集数据集上进行的实验表明,相对于基于GA,PSO和ACO的SPH算法,VIDPSO算法在隐藏失败方面具有更好的性能,并且数据丢失最少。

更新日期:2020-04-30
down
wechat
bug