广东工业大学学报 ›› 2012, Vol. 29 ›› Issue (3): 28-34.doi: 10.3969/j.issn.1007-7162.2012.03.005

• 综合研究 • 上一篇    下一篇

面向聚类挖掘的局部旋转扰动隐私保护算法

刘洪伟,石雅强,梁周扬,肖岳   

  1. 广东工业大学 管理学院,广东 广州 510520
  • 收稿日期:2012-06-01 出版日期:2012-09-20 发布日期:2012-09-20
  • 作者简介:刘洪伟(1962-),男,教授,博士生导师,主要研究方向为数据挖掘、管理信息系统、系统工程、大系统理论等.
  • 基金资助:

    国家自然科学基金资助项目(70971027)

Partial Rotation Perturbation for PrivacyPreserving Clustering Mining

Liu Hongwei, Shi Yaqiang, Liang Zhouyang, Xiao Yue   

  1. School of Management, Guangdong University of Technology, Guangzhou 510520, China
  • Received:2012-06-01 Online:2012-09-20 Published:2012-09-20

摘要: 聚类挖掘可以高效准确地从数据中找出很多潜在的、有价值的规律,但也同时存在着泄露用户隐私数据的安全威胁.已经有一些专门针对聚类挖掘的隐私保护研究,其中乘法扰动方法是一种准确性和安全性都较高的隐私保护算法.研究发现已知信息独立分量分析极大地降低了已有乘法扰动方法的安全性,它能够从乘法扰动数据中近似估计隐私数据.为了解决以上问题,提出了局部旋转扰动隐私保护算法,通过准确性分析得出新算法具有零损失准确性.利用安全性分析证明新算法能够有效抵御独立分量分析的攻击,具有更高的安全性.将新算法应用到聚类挖掘中,得到了与未加隐私保护的聚类挖掘非常接近的结果,说明了它的可行性.局部旋转扰动方法的出现,有效地解决了已有乘法扰动方法的安全漏洞,使得聚类挖掘能够更加安全地得到应用.

关键词: 聚类挖掘;隐私保护;乘法扰动;局部旋转扰动

Abstract: Many potential and valuable rules can be derived from data via clustering mining in an effective and accurate way, which can lead to security threats such as the disclosure of user privacy. Many privacypreserving researches on clustering mining have been conducted, especially on multiplicative perturbation (MP) that is a highly secure and accurate method. Research finds known knowledge independent component analysis (KKICA) can greatly reduce the privacy security of existing MP. It can approximately estimate private data from MP data. To solve the problem, partial rotation perturbation (PRP) is proposed. The analysis of accuracy shows that PRP has zeroloss accuracy. The analysis of security proves that PRP can defend attack from the KKICA availably and is more secure. PRP is applied in clustering mining. The results are very similar to unpreserved clustering mining results, which shows that MP is practicable. The existence of PRP solves the problem with the security vulnerability of existing MP effectively, making the application of clustering more secure.

Key words: clustering mining; privacypreserving; multiplicative perturbation; partial rotation perturbation

[1] Agrawal R, Srikant R. Privacypreserving data mining[C]. Proceedings of the ACM SIGMOD Conference on Management of Data, Dallas, TX USA: ACM SIGMOD, 2000: 439-450.

[2] Agrawal D, Aggarwal C C. On the design and quantification of privacy preserving data mining algorithms[C]. Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Santa Barbara, California, May 2001: 247-255.

[3] Evfimevski A, Gehrke J, Srikant R. Limiting privacy breaches in privacy preserving data mining[C]. Proceedings of the ACM SIGMOD/PODS Conference San Diego, CA, June 2003.

[4] Warner S L. Randomized Response: A survey technique for eliminating evasive answer bias[J]. Journal of the American Statistical Association, 1965, 60: 63-69.

[5] Chen K, Liu L. A random rotation perturbation approach to privacy preserving data classification[D]. USA:Georgia Institute of Technology, 2005.

[6] Liu K, Kargupta H, Ryan J. Random projectionbased multiplicative data perturbation for privacy preserving distributed data mining[J]. Knowledge and Data Engineering, IEEE Transactions on,2006,18(1): 92-106.

[7] Aapo Hyvrinen, Jarmo Hurri, Patrik O Hoyer. Independent Component Analysis [M]. Computational Imaging and Vision, Springer London, 2009, 39: 151-175.

[8] 李广彪,张剑云,毛云祥.盲源分离的发展及研究现状[J].航天对子对抗,2004(6):13-16.
[9] Guo S,Wu X. Deriving private information from arbitrarily projected data[J]. Advances in Knowledge Discovery and Data Mining, 2007: 84-95.

[10] Kargupta H, Datta S, Wang Q, et al. On the privacy preserving properties of random data perturbation techniques[C]. Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne, FL, USA, November 2003.

[11] Guo S. Analysis of and techniques for privacy preserving data mining[M]. Ann Arbor: ProQuest Information and Learning Company, 2007.

[12] Liu Kun. Multiplicative data perturbation for privacy preserving data mining[D]. Baltimore, MD, USA: University of Maryland Baltimore County, 2007.

[13]Huang Z, Du W,Chen B. Deriving private information from randomized data[C]. Proceedings of the 2005 ACM 〖JP2〗SIGMOD Conference, Baltimroe, MD, June 2005: 3748.

[14] Aggarwal C C, Yu P S. A condensation based approach to privacy preserving data mining[C]. Proceedings of the 9th International Conference on Extending Database Technology (EDBT’04), Heraklion, Crete, Greece, March 2004: 183-199.

[15] 肖岳.移动数据的智能分析与隐私保护[D].广州:广东工业大学,2011.

[16] 聂跃光.基于密度聚类的空间数据挖掘算法研究[D].太原:太原科技大学,2008.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!