广东工业大学学报 ›› 2023, Vol. 40 ›› Issue (05): 56-63.doi: 10.12052/gdutxb.220170

• 计算机科学与技术 • 上一篇    

基于方向控制的差分隐私轨迹数据发布方法

李杨, 周莹   

  1. 广东工业大学 计算机学院, 广东 广州 510006
  • 收稿日期:2022-11-16 发布日期:2023-09-26
  • 通信作者: 周莹(1998-),女,硕士研究生,主要研究方向为差分隐私保护,E-mail:joying314@163.com
  • 作者简介:李杨(1980-),女,教授,博士,主要研究方向为多智能体、差分隐私保护
  • 基金资助:
    国家自然科学基金资助项目(61972102);广东省重点研发项目(2021B0101220004)

Differential Privacy Trajectory Data Publishing Based on Orientation Control

Li Yang, Zhou Ying   

  1. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
  • Received:2022-11-16 Published:2023-09-26

摘要: 随着差分隐私研究及其应用的不断拓展,其在轨迹数据发布的隐私保护领域应用受到了广泛关注,现有研究方法大多采用Kmeans聚类方法对轨迹进行聚类划分,但由于差分隐私约束下的轨迹数据集受到噪声的扰动,导致现有的聚类方法无法保证最后的收敛效果。本文提出了一种基于方向控制的差分隐私保护轨迹数据发布方法。首先,提出了基于SKmeans||聚类的轨迹泛化算法,在聚类迭代过程中针对质心的更新,加入方向控制机制,设计指数机制中的打分函数控制质心的收敛,保证高维数据聚类的质量。其次,设计了一个基于有界阶梯噪声机制的轨迹数据发布算法,其中的有界阶梯噪声机制保证了在隐藏轨迹点真实计数的同时,提高了发布后轨迹数据的可用性。最后,通过实验验证了本文所提出方法的有效性。

关键词: 差分隐私, 聚类, 轨迹数据发布, 方向控制, 有界噪声

Abstract: With the continuous expansion of differential privacy and its applications, its application in the privacy protection field of trajectory data release has received extensive attention. However, most existing research methods use the Kmeans to cluster the trajectory, , which cannot guarantee the final convergence due to the fact that the trajectory datasets under differential privacy constraints are usually disturbed by noise. To addrss this, this paper proposes an orientation control-based differential privacy-preserving trajectory data publishing method. Firstly, a trajectory generalization algorithm based on SKmeans|| clustering is proposed, which updates the centroid via a direction control mechanism during iterative process of clustering, and designs a scoring function in the index mechanism to control the convergence of the centroid, such that the quality of high dimensional data clustering can be improved. Secondly, a trajectory data publishing algorithm based on bounded noise mechanism is designed, which improves the availability of trajectory data after publishing. Meanwhile, the bounded noise mechanism ensures the true count of the hidden trajectory. Finally, the effectiveness of the method proposed in this paper is evaluated by experiments.

Key words: differential privacy, clustering, trajectory publishing, orientation control, bounded noise

中图分类号: 

  • TP391
[1] LI J Y, GUO W Z, LI X Y, et al. Privacy-preserving real-time road conditions monitoring scheme based on intelligent traffic [J]. Journal on Communications, 2020, 41(7): 73-83.
[2] SWEENEY L. k-anonymity: a model for protecting privacy [J]. International Journal of Uncertainty, Fuzziness and Knowledge Based Systems, 2002, 10(5): 557-570.
[3] MACHANAVAJJHALA A, KIFER D, GEHRKE J, et al. l-diversity: privacy beyond k-anonymity [J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2007, 1(1): 3-14.
[4] DWORK C. Differential privacy: a survey of results[C]//International Conference on Theory and Applications of Models of Computation. Heidelberg: Springer, 2008: 1-19.
[5] CHEN R, FUNG B C M, DESAI B C, et al. Differentially private transit data publication: a case study on the montreal transportation system[C]// Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing: ACM, 2012: 213-221.
[6] CHEN R, ACS G, CASTELLUCCIA C. Differentially private sequential data publication via variable-length n-grams[C]//Proceedings of the 2012 ACM Conference on Computer and Communications Security. Raleigh: ACM, 2012: 638-649.
[7] ZHAO X, DONG Y, PI D. Novel trajectory data publishing method under differential privacy [J]. Expert Systems with Applications, 2019, 138: 112791.
[8] HUA J, GAO Y, ZHONG S. Differentially private publication of general time-serial trajectory data[C]//2015 IEEE Conference on Computer Communications (INFOCOM). Hong Kong: IEEE, 2015: 549-557.
[9] LI M, ZHU L, ZHANG Z, et al. Achieving differential privacy of trajectory data publishing in participatory sensing [J]. Information Sciences, 2017, 400: 1-13.
[10] GENG Q, VISWANATH P. The optimal noise-adding mechanism in differential privacy [J]. IEEE Transactions on Information Theory, 2015, 62(2): 925-951.
[11] LI Y, YANG D, HU X. A differential privacy-based privacy-preserving data publishing algorithm for transit smart card data [J]. Transportation Research Part C:Emerging Technologies, 2020, 115: 102634.
[12] GURSOY M E, LIU L, TRUEX S, et al. Differentially private and utility preserving publication of trajectory data [J]. IEEE Transactions on Mobile Computing, 2018, 18(10): 2315-2329.
[13] NI T, QIAO M, CHEN Z, et al. Utility efficient differentially private K-means clustering based on cluster merging [J]. Neurocomputing, 2021, 424: 205-214.
[14] LU Z, SHEN H. Differentially private K-means clustering with convergence guarantee [J]. IEEE Transactions on Dependable and Secure Computing, 2020, 18(4): 1541-1552.
[15] LIU Q, YU J, HAN J, et al. Differentially private and utility-aware publication of trajectory data [J]. Expert Systems with Applications, 2021, 180: 115120.
[16] HAMALAINEN J, KARKKAINEN T, ROSSI T. Scalable initialization methods for large-scale clustering[EB/OL]. arXiv preprint arXiv: 2007.11937 (2020-07-23)[2022-11-01].https://doi.org/10.48550/arXiv.2007.11937.
[17] XU C, ZHU L, LIU Y, et al. DP-LTOD: differential privacy latent trajectory community discovering services over location-based social networks [J]. IEEE Transactions on Services Computing, 2018, 14(4): 1068-1083.
[18] 陈思, 付安民, 苏铓, 等. 基于差分隐私的轨迹隐私保护方案[J]. 通信学报, 2021, 42(9): 54-64.
CHEN S, FU A M, SU M, et al. Trajectory privacy protection scheme based on differential privacy [J]. Journal on Communications, 2021, 42(9): 54-64.
[19] ZHAO X, PI D, CHEN J. Novel trajectory privacy-preserving method based on clustering using differential privacy [J]. Expert Systems with Applications, 2020, 149: 113241.
[20] YUAN J, ZHENG Y, ZHANG C, et al. T-drive: driving directions based on taxi trajectories[C]// Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. California: ACM, 2010: 99-108.
[1] 金宇凯, 李志生, 欧耀春, 张华刚, 曾江毅, 陈搏超. 基于多阶段聚类的PM2.5质量浓度预测及对比研究[J]. 广东工业大学学报, 2023, 40(03): 17-24.
[2] 樊娟, 邓秀勤, 刘玉兰. 一种基于Fréchet距离的谱聚类算法[J]. 广东工业大学学报, 2023, 40(02): 39-44.
[3] 莫赞, 范梦婷, 刘洪伟, 严杨帆. 基于在线用户行为的产品非对称竞争市场结构研究[J]. 广东工业大学学报, 2023, 40(02): 111-119.
[4] 杨达森. DPLORE:一种差分隐私保护位置推荐算法[J]. 广东工业大学学报, 2021, 38(01): 69-74.
[5] 范梦婷, 刘洪伟, 高鸿铭, 何锐超. 电子商务平台下的竞争产品市场结构研究[J]. 广东工业大学学报, 2019, 36(06): 32-37.
[6] 何庆祥, 张巍. 改进的聚类算法在恐怖袭击事件中的应用[J]. 广东工业大学学报, 2019, 36(04): 24-30.
[7] 谢振东, 冷梦甜, 吴金成. 基于一卡通数据的公交站点识别方法分析与研究[J]. 广东工业大学学报, 2019, 36(01): 23-28.
[8] 张巍, 麦志深. 核模糊谱聚类LOF降噪方法研究[J]. 广东工业大学学报, 2018, 35(06): 77-82.
[9] 马飞, 李娟. 基于聚类算法的MOOCs学习者分类及学习行为模式研究[J]. 广东工业大学学报, 2018, 35(03): 18-23.
[10] 王荣荣, 傅秀芬. 一种改进的mpts-HDBSCAN算法[J]. 广东工业大学学报, 2017, 34(03): 49-53.
[11] 陈丽, 曹熙, 林俊杰, 高鸿铭, 刘飞雅, 李艳艳. 基于数据挖掘的短期电力负荷风险预测分析[J]. 广东工业大学学报, 2017, 34(03): 105-109.
[12] 陈继峰, 刘广聪, 彭成平. 一种改进的无线传感器网络DV-Hop定位算法[J]. 广东工业大学学报, 2017, 34(02): 80-85.
[13] 申小敏, 李保俊, 孙旭, 徐维超. 基于卷积神经网络的大规模人脸聚类[J]. 广东工业大学学报, 2016, 33(06): 77-84.
[14] 王波, 钟映春, 陈俊彬. 融合AP和GMM的说话人识别方法研究[J]. 广东工业大学学报, 2015, 32(04): 145-149.
[15] 滕少华, 吴昊, 李日贵, 张巍, 刘冬宁, 梁路. 可调多趟聚类挖掘在电信数据分析中的应用[J]. 广东工业大学学报, 2014, 31(3): 1-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!