1.湖南科技大学计算机科学与工程学院,湖南湘潭 411201
2.国防科技大学复杂系统软件工程重点实验室,湖南长沙 410073
[ "张少波 男,博士,1979年生于湖南邵东. 现为湖南科技大学副教授、硕士生导师. 主要研究方向为移动社交网络、大数据、人工智能、区块链的安全和隐私保护等.E-mail: shaobozhang@hnust.edu.cn" ]
[ "原刘杰 男,1995年生于河南开封. 现为湖南科技大学硕士研究生. 主要研究方向为大数据隐私保护.E-mail: ljyuan@mail.hnust.edu.cn" ]
[ "毛新军 男, 1970年生于浙江江山.博士, CCF杰出会员, 现为国防科学技术大学计算机学院教授,博士生导师. 主要研究领域为智能软件技术, 多智能体系统等." ]
[ "朱更明 男,1963年生于湖南邵阳. 现为湖南科技大学教授、硕士生导师. 主要研究方向为信息安全、智能设备机器视觉及控制等.E-mail: zhu.gm@163.com" ]
收稿:2020-12-01,
修回:2021-04-12,
纸质出版:2022-09-25
移动端阅览
张少波,原刘杰,毛新军等.基于本地差分隐私的K-modes聚类数据隐私保护方法[J].电子学报,2022,50(09):2181-2188.
ZHANG Shao-bo,YUAN Liu-jie,MAO Xin-jun,et al.Privacy Protection Method for K-modes Clustering Data with Local Differential Privacy[J].ACTA ELECTRONICA SINICA,2022,50(09):2181-2188.
张少波,原刘杰,毛新军等.基于本地差分隐私的K-modes聚类数据隐私保护方法[J].电子学报,2022,50(09):2181-2188. DOI: 10.12263/DZXB.20201374.
ZHANG Shao-bo,YUAN Liu-jie,MAO Xin-jun,et al.Privacy Protection Method for K-modes Clustering Data with Local Differential Privacy[J].ACTA ELECTRONICA SINICA,2022,50(09):2181-2188. DOI: 10.12263/DZXB.20201374.
分类型数据聚类是数据挖掘的重要研究内容,聚类数据中通常包含用户一些敏感信息. 为保护聚类数据中的用户隐私,当前主要采用基于可信第三方隐私保护模型,但现实中第三方也存在隐私泄露风险. 针对此问题,该文引入本地差分隐私技术,提出一种去可信第三方的K-modes聚类数据隐私保护方法. 该方法首先利用随机采样技术对数据进行采样,然后使用本地差分隐私技术对采样数据进行扰动,最后通过聚类服务端与用户的交互迭代完成聚类. 在聚类过程中,无需可信第三方对数据进行隐私预处理,避免了第三方泄露用户隐私的风险. 理论分析证明了该方法的隐私性和可行性,实验结果表明该方法在满足本地差分隐私机制的前提下保证了聚类结果的质量.
Categorical data clustering is an important research content for data mining
and clustering data usually contains some sensitive information of user. In order to protect user privacy in clustering data
the privacy protection model based on trusted third-party is currently mainly adopted. However
in reality
the third-party also has the risk of privacy leakage. In this paper
we propose a privacy protection method for K-modes clustering data without trusted third-party by introducing local differential privacy technology. Our method first uses random sampling technology to sample the data
then perturbs the sampled data by using local differential privacy technology
and finally complete the clustering through the interaction between the server and the user. In the clustering process
our method does not require a trusted third-party to perform privacy preprocessing on the data
which avoids the risk of the third-party disclosing the user's privacy. Theoretical analysis proves the privacy and feasibility of our method. Experimental results show that our method guarantees the quality of the clustering results under the premise of satisfying the local differential privacy mechanism.
COELHO A L V , SANDES N C . Data clustering via cooperative games: A novel approach and comparative study [J]. Information Sciences , 2021 , 545 : 791 - 812 .
SAROJ K . Review: study on simple k mean and modified K mean clustering technique [J]. International Journal of Computer Science Engineering and Technology , 2016 , 6 ( 7 ): 279 - 281 .
XIAO Y Y , HUANG C H , HUANG J Y , et al . Optimal mathematical programming and variable neighborhood search for k-modes categorical data clustering [J]. Pattern Recognition , 2019 , 90 : 183 - 195 .
ZHANG S B , MAO X J , CHOO K K R , et al . A trajectory privacy-preserving scheme based on a dual-K mechanism for continuous location-based services [J]. Information Sciences , 2020 , 527 : 406 - 419 .
ZHANG S B , WANG G J , BHUIYAN M Z A , et al . A dual privacy preserving scheme in continuous location-based services [J]. IEEE Internet of Things Journal , 2018 , 5 ( 5 ): 4191 - 4200 .
CHAVES A , MOURA I , BERNARDINO J , et al . The privacy paradigm: An overview of privacy in Business Analytics and Big Data [C]// 2020 15th Iberian Conference on Information Systems and Technologies(CISTI) . Piscataway : IEEE , 2020 : 1 - 6 .
DWORK C , ROTH A . The algorithmic foundations of differential privacy [J]. Foundations and Trends in Theoretical Computer Science , 2013 , 9 ( 3/4 ): 211 - 407 .
DEWRI R , THURIMELLA R . Exploiting service similarity for privacy in location-based search queries [J]. IEEE Transactions on Parallel and Distributed Systems , 2014 , 25 ( 2 ): 374 - 383 .
MEDKOVÁ J . Composition attack against social network data [J]. Computers & Security , 2018 , 74 : 115 - 129 .
彭慧丽 , 金凯忠 , 付聪聪 , 等 . 基于序列格的隐私时序模式挖掘方法 [J]. 电子学报 , 2020 , 48 ( 1 ): 153 - 163 .
PENG H L , JIN K Z , FU C C , et al . Private time series pattern mining with sequential lattice [J]. Acta Electronica Sinica , 2020 , 48 ( 1 ): 153 - 163 . (in Chinese)
陈思 , 付安民 , 柯海峰 , 等 . MCDP: 基于神经网络的多集群分布式差分隐私数据发布方法 [J]. 电子学报 , 2020 , 48 ( 12 ): 2297 - 2303 .
CHEN S , FU A M , KE H F , et al . MCDP: multi-cluster differential privacy data publishing method based on neural network [J]. Acta Electronica Sinica , 2020 , 48 ( 12 ): 2297 - 2303 . (in Chinese)
郑孝遥 , 罗永龙 , 汪祥舜 , 等 . 基于位置服务的分布式差分隐私推荐方法研究 [J]. 电子学报 , 2021 , 49 ( 1 ): 99 - 110 .
ZHENG X Y , LUO Y L , WANG X S , et al . Research on location-based distributed differential privacy recommendation method [J]. Acta Electronica Sinica , 2021 , 49 ( 1 ): 99 - 110 . (in Chinese)
XIAO X K , TAO Y F , CHEN M H . Optimal random perturbation at multiple privacy levels [J]. Proceedings of the VLDB Endowment , 2009 , 2 ( 1 ): 814 - 825 .
KIFER D . On estimating the swapping rate for categorical data [C]// Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . New York, USA : ACM , 2015 : 557 - 566 .
SU D , CAO J N , LI N H , et al . Differentially private K-means clustering and a hybrid approach to private optimization [J]. ACM Transactions on Privacy and Security , 2017 , 20 ( 4 ): 1 - 33 .
NGUYEN T D , GUPTA S , RANA S T , et al . Privacy Aware K-Means Clustering with High Utility [C]// Pacific-Asia Conference on Knowledge Discovery and Data Mining . Cham : Springer , 2016 : 388 - 400 .
NGUYEN H H . Privacy-preserving mechanisms for k-modes clustering [J]. Computers & Security , 2018 , 78 : 60 - 75 .
叶青青 , 孟小峰 , 朱敏杰 , 等 . 本地化差分隐私研究综述 [J]. 软件学报 , 2018 , 29 ( 7 ): 1981 - 2005 .
YE Q Q , MENG X F , ZHU M J , et al . Survey on local differential privacy [J]. Journal of Software , 2018 , 29 ( 7 ): 1981 - 2005 . (in Chinese)
ERLINGSSON Ú , PIHUR V , KOROLOVA A . RAPPOR: randomized aggregatable privacy-preserving ordinal response [C]// Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security . New York, USA : ACM , 2014 : 1054 - 1067 .
KAIROUZ P , BONAWITZ K , RAMAGE D . Discrete distribution estimation under local privacy [C]// Proceedings of the 33rd International Conference on International Conference on Machine Learning . New York, USA : ACM , 2016 : 2436 - 2444 .
QIN Z , YANG Y , YU T , et al . Heavy hitter estimation over set-valued data with local differential privacy [C]// CCS'16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security . New York, USA : ACM , 2016 : 192 - 203 .
WANG T H , LI N H , JHA S . Locally differentially private frequent itemset mining [C]// 2018 IEEE Symposium on Security and Privacy . Piscataway : IEEE , 2018 : 127 - 143 .
DUCHI J C , JORDAN M I , WAINWRIGHT M J . Local privacy and statistical minimax rates [C]// 2013 IEEE 54th Annual Symposium on Foundations of Computer Science . Piscataway : IEEE , 2013 : 429 - 438 .
WANG N , XIAO X K , YANG Y , et al . Collecting and analyzing multidimensional data with local differential privacy [C]// 2019 IEEE 35th International Conference on Data Engineering . Piscataway : IEEE , 2019 : 638 - 649 .
KULKARNI T . Answering range queries under local differential privacy [C]// Proceedings of the 2019 International Conference on Management of Data . New York, USA : ACM , 2019 : 1832 - 1834 .
REN X B , YU C M , YU W R , et al . LoPub: High-dimensional crowdsourced data publication with local differential privacy [J]. IEEE Transactions on Information Forensics and Security , 2018 , 13 ( 9 ): 2151 - 2166 .
AMIR A , AMIT M , LANDAU G M , et al . Period recovery of strings over the Hamming and edit distances [J]. Theoretical Computer Science , 2018 , 710 : 2 - 18 .
WANG T H , BLOCKI J , LI N H , et al . Locally differentially private protocols for frequency estimation [C]// Proceedings of the 26th USENIX Conference on Security Symposium . Berkeley, CA,USA : USENIX Association , 2017 : 729 - 745 .
0
浏览量
9
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621