1. 荷兰埃因霍温理工大学计算机系, 荷兰北布拉邦省,5600 MB
2. 国家数字交换系统工程技术研究中心,河南,郑州,450000
3. 荷兰埃因霍温理工大学计算机系 荷兰北布拉邦省,5600 MB
网络出版:2019-08-25,
纸质出版:2019
移动端阅览
张建朋, 陈鸿昶, 王凯, 等. 基于采样的大规模图聚类分析算法[J]. 电子学报, 2019,47(8):1731-1737.
ZHANG Jian-peng, CHEN Hong-chang, WANG Kai, et al. A Sampling-Based Graph Clustering Algorithm for Large-Scale Networks[J]. Acta Electronica Sinica, 2019, 47(8): 1731-1737.
张建朋, 陈鸿昶, 王凯, 等. 基于采样的大规模图聚类分析算法[J]. 电子学报, 2019,47(8):1731-1737. DOI: 10.3969/j.issn.0372-2112.2019.08.017.
ZHANG Jian-peng, CHEN Hong-chang, WANG Kai, et al. A Sampling-Based Graph Clustering Algorithm for Large-Scale Networks[J]. Acta Electronica Sinica, 2019, 47(8): 1731-1737. DOI: 10.3969/j.issn.0372-2112.2019.08.017.
针对当前聚类方法(例如经典的GN算法)计算复杂度过高、难以适用于大规模图的聚类问题,本文首先对大规模图的采样算法展开研究,提出了能够有效保持原始图聚类结构的图采样算法(Clustering-structure Representative Sampling,CRS),它能在采样图中产生高质量的聚类代表点,并根据相应的扩张准则进行采样扩张.此采样算法能够很好地保持原始图的内在聚类结构.其次,提出快速的整体样本聚类推断(Population Clustering Inference,PCI)算法,它利用采样子图的聚类标签对整体图的聚类结构进行推断.实验结果表明本文算法对大规模图数据具有较高的聚类质量和处理效率,能够很好地完成大规模图的聚类任务.
Since computational complexities of the existing methods such as classic GN algorithm are too costly to cluster large-scale graphs
this paper studies sampling algorithms of large-scale graphs
and proposes a clustering-structure representative sampling (CRS) which can effectively maintain the clustering structure of original graphs. It can produce high quality clustering-representative nodes in samples and expand according to the corresponding expansion criteria. Then
we propose a fast population clustering inference (PCI) method on the original graphs and deduce clustering assignments of the population using the clustering labels of the sampled subgraph. Experiment results show that in comparison with state-of-the-art methods
the proposed algorithm achieves better efficiency as well as clustering accuracy on large-scale graphs.
0
浏览量
239
下载量
2
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621