电子学报 ›› 2019, Vol. 47 ›› Issue (2): 257-265.DOI: 10.3969/j.issn.0372-2112.2019.02.001

• 学术论文 •    下一篇

面向大规模网络的快速重叠社团挖掘算法

李政廉, 吉立新, 黄瑞阳, 兰巨龙   

  1. 国家数字交换系统工程技术研究中心, 河南郑州 450003
  • 收稿日期:2017-12-12 修回日期:2018-05-15 出版日期:2019-02-25 发布日期:2019-02-25
  • 作者简介:李政廉 男,1988年生于河北三河.国家数字交换系统工程技术研究中心博士生.研究方向为数据可视化、复杂网络分析.E-mail:lizhenglian@yeah.net;吉立新 男,1969年生于江苏淮安.国家数字交换系统工程技术研究中心副总工程师,研究员.研究方向为电信网安全、数据挖掘.
  • 基金资助:
    国家自然科学基金(No.61521003)

Fast Overlapping Communities Detection Algorithms for Large-Scale Social Networks

LI Zheng-lian, JI Li-xin, HUANG Rui-yang, LAN Ju-long   

  1. National Digital Switching System Engineering & Technological R & D Center, Zhengzhou, Henan 450003, China
  • Received:2017-12-12 Revised:2018-05-15 Online:2019-02-25 Published:2019-02-25

摘要: 重叠社团在社交网络大数据中普遍存在.针对现有重叠社团挖掘算法易将重叠区域错误地划分为独立的社团且计算复杂的问题,提出了一种基于局部信息度量的快速重叠社团挖掘算法(Local information based Fast Overlapped Communities Detection,Li-FOCD).首先,为节点定义局部信息度量指标——社团连接度和邻居连接度,建模节点与社团的关系,缩小了计算范围;然后,每次并行地迭代执行缩减、扩展、去重等操作,并更新局部度量指标,通过松弛每次迭代的终止条件,发现近似最优社团集合而不是最优社团,最终算法复杂度为Om+n).基于真实的大规模社交网络数据的试验分析表明:与当前流行的重叠社团挖掘算法相比,Li-FOCD在不损失检测质量的前提下,大幅提升了计算效率.

关键词: 重叠社团挖掘, 局部信息, 社团连接度, 邻居连接度

Abstract: Overlap between community pairs is commonplace in large-scale social networks.The most existing overlapped community detection algorithms may falsely identify overlaps as communities because the overlap area is denser than others,and those algorithms are computationally demanding and cannot scale well with the size of networks.In this paper,we propose a fast overlapping community discovery algorithm based on some locally computed information-Local information based Fast Overlapped Communities Detection (Li-FOCD).Firstly,we introduce two local information metrics for each network node-community connectivity score and neighborhood connectivity score,to model the relationship between nodes and communities;secondly,based on local metrics,we can concurrently execute the iterations of reduction,expansion,and duplication removal to find the approximately optimal communities instead of the optimal community,and achieve a low complexity O(m+n).Experimental analysis based on real large-scale social networking datasets shows that our algorithm outperforms some popular overlapped community finding algorithms in terms of computational time while not compromising with quality.

Key words: overlapping communities detection, Local information, community connectivity, neighborhood connectivit

中图分类号: