摘要 重叠社团在社交网络大数据中普遍存在.针对现有重叠社团挖掘算法易将重叠区域错误地划分为独立的社团且计算复杂的问题,提出了一种基于局部信息度量的快速重叠社团挖掘算法(Local information based Fast Overlapped Communities Detection,Li-FOCD).首先,为节点定义局部信息度量指标——社团连接度和邻居连接度,建模节点与社团的关系,缩小了计算范围;然后,每次并行地迭代执行缩减、扩展、去重等操作,并更新局部度量指标,通过松弛每次迭代的终止条件,发现近似最优社团集合而不是最优社团,最终算法复杂度为O(m+n).基于真实的大规模社交网络数据的试验分析表明:与当前流行的重叠社团挖掘算法相比,Li-FOCD在不损失检测质量的前提下,大幅提升了计算效率.
Abstract:Overlap between community pairs is commonplace in large-scale social networks.The most existing overlapped community detection algorithms may falsely identify overlaps as communities because the overlap area is denser than others,and those algorithms are computationally demanding and cannot scale well with the size of networks.In this paper,we propose a fast overlapping community discovery algorithm based on some locally computed information-Local information based Fast Overlapped Communities Detection (Li-FOCD).Firstly,we introduce two local information metrics for each network node-community connectivity score and neighborhood connectivity score,to model the relationship between nodes and communities;secondly,based on local metrics,we can concurrently execute the iterations of reduction,expansion,and duplication removal to find the approximately optimal communities instead of the optimal community,and achieve a low complexity O(m+n).Experimental analysis based on real large-scale social networking datasets shows that our algorithm outperforms some popular overlapped community finding algorithms in terms of computational time while not compromising with quality.
[1] DASGUPTA K,SINGH R,VISWANATHAN B,et al.Social ties and their relevance to churn in mobile telecom networks[A].Proceedings of 11th International Conference on Extending Database Technology:Advances in Database Technology[C].New York:ACM,2008.668-677.
[2] CAI Q,MA L,GONG M,et al.A survey on network community detection based on evolutionary computation[J].International Journal of Bio-Inspired Computation,2016,8(2):84-98.
[3] SCHAEFFER S E.Graph clustering[J].Computer Science Review,2007,1(1):27-64.
[4] 邓小龙,翟佳羽,尹栾玉.基于矢量影响力聚类系数的高效有向网络社团划分算法[J].电子与信息学报,2017,39(9):2071-2080. Deng Xiaolong,Zhai Jiayu,Yin Luanyu.Vector influence clustering coefficient based efficient directed community detection algorithm[J].Journal of Electronics & Information Technology,2017,39(9):2071-2080.(in Chinese)
[5] BLONDEL V D,GUILLAUME J L,LAMBIOTTE R,et al.Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008,2008(10):10008.
[6] LANCICHINETTI A,FORTUNATO S and KERTESZ J.Detecting the overlapping and hierarchical community structure in complex networks[J].New Journal of Physics,2009,11(3):033015.
[7] LEE C,REID F,MCDAID A,et al.Detecting highly overlapping community structure by greedy clique expansion[J].ArXiv,2010,1002:1827.
[8] MISHRA N,SCHREIBER R,STANTON I,et al.Finding strongly knit clusters in social networks[J].Internet Mathematics,2008,5(1-2):155-174.
[9] YANG J,LESKOVEC J.Structure and overlaps of communities in networks[J].ArXiv,2012,1205:6228.
[10] 乔少杰,韩楠,张凯峰,等.复杂网络大数据中重叠社区检测算法[J].软件学报,2017,28(3):631-647. QIAO Shaojie,HAN Nan,ZHANG Kaifeng,et al.Algorithm for detecting overlapping communities from complex network big data[J].Journal of Software,2017,28(3):631-647.(in Chinese)
[11] CHAKRABORTY T,KUMAR S,GANGULY N,et al.GenPerm:a unified method for detecting non-overlapping and overlapping communities[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(8):2101-2114.
[12] HUANG Faliang,LI Xuelong,et al.Overlapping community detection for multimedia social networks[J].IEEE Transactions on Multimedia,2017,19(8):1881-1893.
[13] AHN Y Y,BAGROW J P,LEHMANN S.Link communities reveal multiscale complexity in networks[J].Nature,2010,466(7307):761-764.
[14] MCDAID A,HURLEY N.Detecting highly overlapping communities with model-based overlapping seed expansion[A].Proceedings of International Conference on Advances in Social Networks Analysis and Mining[C].Piscataway,NJ:IEEE,2010.112-119.
[15] LIU Xiao,WEI Yiming,WANG Jian,et al.Community detection enhancement using no-negative matrix factorization with graph regularization[J].International Journal of Modern Physics B,2016,30(20):1650130.
[16] Yang J,LESKOVEC J.Overlapping community detection at scale:a nonnegative matrix factorization approach[A].Proceedings of the Sixth ACM International Conference on Web Search and Data Mining[C].New York:ACM,2013.587-596.
[17] NARAYANAM R,NARAHARI Y.A game theory inspired,decentralized,local information based algorithm for community detection in social graphs[A].Proceedings of 21st International Conference on Pattern Recognition[C].Piscataway,NJ:IEEE,2012,1072-1075.
[18] GREGORY S.Finding overlapping communities in networks by label propagation[J].New Journal of Physics,2010,12(10):103018.
[19] 刘功申,孟魁,郭弘毅,苏波,李建华.基于贡献函数的重叠社区划分算法[J].电子与信息学报,2017,39(8):1964-1971. LIU Gongshen,MENG Kui,GUO Hongyi,SU Bo,LI Jianhua.Overlapping-communities recognition algorithm based on contribution function[J].Journal of Electronics & Information Technology,2017,39(8):1964-1971.(in Chinese)
[20] CHEN W,LIU Z,SUN X,et al.A game-theoretic framework to identify overlapping communities in social networks[J].Data Mining and Knowledge Discovery,2010,21(2):224-240.
[21] ZHANG Lei,PAN Hebin,et al.A mixed representation-based multi-objective evolutionary algorithm for overlapping community detection[J].IEEE Transactions on Cybernetics,2017,47(9):2703-2716.
[22] WEN X,CHEN W N,LIN Y,et al.A maximal clique based multi-objective evolutionary algorithm for overlapping community detection[J].IEEE Transactions on Evolutionary Computation,2017,21(3):363-377.
[23] BAUMES J,GOLDBERG M,MAGDOM-ISMAIL M.Efficient identification of overlapping communities[A].Proceedings of International Conference on Intelligence and Security Informatics[C].Heidelberg,Berlin:Springer,2005.27-36.
[24] XIE J and SZYMANSKI B K.Community detection using a neighborhood strength driven label propagation algorithm[A].Proceedings of Network Science Workshop[C].Piscataway,NJ:IEEE,2011.188-195.
[25] XIE J,SZYMANSKI B K.Towards linear time overlapping community detection in social networks[A].Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining[C].Heidelberg,Berlin:Springer,2012.25-36.
[26] 刘世超,朱福喜,甘琳.基于标签传播概率的重叠社区发现算法[J].计算机学报,2016,29(4):717-729. LIU Shichao,ZHU Fuxi,GAN Lin.A label-propagation-probability-based algorithm for overlapping community detection[J].Chinese Journal of Computers,2016,29(4):717-729.(in Chinese)
[27] RAGHAVAN U N,ALBERT R and KUMARA S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical review E,2007,76(3):036106.
[28] 马慧芳,谢蒙,何廷年,等.基于核心标签的可重叠微博网络社区划分方法[J].电子学报,2017,45(4):769-776. MA Xiefang,XIE Meng,HE Tingnian,et al.An overlapping microblog community detection algorithm via core tags[J].Acta Electronica Sinica,2017,45(4):769-776.(in Chinese)
[29] COSCIA M,ROSSETTI G,GIANNOTTI F,et al.Demon:a local-first discovery method for overlapping communities[A].Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining[C].New York:ACM,2012.615-623.
[30] YANG B,LIU J,LIU D.An autonomy-oriented computing approach to community mining in distributed and dynamic networks[J].Autonomous Agents and Multi-Agent Systems,2010,20(2):123-157.
[31] BU Z,WU Z,CAO J,et al.Local community mining on distributed and dynamic networks from a multiagent perspective[J].IEEE Transactions on cybernetics,2016,46(4):986-999.
[32] BU Z,CAO J,LI H J,et al.GLEAM:a graph clustering framework based on potential game optimization for large-scale social networks[J].Knowledge and Information Systems,2017:1-30.
[33] 国琳,左万利,彭涛.基于隶属度的社会化网络重叠社区发现及动态集群演化分析[J].电子学报,2016,44(3):587-594. GUO L,ZUO W L,PENG T.Overlapping community detection and dynamic group evolution analysis based on the degree of membership in social network[J].Acta Electronica Sinica,2016,44(3):587-594.(in Chinese)
[34] YANG J,LESKOVEC J.Defining and evaluating network communities based on ground-truth[J].Knowledge and Information Systems,2015,42(1):181-213.
[35] LANCICHINETTI A,RADICCHI F,RAMASCO J J,et al.Finding statistically significant communities in networks[J].PloS One,2011,6(4):e18961.