电子学报 ›› 2017, Vol. 45 ›› Issue (2): 359-367.DOI: 10.3969/j.issn.0372-2112.2017.02.014

• 学术论文 • 上一篇    下一篇

基于改进哈夫曼编码的大规模动态图可达查询方法

丁琳琳, 李正道, 纪婉婷, 宋宝燕   

  1. 辽宁大学信息学院, 辽宁沈阳 110036
  • 收稿日期:2015-09-06 修回日期:2015-12-22 出版日期:2017-02-25 发布日期:2017-02-25
  • 通讯作者: 宋宝燕
  • 作者简介:丁琳琳,女,1983年生于辽宁阜新.辽宁大学信息学院副教授.研究方向为大数据管理、分布式数据管理、图数据管理等.E-mail:dinglinlin@lnu.edu.cn;李正道,男,1989年生于贵州遵义.辽宁大学硕士研究生.研究方向为图数据管理.E-mail:li_zhengdao@163.com;纪婉婷,女,1992年生于辽宁沈阳.辽宁大学硕士研究生.研究方向为图数据管理.E-mail:jwt@escience.cn
  • 基金资助:

    国家自然科学基金(No.61472169,No.61502215,No.61572119,No.61303016,No.61472069,No.11547235);辽宁省教育厅优秀人才项目(No.LR201017);辽宁省教育厅科学研究一般项目(No.L2015193);辽宁省博士科研启动基金(No.201501127);辽宁大学青年科研基金(No.LDQN201438)

Reachability Query of Large Scale Dynamic Graph Based on Improved Huffman Coding

DING Lin-lin, LI Zheng-dao, JI Wan-ting, SONG Bao-yan   

  1. School of Information, Liaoning University, Shenyang, Liaoning 110036, China
  • Received:2015-09-06 Revised:2015-12-22 Online:2017-02-25 Published:2017-02-25

摘要:

随着社交网络分析、生物信息网络分析等新兴应用的涌现和计算机技术的飞速发展,图的规模迅速增长,并且频繁更新,使得对大规模动态图数据的处理需求愈加迫切.现有的面向大规模动态图的可达查询研究成果较少,尚存在索引压缩困难以及图结构待优化等问题.本文提出了一种支持大规模动态图的基于改进哈夫曼编码的可达查询处理方法(Huffman-based Label Reachability,HuffLR).该方法首先对预处理图进行结构上的两次压缩,得到双压缩图;其次,基于双压缩图提出一种前缀label索引,该索引能够有效表达节点间的可达关系;最后,提出双压缩图的演进和可达查询处理及优化算法,主要包括边的插入与删除、节点的插入与删除.实验表明,本文提出的基于改进哈夫曼编码的大规模动态图可达查询处理方法具有良好的可行性和有效性.

关键词: 可达查询, 大规模图, 动态图, 哈夫曼编码, 标签索引

Abstract:

With the emerging of applications such as Social Network Services and biological information network analysis and the rapid development of computer technology,the scale of graph increases quickly and updates frequently,so the demand to deal with large scale dynamic graph becomes more pressing.The existing research works focusing on large scale dynamic graph are rare,which have shortcomings such as difficult index compression and needing optimizing graph structure.Therefore,in this paper,we present a reachability query processing method of dynamic large scale graph based on improved Huffman Coding,named Huffman-based Label Reachability,HuffLR.Firstly,the structure of pre-processing graph is compressed twice in order to gain the double compression graph.Secondly,the prefix-label index is constructed based on the double compression graph,which can express the reachability relations between nodes effectively.Lastly,we present the evolution of the double compression graph and reachability query processing and optimized algorithms,including insertion and deletion of edge,insertion and deletion of node.Experimental results demonstrate that the reachability query processing algorithm of dynamic large scale graph based on improved Huffman Coding has good feasibility and effectiveness.

Key words: reachability query, large scale graph, dynamic graph, Huffman coding, label index

中图分类号: