北京科技大学计算机科学与技术系,北京,100083
网络出版:2019-05-25,
纸质出版:2019
移动端阅览
李建江, 马占宁, 张凯. 一种基于内容分块的层次化去冗优化策略[J]. 电子学报, 2019,47(5):1094-1100.
LI Jian-jiang, MA Zhan-ning, ZHANG Kai. An Optimal Hierarchical Deduplication Strategy Based on Content Defined Chunking[J]. Acta Electronica Sinica, 2019, 47(5): 1094-1100.
李建江, 马占宁, 张凯. 一种基于内容分块的层次化去冗优化策略[J]. 电子学报, 2019,47(5):1094-1100. DOI: 10.3969/j.issn.0372-2112.2019.05.017.
LI Jian-jiang, MA Zhan-ning, ZHANG Kai. An Optimal Hierarchical Deduplication Strategy Based on Content Defined Chunking[J]. Acta Electronica Sinica, 2019, 47(5): 1094-1100. DOI: 10.3969/j.issn.0372-2112.2019.05.017.
在过去的数十年中,信息数据量呈现指数级增长,如何存储和保护这些大量信息数据成为一个难题.云存储和冗余去重技术成为解决上述难题的主要技术.去冗技术在云存储系统中得到广泛应用,但主流的云存储系统存在索引信息的膨胀以及数据分块的不确定性等不足,而这些弊端会导致内存空间的浪费和数据分块的不可预知性.针对这些问题,提出了一种基于内容分块的层次化去冗优化策略,并构建了对应的算法,解决了云存储系统中索引信息表过大和数据分块过大或过小的问题.并且选取CNN新闻的页面内容作为测试集进行实际测试,通过比较去冗比和去冗时间可以看出,相比于目前主流的去冗策略,本文提出的基于内容分块的层次化去冗优化策略能够提升3%左右的去冗比,同时降低2%左右的去冗时间.
In the past decades
the amount of information data is growing in an unexpected speed and how to store and protect these large amounts of data becomes a problem.Cloud storage and data deduplication technology are the principal technology to solve the above problem.Deduplication technology is widely used in cloud storage systems.However
there are some shortcomings such as the expansion of the index information and the uncertainty of the data block in the current mainstream cloud storage technology
which lead to the waste of space and a big difference in the length of the data block.To overcome these shortcomings
through the study of cloud storage and data deduplication
this paper presents an optimal hierarchical deduplication strategy based on content defined chunking and proposes the corresponding algorithm
and achieves the purpose of saving memory space and obtaining better compression performance.Finally
this paper selects the content of the CNN news as a test set.By comparing the compression ratio and the compression time
the optimal hierarchical deduplication strategy has increased compression ratio by 3% and reduced compression time by 2% compared with the current mainstream deduplication strategy.
0
浏览量
323
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621