电子学报 ›› 2020, Vol. 48 ›› Issue (5): 930-936.DOI: 10.3969/j.issn.0372-2112.2020.05.013

• 学术论文 • 上一篇    下一篇

基于负载均衡的纠删码修复流水线

江小玉, 李贵洋, 周悦, 胡金平, 李慧   

  1. 四川师范大学计算机科学学院, 四川成都 610101
  • 收稿日期:2019-07-04 修回日期:2020-02-07 出版日期:2020-05-25
    • 通讯作者:
    • 李贵洋
    • 作者简介:
    • 江小玉 女,1994年出生于四川自贡,现为四川师范大学计算机科学学院硕士研究生,主要研究方向为分布式存储、纠删码.E-mail:jiang_xy0805@163.com
    • 基金资助:
    • 国家自然科学基金 (No.61701331)

Repair Pipelining for Erasure-Coded Storage Based on Load-Balanced

JIANG Xiao-yu, LI Gui-yang, ZHOU Yue, HU Jing-ping, LI Hui   

  1. Department of Computer Science, Sichuan Normal University, Chengdu, Sichuan 610101, China
  • Received:2019-07-04 Revised:2020-02-07 Online:2020-05-25 Published:2020-05-25
    • Corresponding author:
    • LI Gui-yang
    • Supported by:
    • National Natural Science Foundation of China (No.61701331)

摘要: 大数据分布式存储系统中,修复流水线(Repair Pipelining,RP)减少90%的修复时间,有效地解决由于修复时间开销较大,纠删码不适用于存储热数据的问题.然而,现有的RP存在节点负载不均衡的问题,导致系统性能下降.通过研究后,设计节点负载均衡的纠删码修复流水线(Node Load Balancing-based Repair Pipelining,NLB-RP),并根据性能评价指标提出计算节点负载的算法和计算修复时间的算法.理论分析及实验结果表明,在没有引入额外修复代价的情况下,NLB-RP从局部到整体有效地平衡并减少节点的负载.相比RP,NLB-RP的节点负载方差为0,即每个节点的负载相同.因此,NLB-RP具有最优的负载均衡性.

关键词: 大数据, 分布式存储, 纠删码, 修复流水线, 负载均衡

Abstract: In distributed storage system for big data,the repair pipelining (RP) reduces repair time by 90%, which effectively solves the problem that erasure code is not suitable for storing hot data due to the heavy overhead of repair time. However, the existing RP has the problem of unbalanced load among nodes,which leads to the degradation of system performance. In this paper, a repair pipelining based on node load balancing (NLB-RP) is designed, and then the algorithms for calculating the load of nodes and repair time according to the evaluation index of performance are proposed. Theoretical analysis and experimental results both show that, from local to global, the NLB-RP can effectively balance and reduce the load of nodes without introducing extra repair cost. Compared with the RP, the load variance of the NLB-RP is zero, that is, the load of each node is the same. Thus, the proposed NLB-RP has the optimal load balance.

Key words: big data, distributed storage, erasure code, repair pipelining, load balancing

中图分类号: