电子学报 ›› 2011, Vol. 39 ›› Issue (11): 2635-2642.

• 综述评论 • 上一篇    下一篇

MapReduce并行编程模型研究综述

李建江1, 崔健1, 王聃1, 严林1, 黄义双2   

  1. 1. 北京科技大学计算机与通信工程学院计算机科学与技术系,北京 100083;2. 中国石油化工股份有限公司勘探南方分公司研究院,四川成都 610041
  • 收稿日期:2011-01-12 修回日期:2011-03-17 出版日期:2011-11-25 发布日期:2011-11-25

Survey of MapReduce Parallel Programming Model

LI Jian-jiang1, CUI Jian1, WANG Dan1, YAN Lin1, HUANG Yi-shuang2   

  1. 1. Department of Computer Science and Technology,School of Computer and Communication Engineering,University of Science and Technology Beijing,Beijing 100083,China;2. Research Institute of Exploration Southern Division Company,SINOPEC,Chengdu,Sichuan 610041,China
  • Received:2011-01-12 Revised:2011-03-17 Online:2011-11-25 Published:2011-11-25

摘要: MapReduce并行编程模型通过定义良好的接口和运行时支持库,能够自动并行执行大规模计算任务,隐藏底层实现细节,降低并行编程的难度.本文对MapReduce的国内外相关研究现状进行了综述,阐述和分析了当前国内外与MapReduce相关的典型研究成果的特点和不足,重点对MapReduce涉及的关键技术(包括:模型改进、模型针对不同平台的实现、任务调度、负载均衡和容错)的研究现状进行了深入的分析.本文最后还对MapReduce未来的发展趋势进行了展望.

关键词: MapReduce, 并行编程模型, 运行时支持库, 海量数据处理

Abstract: Through well-defined interfaces and runtime support library,MapReduce parallel programming model can automatically perform the large-scale computing tasks in parallel,hide the underlying implementation details,and reduce the difficulty of parallel programming.This paper reviews the domestic and overseas research of the MapReduce,describes and analyzes the characteristics and lack of the typical research achievements about MapReduce at home and abroad.Then this paper focus on the in-depth analysis of the key technologies about MapReduce (including:model optimization,model implementation according to the different platforms,task scheduling,load balancing,and fault tolerance).Finally,this paper prospects the MapReduce for the future trend.

Key words: MapReduce, parallel programming model, runtime support library, massive data processing

中图分类号: