دانلود Massive Data Load on Distributed Database Systems over HBase

ترجمه مقاله Massive Data Load on Distributed Database Systems over HBase
قیمت : 880,000 ریال
شناسه محصول : 2008235
نویسنده/ناشر/نام مجله : 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
سال انتشار: 2017
تعداد صفحات انگليسي : 4
نوع فایل های ضمیمه : Pdf+Word
حجم فایل : 203 Kb
کلمه عبور همه فایلها : www.daneshgahi.com
عنوان انگليسي : Massive Data Load on Distributed Database Systems over HBase

چکیده

Abstract

Big Data has become a pervasive technology to manage the ever-increasing volumes of data. Among Big Data solutions, scalable data stores play an important role, especially,key-value data stores due to their large scalability (thousands of nodes). The typical workflow for Big Data applications include two phases. The first one is to load the data into the data store typically as part of an ETL (Extract-Transform-Load) process.The second one is the processing of the data itself. Big Table and HBase are the preferred key-value solutions based on range-partitioned data stores. However, the loading phase is inefficient and creates a single node bottleneck. In this paper, we identify and quantify this bottleneck and propose a tool for parallel massive data loading that solves satisfactorily the bottleneck enabling all the parallelism and throughput of the underlying key-value data store during the loading phase as well. The proposed solution has been implemented as a tool for parallel massive data loading over HBase, the key-value data store of the Hadoop ecosystem.

Keywords: HBase MapReduce HDFS

Skip Navigation Links