Analyzing to Promote Computational Performance of Hadoop MapReduce via Pattern History Extraction

ASIA unversity > 資訊學院 > 資訊工程學系 > 博碩士論文 > Item 310904400/95825

Please use this identifier to cite or link to this item: http://asiair.asia.edu.tw/ir/handle/310904400/95825

Title:	Analyzing to Promote Computational Performance of Hadoop MapReduce via Pattern History Extraction
Authors:	Chen, Yen-Tang
Contributors:	資訊工程學系
Keywords:	Hadoop;MapReduce;Cloud Computing;Pattern History
Date:	2015
Issue Date:	2015-11-20
Publisher:	亞洲大學
Abstract:	Hadoop MapReduce is special computational model and is capable to handle a huge amount of data.However, with limited computational resources, how to adjust environment parameters to improve the global performance is an optimization problem.Taking the computation of maximal repeat extraction using hadoop MapReduce for example,This paper use greed method to determine the parameters of hadoop environment to deal with that optimization problem.The greed method has three steps to handle the optimization problem of maximal pattern histories extraction including (1) the number of computing nodes (2) the sizes of memory allocation parameters (3) the number of reduces.For each step, the corresponding parameters achieving the shortest computation time is selected.First of all, to determine the number of computing nodes.Secondly, find the best of the memory allocations provided by two companies, "Amazon Task Configuration" and "Hortonworks".Finally, determine the suitable number of reduce based on the parameters selected in the previous two steps.Experimental resources includes the titles and abstracts of PubMed, a medical literature.Experimental results show that the computation time decrease as the number of computing nodes increases.The best memory allocation in this study is based on the setting of "Amazon Task Configuration" in the case of "m1.medium".However, increasing the number of reducer didn't decrease the computation time because the number of computing nodes is fixed.The experimental results can provide the hints for improving the performance of the other applications adopting Hadoop MapReduce programming model when with limited computational resources.
Appears in Collections:	[資訊工程學系] 博碩士論文

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	200	View/Open

View Licence

Loading...