Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
…
5 pages
1 file
Big data is concern massive amount, complex, growing data set from multiple autonomous sources. It has to deal with large and complex dataset that can be structured, semi-structured or unstructured and will typically not fit into memory to be processed. MapReduce is a programming model for processing large datasets distributed on a large clusters.A rapid growth of data in recent time, Industries and academia required an intelligent data analysis tool that would be helpful to satisfy the need to analysis a large amount of data. MapReduce framework is basically designed to compute data demanding applications to support effective decision making. Since its introduction, remarkable research efforts have been put to make it more familiar to the users subsequently utilized to support the execution of enormous data intensive applications. This survey paper highlights and investigates various applications using recent MapReduce models.
Indonesian Journal of Electrical Engineering and Computer Science, 2016
Nowadays we all are surrounded by Big data. The term ‘Big Data’ itself indicates huge volume, high velocity, variety and veracity i.e. uncertainty of data which gave rise to new difficulties and challenges. Big data generated may be structured data, Semi Structured data or unstructured data. For existing database and systems lot of difficulties are there to process, analyze, store and manage such a Big Data. The Big Data challenges are Protection, Curation, Capture, Analysis, Searching, Visualization, Storage, Transfer and sharing. Map Reduce is a framework using which we can write applications to process huge amount of data, in parallel, on large clusters of commodity hardware in a reliable manner. Lot of efforts have been put by different researchers to make it simple, easy, effective and efficient. In our survey paper we emphasized on the working of Map Reduce, challenges, opportunities and recent trends so that researchers can think on further improvement.
International Journal of Advanced Computer Science and Applications, 2015
A rapid growth of data in recent time, Industries and academia required an intelligent data analysis tool that would be helpful to satisfy the need to analysis a huge amount of data. MapReduce framework is basically designed to compute data intensive applications to support effective decision making. Since its introduction, remarkable research efforts have been put to make it more familiar to the users subsequently utilized to support the execution of massive data intensive applications. Our survey paper emphasizes the state of the art in improving the performance of various applications using recent MapReduce models and how it is useful to process large scale dataset. A comparative study of given models corresponds to Apache Hadoop and Phoenix will be discussed primarily based on execution time and fault tolerance. At the end, a high-level discussion will be done about the enhancement of the MapReduce computation in specific problem area such as Iterative computation, continuous query processing, hybrid database etc.
International Journal of Electrical and Computer Engineering (IJECE), 2016
Nowadays we all are surrounded by Big data. The term "Big Data" itself indicates huge volume, high velocity, variety and veracity i.e. uncertainty of data which gave rise to new difficulties and challenges. Big data generated may be structured data, Semi Structured data or unstructured data. For existing database and systems lot of difficulties are there to process, analyze, store and manage such a Big Data. The Big Data challenges are Protection, Curation, Capture, Analysis, Searching, Visualization, Storage, Transfer and sharing. Map Reduce is a framework using which we can write applications to process huge amount of data, in parallel, on large clusters of commodity hardware in a reliable manner. Lot of efforts have been put by different researchers to make it simple, easy, effective and efficient. In our survey paper we emphasized on the working of Map Reduce, challenges, opportunities and recent trends so that researchers can think on further improvement.
2014 International Conference on Computer and Communication Engineering, 2014
Recently, data that generated from variety of sources with massive volumes, high rates, and different data structure, data with these characteristics is called Big Data. Big Data processing and analyzing is a challenge for the current systems because they were designed without Big Data requirements in mind and most of them were built on centralized architecture, which is not suitable for Big Data processing because it results on high processing cost and low processing performance and quality. MapReduce framework was built as a parallel distributed programming model to process such large-scale datasets effectively and efficiently. This paper presents six successful Big Data software analysis solutions implemented on MapReduce framework, describing their datasets structures and how they were implemented, so that it can guide and help other researchers in their own Big Data solutions.
International Journal of Advanced Trends in Computer Science and Engineering , 2019
The recent years consume the exemplary growth of data generation. This enormous amount of data has brought new kind of problem. The existing RDBMS systems are unable to process the Big Data, or they are not efficient in handling it. The significant problems appeared with the Big Data are storage and processing. Hadoop is brought in the solutions for storage and processing in the form of HDFS (Hadoop Distributed File System) and MapReduce respectively. The traditional systems not construct for keeping the Big Data, and also they can only process structured data. One of the industries, first to face the Big Data challenges is financial sector. In this work, an unstructured stocks data is processed using Hadoop MapReduce. Efficient processing of unstructured data is analyzed, and all the phases involved in implementation explicated.
We are in the age of big data which involves collection of large datasets.Managing and processing large data sets is difficult with existing traditional database systems.Hadoop and Map Reduce has become one of the most powerful and popular tools for big data processing. Hadoop Map Reduce a powerful programming model is used for analyzing large set of data with parallelization, fault tolerance and load balancing and other features are it is elastic,scalable,efficient.MapReduce with cloud is combined to form a framework for storage, processing and analysis of massive machine maintenance data in a cloud computing environment.
International Journal of Computer Sciences and Engineering (IJCSE), E-ISSN : 2347-2693, Volume-5, Issue-10, Page No. 218-225, 2017
Since, the last three or four years, the field of "big data" has appeared as the new frontier in the wide spectrum of IT-enabled innovations and favorable time allowed by the information revolution. Today, there is a raise necessity to analyses very huge datasets, that have been coined big data, and in need of uniqueness storage and processing infrastructures. MapReduce is a programming model the goal of processing big data in a parallel and distributed manner. In MapReduce, the client describes a map function that processes a key/value pair to procreate a set of intermediate value pairs & key, and a reduce function that merges all intermediate values be associated with the same intermediate key. In this paper, we aimed to demonstrate a close-up view about MapReduce. The MapReduce is a famous framework for data-intensive distributed computing of batch jobs. This is oversimplify fault tolerance, many implementations of MapReduce materialize the overall output of every map and reduce task before it can be consumed. Finally, we also discuss the comparison between RDBMS and MapReduce, and famous scheduling algorithms in this field.
With the advancement of PC innovation, there is a colossal increment in the development of information. Researchers are overpowered with this expanding measure of information handling needs which is getting emerged from each science field. A major issue has been experienced in different fields for making the full utilization of these expansive scale information which bolster basic leadership. Information mining is the strategy that can finds new examples from huge informational indexes. For a long time it has been examined in a wide range of utilization territory and in this way numerous information mining strategies have been produced and connected to rehearse. However, there was a colossal increment in the measure of information, their calculation and investigations as of late. In such circumstance most established information mining strategies wound up distant by and by to deal with such enormous information. Productive parallel/simultaneous calculations and usage procedures are the way to meeting the versatility and execution prerequisites involved in such huge scale information mining investigations. Number of parallel calculations has been executed by making the utilization of various parallelization strategies which can be recorded as: strings, MPI, MapReduce, and blend or work process innovations that yields diverse execution and convenience attributes. MPI demonstrate is observed to be effective in figuring the thorough issues, particularly in reproduction. Be that as it may, it is difficult to be utilized as a part of genuine. MapReduce is created from the information investigation model of the data recovery field and is a cloud innovation. Till now, a few MapReduce structures has been produced for taking care of the enormous information. The most renowned is the Google. The other one having such highlights is Hadoop which is the most well known open source MapReduce programming embraced by numerous enormous IT organizations, for example, Yahoo, Facebook, eBay et cetera. In this paper, we center particularly around Hadoop and its execution of MapReduce for expository handling.
—Big Data a novel phenomenon refers to the processing and collection of massive data sets, associated systems and algorithms used to analyse large datasets. Big Data architecture range across multiple machines and clusters with special purpose sub systems. The data produced from several sources requires analysis and organization with meager amounts of time. To potentially speed up the processing, a unified way of machine learning is applied on MapReduce frame work. A broadly applicable programming model MapReduce is applied on different learning algorithms belonging to machine learning family for all business decisions. By using ML algorithms with Hadoop for better storage distribution will improve the time and processing speed. This paper presents parallel implementation of various machine learning algorithms implemented on top of MapReduce model for time and processing efficiency.
International Journal of Engineering and Computer Science, 2020
Clustering As a result of the rapid development in cloud computing, it & fundamental to investigate the performance of extraordinary Hadoop MapReduce purposes and to realize the performance bottleneck in a cloud cluster that contributes to higher or diminish performance. It is usually primary to research the underlying hardware in cloud cluster servers to permit the optimization of program and hardware to achieve the highest performance feasible. Hadoop is founded on MapReduce, which is among the most popular programming items for huge knowledge analysis in a parallel computing environment. In this paper, we reward a particular efficiency analysis, characterization, and evaluation of Hadoop MapReduce Word Count utility. The main aim of this paper is to give implements of Hadoop map-reduce programming by giving a hands-on experience in developing Hadoop based Word-Count and Apriori application. Word count problem using Hadoop Map Reduce framework. The Apriori Algorithm has been used ...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Machines. Technologies. Materials., 2016
International Journal of Engineering & Technology
Bulletin of Electrical Engineering and Informatics
Jour of Adv Research in Dynamical & Control Systems, Vol. 10, 02-Special Issue, 2018, 2018
Zenodo (CERN European Organization for Nuclear Research), 2023
Procedia Computer Science, 2015
Advances in Applied Sciences, 2021
International Journal of Cloud Applications and Computing, 2016
International Journal of Advanced Trends in Computer Science and Engineering, 2019
International Journal of Research in Engineering and Technology, 2017
International Journal of Engineering & Technology