Map Reduce Algorithm

The MapReduce algorithm allows for distributed processing of large datasets across clusters of computers. It works in two phases: 1. The map phase where the input data is processed key-value pair by key-value pair, possibly converting or filtering the values, to generate a set of intermediate key-value pairs. 2. The reduce phase where all intermediate values with the same key are grouped together and passed to the reduce function to produce the final output, stored back in the distributed file system. The MapReduce framework implemented in Hadoop provides a scalable solution for processing vast amounts of structured and unstructured data stored in HDFS in a parallel and distributed manner.

Uploaded by

Leela Rallapudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views4 pages

Map Reduce Algorithm

Uploaded by

Leela Rallapudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Map Reduce Algorithm

Map Reduce:
Hadoop MapReduce is the core Hadoop ecosystem component which provides
data processing. MapReduce is a software framework for easily writing
applications that process the vast amount of structured and unstructured data
stored in the Hadoop Distributed File system.
MapReduce framework works on the data that is stored in
1.Hadoop Distributed File System (HDFS)

2.Google File System (GFS)

Map reduce Analogy:

 Consider the problem of counting the number of occurrences of each word
in alarge collection of documents
 How would you do it in parallel?
Solution:
 Divide documents among workers
 Each worker parses document to find all words, outputs (word, count) pairs
 Partition (word, count) pairs across workers based on word
 For each word at a worker, locally add up counts
How map reduce do it?
 100 files with daily temperature in two cities. Each file has 10,000 entries.
 For example, one file may have (Toronto 20), (New York 30),
 Our goal is to compute the maximum temperature in the two cities.
 Assign the task to 100 Map processors each works on one file.Each
processor outputs a list ofkey-value pairs, e.g., (Toronto 30), (New York
65), …
 Now we have 100 lists each with two elements. We give this list to two
reducers – one forToronto and another for New York.
 The reducer produce the final answer: (Toronto 55), (New York 65)
Working Of Map reduce:
 MapReduce works by breaking the data processing into two phases:
1.Map phase
2.Reduce phase.
Map Phase − The map or mapper’s job is to process the input data. Generally the
input data is in the form of file or directory and is stored in the Hadoop file system
(HDFS). The input file is passed to themapper function line by line. The mapper
processes the data and creates several small chunks of data.
Reduce Phase − The Reducer’s job is to process the data that comes from the
mapper. After processing,it produces a new set of output, which will be stored in
the HDFS.

Keys and Values:

 The programmer in MapReduce has to specify two functions, the map
function and the reduce function thatimplement the Mapper and the
Reducer in a MapReduce program
 In MapReduce data elements are always structured as key-value (i.e., (K, V))
pairs
 The map and reduce functions receive and emit (K, V) pairs

Input Splits Intermediate Outputs Final Outputs

Map Reduce
(K, V) Functio (K’’, V’’)
Functio (K’, V’)
Pairs n Pairs
n
Pairs

Anatomy of MapReduce:

Input Output
Map <k1, v1> list (<k2, v2>)
Reduce <k2, list(v2)> list (<k3, v3>)
How MapReduce works:
The complete execution process (execution of Map and Reduce tasks, both) is
controlled by two types of entities called a
Jobtracker: Acts like a master (responsible for complete execution of submitted
job)
Multiple Task Trackers: Acts like slaves, each of them performing the job
For every job submitted for execution in the system, there is one Jobtracker that
resideson Namenode and there are multiple tasktrackers which reside on
Datanode.
Examples Of Map Reduce:

Unit-2 MapReduce2024
No ratings yet
Unit-2 MapReduce2024
41 pages
Big Data Unit - 3
No ratings yet
Big Data Unit - 3
7 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
Unit 4 1
No ratings yet
Unit 4 1
12 pages
MapReduce Programming in Hadoop
No ratings yet
MapReduce Programming in Hadoop
42 pages
BDA Unit 2 Notes
No ratings yet
BDA Unit 2 Notes
32 pages
Unit 5 - Mapreduce
No ratings yet
Unit 5 - Mapreduce
8 pages
Big Data Analytics Mid 2
No ratings yet
Big Data Analytics Mid 2
9 pages
3.1.how Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.how Map Reduce Works & 3.2 Anatomy
11 pages
Big Data BCA Unit4
No ratings yet
Big Data BCA Unit4
9 pages
MapReduce Technique Overview
No ratings yet
MapReduce Technique Overview
16 pages
Anatomy of MapReduce in Hadoop
No ratings yet
Anatomy of MapReduce in Hadoop
37 pages
MapReduce in Hadoop Explained
No ratings yet
MapReduce in Hadoop Explained
5 pages
Unit 3
No ratings yet
Unit 3
33 pages
Da Unit 5 Data Analytics
No ratings yet
Da Unit 5 Data Analytics
43 pages
BDA Unit 3 1
No ratings yet
BDA Unit 3 1
37 pages
3.Map-Reduce Framework - 1
No ratings yet
3.Map-Reduce Framework - 1
47 pages
Anatomy of Hadoop MapReduce Jobs
No ratings yet
Anatomy of Hadoop MapReduce Jobs
11 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
11 pages
Data Science
No ratings yet
Data Science
7 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
43 pages
Hadoop (Mapreduce)
No ratings yet
Hadoop (Mapreduce)
43 pages
MapReduce vs Spark: Key Differences
No ratings yet
MapReduce vs Spark: Key Differences
2 pages
BDA Chapter 3
No ratings yet
BDA Chapter 3
17 pages
Lecture 03
No ratings yet
Lecture 03
26 pages
Unit 3 - Map Reduce Applications
No ratings yet
Unit 3 - Map Reduce Applications
25 pages
BDA - Unit 3
No ratings yet
BDA - Unit 3
41 pages
Unit 2 Topic 4 Map Reduce
No ratings yet
Unit 2 Topic 4 Map Reduce
27 pages
Bda Unit 2
No ratings yet
Bda Unit 2
48 pages
Notes 3 & 4 B Unit
No ratings yet
Notes 3 & 4 B Unit
19 pages
Module - 4 - UNDERSTANDING MAP REDUCE FUNDAMENTALS
No ratings yet
Module - 4 - UNDERSTANDING MAP REDUCE FUNDAMENTALS
6 pages
MapReduce & Hadoop for CS Students
No ratings yet
MapReduce & Hadoop for CS Students
25 pages
Understand: The First Phase of Mapreduce Paradigm, What Is A Map/Mapper, What Is The Input To The
No ratings yet
Understand: The First Phase of Mapreduce Paradigm, What Is A Map/Mapper, What Is The Input To The
5 pages
MapReduce and HDFS Architecture Guide
No ratings yet
MapReduce and HDFS Architecture Guide
9 pages
Bda Unit 2
No ratings yet
Bda Unit 2
54 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
120 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
MapReduce BDA
No ratings yet
MapReduce BDA
32 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
Hadoop MapReduce for Developers
No ratings yet
Hadoop MapReduce for Developers
4 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
MapReduce for Data Engineers
No ratings yet
MapReduce for Data Engineers
2 pages
Map Reduce Intro CS4961-L22
No ratings yet
Map Reduce Intro CS4961-L22
20 pages
Unit 5 Lecture 5
No ratings yet
Unit 5 Lecture 5
21 pages
Unit III
No ratings yet
Unit III
8 pages
Unit 3
No ratings yet
Unit 3
13 pages
MapReduce for Big Data Processing
No ratings yet
MapReduce for Big Data Processing
7 pages
MapReduce Architecture
No ratings yet
MapReduce Architecture
5 pages
MapReduce Programming Model Overview
No ratings yet
MapReduce Programming Model Overview
26 pages
Hadoop MapReduce Tutorial Guide
No ratings yet
Hadoop MapReduce Tutorial Guide
20 pages
Map Reduce
No ratings yet
Map Reduce
74 pages
MapReduce Framework & Examples
No ratings yet
MapReduce Framework & Examples
44 pages
Understanding MapReduce Algorithm Basics
No ratings yet
Understanding MapReduce Algorithm Basics
4 pages
MapReduce Arch
No ratings yet
MapReduce Arch
29 pages
Understanding MapReduce
No ratings yet
Understanding MapReduce
4 pages
What Is MapReduce in Hadoop - Architecture - Example
No ratings yet
What Is MapReduce in Hadoop - Architecture - Example
7 pages
Map Reduce
No ratings yet
Map Reduce
14 pages
Bda U2
No ratings yet
Bda U2
79 pages
Map Reduce Intro
No ratings yet
Map Reduce Intro
21 pages
Computers Lab
No ratings yet
Computers Lab
52 pages
Certificates - MIC-178
No ratings yet
Certificates - MIC-178
1 page
Text Summarization Paper at ICMICT 2023
No ratings yet
Text Summarization Paper at ICMICT 2023
1 page
Employee Working Hours Report
No ratings yet
Employee Working Hours Report
11 pages
Employee Working Hours Summary
No ratings yet
Employee Working Hours Summary
9 pages
Trainee Software Engineer Profile
100% (1)
Trainee Software Engineer Profile
12 pages
AMAZON Reasoning Ability
No ratings yet
AMAZON Reasoning Ability
3 pages
CBDA Set B (2015)
No ratings yet
CBDA Set B (2015)
2 pages
Conjunctions & Transitions
No ratings yet
Conjunctions & Transitions
2 pages
3D Printing: Vat Photopolymerization
No ratings yet
3D Printing: Vat Photopolymerization
29 pages
Plate Hydraulic Design Procedure111
No ratings yet
Plate Hydraulic Design Procedure111
17 pages
Tech Pack 2
No ratings yet
Tech Pack 2
7 pages
Walmart Project
No ratings yet
Walmart Project
6 pages
犬先天性激怒症候群歌谱
No ratings yet
犬先天性激怒症候群歌谱
536 pages
Test 5
No ratings yet
Test 5
6 pages
Acid/Base Indicator Lab Instructions
No ratings yet
Acid/Base Indicator Lab Instructions
6 pages
TYPE 59U/59B: DIN Standard PTFE Wedge Seals
No ratings yet
TYPE 59U/59B: DIN Standard PTFE Wedge Seals
6 pages
MPT 5 Specification Europe
No ratings yet
MPT 5 Specification Europe
7 pages
Notebooks
No ratings yet
Notebooks
2 pages
Greek Mythology: Dragon Combat Analysis
No ratings yet
Greek Mythology: Dragon Combat Analysis
13 pages
B.Tech R20 IQuestion Bank CC
No ratings yet
B.Tech R20 IQuestion Bank CC
3 pages
Supernatural Holiday Fluff
No ratings yet
Supernatural Holiday Fluff
20 pages
Tds Sinofloc 671
No ratings yet
Tds Sinofloc 671
1 page
Experiment No. 1 Kirchhoff'S Law I. Objectives
No ratings yet
Experiment No. 1 Kirchhoff'S Law I. Objectives
6 pages
2020 Monthly Revenue and Order Analysis
No ratings yet
2020 Monthly Revenue and Order Analysis
1,087 pages
RFA - SD 028 RevB 11kV 400V Substation Single Line Diagram
No ratings yet
RFA - SD 028 RevB 11kV 400V Substation Single Line Diagram
4 pages
Common English Verbs List
No ratings yet
Common English Verbs List
30 pages
Sample Speaking
No ratings yet
Sample Speaking
14 pages
Kohasa Engineering Tender Proposal Overview
No ratings yet
Kohasa Engineering Tender Proposal Overview
93 pages
Bash-Bunny Ebook v22.03
100% (1)
Bash-Bunny Ebook v22.03
42 pages
Audit For Coating
No ratings yet
Audit For Coating
42 pages
Comments On Chilled Water Pipes Method
No ratings yet
Comments On Chilled Water Pipes Method
3 pages
SCH 3401 Electrochemistry 1
No ratings yet
SCH 3401 Electrochemistry 1
3 pages
Fundamental Unit of Life (LESSON PLAN)
No ratings yet
Fundamental Unit of Life (LESSON PLAN)
3 pages
Class 5 W34
No ratings yet
Class 5 W34
60 pages
Dynamics Newton's Laws of Motion - Part 1
No ratings yet
Dynamics Newton's Laws of Motion - Part 1
27 pages
Physics Investigatory Project: Submitted To: - Submitted by
No ratings yet
Physics Investigatory Project: Submitted To: - Submitted by
21 pages

Map Reduce Algorithm

Uploaded by

Map Reduce Algorithm

Uploaded by

Map Reduce Algorithm

2.Google File System (GFS)

Map reduce Analogy:

Keys and Values:

Input Splits Intermediate Outputs Final Outputs

You might also like