0% found this document useful (0 votes)

19 views3 pages

Mapreduce Program

Uploaded by

Paidapati Poojitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views3 pages

Mapreduce Program

Uploaded by

Paidapati Poojitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Experiment 3

MapReduce Programming Basics Word count, sorting, and filtering examples in Java/Python

AIM:

To understand and implement the basics of MapReduce programming in Hadoop by developing and
executing simple programs such as Word Count, Sorting, and Filtering using Java

Step 1:Create WordCount Program in Eclipse

1. Open Eclipse → Create a Java Project

2. Right-click project →New package package name: main.java.com.training and
Finish
3. Right-click project New  Class  class Name:WordCount
4. Type the program
5. Right click on projectconfigure Built path libraries add external jars
6. Right click on projectconfigure Built path java compiler
7. Right click on projectexport javajar file

Step 2: Open VMware

Step 3: Open Winscp

1. Type IP address
2. Username
3. Password
4. Transferring JAR/input files to Hadoop cluster just drag and drop.

Step 4: Open Putty

1. Open PuTTY on Windows.

2. In the Host Name (or IP address) field → enter the server’s IP or hostname (for
example: 192.168.1.100 or hadoop-master).
3. Port = 22 (default for SSH).
4. Connection type = SSH.
5. Click Open.
6. A terminal will appear → enter your username (e.g., hduser) and password.

Commands:

Type ls- The ls command is used to list files and directories in the current directory or a
specified path.

Type Hadoop check version

Create a directory: Type hadoop fs –mkdir /input folder name

Upload file from local to HDFS: hadoop fs –put sample.txt / input folder name
List files in HDFS : hadoop fs –ls / input folder name

View contents of a file: hadoop fs –cat /input folder name /sample.txt

Run a jar file: hadoop jar jar file.jar main.java.com.traininng.WordCount /input folder / sample.txt
/output folder

hadoop fs –ls /output folder

Output: hdfs dfs -cat /output folder /part-r-00000

Program :

package main.java.com.training;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import java.io.IOException;

public class WordCount {

public static class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable>

{
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException
{
String line = value.toString();
String[] words=line.split(",");
for(String word: words )
{
Text outputKey = new Text(word.toUpperCase().trim());
IntWritable outputValue = new IntWritable(1);
context.write(outputKey, outputValue);
}
}
}

public static class WordCountReducer extends Reducer<Text,IntWritable,Text,IntWritable>

{
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException
{
int sum = 0;
for(IntWritable value : values)
{
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}

public static void main(String[] args) throws Exception {

if (args.length != 2) {
System.err.println("Insufficient args");
System.exit(-1);
}
Configuration conf = new Configuration();

conf.set("ResourceManager", "hdfs://192.168.14.128:8050");
Job job = new Job(conf, "WordCount");

job.setJarByClass(WordCount.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));

FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);
}
}

Output :

Result:

Thus the MapReduce programs for Word Count, Sorting, and Filtering were successfully
implemented and executed using Hadoop.

02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
Big Data Analytics with Hadoop Guide
No ratings yet
Big Data Analytics with Hadoop Guide
10 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
First Map-Reduce Program in Hadoop
No ratings yet
First Map-Reduce Program in Hadoop
22 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
13 pages
Hadoop MapReduce WordCount Guide
No ratings yet
Hadoop MapReduce WordCount Guide
5 pages
Cloud PDF
No ratings yet
Cloud PDF
47 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Word Count Program in MapReduce
No ratings yet
Word Count Program in MapReduce
5 pages
Java Hadoop Word Count Tutorial
No ratings yet
Java Hadoop Word Count Tutorial
4 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Hadoop Word Count with MapReduce
No ratings yet
Hadoop Word Count with MapReduce
6 pages
Sanoob BDA 1 S Merged
No ratings yet
Sanoob BDA 1 S Merged
8 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
BDAPract 4
No ratings yet
BDAPract 4
5 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
61 pages
Ex No 04
No ratings yet
Ex No 04
4 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Assignment 11 DSBDA
No ratings yet
Assignment 11 DSBDA
4 pages
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
MapReduce Word Count Example in Java
No ratings yet
MapReduce Word Count Example in Java
6 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Sanjith BDA 2
No ratings yet
Sanjith BDA 2
4 pages
Java MapReduce Word Count Example
No ratings yet
Java MapReduce Word Count Example
15 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
Hadoop MapReduce Exercise Guide
No ratings yet
Hadoop MapReduce Exercise Guide
3 pages
Hadoop Installation Guide and Setup
No ratings yet
Hadoop Installation Guide and Setup
37 pages
BDA3
No ratings yet
BDA3
7 pages
Experiment 1 Copy 1
No ratings yet
Experiment 1 Copy 1
8 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
MapReduce Word and Character Count Guide
No ratings yet
MapReduce Word and Character Count Guide
22 pages
Assignment 2
No ratings yet
Assignment 2
7 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
MapReduce Word Count Tutorial
No ratings yet
MapReduce Word Count Tutorial
12 pages
Lab-1-Steps-Word Count Problem-Hadoop
No ratings yet
Lab-1-Steps-Word Count Problem-Hadoop
6 pages
Sanoob BDA - 2
No ratings yet
Sanoob BDA - 2
4 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
59 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
MapReduce Word Count Program Guide
No ratings yet
MapReduce Word Count Program Guide
14 pages
Java WordCount with Hadoop Guide
No ratings yet
Java WordCount with Hadoop Guide
6 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
MapReduce Enhanced Guide
No ratings yet
MapReduce Enhanced Guide
3 pages
Overview of MapReduce Framework
No ratings yet
Overview of MapReduce Framework
23 pages
Bda Lab S
No ratings yet
Bda Lab S
92 pages
18 TO 23 Notes
No ratings yet
18 TO 23 Notes
14 pages
Unit3 - Training Deep Neural Networks
No ratings yet
Unit3 - Training Deep Neural Networks
3 pages
A Systematic Study On PM2.5 and PM10 Concentration Prediction in Air
No ratings yet
A Systematic Study On PM2.5 and PM10 Concentration Prediction in Air
15 pages
OS Lecture Notes-CSEB
No ratings yet
OS Lecture Notes-CSEB
122 pages
Unit-4 Bigdata Analytics: What Is Apache Pig?
No ratings yet
Unit-4 Bigdata Analytics: What Is Apache Pig?
47 pages
Make USB Bootable Acronis
No ratings yet
Make USB Bootable Acronis
3 pages
Install Omada Controller On Ubuntu 22.04 & Debian 12 - KD's Blog
No ratings yet
Install Omada Controller On Ubuntu 22.04 & Debian 12 - KD's Blog
7 pages
Ubuntu MD5 Hashes for Versions 10.04 & 11.04
No ratings yet
Ubuntu MD5 Hashes for Versions 10.04 & 11.04
4 pages
Tạo Thư Viện Mới Trong MiKroC
No ratings yet
Tạo Thư Viện Mới Trong MiKroC
1 page
Problem Set: Semaphores
No ratings yet
Problem Set: Semaphores
2 pages
Round Robin Scheduling Explained
No ratings yet
Round Robin Scheduling Explained
5 pages
Prateek Punetha - Aptitude Test
No ratings yet
Prateek Punetha - Aptitude Test
10 pages
CPU Scheduling and Memory Management
No ratings yet
CPU Scheduling and Memory Management
30 pages
How To Install Windows® 10 To A GUID Partition Table (GPT) Partition
No ratings yet
How To Install Windows® 10 To A GUID Partition Table (GPT) Partition
2 pages
Red Hat Enterprise Linux-9-9.2 Release Notes-En-Us
No ratings yet
Red Hat Enterprise Linux-9-9.2 Release Notes-En-Us
187 pages
Restore or Import Database To A New Computer - Initial Setup - em Client
No ratings yet
Restore or Import Database To A New Computer - Initial Setup - em Client
7 pages
Worksheet 1 Linux Administration
No ratings yet
Worksheet 1 Linux Administration
7 pages
P01 RTLinux
100% (1)
P01 RTLinux
26 pages
Red Hat Enterprise Linux 9: Boot Options For RHEL Installer
100% (1)
Red Hat Enterprise Linux 9: Boot Options For RHEL Installer
25 pages
Question Bank 2023-24 Even 4th Semester 1902BS401-OPERATING SYSTEMS
No ratings yet
Question Bank 2023-24 Even 4th Semester 1902BS401-OPERATING SYSTEMS
4 pages
Analyzing Rogue Processes in Memory
No ratings yet
Analyzing Rogue Processes in Memory
1 page
SAP TCode Overview and Management
No ratings yet
SAP TCode Overview and Management
6 pages
Jenkins User Handbook
No ratings yet
Jenkins User Handbook
204 pages
MT8167 EMMC Partition Configuration
0% (1)
MT8167 EMMC Partition Configuration
8 pages
Getting Started Foreign VM Import
No ratings yet
Getting Started Foreign VM Import
8 pages
Unit 2. What Is A Distributed File System (DFS)
No ratings yet
Unit 2. What Is A Distributed File System (DFS)
1 page
Upgrading VMS-asmanager
No ratings yet
Upgrading VMS-asmanager
6 pages
Sap Suse Linux
No ratings yet
Sap Suse Linux
136 pages
Unit 1
No ratings yet
Unit 1
4 pages
Chapter 2 Tanenbaum-Pearce 6
No ratings yet
Chapter 2 Tanenbaum-Pearce 6
47 pages
IPMI and Serial Over LAN Setup Guide
No ratings yet
IPMI and Serial Over LAN Setup Guide
6 pages
Package Management
No ratings yet
Package Management
14 pages
WebSphere Application Serrver v8 Installation Lab
No ratings yet
WebSphere Application Serrver v8 Installation Lab
5 pages
OpenShift Virtualization - Technical Overview
100% (2)
OpenShift Virtualization - Technical Overview
74 pages
Understanding File Systems and Operations
No ratings yet
Understanding File Systems and Operations
115 pages

Mapreduce Program

Uploaded by

Mapreduce Program

Uploaded by

Experiment 3

Step 1:Create WordCount Program in Eclipse

1. Open Eclipse → Create a Java Project

Step 2: Open VMware

Step 3: Open Winscp

Step 4: Open Putty

1. Open PuTTY on Windows.

Type Hadoop check version

Create a directory: Type hadoop fs –mkdir /input folder name

View contents of a file: hadoop fs –cat /input folder name /sample.txt

hadoop fs –ls /output folder

Output: hdfs dfs -cat /output folder /part-r-00000

public class WordCount {

public static class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable>

public static class WordCountReducer extends Reducer<Text,IntWritable,Text,IntWritable>

public static void main(String[] args) throws Exception {

FileInputFormat.addInputPath(job, new Path(args[0]));

You might also like