0% found this document useful (0 votes)

18 views6 pages

DA Lab Program-2

Uploaded by

Diksha Padiyar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views6 pages

DA Lab Program-2

Uploaded by

Diksha Padiyar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

DATA ANALYTICS LABORATORY (21CSL66)

2. IMPLEMENT WORD COUNT / FREQUENCY PROGRAM USING

MAPREDUCE.

Steps to be followed:

• Step-1: Open Eclipse à then select File à New à Java Project à

Name it WordCount à then Finish.

• Step-2: Create Three Java Classes into the project.

File à New à Class

Name them WCDriver (having the main function), WCMapper and

WCReducer.

• Step-3: You have to include two Reference Libraries,

Right Click on Project à then select Build Path à Click on Configure
Build Path à Add External JARs (Share à Hadoop). In this add JARs
of Client, Common, HDFS, MapReduce and YARN à Click on Apply
and Close.

• Step-4: Mapper Code which should be copied and pasted into the
WCMapper Java Class file.

// Importing libraries

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.MapReduceBase;

1
import org.apache.hadoop.mapred.Mapper;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements

Mapper<LongWritable, Text, Text, IntWritable>

// Map function

public void map(LongWritable key, Text value, OutputCollector<Text,

IntWritable> output, Reporter rep) throws IOException

String line = value.toString();

// Splitting the line on spaces

for (String word : line.split(" "))

if (word.length() > 0)

output.collect(new Text(word), new IntWritable(1));

2
• Step-5: Reducer Code which should be copied and pasted into the
WCReducer Java Class file.

// Importing libraries

import java.io.IOException;

import java.util.Iterator;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.MapReduceBase;

import org.apache.hadoop.mapred.OutputCollector;

import org.apache.hadoop.mapred.Reducer;

import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements

Reducer<Text, IntWritable, Text, IntWritable>

// Reduce function

public void reduce(Text key, Iterator<IntWritable> value,

OutputCollector<Text, IntWritable> output, Reporter rep) throws
IOException

int count = 0;

// Counting the frequency of each words

while (value.hasNext())

IntWritable i = value.next();

3
count += i.get();

output.collect(key, new IntWritable(count));

• Step-6: Driver Code which should be copied and pasted into the
WCDriver Java Class file.

// Importing libraries

import java.io.IOException;

import org.apache.hadoop.conf.Configured;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapred.FileInputFormat;

import org.apache.hadoop.mapred.FileOutputFormat;

import org.apache.hadoop.mapred.JobClient;

import org.apache.hadoop.mapred.JobConf;

import org.apache.hadoop.util.Tool;

import org.apache.hadoop.util.ToolRunner;

public class WCDriver extends Configured implements Tool

public int run(String args[]) throws IOException

{
4
if (args.length < 2)

System.out.println("Please give valid inputs");

return -1;

JobConf conf = new JobConf(WCDriver.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));

FileOutputFormat.setOutputPath(conf, new Path(args[1]));

conf.setMapperClass(WCMapper.class);

conf.setReducerClass(WCReducer.class);

conf.setMapOutputKeyClass(Text.class);

conf.setMapOutputValueClass(IntWritable.class);

conf.setOutputKeyClass(Text.class);

conf.setOutputValueClass(IntWritable.class);

JobClient.runJob(conf);

return 0;

// Main Method

public static void main(String args[]) throws Exception

int exitCode = ToolRunner.run(new WCDriver(), args);

System.out.println(exitCode);

5
}

• Step-7: Now you have to make a jar file.

Right Click on Project à Click on Export à Select export destination as

Jar File à Name the jar File (WordCount.jar) à Click on next à at
last Click on Finish.

• Step-8: Open the terminal and change the directory to the workspace.

You can do this by using “cd workspace/” command.

Now, Create a text file (WCFile.txt) and move it to HDFS.

For that open terminal and write the below code (remember you should be in
the same directory as jar file you have created just now),

cat WCFile.text

• Step-9: Now, run the below command to copy the file input file into the
HDFS,

hadoop fs -put WCFile.txt WCFile.txt

• Step-10: Now to run the jar file, execute the below code,

hadoop jar wordcount.jar WCDriver WCFile.txt WCOutput

• Step-11: After Executing the code, you can see the result in WCOutput file
or by writing following command on terminal,

hadoop fs -cat WCOutput/part-00000

Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
No ratings yet
Steps To Create Jar File and Execute Word Count Problem in Mapper Reducer
5 pages
BDA3
No ratings yet
BDA3
7 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
Lab3 BigData-MapReduce
No ratings yet
Lab3 BigData-MapReduce
8 pages
Java Hadoop Word Count Tutorial
No ratings yet
Java Hadoop Word Count Tutorial
4 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Java MapReduce Word Count Example
No ratings yet
Java MapReduce Word Count Example
15 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Hadoop Word Count Example
No ratings yet
Hadoop Word Count Example
2 pages
Java WordCount with Hadoop Guide
No ratings yet
Java WordCount with Hadoop Guide
6 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Execute WordCount in Hadoop CDH
No ratings yet
Execute WordCount in Hadoop CDH
10 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Lab-1-Steps-Word Count Problem-Hadoop
No ratings yet
Lab-1-Steps-Word Count Problem-Hadoop
6 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Hadoop Word Count with MapReduce
No ratings yet
Hadoop Word Count with MapReduce
6 pages
Mapreduce Program
No ratings yet
Mapreduce Program
3 pages
MapReduce Word Count Example in Java
No ratings yet
MapReduce Word Count Example in Java
6 pages
BDA University Questions
No ratings yet
BDA University Questions
10 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
First Map-Reduce Program in Hadoop
No ratings yet
First Map-Reduce Program in Hadoop
22 pages
Ex No 04
No ratings yet
Ex No 04
4 pages
Word Count Program in MapReduce
No ratings yet
Word Count Program in MapReduce
5 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
Big Data Analytics with Hadoop Guide
No ratings yet
Big Data Analytics with Hadoop Guide
10 pages
MapReduce Word Count Program Guide
No ratings yet
MapReduce Word Count Program Guide
14 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
MapReduce Word and Character Count Guide
No ratings yet
MapReduce Word and Character Count Guide
22 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
Bda-Wordcount 250805 135324
No ratings yet
Bda-Wordcount 250805 135324
5 pages
BDC Output 3
No ratings yet
BDC Output 3
4 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
Dsbda GRP B Print
No ratings yet
Dsbda GRP B Print
17 pages
Coding
No ratings yet
Coding
10 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
48 pages
Wordcount
No ratings yet
Wordcount
3 pages
Sanoob BDA - 2
No ratings yet
Sanoob BDA - 2
4 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
7 pages
BDAPract 4
No ratings yet
BDAPract 4
5 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Hadoop MapReduce WordCount Guide
No ratings yet
Hadoop MapReduce WordCount Guide
5 pages
WordCount MapReduce Source Code
No ratings yet
WordCount MapReduce Source Code
3 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
Exp3 - Map Reduce Code
No ratings yet
Exp3 - Map Reduce Code
2 pages
Hadoop Installation Guide and Setup
No ratings yet
Hadoop Installation Guide and Setup
37 pages
Java MapReduce Word Count Example
No ratings yet
Java MapReduce Word Count Example
2 pages
104 Da11-13
No ratings yet
104 Da11-13
14 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
No ratings yet
Practical-2 Aim: Write A Program of Word Count in Map Reduce Over HDFS. Description
6 pages
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
No ratings yet
Tutorial-Counting Words in File (S) Using Mapreduce: Prerequisites
11 pages
Birthday Attack
No ratings yet
Birthday Attack
3 pages
Digital - Questions - With - Answers - PDF Version 1
No ratings yet
Digital - Questions - With - Answers - PDF Version 1
69 pages
SED1173 Backstage
No ratings yet
SED1173 Backstage
18 pages
Setu Solutions for Traceability Systems
No ratings yet
Setu Solutions for Traceability Systems
21 pages
DAA Assignment - 17-April-2020: Q1. With Help of Venn Diagram Explain Commonly Believed Relationship Between P and NP?
No ratings yet
DAA Assignment - 17-April-2020: Q1. With Help of Venn Diagram Explain Commonly Believed Relationship Between P and NP?
11 pages
Cal 9900 Manual
No ratings yet
Cal 9900 Manual
7 pages
Functional Units of Computer Systems
No ratings yet
Functional Units of Computer Systems
34 pages
Com - Rentplease.save - Loader Logcat
No ratings yet
Com - Rentplease.save - Loader Logcat
29 pages
Upgrade Preparation ERP Cloud R13 280918
No ratings yet
Upgrade Preparation ERP Cloud R13 280918
21 pages
Datapatch: Database 12c or Later Post Patch SQL Automation (Doc ID 1585822.1)
No ratings yet
Datapatch: Database 12c or Later Post Patch SQL Automation (Doc ID 1585822.1)
4 pages
MMM-R FCT Man 0722 en-US
No ratings yet
MMM-R FCT Man 0722 en-US
206 pages
System Programming Solution Bank
No ratings yet
System Programming Solution Bank
63 pages
Core I5-2500, XFX RX470 4GB, 8GB RAM, 120GB SSD, 460W PSU
No ratings yet
Core I5-2500, XFX RX470 4GB, 8GB RAM, 120GB SSD, 460W PSU
1 page
Nintendo Gameboy Architecture: by Rob Kurst and Andy Madden
No ratings yet
Nintendo Gameboy Architecture: by Rob Kurst and Andy Madden
14 pages
MaticardPro InstallGuide 1.0
No ratings yet
MaticardPro InstallGuide 1.0
10 pages
NetVo iX IT Consulting Services Overview
No ratings yet
NetVo iX IT Consulting Services Overview
1 page
Lab 1 Operating System
No ratings yet
Lab 1 Operating System
9 pages
FRONT OFFICE GR 7-8 Q1 M2 Tools Equipt Paraphernalia
No ratings yet
FRONT OFFICE GR 7-8 Q1 M2 Tools Equipt Paraphernalia
13 pages
UI Path Q&A
50% (6)
UI Path Q&A
11 pages
Full Report Proposal MGT646
No ratings yet
Full Report Proposal MGT646
26 pages
Lec 4. Database System Approach Vs File System Approach
No ratings yet
Lec 4. Database System Approach Vs File System Approach
2 pages
Manual Carrier AQUASNAP 30RA - RH
No ratings yet
Manual Carrier AQUASNAP 30RA - RH
36 pages
Instant Project Cost Estimator
No ratings yet
Instant Project Cost Estimator
29 pages
Logic Circuits Worksheet in Blue Clean Style
No ratings yet
Logic Circuits Worksheet in Blue Clean Style
3 pages
Chazmin Taylor: Project Manager Profile
No ratings yet
Chazmin Taylor: Project Manager Profile
1 page
Screenshot 2025-05-29 at 11.15.37 PM
No ratings yet
Screenshot 2025-05-29 at 11.15.37 PM
1 page
Research Proposal UTeM - ARIF RAHMAN
No ratings yet
Research Proposal UTeM - ARIF RAHMAN
12 pages
12 OOPs
No ratings yet
12 OOPs
24 pages
Lectures-Digital Circuit Design
No ratings yet
Lectures-Digital Circuit Design
64 pages
Essentials of Pattern Recognition An Accessible Approach 1st Edition Jianxin Wu PDF Download
No ratings yet
Essentials of Pattern Recognition An Accessible Approach 1st Edition Jianxin Wu PDF Download
134 pages

DA Lab Program-2

Uploaded by

DA Lab Program-2

Uploaded by

DATA ANALYTICS LABORATORY (21CSL66)

2. IMPLEMENT WORD COUNT / FREQUENCY PROGRAM USING

• Step-1: Open Eclipse à then select File à New à Java Project à

• Step-2: Create Three Java Classes into the project.

File à New à Class

Name them WCDriver (having the main function), WCMapper and

• Step-3: You have to include two Reference Libraries,

public class WCMapper extends MapReduceBase implements

public void map(LongWritable key, Text value, OutputCollector<Text,

String line = value.toString();

// Splitting the line on spaces

for (String word : line.split(" "))

output.collect(new Text(word), new IntWritable(1));

public class WCReducer extends MapReduceBase implements

public void reduce(Text key, Iterator<IntWritable> value,

// Counting the frequency of each words

output.collect(key, new IntWritable(count));

public class WCDriver extends Configured implements Tool

public int run(String args[]) throws IOException

System.out.println("Please give valid inputs");

JobConf conf = new JobConf(WCDriver.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));

FileOutputFormat.setOutputPath(conf, new Path(args[1]));

public static void main(String args[]) throws Exception

int exitCode = ToolRunner.run(new WCDriver(), args);

• Step-7: Now you have to make a jar file.

Right Click on Project à Click on Export à Select export destination as

You can do this by using “cd workspace/” command.

Now, Create a text file (WCFile.txt) and move it to HDFS.

hadoop fs -put WCFile.txt WCFile.txt

hadoop jar wordcount.jar WCDriver WCFile.txt WCOutput

hadoop fs -cat WCOutput/part-00000

You might also like