0% found this document useful (0 votes)
19 views3 pages

Mapreduce Program

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views3 pages

Mapreduce Program

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Experiment 3

MapReduce Programming Basics Word count, sorting, and filtering examples in Java/Python

AIM:

To understand and implement the basics of MapReduce programming in Hadoop by developing and
executing simple programs such as Word Count, Sorting, and Filtering using Java

Step 1:Create WordCount Program in Eclipse

1. Open Eclipse → Create a Java Project


2. Right-click project →New package package name: main.java.com.training and
Finish
3. Right-click project New  Class  class Name:WordCount
4. Type the program
5. Right click on projectconfigure Built path libraries add external jars
6. Right click on projectconfigure Built path java compiler
7. Right click on projectexport javajar file

Step 2: Open VMware

Step 3: Open Winscp

1. Type IP address
2. Username
3. Password
4. Transferring JAR/input files to Hadoop cluster just drag and drop.

Step 4: Open Putty

1. Open PuTTY on Windows.


2. In the Host Name (or IP address) field → enter the server’s IP or hostname (for
example: 192.168.1.100 or hadoop-master).
3. Port = 22 (default for SSH).
4. Connection type = SSH.
5. Click Open.
6. A terminal will appear → enter your username (e.g., hduser) and password.

Commands:

Type ls- The ls command is used to list files and directories in the current directory or a
specified path.

Type Hadoop check version

Create a directory: Type hadoop fs –mkdir /input folder name

Upload file from local to HDFS: hadoop fs –put sample.txt / input folder name
List files in HDFS : hadoop fs –ls / input folder name

View contents of a file: hadoop fs –cat /input folder name /sample.txt

Run a jar file: hadoop jar jar file.jar main.java.com.traininng.WordCount /input folder / sample.txt
/output folder

hadoop fs –ls /output folder

Output: hdfs dfs -cat /output folder /part-r-00000

Program :

package main.java.com.training;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import java.io.IOException;

public class WordCount {

public static class WordCountMapper extends Mapper<LongWritable,Text,Text,IntWritable>


{
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException
{
String line = value.toString();
String[] words=line.split(",");
for(String word: words )
{
Text outputKey = new Text(word.toUpperCase().trim());
IntWritable outputValue = new IntWritable(1);
context.write(outputKey, outputValue);
}
}
}

public static class WordCountReducer extends Reducer<Text,IntWritable,Text,IntWritable>


{
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException
{
int sum = 0;
for(IntWritable value : values)
{
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
}

public static void main(String[] args) throws Exception {

if (args.length != 2) {
System.err.println("Insufficient args");
System.exit(-1);
}
Configuration conf = new Configuration();

conf.set("ResourceManager", "hdfs://192.168.14.128:8050");
Job job = new Job(conf, "WordCount");

job.setJarByClass(WordCount.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapperClass(WordCountMapper.class);
job.setReducerClass(WordCountReducer.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));


FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);
}
}

Output :

Result:

Thus the MapReduce programs for Word Count, Sorting, and Filtering were successfully
implemented and executed using Hadoop.

You might also like