Practical 5 | PDF | Map Reduce | Data
0% found this document useful (0 votes)
60 views

Practical 5

The document describes steps to write a MapReduce program to calculate the average movie ratings from input data containing movie IDs, user IDs, ratings and timestamps. The key steps are: 1. Create a Java project in Eclipse with a MovieRating class containing Mapper, Reducer and main classes. 2. The Mapper parses the input and emits (user ID, rating) pairs. 3. The Reducer counts the total ratings and number of ratings to calculate the average, emitting (user ID, average rating) pairs. 4. The main function runs the MapReduce job on sample input and output data.

Uploaded by

Priya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Practical 5

The document describes steps to write a MapReduce program to calculate the average movie ratings from input data containing movie IDs, user IDs, ratings and timestamps. The key steps are: 1. Create a Java project in Eclipse with a MovieRating class containing Mapper, Reducer and main classes. 2. The Mapper parses the input and emits (user ID, rating) pairs. 3. The Reducer counts the total ratings and number of ratings to calculate the average, emitting (user ID, average rating) pairs. 4. The main function runs the MapReduce job on sample input and output data.

Uploaded by

Priya
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Business Intelligence & Big Data Analytics - 2 Darpan Naik – 60

Practical 5:

Aim:

Write a map-reduce program to determine the average ratings of movies. The


input consists of a series of lines, each containing a movie number, user number,
rating and a timestamp

Steps:

1. Open Eclipse IDE and create a new project with a class files – MovieRating.java
Open MovieRating.java and paste the following code:

MatrixRating.java

package Movie;

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;

public class MovieRating


{

public static class MovieRatingsMapper extends Mapper<LongWritable, Text , IntWritable,


Text>
{
public void map(LongWritable key, Text value, Context context) throws
IOException, InterruptedException
{
String valueStr = value.toString();
int index = valueStr.indexOf(',');
if(index != -1)
{
try
Business Intelligence & Big Data Analytics - 2 Darpan Naik – 60

{
IntWritable keyUserID = new
IntWritable(Integer.parseInt(valueStr.substring(0, index)));
context.write(keyUserID, new Text(valueStr.substring(index
+ 1)));
}
catch(Exception e)
{
// You could get a NumberFormatException
}
}
}
}

public static class MovieRatingsReducer extends Reducer<IntWritable, Text, IntWritable,


Text>
{
public void reduce(IntWritable key, Iterable<Text> values, Context context) throws
IOException, InterruptedException
{
int movieCount = 0;
int movieRatingCount = 0;
String movieValues = "";
for (Text value : values)
{
String[] tokens = value.toString().split(",");
if(tokens.length == 2)
{
movieRatingCount += Integer.parseInt(tokens[1].trim()); //
You could get a NumberFormatException
movieCount++;
movieValues = movieValues.concat(value.toString() + " ");
}
}

context.write(key, new Text(Integer.toString(movieCount) + "," +


Integer.toString(movieRatingCount) + ",(" + movieValues.trim() + ")"));
}
}

public static void main(String[] args) throws Exception


{
Configuration conf = new Configuration();
Job job = Job.getInstance(conf,"CompositeKeyExample");
job.setJarByClass(MovieRating.class);
job.setMapperClass(MovieRatingsMapper.class);
Business Intelligence & Big Data Analytics - 2 Darpan Naik – 60

job.setReducerClass(MovieRatingsReducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(Text.class);
FileInputFormat.addInputPath(job, new Path("/MovieRating/Input/Movie.txt"));
FileOutputFormat.setOutputPath(job, new Path("/MovieRating/Output"));
System.exit(job.waitForCompletion(true) ? 0:1);
}
}

Output:
Business Intelligence & Big Data Analytics - 2 Darpan Naik – 60

You might also like