Big data Analytics LAB
Experiment_no-03
Aim: Write driver code, mapper code, reducer code to count number of words in a
given file. (Hint: WordCount Map- Reduce Program)
Description:
1) Open Oracle VM VirtualBox->export cloudera->start
2) Cloudera->settings->system->set processors to “2”, by default it is “1”.
3) To launch “cloudera Express”
(i) Open terminal in cloudera and start the server by using
“sudo service cloudera-sdh-server start”
(ii) After successful completion click on Cloudera Express->Cloud Manager
(iii) Both username and password is “cloudera” and then click on Login
4) In browser type “localhost:50070/[Link]”
5) In eclipse->File->New->Java project->Project name “WordCount”->Finish
6) Create three classes.
Right click on WordCount -> New->class->Name “WCDriver”
Right click on WordCount -> New->class->Name “WCMapper”
Right click on WordCount -> New->class->Name “WCReducer”
7) Add Hadoop libraries.
Right click on WordCount ->Build path->Configure Build path->Add external
JARS. (usr\lib\hadoop\hadoop-common-2.6.0-cdh 5.13.0 jar,
Usr\lib\hadoop\hadoop-core-2.6.0-cdh 5.13.0 jar)
Program:
[Link]
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
public class WCMapper extends MapReduceBase implements Mapper<LongWritable,
Text, Text, IntWritable> {
// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{
String line = [Link]();
// Splitting the line on spaces
for (String word : [Link](" "))
{
if ([Link]() > 0)
{
[Link](new Text(word), new IntWritable(1));
}
}
}
}
[Link]
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
public class WCReducer extends MapReduceBase implements Reducer<Text,
IntWritable, Text, IntWritable> {
// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{
int count = 0;
// Counting the frequency of each words
while ([Link]())
{
IntWritable i = [Link]();
count += [Link]();
}
[Link](key, new IntWritable(count));
}
}
[Link]
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
import [Link];
public class WCDriver extends Configured implements Tool {
public int run(String args[]) throws IOException
{
if ([Link] < 2)
{
[Link]("Please give valid inputs");
return -1;
}
JobConf conf = new JobConf([Link]);
[Link](conf, new Path(args[0]));
[Link](conf, new Path(args[1]));
[Link]([Link]);
[Link]([Link]);
[Link]([Link]);
[Link]([Link]);
[Link]([Link]);
[Link]([Link]);
[Link](conf);
return 0;
}
// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = [Link](new WCDriver(), args);
[Link](exitCode);
}
}
Right click on WordCount->Export->java->jar file->JAR file:” WordCount”->Finish
OUTPUT:
In Terminal