0% found this document useful (0 votes)
57 views5 pages

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

The document provides instructions for setting up Hadoop on Linux. It recommends using Ubuntu 14.04 and installing Java, openSSH, and Hadoop. It describes configuring environment variables and Hadoop configuration files. Finally, it explains how to start the Hadoop cluster, run a sample job, and stop the cluster.

Uploaded by

Karim Fathallah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views5 pages

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

The document provides instructions for setting up Hadoop on Linux. It recommends using Ubuntu 14.04 and installing Java, openSSH, and Hadoop. It describes configuring environment variables and Hadoop configuration files. Finally, it explains how to start the Hadoop cluster, run a sample job, and stop the cluster.

Uploaded by

Karim Fathallah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

3.

Recommended Platform:
 OS: Linux is supported as a development and production platform. You can use
Ubuntu 14.04 or later (you can also use other Linux flavors like: CentOS, Redhat, etc.)
 Hadoop: Cloudera Distribution for Apache hadoop CDH5.x (you can use Apache
hadoop 2.x)

3.1 Setup Platform


If you are using Windows/Mac OS you can create virtual machine and install Ubuntu using
VMWare Player, alternatively you can create virtual machine and install Ubuntu using Oracle
Virtual Box.

4. Prerequisites:
4.1. Install Java 7 (Recommended Oracle Java)
4.1.1. Install Python Software Properties

1$sudo apt-get install python-software-properties

4.1.2. Add Repository

1$sudo add-apt-repository ppa:webupd8team/java

4.1.3. Update the source list

1$sudo apt-get update

4.1.4. Install Java

1$sudo apt-get install oracle-java7-installer

4.2. Configure SSH


4.2.1. Install Open SSH Server-Client

1$sudo apt-get install openssh-server openssh-client

4.2.2. Generate Key Pairs

1$ssh-keygen -t rsa -P ""

4.2.3. Configure password-less SSH


1$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

4.2.4. Check by SSH to localhost

1$ssh localhost

5. Install Hadoop
5.1. Download Hadoop
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz

5.2. Untar Tar ball


1$tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz

Note: All the required jars, scripts, configuration files, etc. are available in
HADOOP_HOME directory (hadoop-2.5.0-cdh5.3.2)

5.3. Setup Configuration:


5.3.1. Edit .bashrc:

Edit .bashrc file located in user’s home directory and add following parameters:

1export HADOOP_PREFIX="/home/hdadmin/hadoop-2.5.0-cdh5.3.2"
2export PATH=$PATH:$HADOOP_PREFIX/bin
3export PATH=$PATH:$HADOOP_PREFIX/sbin
4export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
5export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
6export YARN_HOME=${HADOOP_PREFIX}
7

Note: After above step restart the terminal, so that all the environment variables will come
into effect

5.3.2. Edit hadoop-env.sh:

Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set


JAVA_HOME:

export JAVA_HOME=<path-to-the-root-of-your-Java-installation> (eg:


1/usr/lib/jvm/java-7-oracle/)

5.3.3. Edit core-site.xml:


Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add
following entries:

1
2 <configuration>
3     <property>
4         <name>fs.defaultFS</name>
5         <value>hdfs://localhost:9000</value>
    </property>
6     <property>
7         <name>hadoop.tmp.dir</name>
8         <value>/home/dataflair/hdata</value>
9     </property>
</configuration>
1
0

Note: /home/hdadmin/hdata is a sample location; please specify a location where you have
Read Write privileges

5.3.4. Edit hdfs-site.xml:

Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add


following entries:

1<configuration>
2    <property>
3        <name>dfs.replication</name>
4        <value>1</value>
5    </property>
</configuration>
6

5.3.5. Edit mapred-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add


following entries:

1<configuration>
2    <property>
3        <name>mapreduce.framework.name</name>
4        <value>yarn</value>
5    </property>
</configuration>
6

5.3.6. Edit yarn-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add


following entries:

1 <configuration>
2     <property>
        <name>yarn.nodemanager.aux-services</name>
3
4
        <value>mapreduce_shuffle</value>
5     </property>
6     <property>
7         <name>yarn.nodemanager.aux-
8 services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
9     </property>
1 </configuration>
0

6. Start the Cluster:


6.1. Format the name node:
1$bin/hdfs namenode -format

NOTE: This activity should be done once when you install hadoop, else It will delete all your
data from HDFS

6.2. Start HDFS Services:


1$sbin/start-dfs.sh

6.3. Start YARN Services:


1$sbin/start-yarn.sh

6.4. Check whether services have been started


1$jps
2NameNode
3DataNode
4ResourceManager
NodeManager
5

7. Run Map-Reduce Jobs


7.1. Run word count example:
1$ bin/hdfs dfs -mkdir /inputwords
2$$ bin/hdfs dfs -put <data-file> /inputwords
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-
3cdh5.3.2.jar wordcount /inputwords /outputwords
4$ bin/hdfs dfs -cat /outputwords/*

Play with HDFS Commands and perform various operations, Follow HDFS command Guide
8. Stop The Cluster
8.1. Stop HDFS Services:
1$sbin/stop-dfs.sh

8.2. Stop YARN Services:


1$sbin/stop-yarn.sh

You might also like