0% found this document useful (0 votes)

57 views5 pages

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

The document provides instructions for setting up Hadoop on Linux. It recommends using Ubuntu 14.04 and installing Java, openSSH, and Hadoop. It describes configuring environment variables and Hadoop configuration files. Finally, it explains how to start the Hadoop cluster, run a sample job, and stop the cluster.

Uploaded by

Karim Fathallah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views5 pages

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

Uploaded by

Karim Fathallah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

3.

Recommended Platform:
 OS: Linux is supported as a development and production platform. You can use
Ubuntu 14.04 or later (you can also use other Linux flavors like: CentOS, Redhat, etc.)
 Hadoop: Cloudera Distribution for Apache hadoop CDH5.x (you can use Apache
hadoop 2.x)

3.1 Setup Platform

If you are using Windows/Mac OS you can create virtual machine and install Ubuntu using
VMWare Player, alternatively you can create virtual machine and install Ubuntu using Oracle
Virtual Box.

4. Prerequisites:
4.1. Install Java 7 (Recommended Oracle Java)
4.1.1. Install Python Software Properties

1$sudo apt-get install python-software-properties

4.1.2. Add Repository

1$sudo add-apt-repository ppa:webupd8team/java

4.1.3. Update the source list

1$sudo apt-get update

4.1.4. Install Java

1$sudo apt-get install oracle-java7-installer

4.2. Configure SSH

4.2.1. Install Open SSH Server-Client

1$sudo apt-get install openssh-server openssh-client

4.2.2. Generate Key Pairs

1$ssh-keygen -t rsa -P ""

4.2.3. Configure password-less SSH

1$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

4.2.4. Check by SSH to localhost

1$ssh localhost

5. Install Hadoop
5.1. Download Hadoop
http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.2.tar.gz

5.2. Untar Tar ball

1$tar xzf hadoop-2.5.0-cdh5.3.2.tar.gz

Note: All the required jars, scripts, configuration files, etc. are available in
HADOOP_HOME directory (hadoop-2.5.0-cdh5.3.2)

5.3. Setup Configuration:

5.3.1. Edit .bashrc:

Edit .bashrc file located in user’s home directory and add following parameters:

1export HADOOP_PREFIX="/home/hdadmin/hadoop-2.5.0-cdh5.3.2"
2export PATH=$PATH:$HADOOP_PREFIX/bin
3export PATH=$PATH:$HADOOP_PREFIX/sbin
4export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
5export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
6export YARN_HOME=${HADOOP_PREFIX}
7

Note: After above step restart the terminal, so that all the environment variables will come
into effect

5.3.2. Edit hadoop-env.sh:

Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set

JAVA_HOME:

export JAVA_HOME=<path-to-the-root-of-your-Java-installation> (eg:

1/usr/lib/jvm/java-7-oracle/)

5.3.3. Edit core-site.xml:

Edit configuration file core-site.xml (located in HADOOP_HOME/etc/hadoop) and add
following entries:

1
2 <configuration>
3     <property>
4         <name>fs.defaultFS</name>
5         <value>hdfs://localhost:9000</value>
    </property>
6     <property>
7         <name>hadoop.tmp.dir</name>
8         <value>/home/dataflair/hdata</value>
9     </property>
</configuration>
1
0

Note: /home/hdadmin/hdata is a sample location; please specify a location where you have
Read Write privileges

5.3.4. Edit hdfs-site.xml:

Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add

following entries:

1<configuration>
2    <property>
3        <name>dfs.replication</name>
4        <value>1</value>
5    </property>
</configuration>
6

5.3.5. Edit mapred-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add

following entries:

1<configuration>
2    <property>
3        <name>mapreduce.framework.name</name>
4        <value>yarn</value>
5    </property>
</configuration>
6

5.3.6. Edit yarn-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add

following entries:

1 <configuration>
2     <property>
        <name>yarn.nodemanager.aux-services</name>
3
4
        <value>mapreduce_shuffle</value>
5     </property>
6     <property>
7         <name>yarn.nodemanager.aux-
8 services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
9     </property>
1 </configuration>
0

6. Start the Cluster:

6.1. Format the name node:
1$bin/hdfs namenode -format

NOTE: This activity should be done once when you install hadoop, else It will delete all your
data from HDFS

6.2. Start HDFS Services:

1$sbin/start-dfs.sh

6.3. Start YARN Services:

1$sbin/start-yarn.sh

6.4. Check whether services have been started

1$jps
2NameNode
3DataNode
4ResourceManager
NodeManager
5

7. Run Map-Reduce Jobs

7.1. Run word count example:
1$ bin/hdfs dfs -mkdir /inputwords
2$$ bin/hdfs dfs -put <data-file> /inputwords
bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-
3cdh5.3.2.jar wordcount /inputwords /outputwords
4$ bin/hdfs dfs -cat /outputwords/*

Play with HDFS Commands and perform various operations, Follow HDFS command Guide
8. Stop The Cluster
8.1. Stop HDFS Services:
1$sbin/stop-dfs.sh

8.2. Stop YARN Services:

1$sbin/stop-yarn.sh

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
No ratings yet
Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)
5 pages
EX. NO Date Program NO Sign
No ratings yet
EX. NO Date Program NO Sign
80 pages
Exp 1 Hadoop Installation Steps
No ratings yet
Exp 1 Hadoop Installation Steps
4 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Hadoop Setup Guide for Linux Users
No ratings yet
Hadoop Setup Guide for Linux Users
23 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
Hadoop 2.7.0 Pseudo Node Setup Guide
No ratings yet
Hadoop 2.7.0 Pseudo Node Setup Guide
9 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
49 pages
Hadoop
No ratings yet
Hadoop
4 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Hive INstallation
No ratings yet
Hive INstallation
13 pages
Hbase Installationn
No ratings yet
Hbase Installationn
12 pages
BDA Lab Manual UPDATED
No ratings yet
BDA Lab Manual UPDATED
45 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
80 pages
Single Node Hadoop Installation Guide
100% (1)
Single Node Hadoop Installation Guide
6 pages
DAN Lab ManuaL
No ratings yet
DAN Lab ManuaL
53 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
Hadoop Setup Guide for Developers
No ratings yet
Hadoop Setup Guide for Developers
3 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
BDA Practical Experiment 1
No ratings yet
BDA Practical Experiment 1
5 pages
Installing Hadoop 3.2.4 Guide
No ratings yet
Installing Hadoop 3.2.4 Guide
7 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Hadoop Setup & File Management Guide
No ratings yet
Hadoop Setup & File Management Guide
16 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Install Hadoop: Standalone & Pseudo Modes
No ratings yet
Install Hadoop: Standalone & Pseudo Modes
13 pages
BDT Lab Manual
No ratings yet
BDT Lab Manual
34 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
Installing Hadoop 3.3.1 on Ubuntu
No ratings yet
Installing Hadoop 3.3.1 on Ubuntu
32 pages
Hadoop Installation
No ratings yet
Hadoop Installation
5 pages
Hadoop Setup Guide for Ubuntu 16.04/18.04
No ratings yet
Hadoop Setup Guide for Ubuntu 16.04/18.04
20 pages
Week 1 Lab
No ratings yet
Week 1 Lab
8 pages
Bda Record
No ratings yet
Bda Record
27 pages
Ccs334-Bda Lab Manual
No ratings yet
Ccs334-Bda Lab Manual
48 pages
Hadoop Installation Guide for Ubuntu
No ratings yet
Hadoop Installation Guide for Ubuntu
8 pages
Big Data Record 2024-25
No ratings yet
Big Data Record 2024-25
46 pages
Hadoop Setup for CSE Students
No ratings yet
Hadoop Setup for CSE Students
17 pages
Install Oracle Java 8 on Ubuntu
No ratings yet
Install Oracle Java 8 on Ubuntu
7 pages
Install Hadoop 2.6 on Ubuntu 14.04
No ratings yet
Install Hadoop 2.6 on Ubuntu 14.04
27 pages
Bda Lab Manual Print 3.6.24
No ratings yet
Bda Lab Manual Print 3.6.24
45 pages
Hadoop 3.2.2 Installation Guide
No ratings yet
Hadoop 3.2.2 Installation Guide
3 pages
HBase Installation Guide for Ubuntu
No ratings yet
HBase Installation Guide for Ubuntu
11 pages
Hadoop Multinode Cluster Installation
No ratings yet
Hadoop Multinode Cluster Installation
4 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Exp 1 1
No ratings yet
Exp 1 1
24 pages
Sqoop Data Transfer Tutorial
No ratings yet
Sqoop Data Transfer Tutorial
11 pages
Hadoop Installation Guide: Single & Multi Node
No ratings yet
Hadoop Installation Guide: Single & Multi Node
11 pages
Hadoop Installaion
No ratings yet
Hadoop Installaion
113 pages
Installing Hadoop on Ubuntu Steps
No ratings yet
Installing Hadoop on Ubuntu Steps
3 pages
CP5261Data Analytics Laboratory
No ratings yet
CP5261Data Analytics Laboratory
57 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Exp 1-2
No ratings yet
Exp 1-2
9 pages
Install OS
No ratings yet
Install OS
21 pages
EVE COOK BOOK Chap1
No ratings yet
EVE COOK BOOK Chap1
10 pages
Z4 G5 Quick Specs
No ratings yet
Z4 G5 Quick Specs
68 pages
Steps To Install Ora2pg On Windows and Linux
No ratings yet
Steps To Install Ora2pg On Windows and Linux
6 pages
Agent HX Data Sheet
No ratings yet
Agent HX Data Sheet
6 pages
Advantages and Components of Linux OS
No ratings yet
Advantages and Components of Linux OS
13 pages
Steps For Installing The Unifi Controller On Linux
No ratings yet
Steps For Installing The Unifi Controller On Linux
4 pages
TalendOpenStudio DI IG Windows 6.5.1 EN
No ratings yet
TalendOpenStudio DI IG Windows 6.5.1 EN
19 pages
Raspberry Pi 2 3
No ratings yet
Raspberry Pi 2 3
4 pages
Intro To Computing
No ratings yet
Intro To Computing
80 pages
Steganography Lab Using Steghide
No ratings yet
Steganography Lab Using Steghide
3 pages
Siril Documentation
No ratings yet
Siril Documentation
375 pages
How To Install and Use Docker On AlmaLinux 8 - VITUX
No ratings yet
How To Install and Use Docker On AlmaLinux 8 - VITUX
8 pages
C. v. Kulkarni Linux - Flavours
No ratings yet
C. v. Kulkarni Linux - Flavours
9 pages
Oss 2 Marks With Answer
No ratings yet
Oss 2 Marks With Answer
17 pages
OS Installation & Virtualization Guide
No ratings yet
OS Installation & Virtualization Guide
26 pages
Linux Installation
No ratings yet
Linux Installation
7 pages
WGS Extract Manual Alpha v4
No ratings yet
WGS Extract Manual Alpha v4
140 pages
Install Apache on Ubuntu 18.04 Guide
No ratings yet
Install Apache on Ubuntu 18.04 Guide
12 pages
Install .NET Core on Ubuntu Guide
No ratings yet
Install .NET Core on Ubuntu Guide
5 pages
Linux System Administrator CV
No ratings yet
Linux System Administrator CV
4 pages
IS-074A Dell T3660XE Reconstruction PC
No ratings yet
IS-074A Dell T3660XE Reconstruction PC
5 pages
Common Password Patterns and Variants
No ratings yet
Common Password Patterns and Variants
190 pages
SDN Manual
No ratings yet
SDN Manual
38 pages
Capstone Project 3 Cloud Integration and Management
No ratings yet
Capstone Project 3 Cloud Integration and Management
24 pages
ns-3 Install Guide for Developers
No ratings yet
ns-3 Install Guide for Developers
39 pages
Email Server On A Linux
No ratings yet
Email Server On A Linux
24 pages
Ubuntu & Hadoop Setup Guide
No ratings yet
Ubuntu & Hadoop Setup Guide
30 pages
Python Programming Crash Course Guide
100% (4)
Python Programming Crash Course Guide
141 pages
Ugcs User Manual 4.4
No ratings yet
Ugcs User Manual 4.4
102 pages

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

Uploaded by

Recommended Platform:: 4.1. Install Java 7 (Recommended Oracle Java)

Uploaded by

3.

3.1 Setup Platform

1$sudo apt-get install python-software-properties

4.1.2. Add Repository

1$sudo add-apt-repository ppa:webupd8team/java

4.1.3. Update the source list

1$sudo apt-get update

4.1.4. Install Java

1$sudo apt-get install oracle-java7-installer

4.2. Configure SSH

1$sudo apt-get install openssh-server openssh-client

4.2.2. Generate Key Pairs

1$ssh-keygen -t rsa -P ""

4.2.3. Configure password-less SSH

4.2.4. Check by SSH to localhost

5.2. Untar Tar ball

5.3. Setup Configuration:

5.3.2. Edit hadoop-env.sh:

Edit configuration file hadoop-env.sh (located in HADOOP_HOME/etc/hadoop) and set

export JAVA_HOME=<path-to-the-root-of-your-Java-installation> (eg:

5.3.3. Edit core-site.xml:

5.3.4. Edit hdfs-site.xml:

Edit configuration file hdfs-site.xml (located in HADOOP_HOME/etc/hadoop) and add

5.3.5. Edit mapred-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add

5.3.6. Edit yarn-site.xml:

Edit configuration file mapred-site.xml (located in HADOOP_HOME/etc/hadoop) and add

6. Start the Cluster:

6.2. Start HDFS Services:

6.3. Start YARN Services:

6.4. Check whether services have been started

7. Run Map-Reduce Jobs

8.2. Stop YARN Services:

You might also like