0% found this document useful (0 votes)

14 views3 pages

Load and Index Data in Search

Uploaded by

l00pback63

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

Load and Index Data in Search

Uploaded by

l00pback63

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Load and Index Data in Search http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/search_load_index_da...

Cloudera.com Training Support Documentation Dev Center | Contact Us Downloads

This is the documentation for Cloudera 5.4.x. Documentation for other versions is available at Cloudera Documentation.

Load and Index Data in Search

Execute the script found in a subdirectory of the following locations. The path for the script often includes the product version, such as Cloudera Manager 5.4.x, so path
details vary. To address this issue, use wildcards.
Packages: /usr/share/doc. If Search for CDH 5.4.2 is installed to the default location using packages, the Quick Start script is found in /usr/share
/doc/search-*/quickstart.
Parcels: /opt/cloudera/parcels/CDH/share/doc. If Search for CDH 5.4.2 is installed to the default location using parcels, the Quick Start script is found in
/opt/cloudera/parcels/CDH/share/doc/search-*/quickstart.

The script uses several defaults that you might want to modify:

Table 1. Script Parameters and Defaults

NAMENODE_CONNECT `hostname`:8020 For use on an HDFS HA cluster. If you use NAMENODE_CONNECT, do not use
NAMENODE_HOST or NAMENODE_PORT.

NAMENODE_HOST `hostname` If you use NAMENODE_HOST and NAMENODE_PORT, do not use

NAMENODE_CONNECT.

NAMENODE_PORT 8020 If you use NAMENODE_HOST and NAMENODE_PORT, do not use

NAMENODE_CONNECT.

ZOOKEEPER_HOST `hostname`

ZOOKEEPER_PORT 2181

1 di 3 30/05/2015 19:57
Load and Index Data in Search http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/search_load_index_da...

ZOOKEEPER_ROOT /solr

HDFS_USER ${HDFS_USER:="${USER}"}

SOLR_HOME /opt/cloudera/parcels/SOLR/lib/solr

By default, the script is configured to run on the NameNode host, which is also running ZooKeeper. Override these defaults with custom values when you start
quickstart.sh. For example, to use an alternate NameNode and HDFS user ID, you could start the script as follows:

$ NAMENODE_HOST=nnhost HDFS_USER=jsmith ./quickstart.sh

The first time the script runs, it downloads required files such as the Enron data and configuration files. If you run the script again, it uses the Enron information already
downloaded, as opposed to downloading this information again. On such subsequent runs, the existing data is used to re-create the enron-email-collection
SolrCloud collection.

Note: Downloading the data from its server, expanding the data, and uploading the data can be time consuming. Although your connection and CPU speed
determine the time these processes require, fifteen minutes is typical and longer is not uncommon.

The script also generates a Solr configuration and creates a collection in SolrCloud. The following sections describes what the script does and how you can complete these
steps manually, if desired. The script completes the following tasks:

1. Set variables such as hostnames and directories.

2. Create a directory to which to copy the Enron data and then copy that data to this location. This data is about 422 MB and in some tests took about five minutes to
download and two minutes to untar.
3. Create directories for the current user in HDFS, change ownership of that directory to the current user, create a directory for the Enron data, and load the Enron
data to that directory. In some tests, it took about a minute to copy approximately 3 GB of untarred data.
4. Use solrctl to create a template of the instance directory.
5. Use solrctl to create a new Solr collection for the Enron mail collection.
6. Create a directory to which the MapReduceBatchIndexer can write results. Ensure that the directory is empty.
7. Use the MapReduceIndexerTool to index the Enron data and push the result live to enron-mail-collection. In some tests, it took about seven minutes to
complete this task.

<< Prerequisites Using Search to Query Loaded Data >>

2 di 3 30/05/2015 19:57
Load and Index Data in Search http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/search_load_index_da...

Terms and Conditions Privacy Policy

3 di 3 30/05/2015 19:57

Prerequisites
No ratings yet
Prerequisites
1 page
Cloudera Search Quick Start Guide
No ratings yet
Cloudera Search Quick Start Guide
1 page
Using Search To Query Loaded Data
No ratings yet
Using Search To Query Loaded Data
1 page
MX InfoStorage Installation Instructions
No ratings yet
MX InfoStorage Installation Instructions
15 pages
Hadoop Cluster
No ratings yet
Hadoop Cluster
18 pages
Install, Configure and Start: Solr-Sample
No ratings yet
Install, Configure and Start: Solr-Sample
10 pages
Open Search Server Documentation
No ratings yet
Open Search Server Documentation
16 pages
Cloudera Quickstart PDF
No ratings yet
Cloudera Quickstart PDF
28 pages
Cloudera Quickstart
No ratings yet
Cloudera Quickstart
32 pages
Cloudera Search User Guide
No ratings yet
Cloudera Search User Guide
86 pages
Starting and Restarting Solr Commands
No ratings yet
Starting and Restarting Solr Commands
10 pages
ClouderaManager ExerciseInstructions
No ratings yet
ClouderaManager ExerciseInstructions
25 pages
Cloudera Search
No ratings yet
Cloudera Search
133 pages
Cloudera Installation - 5.11.1 (Using Parcels)
No ratings yet
Cloudera Installation - 5.11.1 (Using Parcels)
18 pages
Taking Solr To Production: Service Installation Script
No ratings yet
Taking Solr To Production: Service Installation Script
11 pages
Apache Solr Setup Guide
No ratings yet
Apache Solr Setup Guide
11 pages
AWS Spark 2 Setup Guide
No ratings yet
AWS Spark 2 Setup Guide
19 pages
Penetration Testing: OSINT & Scanning Techniques
No ratings yet
Penetration Testing: OSINT & Scanning Techniques
38 pages
Install Cloudera Manager on AWS EC2
No ratings yet
Install Cloudera Manager on AWS EC2
39 pages
Dev Ques
No ratings yet
Dev Ques
4 pages
Cloudera Administration PDF
100% (1)
Cloudera Administration PDF
476 pages
17-Lesson Sqoop Practice - PuTTY
No ratings yet
17-Lesson Sqoop Practice - PuTTY
8 pages
Cloudera Introduction PDF
No ratings yet
Cloudera Introduction PDF
85 pages
Incorta Analytics Administration Guide
No ratings yet
Incorta Analytics Administration Guide
59 pages
Cloudera Administration
No ratings yet
Cloudera Administration
424 pages
Cloudera QuickStart Installation Guide
No ratings yet
Cloudera QuickStart Installation Guide
1 page
Amazon EC2 Command Line Tool Guide
No ratings yet
Amazon EC2 Command Line Tool Guide
5 pages
Zookeeper Getting Started Guide
No ratings yet
Zookeeper Getting Started Guide
5 pages
Zookeeper Started
No ratings yet
Zookeeper Started
7 pages
EC@1
No ratings yet
EC@1
13 pages
How To Configure Big Data Management 10.1 For Amazon EMR 4.6
No ratings yet
How To Configure Big Data Management 10.1 For Amazon EMR 4.6
10 pages
Server Command Reference Guide
100% (3)
Server Command Reference Guide
3 pages
Solr Configuration and Management Guide
No ratings yet
Solr Configuration and Management Guide
11 pages
Cloudera Administration
No ratings yet
Cloudera Administration
481 pages
AWS EC2 Auto Scaling Lab Guide
No ratings yet
AWS EC2 Auto Scaling Lab Guide
9 pages
Cloudera Msazure Hadoop Deployment Guide
No ratings yet
Cloudera Msazure Hadoop Deployment Guide
39 pages
Cloudera Quickstart VM
No ratings yet
Cloudera Quickstart VM
11 pages
Cloudera Administration PDF
No ratings yet
Cloudera Administration PDF
478 pages
Walkthrough - Setting Up Solr - 9.2 PDF
No ratings yet
Walkthrough - Setting Up Solr - 9.2 PDF
8 pages
Cloudera Releases PDF
No ratings yet
Cloudera Releases PDF
185 pages
Cloudera Introduction
No ratings yet
Cloudera Introduction
93 pages
Apache Solr For Indexing Data - Sample Chapter
No ratings yet
Apache Solr For Indexing Data - Sample Chapter
19 pages
Cloudera Administration
No ratings yet
Cloudera Administration
399 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
02 Compute Services
No ratings yet
02 Compute Services
30 pages
Cloud Computing Lab Guide
No ratings yet
Cloud Computing Lab Guide
31 pages
Asterisk and OpenIMSCore Installation Guide
No ratings yet
Asterisk and OpenIMSCore Installation Guide
5 pages
MongoDB Essentials Guide
No ratings yet
MongoDB Essentials Guide
9 pages
1609910985201POC Best Practices For CDL With Matt PDF Mar 2019
No ratings yet
1609910985201POC Best Practices For CDL With Matt PDF Mar 2019
36 pages
Otrs Installation Guide 7.0 en
No ratings yet
Otrs Installation Guide 7.0 en
34 pages
EC2 Basics for Developers
0% (1)
EC2 Basics for Developers
16 pages
Admin Cloudera
100% (3)
Admin Cloudera
637 pages
Cloudera Administration
No ratings yet
Cloudera Administration
486 pages
Configuring DNS and HTTP Servers
No ratings yet
Configuring DNS and HTTP Servers
46 pages
Bda Exp 1
No ratings yet
Bda Exp 1
6 pages
EC2 Linux Setup: Hands-On Lab Guide
No ratings yet
EC2 Linux Setup: Hands-On Lab Guide
7 pages
IHS Basic Directivesempty
No ratings yet
IHS Basic Directivesempty
68 pages
Casi Di Successo Liquidware Labs C
No ratings yet
Casi Di Successo Liquidware Labs C
22 pages
Supported Devices - Iot-1.3-Releasenotes-20190416
No ratings yet
Supported Devices - Iot-1.3-Releasenotes-20190416
15 pages
Ruckus Product Guide
No ratings yet
Ruckus Product Guide
8 pages
418.109 426.26 Grid Software Quick Start Guide
No ratings yet
418.109 426.26 Grid Software Quick Start Guide
40 pages
DeTwo QSG No Onboard Graphics v2 05-11-08
No ratings yet
DeTwo QSG No Onboard Graphics v2 05-11-08
2 pages
DeTwo Datasheet v11 07-11-08
No ratings yet
DeTwo Datasheet v11 07-11-08
2 pages
AN 036 How The Sharing Status and Active Channel LEDs Are Affected When Cascading The K4u
No ratings yet
AN 036 How The Sharing Status and Active Channel LEDs Are Affected When Cascading The K4u
2 pages
AHK3000E QSG v7
No ratings yet
AHK3000E QSG v7
1 page
DXR2 Manual v2 28-10-08
No ratings yet
DXR2 Manual v2 28-10-08
17 pages
AN 030 Twin Cascade For Controlling 7 PCs v2
No ratings yet
AN 030 Twin Cascade For Controlling 7 PCs v2
1 page
AN 022 K4u Sharing Rev2
No ratings yet
AN 022 K4u Sharing Rev2
1 page
AHK3000D QSG v9
No ratings yet
AHK3000D QSG v9
2 pages
AN 040 Installing A DXR2 With A K4u For Dealing 3000-REUTERS (D3D) KYBD v2
No ratings yet
AN 040 Installing A DXR2 With A K4u For Dealing 3000-REUTERS (D3D) KYBD v2
1 page
DXPC Manual v3 14-08-09
No ratings yet
DXPC Manual v3 14-08-09
42 pages
DX Rack Manual v6 20-10-08
No ratings yet
DX Rack Manual v6 20-10-08
15 pages
DX Rack Datasheet v3 20-10-08
No ratings yet
DX Rack Datasheet v3 20-10-08
2 pages
AN 039 Integrating A kmd3 and DX Series Products
No ratings yet
AN 039 Integrating A kmd3 and DX Series Products
1 page
NAKIVO Sales Presentation
No ratings yet
NAKIVO Sales Presentation
57 pages
Altor 4.0 Downloads
No ratings yet
Altor 4.0 Downloads
6 pages
SmartFabric OS10 Release Notes 10.5.3.2
No ratings yet
SmartFabric OS10 Release Notes 10.5.3.2
7 pages
Cloudera Introduction
No ratings yet
Cloudera Introduction
62 pages
h16084 WP Tech Overview New Improved Features Onefs8.1
No ratings yet
h16084 WP Tech Overview New Improved Features Onefs8.1
5 pages
Dell Emc Smartfabric Os10 Deployment Guide3 en Us
No ratings yet
Dell Emc Smartfabric Os10 Deployment Guide3 en Us
136 pages
Prof Quarm Ugbs 102 Pasco-2019
No ratings yet
Prof Quarm Ugbs 102 Pasco-2019
10 pages
UNDERSTANDING APPAREL QUALITY
No ratings yet
UNDERSTANDING APPAREL QUALITY
9 pages
B.Sc. Electronics Curriculum NEP 2020
No ratings yet
B.Sc. Electronics Curriculum NEP 2020
45 pages
Manupatra Weekly Wrap 21st July 2025 To 26th July 2025
No ratings yet
Manupatra Weekly Wrap 21st July 2025 To 26th July 2025
20 pages
PCA Guide for R Users
No ratings yet
PCA Guide for R Users
2 pages
Data Flow Diagram Guide
0% (1)
Data Flow Diagram Guide
30 pages
Robert Maxwell - The Sinking of Captain Bob
No ratings yet
Robert Maxwell - The Sinking of Captain Bob
14 pages
April 15 Tuesday Press Log
No ratings yet
April 15 Tuesday Press Log
11 pages
Technical Specifications Mysql
No ratings yet
Technical Specifications Mysql
2 pages
Marine Propeller Shafting and Shafting Alignment
100% (1)
Marine Propeller Shafting and Shafting Alignment
18 pages
Chemist & Materials Scientist CV
No ratings yet
Chemist & Materials Scientist CV
2 pages
Bomba Desplazamiento Variable RE92703
No ratings yet
Bomba Desplazamiento Variable RE92703
24 pages
Issues in Network Virtualization
100% (1)
Issues in Network Virtualization
14 pages
E1 Solar Panel Lay-Out
No ratings yet
E1 Solar Panel Lay-Out
1 page
Mathcounts 2017 Chapter
No ratings yet
Mathcounts 2017 Chapter
7 pages
Lecture 7. Torsion
No ratings yet
Lecture 7. Torsion
50 pages
RT(R)A Exam Notes for Aerodrome
100% (5)
RT(R)A Exam Notes for Aerodrome
31 pages
Types of Chemical Bonds Explained
No ratings yet
Types of Chemical Bonds Explained
63 pages
Workplace Interpersonal Skills Guide
100% (1)
Workplace Interpersonal Skills Guide
19 pages
Quality Assurance in Product Lifecycle
No ratings yet
Quality Assurance in Product Lifecycle
9 pages
Job Application Form
No ratings yet
Job Application Form
6 pages
CRM-M 22363 2025 02 05 2025 Final Order
No ratings yet
CRM-M 22363 2025 02 05 2025 Final Order
4 pages
Reconcilation of Yashaswi - SGBNPL
No ratings yet
Reconcilation of Yashaswi - SGBNPL
4 pages
Directorate General of Foreign Trade: DGFT, Udyog Bhawan, New Delhi
No ratings yet
Directorate General of Foreign Trade: DGFT, Udyog Bhawan, New Delhi
3 pages
E-Toolkit For Int'l SLead Congress 2020
No ratings yet
E-Toolkit For Int'l SLead Congress 2020
28 pages
Rubber Moulded Coir Matting Report
No ratings yet
Rubber Moulded Coir Matting Report
19 pages
Apeejay Shipping LTD: Main Engine (Mitsui Man B&W 7S50Mc-C)
No ratings yet
Apeejay Shipping LTD: Main Engine (Mitsui Man B&W 7S50Mc-C)
8 pages
Evaluating The Effectiveness of Bayesian Knowledge
No ratings yet
Evaluating The Effectiveness of Bayesian Knowledge
23 pages
Engineering Vol 72 1901-08-02
No ratings yet
Engineering Vol 72 1901-08-02
35 pages
Non Precision Approach-Precision Approach
No ratings yet
Non Precision Approach-Precision Approach
3 pages

Load and Index Data in Search

Uploaded by

Load and Index Data in Search

Uploaded by

Load and Index Data in Search http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/search_load_index_da...

Cloudera.com Training Support Documentation Dev Center | Contact Us Downloads

Load and Index Data in Search

Table 1. Script Parameters and Defaults

NAMENODE_HOST `hostname` If you use NAMENODE_HOST and NAMENODE_PORT, do not use

NAMENODE_PORT 8020 If you use NAMENODE_HOST and NAMENODE_PORT, do not use

$ NAMENODE_HOST=nnhost HDFS_USER=jsmith ./quickstart.sh

1. Set variables such as hostnames and directories.

<< Prerequisites Using Search to Query Loaded Data >>

©2015 Cloudera, Inc. All rights reserved

Terms and Conditions Privacy Policy

You might also like