0% found this document useful (0 votes)

984 views50 pages

Chapter+9+ HIVE

Uploaded by

HARISH SUBRAMANIAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

984 views50 pages

Chapter+9+ HIVE

Uploaded by

HARISH SUBRAMANIAN

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Introduction to Hive
What is Hive?
History of Hive
Features of Hive
Points to Remember
Hive Integration and Workflow
Hive Data Units
Hive Metastore
Hive Architecture
Hive Data Types
Hive File Format
DDL and DML Statements
Tables
Partitions
Bucketing Features in Hive
Aggregations
Group By and Having
Join
SerDer
Sub Query

Chapter 9

Introduction to Hive

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Learning Objectives Learning Outcomes

Introduction to Hive

[Link] study the Hive Architecture a)To understand the hive

architecture.
[Link] study the Hive File format b)To create databases, tables and
execute data manipulation language
[Link] study the Hive Query statements on it.
Language c)To differentiate between static
and dynamic partitions.
d)To differentiate between
managed and external tables.

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

What is Hive?
Hive Architecture
Hive Data Types
Primitive Data Types
Collection Data Types
Hive File Format
Text File
Sequential File
RCFile (Record Columnar File)
Hive Queries

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Hive is a Data Warehousing tool. Hive is used to query

structured data built on top of Hadoop. Facebook created
Hive component to manage their ever- growing volumes of
data. Hive makes use of the following:
[Link] for Storage

[Link] for execution

[Link] metadata in an RDBMS.

Hive is suitable for data warehousing applications,

processing batch jobs on huge data that is immutable.
Examples: Eg: Analysis of web logs, application logs.
History of HIVE
 2007: HIVE was born at Facebook to analyze their incoming log data
 2008: HIVE became Apache Hadoop sub-project.

Hive provides HQL(Hive Query Language) which is similar to SQL.

Hive compiles SQL queries into Map Reduce jobs and then runs the job in
the Hadoop Cluster.
Hive provides extensive data type functions and formats for data
summarization and analysis.
Features of Hive

1. It is similar to SQL.

2. HQL is easy to code.

3. Hive supports rich data types/collection data types such

as structs, lists, and maps.

4. Custom types , custom functions can be defined.

5. Hive supports SQL filters, group-by and order-by

clauses.
Points to remember

Hive Query Language is similar to SQL and gets

compiled into map reduce jobs and then runs the job in
the Hadoop cluster.
Hive's default database is derby.
Hive integration and work flow
Hourly log data can be stored directly into HDFS and then
data cleansing is performed on the log file. Finally, Hive tables
can be created to query the log file.
Hive Data Units
• Databases : The namespace for tables.

• Tables: Set of records that have similar schema.

• Partitions: Logical separations of data based on classification of given

information as per specific attributes. Once hive has partitioned the data
based on a specified key, it starts to assemble the records into specific folders
as and when the records are inserted.

• Buckets (Clusters) : Similar to partitions but uses hash functions to segregate

data and determines the cluster or bucket into which the record should be
placed.

• In Hive, tables are stored as a folder, partitions are stored as a sub-directory

and buckets are stored as a file.
Hive Data Units
Data units as arranged in a HIVE Cluster.

Database
Tables
Buckets
Partitions
Columns Columns

Data Data
Hive Architecture
Hive Command Line Interface: Interface to interact with Hive.

Web interface: It is a graphical user interface to interact with Hive.

Hive server: It is used to submit Hive jobs from a client.

JDBC/ODBC: Java code can be written to connect to Hive and submit jobs
on it.

Driver: It compiles, optimizes and executes Hive queries.

Hive Metastore: Hive table definitions and mappings to the data are stored
in a metastore.

A metastore consists of the following:

 Metastore service: offers interface to the hive.

 Database: Stores data definitions, mappings to the data and others.
HIVE METASTORE
Hive metastore is a database that stores metadata about your Hive
tables (eg. table name, column names and types, table location, storage
handler being used, number of buckets in the table, sorting columns if
any, partition columns if any, etc.).

When you create a table, this metastore gets updated with the
information related to the new table which gets queried when you
issue queries on that table.

Metadata stored in Meat store includes:

ID’s of Database, Tables , Indexes

Table creation time

Input/Output format used for a table.

Metadata is updated whenever a table is created or deleted from HIVE

HIVE METASTORE

Three Kinds:

Embedded Metastore : Both the metastore database and the

metastore service run embedded in the main HiveServer
process

Local Metastore: In this mode the Hive metastore service runs

in the same process as the main HiveServer process, but the
metastore database runs in a separate process, and can be on a
separate host. The embedded metastore service communicates
with the metastore database over JDBC.

Remote Metastore: Hive driver and metastore interface run on

different JVMs.
Diagram depicts the workflow between Hive
and Hadoop.
Hive File Format
•Text File
The default file format is text file. Each record is a line in the file.

In text file, different control characters are used as delimiters

^A(octal 001, separates fields)
^B(octal 002, separates elements in array/struct)
^C(octal 003, separates key-value pairs).
\n record delimiter.
The term field is used when overriding the default delimiter. Formats
supported csv, tsv, XML, JSON.

•Sequential File
Sequential files are flat files that store binary key-value pairs. Includes
compression support which optimizes I/O requirements.

•RCFile (Record Columnar File)

RCFile stores the data in Column Oriented Manner which ensures that
Aggregation operation is not an expensive operation.
Hive File Format
•Text File
The default file format is text file. Each record is a line in the file.

In text file, different control characters are used as delimiters

^A(octal 001, separates fields as columns)
^B(octal 002, separates elements in array/struct)
^C(octal 003, separates key-value pairs).
\n used to split them as each line or record in the text file.
The term field is used when overriding the default delimiter. Formats
supported csv, tsv, XML, JSON.
If table format is..
name STRING,
salary FLOAT,
subordinates ARRAY<STRING>,
deductions MAP<STRING, FLOAT>,
address STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>);

John DoeÂ100000.0ÂMary Smith^BTodd JonesÂFederal Taxes^C.2^BState

Taxes^C.05^BInsurance^C.1^A1 Michigan Ave.^BChicago^BIL^B60600
The same sample record can be provided in form of JSON document (JavaScript
Object Notation).
{
“name”: “John Doe”,
“salary”: 100000.0,
“subordinates”: [“Mary Smith”, “Todd Jones”],
“deductions”: {
“Federal Taxes”: .2,
“State Taxes”:.05,
“Insurance”:.1
},
“address”: {
“street”: “1 Michigan Ave.”,
“city”:”Chicago”,
“state”: “IL”,
“zip”:60600
}
}
Now you able see how to represent the Complex data type like ARRAY,MAP,STRUCT
in JSON . Readable and easy to represent.
DDL and DML statements

•DDL statements are used to build and modify the tables and other
objects. They deal with

1. Create/Drop/Alter Database
2. Create/Drop/Truncate Table
3. Alter Table/Partition/Column
4. Create/Drop/Alter View
5. Create/Drop/Alter Index
6. Show
7. Describe

•DML statements are used retrive , store, modify delete and update
data in database. They deal with
1. Loading files into table.
2. Inserting data into Hive table from queries.
Database

Starting HIVE shell

Go to its installation path and type hive (hadoop needs to be started in another terminal)

To create a database named “STUDENTS” with comments and database properties.

CREATE DATABASE IF NOT EXISTS STUDENTS COMMENT

'STUDENT Details' WITH DBPROPERTIES ('creator' = 'JOHN');

hive> show databases;

hive> describe database students; //shows DB name, comment and directory
hive> describe database extended students; //will show the properties also
Database

To alter the database properties

hive> alter database students set dbproperties ('edited by'= 'David');

hive> describe database extended students;

hive>use students //to make it current database

hive>drop database students;

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Tables
Hive provides two kinds of tables: Managed and External table
Metadata of a table: When we create a table the masternode will keep
information about the location, schema, list of partitions etc .

Managed Table
When the internal table is dropped, it drops both the data and metadata.
Table is stored under the warehouse folder under Hive.
Life cycle of the table and data is managed by Hive

External Table –
When you drop this table, it retains the data in the underlying location
External keyword is used to create an external table
Location needs to be specified to store the dataset in that particular
location

hive>describe formatted <tablename> (to see type)

Big Data and Analytics by Seema Acharya and Subhashini Chellappan
Copyright 2015, WILEY INDIA PVT. LTD.
Tables

To create managed table named ‘STUDENT’.

Format
CREATE [temporary] [external] TABLE [ IF NOT EXISTS] STUDENT(rollno INT,name
STRING,gpa FLOAT) [COMMENT] ROW FORMAT DELIMITED FIELDS TERMINATED BY
'\t';

CREATE TABLE IF NOT EXISTS STUDENT(rollno INT,name STRING,gpa FLOAT) ROW

FORMAT DELIMITED FIELDS TERMINATED BY '\t';

hive>describe STUDENT //will show the schema of the table

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

To create external table named ‘EXT_STUDENT’.

CREATE EXTERNAL TABLE IF NOT EXISTS EXT_STUDENT(rollno INT,name

STRING,gpa FLOAT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION ‘/STUDENT_INFO';

ALTER TABLE name RENAME TO new_name

hive> ALTER TABLE employee RENAME TO emp;

DROP TABLE [IF EXISTS] table_name;

hive> DROP TABLE IF EXISTS employee;

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
CREATING TEXT FILE
create table olympic(athelete STRING,age INT,country STRING,year STRING,closing STRING,
sport STRING,gold INT,silver INT,bronze INT,total INT) row format delimited fields terminated by '\t' stored as textfile;

Here we are creating a table with name “olympic” and the schema of the table is as
specified above. The data inside the above input file is delimited by tab space. As
explained earlier the file format is specified as TEXTFILE at the end. The schema of
the table created above can be checked using describe olympic;
Loading data into table from file
To load data into the table from file named [Link].

LOAD DATA LOCAL INPATH ‘/root/hivedemos/[Link]' OVERWRITE INTO

TABLE EXT_STUDENT;

Local is used to load data from the local file system . To load the data from
HDFS remove local key word.

To retrieve the student details from “EXT_STUDENT” table.

hive>show tables;
SELECT * from EXT_STUDENT

hive> select roll no, name, from Student; //to view selected fields
Big Data and Analytics by Seema Acharya and Subhashini Chellappan
Copyright 2015, WILEY INDIA PVT. LTD.
Collection Data Types

ARRAY : Syntax ARRAY<data_type>,

MAP : MAP<primitive_type, data_type>
STRUCT : STRUCT<col_name:data_type>

hive> create table student_info(rollno INT, name STRING, sub ARRAY<STRING >, marks
MAP<STRING,INT>, addr struct<city:STRING, state:STRING, pin:BIGINT>)row format delimited
fields terminated by ',' collection items terminated by ':' map keys terminated by ‘!';

hive> load data local inpath '/home/hduser/hive/collection_demo.csv' into table student_info;

1001,John,[English:Hindi],{Mark1!45,Mark2!65},{delhi:delhi:897654}
1002,Jill,[Physics:Maths],{Mark1!43,Mark2!89},{bhopal:MP:432024}

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Collection Data Types-Array
hive> select *from student_info ;
1001 John [“English",“Hindi"] {"Mark1":45,"Mark2":65}
1002 Jill [“Physics",“Maths"] {"Mark1":43,"Mark2":89}

//Accessing complex data types

hive> select name, sub from student_info; //select whole array
John [“English",“Hindi"]
Jill [“Physics",“Maths"]
hive> select name, sub[1] from student_info where rollno=1001;
hive> select name, sub[0] from student_info; //array element

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Collection Data Types - Map
hive> select name, marks from student_info; //display the whole map collection
John {"Mark1":45,"Mark2":65}
Jill {"Mark1":43,"Mark2":89}

hive >select name, marks[“mark1”], marks[“mark2”] from student_info;

John 45 65
Jill 43 89

hive>select name, [Link] from student_info;

John delhi
Jill bhopal

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Partitions -> To execute hive query faster
Very often users need to filter the data on specific column values.

Example: select the employees whose salary is above 50000 in

a table of 10 lakh entries.
If Users understand the domain of the data on which they are
doing analysis, they can identify frequently queried columns and
then partitioning can be applied on those columns.
When “where clause” is applied, Hive reads the entire dataset.
This decreases the efficiency and becomes a bottleneck when we
are required to run the queries on large tables.
This issue can be overcome by implementing partitions in hive.

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Partition splits the large dataset into more meaningful chunks.
Two kinds of partitions:
1) Static partition (It comprises columns whose values are
known at compile time)
2) Dynamic partition (It comprises columns whose values are
known only at execution time)
Static partition:
CREATE TABLE IF NOT EXISTS STUDENT(rollno INT,name STRING,gpa FLOAT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

LOAD DATA LOCAL INPATH '/home/hduser/hive/student_data.csv' OVERWRITE

INTO TABLE student;

create table if not exists static_student(rollno int,name string) partitioned by

(gpa float) row format delimited fields terminated by ',‘ ;

insert overwrite table static_student partition (gpa=6.5) select rollno,name

from student where gpa=6.5;
To add one more static partition based on “gpa” column using the alter statement.

alter table static_student add partition (gpa=8.0);

Goto: /user/hive/warehouse/[Link]/static_student
Name type size replication blocksize modification time permission owner group
Gpa=6.5 dir
Gpa=8.0 dir
Dynamic partition:
To use the dynamic partitioning in hive we need to set the below parameters in hive
shell or in [Link] file.

select * from dynamic_part_student;

The file name says file1 contains client data table:

1.tab1/clientdata/file1
id, name, dept, yoj
1, sunny, SC, 2009 Now, let us partition above data into two files using years
2, animesh, HR, 2009 Creating a partitioned table is as follows:
3, sumeer, SC, 2010 CREATE TABLE table_tab1 (id INT, name STRING, dept
4, sarthak, TP, 2010 STRING, yoj INT) PARTITIONED BY (year STRING);
[Link] DATA LOCAL INPATH
tab1’/clientdata/2009/file2’OVERWRITE INTO TABLE
studentTab PARTITION (year='2009');
[Link] DATA LOCAL INPATH
tab1’/clientdata/2010/file3’OVERWRITE INTO TABLE
studentTab PARTITION (year='2010');

tab1/clientdata/2009/file2
1, sunny, SC, 2009
2, animesh, HR, 2009
tab1/clientdata/2010/file3
3, sumeer, SC, 2010
4, sarthak, TP, 2010
Bucketing Features in Hive
Hive partition divides table into number of partitions and these
partitions can be further subdivided into more manageable parts
known as Buckets or Clusters.
The Bucketing concept is based on Hash function, which depends
on the type of the bucketing column. Records which are bucketed by
the same column will always be saved in the same bucket.

Here, CLUSTERED BY clause is used to divide the table into

buckets.

In Hive Partition, each partition will be created as directory. But in

Hive Buckets, each bucket will be created as file.
Bucketing can also be done even without partitioning on Hive tables.
Buckets

To use the bucketing in hive we need to set the below parameters in hive shell or in hive-
[Link] file.
set [Link]=true;

•To create a bucketed table having 3 buckets.

CREATE TABLE IF NOT EXISTS STUDENT_BUCKET (rollno INT,name STRING,gpa FLOAT)

CLUSTERED BY (gpa) into 3 buckets;

•Load data to bucketed table.

FROM STUDENT INSERT OVERWRITE TABLE STUDENT_BUCKET SELECT rollno,name,gpa;

•To display the content of first bucket.

SELECT DISTINCT GRADE FROM STUDENT_BUCKET TABLESAMPLE(BUCKET 1 OUT OF 3 ON

GRADE);
When you create the table and bucket it using the clustered by clause into 32
buckets (as an example), hive buckets your data into 32 buckets using
deterministic hash functions. Then when you use TABLESAMPLE(BUCKET x
OUT OF y), hive divides your buckets into groups of y buckets and then picks the
x'th bucket of each group. For example:

If you use TABLESAMPLE(BUCKET 6 OUT OF 8), hive would divide your 32

buckets into groups of 8 buckets resulting in 4 groups of 8 buckets and then
picks the 6th bucket of each group, hence picking the buckets 6, 14, 22, 30.

If you use TABLESAMPLE(BUCKET 23 OUT OF 32), hive would divide your 32

buckets into groups of 32, resulting in only 1 group of 32 buckets, and then picks
the 23rd bucket as your result.
Partitioning VS Bucketing

Basically both Partitioning and Bucketing slice the data for executing the query much more efficiently
than on the non-sliced data.

The major difference is that the number of slices will keep on changing in the case of partitioning as
data is modified, but with bucketing the number of slices are fixed which are specified while creating
the table.

Bucketing happen by using a Hash algorithm and then a modulo on the number of buckets. So, a row
might get inserted into any of the bucket. Bucketing can be used for sampling of data, as well also for
joining two data sets much more effectively and much more.

when we do partitioning, we create a partition for each unique value of the column. But there may be
situation where we need to create lot of tiny partitions.

But if you use bucketing, you can limit it to a number which you choose and decompose your data
into those buckets. In hive a partition is a directory but a bucket is a file.
Now, if we want to perform partitioning on the basis of department column. Then the
information of all the employees belonging to a particular department will be stored
together in that very partition. Physically, a partition in Hive is nothing but just a sub-
directory in the table directory.

Thus we will have three partitions in total for each of the departments as we can see
clearly in diagram. For each department we will have all the data regarding that
very department residing in a separate sub – directory under the table directory.

So for example, all the employee data regarding Technical departments will be
stored in user/hive/warehouse/employee_details/dept.=EEE.

Hence, from the above diagram, we can see that how each partition is bucketed
into 2 buckets. Therefore each partition, says Technical, will have two files
where each of them will be storing the Technical employee’s data
Aggregations

Hive supports aggregation functions like avg, count, etc.

To write the average and count aggregation function.

SELECT AVG(gpa) FROM STUDENT;

SELECT count(*) FROM STUDENT;

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Data in a column or columns can be grouped on the basis of values contained therein
by using “Group By”.
“Having” clause is used to filter out groups NOT meeting the specified condition

select rollno, name, gpa from student group by rollno, name,

gpa having gpa > 4;

select gpa from student group by gpa having count(gpa) >= 3;

HAVING Clause enables you to specify conditions that filter which

group results appear in the results.

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
VIEWS: Views in SQL are kind of virtual tables. A view also has rows and columns as
they are in a real table in the database. We can create a view by selecting fields
from one or more tables present in the database.
A View can either have all the rows of a table or specific rows based on certain condition.
CREATE VIEW DetailsView AS SELECT NAME, ADDRESS FROM StudentDetails
WHERE S_ID < 5;

CREATE VIEW MarksView AS SELECT [Link], [Link],

[Link] FROM StudentDetails, StudentMarks WHERE [Link] =
[Link];

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Drop the View

 Drop View marksview;
join
Join- similar to the SQL JOIN
To create a join between Student and Department tables where we use rollnos from both as the join key.
For example:
CREATE TABLE a (k1 string, v1 string);
CREATE TABLE b (k2 string, v2 string);
SELECT k1, v1, k2, v2 FROM a JOIN b ON k1 = k2;

i) Create two tab separated files [Link] and [Link] with common rollnos
hive>create table if not exists Stud(rollno INT, name STRING, gpa FLOAT) row format delimited
fields terminated by '\t';
hive>create table if not exists Dept(rollno INT, name STRING, gpa FLOAT) row format delimited
fields terminated by '\t';

ii) Create the two tables and load data into them
hive>load data local inpath '/home/sangeetha/hivedemo/[Link]' overwrite into table stud;
hive>load data local inpath '/home/sangeetha/hivedemo/[Link]' overwrite into table dept;

iii) Select [Link], [Link], [Link], [Link], FROM STUD a join DEPT b on [Link]=[Link];

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Serializer converts java objects that hive has been working with and
translates it into something that Hive can write to HDFS.
Deserializer converts binary representation of records and converts it
into java objects.
•SerDer stands for Serializer/Deserializer.

•Contains the logic to convert unstructured data into records.

•Implemented using Java.

•Serializers are used at the time of writing.

•Deserializers are used at query time (SELECT Statement).

Big Data and Analytics by Seema Acharya and Subhashini Chellappan

Copyright 2015, WILEY INDIA PVT. LTD.
Sub Query
 In Hive sub queries are supported only in the FROM
Clause(Hive 0.12).
 Need to specify name for sub query because every table
in a from clause has a name.
 The columns in the sub query select list should have
unique names.
 The columns in the sub query select list are available to
the outer query just like columns of a table.
 In Hive(0.13),sub queries are supported in the where
clause as well.
Sub query to count occurrence of similar words in
the file.
 CREATE TABLE docs (line STRING);
LOAD DATA LOCAL INPATH ‘/root/hivedemos/[Link]’ OVERWRITE INTO TABLE docs;
 CREATE TABLE word_count AS
SELECT word, count(1) AS count FROM
(SELECT explode (split (line, ‘ ’)) AS word FROM docs) w
GROUP BY word
ORDER BY word;
The explode() function takes an array as input and outputs the elements of the array as
separate rows.
SELECT * FROM word_count;

Introduction to Apache Hive
No ratings yet
Introduction to Apache Hive
61 pages
Chapter 10
No ratings yet
Chapter 10
50 pages
Chapter 7
No ratings yet
Chapter 7
48 pages
UNIT-3 Hadoop and MapReduce Programming
100% (2)
UNIT-3 Hadoop and MapReduce Programming
84 pages
Chapter 6
100% (1)
Chapter 6
51 pages
Chapter 5
100% (1)
Chapter 5
45 pages
Big Data Analytics: Seema Acharya Subhashini Chellappan
100% (1)
Big Data Analytics: Seema Acharya Subhashini Chellappan
47 pages
4 UNIT-4 Introduction To Hadoop
No ratings yet
4 UNIT-4 Introduction To Hadoop
154 pages
Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction To Pig
67% (3)
Big Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction To Pig
34 pages
MapReduce Applications in Big Data
No ratings yet
MapReduce Applications in Big Data
94 pages
Nptel Big Data Full PPT Book With Assignment Solution Rajiv Mishra IIT Patna 2021
100% (1)
Nptel Big Data Full PPT Book With Assignment Solution Rajiv Mishra IIT Patna 2021
1,103 pages
Big Data & Analytics Lab Manual
No ratings yet
Big Data & Analytics Lab Manual
51 pages
Unit 5 Notes
100% (3)
Unit 5 Notes
66 pages
BDA Unit 5 HIVE HBASE
No ratings yet
BDA Unit 5 HIVE HBASE
33 pages
Facebook's Apache Cassandra Overview
No ratings yet
Facebook's Apache Cassandra Overview
23 pages
The Big Data Technology Landscape
No ratings yet
The Big Data Technology Landscape
36 pages
DWM Unit 1
No ratings yet
DWM Unit 1
34 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
Unit-Vi Hive Hadoop & Big Data
100% (1)
Unit-Vi Hive Hadoop & Big Data
24 pages
CS8091 Important Questions BDA
No ratings yet
CS8091 Important Questions BDA
1 page
CCS334 Big Data Analytics Overview
No ratings yet
CCS334 Big Data Analytics Overview
16 pages
cp5293 Big Data Analytics Question Bank
0% (1)
cp5293 Big Data Analytics Question Bank
13 pages
Big Data and Analytics Syllabus 2021
No ratings yet
Big Data and Analytics Syllabus 2021
3 pages
CP7019-Managing Big Data-Anna University - Question Paper
75% (4)
CP7019-Managing Big Data-Anna University - Question Paper
4 pages
NoSQL Databases for MCA Students
0% (1)
NoSQL Databases for MCA Students
15 pages
BDA Lab Manual R22
0% (1)
BDA Lab Manual R22
70 pages
Unit5 BD
100% (2)
Unit5 BD
91 pages
Question Bank-Big Data
25% (4)
Question Bank-Big Data
1 page
Hive Table Creation and Partitioning Guide
0% (1)
Hive Table Creation and Partitioning Guide
4 pages
BDA Unit-5
No ratings yet
BDA Unit-5
25 pages
Cp5293 Big Data Analytics Question Bank
0% (1)
Cp5293 Big Data Analytics Question Bank
13 pages
Big Data Analytics Unit-5
No ratings yet
Big Data Analytics Unit-5
28 pages
CS8091-BIG DATA ANALYTICS UNIT V Notes
100% (4)
CS8091-BIG DATA ANALYTICS UNIT V Notes
31 pages
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
No ratings yet
VTU Exam Question Paper With Solution of 18CS72 Big Data and Analytics Feb-2022-Dr. v. Vijayalakshmi
25 pages
BDA Final Lab Manual
100% (1)
BDA Final Lab Manual
56 pages
cp5293 Big Data Analytics Unit 5 PDF
No ratings yet
cp5293 Big Data Analytics Unit 5 PDF
28 pages
Big Data Analytics by Seema Acharya PDF 9 PDF Free
No ratings yet
Big Data Analytics by Seema Acharya PDF 9 PDF Free
370 pages
BD - Unit - III - MapReduce
100% (1)
BD - Unit - III - MapReduce
31 pages
Big Data Analytics Syllabus - 22UAI603C - 204 - 2025
No ratings yet
Big Data Analytics Syllabus - 22UAI603C - 204 - 2025
2 pages
III-II Big Data Analytics Question Bank
50% (2)
III-II Big Data Analytics Question Bank
3 pages
BDA Lab Manual AI&DS
No ratings yet
BDA Lab Manual AI&DS
60 pages
r18 - Big Data Analytics - Cse (DS)
0% (1)
r18 - Big Data Analytics - Cse (DS)
1 page
Question Bank - Big Data Analytics - Final1
100% (1)
Question Bank - Big Data Analytics - Final1
6 pages
Unit 1 PPT
No ratings yet
Unit 1 PPT
72 pages
CCS334 Big Data Analytics Key Questions
No ratings yet
CCS334 Big Data Analytics Key Questions
1 page
Co Po Mapping Bda With Justiificaton
No ratings yet
Co Po Mapping Bda With Justiificaton
4 pages
Bda Lab Manual
0% (1)
Bda Lab Manual
40 pages
Bda Unit 5 Hive Notes
No ratings yet
Bda Unit 5 Hive Notes
23 pages
Data WareHouse Previous Year Question Paper
100% (1)
Data WareHouse Previous Year Question Paper
10 pages
Big Data Analytics
No ratings yet
Big Data Analytics
131 pages
Hadoop and Their Ecosystem
100% (2)
Hadoop and Their Ecosystem
24 pages
P.prabu (29x61c) CCS334 BDA - Unit 2
No ratings yet
P.prabu (29x61c) CCS334 BDA - Unit 2
29 pages
4.big Data Technology Landscape
No ratings yet
4.big Data Technology Landscape
31 pages
Hadoop Command Line Interface
No ratings yet
Hadoop Command Line Interface
10 pages
Big Data Analytics (CS443) IV B.Tech (IT) 2018-19 I Semester
No ratings yet
Big Data Analytics (CS443) IV B.Tech (IT) 2018-19 I Semester
72 pages
Hadoop Basics and Architecture Guide
100% (1)
Hadoop Basics and Architecture Guide
104 pages
MapReduce, Hive & Pig Question Bank
100% (2)
MapReduce, Hive & Pig Question Bank
2 pages
Hive
No ratings yet
Hive
45 pages
Hive Final
No ratings yet
Hive Final
75 pages
Big Data Analytics: Welcome
No ratings yet
Big Data Analytics: Welcome
69 pages
SWNVK 885808 Brochure
No ratings yet
SWNVK 885808 Brochure
2 pages
Ship Simulator 2008 v1.1 Manual
No ratings yet
Ship Simulator 2008 v1.1 Manual
21 pages
Digital Skills Course Overview
No ratings yet
Digital Skills Course Overview
2 pages
NIST Big Data Interoperability Framework
No ratings yet
NIST Big Data Interoperability Framework
89 pages
IT Skills Lab-1 Overview for MBA Students
100% (1)
IT Skills Lab-1 Overview for MBA Students
22 pages
JetsonNano DataSheet DS09366001v1.1
No ratings yet
JetsonNano DataSheet DS09366001v1.1
41 pages
Android App Error Log Analysis
No ratings yet
Android App Error Log Analysis
26 pages
Ijcsi Vol 8 Issue 3 No 1
No ratings yet
Ijcsi Vol 8 Issue 3 No 1
627 pages
Junction Connectors and Audio Systems Guide
No ratings yet
Junction Connectors and Audio Systems Guide
32 pages
Serial Attached SCSI (SAS)
No ratings yet
Serial Attached SCSI (SAS)
4 pages
Mobile Communications Product Guide: Number Description Package(s) Available (Lead-Free) No
100% (1)
Mobile Communications Product Guide: Number Description Package(s) Available (Lead-Free) No
2 pages
PF E-Nomination Process
No ratings yet
PF E-Nomination Process
28 pages
Unit 3 C Programming
No ratings yet
Unit 3 C Programming
6 pages
CUST BS(CS) Mid Exams Schedule Fall 2023
No ratings yet
CUST BS(CS) Mid Exams Schedule Fall 2023
3 pages
English Language Quiz - Quizizz-2-3
No ratings yet
English Language Quiz - Quizizz-2-3
2 pages
ECA LNG Instrumentation Specs
No ratings yet
ECA LNG Instrumentation Specs
19 pages
Sec A
No ratings yet
Sec A
75 pages
MS Excel Mcqs
100% (1)
MS Excel Mcqs
17 pages
README
No ratings yet
README
4 pages
ASP.NET Page Life-Cycle Guide
No ratings yet
ASP.NET Page Life-Cycle Guide
4 pages
Dryer: Samsung Smart Home User Manual
0% (1)
Dryer: Samsung Smart Home User Manual
4 pages
Deep Web Final Documentation 4B6
No ratings yet
Deep Web Final Documentation 4B6
16 pages
Manajemen Logistik Dan Rantai Pasokan Sesi 2
No ratings yet
Manajemen Logistik Dan Rantai Pasokan Sesi 2
15 pages
HMI Connection Guide for LOGO! 0BA7
No ratings yet
HMI Connection Guide for LOGO! 0BA7
12 pages
OS Lab: Process Synchronization
No ratings yet
OS Lab: Process Synchronization
5 pages
BS Computer Science: Semester 1
100% (3)
BS Computer Science: Semester 1
3 pages
Data Migration Guide
No ratings yet
Data Migration Guide
29 pages
David Kirui CV - Tangazoletu
No ratings yet
David Kirui CV - Tangazoletu
4 pages
Equivalence Partitioning Examples
No ratings yet
Equivalence Partitioning Examples
6 pages
CompTIA Security Guide To Network Security Fundamentals 5th Edition Mark Ciampa
No ratings yet
CompTIA Security Guide To Network Security Fundamentals 5th Edition Mark Ciampa
478 pages