0% found this document useful (0 votes)
21 views17 pages

Data Modeling and Query Language Basics

Data modeling tells you how data is structured, what operations can be done on the data, and what constraints apply to the data. A query language is declarative, while a database programming language is procedural. SQL queries using LIKE and wildcards can find records where a field starts with a particular string.

Uploaded by

Mo Farhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views17 pages

Data Modeling and Query Language Basics

Data modeling tells you how data is structured, what operations can be done on the data, and what constraints apply to the data. A query language is declarative, while a database programming language is procedural. SQL queries using LIKE and wildcards can find records where a field starts with a particular string.

Uploaded by

Mo Farhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Data modeling tells you

i. How your data is structured


ii. What operations can be done on the data
iii. What constraints apply to the data
iv. Where the data is stored
Single choice.

(0.5 Points)

i, ii, iii

ii, iii, iv

i, ii, iv

i, iii, iv
2.
State True/False:
i. A query language is declarative
ii. Database programming language is procedural programming language
Single choice.
(0.5 Points)

i-true, ii-true

i-true, ii-false

i-false, ii-true

i-false, ii-false
3.
SQL query which prints the records for students whose name starts with ‘De’ is
Single choice.
(0.5 Points)

Select * from students where name = ‘De’

Select * from students where name like ‘De’

Select * from students where name = ‘%De’

Select * from students where name like ‘%De’ (best option)


4.
What is a subquery?
Single choice.
(0.5 Points)

A query statement within another query

An alternate query that acts as a substitute for a given query

A query that requires two tables in order to calculate the values

A short query than normal


5.
In MongoDB, ____ operator matches any of the values specified in an array.
Single choice.
(0.5 Points)

$ne

$nin

$in
$eq
6.
What are the three layers for the Hadoop Ecosystem?
i. Data Management and Storage
ii. Data Manipulation and Integration
iii. Coordination and Workflow Management
iv. Data Integration and Processing
v. Data Creation and Storage
Single choice.
(0.5 Points)

ii, iii, iv

i, iii, iv

i, ii, v

ii, iv, v
7.
Which of the following statement is FALSE with respect to the big data
processing engines supported by Apache foundation?
Single choice.
(0.5 Points)

The Beam system is a relatively new system for batch and stream processing with a data
flow programming model

Flink has it's own execution engine called Nephele

Spark defined input stream interface abstractions called spouts, and computation
abstractions called bolts (storm has defined…)

Spark was built using an in-memory structure called Resilient Distributed Datasets
8.
____________ software collects and indexes machine data at a very large
scale irrespective of wherever its generated.
Single choice.
(0.5 Points)

TurboTax

Splunk (not sure)

OpenXC

None of these
9.
What is the equivalent MongoDB query for the given SQL query- “select * from
ABC”?
Single choice.
(0.5 Points)

db.ABC.find( )

select.ABC.find( )

db.select.ABC( )

ABC.db.select( )
10.
db.collection.find(<query filter>, <projection>).<cursor modifier>
Which part of the above statement is equivalent to WHERE clause in SQL?
Single choice.
(0.5 Points)

<query filter>
<Projection>

<collection>

<cursor modifier>
11.
Which of the following is TRUE w.r.t Query Language?
i. Specifies the data items we need.
ii. Database programming language
iii. It is declarative
Single choice.
(0.5 Points)

i, ii only

ii, iii only

i, iii only

i, ii and iii
12.
State True(T) or False(F).
i. MongoDB is a collection of documents.
ii. MongoDB does not have adequate support to perform recursive queries
over nested substructures.
Single choice.
(0.5 Points)

i- T, ii- T

i-T, ii-F

i-F, ii-T
i-F, ii-F
13.
The head(5) command in Pandas data frame is used to
Single choice.
(0.5 Points)

View first five rows

View last five rows

View first five columns

View last five attributes


14.
Considering the following schema, what does the given query return?
Schema: Items (name, manf)
Likes (user, item)
Query: SELECT *
FROM Items
WHERE name NOT IN
(SELECT item
FROM Likes
WHERE user=’Joe’);
Single choice.
(0.5 Points)

Selects the name and manufacturer of each item that Joe doesn’t like

Selects the name of each item liked by Joe

Selects the name and manufacturer of each item liked by everyone except Joe

Selects the manufacturer of each item liked by Joe


15.
Which of the following are distinct layers of Hadoop?
I. Data management and storage
II. Query Management
III. Data processing
IV. All the above
Single choice.
(0.5 Points)

IV

I & III

I & II

I
16.
In Hadoop, different varieties of data get retrieved, integrated, and analyzed in
the:
Single choice.
(0.5 Points)

Data management and storage Layer

Query Management Layer

Data processing Layer

All of the above


17.
The goal of data fusion is to:
I. find the values of Data Items from multiple sources
II. derive information that has greater benefit than what would have been
derived from each of the contributing parts
III. combine all data in a source
IV. find the true worth of a data set
Single choice.
(0.5 Points)

I & II (not sure)

II

III

IV
18.
MongoBD query to find a document whose 2nd element in tags is “summer”
Single choice.
(0.5 Points)

db.inventory.find(tags.1:”summer”)

db.inventory.find(tags.2:”summer”)

db.inventory(tags.1:”summer”)

db.inventory(tags.2:”summer”)
19.
The job of data integration system is:
Single choice.
(0.5 Points)

Accumulate all data in one system

Transform the data from the source schema to the schema of the receiving system
Record Customer Interactions

Customer Analytics
20.
Data compression refers to a way of:
I. Compressing the data file
II. Creating an encoded representation of data.
III. Retaining only relevant data
IV. Creating a form smaller than the original representation.
Single choice.
(0.5 Points)

I & II

I & III

III & IV

II & IV
21.
In Hadoop, the YARN Engine is used for:
Single choice.
(0.5 Points)

Batch and Stream Processing

Data Processing

Resource negotiation and scheduling

All of the above


22.
For applications like online gaming and hazards management it is very important
to have a:
I. High latency system
II. Low latency system
III. Batch processing ability
IV. Highly scalable execution
Single choice.
(0.5 Points)

I & II

I & III

III & IV

II & IV
23.
The integration and processing layer includes which of the following tools for
bringing a query interface on top of the storage layer?
I. Spark SQL
II. Vertica
III. Hive
IV. Solr
Single choice.
(0.5 Points)

I & II

II & IV

I & III

III & IV
24.
What does the following line of code do in postgres?
SELECT count(userid) FROM (SELECT buyclicks.userId, teamLevel, price
FROM buyclicks JOIN gameclicks on buyclicks.userId = gameclicks.userId)
temp WHERE price=3 and teamLevel=5;
Single choice.
(0.5 Points)

Counts the users who exists between both gameclicks and buyclicks files

This is an invalid line of code, the subquery is not formatted properly

Finds the total number of user ids (repeats allowed) in buy-clicks that have bought items
with prices worth $3 and was in a team with level 5 at some point in time

Displays the users who have bought items worth $3 and have had a team with level 5
25.
________is the primary form of data in Information Retrieval systems
Single choice.
(0.5 Points)

Image

Text

XML data

HTML
26.
What is the main problem with big data information integration?
Single choice.
(0.5 Points)

Many sources
Mediated Schema

Pay-as-you-go model

Probabilistic schema mapping


27.
With SQL, which of the following query returns the number of records in the
"Product" table?
Single choice.
(0.5 Points)

Select * from Product

Select count (*) from Product

Select count from Product

Select distinct (count) from Product


28.
Which of the following statements using MongoDB will result in counting the
number of unique jobs of Customers?
Single choice.
(0.5 Points)

db.Customers.count (jobs:{$in: false})

db.Customers(jobs: {$exists: true}).count

db.Customers.count (jobs: {$exists: true})

db.count.Customers exists (jobs)


29.
Any big data integration system should:
I. Not integrate all sources of data
II. Should have addressed the record linkage problem
III. Integrate as per the application/business demand
IV. All the above.
Single choice.
(0.5 Points)

IV

III

II & III

I & III
30.
State True(T) or False(F) w.r.t Big Data Management Systems (BDMS).
i. Mainly designed for parallel and distributed processing.
ii. Always guarantees consistency for every update.
Single choice.
(0.5 Points)

i- T, ii- T

i- T, ii- F

i- F, ii- T

i- F, ii- F
31.
Which is the command used to see the databases in MongoDB?
Single choice.
(0.5 Points)
select dbs

show dbs

create dbs

use dbs
32.
Which of the following is an aggregate function?
Single choice.
(0.5 Points)

Select

Project

Count

Join
33.
In Aerospike, which of these dictates the namespace behavior such as the
way of data storage, existence of replica and expiry time for a record.
Single choice.
(0.5 Points)

Indexes

Policies

Bins

None of these
34.
Which of these the MongoDB commands is used to look for the value where a
particular field is greater than 10?
Single choice.
(0.5 Points)

{$gt=10}

{$gt :=10}

{$gt :10}

{$gt ,10}
35.
____ symbol in MongoDB matches none of the values specified in an array.
Single choice.
(0.5 Points)

$nin

$ne

$in

$not
36.
What does it mean to have a _id:0 within our query statement?
Single choice.
(0.5 Points)

Tell MongoDB not to return a document id.

Grab the first object in the results


Does not have an effect, simple convention left for compatibility issues.

Grab as many objects as possible


37.
What is a data item?
Single choice.
(0.5 Points)

Data found in a mediated schema.

Data found in a customer transaction.

The real worth of a data value.

Data that represents an aspect of a real world entity.


38.
________________ statement in SQL ensures that the result will not have any
duplicates.
Single choice.
(0.5 Points)

SELECT UNIQUE

SELECT DISTINCT

SELECT *

SELECT ALL
39.
“Find friend of a friend” feature on social networks make use of _________
data models.
Single choice.
(0.5 Points)
Relational

Semi Structured

Graph

Text
40.
The decision tree algorithm is one technique for ___________.
Single choice.
(0.5 Points)

Connectivity Analysis

Classification

Clustering

Path Analysis

You might also like