Department of Artificial Intelligence and Data Science
Academic Year: 2024- 2025 (Even Semester)
Course Code & Title: CCS368 & Stream processing
Date: 18.02.2025
Prerequisite Questions with Answers
1. Which SQL function is used to count the number of rows in a SQL query?
a) COUNT()
b) NUMBER()
c) SUM()
d) COUNT(*)
Ans: d) COUNT(*)
2. Which of the following SQL clauses is used to DELETE tuples from a database table?
a) DELETE
b) REMOVE
c) DROP
d) CLEAR
Ans: a) DELETE
3. With SQL, how do you select all the records from a table named “Persons” where the
value of the column “FirstName” ends with an “a”?
a) SELECT * FROM Persons WHERE FirstName=’a’
b) SELECT * FROM Persons WHERE FirstName LIKE ‘a%’
c) SELECT * FROM Persons WHERE FirstName LIKE ‘%a’
d) SELECT * FROM Persons WHERE FirstName=’%a%’
Ans: c) SELECT * FROM Persons WHERE FirstName LIKE ‘%a’
4. The UPDATE SQL clause can _____________
a) update only one row at a time
b) update more than one row at a time
c) delete more than one row at a time
d) delete only one row at a time
Ans: b) update more than one row at a time
5. Which of the following is an aggregate function?
a) Average
b) Sum
c) With
d) Minimum
Ans: b) Sum
6. How many join types in join condition?
a) 2
b) 3
c) 4
d) 5
Ans: d) 5
7. Which is a join condition contains an equality operator?
a) Equijoins
b) Cartesian
c) Both Equijoins and Cartesian
d) None of the mentioned
Ans: a) Equijoins
8. Which join refers to join records from the write table that have no matching key in the
left table are include in the result set?
a) Left outer join
b) Right outer join
c) Full outer join
d) Half outer join
Ans: b) Right outer join
9. Benefit of using NoSQL databases.
a) Strict data modelling
b) Limited scalability
c) Easy schema evolution
d) Limited data storage capacity
Ans: c) Easy schema evolution
10. Which NoSQL database is known for its ability to handle complex queries and data
analysis?
a) HBase
b) Cassandra
c) MongoDB
d) Couchbase
Ans: c) MongoBD
11. Identify the term used to define the multidimensional model of the data warehouse.
a) Table
b) Data cube
c) Tree
d) Data structure
Ans: b) Data cube
12. In which language is Hadoop written?
a) C++
b) Java
c) Rust
d) Python
Ans: b) Java
a) On which of the following platforms does Hadoop run?DebianCross-platform
b) Bare metal
c) Unix-like
Ans: b) Cross-platform
13. The output of map tasks is written in ___________.
a) Local disk
b) File system
c) HDFS
d) Secondary storage
Ans: a) Local disk
14. Transaction of data of the bank is a type of
a) Unstructured data
b) Structured data
c) Both a and b
d) None of the above
Ans: b) Structured data
15. Which of the following can be generally used to clean and prepare big data?
a) Pandas
b) Data lake
c) U-SQL
d) Data warehouse
Ans: d) Data warehouse
16. Fact tables are __________.
a) HDFS
b) MapReduce
c) Yarn
d) All of the above
Ans: d) All of the above
17. Among the following options which component deals with ingesting streaming data
into Hadoop?
a) Oozie
b) Hive
c) Kafka
d) Flume
Ans: d) Flume
18. Data in ____ bytes size is called big data.
a) Meta
b) Giga
c) Tera
d) Peta
Ans: d) Peta
19. Identify the incorrect big data Technologies.
a) Apache Pytorch
b) Apache Kafka
c) Apache Hadoop
d) Apache Spark
Ans: a) Apache Pytorch
20. Kafka is used for
a) Decoupling data streams
b) Image processing
c) Video conferencing
d) Audio streaming
Ans: a) Decoupling data streams
21. What are the two types of message patterns in Kafka?
a) Point to point and broadcast
b) Point to point and publish-subscribe
c) Broadcast and multicast
d) Unicast and multicast
Ans: b) Point to point and publish-subscribe
22. The primary Machine Learning API for Spark is now the _____ based API
a) DataFrame
b) Dataset
c) RDD
d) All of the above
Ans: ) a) DataFrame
23. Which of the following is module for structured data processing?
a) GraphX
b) MLlib
c) Spark SQL
d) Spark R
Ans: c) Spark SQL
24. Spark SQL plays the main role in the optimization of queries.
a) True
b) False
Ans: a) True
Faulty in-charge HOD