0% found this document useful (0 votes)
12 views2 pages

End Exam Only Answers

Uploaded by

subhantls2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

End Exam Only Answers

Uploaded by

subhantls2000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd

Here are the answers:

1. When irrelevant attributes have been removed from the data


2. Regression
3. B and C (Predicting the number of pages in a document, Predicting the profit of
a company)
4. Alphabet Nest
5. Comfy
6. Previous Experiences
7. 1980s
8. False Positive and True Negative
9. Recall and Precision
10. Store block location
11. YARN
12.
- A1 (Map Phase) → B2 (Parses input into records as key-value pairs)
- A2 (Partition Phase) → B4 (Each mapper must determine which reducer will
receive each of the outputs)
- A3 (Shuffle Phase) → B1 (Fetches input data from all map tasks for the portion
corresponding to the reduce task’s bucket)
- A4 (Sort Phase) → B5 (Sorts all map outputs into a single run)
- A5 (Reduce Phase) → B3 (Writes output to a file in HDFS)
13. 3, 1, 4, 5, 2
14. 2 and 3 (Operations are performed by multiple processors, Handles small-scale
data)
15. ;
16. It is commonly used to analyze social media coverage.
17. Log in to cloud lab Web console.
18. They do not query actual data.
19. /user/hive/warehouse
20. Three
21. Value
22. 1x10^21
23. Virality
24. Vulnerability
25. A situation where one or more clients are unable to access a service.
26. MLlib
27. Queue Elasticity
28. Scheduler
29. Hadoop
30. FIFO scheduler
31. Dominant Resource Fairness
32. yarn-site.xml
33. Top-down
34. Density
35. Whenever Beer is bought, diaper is also bought
36. Binary Classification
37. Maximize the margin
38. Multi-collinearity
39. **128 MB**
40. **Block Replication**
41. **Web GUI**
42. **Gets a directory listing of user's home directory in HDFS**
43. **!**
44. **NULL**
45.
- **A1 (Catalog)** — **B3 (Provides lookup service for Impala daemons.)**
- **A2 (State Store)** — **B2 (Relays metadata changes to all the Impala daemons
in a cluster.)**
- **A3 (Impala Daemon)** — **B1 (A daemon process that runs on each node of the
cluster.)**
46.
- **A1 (Text)** — **B2 (It is delimited by a comma or a tab.)**
- **A2 (Sequence)** — **B1 (It is widely supported inside and outside the Hadoop
ecosystem.)**
- **A3 (Avro Data)** — **B4 (It is not human readable.)**
- **A4 (Parquet)** — **B3 (It uses advanced optimizations described in Google’s
Dremel paper.)**
47. **Boolean**
48. **Diagnostics → logs → view**
49. **.in**
50. **Full log**
51. **Refresh stale services**
52. **dfs.datanode.http.address**
53.
- **A1 (Host)** — **B4 (A machine (typically physical) running the CM agent.)**
- **A2 (Rack)** — **B1 (Machines in the same rack, typically served by the same
switch.)**
- **A3 (Service)** — **B2 (A system, which may be distributed, running on a
cluster.)**
- **A4 (Config)** — **B3 (A key-value pair associated with a scope.)**
54.
- **A1 (Service)** — **B3 (A category of managed functionality in Cloudera
Manager.)**
- **A2 (Service Instance)** — **B5 (An instance of a service running on a
cluster that spans many role instances.)**
- **A3 (Roles)** — **B2 (Daemons or processes that take care of a service.)**
- **A4 (Role Instance)** — **B1 (An instance of a role running on a host.)**
- **A5 (Role Group)** — **B4 (A set of configuration properties for a set of
role instances.)**
55. **Flume**
56. **Computation frameworks**
57. **Presto**
58. **QJM**
59. **Rack awareness**
60. **3, 5**
61. **Select()**
62. **Data Visualization**
63. **Controls the number of bins**
64. **When we want to plot between 1 numerical and 1 categorical variable**
65. **Error**
66. **Character**
67. **Convolutional Neural Networks**
68. **Quality**
69.
- **A1 (Machine Learning - Product Analytics)** — **B3 (Movie Recommendations)**
- **A2 (ML Applications – Accounting)** — **B1 (Pay-roll management)**
- **A3 (Sales performance of various entities)** — **B2 (Statistical Analysis)**
- **A4 (Major classes of machine learning process)** — **B5 (Training and
testing)**
- **A5 (Training data patterns are used to classify test data)** — **B4 (Learned
Model)**
70. **Four**
71. **Raw Data**
72. **Domain-Specific**
73. **2017**
74. **1 hour**
75. **Hadoop**

You might also like