0% found this document useful (0 votes)
44 views5 pages

Data Analytics Exam Solutions Guide

The document outlines key concepts in data analytics, including types of data (structured, unstructured, semi-structured), phases of the data analytics lifecycle, and various analytical techniques such as decision trees and K-means clustering. It also compares different data management systems (RDBMS, NoSQL, Hadoop) and discusses the importance of visualization tools and the differences between supervised and unsupervised learning. Additionally, it covers topics like Bayesian analysis, predictive vs. prescriptive analytics, and the data analysis process.

Uploaded by

luckyrounak2895
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views5 pages

Data Analytics Exam Solutions Guide

The document outlines key concepts in data analytics, including types of data (structured, unstructured, semi-structured), phases of the data analytics lifecycle, and various analytical techniques such as decision trees and K-means clustering. It also compares different data management systems (RDBMS, NoSQL, Hadoop) and discusses the importance of visualization tools and the differences between supervised and unsupervised learning. Additionally, it covers topics like Bayesian analysis, predictive vs. prescriptive analytics, and the data analysis process.

Uploaded by

luckyrounak2895
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Data Analytics Exam Solutions

Q1. Types of Data

Data analytics me data ke 2 main types hote hain: Structured aur Unstructured.

- Structured Data: Yeh fixed format me hota hai, jaise rows aur columns me. Examples: Excel sheets,

relational databases.

- Unstructured Data: Free-form data hota hai, jaise text, images, aur videos. Examples: Social media posts,

emails, audios.

- Semi-structured Data: Partially organized hota hai, jaise XML aur JSON files.

Conclusion: Dono data types ko mix karke zyada advanced insights derive ki ja sakti hain.

Q2. Phases of Data Analytics Lifecycle

1. Discovery: Problem aur objectives ko define karte hain.

2. Data Preparation: Data ko clean aur preprocess karte hain.

3. Model Planning: Algorithms aur techniques select karte hain.

4. Model Building: Models train aur test karte hain.

5. Results Communication: Insights ko visualize aur share karte hain.

6. Operationalize: Final model ko deploy karte hain.

Conclusion: Lifecycle ek structured approach ensure karta hai for effective data analysis.

Q3. Decision Trees: Working and Importance

Decision Tree ek supervised machine learning algorithm hai jo data ko classify karne aur predict karne ke liye

use hota hai.

- Working: Root node se start karta hai, jahan data split hota hai based on attribute values. Leaf nodes

decisions ya classifications show karte hain.

- Importance: Yeh intuitive aur explainable hote hain, jo real-world decision-making me kaam aate hain.
Data Analytics Exam Solutions

Applications: Fraud detection, medical diagnosis, aur loan approval.

Q4. Steps in Bayesian Data Analysis

Bayesian data analysis ek statistical approach hai jo uncertainties ko quantify karta hai:

1. Define Prior Beliefs: Problem ke pehle assumptions set karte hain.

2. Likelihood Function: Data observe karte hue probabilities calculate karte hain.

3. Compute Posterior: Updated probabilities nikalte hain.

4. Validate Model: Model ko evaluate karte hain.

Conclusion: Bayesian methods dynamic aur real-world uncertainties ke liye ideal hain.

Q5. K-Means Clustering

K-Means ek unsupervised learning algorithm hai jo similar data points ko clusters me group karta hai.

- Working: Data ko randomly initialized centroids ke around cluster karta hai aur centroids ko iteratively

update karta hai.

- Example: E-commerce me customer segmentation ke liye.

Applications: Market segmentation, anomaly detection, aur image compression.

Q6. Comparison: RDBMS, NoSQL, and Hadoop Systems

RDBMS, NoSQL, aur Hadoop systems ka use alag scenarios me hota hai:

- RDBMS: Structured data ke liye. Example: MySQL.

- NoSQL: Flexible schema aur unstructured data ke liye. Example: MongoDB.

- Hadoop: Distributed systems aur big data analytics ke liye. Example: HDFS.

Comparison Table: Hadoop distributed aur scalable hai, jabki RDBMS transactional aur NoSQL flexible hote

hain.
Data Analytics Exam Solutions

Q7. Multivariate Analysis Techniques with Use Cases

Multivariate analysis multiple variables ke relationships ko samajhne ke liye hota hai:

- PCA: Dimensionality reduction.

- Clustering: Data grouping.

- Factor Analysis: Hidden factors identify karna.

Applications: Marketing me customer segmentation, finance me risk assessment.

Q8. Components of Hadoop and MapReduce

Hadoop big data ke distributed processing ke liye use hota hai:

- Components: HDFS, YARN, MapReduce, aur Hadoop Common.

- MapReduce Workflow: Input splitting, mapping, shuffling, aur reducing ke steps.

Applications: E-commerce recommendations, genomics, aur fraud detection.

Q9. Role of Visualization Tools

Data visualization tools raw data ko graphs aur charts me convert karte hain:

- Tools: Tableau, Power BI, matplotlib.

- Applications: Healthcare me disease tracking, finance me trends, aur marketing me customer insights.

Importance: Data ko simplify karke insights derive karte hain.

Q10. Hive Architecture and Features

Hive ek SQL-like tool hai jo Hadoop ke upar kaam karta hai:

- Architecture: Components include Metastore, Driver, Compiler, aur HDFS.

- Features: Scalable, extensible, aur SQL-like queries.

Applications: Transaction analysis aur risk management.


Data Analytics Exam Solutions

Q11. Supervised vs. Unsupervised Learning

Supervised aur unsupervised learning ka use alag scenarios me hota hai:

- Supervised Learning: Labeled data ke saath. Example: Spam detection.

- Unsupervised Learning: Unlabeled data ke saath. Example: Customer segmentation.

Conclusion: Problem aur data type ke basis par selection hota hai.

Q12. Advantages of PCY Algorithm Over Apriori

PCY algorithm Apriori ke comparison me memory aur efficiency me better hai:

- PCY uses hashing aur bitmaps, jo memory-efficient hain.

- Apriori multiple scans karta hai, jabki PCY optimized hai.

Applications: Frequent itemset mining in large-scale datasets.

Q13. Bernoulli Sampling and SON Algorithm

Stream data analysis me Bernoulli sampling aur SON algorithm ka use hota hai:

- Bernoulli Sampling: Random sampling with fixed probability. Example: Social media data.

- SON Algorithm: Distributed systems me frequent patterns ke liye efficient.

Applications: Fraud detection aur web log analysis.

Q14. Predictive vs. Prescriptive Analytics

Predictive aur prescriptive analytics ka use decision-making me hota hai:

- Predictive: Future trends predict karna. Example: Sales forecasting.

- Prescriptive: Best actions recommend karna. Example: Dynamic pricing.

Conclusion: Predictive insights aur prescriptive actions dono ka combination powerful hai.

Q15. Hierarchical Clustering


Data Analytics Exam Solutions

Hierarchical clustering ek tree-like structure me data points ko organize karta hai:

- Types: Agglomerative (bottom-up) aur Divisive (top-down).

- Applications: Genomics, marketing segmentation.

Advantages: Visual representation using dendrograms.

Q16. Streaming Data Processing vs. Traditional Data Processing

Data processing ke approaches real-time aur batch-based hote hain:

- Streaming: Continuous data. Example: Stock market updates.

- Traditional: Batch processing. Example: Monthly reports.

Comparison: Streaming real-time hai, jabki traditional periodic analysis ke liye.

Q17. Prediction Error in Regression and Classification

Prediction error model ki accuracy ko evaluate karta hai:

- Regression: Errors include MAE, MSE, aur R-squared.

- Classification: Metrics include confusion matrix, precision, recall.

Example: Misclassification rate aur sales forecast accuracy.

Q18. Steps in Data Analysis Process

Data analysis ek systematic process hai:

- Steps: Objectives define karna, data collect aur clean karna, modeling aur visualization.

- Applications: Business optimization aur trend analysis.

Conclusion: Insights ko actionable recommendations me convert karta hai.

You might also like