0% found this document useful (0 votes)
31 views4 pages

Question Bank For DM

Question bank for deceptive matha

Uploaded by

chaitu naidu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views4 pages

Question Bank For DM

Question bank for deceptive matha

Uploaded by

chaitu naidu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Question Bank for DM & DA

Module - 1
1. Define a data warehouse and explain its purpose in modern data management.
2. What are the key differences between operational database systems and data
warehouses?
3. Discuss the characteristics of a data warehouse and why they are essential for
decision-making processes.
4. Can you outline the typical architecture of a data warehouse? Explain each
component briefly.
5. Define data mining and discuss its significance in extracting valuable insights from
large datasets.
6. What is Knowledge Discovery in Databases (KDD)? How does it relate to data
mining?
7. Identify and explain some of the key challenges in data mining.
8. Describe the primary data mining tasks and provide examples of each.
9. Why is data preprocessing crucial in data mining? Discuss its various stages.
10. Explain the following data preprocessing techniques: data cleaning, missing data
handling, dimensionality reduction, feature subset selection, discretization,
binarization, and data transformation.
11. Briefly discuss the importance of measures of similarity and dissimilarity in data
mining. Provide some basic examples.

Module -2

1. Define data analytics and discuss its importance in modern business decision-
making.
2. What are the key components of the data analytics process? Briefly explain each
component.
3. How does descriptive analytics differ from predictive analytics and prescriptive
analytics? Provide examples of each.
4. Discuss the role of data visualization in data analytics. How does it aid in
understanding and interpreting data?
5. Describe some common tools and technologies used in data analytics. How do they
facilitate data processing, analysis, and visualization?
6. Explain how modeling techniques are applied in business decision-making
processes.
7. Differentiate between structured and unstructured data. Provide examples of each
type and discuss their characteristics.
8. What are the different types of variables in statistical analysis? Explain the
distinctions between categorical and numerical variables.
9. Discuss the importance of database management systems (DBMS) in organizing
and accessing structured data. Provide examples of popular DBMS platforms.
10. Explain the concept of data modeling and its role in data management and
analysis.
11. Describe the difference between conceptual, logical, and physical data models.
Provide examples of each.
12. What are missing values, and why are they problematic in data analysis?
13. Discuss various techniques for handling missing data, including mean imputation,
median imputation, and predictive imputation.

Module -3

1. Explain the concept of regression analysis and its purpose in statistical modeling
2. What are the key assumptions of linear regression? Discuss the importance of these
assumptions in regression analysis.
3. Define the blue property in the context of regression analysis. How does it relate to
the ordinary least squares (OLS) estimation method?
4. Describe the process of least squares estimation in regression. How does it
determine the best-fitting line or plane for the data?
5. What is variable rationalization in regression modeling? Discuss its significance in
selecting predictor variables for the regression model.
6. Describe the stepwise approach to model building in regression analysis. What are
the advantages and disadvantages of stepwise regression?
7. How do you handle multicollinearity in regression analysis? Discuss some
techniques for detecting and addressing multicollinearity among predictor variables.
8. Explain the concept of logistic regression. How does it differ from linear regression
in terms of the outcome variable and model assumptions?
9. Discuss the logistic function and its role in logistic regression. How does it
transform the linear combination of predictor variables into probabilities?
10. What are some common model fit statistics used in logistic regression? Explain
the significance of metrics such as the likelihood ratio test, deviance, and AIC
(Akaike Information Criterion).
11. Describe the process of constructing a logistic regression model. What steps are
involved in selecting predictor variables, fitting the model, and assessing model
performance?
12. Discuss the importance of model validation in logistic regression. What
techniques can be used to evaluate the performance of a logistic regression model?
13. Provide examples of how logistic regression is used in various business domains,
such as marketing, finance, healthcare, and customer relationship management.
14. How can logistic regression be applied to predict customer churn in a
telecommunications company? Discuss the relevant predictor variables and model
interpretation.

Module -4

1. Define association rule mining and its significance in data mining.


2. What are frequent itemsets? How are they used in association rule mining?
3. Discuss the Apriori algorithm for mining frequent itemsets. What are its key steps
and optimization techniques?
4. Explain the concept of support and confidence in association rule mining. How are
these metrics used to evaluate the quality of association rules?
5. Describe various methods for mining association rules, including Apriori, FP-
Growth, and Eclat algorithms. Compare and contrast their strengths and weaknesses.
6. How does correlation analysis differ from association rule mining? Provide
examples of situations where correlation analysis is more appropriate.
7. Discuss the types of association rules that can be mined from transactional datasets,
including single-dimensional, multi-level, and multi-dimensional rules.
8. Define classification and prediction in the context of machine learning. How do
they differ from each other?
9. Describe the process of decision tree induction. What criteria are used to split nodes
in a decision tree?
10. Explain the principles of Bayesian classification and the Bayes theorem. How is it
applied to classify data into different classes?
11. Discuss the advantages of Bayesian classification in handling uncertainty and
incorporating prior knowledge into the classification process.
12. What is a lazy learner in machine learning? How does it differ from eager
learners?
13. Discuss the strengths and weaknesses of lazy learning approaches in classification
tasks, particularly in handling large datasets and non-linear relationships.

Module -5

1. Define cluster analysis and its significance in unsupervised learning. How does it
differ from classification?
2. What are the main objectives of cluster analysis? Discuss some common
applications of clustering in real-world scenarios.
3. Explain the process of cluster formation in cluster analysis. How are data points
grouped together based on their similarities or dissimilarities?
4. Describe the different types of data that can be analyzed using cluster analysis,
including numerical, categorical, and mixed data.
5. How does the type of data influence the choice of distance metrics and clustering
algorithms in cluster analysis?
6. Categorize major clustering methods based on their underlying approaches and
characteristics.
7. Compare and contrast partitioning methods, hierarchical methods, density-based
methods, and grid-based methods in terms of their strengths and weaknesses.
8. Explain the concept of partitioning methods in cluster analysis. How do partitioning
algorithms divide the dataset into clusters?
9. Discuss two popular partitioning algorithms, K-means and K-medoids. How do
they work, and what are their differences?
10. Describe the hierarchical clustering approach and how it differs from partitioning
methods.
11. Discuss the agglomerative and divisive hierarchical clustering techniques. How do
they build clusters iteratively based on the proximity of data points?
12. Define density-based clustering methods and their objective in identifying clusters
based on regions of high data density.
13. Explain the concept of grid-based clustering methods and their use in partitioning
the data space into a grid structure.
14. Define outliers and their role in cluster analysis. How do outliers affect the
formation and interpretation of clusters?
15. Discuss methods for outlier analysis, including distance-based approaches,
density-based approaches, and statistical methods.

You might also like