Data Warehousing & Data Mining - Important Q&A Notes
Unit 1: Introduction to Data Warehousing
Q. What is a Data Warehouse? How is it different from DBMS?
Ans: A data warehouse is a central storage that stores large historical data from various sources. It is used
mainly for analysis and reporting. Unlike DBMS which handles real-time transactional data (like ATM
systems), data warehouses are optimized for reading and analyzing old data. Data in DBMS is normalized
and updated frequently, while in data warehouses, it is denormalized and optimized for fast queries.
Q. Explain the Multidimensional Data Model.
Ans: The multidimensional data model allows data to be viewed in multiple dimensions like a cube. For
example, sales data can be analyzed by product, time, and location. This model uses dimensions (like time,
location) and measures (like sales amount) to make analysis faster and easier using OLAP operations.
Unit 2: Data Warehouse Architecture
Q. Explain the Three-Tier Data Warehouse Architecture.
Ans: 1) Bottom Tier: Databases and ETL tools extract and load data. 2) Middle Tier: The actual data
warehouse storage and OLAP engine. 3) Top Tier: Front-end tools like dashboards for users. This layered
design improves performance and manageability.
Q. What are the advantages of OLAP over OLTP?
Ans: OLAP is used for analysis, OLTP for transactions. OLAP queries are faster for complex analysis. OLAP
supports multidimensional view, summaries, and historical data, while OLTP focuses on current, quick
insert/update/delete operations.
Unit 3: Introduction to Data Mining
Q. What is Data Mining? Mention its objectives.
Ans: Data mining is finding useful patterns or trends from large datasets. Its goals are to discover hidden
knowledge, predict future outcomes, and summarize big data. It helps in decision making by analyzing
Data Warehousing & Data Mining - Important Q&A Notes
customer behavior, fraud detection, etc.
Q. Explain KDD vs Data Mining.
Ans: KDD (Knowledge Discovery in Databases) is a full process: selection, cleaning, transformation, mining,
and interpretation. Data mining is a step within KDD where actual pattern discovery happens. So, data mining
is part of the bigger KDD process.
Unit 4: Data Mining Techniques
Q. What are Association Rules in data mining? Give an example.
Ans: Association rules find relationships between items in large datasets. Example: {Bread} -> {Butter} means
customers who buy bread also buy butter. This helps in product placement and recommendations.
Q. Explain the Hierarchical Clustering Method.
Ans: In this method, each data point is initially a separate cluster. Similar clusters are then combined
step-by-step to form a hierarchy. The result is shown as a tree called a dendrogram. It is useful when the
number of clusters is not known in advance.
Unit 5: Overview of Advanced Features of Data Mining
Q. What is Text Mining? Give its applications.
Ans: Text mining extracts meaningful patterns from text data like reviews or emails. It is used in sentiment
analysis, spam detection, and analyzing customer feedback. It helps convert unstructured text into structured
information.
Q. Explain Spatial Data Mining with applications.
Ans: Spatial data mining deals with geographic and location-based data. It is used in weather forecasting, city
planning, and traffic management. It helps in discovering patterns based on spatial location, like areas with
high pollution.