18CSE355T – DATA MINING AND ANALYTICS Year & Semester - III & 5
Kinds of data meant for mining
• Data mining can be applied to any kind of data as long as the data are meaningful for a target
application.
• The most basic forms of data for mining applications are database data , data warehouse data
and transactional data.
• Data mining can also be applied to other forms of data (e.g., data streams, ordered/sequence
data, graph or networked data, spatial data, text data, multimedia data, and the WWW).
Database Data
• A database system, also called a database management system (DBMS), consists of a collection
of interrelated data, known as a database, and a set of software programs to manage and access
the data.
• The software programs provide mechanisms for defining database structures and data storage;
for specifying and managing concurrent, shared, or distributed data access; and for ensuring
consistency and security of the information stored despite system crashes or attempts at
unauthorized access.
Data Warehouse
• A data warehouse is a repository of information collected from multiple sources, stored under a
unified schema, and usually residing at a single site.
• Data warehouses are constructed via a process of data cleaning, data integration, data
transformation, data loading, and periodic data refreshing.
• A data warehouse is usually modeled by a multidimensional data structure, called a data cube,
in which each dimension corresponds to an attribute or a set of attributes in the schema, and
each cell stores the value of some aggregate measure such as count or sum.
• A data cube provides a multidimensional view of data and allows the precomputation and fast
access of summarized data.
Unit 1 Page 1
18CSE355T – DATA MINING AND ANALYTICS Year & Semester: III & 5
Transactional Data
• In general, each record in a transactional database captures a transaction, such as a customer’s
purchase, a flight booking, or a user’s clicks on a web page.
• A transaction typically includes a unique transaction identity number (trans ID) and a list of the
items making up the transaction, such as the items purchased in the transaction.
• A transactional database may have additional tables, which contain other information related to
the transactions, such as item description, information about the salesperson or the branch, and
so on.
Unit 1 Page 2