0% found this document useful (0 votes)
16 views6 pages

Advanced Databases Unit 1

The document provides an overview of advanced database concepts, including the differences between file systems and database management systems (DBMS), as well as the structures used for database storage such as tablespaces, data files, and control files. It also discusses the distinctions between structured and unstructured data, SQL commands (DDL, DML, DCL), and introduces data warehousing and ETL tools for data analysis and reporting. Key points include the organization of data in databases, methods for data manipulation, and the role of ETL tools in preparing data for analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views6 pages

Advanced Databases Unit 1

The document provides an overview of advanced database concepts, including the differences between file systems and database management systems (DBMS), as well as the structures used for database storage such as tablespaces, data files, and control files. It also discusses the distinctions between structured and unstructured data, SQL commands (DDL, DML, DCL), and introduces data warehousing and ETL tools for data analysis and reporting. Key points include the organization of data in databases, methods for data manipulation, and the role of ETL tools in preparing data for analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

ADVANCED DATABASES

Unit 1 ( Module 1 )

 Database Concepts ( File System & DBMS )


1. File System:
A File System is like a digital file cabinet where you store your documents, pictures,
videos, and other files on your computer or device. The files are stored in folders,
and the file system helps you organize them.
 How it works: When you save a file, the file system keeps track of its name, where
it's stored (in which folder), and how to find it later.
 Example: If you save a Word document on your computer, the file system is
responsible for storing it and finding it when you want to open it.

2. DBMS (Database Management System):


A DBMS is a more advanced system to organize, store, and manage lots of data,
especially when that data is interconnected or needs to be accessed frequently.
Think of it as a more sophisticated version of a file system, designed for handling
large amounts of structured data efficiently. Database Management System (DBMS)
is a software tool that allows users to define, manipulate, retrieve, and manage data
within a database
 How it works: A DBMS helps you store data in tables (like an Excel sheet) with
rows and columns. It allows for complex searches, data retrieval, and
management. It also makes sure the data is consistent and not duplicated.
 Example: If you run a store and want to store information about customers,
products, and sales, you would use a DBMS to keep track of all this data in a well-
organized manner, making it easy to search for a specific product or customer.
In Short ,
File System: Stores files in folders, simpler to use for small amounts of data.
DBMS: Stores structured data in tables, better for large, complex data, allows easy
searching, updating, and managing.

 Database Storage Structures ( Tablespace , Control Files , Data Files )


1. Tablespace:
Think of a Tablespace as a big container or a storage area in a database that holds
data files. It's a way to organize how data is stored in the database.
 What it does: A tablespace doesn't hold data directly but provides an organizational
structure to store data files. You can have multiple tablespaces, each serving
different purposes (for example, storing customer data in one tablespace, sales data
in another)
 A Tablespace is like a folder in your computer where you organize things. It holds all
the data files for a database.
2. Data Files:
A Data File is where the actual data is physically stored on disk. These files contain
the records and all the data used by the database.
 What it does: Data files store all the tables, indexes, and other database objects.
Whenever you insert, update, or delete data, it’s written to the data files.
 Data Files are where the actual data in the database is stored. These are the files
that hold everything you see in the database (like tables, records, and information).

3. Control Files:
A Control File is a small, essential file that keeps track of the structure and state of
the database. It helps the database keep track of where its files are and ensures
everything is working correctly.
 What it does: Control files store important information like:
o The names and locations of data files.
o The current state of the database (whether it’s open or closed).
o Information about backups, transactions, and recovery.
 Control Files are tiny, important files that help the database keep track of everything.
They tell the database where the data files are, how everything is organized, and help
recover the database if something goes wrong.
 Think of it like this: A control file is like a list or index that tells you where your
folders and documents are located, so you can easily find them.

In Short ,
Tablespace = A folder that holds data files.
Data Files = The actual documents (where the data is stored).
Control Files = The list that keeps track of where everything is and ensures the
database is working properly.

Unit 1 ( Module 1 )

 Structured & Unstructured Data


1. Structured Data:
Structured data is organized and stored in a very neat, organized way. It’s like
information neatly arranged in tables, rows, and columns — just like a spreadsheet or
a database table. The data follows a strict structure.
 How it looks: Imagine a table where each row has data like "Customer Name", "Age",
"Address", "Phone Number", etc., and each column has a specific type of data (like
text, numbers, or dates).
Example:
o A customer database where every customer's name, email, and phone
number are stored in a table.
o A bank database with columns like "Account Number", "Account Holder",
"Balance", etc.
Where it’s used: Structured data is commonly used in Relational Databases
(RDBMS) like MySQL, PostgreSQL, and Oracle.

Unstructured data is not organized in a specific way. It’s like random information that
doesn't follow a strict table format. Think of it as the opposite of structured data.
o How it looks: Unstructured data can include things like emails, social media
posts, videos, images, audio files, and even free text written in a document.
It’s harder to organize because it doesn’t fit neatly into tables or rows. It can
be in any format — images, text, audio, etc.
 Example:
o A photo or video file is unstructured data. The database might store it, but
you won’t have columns and rows of neat, organized information about it.
o Tweets on Twitter or comments on Facebook are examples of unstructured
data — they’re free-form text.
Where it’s used: Unstructured data is often stored in NoSQL databases like
MongoDB or Cassandra, which are more flexible and can handle this kind of data.

Key Differences:
 Structured Data:
o Organized in tables with rows and columns.
o Easy to search and analyze.
o Examples: databases of customer info, product inventories.
 Unstructured Data:
o No predefined format or organization.
o Harder to search and analyze directly.
o Examples: images, videos, social media posts, audio files.

 SQL Commands ( DDL , DML , & DCL )


1. DDL (Data Definition Language):
DDL commands are used to define and manage the structure of a database. It deals
with creating, altering, and deleting database objects (like tables, schemas, etc.).
 What it does: It helps you create or change the layout of the database (tables,
columns, etc.).
 Common DDL Commands:
o CREATE: Used to create a new table, database, or other objects.
 Example: CREATE TABLE customers (id INT, name VARCHAR(50));
(Creates a new table for customers).
o ALTER: Used to change an existing database object (like adding or removing a
column).
 Example: ALTER TABLE customers ADD email VARCHAR(100); (Adds a
new column for email).
o DROP: Used to delete a database object (like a table or database).
 Example: DROP TABLE customers; (Deletes the customers table).

2. DML (Data Manipulation Language):


DML commands are used to manipulate the data inside the database. These
commands allow you to insert, update, or delete the actual data in the tables.
 What it does: It helps you work with the data itself, such as adding new records,
updating existing ones, or removing them.
 Common DML Commands:
o INSERT: Used to add new data (rows) into a table.
 Example: INSERT INTO customers (id, name, email) VALUES (1, 'John
Doe', 'john@[Link]'); (Adds a new customer).
o UPDATE: Used to modify existing data in a table.
 Example: UPDATE customers SET name = 'Jane Doe' WHERE id = 1;
(Updates the name of customer with id 1).
o DELETE: Used to remove data from a table.
 Example: DELETE FROM customers WHERE id = 1; (Deletes the
customer with id 1).

3. DCL (Data Control Language):

DCL commands are used to control access and permissions in the database. They
manage who can access or modify the data.
 What it does: It allows administrators to grant or revoke access rights to users.
 Common DCL Commands:
o GRANT: Used to give users permission to access or modify certain parts of the
database.
 Example: GRANT SELECT ON customers TO user1; (Gives "user1"
permission to view the customer table).
o REVOKE: Used to remove permissions from users.
 Example: REVOKE SELECT ON customers FROM user1; (Removes the
ability for "user1" to view the customer table).
In Short ,
 DDL (Data Definition Language): Deals with the structure of the database (create,
alter, delete tables).
 DML (Data Manipulation Language): Deals with the data inside the database (insert,
update, delete data).
 DCL (Data Control Language): Deals with permissions (granting and revoking access).
 Dataware Housing Concepts And Tools ( ETL Tools )
1. Data Warehousing:
A data warehouse is like a big storage area where data from different sources is
brought together and stored for analysis and reporting.
A centralized storage space that brings together data from different sources for
analysis and reporting. It helps you track historical data and analyze trends over time.

 What it is: Imagine you have data spread across different departments or systems
(like sales, customer service, and inventory). A data warehouse gathers all this
information in one place, so it's easier to analyze and make decisions.
 Purpose: The main purpose of a data warehouse is to store large amounts of data
from different sources and make it easy to analyze trends, patterns, and performance
over time.
 Example: Imagine a retail company that collects data from its stores, website, and
customer feedback systems. All this data is moved into a data warehouse, and
analysts can then run reports to understand customer behavior, sales trends, and
inventory needs.

2. ETL Tools:
ETL stands for Extract, Transform, and Load. These are the tools that help get data
from different sources and load it into the data warehouse.
Tools that help you Extract data from various sources, Transform it into the right
format, and Load it into the data warehouse so it can be used for analysis.

 What it does: ETL tools are like helpers that move and clean up the data, ensuring
it’s in the right format for storage and analysis in the data warehouse.
Here’s what each part of ETL means:
 Extract: First, ETL tools extract data from different sources. These sources could be
other databases, files, or external systems.
o Example: Getting customer data from the sales database or product details
from the inventory system.
 Transform: Next, the data is transformed (or cleaned) to make sure it’s in the right
format. It may involve:
o Changing formats (e.g., dates might need to be in a specific format).
o Combining data from different sources.
o Removing any mistakes or duplicates.
o Example: Converting dates from "MM/DD/YYYY" to "YYYY-MM-DD" format or
combining customer first and last names into a full name.
 Load: Finally, the transformed data is loaded into the data warehouse for storage
and analysis.
o Example: Loading the cleaned-up customer and sales data into the data
warehouse.

Popular ETL Tools:


There are many ETL tools available to help with the Extract, Transform, and Load
process. Some of the popular ones include:
 Informatica: A widely-used tool that helps with data integration and ETL processes.
 Talend: An open-source ETL tool that can connect with many types of data sources
and transform the data.
 Apache Nifi: An open-source data integration tool designed for data flow
automation.
 Microsoft SSIS (SQL Server Integration Services): A tool for ETL processes, often used
with Microsoft SQL Server databases.
 Alteryx: A data analytics tool that also helps with ETL, focusing on making the
process easier for users without technical skills.

You might also like