Chapter two
Data Models
1
Data Models in Data Management and Curation
➢ A data model is a conceptual framework that defines how data is structured,
stored, retrieved, and presented.
➢ It provides a blueprint for organizing and managing data efficiently.
➢ Data modeling is the process of creating a diagram that represents your data system and
defines the structure, attributes, and relationships of your data entities.
➢ Data modeling organizes and simplifies your data in a way that makes it easy to understand,
manage, and query, while also ensuring data integrity and consistency.
➢ Different types of data models exist, each suited to different kinds of data and use cases.
Data Model Types
• three types of models–conceptual, logical, and physical.
Abstractions in Data Modeling
➢ Abstraction simplifies complex data structures by hiding lower-level details and
presenting data at different levels.
Trees/
• The “objects” involved are abstractions
of real-world entities.
• Objects are grouped in class hierarchies,
and have associated features.
• Object-oriented databases can
incorporate tables, but can also support
more complex data relationships.
6. Text and Documents as Data Models
• Text and document-based data models handle unstructured or semi-
structured data stored in documents rather than tables.
• Characteristics of Text-Based Models:
• Stores text-heavy data, such as books, articles, reports, and emails.
• Uses full-text search and indexing for fast retrieval.
• Often schema-less, allowing flexible structures.
• Common formats include XML, JSON.
7. Ontologies in Data Management
• An ontology defines a formal structure for knowledge representation, establishing
relationships between concepts in a domain.
• Characteristics of Ontologies:
• Defines concepts (classes) and their relationships.
• Uses semantic meaning to describe data.
• Example: Medical Ontology
• A Healthcare Ontology might define:
• Concepts: Patients, Doctors, Diseases, Treatments.
• Relationships: "A doctor treats a patient", "A patient has a disease", etc.
• Example:
• AI and machine learning for knowledge graphs.
• Data integration across multiple sources in enterprise systems.
8. Dimensional data models
• Dimensional data models they were designed to optimize data retrieval
speeds for analytic purposes in a data warehouse.
• While relational and ER models emphasize efficient storage, dimensional
models increase redundancy in order to make it easier to locate information
for reporting and retrieval.
• This modeling is typically used across OLAP systems.
Schemas in Data Management
• A schema is a blueprint that defines the structure of a database, including tables, fields, and
data types.
• Characteristics of Schemas:
• Specifies table structures, constraints, and relationships.
• Enforces data consistency and integrity.
• Used in relational databases (SQL) and NoSQL databases.
• Example: Database Schema for a Library
• LibrarySchema
• ├── Books (BookID, Title, AuthorID, Genre)
• ├── Authors (AuthorID, Name, Nationality)
• ├── Borrowers (BorrowerID, Name, Address)
• ├── Transactions (TransactionID, BookID, BorrowerID, BorrowDate, ReturnDate)
Scenario:
• Defining relational database models (MySQL, PostgreSQL).
• Ensuring data consistency in structured storage.
Conclusion
• Different data models cater to different needs in data management and
curation.
• Relational models are great for structured databases.
• Trees help with hierarchical data.
• Text and document models manage unstructured data.
• Ontologies provide semantic understanding.
• Schemas define structured data constraints.
• Abstraction layers simplify complexity.