0% found this document useful (0 votes)

34 views29 pages

Lec4 Databases

The document discusses several types of biological databases including: 1) Primary databases that contain original experimental data submissions controlled by submitters, as well as derivative databases derived from primary data controlled by third parties like NCBI. 2) Relational databases that organize information into tables with rows and columns to reduce data redundancy. 3) Major database providers like NCBI, EBI, and GenomeNet that host biological data. 4) Specific database types like nucleotide databases that store nucleic acid sequences, protein databases that include Uniprot, PDB, and Pfam, and pathway databases like KEGG.

Uploaded by

Ayesha Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views29 pages

Lec4 Databases

Uploaded by

Ayesha Khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Biological Databases

Zoya Khalid
[Link]@[Link]
Data Vs. Information

• Information produced by processing data

• Information used to reveal meaning in data
• Accurate, timely and relevant information is the key to good
decision making
• Good decision making is the key to organizational survival
What is a database

• Structured collection of information.

• Consists of basic units called records or entries.
• Each record consists of fields, which hold pre-defined data
related to the record.
• For example, a protein database would have protein entries as
records and protein properties as fields (e.g., name of protein,
length, amino-acid sequence)
Types of databases

• Primary Databases
– Original submissions by experimentalists
– Content controlled by the submitter
• Examples: GenBank, Trace, SRA, SNP, GEO

• Derivative Databases
– Derived from primary data
– Content controlled by third party (NCBI) Algorithms
• Examples: NCBI Protein, Refseq, TPA, RefSNP, GEO datasets, UniGene, Homologene,
Structure, Conserved Domain
A flat-file database
Why Flat Files ?

• Flat files are the universal mechanism for moving data from one
database or system to another.
• There are two common types of flat files: CSV (comma separated
values) and delimited files.
Relational databases
Relational database

• A relational database consists of a relations (tables) containing attributes

(fields or columns). Each row in a table is known as a record or tuple.
• Information should be ‘normalized’ so that it is non-redundant this means
that every row should be unique, although this ideal is not always observed.
First Name Last Name Institution Department Address
Omar|Farooq|Computer Science|NUCES|Islamabad
Omar Farooq NUCES Computer Science Islamabad Hadiya|Ali|Electrical Engineering|FAST|Islamabad
Ahmed|Khan|Dept of Computer Science|NUCES|Isb
Hadiya Ali FAST Electrical Engineering Islamabad
Ahmed|Khan|Dept of Management|NUST|Islamabad
Ahmed Khan NUCES Dept of Computer Science Isb
Ahmed Khan NUST Dept of Management Islamabad
Foreign Primary
Key Key
Table Professor Table Contacts
Primary Prof_id First_name Last_name Contact_id Contact_id Institution Department Address
Key 1 Omar Farooq 1 1 NUCES Computer Science Islamabad
2 Hadiya Ali 2 2 FAST Electrical Engineering Islamabad
3 Ahmed Khan 1 3 NUST Management Islamabad
4 Ahmed Khan 3
Types of databases
Database providers

• The National Center for

Biotechnology Information (NCBI)
offers data banks, databases and
tools (USA)
• The European Bioinformatics
Institute (EBI) does a similar
function in Europe
• GenomeNet gathers several
databases from Japan
Data quality

• How are things entered

– Step by step protocol
• What are the evidence?
– Automatic validation
– Manual curation
• How new is the data?
• Can the data be secret?
• Redundant or non-redundant?
summary
NCBI
European Bioinformatics Institute
GenomeNet
NAR database issue
Nucleotide databases
• International nucleotide sequence database collaborations
– Genbank
– EMBL
– DDBJ
• The nucleotide sequence databases are data repositories,
accepting nucleic acid sequence data from the scientific
community and making it freely available.
– The databases strive for completeness, with the aim of recording
every publicly known nucleic acid sequence.
– These data are heterogeneous, they vary with respect to
• the source of the material (e.g. genomic versus cDNA), the intended quality
(e.g. finished versus single pass sequences), the extent of sequence
annotation
• the intended completeness of the sequence relative to its biological target
(e.g. complete versus partial coverage of a gene or a genome).
GenBank entry
Genome specific databases
Protein Databases

• Sequences are in Uniprot

• Structures are in PDB

• Enzyme classifications EC

• Protein families: Pfam,

Interpro etc
Uniprot

• UniProtKB: Protein knowledgebase, consists of two sections:

– Swiss-Prot, which is manually annotated and reviewed.
– TrEMBL, which is automatically annotated and is not reviewed.
• Includes complete and reference proteome sets.
• UniRef: Sequence clusters, used to speed up sequence
similarity searches.
• UniParc: Sequence archive, used to keep track of sequences
and their identifiers.
• Supporting data
– Literature citations, keywords, subcellular locations, cross-referenced
databases and more.
Uniprot
PDB
PDB
Pfam

Multiple Sequence Alignment and HMMs

KEGG
[Link]

Bioinformatics Lecture 1
No ratings yet
Bioinformatics Lecture 1
48 pages
02-A-Introduction To Biological Databases
No ratings yet
02-A-Introduction To Biological Databases
52 pages
Bioinformatics Day1
No ratings yet
Bioinformatics Day1
5 pages
Intro to Biological Databases
No ratings yet
Intro to Biological Databases
14 pages
Lesson 01 Intro DataBases V2
No ratings yet
Lesson 01 Intro DataBases V2
38 pages
Introduction To Biological Databases
No ratings yet
Introduction To Biological Databases
5 pages
Biological Databases in Bioinformatics
No ratings yet
Biological Databases in Bioinformatics
29 pages
Group. 1 Final
No ratings yet
Group. 1 Final
11 pages
Understanding Biological Databases
No ratings yet
Understanding Biological Databases
47 pages
Bioinformatics Week 1: Play Video Starting At:4:13 and Follow Transcript4:13
No ratings yet
Bioinformatics Week 1: Play Video Starting At:4:13 and Follow Transcript4:13
7 pages
2024.HF BioInformatics Lec3p
No ratings yet
2024.HF BioInformatics Lec3p
11 pages
Bioinformatics Database Guide
No ratings yet
Bioinformatics Database Guide
7 pages
Bioinformatics PPT Section B Data Storage and Retrival Group 3
No ratings yet
Bioinformatics PPT Section B Data Storage and Retrival Group 3
36 pages
(Bioinformatics) Ahmad
No ratings yet
(Bioinformatics) Ahmad
10 pages
Bi 5&10mark Q&A Mse 1
No ratings yet
Bi 5&10mark Q&A Mse 1
14 pages
Unit Ii
No ratings yet
Unit Ii
23 pages
Bioinformatics: Overview and Applications
No ratings yet
Bioinformatics: Overview and Applications
24 pages
Biological Databases
No ratings yet
Biological Databases
3 pages
M Lec 01 & 02 Biological Database
No ratings yet
M Lec 01 & 02 Biological Database
50 pages
Sec1 Introduction To Bioinformatics
No ratings yet
Sec1 Introduction To Bioinformatics
20 pages
Bioinformatics Databases Explained
No ratings yet
Bioinformatics Databases Explained
5 pages
Capture D'écran . 2023-03-14 À 00.15.22
No ratings yet
Capture D'écran . 2023-03-14 À 00.15.22
54 pages
Biological Databases
No ratings yet
Biological Databases
41 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Databases For Microarrays: Vidhya Jagannathan SIB, Lausanne
No ratings yet
Databases For Microarrays: Vidhya Jagannathan SIB, Lausanne
49 pages
Database
No ratings yet
Database
2 pages
Lecture 1 - Biological Database
No ratings yet
Lecture 1 - Biological Database
14 pages
Database Basics
No ratings yet
Database Basics
6 pages
CR Micro
No ratings yet
CR Micro
2 pages
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
No ratings yet
WINSEM2021-22 BIY1012 ETH VL2021220501045 Reference Material I 11-01-2022 Ntroduction To Databases
42 pages
Biological Databases
No ratings yet
Biological Databases
20 pages
Peace BMCB Seminar
No ratings yet
Peace BMCB Seminar
13 pages
CR Micro
No ratings yet
CR Micro
2 pages
Presentation 11
No ratings yet
Presentation 11
20 pages
Biological Databases
100% (1)
Biological Databases
39 pages
Ajol File Journals - 314 - Articles - 242956 - Submission - Proof - 242956 3745 584187 1 10 20230306
No ratings yet
Ajol File Journals - 314 - Articles - 242956 - Submission - Proof - 242956 3745 584187 1 10 20230306
17 pages
1 Databases
No ratings yet
1 Databases
10 pages
Module 2 (Bioinformatics)
No ratings yet
Module 2 (Bioinformatics)
81 pages
Kristen 1scsasc
No ratings yet
Kristen 1scsasc
35 pages
Bioinformatics Biological Database
No ratings yet
Bioinformatics Biological Database
31 pages
Bioinformatics: Biological Databases Guide
No ratings yet
Bioinformatics: Biological Databases Guide
28 pages
04 Computer Applications in Pharmacy Full Unit IV
No ratings yet
04 Computer Applications in Pharmacy Full Unit IV
14 pages
Bioinformatics (Final)
No ratings yet
Bioinformatics (Final)
41 pages
Bio in For Ma Tics
No ratings yet
Bio in For Ma Tics
52 pages
Ananya Jaiswal
No ratings yet
Ananya Jaiswal
20 pages
#1 L1 BioDatabases
No ratings yet
#1 L1 BioDatabases
89 pages
Introduction and Biological Databases
No ratings yet
Introduction and Biological Databases
17 pages
Bioinformatics Database Guide
No ratings yet
Bioinformatics Database Guide
16 pages
Biological Database
No ratings yet
Biological Database
18 pages
Lecture1 BIOF242 Shuvadeep
No ratings yet
Lecture1 BIOF242 Shuvadeep
38 pages
DBS 6202 - Advanced Database Systems Individual Assignment Iii
No ratings yet
DBS 6202 - Advanced Database Systems Individual Assignment Iii
16 pages
Cap Unit Iv
No ratings yet
Cap Unit Iv
8 pages
First Bioinformatics Database Overview
No ratings yet
First Bioinformatics Database Overview
34 pages
Unit 2.4: Bioinformatics and Databases
No ratings yet
Unit 2.4: Bioinformatics and Databases
55 pages
Characteristics of Databses - Lecture 3-4
No ratings yet
Characteristics of Databses - Lecture 3-4
9 pages
? Bioinformatics Study Note
No ratings yet
? Bioinformatics Study Note
4 pages
Biological Data Bases
No ratings yet
Biological Data Bases
36 pages
Introduction To NCBI Resources
No ratings yet
Introduction To NCBI Resources
39 pages
Graphs & Sequence Alignment Quiz
No ratings yet
Graphs & Sequence Alignment Quiz
2 pages
Sequence DB Search
No ratings yet
Sequence DB Search
38 pages
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Dr. Zoya Khalid Zoya - Khalid@nu - Edu.pk
51 pages
DNA to mRNA and Protein Quiz
No ratings yet
DNA to mRNA and Protein Quiz
2 pages
Molecular Biology Fundamentals Explained
No ratings yet
Molecular Biology Fundamentals Explained
43 pages
Lec1-Introduction To Bioinformatics
No ratings yet
Lec1-Introduction To Bioinformatics
27 pages
Why Bioinformatics?: Zoya Khalid Zoya - Khalid@nu - Edu.pk
No ratings yet
Why Bioinformatics?: Zoya Khalid Zoya - Khalid@nu - Edu.pk
22 pages
Lecture 5 Fragment Assembly
No ratings yet
Lecture 5 Fragment Assembly
40 pages
Hashing and Indexing Techniques Explained
No ratings yet
Hashing and Indexing Techniques Explained
28 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
38 pages
Protein Structure Prediction Guide
No ratings yet
Protein Structure Prediction Guide
33 pages
Biology Primer for Computer Scientists
No ratings yet
Biology Primer for Computer Scientists
18 pages
K-Means and Hierarchical Clustering
No ratings yet
K-Means and Hierarchical Clustering
61 pages
Data Structures & Clustering Guide
No ratings yet
Data Structures & Clustering Guide
3 pages
Assignment 3 CS-460
No ratings yet
Assignment 3 CS-460
2 pages
Understanding SQL Injection Types and Prevention
No ratings yet
Understanding SQL Injection Types and Prevention
26 pages
L5-1 - Database
No ratings yet
L5-1 - Database
31 pages
Receiver Technics SA-DA20
No ratings yet
Receiver Technics SA-DA20
72 pages
Introduction INFORMATION STORAGE AND Retrieval
No ratings yet
Introduction INFORMATION STORAGE AND Retrieval
150 pages
Backend Development Guide
No ratings yet
Backend Development Guide
39 pages
20 - Join in SQL
No ratings yet
20 - Join in SQL
6 pages
Data Warehousing - Metadata Concepts
No ratings yet
Data Warehousing - Metadata Concepts
9 pages
DBMS Practical File 12 Pages 5 Practicals
No ratings yet
DBMS Practical File 12 Pages 5 Practicals
12 pages
Protein Database
No ratings yet
Protein Database
3 pages
The Four Stages of Simon's Decision-Making Process & Common Strategies of Decision Makers
No ratings yet
The Four Stages of Simon's Decision-Making Process & Common Strategies of Decision Makers
59 pages
Se Unit 5
No ratings yet
Se Unit 5
56 pages
Big Data Analytics&Visualization Syllabus
No ratings yet
Big Data Analytics&Visualization Syllabus
4 pages
Data Wrangling
No ratings yet
Data Wrangling
18 pages
Web App Development with PHP & MySQL
No ratings yet
Web App Development with PHP & MySQL
4 pages
SQL Basics: Key Statements & Queries
No ratings yet
SQL Basics: Key Statements & Queries
8 pages
D426V3 Practice Test
No ratings yet
D426V3 Practice Test
9 pages
Premium DVA-C02 Exam Dumps Review
No ratings yet
Premium DVA-C02 Exam Dumps Review
24 pages
Upgradation Postgresql14 - 15
No ratings yet
Upgradation Postgresql14 - 15
6 pages
CS221 DBMS
No ratings yet
CS221 DBMS
5 pages
1 Traditional File Processing System
No ratings yet
1 Traditional File Processing System
3 pages
Mir Shezan Data Analyst Resume
No ratings yet
Mir Shezan Data Analyst Resume
3 pages
15-MySQL Connectivity Cs 12
No ratings yet
15-MySQL Connectivity Cs 12
36 pages
Information Retrieval Techniques
No ratings yet
Information Retrieval Techniques
13 pages
To Read Image Files From An Oracle APEX Application Hosted On Tomcat With The Image Directory On A Linux Server
No ratings yet
To Read Image Files From An Oracle APEX Application Hosted On Tomcat With The Image Directory On A Linux Server
2 pages
Previous Year Question Paper of MLII 104
No ratings yet
Previous Year Question Paper of MLII 104
90 pages
DBMS Architecture and Distribution
No ratings yet
DBMS Architecture and Distribution
62 pages
DBMS Total Notes
No ratings yet
DBMS Total Notes
292 pages
Keepa Api
100% (1)
Keepa Api
55 pages
Format of Computer Lab File & List of Practicals 2022 23
No ratings yet
Format of Computer Lab File & List of Practicals 2022 23
27 pages
Red Hat Directory Server-10-Configuration Command and File Reference-en-US PDF
No ratings yet
Red Hat Directory Server-10-Configuration Command and File Reference-en-US PDF
919 pages

Lec4 Databases

Uploaded by

Lec4 Databases

Uploaded by

Biological Databases

• Information produced by processing data

• Structured collection of information.

• A relational database consists of a relations (tables) containing attributes

• The National Center for

• How are things entered

• Sequences are in Uniprot

• Structures are in PDB

• Protein families: Pfam,

• UniProtKB: Protein knowledgebase, consists of two sections:

Multiple Sequence Alignment and HMMs

You might also like