0% found this document useful (0 votes)
12 views21 pages

Chapter 1

The document provides an introduction to the Bioconductor project in R, detailing its structure, functionality, and installation process for packages. It highlights the differences between S3 and S4 object systems, emphasizing the advantages of S4 in terms of validation and reusability. Additionally, it discusses genomic datasets, specifically using the yeast genome as an example, and demonstrates how to access and manipulate genomic sequences using Bioconductor packages.

Uploaded by

dora.ogorek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views21 pages

Chapter 1

The document provides an introduction to the Bioconductor project in R, detailing its structure, functionality, and installation process for packages. It highlights the differences between S3 and S4 object systems, emphasizing the advantages of S4 in terms of validation and reusability. Additionally, it discusses genomic datasets, specifically using the yeast genome as an example, and demonstrates how to access and manipulate genomic sequences using Bioconductor packages.

Uploaded by

dora.ogorek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

The Bioconductor

Project
INTRODUCTION TO BIOCONDUCTOR IN R

Paula Andrea Martinez, PhD.


Data Scientist
Bioconductor

1 Bioconductor (www.bioconductor.org)

INTRODUCTION TO BIOCONDUCTOR IN R
What do we measure and why?
Structure: elements, regions, size, order, relationships

Function: expression, levels, regulation, phenotypes

INTRODUCTION TO BIOCONDUCTOR IN R
How to install Bioconductor packages?
Biconductor has its own repository, way to install packages, and each release is designed to
work with a speci c version of R.
For this course, you'll be using Bioconductor version 3.6.

Bioconductor version 3.7 or earlier uses BiocLite:

source("https://bioconductor.org/biocLite.R")
biocLite("packageName")

Bioconductor version 3.8 and later uses BiocManager:

if (!requireNamespace("BiocManager"))
install.packages("BiocManager")
BiocManager::install()

INTRODUCTION TO BIOCONDUCTOR IN R
Bioconductor version and package version
BiocInstaller works for Bioconductor version 3.7 or earlier

# Check Bioconductor version (For versions <= 3.7)


BiocInstaller::biocVersion()
# or
biocVersion()
# Load a package
library(packageName)
# Check versions for reproducibility
sessionInfo()
# or
packageVersion("packageName")
# Check package updates (Bioconductor version <= 3.7)
BiocInstaller::biocValid()
# or
biocValid()

INTRODUCTION TO BIOCONDUCTOR IN R
Let's practice!
INTRODUCTION TO BIOCONDUCTOR IN R
The Role of S4 in
Bioconductor
INTRODUCTION TO BIOCONDUCTOR IN R

Paula Andrea Martinez, PhD.


Data Scientist
S3
Positive

CRAN, simple but powerful

Flexible and interactive

Uses a generic function

Functionality depends on the rst argument

Example: plot() and methods(plot)

Negative

Bad at validating types and naming conventions (dot not dot?)

Inheritance works, but depends on the input

INTRODUCTION TO BIOCONDUCTOR IN R
S4
Positive

Formal de nition of classes

Bioconductor reusability

Has validation of types

Naming conventions

Example: mydescriptor <- new("GenomeDescription")

Negative

Complex structure compared to S3

INTRODUCTION TO BIOCONDUCTOR IN R
Is it S4 or not?
Ask if an object is S4

isS4(mydescriptor)

TRUE

str of S4 objects start with Formal class

str(mydescriptor)

Formal class 'GenomeDescription' [package "GenomeInfoDb"] with 7 slots


...

INTRODUCTION TO BIOCONDUCTOR IN R
S4 class definition
A class describes a representation

name

slots (methods/ elds)

contains (inheritance de nition)

MyEpicProject <- setClass(# Define class name with UpperCamelCase


"MyEpicProject",
# Define slots, helpful for validation
slots = c(ini = "Date",
end = "Date",
milestone = "character"),
# Define inheritance
contains = "MyProject")

INTRODUCTION TO BIOCONDUCTOR IN R
.S4methods(class = "GenomeDescription")

[1] commonName organism provider providerVersion releaseDate releaseName seqinfo


[8] seqnames show toString bsgenomeName

showMethods(classes = "GenomeDescription", where = search())

Object summary

show(myDescriptor)

| organism: ()
| provider:
| provider version:
| release date:
| release name:
| ---
| seqlengths:

INTRODUCTION TO BIOCONDUCTOR IN R
Let's practice!
INTRODUCTION TO BIOCONDUCTOR IN R
Introducing biology
of genomic datasets
INTRODUCTION TO BIOCONDUCTOR IN R

Paula Andrea Martinez, PhD.


Data Scientist
INTRODUCTION TO BIOCONDUCTOR IN R
INTRODUCTION TO BIOCONDUCTOR IN R
Genome elements
Genetic information DNA alphabet

A set of chromosomes (highly variable number)

Genes (carry heredity instructions)


coding and non-coding

Proteins (responsible for speci c functions)


DNA-to-RNA (transcription)

RNA-to-protein (translation)

INTRODUCTION TO BIOCONDUCTOR IN R
Yeast
A single cell microorganism

The fungus that people love ♥

Used for fermentation: beer, bread, ke r,


kombucha, bioremediation, etc.

Name: Saccharomyces cerevisiae or S.


cerevisiae

INTRODUCTION TO BIOCONDUCTOR IN R
BSgenome annotation package

# load the package and store data into yeast


library(BSgenome.Scerevisiae.UCSC.sacCer3)
yeast <- BSgenome.Scerevisiae.UCSC.sacCer3
#interested in other genomes?
available.genomes()

Using accessors

# Chromosome number
length(yeast)
# Chromosome names
names(yeast)
# Sequence lengths
seqlengths(yeast)

INTRODUCTION TO BIOCONDUCTOR IN R
Get sequences
S4 method for BSgenome

# S4 method getSeq() requires a BSgenome object


getSeq(yeast)
# Select chromosome sequence by name, one or many
getSeq(yeast, "chrM")
# Select start, end and or width
# end = 10, selects first 10 base pairs of each chromosome
getSeq(yeast, end = 10)

INTRODUCTION TO BIOCONDUCTOR IN R
Let's practice!
INTRODUCTION TO BIOCONDUCTOR IN R

You might also like