0% found this document useful (0 votes)
84 views21 pages

Current Topics in Bioinformatics

This document provides an overview of the course "Current Topics in Bioinformatics". It discusses what bioinformatics is, how to stay current in the field through seminars, blogs and journals. It introduces R and Bioconductor for statistical computing and genomic analysis, and emphasizes reproducible research practices. The document outlines finding help resources and states that the course project involves analyzing a dataset and is worth 50% of the grade.

Uploaded by

panna1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views21 pages

Current Topics in Bioinformatics

This document provides an overview of the course "Current Topics in Bioinformatics". It discusses what bioinformatics is, how to stay current in the field through seminars, blogs and journals. It introduces R and Bioconductor for statistical computing and genomic analysis, and emphasizes reproducible research practices. The document outlines finding help resources and states that the course project involves analyzing a dataset and is worth 50% of the grade.

Uploaded by

panna1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Current Topics in Bioinformatics

(BST520)

Current Topics in Bioinformatics


What is Bioinformatics?

Hogeweg and Hesper: the study of information


processes in biotic systems, in contrast to
biochemistry and biophysics

Current Topics in Bioinformatics


What is Bioinformatics?

Wikipedia: Bioinformatics now entails the creation


and advancement of databases, algorithms,
computational and statistical techniques and theory
to solve formal and practical problems arising from
the management and analysis of biological data.

Current Topics in Bioinformatics


Staying Current

Seminars & Conferences


I TIGR Meeting: Transcriptomics and Integrated

Genomics Working Group


Thurs 9:30-10:30 MRBX 1.11211
I BioC: Bioconductor Conference

each summer at FHCRC in Seattle

Current Topics in Bioinformatics


Staying Current

Blogs & Social Media


I R-bloggers (http://www.r-bloggers.com)

I Genomics, Evolution, and Pseudoscience

(http://genome.fieldofscience.com)
I Simply Statistics (twitter: @simplystats)

Current Topics in Bioinformatics


Staying Current
Core Bioinformatics journals:
I Bioinformatics

I Biostatistics

I BMC Bioinformatics

I BMC Systems Biology

I Briefings in Bioinformatics

I PLoS Computational Biology

I Statistical Applications in Genetics and

Molecular Biology
I IEEE/ACM Transactions on Computational

Biology and Bioinformatics


Current Topics in Bioinformatics
Staying Current
Journals that publish Bioinformatics research:
I Science

I PNAS

I Nature

I Nature Methods

I Nature Biotechnology

I Nucleic Acids Research

I Genome Research

I Genome Biology

I PLoS Biology

Current Topics in Bioinformatics


Staying Current

Scientific Literature
I PubMed

I Google Scholar

I RSS feeds

Current Topics in Bioinformatics


Introduction to R

History of R:
I based on the S programming language

developed at Bell Labs


I developed by Ross Ihaka and Robert Gentleman

I Feb 29, 2000: version 1.0.0

I June 22, 2012: version 2.15.1

http://www.r-project.org/

Current Topics in Bioinformatics


Introduction to R

Advantages of R:
I open source programming language

I functions for most statistical and graphical

techniques – e.g. glm()


I interface with C/C++/PERL/MYSQL

I contributed add-on packages

http://www.r-project.org/

Current Topics in Bioinformatics


Bioconductor

History of Bioconductor:
I collection of R packages for genomic data

I started in 2001

I updated twice per year

I April 2, 2012: version 2.10

http://www.bioconductor.org

Current Topics in Bioinformatics


Bioconductor

Advantages of Bioconductor:
I open source / open development

I stricter contributed package guidelines

I focus on biostatistics / bioinformatics

I genomic annotation / metadata

I focus on reproducible research

http://www.bioconductor.org

Current Topics in Bioinformatics


R Integrated Development Environments

Advantages:
I R script formatting

I syntax highlighting

I interactive session reproducibility

A few options:
I Emacs + ESS

I RStudio

I Eclipse

Current Topics in Bioinformatics


Reproducible Research

Level 0:
I no R script saved:

all commands at the R prompt


I raw data:

deleted once used


I only final table / figure saved

Current Topics in Bioinformatics


Reproducible Research

Level 1:
I R script:

commented code; begins with loading raw data


I raw data:

saved and annotated

Current Topics in Bioinformatics


Reproducible Research

Level 2:
I description of data and analysis, R script, and

results:
Sweave, knitr, etc.
I raw data:

saved and annotated


I processed data:

saved and annotated

Current Topics in Bioinformatics


Reproducible Research

Level 3:
I R/Bioconductor software package:
I documented R functions
I example data set(s)
I package vignette(s)
I versioning using svn
I R/Bioconductor data package:
I documented raw and processed data
I S4 class data structure and methods
I advanced data format – e.g. mysql database

Current Topics in Bioinformatics


Finding Help

Being self-sufficient:
I read the manual / vignette

I Google keywords

I search the R/Bioconductor mailing list archives

http://tolstoy.newcastle.edu.au/R/

Current Topics in Bioinformatics


Finding Help

Being reliant:
I email the R/Bioconductor mailing list

http://www.r-project.org/posting-guide.html
I attend office hours

I email a classmate

I email a professor

Current Topics in Bioinformatics


Course Project

I analysis of a data set / treatment of a


methodological issue
I formal proposal due Oct 15th
I discuss ideas with instructors early
I 50% of your grade
I possibility of publication

Current Topics in Bioinformatics


Homework

First homework assignment due Monday Sept 10th


at 5pm.
Available at http://mnmccall.com/teaching/bst520

Current Topics in Bioinformatics

You might also like