R Tutorial
How to list files in a directory/folder?
> list.files("~/Documents/linuxDatPro")
[1] "chrs_2L_snps.csv~"
"chrs_all_snps.csv~"
"dgrp"
[4] "linux_data_manipulation.odt" "perl_scripts.odt"
"regexp.pdf"
[7] "sed1line.txt"
"shell_script"
"story.txt"
[10] "story.txt~"
"userlogin.sh~"
"while01.awk~"
Data types
1.
2.
3.
4.
5.
6.
7.
8.
9.
Numeric
Integer
Complex
Logical
Character
Vector
Matrix
List
Dataframe
Numeric
It is the default computational data type.
Decimals
Integers
To find out, use:
class()
> x <- 2.5
> class(x)
[1] "numeric"
> y <- 10
> class(y)
[1] "numeric"
> is.integer(y)
[1] FALSE
Integer
In order to create an integer variable in R, we invoke:
as.integer()
> k <- as.integer(y)
>k
[1] 10
> class(k)
[1] "integer"
Coerce a decimal as integer. Use:
as.integer()
To find out, use:
is.integer()
> j <- as.integer(x)
>j
[1] 2
> class(j)
[1] "integer"
> is.integer(j)
[1] TRUE
Integers and logical values
Often, it is useful to perform arithmetic on logical values. Like the C language, TRUE has the value 1,
while FALSE has value 0.
> as.integer(TRUE)
[1] 1
> as.integer(FALSE)
[1] 0
Complex
> z <- 1+2i
>z
[1] 1+2i
> class(z)
[1] "complex"
Logical
> j <- 2
> m <- j>x
>m
[1] FALSE
> class(m)
[1] "logical"
Standard logical operations are "&" (and), "|" (or), and "!" (negation).
> a <- TRUE
> b <- FALSE
>a&b
[1] FALSE
>a|b
[1] TRUE
> !a
[1] FALSE
Character: creation, concatenation
Create character string with:
quotes
To concatenate, use:
paste()
> fname <- "Mahesh"
> lname <- "Vaishnav"
> fullName <- paste(fname, lname)
> fullName
[1] "Mahesh Vaishnav"
> class(fname)
[1] "character"
> class(lname)
[1] "character"
> class(fullName)
[1] "character"
Character: sprintf()
However, it is often more convenient to create a readable string with the sprintf function, which has a C
language syntax.
> sprintf("%s has %d dollars.", "Joe", 500)
[1] "Joe has 500 dollars."
Character: substr()
To extract a substring, we apply the substr function. Here is an example showing how to extract the
substring between the second and twelfth positions in a string.
> substr("Joe has 500 dollars.", start=2, stop=12)
[1] "oe has 500 "
Character: sub()
And to replace the first occurrence of the word "little" by another word "big" in the string, we apply the
sub function.
> sub("500", "1000", "Joe has 500 dollars.")
[1] "Joe has 1000 dollars."
Membership: Testing and coercion
Membership relates to the class of an object. Asks: is.something
Coercion changes the class of an object. Says: as.something
See p31, Crawley, for Testing and Coercing functions.
Logical variable coerced to factors and numerics
Create a logical variable:
> lv <- c(T,F,T)
> lv
[1] TRUE FALSE TRUE
Assess its membership using is.logical() function.
> is.logical(lv)
[1] TRUE
It is not a factor, so does not have levels:
> levels(lv)
NULL
But we can coerce it to a two-factor level, because it does have 2 levels, TRUE and FALSE:
> fv <- as.factor(lv)
> fv
[1] TRUE FALSE TRUE
Levels: FALSE TRUE
> is.factor(fv)
[1] TRUE
We can coerce it to be a numeric variable too:
TRUE evaluates to 1
FALSE evaluates to 0
> nv <- as.numeric(lv)
> nv
[1] 1 0 1
> is.numeric(nv)
[1] TRUE
Significance:
useful as a shortcut when creating new factors with reduced number of levels, as we do in
model simplification.
Factor levels coerced into numerics
> char <- c("a", "b", "c")
> char
[1] "a" "b" "c"
> class(char)
[1] "character"
> as.numeric(factor(char))
[1] 1 2 3
> as.numeric(char)
[1] NA NA NA
Warning message:
NAs introduced by coercion
However,
> as.numeric(c("a", "4", "c"))
[1] NA 4 NA
Warning message:
NAs introduced by coercion
Here:
character 4 was coerced into being a number, but a and c could not.