Advanced R: Visualization and
Programming
Computational Economics Practice
Winter Term 2015/16
Stefan Feuerriegel
Today’s Lecture
Objectives
1 Visualizing data in R graphically as points, lines, contours or areas
2 Understanding the programming concepts of if-conditions and loops
3 Implementing simple functions in R
4 Measuring execution time
Advanced R 2
Outline
1 Visualization
2 Control Flow
3 Timing
4 Wrap-Up
Advanced R 3
Outline
1 Visualization
2 Control Flow
3 Timing
4 Wrap-Up
Advanced R: Visualization 4
Point Plot
I Creating simple point plots (also named scatter plots) via plot(...)
I Relies upon vectors denoting the x-axis and y-axis locations
I Various options can be added to change appearance
d <- read.csv("persons.csv", header=TRUE, sep=",",
stringsAsFactors=FALSE)
plot(d$height, d$age)
●
30
d$age
26
●
22
●
●
165 170 175 180 185 190
Advanced R: Visualization
d$height 5
Adding Title, Labels and Annotations
I Title is added through additional parameter main
I Axis labels are set via xlab and ylab
I Annotations next to points with text(...)
plot(d$height, d$age,
main="Title", # title for the plot
xlab="Height", ylab="Age") # labels for x and y axis
text(d$height, d$age, d$name) # d$name are annotations
Title
Jerry
●
30
Age
26
Robin
●
Julia
●
22
Max
●
Kevin
●
165 170 175 180 185 190
Height
Advanced R: Visualization 6
Line Plot
Generate line plot using the additional option type="l"
x <- seq(0, 4, 0.01)
plot(x, x*x, type="l")
10 15
x*x
5
0
0 1 2 3 4
Advanced R: Visualization 7
Exercise: Plotting
x <- seq(-1, +1, 0.01)
0.8
0.4
0.0
−1.0 −0.5 0.0 0.5 1.0
Question
I How would you reproduce the above plot?
I plot(x, kink(x), type="l", main="")
I plot(x, kink(x), type="l", lab="")
I plot(x, abs(x), type="l", ylab="", xlab="")
I Visit http://pingo.upb.de with code 1523
Advanced R: Visualization 8
3D Plots
I Consider the function f (x ,y ) = x 3 + 3y − y 3 − 3x
f <- function(x, y) x^3+3*y-y^3-3*x
I Create axis ranges for plotting
x <- seq(-5, 5, 0.1)
y <- seq(-5, 5, 0.1)
I Function outer(x,y,f) evaluates f all combinations of x and y
z <- outer(x, y, f)
Advanced R: Visualization 9
3D Plots
Function persp(...) plots the plane through x, y and z in 3D
persp(x, y, z)
z
Advanced R: Visualization x 10
3D Plots
Turn on ticks on axes via ticktype="detailed"
persp(x, y, z, ticktype="detailed")
200
100
0
z
4
2
−100 0
y
−2
−200−4
Advanced R: Visualization−4 −2 0 2 4 11
x
3D Plots
Parameters theta (left/right) and phi (up/down) control viewing angle
persp(x, y, z, theta=20, phi=0)
z persp(x, y, z, theta=20, phi=35)
y
y
x
x
Advanced R: Visualization 12
Contour Plots
I A contour line is a curve along which the function has the same value
I image(...) plots a grid of pixels colored corresponding to z-value
I contour(..., add=TRUE) adds contour lines to an existing plot
image(x, y, z) # Plot colors
contour(x, y, z, add=TRUE) # Add contour lines
0 −100
−15
4
−50
0
2
0
0
y
−2
0
0
−4
0 50
100 150
Advanced R: Visualization 13
Contour Plots
f <- function(x, y) sqrt(x^2+y^2)
z <- outer(x, y, f)
image(x, y, z, asp=1) # set aspect ratio, i.e. same scale for x and y
contour(x, y, z, add=TRUE)
Question
I What would the above plot look like?
Answer A Answer B Answer C
6 6
0.2
0.6
16
16
12
12
8
8
4
4
4
−0.6
−0.2
4
−0.8
−0.4
0
0
0
y
y
1
2
3
−4
−4
−4
0.4
0.8
14
14
10
10
6
6
5 6
−6 −2 2 6 −6 −2 2 6 −6 −2 2 6
x x x
I Visit http://pingo.upb.de with code 1523
Advanced R: Visualization 14
Plotting Regression Plane
library(car) # for dataset Highway1
## Warning: no function found corresponding to methods exports from ’SparseM’ for:
’coerce’
model <- lm(rate ~ len + slim, data=Highway1)
model
##
## Call:
## lm(formula = rate ~ len + slim, data = Highway1)
##
## Coefficients:
## (Intercept) len slim
## 16.61050 -0.09151 -0.20906
x1r <- range(Highway1$len)
x1seq <- seq(x1r[1], x1r[2], length=30)
x2r <- range(Highway1$slim)
x2seq <- seq(x2r[1],x2r[2], length=30)
z <- outer(x1seq, x2seq,
function(a,b) predict(model,
newdata=data.frame(len=a,slim=b)))
Advanced R: Visualization 15
Plotting a Regression Plane
res <- persp(x=x1seq, y=x2seq, z=z,
theta=50, phi=-10)
dp <- trans3d(Highway1$len, Highway1$slim,
Highway1$rate, pmat=res)
points(dp, pch=20, col="red")
●
●
●
●
●
●
● ●
● ● ● ●
● ● ●
●●● ●
z
●● ●●● ●● ● ●
● ● ● ●
●
●● ● ●
●
●
x1se
q x2seq
Advanced R: Visualization 16
Outline
1 Visualization
2 Control Flow
3 Timing
4 Wrap-Up
Advanced R: Control Flow 17
Managing Code Execution
I Control flow specifies order in which statements are executed
I Previous concepts can only execute R code in a linear fashion
I Control flow constructs can choose which execution path to follow
Functions: Combines sequence of statements into a self-contained task
Conditional expressions: Different computations according to a specific
condition
Loops: Sequence of statements which may be executed more than
once
Advanced R: Control Flow 18
Functions
I Functions avoid repeating the same code more than once
I Leave the current evaluation context to execute pre-defined commands
Main Program
Function
Advanced R: Control Flow 19
Functions
I Extend set of built-in functions with opportunity for customization
I Functions can consist of the following:
1 Name to refer to (avoid existing function names in R)
2 Function body is a sequence of statements
3 Arguments define additional parameters passed to the function body
4 Return value which can be used after executing the function
I Simple example
f <- function(x,y) {
return(2*x + y^2)
}
f(-3, 5)
## [1] 19
Advanced R: Control Flow 20
Functions
I General syntax
functionname <- function(argument1, argument2, ...) {
function_body
return(value)
}
I Return value is the last evaluated expression
→ Alternative: set explicitly with return(...)
I Curly brackets can be omitted if the function contains only one
statement (not recommended)
I Be cautious since the order of the arguments matters
I Values in functions are not printed in console
→ Remedy is print(...)
Advanced R: Control Flow 21
Examples of Functions
square <- function(x) x*x # last value is return value
square(10)
## [1] 100
cubic <- function(x) {
# Print value to screen from inside the function
print(c("Value: ", x, " Cubic: ", x*x*x))
# no return value
}
cubic(10)
## [1] "Value: " "10" " Cubic: " "1000"
Advanced R: Control Flow 22
Examples of Functions
hello <- function() { # no arguments
print("world")
}
hello()
## [1] "world"
my.mean <- function(x) {
return (sum(x)/length(x))
}
my.mean(1:100)
## [1] 50.5
Advanced R: Control Flow 23
Scope in Functions
I Variables created inside a function only exists within it → local
I They are thus inaccessible from outside of the function
I Scope denotes when the name binding of variable is valid
x <- "A"
g <- function(x) {
x <- "B"
return(x)
}
x <- "C"
I What are the values?
g(x) # Return value of function x
x # Value of x after function execution
I Solution
## [1] "B"
## [1] "C"
Advanced R: Control Flow 24
Scope in Functions
I Variables created inside a function only exists within it → local
I They are thus inaccessible from outside of the function
I Scope denotes when the name binding of variable is valid
x <- "A"
g <- function(x) {
x <- "B"
return(x)
}
x <- "C"
I What are the values?
g(x) # Return value of function x
x # Value of x after function execution
I Solution
## [1] "B"
## [1] "C"
Advanced R: Control Flow 24
Unevaluated Expressions
I Expressions can store symbolic mathematical statements for later
modifications (e. g. symbolic derivatives)
I Let’s define an example via expression(...)
f <- expression(x^3+3*y-y^3-3*x)
f
## expression(x^3 + 3 * y - y^3 - 3 * x)
I If evaluation of certain parameters becomes necessary, one can use
eval(...)
x <- 2
y <- 3
eval(f)
## [1] -16
Advanced R: Control Flow 25
If-Else Conditions
I Conditional execution requires a condition to be met
True False
Condition
Block 1 Block 2
Advanced R: Control Flow 26
If-Else Conditions
I Keyword if with optional else clause
I General syntax:
if condition if-else condition
if (condition) { if (condition) {
statement1 statement1
} } else {
statement2
If condition is true, }
then statement is
If condition is true, then
executed
statement1 is executed,
otherwise statement2
Advanced R: Control Flow 27
If-Else Conditions
I Example
grade <- 2 grade <- 5
if (grade <= 4) { if (grade <= 4) {
print("Passed") print("Passed")
} else { } else {
print("Failed") print("Failed")
} }
## [1] "Passed" ## [1] "Failed"
I Condition must be of length 1 and evaluate as either TRUE or FALSE
if (c(TRUE, FALSE)) { # don't do this!
print("something")
}
## Warning in if (c(TRUE, FALSE)) {: Bedingung hat Länge
> 1 und nur das erste Element wird benutzt
## [1] "something"
Advanced R: Control Flow 28
Else-If Clauses
I Multiple conditions can be checked with else if clauses
I The last else clause applies when no other conditions are fulfilled
I The same behavior can also be achieved with nested if-clauses
else-if clause Nested if-condition
if (grade == 1) { if (grade == 1) {
print("very good") print("very good")
} else if (grade == 2) { } else {
print("good") if (grade == 2) {
} else { print("good")
print("not a good grade") } else {
} print("not a good grade")
}
}
Advanced R: Control Flow 29
If-Else Function
I As an alternative, one can also reach the same control flow via the
function ifelse(...)
ifelse(condition, statement1, statement2)
# executes statement1 if condition is true,
# otherwise statement2
grade <- 2
ifelse(grade <= 4, "Passed", "Failed")
## [1] "Passed"
I ifelse(...) can also work with vectors as if it was applied to each
element separately
grades <- c(1, 2, 3, 4, 5)
ifelse(grades <= 4, "Passed", "Failed")
## [1] "Passed" "Passed" "Passed" "Passed" "Failed"
I This allows for the efficient comparison of vectors
Advanced R: Control Flow 30
For Loop
I for loops execute statements for a fixed number of repetitions
Conditional Code
If condition
is true
Condition
If condition
is false
Advanced R: Control Flow 31
For Loop
I General syntax
for (counter in looping_vector){
# code to be executed for each element in the sequence
}
I In every iteration of the loop, one value in the looping vector is
assigned to the counter variable that can be used in the statements
of the body of the loop.
I Examples
for (i in 4:7) { a <- c()
print(i) for (i in 1:3){
} a[i] <- sqrt(i)
}
## [1] 4
a
## [1] 5
## [1] 6 ## [1] 1.000000 1.414214 1.732051
## [1] 7
Advanced R: Control Flow 32
While Loop
I Loop where the number of iterations is controlled by a condition
I The condition is checked in every iteration
I When the condition is met, the loop body in curly brackets is executed
I General syntax
while (condition) {
# code to be executed
}
I Examples
z <- 1 z <- 1
# same behavior as for loop # iterates all odd numbers
while (z <= 4) { while (z <= 5) {
print(z) z <- z + 2
z <- z + 1 print(z)
} }
## [1] 1 ## [1] 3
## [1] 2 ## [1] 5
## [1] 3 ## [1] 7
## [1] 4
Advanced R: Control Flow 33
Outline
1 Visualization
2 Control Flow
3 Timing
4 Wrap-Up
Advanced R: Timing 34
Measuring Timings via Stopwatch
I Efficiency is a major issue with larger datasets and complex codes
I Timings can help in understanding scalability and bottlenecks
I Use a stopwatch approach measuring the duration between two
proc.time() calls
start.time <- proc.time() # Start the clock
g <- rnorm(100000)
h <- rep(NA, 100000)
for (i in 1:100000) { # Loop over vector, always add +1
h[i] <- g[i] + 1
}
# Stop clock and measure duration
duration <- proc.time() - start.time
Advanced R: Timing 35
Measuring Timings via Stopwatch
I Results of duration have the following format
## user system elapsed
## 0.71 0.02 0.72
I Timings are generally grouped into 3 categories
I User time measures the understanding of the R instructions
I System time measures the underlying execution time
I Elapsed is the difference since starting the stopwatch (= user +
system)
I Alternative approach avoiding loop
start.time <- proc.time() # Start clock
g <- rnorm(100000)
h <- g + 1
proc.time() - start.time # Stop clock
## user system elapsed
## 0.08 0.00 0.08
I Rule: vector operations are faster than loops
Advanced R: Timing 36
Measuring Timings of Function Calls
Function system.time(...) can directly time function calls
slowloop <- function(v){
for (i in v) {
tmp <- sqrt(i)
}
}
system.time(slowloop(1:1000000))
## user system elapsed
## 2.06 0.05 2.13
Advanced R: Timing 37
Outline
1 Visualization
2 Control Flow
3 Timing
4 Wrap-Up
Advanced R: Wrap-Up 38
Fancy Diagrams with ggplot2
library(ggplot2)
df <- data.frame(Plant=c("Plant1", "Plant1", "Plant1", "Plant2", "Plant2", "Plant2"),
Type=c(1, 2, 3, 1, 2, 3),
Axis1=c(0.2, -0.4, 0.8, -0.2, -0.7, 0.1),
Axis2=c(0.5, 0.3, -0.1, -0.3, -0.1, -0.8))
ggplot(df, aes(x=Axis1, y=Axis2, shape=Plant,
color=Type)) + geom_point(size=5)
0.5 ● Type
3.0
● 2.5
2.0
0.0
Axis2
● 1.5
1.0
−0.5 Plant
● Plant1
Plant2
−0.4 0.0 0.4 0.8
Axis1
Advanced R: Wrap-Up 39
Summary: Visualization and Timing
plot() Simple plot function
text() Add text to an existing plot
outer() Apply a function to two arrays
persp() Plot a surface in 3D
image() Plot a colored image
contour() Add contour lines to a plot
trans3d() Add point to an existing 3D plot
points() Add points to a plot
proc.time() Stopwatch for measuring execution time
system.time(expr) Measures execution time of an expression
Advanced R: Wrap-Up 40
Summary: Programming
function(){} Self-defined function
expression() Function with arguments not evaluated
eval() Evaluate an expression
if, else Conditional statement
for(){} Loops over a fixed vector
while Loops while a condition is fulfilled
Advanced R: Wrap-Up 41
Outlook
Additional Material
I Further exercises as homework
I Advanced materials beyond our scope
I Advanced R (CRC Press, 2014, by Wickham)
http://adv-r.had.co.nz/
Future Exercises
R will be used to implement optimization algorithms
Advanced R: Wrap-Up 42