Name:
Student# :
The University of Western Ontario
Department of Statistical & Actuarial Sciences
Statistical Sciences 2864b, 2018
Final Exam
Instructor: H. Yu Friday 27, 2018 2:00 pm-5:00 pm
Instruction:
• Questions 1 to 4 are to be answered by all students.
• The mark allotment per question is given after each question number. Total mark is 80.
• Write your answers (codes) and calculations on the space provided right after each question. The
amount of space given is deemed more than sufficient in writing the correct answer. However, should
you require additional space, you may use the back pages. Also, the last page is blank page, which
you can detach and use as scrap paper.
• Calculators and formula sheets are not allowed.
1. [20] Answer each question briefly. No need to write a big paragraph.
(a) [4] Why is the numerical optimization in R so important? Name two usages in statistical
modeling or estimations.
(b) [4] Please list at least two numerical optimization algorithms you have learned from this course.
Also explain what kind of functions are suitable to be optimized using these algorithms.
1
(c) [4] What are the similarities and differences between a data.frame object and a list object?
(d) [4] How do you pass results computed in a function to next function? Do you think it is a good
idea to pass them as global values so next function can access them in the function body?
(e) [4] What is a Monte Carlo simulation? Why is it so important?
2
2. [20]
(a) [10] Please identify and correct bugs/mistakes in the following codes. The correction can be
inside the codes or the spaced provided below.
one.bt.cover <- function(x, R=10000, mu=0.5){
if (all(is.na(x))
stop ("x contains some missing value(s)")
#run bootstrap R times
out <- replicate(10000, mean(sample(x, replace=TRUE)))-mean(x)
#find 2.5% and 97.5% quantiles
bounds <- quantile(out, (0.025,0.975))
#check if mu is inside the coverage
cv <- mean(x)-bounds[2] <= mu && mean(x)-bounds[1] >= 0.5
#return TRUE if it is inside and FALSE otherwise
print(cv)
}
(b) [10] By using inversion method, write an R function to generate a vector of iid random numbers
from the CDF
F (x) = (ex − 1)/(e − 1), 0 ≤ x ≤ 1.
Your function should have one argument n (the size of return vector) and one error check on n.
No for, while, or repeat looping is allowed.
3
3. [20] In this question, you will implement a Monte Carlo simulation to estimate π. It is based on the
fact that the probability of two uniform[0,1] numbers (x, y) landing inside of quarter circle x2 +y 2 = 1
within (0, 0), (0,1), (1,0), (1,1) square area is π/4.
(a) [5] Write a function called one.tossing with inputs x and y and return 1 if x2 + y 2 < 1 and 0
otherwise.
(b) [5] Write a function called many.tossing with input K = 10000 to do Monte Carlo simulation
based on the function one.tossing in (a), i.e., you need to run one.tossing K times with each
x and y replaced by random uniform[0,1] numbers respectively. The return value should be a
vector of 1 (inside the quarter circle) and 0 (outside).
(c) [5] Based on the output of many.tossing, estimate the π and give 95% confidence bands.
4
(d) [5] The computations in (a) and (b) are not vectorized. Can you find a way to vectorize the
computations (to obtain the same output in (b))?
4. [20] Implement a ttt function, where ttt means total time on test. It is used in reliability for testing
exponential distribution. Write a ttt function with one argument: x — a vector. Your ttt will
compute the ttt vector by the following steps:
(a) [3] You have to have an error checking step at beginning. You need to check if all data are
nonnegative or not. Reliability data must be nonnegative.
(b) [6] Sort the data x from small to large as a vector sorted.x. Then generate a vector w by
w[k] = (n + 1 − k)(sorted.x[k] − sorted.x[k − 1]), k = 1, . . . , n, where n is the sample size. Be
careful with the computation at k = 1 since sorted.x[0] does not exist. Just replace it by 0.
!k
(c) [6] Compute the total time on test vector ttt.x as ttt.x[k] = i=1 w[i], k = 1, . . . , n and compute
the scaled total time on test vector as scaled.ttt.x = ttt.x/sum(x).
(d) [3] Add a plot with x coordinates as (1:n)/n and y as scaled.ttt.x. Since they must be in (0,
0), (0,1), (1,0), (1,1) square area, please add a 45 degree line.
(e) [2] Finally return the vector scaled.ttt.x.
Note: You can write one function ttt in which (a) to (e) are implemented. Please avoiding using
for, while, or repeat loops whenever possible.
5
6