DEDAN KIMATHI UNIVERSITY OF TECHNOLOGY
University Examinations 2021/2022
First Year Special/Supplementary Examination for the Degree of Bachelor of Science in
Actuarial Science
Date: 29th /11/2021 SAS 1250: Fundamentals of Statistical Programming Time: 11 am - 1 pm
INSTRUCTIONS: Answer question ONE (COMPULSORY) and any other TWO questions.
QUESTION ONE [30 MARKS]
(a) Briefly explain the statistical programming concept. [2 marks]
(b) By considering the R-codes given below as used in R programming, Give the expected output
for each case. Let the variable COLOURS contain colours; blue, red, yellow and green.
i. Create the character vector COLOURS given above. [1 marks]
ii. Give the expected output of the following R-code;
paste("several ", colors, "s", sep = "") [2 marks]
iii. Give the expected output of the following R-code;
paste0("several ", colors, "s") [2 marks]
iv. Give the expected output of the following R-code;
paste("I like", colors, collapse = ", ") [2 marks]
(c) Mr. John wishes to borrow a loan, today, of KShs. 2.5 million at a monthly interest rate of
1.5%. The loan is to be paid back in 36 monthly instalments, beginning one month from now.
Write the R-code that can compute John’s expected monthly repayment amount. [4 marks]
(d) There are several high level plots used in R programming when dealing with statistical
graphics. Briefly describe any three of these plots. [6 marks]
(e) Explain the concept of Relational operators as used in R programming, and hence give the
corresponding examples. [6 marks]
(f) A good programmer should always observe good programming habits. Briefly discuss these
good practices. [5 marks]
SAS 1250: Fundamentals of Statistical Programming Page 2 of 4
QUESTION TWO [20 MARKS]
(a) Consider the following data frame consisting of four observations on the three variables X,
Y and Z;
X Y Z
61 13 4
175 21 8
111 24 14
124 23 18
Suppose that the data set is stored in a file called [Link] in the directory myfiles
on the C: drive, then;
i. Write the R-code that can be read the data frame into an R for analysis. [2 marks]
ii. Give the R-codes that computes the column means, medians variances and standard
deviations. [4 marks]
(b) Patterned vectors is commonly use in R. Study the R code below and hence give the expected
output.
x <- rep(c(1, 4), c(3, 2)); x
names(x) <- paste( "Day_", 1:length(x), sep = ""); x
x[x >= 2]
names(x)[x < 2]
[4 marks]
(c) Explain the concept of approximate storage of numbers as used in data storage in R. [3 marks]
(d) Briefly explain three packages that are used in programming statistical graphics. [3 marks]
(e) Consider the following data set representing the proportion of students in Dekut pursuing
five different courses given as follows; Science = 15%; Business=35%, Engineering=20%,
IT=18% and Nursing 12%. Write the R-codes that can enter these data into R and plot
the pie chart having corresponding names distinguished by the following colours, cols =
{red, blue, green, grey, black}. [4 marks]
QUESTION THREE [20 MARKS]
(a) Use of arrays in R-programming is very common. Consider the following R-code; MY ARRAY
< − array(1:24,c(2, 4, 3)). Give the expected output when executed in R, hence
give the R-code that can extract the second matrix of the array. [4 marks]
(b) Consider the following function defined by;
(
3x + 2 x≤3
f (x) =
2x − 0.5x2 x>3
i. Briefly explain the usage of the function curve() in R. [3 mark]
Special/Supplementary exam Question 3 continues on the next page. . .
SAS 1250: Fundamentals of Statistical Programming Page 3 of 4
ii. By using the appropriate function, write the relevant R-codes that plots the graphs on the
intervals (−3, 3) and (3, 10). [2 mark]
(c) Explain the following functions as used in R-programming.
i. > setwd("c:/mywork") [1 mark]
ii. > dump("usefuldata", "useful.R") [1 mark]
iii. > source("useful.R") [1 mark]
iv. > sink("[Link]") [1 mark]
v. > [Link]("[Link]") [1 mark]
(d) Statistical graphics is essential in data analysis. Box plot is one of these graphics, explain its
usage and hence sketch a Box-plot and show all its details. [6 marks]
QUESTION FOUR [20 MARKS]
(a) Why are command lines used in programming? [3 marks]
(b) Distinguish between lists and data frames as used in R. [2 marks]
(c) Distinguish between seq() and rep() as used in R and hence give an example in each case
with the expected output. [5 marks]
(d) Suppose that the dean of students at DeKUT has granted you a work-study in his office.
He has given you an excel file containing the following students details; Name, [Link],
Gender, Age, Course and you are required to do the statistical summary of their ages
using R. Write the appropriate R-code that can input the data named [Link] from
your drive D with the first row containing headings and hence determine its mean, variance,
standard deviation and the histogram. [5 marks]
(e) Discuss the concept of dates and times as used in R. [5 marks]
QUESTION FIVE [20 MARKS]
(a) Distinguish between a console and script as used in R-programming and hence state the four
main parts of a script. [5 marks]
(b) Consider the following sampled dataset; X={4,8,2,9,8,3,3,5,7,3}. By using this
sample, write the R-code that computes the sample variance and sample standard deviation of
a sample without using the inbuilt functions; var() and sd() but instead use the alternative
code to compute sample variance and standard deviation of X. [6 marks]
(c) Briefly discuss the concept of factors as used in data analysis and hence give an example
using an appropriate R-code. [4 marks]
(d) Consider two matrices X and Y having the same dimension of 3×3. Matrices have got several
operations as given below, explain each of the R functions given below..
Special/Supplementary exam Question 5 continues on the next page. . .
SAS 1250: Fundamentals of Statistical Programming Page 4 of 4
>solve(X)
>X*Y
>X\%*\%Y
>crossprod(X,Y)
>t(X)+Y
[5 marks]
END OF EXAM
Special/Supplementary exam Last page