0% found this document useful (0 votes)
108 views3 pages

Balaji-Module - 1-Module - 1

This document provides an introduction to statistical data analysis using R packages. It defines different types of variables like continuous, categorical, count and discrete. It also defines concepts like population, sample, parameters, statistics and introduces basic R code for computing summary statistics like sum, mean, variance and standard deviation from sample data. The document includes examples of computing various summary statistics like mean, variance and standard deviation from given sample data.

Uploaded by

Ashutosh Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views3 pages

Balaji-Module - 1-Module - 1

This document provides an introduction to statistical data analysis using R packages. It defines different types of variables like continuous, categorical, count and discrete. It also defines concepts like population, sample, parameters, statistics and introduces basic R code for computing summary statistics like sum, mean, variance and standard deviation from sample data. The document includes examples of computing various summary statistics like mean, variance and standard deviation from given sample data.

Uploaded by

Ashutosh Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Balaji

WORK SHOP ON
STATTISTICAL DATA ANALYSIS USING R_PACKAGE
Module-1
1) Introduction: Variables such as height, weight, temperature,
rain fall ,income, BP level, waiting time etc. are measurable
characteristics. They have unit of measurement and known as
continuous variables. Where as gender , educational
qualification , preference of a product by a customer ,pass or
fail, etc. are categorical variables. These are having only
classification. Number of students passed in a test out of a
class of 60 students is a count , number of passengers in a bus
, number of corona case per day etc are count and discrete
variables.
2) Population and sample: Population is a large data where as
sample is a methodologically selected part of the larger data.
3) Parameter and statistic: All the statistical measures such as
AM, variance ,SD , Correlation coefficient ,Regression
coefficient , Probability defined on population data are
parameters. These all measure defined on sample data are
statistic.
4) A population data on a variable X is represented as X1 , X2 , … XN.
in sample data it is X1 , X2 , …… Xn. n≤N
5) Initial introduction of R package: all small caps.
i. Data: x<- c(X1 , X2 , …… Xn )
ii. Ex : x<-c(1,2,3,4,5,6,7,8,9,10)
iii. Sum : s<-sum(x)
iv. AM: m<-s/n
v. Deviation: d<-(x-m)
vi. Deviation2 : dd<-d*d
vii. Variance: Var<-dd/(n-1)
viii. SD: sd<-sqrt(var)
6) Given a data : X :10, 12, 20, 21, 23, 25, 30, 25, 20,30
compute
i. Y=X3 Z=X/(X-5) W=X2+3X-2
ii. AM ,VAR,SD.
7) Following is a frequency distribution on some variable X:
Class : 10—20 20—30 30—40 40—50
Freq: 11 25 17 6.
Compute AM =∑FX/∑F, VAR=∑F(X-AM)2/∑F ,
SD =sqrt(VAR).
8) The following are the daily expenditures (in Rs) of 30 student
in a school.
Expenditure (in Rs) No. of students
10−15 3
15−20 5
20−25 10
25−30 6
30−35 4
35−40 2
Total 30
Compute AM , VAR, SD, Coefficient of variation (CV)

You might also like