Healthcare Stats Final Exam 2
1. _________ _________ must relate to the data that will be collected by the researcher.
Research questions
2. Every study has ________, ________, and ________, all of which must be stated explicitly.
Assumptions, limitations, and delimitations
3. The ______ and ___________ are characteristics of the populations and sample respectively
Parameter (populations), statistic (sample)
4. Statistics is a branch of science that deals with ________, ________, _______ data pertaining to
research studies.
Collecting, organizing, interpreting
5. The descriptive statistics include _________, __________, and measures of _________ __________.
Tables, graphs, data summary
6. The four types of measurement scales for variance are:
Nominal, ordinal, interval, ratio
7. Sometimes, nominal variables could be _______ or __________
Categorical, qualitative
8. Which of the following is not true about frequency tables?
Shows the type of skewness for the distribution
9. The histogram is appropriate graph for ________, __________, and __________ data.
Interval, ratio, ordinal
10. Which is not a measure of variability?
Mean
The polygon is appropriate for _________ or _________ variables
Interval, ratio
Suppose that the students at Texas State University have averaged 125 on an IQ test. The median of the
scores is 120. The data is considered_____________
Positively skewed
The pie chart, an alternative to the bar char, is used for ____________ and _________ data
Nominal, interval
Which scale is used by a clinical psychologist to place patients in the categories of schizophrenic,
paranoid, and manic-depressive?
Nominal
A researcher transformed data into standardized scores (z scores) and obtained a mean z score of 1.438.
Does this constitute grounds for rechecking the calculations?
Yes
Ms. Sweetwater's statistics class had a standard deviation of 2.4 on a test, while Ms. Quincy's class had a
standard deviation of 1.2 on the same test. What can be said about these two classes?
Ms. Quincy's class is less heterogeneous than Ms. Sweetwater's
The relationship between a parameter and a statistic is the same as the relationship between a
population and a....
Sample
Which of the following is an example of a non-continuous measurement?
Number of brothers or sisters in a family
The statistical procedures that are used to summarize, organize, and simplify data are called
____________
Descriptive statistics
The statistical procedures that are used to generalize about the population are called
Inferential statistics
Ethnic background is an example of what type of data?
Nominal
Which one of the following variables in ordinal in nature?
The ranking of students in a graduating class
The ___________ is simply the number of times the event occurred divided by the total number of times
that it could have occurred
Marginal probability
The standard deviation is determined by ___________
All scores of the data
The range is determined by the
Lowest and highest score
A data analyst describes that his data are positively skewed. The researcher has found the mean of
n=100 scores to be 30 but has not calculated the median. If calculated, the median will be:
Less than 30
The __________ is considered robust (that is uninfluenced) by the unusual high scores in the data
Median
The measures of central tendency include
Mean, median, and mode
The scores of a class in a test have a mean of 100 and a standard deviation of 15. About 95 percent of
the scores are expected to be between___________.
70 and 130
A health researcher has collect five scores (120, 140, 130, 150, 110) on a study about the number of
persons coming into the hospital's outpatients clinic in five randomly selected days. The standard
deviation is ____________.
15.71
Given the scores (2, 0, 4, 2) in the data, what is Σ(x) +1?
The variance of the following sample scores (2, 2, 2, and 2) is...
A random sample has n=4 scores. The sum of the squares Σ(x-x) ^2 equals to 12. The sample standard
deviation is ____________
The proportion of right tail area under the standard normal curve beyond 2.75 is ________
0.003
The critical boundaries for a two tailed hypothesis test are z = -1.96. The sample data yield a z score of -
1.90. What is the correct statistical decision?
Fail to reject the null hypothesis
The first quartile of the length of time (in months) since being placed on the list for these 20 species is
_________
59.5
1. A smaller alpha level would mean ______________
Attempting to make it easier to reject the null hypotheses
2. Increasing the alpha level from 0.01 to 0.05 in hypothesis testing
- Increases the probability of type 1 error
- Increases the size of the critical (that is, rejection) region
- Increases the probability that the test statistic to fall in the critical region
3. In a hypothesis test, the critical region consists of ___________
Statistic value that is very unlikely to occur if the null hypothesis is true
4. In a hypothesis testing, an extreme test statistic z score value, like z = 4.5, _____________
Is probably in the critical region
1. The type 1 error means that a researcher has
Falsely concluded that a treatment has an effect
You have been using the public newspaper to spread the word about healthy lifestyle issues as a media
campaign for the hospital. You want to determine if there is a significant association between the
public's awareness (those reading the newspaper on a regular basis and those not reading the
newspaper on a regular basis) and geographic location (north, south, east, and west). Which statistical
procedure is best to answer this question?
Independent measure t-test
The calculated test statistic is 3.92. Is there a significant finding between the public's awareness of
healthy lifestyle issues in the newspaper and geographic location at an alpha level of 0.05?
Accept the null hypothesis and conclude that there is a signification association between public
awareness and geographic location
If the variances of the two groups being compared are significantly different, which independent sample
t test should be used?
The independent samples t test for pooled variances
The Mann-Whitney U Test provides a nonparametric alternative to the __________
Independent t test
With both one-way ANOVA and the Kruskal-Wallis H-test, the independent variable is at the _______
level.
Nominal
Assuming the data violates a parametric procedure, which is the best statistical test to determine if
there is a significant difference between score collected from six groups of subjects?
Kruskal-Wallis
A significance level of .01 means that there is ____ chance of making a type 1 error
1 out of 100
The one-way ANOVA test is a ________ while the Kruskal-Wallis H-Test is a ________ test
Parametric version, non-parametric version
The parametric tests are ________ while the non-parametric tests are _________
Powerful, approximate
The Mann Whitney U Test is a ____ while the ______ is a parametric version
Non-parametric version, t test
The non-parametric tests can be applied when the distribution of the variable ____________
Is not a normal type
The disadvantage of non-parametric tests include
Inability to handle multivariate data analysis
The chi-squared test requires that the data are ______
Nominal
The assumptions to be checked out for using chi-squared test are
A.
Frequency data, adequate sample size, independent observations, and categorizations
When there are five rows and four columns in table, the degrees of freedom in chi-squared test are
________
12
To say that there is a statistically significant difference between the means of two dependent samples,
the computed absolute value of the paired t-statistic must be ______ the critical value
Greater than
The repeated measures t test can be used when the ___
Scores are measured from the same subjects
A smaller alpha level in a hypothesis test would mean that it is _________
Reducing the risk of type 1 error
To determine the sample size for collecting data to use t test, the researcher should determine
- Whether it is one or two tailed test
. - What is the alpha level?
- What is the effect size?
1. The _________ is appropriate to analyze normally distributed data from three or more groups
A.
Analysis of variance
The F score in an ANOVA is calculated using __________.
Between mean sum of squares over the within mean sum of squares
The mean sum of squares is the _________ in an ANOVA table
Sum of squares over the degrees of freedom
The ______ variation of partitioned into _____ variation and ________ variation
Total, within group, between groups
Suppose that the three groups have equal variance. This is
Homogeneity
The numerator degrees of freedom for F score in an ANOVA table is _________
The number of groups minus one
The denominator degrees of freedom for F score in an ANOVA table are the __________
Total number of freedom plus the between degrees of freedom
The _______ is a multiple comparison test and it is applied when we ______ the null hypothesis in an
ANOVA
Least significant difference test, reject
The assumptions underlying the t test are that __________
a. The independent variable is categorical to make groups
b. The observations are normally disturbed
c. The variances of the two groups are equal
Correlations coefficient is utilized to understand the _______ between variables
Linear relationship
An assumption to be checked before using the Pearson correlation coefficient is that the data is
Normal
The _______ _______ captures the linear relationship between two _____ variables
Pearson correlation, continuous
The correlation coefficient of the values of a variable with themselves is ______
1.00
Another assumption to be checked before sing the Pearson correlation coefficient is that the collected
sample data are ___ of the ______
Representative, population
A positive correlation coefficient between two variables means if the value of _______
One variable increases, the value of another variable decreases
The Pearson correlation coefficient cannot be less than _______
a. -1.00
The fitted regression equation for the dependent variable in terms of the predicator variable is very
useful if the _______ is closer to the value of one
Adjusted R^2
The multiple correlation coefficient is always between ______ and _____
0, 1
A prerequisite before using the regression is that the _________ ________ between the predictor
variable and the dependent variable is significant
Correlation coefficient
The ________ coefficient is to describe a relationship between two variables after controlling the
influence of the third variable
Partial correlation
Correlation coefficient measure
a. Positive relationships and inverse relationships
E of the scores ought to fall between _____ and ______
-3,3
The Pearson correction coefficient cannot be more than
1.0
When the data are normally distributed without any outlier, most th7. The _________ is used when
both variables are correlated and are _______
Phi, dichotomous
. The formula ______ is recognized as standardized Z score
Z = (x-mean)/sd
With a random sample of 62 participants, the Pearson correlation coefficient between the family income
and family saving is 0.548 with a p-value = 0.001 for two tailed testing. It means that_____
With 99.9% confidence level, family income and saving variables are significantly correlated
A negative correlation coefficient between two variables means if the value of one variable ____
Increases, the value of another variable decreases
About 68% of the area under the standard normal curbe is between ______ and ___
-1, 1
The e in the regression equation Y = b0 + b1 X + e refers to the influence of _______
Non collected variables
The regression equation Y = b0 + b1X1 + b2X2 + b3X3 + b4X4 + e is called ________ ________ with X1,
X2, X3, and X4 as predictors.
Multiple linear regression
.In a regression model fitting for the patient's reaction time to the dose level of a medication, if the
patients are selected from three age groups, how many coding vectors are enough?
Two
In a regression model fitting for 50 patient's reaction time to the dose level of a medication, the coding
vector of zero for male and one for female is _____
Dummy
In a regression model fitting for 50 patients' reaction time to the dose level of a medication, if the
patients are selected from both gender groups, the necessary number of coding vectors is
The name _____ refers to the variable X in the regression equation Y = b0 + b1 X + e
Predictor
The name _____ refers to the variable Y in the regression equation Y = b0 + b1 X + e
Dependent
.The name regression coefficient refers to _______ the in the regression equation Y = b0 + b1 X + e
The name initial amount refers to _____ the in the regression equation Y = b0 + b1 X + e
b0
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, if the predictor with the highest correlation is selected to enter first in the
regression equation, then it is called ____ _
Forward solution
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, the coefficient of determination is denoted as __
R^2
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, each predictor is dropped from the regression if the coefficient of determination is
not reducing much. This method id called ________ ________
Backward solution
.The stepwise solution combines ______ and _______ ________ in multiple linear regression model
building.
Forward and backward solution
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of medication
and age, the data should be _______ distributed
Normally
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, each one of the patients must be statistically _______ _____ of others
Independent
The _______ is the relationship between two variables with the effect of a third variable removed from
only one of the variables being correlated
Semi partial correlation
.The Kendall's Tau is considered a _______ correlation measure
Non-parametric
If the correlation coefficient is zero, then there is _________ _________ trend between the variables
No visible
The r^2 measures
The variance shared by the two variables