0% found this document useful (0 votes)
83 views17 pages

Healthcare Stats Final Exam 2

Healthcare Stats Final Exam 2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views17 pages

Healthcare Stats Final Exam 2

Healthcare Stats Final Exam 2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Healthcare Stats Final Exam 2

1. _________ _________ must relate to the data that will be collected by the researcher.

Research questions

2. Every study has ________, ________, and ________, all of which must be stated explicitly.

Assumptions, limitations, and delimitations

3. The ______ and ___________ are characteristics of the populations and sample respectively

Parameter (populations), statistic (sample)

4. Statistics is a branch of science that deals with ________, ________, _______ data pertaining to
research studies.

Collecting, organizing, interpreting

5. The descriptive statistics include _________, __________, and measures of _________ __________.

Tables, graphs, data summary

6. The four types of measurement scales for variance are:

Nominal, ordinal, interval, ratio

7. Sometimes, nominal variables could be _______ or __________

Categorical, qualitative
8. Which of the following is not true about frequency tables?

Shows the type of skewness for the distribution

9. The histogram is appropriate graph for ________, __________, and __________ data.

Interval, ratio, ordinal

10. Which is not a measure of variability?

Mean

The polygon is appropriate for _________ or _________ variables

Interval, ratio

Suppose that the students at Texas State University have averaged 125 on an IQ test. The median of the
scores is 120. The data is considered_____________

Positively skewed

The pie chart, an alternative to the bar char, is used for ____________ and _________ data

Nominal, interval

Which scale is used by a clinical psychologist to place patients in the categories of schizophrenic,
paranoid, and manic-depressive?

Nominal
A researcher transformed data into standardized scores (z scores) and obtained a mean z score of 1.438.
Does this constitute grounds for rechecking the calculations?

Yes

Ms. Sweetwater's statistics class had a standard deviation of 2.4 on a test, while Ms. Quincy's class had a
standard deviation of 1.2 on the same test. What can be said about these two classes?

Ms. Quincy's class is less heterogeneous than Ms. Sweetwater's

The relationship between a parameter and a statistic is the same as the relationship between a
population and a....

Sample

Which of the following is an example of a non-continuous measurement?

Number of brothers or sisters in a family

The statistical procedures that are used to summarize, organize, and simplify data are called
____________

Descriptive statistics

The statistical procedures that are used to generalize about the population are called

Inferential statistics
Ethnic background is an example of what type of data?

Nominal

Which one of the following variables in ordinal in nature?

The ranking of students in a graduating class

The ___________ is simply the number of times the event occurred divided by the total number of times
that it could have occurred

Marginal probability

The standard deviation is determined by ___________

All scores of the data

The range is determined by the

Lowest and highest score

A data analyst describes that his data are positively skewed. The researcher has found the mean of
n=100 scores to be 30 but has not calculated the median. If calculated, the median will be:

Less than 30

The __________ is considered robust (that is uninfluenced) by the unusual high scores in the data

Median
The measures of central tendency include

Mean, median, and mode

The scores of a class in a test have a mean of 100 and a standard deviation of 15. About 95 percent of
the scores are expected to be between___________.

70 and 130

A health researcher has collect five scores (120, 140, 130, 150, 110) on a study about the number of
persons coming into the hospital's outpatients clinic in five randomly selected days. The standard
deviation is ____________.

15.71

Given the scores (2, 0, 4, 2) in the data, what is Σ(x) +1?

The variance of the following sample scores (2, 2, 2, and 2) is...

A random sample has n=4 scores. The sum of the squares Σ(x-x) ^2 equals to 12. The sample standard
deviation is ____________

The proportion of right tail area under the standard normal curve beyond 2.75 is ________
0.003

The critical boundaries for a two tailed hypothesis test are z = -1.96. The sample data yield a z score of -
1.90. What is the correct statistical decision?

Fail to reject the null hypothesis

The first quartile of the length of time (in months) since being placed on the list for these 20 species is
_________

59.5

1. A smaller alpha level would mean ______________

Attempting to make it easier to reject the null hypotheses

2. Increasing the alpha level from 0.01 to 0.05 in hypothesis testing

- Increases the probability of type 1 error

- Increases the size of the critical (that is, rejection) region

- Increases the probability that the test statistic to fall in the critical region

3. In a hypothesis test, the critical region consists of ___________

Statistic value that is very unlikely to occur if the null hypothesis is true

4. In a hypothesis testing, an extreme test statistic z score value, like z = 4.5, _____________

Is probably in the critical region


1. The type 1 error means that a researcher has

Falsely concluded that a treatment has an effect

You have been using the public newspaper to spread the word about healthy lifestyle issues as a media
campaign for the hospital. You want to determine if there is a significant association between the
public's awareness (those reading the newspaper on a regular basis and those not reading the
newspaper on a regular basis) and geographic location (north, south, east, and west). Which statistical
procedure is best to answer this question?

Independent measure t-test

The calculated test statistic is 3.92. Is there a significant finding between the public's awareness of
healthy lifestyle issues in the newspaper and geographic location at an alpha level of 0.05?

Accept the null hypothesis and conclude that there is a signification association between public
awareness and geographic location

If the variances of the two groups being compared are significantly different, which independent sample
t test should be used?

The independent samples t test for pooled variances

The Mann-Whitney U Test provides a nonparametric alternative to the __________

Independent t test

With both one-way ANOVA and the Kruskal-Wallis H-test, the independent variable is at the _______
level.

Nominal
Assuming the data violates a parametric procedure, which is the best statistical test to determine if
there is a significant difference between score collected from six groups of subjects?

Kruskal-Wallis

A significance level of .01 means that there is ____ chance of making a type 1 error

1 out of 100

The one-way ANOVA test is a ________ while the Kruskal-Wallis H-Test is a ________ test

Parametric version, non-parametric version

The parametric tests are ________ while the non-parametric tests are _________

Powerful, approximate

The Mann Whitney U Test is a ____ while the ______ is a parametric version

Non-parametric version, t test

The non-parametric tests can be applied when the distribution of the variable ____________

Is not a normal type

The disadvantage of non-parametric tests include

Inability to handle multivariate data analysis


The chi-squared test requires that the data are ______

Nominal

The assumptions to be checked out for using chi-squared test are

A.

Frequency data, adequate sample size, independent observations, and categorizations

When there are five rows and four columns in table, the degrees of freedom in chi-squared test are
________

12

To say that there is a statistically significant difference between the means of two dependent samples,
the computed absolute value of the paired t-statistic must be ______ the critical value

Greater than

The repeated measures t test can be used when the ___

Scores are measured from the same subjects

A smaller alpha level in a hypothesis test would mean that it is _________

Reducing the risk of type 1 error


To determine the sample size for collecting data to use t test, the researcher should determine

- Whether it is one or two tailed test

. - What is the alpha level?

- What is the effect size?

1. The _________ is appropriate to analyze normally distributed data from three or more groups

A.

Analysis of variance

The F score in an ANOVA is calculated using __________.

Between mean sum of squares over the within mean sum of squares

The mean sum of squares is the _________ in an ANOVA table

Sum of squares over the degrees of freedom

The ______ variation of partitioned into _____ variation and ________ variation

Total, within group, between groups

Suppose that the three groups have equal variance. This is

Homogeneity

The numerator degrees of freedom for F score in an ANOVA table is _________

The number of groups minus one


The denominator degrees of freedom for F score in an ANOVA table are the __________

Total number of freedom plus the between degrees of freedom

The _______ is a multiple comparison test and it is applied when we ______ the null hypothesis in an
ANOVA

Least significant difference test, reject

The assumptions underlying the t test are that __________

a. The independent variable is categorical to make groups

b. The observations are normally disturbed

c. The variances of the two groups are equal

Correlations coefficient is utilized to understand the _______ between variables

Linear relationship

An assumption to be checked before using the Pearson correlation coefficient is that the data is

Normal

The _______ _______ captures the linear relationship between two _____ variables

Pearson correlation, continuous


The correlation coefficient of the values of a variable with themselves is ______

1.00

Another assumption to be checked before sing the Pearson correlation coefficient is that the collected
sample data are ___ of the ______

Representative, population

A positive correlation coefficient between two variables means if the value of _______

One variable increases, the value of another variable decreases

The Pearson correlation coefficient cannot be less than _______

a. -1.00

The fitted regression equation for the dependent variable in terms of the predicator variable is very
useful if the _______ is closer to the value of one

Adjusted R^2

The multiple correlation coefficient is always between ______ and _____

0, 1

A prerequisite before using the regression is that the _________ ________ between the predictor
variable and the dependent variable is significant

Correlation coefficient
The ________ coefficient is to describe a relationship between two variables after controlling the
influence of the third variable

Partial correlation

Correlation coefficient measure

a. Positive relationships and inverse relationships

E of the scores ought to fall between _____ and ______

-3,3

The Pearson correction coefficient cannot be more than

1.0

When the data are normally distributed without any outlier, most th7. The _________ is used when
both variables are correlated and are _______

Phi, dichotomous

. The formula ______ is recognized as standardized Z score

Z = (x-mean)/sd

With a random sample of 62 participants, the Pearson correlation coefficient between the family income
and family saving is 0.548 with a p-value = 0.001 for two tailed testing. It means that_____

With 99.9% confidence level, family income and saving variables are significantly correlated
A negative correlation coefficient between two variables means if the value of one variable ____

Increases, the value of another variable decreases

About 68% of the area under the standard normal curbe is between ______ and ___

-1, 1

The e in the regression equation Y = b0 + b1 X + e refers to the influence of _______

Non collected variables

The regression equation Y = b0 + b1X1 + b2X2 + b3X3 + b4X4 + e is called ________ ________ with X1,
X2, X3, and X4 as predictors.

Multiple linear regression

.In a regression model fitting for the patient's reaction time to the dose level of a medication, if the
patients are selected from three age groups, how many coding vectors are enough?

Two

In a regression model fitting for 50 patient's reaction time to the dose level of a medication, the coding
vector of zero for male and one for female is _____

Dummy

In a regression model fitting for 50 patients' reaction time to the dose level of a medication, if the
patients are selected from both gender groups, the necessary number of coding vectors is
The name _____ refers to the variable X in the regression equation Y = b0 + b1 X + e

Predictor

The name _____ refers to the variable Y in the regression equation Y = b0 + b1 X + e

Dependent

.The name regression coefficient refers to _______ the in the regression equation Y = b0 + b1 X + e

The name initial amount refers to _____ the in the regression equation Y = b0 + b1 X + e

b0

In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, if the predictor with the highest correlation is selected to enter first in the
regression equation, then it is called ____ _

Forward solution

In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, the coefficient of determination is denoted as __

R^2
In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, each predictor is dropped from the regression if the coefficient of determination is
not reducing much. This method id called ________ ________

Backward solution

.The stepwise solution combines ______ and _______ ________ in multiple linear regression model
building.

Forward and backward solution

In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of medication
and age, the data should be _______ distributed

Normally

In a multiple linear regression model fitting for 50 patients' reaction time to the dose level of a
medication and age, each one of the patients must be statistically _______ _____ of others

Independent

The _______ is the relationship between two variables with the effect of a third variable removed from
only one of the variables being correlated

Semi partial correlation

.The Kendall's Tau is considered a _______ correlation measure

Non-parametric

If the correlation coefficient is zero, then there is _________ _________ trend between the variables
No visible

The r^2 measures

The variance shared by the two variables

You might also like