Project 2: Collecting and Analyzing Data
Name:Hamza Ali______________________________________________________________________
All questions must be answered in complete sentences.
Grading: This project is worth in total 75 points. This is split into 3 parts.
1. The project was completed with thoughtful responses and complete sentences (10 pts)
2. Excel file submitted with work shown (5 pt)
3. The questions were answered correctly based on the Statistics learned so far. Points are
distributed per problem.
Part 1 Average Income by State
The following is an excerpt of data collected by the financial website [Link] 1.
State Average Individual
Income in 2020
Alabama $54,813.68
Alaska $58,920.08
Arizona $57,407.70
Arkansas $55,897.80
California $66,461.85
Colorado $67,241,75
Connecticut $72,776.72
Delaware $59,517.97
Florida $57,369.16
Georgia $56,713.19
Hawaii $57,290.63
1 [Link]
Idaho $57,290.63
Illinois $66,554.70
Indiana $55,290.18
Iowa $51,783.06
1. (3 pt) Is the Average Individual Income Categorical or Quantitative data? Why? Explain your
answer clearly. The Average Individual Income is quantitative data. This is because it
represents a numerical measurement of the amount of money earned by an individual
in a given state.
2.
In the U.S., average individual income in 2019 was $62,518.13 2.
a. (2 pt) How many of the states shown have an average individual income less than the
U.S. average?
What percent of all the data shown does this represent? Seven states have an
average individual income less than the U.S. average. This represents 46.67% of
all the data shown.
b. (3 pt) Out of the states shown whose average individual income is less than the U.S.
average, what percent of them are states whose name begins with the letter?
Two . Alabama and Arkansas Have an average individual income less than the
U.S. average, and they both begin with the letter A. Therefore, the percentage of
states whose average individual income is less than the U.S. average and whose
name begins with the letter A is 28.57%.
c. (3 pt) Out of all the states whose name begins with the letter A, what percent of them
have an average individual income less than the U.S. average? (This is NOT the same
question as above!) 50%.
Part 2 Yes or No?
2 [Link]
For this part of the project, you will conducting a survey on a yes or no question of your choosing. You
are going to collect information from 20 different people. For each person, you will ask gender and your
yes or no question. From there, you will perform some statistical analysis and answer questions on the
data.
Choose a yes or no question that is fun and interesting for you. Some examples:
● Are you a fan of the Arizona Diamondbacks?
● Do you like Star Trek: Voyager?
● Have you ever had surgery?
Question you will be asking (1 pt) - Do you prefer cats and dogs as pets?
For each person, ask two questions:
1. What is your gender? (male, female, or other/no response)
2. Your question? (yes or no)
Record the responses on a separate sheet of paper, and then enter it into Excel. In column A, number 1
– 20. In column B, record the gender of person 1, then the gender of person 2, etc. In column 3, record
each person’s response – yes or no – to the question. (Do not put these results in this document.) You
will submit this Excel file with your project.
Answer the following questions with complete sentences. Review the definitions of population,
sample, and sample statistic.
1. (3 pt) What is the population that is being studied?
The general population of people who have an opinion on the topic of pets.
2. (3 pt)What is the sample that is taken?
The sample taken is the 20 people who responded to the survey.
3. (4 pt) What sampling method was used to collect the data. Choose from the sampling methods
described in your text and explain why it is this type of sampling. (Hint: You probably did not
use Simple Random Sampling.)
The sampling method used is convenience sampling. I surveyed people who were easily accessible or
available to them, such as friends, family, or acquaintances, rather than selecting a random sample from
the entire population.
For the data you collected, fill out the frequency table. For this table only use the (yes or no) response
to the second question. Ignore gender for this portion of the project.
Your question here:Do you prefer cats or dogs as pets?
Response Frequency
Yes 12
No 8
(3 pt)
Now use Excel to make a pie chart of this data. Make sure the pie chart has a title, a legend, and that
each slice shows the percent of data in that slice. Copy your pie chart from Excel and paste below. You
will also be submitting the Excel file with your assignment, but you must have the pie chart pasted below
as well. (4 pt)
For the data you collected, fill out the two-way table. You will use both the gender response to the first
question and the yes/no response to the second question.
Your question here:Do you prefer cats and dogs as pets?
Yes No
Male 6 3
Female 5 5
Other/NA 1 0
(3 pt)
Based on the two-way table, answer these questions with complete sentences. Type out your
calculations as well (for example: 9/15 *100% = 60%), so that your work can be check for partial credit if
your answer is incorrect.
1. (2 pt) What percent of all responses were yes?
60% of all responses were yes (12/20 * 100% = 60%).
2. (4 pt) If you repeated this experiment with 20 different people, would you get the same percent
of yes responses? Why or why not?
It's unlikely I would get the same percent of yes responses if I repeated the experiment with 20 different
people. This is because the sample was not selected using a random sampling method, which means the
sample may not be representative of the entire population. The sample may have a bias towards people
who are more likely to prefer cats and dogs as pets, or towards people who are more likely to respond
to the survey.
3. (3 pt)What percent of all male responses were yes? (out of the male responses, what percent
were yes?)
33.33%
4. (3 pt) Suppose that your entire population had 10,000 males. Using the percent you found
above, how many of these 10,000 males would have seen your movie?
If the entire population had 10,000 males, 3,333 of these males would have seen the movie (33.33% of
10,000 = 3,333).
5. (3 pt) What percent of all yes responses were female? (out of the yes responses, what percent
were female?)
41.67% of all yes responses were female (5/12 * 100% = 41.67%).
6. (3 pt) Suppose that your population had 15,000 people who said yes to your question. Based on
the statistic that you collected above, what number of those people would be female?
Out of a population of 15,000 people who said "yes," we would expect 9,000 of them to be female.
Part 3
Suppose that in part 2, I asked you to use cluster sampling with any sample size of 20 or greater.
Describe how you would get the survey information using cluster sampling. (Give the full details of your
data collection –where you would go, which cluster you would use, etc.) You do NOT need to actually
collect data this way, merely describe how you would. (10 pt)
I would define the population of interest for the survey. For example, if I am conducting a survey on the
opinions of residents in a particular city about a new city policy, my population would be all the
residents of the city.I would randomly select clusters from the population of interest, determine the
sample size for each cluster, survey within the clusters, and then analyze the results to draw conclusions
about the population.