Description: For this project, you will collect and compare quantitative data from two populations. You will use statistical methods to determine if there a difference between the means of the two populations. You will present your results and conclusions for this first part of project in an essay format that is at least 1000 words (3 pages, double spaced) in length. Do NOT include definitions or state how to calculate anything in your project. State your findings and conclusions in your own words. Discussion of Topic (Total: 6pts) • Clearly describe the quantitative variable and two populations studied: What is your project about? What are you studying, and what populations did you select? • Discuss why the selected topic was chosen: What piqued your interest about this topic. Is it related to your major in some way, or just something you thought was interesting? Discuss any preconceived notions of how the project might turn out. • Discuss expectations: What do you think will be the outcome of this experiment? Discussion of Data Collection (Total: 4pts) • Clearly describe how data was collected: Who did you ask? How did you decide who to ask? How did you record data? • Decide if the collection method meets the requirements of a random sample: You likely did not meet the requirements of a true random sample. How did your sample either meet the requirements or not meet the requirements of a random sample? • Discuss how to correct sampling issues: Generally the reason that students don’t have random samples is due to a lack of resources. You don’t have the time or ability to truly randomly choose 50 subjects from each population. What are some ways that you could get a true random sample? Discussion of Histograms (Total 6pts) • Discuss the shape of both distributions based on the histograms: Is the distribution right skewed, left skewed, or symmetric? How can you tell? Is the distribution multimodal, bimodal, or unimodal. How can you tell? • Determine visually if there are any outliers present in the histograms and explain how outliers in a histogram are identified:. According to your histogram, are there any outliers in the distribution? In a histogram, how do you identify outliers? Do not use the formula for outliers, and do not look at your boxplots
Discussion of Outliers in Boxplot (Total 4pts)
• Correctly and clearly discuss any outliers present in the boxplots and explain how outliers in a boxplot are identified: This answer may not match your answer to the question regarding outliers in your histogram, and that is fine. Do not go back and change your answer. Discussion of Visual Sample Comparison (Total 8pts) • Correctly and clearly discuss if the centers of the two samples are different from each other using the histograms only: Do not use your boxplots or summary statistics for this part. Using your histograms alone, make a rough estimate about where the “center” for each distribution is. Then compare them. Are they basically the same or is one significantly bigger than the other? • Correctly and clearly discuss if the centers of the two samples are different from each other using the boxplots only: For each graph, what are the “centers” that represent the typical values of your data? Are they basically the same or is one of them significantly bigger than the other?
Discussion of Measures of Center and Spread (Total 4pts): • Describe the two samples using the two measures of center: Talk about what methods you used to identify the “center” of your histograms and box plots. Did you identify the mean or median when reporting on measure of center for your histograms? What about for your boxplots? Compare the mean and median for each sample. • Describe the two samples using the two measures of spread: Compare standard deviation and IQR for each sample Discussion of Appropriate Measures to Use (Total 4pts) • Determine which measure of center and spread would be appropriate to use when comparing the two samples based on the shapes of the distributions: Use what you know about symmetry vs skewness to decide which is better for your data sets: The mean and the standard deviation, OR the median and the IQR. Decide for each data set individually. However, if one of your data sets uses median and IQR, but the other uses mean and standard deviation, you will move forward with your comparison using the median and IQR for BOTH DATA SETS. If this is the case, please state this in your paper. • Correctly support reasoning for the decision: Why did you choose what you chose for each data set? Comment on the shapes of data and which measures of center and spread are appropriate.Discussion of Samples Using Chosen Center and Spread Measurement (Total 6pts)
• Compare the samples using the chosen measure of center and spread: First, think aboutthe measure of center you chose for comparison. Which group has a higher measure of center? Now, think about the measure of spread that you chose to compare. Which group has a greater measure of spread?
• Discuss what the results suggest about the two overlying populations: What do your results in the previous question imply about the two overlying populations? Discussion of Expectations and Limitations (Total 4pts) • Clearly discuss if the analysis of the samples matched initial expectations: What did you think the results would be in the beginning, before collecting your data? Were you right or wrong? What are your main conclusions based on the initial data analysis?