Give some background info here, and list your driving question and main statistical methods you will use later in your project.
Penguins are really cool animals. They have bills, flippers, and there are even different species of penguins! Question: Does bill length vary between species with statistical significance? In this project, we will use an ANOVA to test if bill lengths differ between species.
Here, explain how you sourced your dataset and provide a variable-description table.
My dataset was sourced from the library palmerpenguins, made avalible by Dr. Kristen Gorman and the Palmer Station.
Variable | Description |
---|---|
Species | Species of the penguin |
Bill Length (mm) | Bill length in millimeters |
Flipper Length (mm) | Flipper length in millimeters |
I chose to only use one plot for this mini-project, but you should include around 3 figures to identify outliers, make inferences, and more.
I sourced this plot from Ethan’s Palmer Penguins Visualization, found here. Ethan goes over many different methods of visualization whichcan all be achieved in R. ggplot is a diverse library that provides many different ways to visualize your data. You could use tableu for your visualization also!
I want to use a
Now, make some inferences that motivate your driving question.
From this plot, it looks like bill length and flipper length vary between species.
Here, explain the statistical methods you will be using, and how they influence your driving question.
I am going to use an ANOVA (Analysis of Variance) to test if beak lengths differ between species. An ANOVA works by comparing the means and variances of different groups, and tests if it is possible for the means of the groups to be equal. To do this, I’ll use the anova package in R.
model <- lm(bill_length_mm ~ species, data=penguins)
anova_result <- anova(model)
print(anova_result)
## Analysis of Variance Table
##
## Response: bill_length_mm
## Df Sum Sq Mean Sq F value Pr(>F)
## species 2 7194.3 3597.2 410.6 < 2.2e-16 ***
## Residuals 339 2969.9 8.8
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now, explain the results of your statistical methods.
So, we have found with a p-value of 2.2e-16 that there is a statistically significant difference between the bill lengths of the penguin species. This means that it is very likely that bill length varies with species
Here, restate the results and methods.