```
# load packages
library(readr)
library(dplyr)
library(ggplot2)
```

Start by reading in the Porg data.

`porg_data <- read_csv("https://itep-r.netlify.com/data/Porg_samples.csv")`

Take a couple of minutes to get an overview of the data. Do you remember some of the functions to do that?

*Show solution*

```
view(porg_data)
glimpse(porg_data)
summary(porg_data)
```

What columns are there? What kinds of questions do you have about the data after looking at it. (Hint, I see some biomonotoring concentrations and some response times. Maybe Corundum is neurotoxic to porgs? Ask Dorian to find out for sure.)

Now create a scatterplot showing the relationship between conncetration and rxn_time.

*Show solution*

```
ggplot(data = porg_data, aes(x = Concentration, y = rxn_time)) +
geom_point()
```

What type of cleaning does this data set need? Let’s take a look at the lowest and highest concentration levels and see if anything looks odd.

*Show solution*

```
porg_data <- arrange(porg_data, Concentration)
porg_data <- arrange(porg_data, desc(Concentration))
```

Hmmm. Looks like there’s some negative numbers. Is that possible for Corundum measurements? We ask our team of Bot measurement analysts and they say NO! So, let’s filter out the negative concentration values.

*Show solution*

`porg_data <- filter(porg_data, Concentration > 0)`

Let’s look at the data set again. I see a units column. Are all the units the same? Looks like the Concentration units are in both ppb and ppm. Let’s add a column with consistent concentration units, and convert the ppm concentrations into ppb.

*Show solution*

`porg_data <- mutate(porg_data, conc_ppb = ifelse(Conc_units == "ppm", Concentration * 1000, Concentration))`

Remember the rule of plotting data? Hint: Plot the data, Plot the data, Plot the data

So, let’s re-plot the data

*Show solution*

```
ggplot(data = porg_data, aes(x = conc_ppb, y = rxn_time)) +
geom_point()
```

Let’s calculate the mean concentration by island to determine if there might be Corundum concentration differences by island. Do you remeber which function to use here?

*Show solution*

```
porg_island_summary <- group_by(porg_data, Island) %>%
summarise(mean_corundum = mean(conc_ppb))
```

Maybe making a box plot would help answer this question more clearly?

*Show solution*

```
ggplot(data = porg_data, aes(x = Island, y = conc_ppb)) +
geom_boxplot()
```

Let’s fill the boxplot with a different color for each island to make it look better.

*Show solution*

```
ggplot(data = porg_data, aes(x = Island, y = conc_ppb, fill = Island)) +
geom_boxplot()
```

Let’s make an individual scatterplot for each island. Remember which function to use?

*Show solution*

```
ggplot(data = porg_data, aes(x = conc_ppb, y = rxn_time)) +
geom_point() +
facet_wrap("Island")
```

Try an x-y scatter plot of conc_ppb and rxn_time and experiment with

`geom_smooth(method = "loess")`

*Show solution*

```
ggplot(data = porg_data, aes(x = conc_ppb, y = rxn_time)) +
geom_point() +
geom_smooth(method = "loess") +
facet_wrap(~Island)
```

Nice work Jedi Master! You are ready to tackle your own data set, maps, images, statistics, the universe!