Author: Niko Hartline

Acknowledgements: BBest for the great .Rmd guide

## Content

I’d love to examine the locations of endangered species of fish as listed by the IUCN Red List. In freshwater systems it would be interesting to find how nitrogen use may affect endangerment status.

  1. Freshwater Fish
    • Nitrogen Use Effect
  2. Brackish Fish
  3. Ocean Fish

## Techniques

Since the data will be coming from different source (the IUCN Red List and FAOSTAT), effective data concatenation will be key

## Data

IUCN Red List and FAOSTAT will be the two primary data sources for this endeavor.

Singleton_Country_Aggregates = read.csv('./data/zebos1_Singleton_Country_Aggregates.csv')

summary(Singleton_Country_Aggregates)
##       ISO3         All_Species_Table   FAO_Region         FAO_Country 
##  ABW    :  1   Afghanistan  :  1     Africa :58   Afghanistan   :  1  
##  AFG    :  1   Albania      :  1     America:51   Albania       :  1  
##  AGO    :  1   Algeria      :  1     Asia   :51   Algeria       :  1  
##  AIA    :  1   AmericanSamoa:  1     Europe :48   American Samoa:  1  
##  ALA    :  1   Andorra      :  1     Oceania:25   Andorra       :  1  
##  ALB    :  1   Angola       :  1     NA's   :17   (Other)       :228  
##  (Other):244   (Other)      :244                  NA's          : 17  
##  CR_EN_VU_AMPHIBIA_Singleton CR_EN_VU_AVES_Singleton
##  Min.   :  0.000             Min.   : 0.000         
##  1st Qu.:  0.000             1st Qu.: 0.000         
##  Median :  0.000             Median : 0.000         
##  Mean   :  6.096             Mean   : 3.236         
##  3rd Qu.:  1.000             3rd Qu.: 1.000         
##  Max.   :150.000             Max.   :89.000         
##                                                     
##  CR_EN_VU_MAMMALIA_Singleton CR_EN_VU_REPTILIA_Singleton
##  Min.   :  0.00              Min.   :  0.000            
##  1st Qu.:  0.00              1st Qu.:  0.000            
##  Median :  0.00              Median :  0.000            
##  Mean   :  3.14              Mean   :  2.912            
##  3rd Qu.:  1.00              3rd Qu.:  1.000            
##  Max.   :112.00              Max.   :132.000            
##                                                         
##  CR_EN_VU_PLANTAE_Singleton CR_EN_VU_CHORDATA_Singleton
##  Min.   :   0.00            Min.   :  0.00             
##  1st Qu.:   0.00            1st Qu.:  0.00             
##  Median :   1.00            Median :  1.00             
##  Mean   :  35.88            Mean   : 15.38             
##  3rd Qu.:  14.00            3rd Qu.:  6.00             
##  Max.   :1750.00            Max.   :338.00             
##                                                        
##  CR_EN_VU_TOTAL_Singleton
##  Min.   :   0.00         
##  1st Qu.:   0.00         
##  Median :   3.00         
##  Mean   :  51.27         
##  3rd Qu.:  22.75         
##  Max.   :1883.00         
## 

## Data Wrangling

##The .. tells R to back up one directory! Extremely useful! . Uses the current folder (I’d assume … backs out two directory folders)

# Run this chunk only once in your Console
# Do not evaluate when knitting Rmarkdown

# list of packages
pkgs = c(
  'readr',        # read csv
  'readxl',       # read xls
  'dplyr',        # data frame manipulation
  'tidyr',        # data tidying
  'nycflights13', # test dataset of NYC flights for 2013
  'gapminder')    # test dataset of life expectancy and popultion

# install packages if not found
for (p in pkgs){
  if (!require(p, character.only=T)){
    install.packages(p)
  }
}

readr with read_csv is different from read.csv and shows the class of variables. If you see this Ben/Naomi, what are some other differences between the two functions?

library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
read_csv('../data/r-ecology/surveys.csv') %>%
  select(species_id,year)%>%
  #filter(species_id=='NL')%>%
  group_by(species_id,year)%>%
  count(species_id,year)%>%
  head()
## Source: local data frame [6 x 3]
## Groups: species_id [1]
## 
##   species_id  year     n
##        (chr) (int) (int)
## 1         AB  1980     5
## 2         AB  1981     7
## 3         AB  1982    34
## 4         AB  1983    41
## 5         AB  1984    12
## 6         AB  1985    14