DATA ANALYSIS
R Resources
R is an open source statistical coding language that you can use to manipulate data and do statistics in an efficient and reproducible way.
Visit this page for some resources on how to download and install R and RStudio, and getting started using them.
Visit this page for some resources on how to download and install R and RStudio, and getting started using them.
Useful R Packages
The R user community is vibrant and constantly developing and improving open-source "packages" that help simplify your data analysis efforts! Below are a few resources and "cheat sheets" for some very useful R packages. Additional cheatsheets published by RStudio can be found on this website.
dplyr/tidyr: data wrangling and transformation
dtplyr: dplyr syntax wrappers for the data.table library (which provides R's fastest data processing tools; same cheat sheet as dplyr)
lubridate: date and time wrangling
ggplot: data visualization
Shiny: interactive visualization
viridis: expressive and colorblind-friendly palettes
foreach and doParallel: intuitive parallel processing
geoknife: processing of large, gridded datasets according to their overlap with landscape features (e.g. summarizing watershed data)
dataRetrieval: import USGS and EPA water data into R
dtplyr: dplyr syntax wrappers for the data.table library (which provides R's fastest data processing tools; same cheat sheet as dplyr)
lubridate: date and time wrangling
ggplot: data visualization
Shiny: interactive visualization
viridis: expressive and colorblind-friendly palettes
foreach and doParallel: intuitive parallel processing
geoknife: processing of large, gridded datasets according to their overlap with landscape features (e.g. summarizing watershed data)
dataRetrieval: import USGS and EPA water data into R
Other Resources
Useful Python Libraries
numpy: basic numerical computing (vectorized, unlike base Python)
pandas: data frame operations
scipy: scientific computing
Scikit-Learn: machine learning
Matplotlib: publishable and highly customizable visualization
Seaborn: out-of-the-box plots for common plotting needs
Bokeh: interactive visualization
pandas: data frame operations
scipy: scientific computing
Scikit-Learn: machine learning
Matplotlib: publishable and highly customizable visualization
Seaborn: out-of-the-box plots for common plotting needs
Bokeh: interactive visualization
Unix-like command line
Version control
For more information on how to use R for more complex data visualization efforts, peruse