Scraping, wrangling & viz, oh my! Fun with Columbia Basin DART (fish passage data)

As a little side project, I decided to scrape data from the Columbia Basic Research DART (Data Access in Real Time) to explore fish passage and seasonal trends over time.

This project involved:

  • A little crawl over web pages with purrr (and loving the side effects, like possibly()!)
  • Web scraping with rvest to access fish count tables from > 1500 unique URLs
  • Creating a function to access an html table, read it in, and combine with other scraped tables
  • Data wrangling (dplyr)
  • Data visualization (ggplot2)
  • Interactive visualizations (shiny widgets + reactive outputs)
  • Next steps: time series analysis & forecasting!
Allison Horst
Assistant Teaching Professor

My teaching interests are data science, statistics, and science communication.