FlowDB 2.0
Goals:
Incorporate snow-pack into the dataset to FlowDB
Utilize better sources of precipitation (e.g. multiple locations or an image of rainfall in the area)
Dataset can also be used for precipitation now-casting (include data Earthformer type model could ingest).
Bonus: add soil moisture data
Bonus: add aerial imagery or geospatial of the basin in order for models to learn better representations of hydrology.
Create an end-to-end pipeline to append new data on at least a daily basis (likely using Composer on GCP?)
Tests for individual scraping functions that run daily and alert to API changes.
Current data sources
ASOS (precipitation/temperature/humidity)
Hourly [X]
Need to check API and legacy code still scrapes. [X]
Test
process_asos
[x]
USGS (river flow)
Verify raw data is in fifteen minute intervals [X]
Need to check API and legacy code still scrapes. [X]
Test USGS function [X]
Data Sources
NOAA Precipitation Maps (Daily)
Look into hourly historical TIFF National Water Prediction Service - NOAA. They seem to only have daily data for historical weather maps.
Use vanilla WGET no real API
https://water.weather.gov/precip/downloader.php?hourly=true&file_type=nc_file&range=1hour&format=tar
SNOTEL (snow pack)
Added information for closest (distance) SNOTEL station to river meta-data.
However, there are issues with this as the closest SNOTEL station might not align with river drainage basin.
Still only have snow-pack for western rivers in CA, ID, CO, (furthest east state is SD)
Current Code