Data Preparation

The key areas of data to focus on are snowpack/snowmelt, soil moisture, temperature, precipitation, and streamflow.

Snowpack, Temperature, and Precipitation

With some exploratory research, the National Resource Conservation Services (NRCS) hosts a number of data APIs that include snowpack data in Colorado. This data is queryable by station, data element, start date, and end date. Before pulling back data and stressing the NRCS's servers, it's important to identify which stations will give us reliable readings and give us a true sense of how rivers in Colorado originate. Therefore, we focus on two stations, Buffalo Lake and McClure Pass, which are above 8,000ft elevation and feed the Colorado River near Glenwood Springs. Conveniently, these stations also record temperature and precipitation data, which we will pull in as well.

Soil Moisture

The NRCS also offers soil moisture as part of its data catalogue. While we can not get explicit soil moisture measurements at the Buffalo Lake and McClure Pass stations, we will focus on a collection of five stations above 8,000ft. This collection of data should assist us in connecting snowpack and its affect on soil moisture, along with a path to measure the moisture that is lost due to evapotranspiration.

Streamflow

Streamflow data is available from the USGS. We will focus on the Colorado River at Glenwood Springs, which is the river that is fed by the Buffalo Lake and McClure Pass stations. The location is also beneficial in that it is downstream of important tributaries, yet upstream of diversions, man-made dams, and other water management structures. This data will be used to gather and study historical trends and inform our understanding of how waterflow impacts drinking supply and agriculture downstream. It was also serve as the variable we aim to predict in our modeling phase.

Joins

The final step for this historical data is to join by date to form a complete dataset. Since we have dropped null / unrecorded data, our daterange will be limited to the intersection of all datasets.

RCP Scenario Data

Representative Concentration Pathways (RCPs) are used to model future climate scenarios. We will use the RCP 4.5 and RCP 8.5 scenarios to model future snowpack, soil moisture, temperature, precipitation, and streamflow. This data will be used to understand how climate change will impact the Colorado River Basin and the water supply for the region.

RCP data is extremely large and spans the entire globe at lower resolutions. A project called CORDEX aims to take large climate models and downscale them to regional outlooks.The Canadian government has a climate modeling organization, which is focused on the North American region. For this excercise, we have selected the Canadian Centre for Climate Modelling and Analysis CORDEX with a 0.5 latitude/longitude resolution, and have filtered down to the state of Colorado.