7Q10 records and basin characteristics for 224 basins in South Carolina, Georgia, and Alabama (2015)
This data release provides the data and R scripts used for the 2018 publication titled "Improving predictions of hydrological low-flow indices in ungaged basins using machine learning", Environmental Modeling and Software, https://doi.org/10.1016/j.envsoft.2017.12.021. There are two .csv files and 14 R-scripts included below. The lowflow_sc_ga_al_gagesII_2015.csv datafile contains the annual minimum seven-day mean streamflow with an annual exceedance probability of 90% (7Q10) for 224 basins in South Carolina, Georgia, and Alabama. The datafile also contains 231 basin characteristics from the Gages II dataset (https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011). The "all_preds.csv" file contains the leave-one-out cross validated predictions for all the models. The paper associated with the data release compares the ability of eight machine-learning models (elastic net, gradient boosting, kernel-k-nearest neighbors, two variants of support vector machines, M5-cubist, random forest, and a meta-learning ensemble M5-cubist model) and four baseline models (ordinary kriging, a unit-area discharge model, and two variants of censored regression) to generate estimates of the 7Q10 at 224 unregulated sites in South Carolina, Georgia, and Alabama.
Complete Metadata
| accessLevel | public |
|---|---|
| bureauCode |
[
"010:12"
]
|
| contactPoint |
{
"fn": "Scott C Worland",
"@type": "vcard:Contact",
"hasEmail": "mailto:scworland@usgs.gov"
}
|
| description | This data release provides the data and R scripts used for the 2018 publication titled "Improving predictions of hydrological low-flow indices in ungaged basins using machine learning", Environmental Modeling and Software, https://doi.org/10.1016/j.envsoft.2017.12.021. There are two .csv files and 14 R-scripts included below. The lowflow_sc_ga_al_gagesII_2015.csv datafile contains the annual minimum seven-day mean streamflow with an annual exceedance probability of 90% (7Q10) for 224 basins in South Carolina, Georgia, and Alabama. The datafile also contains 231 basin characteristics from the Gages II dataset (https://water.usgs.gov/lookup/getspatial?gagesII_Sept2011). The "all_preds.csv" file contains the leave-one-out cross validated predictions for all the models. The paper associated with the data release compares the ability of eight machine-learning models (elastic net, gradient boosting, kernel-k-nearest neighbors, two variants of support vector machines, M5-cubist, random forest, and a meta-learning ensemble M5-cubist model) and four baseline models (ordinary kriging, a unit-area discharge model, and two variants of censored regression) to generate estimates of the 7Q10 at 224 unregulated sites in South Carolina, Georgia, and Alabama. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "Digital Data",
"format": "XML",
"accessURL": "https://doi.org/10.5066/F7CR5S4T",
"mediaType": "application/http",
"description": "Landing page for access to the data"
},
{
"@type": "dcat:Distribution",
"title": "Original Metadata",
"format": "XML",
"mediaType": "text/xml",
"description": "The metadata original format",
"downloadURL": "https://data.usgs.gov/datacatalog/metadata/USGS.594c4fbae4b062508e3857c6.xml"
}
]
|
| identifier | http://datainventory.doi.gov/id/dataset/USGS_594c4fbae4b062508e3857c6 |
| keyword |
[
"Alabama",
"Georgia",
"South Carolina",
"USGS:594c4fbae4b062508e3857c6",
"low-flow",
"machine learning",
"regionalization",
"statistical modeling",
"streamflow"
]
|
| modified | 2020-08-21T00:00:00Z |
| publisher |
{
"name": "U.S. Geological Survey",
"@type": "org:Organization"
}
|
| spatial | -88.967285156335, 29.87361281758, -77.673339844286, 35.634621048638 |
| theme |
[
"Geospatial"
]
|
| title | 7Q10 records and basin characteristics for 224 basins in South Carolina, Georgia, and Alabama (2015) |