Groundwater nitrate data and ascii grids of predicted nitrate and model inputs for the Central Valley aquifer, California, USA
This public data release contains two ascii grids comprising predicted nitrate concentrations (as NO3-N, mg/L) at two depth zones associated with private and public drinking water supply wells, respectively, in the Central Valley, California; raster files of the 25 predictor variables in the final statistical model; and groundwater nitrate data from the sampled wells used to train and test the model. Both prediction grids are bound by the alluvial bed boundary that defines the Central Valley. The prediction grids were produced with Boosted Regression Tree (BRT) modeling methods within a statistical modeling framework using the statistical modeling software R (R Core Team, https://www.r-project.org/) and linear interpolation within Oasis Montaj software (Geosoft, version 9.0.2). The response variable was a set of nitrate concentrations in wells located within the Central Valley. We compiled the database of well nitrate measurements from private supply and public supply wells. Nitrate data came from two sources, the University of California at Davis (UC Davis) and the U.S. Geological Survey (USGS). Prior to statistical modeling, wells were spatially declustered using an equal area grid cell approach to reduce effects on the modeling of oversampling in areas of intensive agricultural land use. A total of 5170 wells were selected, 3508 of which were used for training and 1662 of which served as hold-out. A database of 25 predictor variables was used for the final BRT model and included well characteristics, land use, climate, soil properties, aquifer properties, depth to the water table, and estimates of nitrogen loading and groundwater age. Based on the gridded predictor variables and final model, nitrate predictions were made using the R raster package for 17 depth zones spaced throughout the aquifer (at 15.24, 30.48, 45.72, 60.96, 76.20, 91.44, 106.68, 121.92, 152.40, 182.88, 213.36, 243.84, 274.32, 304.80, 365.76, 426.72, and 487.68 m below ground surface) to create input layers for 3D mapping with Oasis Montaj software version 9.0.2 (GeoSoft, Inc., 2016). The nitrate prediction grids for each of the 17 depth zones were imported into the Oasis Montaj mapping environment software for 3D interpolation and visualization. Each grid was assigned a vertical thickness of 1 m and linear interpolation was used between each of the layers at a vertical resolution of 1 m to produce a complete representation of predicted nitrate concentration at depth throughout the Central Valley. For visualization purposes, nitrate predictions were extracted from the interpolated model at 54.86 m and at 121.92 m deep. These depths correspond to the average total depths of private and public wells for the training wells.
Complete Metadata
| accessLevel | public |
|---|---|
| bureauCode |
[
"010:12"
]
|
| contactPoint |
{
"fn": "Bernard T Nolan",
"@type": "vcard:Contact",
"hasEmail": "mailto:btnolan@usgs.gov"
}
|
| description | This public data release contains two ascii grids comprising predicted nitrate concentrations (as NO3-N, mg/L) at two depth zones associated with private and public drinking water supply wells, respectively, in the Central Valley, California; raster files of the 25 predictor variables in the final statistical model; and groundwater nitrate data from the sampled wells used to train and test the model. Both prediction grids are bound by the alluvial bed boundary that defines the Central Valley. The prediction grids were produced with Boosted Regression Tree (BRT) modeling methods within a statistical modeling framework using the statistical modeling software R (R Core Team, https://www.r-project.org/) and linear interpolation within Oasis Montaj software (Geosoft, version 9.0.2). The response variable was a set of nitrate concentrations in wells located within the Central Valley. We compiled the database of well nitrate measurements from private supply and public supply wells. Nitrate data came from two sources, the University of California at Davis (UC Davis) and the U.S. Geological Survey (USGS). Prior to statistical modeling, wells were spatially declustered using an equal area grid cell approach to reduce effects on the modeling of oversampling in areas of intensive agricultural land use. A total of 5170 wells were selected, 3508 of which were used for training and 1662 of which served as hold-out. A database of 25 predictor variables was used for the final BRT model and included well characteristics, land use, climate, soil properties, aquifer properties, depth to the water table, and estimates of nitrogen loading and groundwater age. Based on the gridded predictor variables and final model, nitrate predictions were made using the R raster package for 17 depth zones spaced throughout the aquifer (at 15.24, 30.48, 45.72, 60.96, 76.20, 91.44, 106.68, 121.92, 152.40, 182.88, 213.36, 243.84, 274.32, 304.80, 365.76, 426.72, and 487.68 m below ground surface) to create input layers for 3D mapping with Oasis Montaj software version 9.0.2 (GeoSoft, Inc., 2016). The nitrate prediction grids for each of the 17 depth zones were imported into the Oasis Montaj mapping environment software for 3D interpolation and visualization. Each grid was assigned a vertical thickness of 1 m and linear interpolation was used between each of the layers at a vertical resolution of 1 m to produce a complete representation of predicted nitrate concentration at depth throughout the Central Valley. For visualization purposes, nitrate predictions were extracted from the interpolated model at 54.86 m and at 121.92 m deep. These depths correspond to the average total depths of private and public wells for the training wells. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "Digital Data",
"format": "XML",
"accessURL": "https://doi.org/10.5066/F7V40SDN",
"mediaType": "application/http",
"description": "Landing page for access to the data"
},
{
"@type": "dcat:Distribution",
"title": "Original Metadata",
"format": "XML",
"mediaType": "text/xml",
"description": "The metadata original format",
"downloadURL": "https://data.usgs.gov/datacatalog/metadata/USGS.58c1d920e4b014cc3a3d3b63.xml"
}
]
|
| identifier | http://datainventory.doi.gov/id/dataset/USGS_58c1d920e4b014cc3a3d3b63 |
| keyword |
[
"Boosted regression trees",
"California",
"Central Valley",
"Machine learning",
"Nitrate",
"USGS:58c1d920e4b014cc3a3d3b63",
"United States",
"groundwater",
"mathematical modeling"
]
|
| modified | 2020-08-26T00:00:00Z |
| publisher |
{
"name": "U.S. Geological Survey",
"@type": "org:Organization"
}
|
| spatial | -122.90405272946, 34.957995309646, -118.53149413589, 40.630630082643 |
| theme |
[
"Geospatial"
]
|
| title | Groundwater nitrate data and ascii grids of predicted nitrate and model inputs for the Central Valley aquifer, California, USA |