Publications and Datasets from Play-Fairway Retrospective Analysis with Emphasis on Developing Improved Hydrothermal Energy Assessments
Previous moderate- and high-temperature geothermal resource assessments of the western United States utilized data-driven methods and expert decisions to estimate resource favorability. Although expert decisions can add confidence to the modeling process by ensuring reasonable models are employed, expert decisions also introduce human and, thereby, model bias. This bias can present a source of error that reduces the predictive performance of the models and confidence in the resulting resource estimates.
This study aims to develop robust data-driven methods with the goals of reducing bias and improving predictive ability. This submission includes a list of papers, data releases, and presentations produced as part of this work.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"019:20"
]
|
| contactPoint |
{
"fn": "Stanley P. Mordensky",
"@type": "vcard:Contact",
"hasEmail": "mailto:smordensky@usgs.gov"
}
|
| dataQuality |
true
|
| description | Previous moderate- and high-temperature geothermal resource assessments of the western United States utilized data-driven methods and expert decisions to estimate resource favorability. Although expert decisions can add confidence to the modeling process by ensuring reasonable models are employed, expert decisions also introduce human and, thereby, model bias. This bias can present a source of error that reduces the predictive performance of the models and confidence in the resulting resource estimates. This study aims to develop robust data-driven methods with the goals of reducing bias and improving predictive ability. This submission includes a list of papers, data releases, and presentations produced as part of this work. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "When Less Is More - How Increasing the Complexity of Machine Learning Strategies for Geothermal Energy Assessments May Not Lead toward Better Estimates",
"format": "102662",
"accessURL": "https://doi.org/10.1016/j.geothermics.2023.102662",
"mediaType": "application/octet-stream",
"description": "Our study aims to develop robust data-driven methods with the goals of reducing bias and improving predictive ability. We present and compare nine favorability maps for geothermal resources in the western United States using data from the U.S. Geological Survey's 2008 geothermal resource assessment. Two favorability maps are created using the expert decision-dependent methods from the 2008 assessment (i.e., weight-of-evidence and logistic regression). With the same data, we then create six different favorability maps using logistic regression (without underlying expert decisions), XGBoost, and support-vector machines paired with two training strategies. The training strategies are customized to address the inherent challenges of applying machine learning to the geothermal training data, which have no negative examples and severe class imbalance. We also create another favorability map using an artificial neural network."
},
{
"@type": "dcat:Distribution",
"title": "Applying Data-Driven Machine Learning to Geothermal Favorability in Western United States",
"format": "HTML",
"accessURL": "https://doi.org/10.1130/abs/2021AM-365177",
"mediaType": "text/html",
"description": "This study demonstrates that two foundational machine learning algorithms (logistic regression and XGBoost), implemented using unbiased data analysis strategies, agree with previous studies that relied much more heavily on expert-systems knowledge."
},
{
"@type": "dcat:Distribution",
"title": "What Matters Most - Measuring Feature Importance for Geothermal Resources Using Supervised Learning",
"format": "HTML",
"accessURL": "https://doi.org/10.1130/abs/2022AM-376621",
"mediaType": "text/html",
"description": "Recent evaluation of strategies for conventional hydrothermal resource assessment in the United States has relied upon machine learning methods (i.e., logistic regression, SVMs, XGBoost, and multilayer perceptron neural networks [i.e., MLPs]) to predict resource favorability using features (i.e., heat flow, distance to faults, distance to magma bodies, maximum horizontal strain, and seismic event density) from the U.S. Geological Survey’s 2008 Geothermal Resource Assessment. Two of the machine learning algorithms (i.e., SVMs and MLPs) must rely on model-agnostic measures of feature importance (i.e., measures of feature importance that are applicable regardless of an algorithm’s conceptual framework; e.g., sensitivity analyses and SHapely Additive exPlanation [SHAP] values), while the other two machine learning algorithms also offer straightforward, model-gnostic (i.e., algorithm-specific) measures to interpret the relative contributions of features on favorability predictions (i.e., feature coefficients for logistic regression, weight, gain, cover, and F score for XGBoost). Relative feature importance is measured for all machine learning algorithms using the model-agnostic measures, and, when possible, model-gnostic measures are shown for comparison."
},
{
"@type": "dcat:Distribution",
"title": "Imperfect Data In. Imperfect Model Out - Using Competing Models to Decide If We Have the Right Data",
"format": "HTML",
"accessURL": "https://doi.org/10.1130/abs/2022AM-377146",
"mediaType": "text/html",
"description": "To facilitate comparison of methods, we use the same data from the 2008 geothermal resource assessment (e.g., heat flow, horizontal stress) to train models from modern machine learning algorithms (i.e., logistic regression, eXtreme Gradient Boosting, support vector machines, and multilayer perceptron neural networks), which minimize dependence upon expert decisions. While some algorithms are simple (e.g., logistic regression), other algorithms are highly sophisticated (e.g., the neural network). Despite the contrast in complexity, the results from the very simple and highly complex algorithms are similar."
},
{
"@type": "dcat:Distribution",
"title": "Geothermal resource favorability - select features and predictions for the western United States",
"format": "HTML",
"accessURL": "https://doi.org/10.5066/P9V1Q9XM",
"mediaType": "text/html",
"description": "The data contained herein are five input features (i.e., heat flow, distance to the nearest quaternary fault, distance to the nearest quaternary magma body, seismic event density, maximum horizontal stress) and labels (i.e., where known geothermal systems have been identified) from Williams and DeAngelo (2008) and nine favorability maps from Mordensky et al. (2023). The favorability maps are the untransformed predictions from models resulting from the features and labels used with either the methods presented in Williams and DeAngelo (2008) or the machine learning approaches presented in Mordensky et al. (2023)."
},
{
"@type": "dcat:Distribution",
"title": "Predicting Geothermal Favorability in the Western United States by Using Machine Learning - Addressing Challenges and Developing Solutions",
"format": "HTML",
"accessURL": "https://pangea.stanford.edu/ERE/db/IGAstandard/record_detail.php?id=35430",
"mediaType": "text/html",
"description": "This study aims to reduce expert input through robust data-driven analyses and better-suited data science techniques, with the goals of saving time, reducing bias, and improving predictive ability. We present six favorability maps for geothermal resources in the western United States created using two strategies applied to three modern machine learning algorithms (logistic regression, support-vector machines, and XGBoost). To provide a direct comparison to previous assessments, we use the same input data as the 2008 U.S. Geological Survey (USGS) conventional moderate- to high-temperature geothermal resource assessment."
},
{
"@type": "dcat:Distribution",
"title": "What Did They Just Say - Building a Rosetta Stone for Geoscience and Machine Learning",
"format": "HTML",
"accessURL": "https://www.geothermal-library.org/index.php?mode=pubs&action=view&record=1034680",
"mediaType": "text/html",
"description": "Our research group of geoscientists and machine learning experts presents a process to help geoscientists understand the fundamentals of supervised learning by describing the general workflow (i.e., a conceptual pipeline) for supervised learning that must be understood by all the parties involved in a geoscience-machine learning endeavor. Terms critical for machine learning are introduced, defined, and used within the context of an overly simplified mock hydrological study to illustrate their appropriate usage, and then used again in the context of a published geothermal-machine learning study. "
}
]
|
| identifier | https://data.openei.org/submissions/7589 |
| issued | 2023-02-07T07:00:00Z |
| keyword |
[
"EGS",
"PFA",
"bias reduction",
"characterization",
"data-driven",
"energy",
"energy assessment",
"favorability",
"geoscience",
"geothermal",
"hydrothermal",
"low temp",
"machine learning",
"mapping",
"processed data",
"resource",
"resource assessment",
"retrospective",
"western US"
]
|
| landingPage | https://gdr.openei.org/submissions/1498 |
| license | https://creativecommons.org/licenses/by/4.0/ |
| modified | 2023-07-25T18:06:00Z |
| programCode |
[
"019:006"
]
|
| projectLead | Mike Weathers |
| projectNumber |
"24996"
|
| projectTitle | Play-Fairway Retrospective Analysis with Emphasis on Developing Improved Hydrothermal Energy Assessments |
| publisher |
{
"name": "United States Geological Survey",
"@type": "org:Organization"
}
|
| spatial |
"{"type":"Polygon","coordinates":[[[-126.2423875,28.97955582535226],[-100.76519375,28.97955582535226],[-100.76519375,49.033730499530634],[-126.2423875,49.033730499530634],[-126.2423875,28.97955582535226]]]}"
|
| title | Publications and Datasets from Play-Fairway Retrospective Analysis with Emphasis on Developing Improved Hydrothermal Energy Assessments |