BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting
Resources
6 resources available
-
The Buildings-900K dataset and the BuildingsBench datasets in S3
HTML -
EULP for the U.S. Building Stock
HTML -
NeurIPS Paper
00142 -
BuildingsBench GitHub Repo
HTML -
Tutorials
HTML -
OEDI Data Registry on AWS
HTML
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"019:20"
]
|
| contactPoint |
{
"fn": "Patrick Emami",
"@type": "vcard:Contact",
"hasEmail": "mailto:pemami@nrel.gov"
}
|
| dataQuality |
true
|
| description | The BuildingsBench datasets consist of: - Buildings-900K: A large-scale dataset of 900K buildings for pretraining models on the task of short-term load forecasting (STLF). Buildings-900K is statistically representative of the entire U.S. building stock. - 7 real residential and commercial building datasets for benchmarking two downstream tasks evaluating generalization: zero-shot STLF and transfer learning for STLF. Buildings-900K can be used for pretraining models on day-ahead STLF for residential and commercial buildings. The specific gap it fills is the lack of large-scale and diverse time series datasets of sufficient size for studying pretraining and finetuning with scalable machine learning models. Buildings-900K consists of synthetically generated energy consumption time series. It is derived from the NREL End-Use Load Profiles (EULP) dataset (see link to this database in the links further below). However, the EULP was not originally developed for the purpose of STLF. Rather, it was developed to "...help electric utilities, grid operators, manufacturers, government entities, and research organizations make critical decisions about prioritizing research and development, utility resource and distribution system planning, and state and local energy planning and regulation." Similar to the EULP, Buildings-900K is a collection of Parquet files and it follows nearly the same Parquet dataset organization as the EULP. As it only contains a single energy consumption time series per building, it is much smaller (~110 GB). BuildingsBench also provides an evaluation benchmark that is a collection of various open source residential and commercial real building energy consumption datasets. The evaluation datasets, which are provided alongside Buildings-900K below, are collections of CSV files which contain annual energy consumption. The size of the evaluation datasets altogether is less than 1GB, and they are listed out below: 1. ElectricityLoadDiagrams20112014 2. Building Data Genome Project-2 3. Individual household electric power consumption (Sceaux) 4. Borealis 5. SMART 6. IDEAL 7. Low Carbon London A README file providing details about how the data is stored and describing the organization of the datasets can be found within each data lake version under BuildingsBench. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "The Buildings-900K dataset and the BuildingsBench datasets in S3",
"format": "HTML",
"accessURL": "https://data.openei.org/s3_viewer?bucket=oedi-data-lake&prefix=buildings-bench%2F",
"mediaType": "text/html",
"description": "Link to full dataset in OEDI S3 viewer."
},
{
"@type": "dcat:Distribution",
"title": "EULP for the U.S. Building Stock",
"format": "HTML",
"accessURL": "https://data.openei.org/submissions/4520",
"mediaType": "text/html",
"description": "Buildings-900K is derived from the End-Use Load Profiles (EULP) dataset from this OEDI submission."
},
{
"@type": "dcat:Distribution",
"title": "NeurIPS Paper",
"format": "00142",
"accessURL": "https://doi.org/10.48550/arXiv.2307.00142",
"mediaType": "application/octet-stream",
"description": "Link to our NeurIPS'23 Datasets & Benchmarks paper titled "BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting." This paper provides additional information on the datasets along with the analyses conducted with BuildingsBench."
},
{
"@type": "dcat:Distribution",
"title": "BuildingsBench GitHub Repo",
"format": "HTML",
"accessURL": "https://github.com/NREL/BuildingsBench",
"mediaType": "text/html",
"description": "Repository that contains code for large-scale pretraining and benchmarking for short-term load forecasting."
},
{
"@type": "dcat:Distribution",
"title": "Tutorials",
"format": "HTML",
"accessURL": "https://nrel.github.io/BuildingsBench/tutorials/",
"mediaType": "text/html",
"description": "Link to Tutorials page for BuildingsBench. This includes tutorials on getting started with the data and pretrained models, registering a new model with BuildingsBench, and computing aggregate statistics from results files."
},
{
"@type": "dcat:Distribution",
"title": "OEDI Data Registry on AWS",
"format": "HTML",
"accessURL": "https://registry.opendata.aws/oedi-data-lake/",
"mediaType": "text/html",
"description": "AWS public dataset program registry page for data released under the Department of Energy's (DOE) Open Energy Data Initiative (OEDI). The registry page contains information about dataset documentation, access, and contact, for each of the OEDI Data Lake datasets."
}
]
|
| DOI | 10.25984/1986147 |
| identifier | https://data.openei.org/submissions/5859 |
| issued | 2018-12-31T07:00:00Z |
| keyword |
[
"EULP",
"STLF",
"benchmark",
"buildings",
"commercial",
"dataset",
"deep learning",
"end use load profiles",
"energy",
"load forecasting",
"machine learning",
"power",
"pretraining",
"processed data",
"residential",
"short-term",
"transfer learning"
]
|
| landingPage | https://data.openei.org/submissions/5859 |
| license | https://creativecommons.org/licenses/by/4.0/ |
| modified | 2024-01-11T07:00:01Z |
| programCode |
[
"019:000",
"019:002",
"019:023"
]
|
| projectNumber | 08GO28308 |
| projectTitle | Laboratory Directed Research and Development (LDRD) |
| publisher |
{
"name": "National Renewable Energy Laboratory",
"@type": "org:Organization"
}
|
| spatial |
"{"type":"Polygon","coordinates":[[[-180,-83],[180,-83],[180,83],[-180,83],[-180,-83]]]}"
|
| title | BuildingsBench: A Large-Scale Dataset of 900K Buildings and Benchmark for Short-Term Load Forecasting |