NIST Excerpts Benchmark Data
The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"006:55"
]
|
| contactPoint |
{
"fn": "Gary Howarth II",
"hasEmail": "mailto:gary.howarth@nist.gov"
}
|
| describedBy | https://github.com/usnistgov/SDNist/tree/main/BenchmarkData |
| description | The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products. |
| distribution |
[
{
"title": "NIST Excerpt Benchmark Data",
"format": "A data respository",
"accessURL": "https://github.com/usnistgov/SDNist/tree/main/BenchmarkData",
"description": "The NIST Data Excerpts are curated subsets of publicly released tabular data sets, drawn from real households and businesses in the U.S. The Excerpts serve as benchmark data for the [SDNist v2: Deidentified Data Report Tool](https://github.com/usnistgov/SDNist/) ."
}
]
|
| identifier | ark:/88434/mds2-2895 |
| issued | 2023-06-02 |
| keyword |
[
"American Community Survey",
"SDNist",
"demographic data",
"privacy",
"synthetic data"
]
|
| landingPage | https://data.nist.gov/od/id/mds2-2895 |
| language |
[
"en"
]
|
| license | https://www.nist.gov/open/license |
| modified | 2025-01-31 00:00:00 |
| programCode |
[
"006:045"
]
|
| publisher |
{
"name": "National Institute of Standards and Technology",
"@type": "org:Organization"
}
|
| theme |
[
"Information Technology:Data and informatics",
"Information Technology:Privacy",
"Information Technology:Software research"
]
|
| title | NIST Excerpts Benchmark Data |