Return to search results
SDNist v1.3: Temporal Map Challenge Environment
SDNist (v1.3) is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. This version (1.3) reproduces the challenge environment from Sprints 2 and 3 of the Temporal Map Challenge. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via `pip` install: `pip install sdnist==1.2.8` for Python >=3.6 or on the [USNIST/Github](https://github.com/usnistgov/Differential-Privacy-Temporal-Map-Challenge-assets/). The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"006:55"
]
|
| contactPoint |
{
"fn": "Gary Howarth II",
"hasEmail": "mailto:gary.howarth@nist.gov"
}
|
| description | SDNist (v1.3) is a set of benchmark data and metrics for the evaluation of synthetic data generators on structured tabular data. This version (1.3) reproduces the challenge environment from Sprints 2 and 3 of the Temporal Map Challenge. These benchmarks are distributed as a simple open-source python package to allow standardized and reproducible comparison of synthetic generator models on real world data and use cases. These data and metrics were developed for and vetted through the NIST PSCR Differential Privacy Temporal Map Challenge, where the evaluation tools, k-marginal and Higher Order Conjunction, proved effective in distinguishing competing models in the competition environment.SDNist is available via `pip` install: `pip install sdnist==1.2.8` for Python >=3.6 or on the [USNIST/Github](https://github.com/usnistgov/Differential-Privacy-Temporal-Map-Challenge-assets/). The sdnist Python module will download data from NIST as necessary, and users are not required to download data manually. |
| distribution |
[
{
"title": "DOI Access for SDNist: Benchmark data and evaluation tools for data synthesizers.",
"accessURL": "https://doi.org/10.18434/mds2-2515"
},
{
"title": "SDNist software respository at Github",
"format": "Python 3.8 module",
"accessURL": "https://github.com/usnistgov/SDNist/",
"description": "SDNist: Benchmark data and evaluation tools for synthetic data generators"
},
{
"title": "K-marginal report template",
"mediaType": "application/octet-stream",
"description": "A jinja2 report template to help humans read the k-marginal data",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/report2.jinja2"
},
{
"title": "Datasets for 'Census' evaluation in CSV format",
"format": "CSV",
"mediaType": "application/zip",
"description": "Three compressed CSV files to run the 'Census'-related functions in SDNist.",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/census-datasets-CSVs.zip"
},
{
"title": "Taxi datasets in CSV format",
"format": "CSV",
"mediaType": "application/zip",
"description": "Three compressed CSV files to run the 'Taxi'-related functions in SDNist.",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi-datasets-CSVs.zip"
},
{
"title": "Census GA_NC_SC schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/GA_NC_SC_10Y_PUMS.json"
},
{
"title": "Census GA_NC_SC data",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/GA_NC_SC_10Y_PUMS.parquet"
},
{
"title": "Census IL_OH schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/IL_OH_10Y_PUMS.json"
},
{
"title": "Census IL-OH data",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/IL_OH_10Y_PUMS.parquet"
},
{
"title": "Census NY_PA schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/NY_PA_10Y_PUMS.json"
},
{
"title": "Census NY-PA data",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/NY_PA_10Y_PUMS.parquet"
},
{
"title": "Taxi schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi.json"
},
{
"title": "Taxi data",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi.parquet"
},
{
"title": "Taxi 2016 schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi2016.json"
},
{
"title": "Taxi 2016",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi2016.parquet"
},
{
"title": "Taxi 2020 schema",
"mediaType": "application/json",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi2020.json"
},
{
"title": "Taxi 2020 data",
"mediaType": "application/octet-stream",
"downloadURL": "https://data.nist.gov/od/ds/mds2-2515/taxi2020.parquet"
}
]
|
| identifier | ark:/88434/mds2-2515 |
| issued | 2021-12-28 |
| keyword |
[
"benchmarks",
"differential privacy",
"privacy",
"private information sharing",
"synthetic data"
]
|
| landingPage | https://data.nist.gov/od/id/mds2-2515 |
| language |
[
"en"
]
|
| license | https://www.nist.gov/open/license |
| modified | 2021-12-06 00:00:00 |
| programCode |
[
"006:045"
]
|
| publisher |
{
"name": "National Institute of Standards and Technology",
"@type": "org:Organization"
}
|
| theme |
[
"Information Technology:Artificial Intelligence",
"Information Technology:Privacy",
"Public Safety:Public safety communications research"
]
|
| title | SDNist v1.3: Temporal Map Challenge Environment |