Return to search results
NIST test dataset for assessing baseline nucleic acid sequence screening
This repository contains the dataset used in the manuscript "Inter-tool analysis of a NIST dataset for assessing baseline nucleic acid sequence screening". NIST constructed the test dataset based on the current screening recommendations from HHS. The dataset is a FASTA formatted file with blinded numerical sequence headers. The dataset was sent to sequence screening tool developers for initial testing and to obtain feedback about its utility for assessing baseline sequence screening. An additional metadata file provides the NIST-assigned label for each sequence, along with a more detailed description derived from the source database.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| accrualPeriodicity | irregular |
| bureauCode |
[
"006:55"
]
|
| contactPoint |
{
"fn": "Tyler Laird",
"hasEmail": "mailto:tyler.laird@nist.gov"
}
|
| description | This repository contains the dataset used in the manuscript "Inter-tool analysis of a NIST dataset for assessing baseline nucleic acid sequence screening". NIST constructed the test dataset based on the current screening recommendations from HHS. The dataset is a FASTA formatted file with blinded numerical sequence headers. The dataset was sent to sequence screening tool developers for initial testing and to obtain feedback about its utility for assessing baseline sequence screening. An additional metadata file provides the NIST-assigned label for each sequence, along with a more detailed description derived from the source database. |
| distribution |
[
{
"title": "NIST_nucleic_acid_synthesis_screening_test_dataset",
"format": "FASTA",
"mediaType": "text/plain",
"description": "A FASTA file of blinded sequences used as a test for assessing baseline sequence screening capabilities of several nucleic acid synthesis screening tools.",
"downloadURL": "https://data.nist.gov/od/ds/mds2-3787/NIST_nucleic_acid_synthesis_screening_test_dataset.fasta"
},
{
"title": "README",
"mediaType": "text/markdown",
"description": "A README file pertaining to the NIST test dataset for assessing baseline nucleic acid sequence screening.",
"downloadURL": "https://data.nist.gov/od/ds/mds2-3787/README.md"
},
{
"title": "NIST_nucleic_acid_syntheisis_screening_test_dataset_metadata",
"mediaType": "text/tab-separated-values",
"description": "A file with additional information for each sequence in the associated FASTA file",
"downloadURL": "https://data.nist.gov/od/ds/mds2-3787/NIST_nucleic_acid_syntheisis_screening_test_dataset_metadata.tsv"
}
]
|
| identifier | ark:/88434/mds2-3787 |
| issued | 2025-05-21 |
| keyword |
[
"Biosecurity",
"DNA",
"Nucleic Acid Synthesis",
"Sequence Screening"
]
|
| landingPage | https://data.nist.gov/od/id/mds2-3787 |
| language |
[
"en"
]
|
| license | https://www.nist.gov/open/license |
| modified | 2024-08-09 00:00:00 |
| programCode |
[
"006:045"
]
|
| publisher |
{
"name": "National Institute of Standards and Technology",
"@type": "org:Organization"
}
|
| theme |
[
"Bioscience:Biomaterials",
"Bioscience:Engineering/synthetic biology",
"Public Safety:Chemical/Biological/Radiological/Nuclear/Explosives (CBRNE)"
]
|
| title | NIST test dataset for assessing baseline nucleic acid sequence screening |