PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422)
Data presented here are subset of a larger plankton imagery data set collected in the subtropical Straits of Florida from 2014-05-28 to 2014-06-14. Imagery data were collected using the In Situ Ichthyoplankton Imaging System (ISIIS-2) as part of a NSF-funded project to assess the biophysical drivers affecting fine-scale interactions between larval fish, their prey, and predators. This subset of images was used in the inaugural National Data Science Bowl (www.datasciencebowl.com) hosted by Kaggle and sponsored by Booz Allen Hamilton. Data were originally collected to examine the biophysical drivers affecting fine-scale (spatial) interactions between larval fish, their prey, and predators in a subtropical pelagic marine ecosystem. Image segments extracted from the raw data were sorted into 121 plankton classes, split 50:50 into train and test data sets, and provided for a machine learning competition (the National Data Science Bowl). There was no hierarchical relationships explicit in the 121 plankton classes, though the class naming convention and a tree-like diagram (see file "Plankton Relationships.pdf") indicated relationships between classes, whether it was taxonomic or structural (size and shape). We intend for this dataset to be available to the machine learning and computer vision community as a standard machine learning benchmark. This “Plankton 1.0” dataset is a medium-size dataset with a fair amount of complexity where image classification improvements can still be made.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | non-public |
| contactPoint |
{
"fn": "NOAA National Centers for Environmental Information",
"@type": "vcard:Contact",
"hasEmail": "mailto:ncei.info@noaa.gov"
}
|
| describedByType | application/octet-steam |
| description | Data presented here are subset of a larger plankton imagery data set collected in the subtropical Straits of Florida from 2014-05-28 to 2014-06-14. Imagery data were collected using the In Situ Ichthyoplankton Imaging System (ISIIS-2) as part of a NSF-funded project to assess the biophysical drivers affecting fine-scale interactions between larval fish, their prey, and predators. This subset of images was used in the inaugural National Data Science Bowl (www.datasciencebowl.com) hosted by Kaggle and sponsored by Booz Allen Hamilton. Data were originally collected to examine the biophysical drivers affecting fine-scale (spatial) interactions between larval fish, their prey, and predators in a subtropical pelagic marine ecosystem. Image segments extracted from the raw data were sorted into 121 plankton classes, split 50:50 into train and test data sets, and provided for a machine learning competition (the National Data Science Bowl). There was no hierarchical relationships explicit in the 121 plankton classes, though the class naming convention and a tree-like diagram (see file "Plankton Relationships.pdf") indicated relationships between classes, whether it was taxonomic or structural (size and shape). We intend for this dataset to be available to the machine learning and computer vision community as a standard machine learning benchmark. This “Plankton 1.0” dataset is a medium-size dataset with a fair amount of complexity where image classification improvements can still be made. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "NCEI Dataset Landing Page",
"mediaType": "placeholder/value",
"description": "Navigate directly to the URL for a descriptive web page with download links.",
"downloadURL": "https://doi.org/10.7289/v5d21vjd",
"describedByType": "application/octet-steam"
},
{
"@type": "dcat:Distribution",
"title": "Descriptive Information",
"mediaType": "placeholder/value",
"description": "Navigate directly to the URL for a descriptive web page with download links.",
"downloadURL": "https://www.ncei.noaa.gov/archive/accession/oas/127422",
"describedByType": "application/octet-steam"
},
{
"@type": "dcat:Distribution",
"title": "HTTPS",
"mediaType": "placeholder/value",
"description": "Navigate directly to the URL for data access and direct download.",
"downloadURL": "https://www.ncei.noaa.gov/archive/accession/download/127422",
"describedByType": "application/octet-steam"
},
{
"@type": "dcat:Distribution",
"title": "FTP",
"mediaType": "placeholder/value",
"description": "These data are available through the File Transfer Protocol (FTP). FTP is no longer supported by most internet browsers. You may copy and paste the FTP link to the data into an FTP client (e.g., FileZilla or WinSCP).",
"downloadURL": "ftp://ftp-oceans.ncei.noaa.gov/nodc/archive/arc0075/0127422/",
"describedByType": "application/octet-steam"
},
{
"@type": "dcat:Distribution",
"title": "GCMD Keyword Forum Page",
"mediaType": "placeholder/value",
"description": "Global Change Master Directory (GCMD). 2025. GCMD Keywords, Version 21. Greenbelt, MD: Earth Science Data and Information System, Earth Science Projects Division, Goddard Space Flight Center (GSFC), National Aeronautics and Space Administration (NASA). URL (GCMD Keyword Forum Page): https://forum.earthdata.nasa.gov/app.php/tag/GCMD+Keywords",
"downloadURL": "https://forum.earthdata.nasa.gov/app.php/tag/GCMD%2BKeywords",
"describedByType": "application/octet-steam"
},
{
"@type": "dcat:Distribution",
"title": "NCEI Contact Information",
"mediaType": "placeholder/value",
"description": "Information for contacts at NCEI.",
"downloadURL": "https://www.ncei.noaa.gov/contact",
"describedByType": "application/octet-steam"
}
]
|
| identifier | gov.noaa.nodc:0127422 |
| issued | 2015-04-28T00:00:00.000+00:00 |
| keyword |
[
"0127422",
"biological data",
"images",
"PLANKTON",
"biological",
"in situ",
"R/V F.G. Walton Smith",
"Oregon State University, Hatfield Marine Science Center",
"Oregon State University, Hatfield Marine Science Center",
"Straits of Florida",
"oceanography",
"DOC/NOAA/NESDIS/NODC > National Oceanographic Data Center, NESDIS, NOAA, U.S. Department of Commerce",
"National Data Science Bowl (www.datasciencebowl.com)",
"Spatial variability of larval fish in relation to their prey and predator fields: Patterns and interactions from cm to 10s of km in a subtropical, pelagic environment - NSF Award 1419987",
"EARTH SCIENCE > BIOLOGICAL CLASSIFICATION",
"EARTH SCIENCE > BIOLOGICAL CLASSIFICATION > PROTISTS > PLANKTON",
"EARTH SCIENCE > BIOSPHERE > ECOSYSTEMS > AQUATIC ECOSYSTEMS > PLANKTON",
"In situ Ichthyoplankton Imaging System (ISIIS)",
"F. G. Walton Smith (call sign: WCZ6292, ICES code: 33WA, 1999-)",
"OCEAN > ATLANTIC OCEAN > NORTH ATLANTIC OCEAN",
"MLHFP1"
]
|
| landingPage | https://www.ncei.noaa.gov/contact |
| language |
[]
|
| license | https://creativecommons.org/publicdomain/zero/1.0/ |
| modified | 2015-05-08T00:00:00.000+00:00 |
| publisher |
{
"name": "NOAA National Centers for Environmental Information",
"@type": "org:Organization"
}
|
| rights | otherRestrictions |
| spatial | -79.2,24.3,-81.9,26.0 |
| temporal | 2014-06-03T00:00:00+00:00/2014-06-06T00:00:00+00:00 |
| title | PlanktonSet 1.0: Plankton imagery data collected from F.G. Walton Smith in Straits of Florida from 2014-06-03 to 2014-06-06 and used in the 2015 National Data Science Bowl (NCEI Accession 0127422) |