Deep Green Unannotated Protein Structures
The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology. References: Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589. Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| bureauCode |
[
"019:20"
]
|
| contactPoint |
{
"fn": "Eric Knoshaug",
"@type": "vcard:Contact",
"hasEmail": "mailto:Eric.Knoshaug@nrel.gov"
}
|
| dataQuality |
true
|
| description | The Deep Green list is based on the identification and curation of conserved unannotated proteins in three green lineage (Viridiplantae) model organisms; Arabidopsis thaliana, Chlamydomonas reinhardtii, and Setaria viridis. Preliminary characterization of Deep Green proteins and genes was done using various informatics tools and published data sets and is presented in Knoshaug, Sun, et al., 2023, submitted. The structures of these unannotated proteins were also predicted using AlphaFold (Jumper et al., 2021). The data deposited here are the AlphaFold structural predictions having the highest pLDDT score and thus identified as the best folded structure (ranked_0). These data enable others to do in-depth structural characterizations to aid in functional characterization leading to deeper understanding of plant biology. References: Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berghammer, T., Bodenstein, S., Silver, D., Vinyals, O., Senior, A. W., Kavukcuoglu, K., Kohli, P. and Hassabis, D. (2021) Highly accurate protein structure prediction with AlphaFold. Nature, 596:583-589. Knoshaug, E. P., Sun, P., Nag, A., Nguyen, H., Mattoon, E. M., Zhang, N., Liu, J., Chen, C., Cheng, J., Zhang, R., St. John, P., and Umen, J. (submitted) Identification and preliminary characterization of conserved uncharacterized proteins from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Setaria viridis. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "Predicted structures for Arabidopsis thaliana unannotated proteins",
"accessURL": "https://data.nrel.gov/system/files/216/A-1682005105.thaliana_unannotated%20structures.zip",
"mediaType": "application/octet-stream",
"description": "Predicted structures for Arabidopsis thaliana unannotated proteins"
},
{
"@type": "dcat:Distribution",
"title": "Predicted structures for Chlamydomonas reinhardtii unannotated proteins",
"accessURL": "https://data.nrel.gov/system/files/216/C-1682005105.reinhardtii_unannotated%20structures.zip",
"mediaType": "application/octet-stream",
"description": "Predicted structures for Chlamydomonas reinhardtii unannotated proteins"
},
{
"@type": "dcat:Distribution",
"title": "Predicted structures of the Deep Green protein set",
"accessURL": "https://data.nrel.gov/system/files/216/Deep%20Green%20folded%20structures-1682005105.zip",
"mediaType": "application/octet-stream",
"description": "Predicted structures of the Deep Green protein set"
},
{
"@type": "dcat:Distribution",
"title": "Predicted structures for Setaria viridis unannotated proteins",
"accessURL": "https://data.nrel.gov/system/files/216/S-1682005105.viridis_unannotated%20structures_0.zip",
"mediaType": "application/octet-stream",
"description": "Predicted structures for Setaria viridis unannotated proteins"
}
]
|
| identifier | https://data.openei.org/submissions/8267 |
| issued | 2023-04-20T16:14:18Z |
| keyword |
[
"AlphaFold",
"Arabidopsis thaliana",
"Chlamydomonas reinhardtii",
"Donald Danforth Plant Science Center",
"Setaria viridis",
"energy crop",
"green lineage",
"model species",
"protein structure",
"unannotated proteins"
]
|
| landingPage | https://data.nrel.gov/submissions/216 |
| license | https://creativecommons.org/licenses/by/4.0/ |
| modified | 2025-01-17T20:52:58Z |
| programCode |
[
"019:005"
]
|
| projectNumber | ERW9098 |
| projectTitle | Deep Green: Structural and Functional Genomic Characterization of Conserved Unannotated Green Lineage Proteins |
| publisher |
{
"name": "National Renewable Energy Laboratory",
"@type": "org:Organization"
}
|
| title | Deep Green Unannotated Protein Structures |