Skip to main content
U.S. flag

An official website of the United States government

This site is currently in beta, and your feedback is helping shape its ongoing development.

Code used to produce terms list in the work "NLP-Driven Electron Microscopy Ontology Development"

Published by National Institute of Standards and Technology | National Institute of Standards and Technology | Metadata Last Checked: August 02, 2025 | Last Modified: 2021-12-31 00:00:00
This is a collection of code written by Maurice Curran that was used to process the Microscopy and Microanalysis conference proceeding corpus into word products described in the publication "NLP-Driven Electron Microscopy Ontology Development". The scripts are written in Python, to be used in the following order:1. SettingUpTextFiles.py and CopyingText.py to get the raw text files; 2. SentenceConversion.py; 3. reference_remover.py; 4. testing.py and testingavg.py; 5. SentenceCreator.py; 6. matscholar_model.py to get matscholar tags; 7. training_model_gensim.py to get gensim model;8. word2vecscript.py and gensim_visual.py;

Find Related Datasets

Click any tag below to search for similar datasets

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov