Found 340 datasets matching "text processing".
-
A collection of full-text documents from various sources including the Financial Times Limited (1991, 1992, 1993, 1994), the Congressional Record of the 103rd Congress (1993), the Federal Register...
Search relevance: 49.09 | Views last month: 0 -
KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS BOLIN DING*, YINTAO YU*, BO ZHAO*, CINDY XIDE LIN*, JIAWEI HAN*, AND CHENGXIANG ZHAI* Abstract. We study the problem of keyword...
Search relevance: 40.74 | Views last month: 1 -
This dataset contains surface sediment grain size distributions derived from automated image processing of in situ seafloor images obtained with an underwater camera system at four sites (SKM,...
Search relevance: 39.14 | Views last month: 0 -
An R-Shiny application has been developed that allows users to import text-based air sensor data, define the format of that data, do basic quality control, and export the data to standard formats....
Search relevance: 36.75 | Views last month: 1 -
The supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/toxics11120951/s1. Text S1: Chemicals; Text S2: Thyroid Hormone Chemicals and Analysis; Text S3: In Vivo...
Search relevance: 36.58 | Views last month: 2 -
Round 6 Test DatasetThis is the test data used to construct and evaluate trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs...
Search relevance: 36.56 | Views last month: 0 -
Round 6 Holdout DatasetThis is the holdout data used to construct and evaluate trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs...
Search relevance: 36.47 | Views last month: 2 -
This dataset contains the reference genome assembly, created from 23 Hawaiian hoary bats collected across the four Hawaiian Islands, Hawaii, Maui, Oahu, and Kauai. These data were collected in...
Search relevance: 36.39 | Views last month: 1 -
This dataset contains surface sediment grain size distributions derived from manual point counts of in situ seafloor images obtained with an underwater camera system in the lower Columbia River,...
Search relevance: 35.69 | Views last month: 0 -
Round 6 Train DatasetThis is the training data used to construct and evaluate trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs...
Search relevance: 35.62 | Views last month: 0 -
Round 6 Train Dataset part2This is the training data used to construct and evaluate trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP)...
Search relevance: 35.62 | Views last month: 0 -
As the amount of textual information grows explosively in various kinds of business systems, it becomes more and more desirable to analyze both structured data records and unstructured text data...
Search relevance: 34.73 | Views last month: 3 -
This archive volume is one of a set of volumes containing raw and derived data from the Mars Exploration Rover mission. This volume contains "science" data products, which were generated by the...
Search relevance: 33.88 | Views last month: 0 -
nlp-question-answering-aug2023-trainThis is the train data used to evaluate trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs...
Search relevance: 33.77 | Views last month: 2 -
This archive volume is one of a set of volumes containing raw and derived data from the Mars Exploration Rover mission. This volume contains 'science' data products, which were generated by...
Search relevance: 33.73 | Views last month: 0 -
This vegetation mapping project of Suisun Marsh blends ground-based classification, aerial photo interpretation, and GIS editing and processing. The method is based on the development of a...
Search relevance: 33.37 | Views last month: 0 -
This vegetation mapping project of Suisun Marsh blends ground-based classification, aerial photo interpretation, and GIS editing and processing. The method is based on the development of a...
Search relevance: 33.37 | Views last month: 0 -
This data release contains a single vector shapefile and two text documents with code used to generate the data product. This vector shapefile contains the locations of 365 “plugged and...
Search relevance: 32.92 | Views last month: 0