Skip to main content
U.S. flag

An official website of the United States government

This site is currently in beta, and your feedback is helping shape its ongoing development.

ARC Code TI: sequenceMiner

Published by Ames Research Center | National Aeronautics and Space Administration | Metadata Last Checked: August 04, 2025 | Last Modified: 2025-07-14
The sequenceMiner was developed to address the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. sequenceMiner works by performing unsupervised clustering (grouping) of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. sequenceMiner utilizes a new hybrid algorithm for computing the LCS that has been shown to outperform existing algorithms by a factor of five. sequenceMiner also includes new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. This provides analysts with a coherent description of the anomalies identified in the sequence, and why they differ from more 'normal' sequences.

Find Related Datasets

Click any tag below to search for similar datasets

data.gov

An official website of the GSA's Technology Transformation Services

Looking for U.S. government information and services?
Visit USA.gov