Return to search results
sequenceMiner algorithm
Detecting and describing anomalies in large repositories of discrete symbol sequences.
**sequenceMiner has been open-sourced! Download the file below to try it out.**
sequenceMiner was developed to address the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. sequenceMiner works by performing unsupervised clustering (grouping) of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. sequenceMiner utilizes a new hybrid algorithm for computing the LCS that has been shown to outperform existing algorithms by a factor of five.
sequenceMiner also includes new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. This provides analysts with a coherent description of the anomalies identified in the sequence, and why they differ from more “normal” sequences.
sequenceMiner was developed with funding from the NASA Aviation Safety Program. In the commercial aviation domain, sequenceMiner can be used to discover atypical behavior in airline performance data that may have possible operational significance for safety analysts. But because the sequenceMiner approach is general and not restricted in any way to a domain, and these algorithms can be applied in other fields where anomaly detection and event mining would be useful.
Complete Metadata
| @type | dcat:Dataset |
|---|---|
| accessLevel | public |
| accrualPeriodicity | irregular |
| bureauCode |
[
"026:00"
]
|
| contactPoint |
{
"fn": "Suratna Budalakoti",
"@type": "vcard:Contact",
"hasEmail": "mailto:suratna_b@yahoo.com"
}
|
| description | Detecting and describing anomalies in large repositories of discrete symbol sequences. **sequenceMiner has been open-sourced! Download the file below to try it out.** sequenceMiner was developed to address the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. sequenceMiner works by performing unsupervised clustering (grouping) of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by a detailed analysis of outliers to detect anomalies. sequenceMiner utilizes a new hybrid algorithm for computing the LCS that has been shown to outperform existing algorithms by a factor of five. sequenceMiner also includes new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. This provides analysts with a coherent description of the anomalies identified in the sequence, and why they differ from more “normal” sequences. sequenceMiner was developed with funding from the NASA Aviation Safety Program. In the commercial aviation domain, sequenceMiner can be used to discover atypical behavior in airline performance data that may have possible operational significance for safety analysts. But because the sequenceMiner approach is general and not restricted in any way to a domain, and these algorithms can be applied in other fields where anomaly detection and event mining would be useful. |
| distribution |
[
{
"@type": "dcat:Distribution",
"title": "SequenceMiner1.2.tar.gz",
"format": "GZ",
"mediaType": "application/x-gzip",
"description": "Fix null pointer exception.",
"downloadURL": "https://c3.nasa.gov/dashlink/static/media/algorithm/SequenceMiner1.2.tar.gz"
},
{
"@type": "dcat:Distribution",
"title": "UnformattedFiles2Seq.tar.gz",
"format": "GZ",
"mediaType": "application/x-gzip",
"description": "Matlab/Octave scripts to convert files to sequences",
"downloadURL": "https://c3.nasa.gov/dashlink/static/media/algorithm/UnformattedFiles2Seq.tar.gz"
},
{
"@type": "dcat:Distribution",
"title": "SequenceMiner1.3.tar.gz",
"format": "GZ",
"mediaType": "application/x-gzip",
"description": "Random seed option and unique ID feature.",
"downloadURL": "https://c3.nasa.gov/dashlink/static/media/algorithm/SequenceMiner1.3.tar.gz"
},
{
"@type": "dcat:Distribution",
"title": "SequenceMiner.tar.gz",
"format": "GZ",
"mediaType": "application/x-gzip",
"description": "SequenceMiner.tar.gz",
"downloadURL": "https://c3.nasa.gov/dashlink/static/media/algorithm/SequenceMiner.tar.gz"
},
{
"@type": "dcat:Distribution",
"title": "SequenceMiner1.1.tar.gz",
"format": "GZ",
"mediaType": "application/x-gzip",
"description": "Speed increase.",
"downloadURL": "https://c3.nasa.gov/dashlink/static/media/algorithm/SequenceMiner1.1.tar.gz"
}
]
|
| identifier | DASHLINK_115 |
| issued | 2010-09-10 |
| keyword |
[
"ames",
"dashlink",
"nasa"
]
|
| landingPage | https://c3.nasa.gov/dashlink/resources/115/ |
| modified | 2025-03-31 |
| programCode |
[
"026:029"
]
|
| publisher |
{
"name": "Dashlink",
"@type": "org:Organization"
}
|
| title | sequenceMiner algorithm |