Return to search results

KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS

Published by Dashlink | National Aeronautics and Space Administration | Metadata Last Checked: January 03, 2026 | Last Modified: 2025-04-01

KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS BOLIN DING*, YINTAO YU*, BO ZHAO*, CINDY XIDE LIN*, JIAWEI HAN*, AND CHENGXIANG ZHAI* Abstract. We study the problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube). The text cube is built on a multidimensional text database, where each row is associated with some text data (e.g., a document) and other structural dimensions (attributes). A cell in the text cube aggregates a set of documents with matching attribute values in a subset of dimensions. A cell document is the concatenation of all documents in a cell. Given a keyword query, our goal is to find the top-k most relevant cells (ranked according to the relevance scores of cell documents w.r.t. the given query) in the text cube. We define a keyword-based query language and apply IR-style relevance model for scoring and ranking cell documents in the text cube. We propose two efficient approaches to find the top-k answers. The proposed approaches support a general class of IR-style relevance scoring formulas that satisfy certain basic and common properties. One of them uses more time for pre-processing and less time for answering online queries; and the other one is more efficient in pre-processing and consumes more time for online queries. Experimental studies on the ASRS dataset are conducted to verify the efficiency and effectiveness of the proposed approaches.

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

@type	dcat:Dataset
accessLevel	public
accrualPeriodicity	irregular
bureauCode	[ "026:00" ]
contactPoint	{ "fn": "Elizabeth Foughty", "@type": "vcard:Contact", "hasEmail": "mailto:elizabeth.a.foughty@nasa.gov" }
description	KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS BOLIN DING, YINTAO YU, BO ZHAO, CINDY XIDE LIN, JIAWEI HAN, AND CHENGXIANG ZHAI Abstract. We study the problem of keyword search in a data cube with text-rich dimension(s) (so-called text cube). The text cube is built on a multidimensional text database, where each row is associated with some text data (e.g., a document) and other structural dimensions (attributes). A cell in the text cube aggregates a set of documents with matching attribute values in a subset of dimensions. A cell document is the concatenation of all documents in a cell. Given a keyword query, our goal is to find the top-k most relevant cells (ranked according to the relevance scores of cell documents w.r.t. the given query) in the text cube. We define a keyword-based query language and apply IR-style relevance model for scoring and ranking cell documents in the text cube. We propose two efficient approaches to find the top-k answers. The proposed approaches support a general class of IR-style relevance scoring formulas that satisfy certain basic and common properties. One of them uses more time for pre-processing and less time for answering online queries; and the other one is more efficient in pre-processing and consumes more time for online queries. Experimental studies on the ASRS dataset are conducted to verify the efficiency and effectiveness of the proposed approaches.
distribution	[ { "@type": "dcat:Distribution", "title": "Paper 12 .pdf", "format": "PDF", "mediaType": "application/pdf", "description": "KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS", "downloadURL": "https://c3.nasa.gov/dashlink/static/media/publication/Paper_12_.pdf" } ]
identifier	DASHLINK_234
issued	2010-10-13
keyword	[ "ames", "dashlink", "nasa" ]
landingPage	https://c3.nasa.gov/dashlink/resources/234/
modified	2025-04-01
programCode	[ "026:029" ]
publisher	{ "name": "Dashlink", "@type": "org:Organization" }
title	KEYWORD SEARCH IN TEXT CUBE: FINDING TOP-K RELEVANT CELLS

1 resource available