OPTIMAL PARTITIONS OF DATA IN HIGHER DIMENSIONS

Published by Dashlink | National Aeronautics and Space Administration | Metadata Last Checked: August 04, 2025 | Last Modified: 2025-03-31

OPTIMAL PARTITIONS OF DATA IN HIGHER DIMENSIONS BRADLEY W. JACKSON*, JEFFREY D. SCARGLE**, AND CHRIS CUSANZA, DAVID BARNES, DENNIS KANYGIN, RUSSELL SARMIENTO, SOWMYA SUBRAMANIAM, TZU-WANG CHUANG*** Abstract. Consider piece-wise constant approximations to a function of several parameters, and the problem of finding the best such approximation from measurements at a set of points in the parameter space. We find good approximate solutions to this problem in two steps: (1) partition the parameter space into cells, one for each of the N data points, and (2) collect these cells into blocks, such that within each block the function is constant to within measurement uncertainty. We describe a branch-and-bound algorithm for finding the optimal partition into connected blocks, as well as an O(N2) dynamic programming algorithm that finds the exact global optimum over this exponentially large search space, in a data space of any dimension. This second solution relaxes the connectivity constraint, and requires additivity and convexity conditions on the block fitness function, but in practice none of these items cause problems. From the wide variety of intelligent data understanding applications (including cluster analysis, classification, and anomaly detection) we demonstrate two: partitioning of the State of California (2D) and the Universe (3D).

Find Related Datasets

Click any tag below to search for similar datasets

Complete Metadata

@type	dcat:Dataset
accessLevel	public
accrualPeriodicity	irregular
bureauCode	[ "026:00" ]
contactPoint	{ "fn": "Elizabeth Foughty", "@type": "vcard:Contact", "hasEmail": "mailto:elizabeth.a.foughty@nasa.gov" }
description	OPTIMAL PARTITIONS OF DATA IN HIGHER DIMENSIONS BRADLEY W. JACKSON, JEFFREY D. SCARGLE, AND CHRIS CUSANZA, DAVID BARNES, DENNIS KANYGIN, RUSSELL SARMIENTO, SOWMYA SUBRAMANIAM, TZU-WANG CHUANG** Abstract. Consider piece-wise constant approximations to a function of several parameters, and the problem of finding the best such approximation from measurements at a set of points in the parameter space. We find good approximate solutions to this problem in two steps: (1) partition the parameter space into cells, one for each of the N data points, and (2) collect these cells into blocks, such that within each block the function is constant to within measurement uncertainty. We describe a branch-and-bound algorithm for finding the optimal partition into connected blocks, as well as an O(N2) dynamic programming algorithm that finds the exact global optimum over this exponentially large search space, in a data space of any dimension. This second solution relaxes the connectivity constraint, and requires additivity and convexity conditions on the block fitness function, but in practice none of these items cause problems. From the wide variety of intelligent data understanding applications (including cluster analysis, classification, and anomaly detection) we demonstrate two: partitioning of the State of California (2D) and the Universe (3D).
distribution	[ { "@type": "dcat:Distribution", "title": "Paper 8 .pdf", "format": "PDF", "mediaType": "application/pdf", "description": "OPTIMAL PARTITIONS OF DATA IN HIGHER DIMENSIONS", "downloadURL": "https://c3.nasa.gov/dashlink/static/media/publication/Paper_8_.pdf" } ]
identifier	DASHLINK_230
issued	2010-10-13
keyword	[ "ames", "dashlink", "nasa" ]
landingPage	https://c3.nasa.gov/dashlink/resources/230/
modified	2025-03-31
programCode	[ "026:029" ]
publisher	{ "name": "Dashlink", "@type": "org:Organization" }
title	OPTIMAL PARTITIONS OF DATA IN HIGHER DIMENSIONS

1 resource available