Over the past decade, the growth of NASA's remote sensing data archives
has kept pace with Moore's law, doubling every 18 months.
Just as importantly, remote sensing data is finding its way into more
and more applications, most of which require data in real-time or
near real-time.
The combination of low latency requirements with increasing data
volumes poses a major challenge for data management.
In order to make the right data available at the right time, a data
system must access and apply knowledge about the content of the data
in its data management decisions.
This particular decision support domain includes aspects such
as automatic quality assessment, feature detection to
support caching decisions, and content-based metadata to support
effecient data selection.
Bayesian Classification of Data Content
In order to be useful for data management decisions, the content
of the data must usually be assessed almost immediately after the data
are created. A number of machine learning
algorithms, such as neural networks and Bayesian classifiers, are
extremely fast in their forward application
(though they may take some time to train). In this project,
we use a simple Bayesian classifier to distinguish cloudy pixels
from other types of pixels (land, water, sun-glint, snow/ice,
desert) in MODIS Calibrated Radiance data (aka Level 1B).
Data Usability and Usefulness
This "quick-look" classification of the data content is used
to decide how usable and useful the data are likely to be to the user
community. This in turn enables the optimization of scarce resources,
such as online storage space and network throughput,
using a variety of techniques such as content-based subscriptions,
subsetting and cache management.
For more information, see
Data services using Bayesian classification for data management.
Feedback
Please let us know what you think about our exploration into content-based
data management.
You may email
us at labs-disc@listserv.gsfc.nasa.gov.
The point-of-contact for this project is
Dr. Christopher Lynnes.
ACKNOWLEDGMENT
This work was funded primarily through
the Computing, Information, and
Communications Technology
program at NASA Ames Research Center.
|