Image and signal sensors collect data at increasingly greater rates, making it a great challenge to search and organize data in archives. Work is underway to engage the sensor, data management, and machine learning expertise of LANL and UCSC to tackle adaptive, content-based search in large remote sensing archives. We demonstrate the utility of a new method for extracting features from imagery and signals to aid the archival search problem.
CASCC is a new algorithm for classifying time series. It is highly competitive in terms of speed and accuracy compared to many other algorithms. It is inspired by another leading alghorthm DTW-1NN howerver does not suffer the same computational limitations when applying the model to new time series.
RAID systems have traditionally offered increased performance and data security in small storage systems. An opportunitynow exists to extend traditional RAID principles into the area of large-scale object-based storage devices in order to offer greater data security and space efficiency. In a system where component failures can be expected on a daily basis, the importance of redundancy mechanisms is obvious, and RAID principles offer an appropriate model. Ceph is an excellent platform with whic
We propose to do fundamental research in the development of self-organizing configure automatically its physical schema, in an on-line fashion.
Pseudorandom placement in distributed storage systems offers scalability benefits. Pseudorandom placement makes load balancing harder; new techniques are required. We explore different load balancing techniques using Ceph, an object-based storage system developed at UCSC.
We propose to work on the problem of calibrating the parameters of computer code used for simulation of physical phenomena. We will explore statistical methods based on a Bayesian approach implemented with Sampling Importance Resampling (SIR).
The ISIS team in ISR-2 primarily uses supervised learning techniques to solve classification problems in imagery and therefore has a strong interest in finding linear classification algorithms that are both robust and efficient. Boosting algorithms take a principled approach to finding linear classifiers and they has been shown to be so effective in practice that they are widely used in a variety of domains. In this proposal we present evidence that smoothing is not necessarily the optimal way
The size of the data sets and the uncertainty in the data sets come from the fact that we are dealing with ensemble data sets. These are usually from Monte Carlo simulations where each output (out of many runs) represents a possible solution. The degree of agreement (or disagreement) provides some indications of certainty (or uncertainty) about the results. Because Monte Carlo simulations can potentially involve large number of repetitions, the total data size can very quickly get very large. This project will explore uncertainty and how visualization can be used as a tool to help deal with it.
The research objective of this proposal is to measure human body shape and motion without augmenting the subject. The hypothesis is that replacing traditional cameras with high accuracy 3D shape measurement devices and utilizing a carefully constructed prior model of human surface shape are the critical factors that have been missing from prior attempts to meet this goal. The long term accuracy targets are shape to 1mm and motion to 1deg.
To help users make sense of collaboratively-generated information, we are developing algorithmic notions of information trust
We focus on developing intelligent interactive search and browsing techniques to help users and the information they are looking for from billions of non-relevant files
In large tightly coupled parallel systems, computation goes as fast as the slowest part. For this reason it is necessary to pursue deterministic behavior of all parts of the system. Quality of Service is one way to assist in providing deterministic behavior. This project will explore providing Quality of Service on networks of interest to high performance computing.
We are planning to develop a web-based system to help 'decision makers' quickly identify and process relevant web-based information in case of a disease outbreak. We will work on identifying the pathogen based on sequence information. We will also develop an adaptive information filtering to find, filter and condense the information available on the web.