Big data means big issues
for exascale visualization
Posted August 8, 2012
Visualization from a simulation of a grossly aspherical supernova.
Click image to enlarge and for more information.
|
When exascale computers begin calculating at a billion, billion operations each second, gaining insights from the massive datasets generated by the simulations they run will be a huge challenge. Scientists may be tempted to pause the simulation, effectively “holding the machine hostage” as they scramble to create meaningful portrayals of the data barrage, says Hank Childs, a Lawrence Berkeley National Laboratory computer systems engineer. Childs, a recent recipient of a $2.5 million Department of Energy Early Career Research Program award, is out to prevent that.
A former chief software architect for a component of DOE’s Visualization and Analytics Center for Enabling Technologies (see sidebar, “VisIt: More than pretty pictures”), Childs backs the idea that running simulations in lieu of doing experiments is the most cost-effective way to advance scientific knowledge in the era of mega-performing machines.
“My piece of the puzzle is helping people know what the simulation is telling them,” he says. “Scientific visualization is taking the data and then producing images that let people see the data. Ideally, this will lead to new science, the new insights, that realize the simulation’s value.”
But he and fellow visualization researchers fear looming logjams, in both input/output and data movement, as supercomputing advances in about five years from the current petascale level of a million billion floating point operations a second (flops) to the exaflops range.
“If somebody hands you a petabyte or, in the future, an exabyte, how do you load that much data from disk, apply an algorithm to it and produce a result?” Childs asks. “The second challenge is complexity. You only have about a million pixels on the screen – not many more than in your eye – so you have to do a million-to-one reduction of which data points make it onto the screen.”
Such pare-downs raise questions about the remaining data’s integrity, Childs says. “And even if this data integrity issue is managed, we may still produce images so complex that they overwhelm what our human visual processing system can understand. So we’re not doing anything meaningful.”