ANTD Home PageANTD StaffANTD PublicationsANTD ProductsSearch ANTDInformation Technology Laboratory Home PageNIST Home Page

Measurement Science for Complex Information Systems

What are complex systems?
What is the problem?
What is the new idea?
What are the technical objectives?
Why is this hard?
Who would care?
Hard Issues & Plausible Approaches
     Spatiotemporal Scale
    Model Validation
     Tractable Analysis
     Causal Analysis
     Controlling Behavior
Publications
Software Tools
Presentations
Demonstrations





What are complex systems?

Large collections of interconnected components whose interactions lead to macroscopic behaviors:

  • Biological systems (e.g., slime molds, ant colonies, embryos)
  • Physical systems (e.g., earthquakes, avalanches, forest fires)
  • Social systems (e.g., transportation networks, cities, economies)
  • Information systems (e.g., Internet and Web services)

What is the problem?

No one understands how to measure, predict or control macroscopic behavior in complex information systems

  • threatening our nation’s security
  • costing billions of dollars

[Despite] society’s profound dependence on networks, fundamental knowledge about them is primitive. [G]lobal communication … networks have quite advanced technological implementations but their behavior under stress still cannot be predicted reliably.… There is no science today that offers the fundamental knowledge necessary to design large complex networks [so] that their behaviors can be predicted prior to building them.” 
                                      quote from Network Science 2006, a report from the National Research Council

What is the new idea?

Leverage models and mathematics from the physical sciences to define a systematic method to measure, understand, predict and control macroscopic behavior in the Internet and distributed software systems built on the Internet

What are the technical objectives?

Establish models and analysis methods that (1) are computationally tractable, (2) reveal macroscopic behavior and (3) establish causality. Characterize distributed control techniques, including: (1) economic mechanisms to elicit desired behaviors and (2) biological mechanisms to organize components

Why is this hard?

Valid computationally tractable models that exhibit macroscopic behavior and reveal causality are difficult to devise. Phase-transitions are difficult to predict and control.

Who would care?

All designers and users of networks and distributed systems with a 25-year history of unexpected failures:

  • ARPAnet congestion collapse of 1980
  • Internet congestion collapse of Oct 1986
  • Cascading failure of AT&T long-distance network in Jan 1990
  • Collapse of AT&T frame-relay network in April 1998 …

Businesses and customers who rely on today's information systems:

  • “Cost of eBay's 22-Hour Outage Put At $2 Million”, Ecommerce, Jun 1999
  • “Last Week’s Internet Outages Cost $1.2 Billion”, Dave Murphy, Yankee Group, Feb 2000
  • “…the Internet "basically collapsed" Monday”, Samuel Kessler, Symantec, Oct 2003
  • “Network crashes…cost medium-sized businesses a full 1% of annual revenues”, Technology News, Mar 2006
  • “costs to the U.S. economy…range…from $65.6 M for a 10-day [Internet] outage at an automobile parts plant to $404.76 M for … failure …at an oil refinery”, Dartmouth study, Jun 2006

Designers and users of tomorrow's information systems that will adopt dynamic adaptation as a design principle:

  • DoD to spend $13 B over the next 5 yrs on Net-Centric Enterprise Services initiative, Government Computer News, 2005 
  • Market derived from Web services to reach $34 billion by 2010, IDC
  • Grid computing market to exceed $12 billion in revenue by 2007, IDC
  • Market for wireless sensor networks to reach $5.3 billion in 2010, ONWorld
  • Revenue in mobile networks market will grow to $28 billion in 2011, Global Information, Inc.
  • Market for service robots to reach $24 billion by 2010, International Federation of Robotics

Hard Issues & Plausible Approaches

Model scale – Systems of interest (e.g., Internet and compute grids) extend over large spatiotemporal extent, have global reach, consist of millions of components, and interact through many adaptive mechanisms over various timescales. Which computational models can achieve sufficient spatiotemporal scaling properties? Micro-scale models are not computable at large spatiotemporal scale. Macro-scale models are computable and might exhibit global behavior, but can they reveal causality? Meso-scale models might exhibit global behavior and reveal causality, but are they computable? One plausible approach is to investigate abstract models from the physical sciences. e.g., fluid flows (from hydrodynamics), lattice automata (from gas chemistry), Boolean networks (from biology) and agent automata (from geography). We can apply parallel computing to scale to millions of components and days of simulated time.

Model validation – Scalable models from the physical sciences ( e.g., differential equations, cellular automata, nk-Boolean nets) tend to be highly abstract. Can sufficient fidelity be obtained to convince domain experts of the value of insights gained from such abstract models? We can conduct key comparisons along three complementary paths: (1) comparing model data against existing traffic and analysis, (2) comparing results from subsets of macro/meso-scale models against micro-scale models and (3) comparing simulations of distributed control regimes against results from implementations in test facilities, such as the Global Environment for Network Innovations.

Tractable analysis – The scale of potential measurement data is expected to be very large – O(1015) –  with millions of elements, tens of variables, and millions of seconds of simulated time. How can measurement data be analyzed tractably? We could use homogeneous models, which allow one (or a few) elements to be sampled as representative of all. This reduces data volume to 106 – 107, which is amenable to statistical analyses (e.g., power-spectral density, wavelets, entropy, Kolmogorov complexity) to visualization.

Causal analysis – Tractable analysis strategies yield coarse data with limited granularity of timescales, variables and spatial extents. Coarseness may reveal macroscopic behavior that is not explainable from the data. For example, an unexpected collapse in the probability density function of job completion times in a computing grid was unexplainable without more detailed data and analysis. Multidimensional analysis can represent system state as a multidimensional space and depict system dynamics through various projections (e.g., slicing, aggregation, scaling). State-space dynamics can segment system dynamics into an attractor-basin field and then monitor trajectories.

 Controlling Behavior – Large distributed systems and networks cannot be subjected to centralized control regimes because the system consists of too many elements, too many parameters, too much change, and too many policies Can models and analysis methods be used to determine how well decentralized control regimes stimulate desirable system-wide behaviors? Use price feedback (e.g., auctions, present-value analysis or commodity markets) to modulate supply and demand for resources or services. Use biological processes to differentiate function based on environmental feedback, e.g., morphogen gradients, chemotaxis, local and lateral inhibition, polarity inversion, quorum sensing, energy exchange and reinforcement.

Related Publications

Related Software Tools

  • SLX software for simulated computing grid used in "Investigating Global Behavior in Computing Grids". 
    (see  http://www.wolverinesoftware.com/ for information on the SLX simulation environment)
  • Matlab MFiles used in "Simulating Timescale Dynamics of Network Traffic Using Homogeneous Modeling".
    (see http://www.mathworks.com/ for information on Matlab)
  • Matlab MFiles used in "Monitoring the Macroscopic Effect of DDoS Flooding Attacks".
  • Matlab MFiles used in "A Cross-Correlation Based Method for Spatial-Temporal Traffic Analysis".
  • Matlab MFiles used in "Macroscopic Dynamics in Large-Scale Data Networks".
  • Matlab MFiles used in "Exploring Collective Dynamics in Communication Networks".
  • MesoNet: a Medium-scale Simulation Model of a Router-Level Internet-like Network
  • EconoGrid: a detailed Simulation Model of a Standards-based Grid Compute Economy
  • Flexi-Cluster: a Simulator for a Single Compute Cluster
  • MesoNetHS: A Medium-scale Network Simulation with TCP Congestion-Control Algorithms for High Speed Networks, including Compound TCP, FAST, H- TCP, HS-TCP and Scalable TCP

Related Presentations

Related Demonstrations

  • Visualization (10 Mbyte .avi) from a Simulation (May 23, 2007) of an Abilene-style Network
  • Visualization (14.4 Mbyte .avi) from a Simulation (July 31, 2007) of a Network Running CTCP
Horizontal rule
www.antd.nist.gov
Web site owner: The National Institute of Standards and Technology
U.S. Flag

Disclaimer Notice & Privacy Policy / Security Notice
Send comments or suggestions to webmaster@antd.nist.gov
The National Institute of Standards and Technology is an Agency of the U.S. Commerce Department's Technology Administration

Created, maintained and owned by: ANTD's webmaster
Last updated: May, 2006
Date Created: May, 2001

Back to NIST Home