NIH Enterprise Architecture Home

Network Availability Management Pattern

Description

Availability Management is an Enterprise Systems Management (ESM) discipline. Network Availability Management includes the administrative services performed in monitoring NIHnet and the IC networks, including network devices, network topology and software configuration, monitoring network performance, maintaining network operations, and diagnosing and troubleshooting problems.

CIT and most ICs implement network management systems to manage their own community-of-interest networks. These systems generally reside within the data centers of the IC.

NIH will use Simple Network Management Protocol (SNMP) polling to provide device status Management Information Bases (MIBs) to the network management software for analysis. If a network availability situation warrants note or attention, then the network management software will automatically generate a notification or alert to the IC, Help Desk and operations personnel, as appropriate.

The diagram illustrates the centralized CIT network management solution for monitoring all 27 IC subnets and enterprise connections and resources.

Diagram

Benefits

  • Supports the NIH Network Operations Center (NOC) Capability and allows for performance monitoring and availability management across NIHnet
  • By establishing procedures and criteria for forwarding system performance information and alerts to a central network management capability, critical outages may be prevented through proactive Fault Prediction and Problem Management
  • This solution allows ICs to have their own Logical Network Views while providing backup management capabilities with 24/7 network monitoring through the NIH NOC
  • This approach allows for predictive traffic analysis to enhance capacity planning

Limitations

  • Development of such a system may be complex
  • As the implementation of ESM progresses, this pattern will be expanded so that selected alarms and notifications are also passed to a Manager of Managers (MoM) that can incorporate network status into problem management and user-support activities.

Time Table

This architecture definition approved on: February 8, 2005

The next review is scheduled in: TBD