[DNFSB LETTERHEAD]
January 20,
2006
The Honorable
Samuel W. Bodman
Secretary of
Energy
1000
Independence Avenue, SW
Washington, DC
20585-1000
Dear Secretary
Bodman:
In its response
to the Defense Nuclear Facilities Safety Board’s (Board) Recommendation 2004-1,
Oversight of Complex, High-Hazard Organizations, the Department of Energy (DOE)
committed to revitalizing Integrated Safety Management (ISM) with “a set of actions
the Department will pursue to re-confirm that ISM will be the foundation of the
Department’s safety management approach and to address identified weaknesses in
implementation”. The enclosed technical
report, DNFSB/TECH-36, Integrated Safety
Management: The
Foundation for an Effective Safety Culture, provides an assessment of the strengths
and weaknesses of the current state of ISM implementation at the National
Nuclear Security Administration’s (NNSA) production plants and laboratories.
ISM was
established 10 years ago as a new approach to integrating work and safety. The concept was adopted by DOE to enhance
safety awareness, upgrade formality of operations, and improve safety
performance. However, the potential for
this practical safety system to achieve operational excellence and instill a
sustainable safety culture has not been fully realized. From the broadest perspective, requirements
and mechanisms to implement ISM are established, but implementation of safety
management systems varies from site to site. This report examines the current status of the
effectiveness of ISM systems at the seven NNSA weapons sites, summarizes failures
and good practices, and proposes changes to enhance the effectiveness of ISM.
Sincerely,
A.
J. Eggenberger
Chairman
c: |
The Honorable
Linton Brooks Mr. Thomas P.
D’Agostino Ms. Patty
Wagner Mr. Daniel E.
Glenn Mrs. Camille
Yuan-Soo Hoo Mr. Edwin L.
Wilmot |
Mr. William J.
Brumley Mr. Richard W.
Arkin Ms. Karen L.
Boardman Mr. Mark B.
Whitaker, Jr. |
Enclosure
DNFSB/TECH-36
INTEGRATED SAFETY
MANAGEMENT:
THE FOUNDATION FOR A
SUCCESSFUL SAFETY CULTURE
Defense Nuclear Facilities
Safety Board
Technical Report
DECEMBER 2005
INTEGRATED
SAFETY MANAGEMENT:
THE
FOUNDATION FOR A SUCCESSFUL SAFETY CULTURE
This technical
report was prepared for the Defense Nuclear Facilities Safety Board by:
Board Member, R.
Bruce Matthews
with assistance from former Board Member,
Joseph J. DiNunno and the following staff:
Dan L. Burnfield John S.
Contardi R. Todd Davis John E.
Deplitch Matthew P.
Duncan Timothy L.
Hunt |
Albert G.
Jordan Charles H.
Keilers David N. Kupferer Al J.
Matteucci Michael J. Merritt Donald F. Owen |
EXECUTIVE
SUMMARY
Integrated
Safety Management (ISM) was established 10 years ago as a new approach to integrating
work and safety. The concept was adopted
by the Department of Energy (DOE) to enhance safety awareness, upgrade
formality of operations, and improve safety performance. However, the potential for this practical
safety system to achieve operational excellence and instill a sustainable
safety culture has not been realized. In
the broadest sense, system expectations and mechanisms to implement ISM are
established, but execution of effective safety systems varies from site to site.
This report examines the current status
of the effectiveness of safety management systems at the National Nuclear
Security Administration’s (NNSA) seven weapons sites, summarizes failures and
good practices, and proposes changes to enhance the implementation of ISM.
Important
safety improvements have been achieved at the nuclear weapons production plants
and the laboratories since the introduction of ISM. For example, work planning is formal, and
plan-of-the-day meetings and pre job briefs are frequent; formal identification
of job hazards is regularly practiced; working to standards and using
engineered controls to mitigate hazards are applied at the facility and
activity levels; authorization of hazardous work, work control, and procedure
adherence are common practices; and occurrence reports and incident critiques
are focused on preventing similar occurrences.
Thus, ISM has
clearly initiated many positive changes. The question is whether NNSA’s production
plants and laboratories are safer. The
answer is not clear; even though formality has improved at all sites, and some
safety performance measures have improved, the common-sense approach of the
principles and functions of ISM is not always applied. At some sites, workers and line management
perceive that ISM has deteriorated into a top-down, process-heavy, compliance-driven
system. The perception is that DOE and
NNSA set the ISM expectations, directives, and contract requirements; senior
contract managers who are not familiar with facility operations develop the
implementation processes; and line managers are left to implement a cumbersome
program.
Most problems
with the implementation of ISM occur at the activity level, particularly in research
and development and nonroutine activities. Accidents happen when work is not planned according
to the core functions of ISM. At some
facilities, researchers and workers are operating outside of the activity-level
safety envelope because the institutional-level decision makers do not fully
understand the front-line work and hazards, and workers are motivated and
rewarded for getting work done. Furthermore, managers do not spend enough time
on the floor interacting with workers about ISM. To be more effective, ISM needs to start with
the hazards and the work and should be owned, developed, and executed by line
management and the individuals who do the hazardous work, with the support of
subject matter experts as necessary.
During the
review documented in this report, a number of positive attributes and good practices
were observed that could improve the effectiveness of ISM at all sites. For example, senior managers at the best sites
have an obvious commitment to ISM and actively demonstrate its principles and
functions. Top managers spend
significant and effective time on the floor, as evidenced by their technical
awareness of the work and the hazards. Effective ISM organizations continuously
increase safety performance expectations, recognize the reporting of errors,
and reward outstanding safety achievements. At the best sites, cooperation between the
site office and contractor management is apparent, and effective oversight,
self-assessments, and supporting issues management programs are implemented. Nuclear facility operations at the better
sites are compliant with Title 10, Code of Federal Regulations (CFR), Part 830,
Nuclear Safety Management, and supporting DOE directives. At the activity level, clear operational
boundaries are maintained and supported by change control programs at all
levels. Workers understand the boundaries
of their authorized work and what action is to be taken when an authorized
boundary is approached. Finally, worker
involvement at the activity level in identifying hazards, developing
procedures, and improving safety is common practice at the better-performing organizations.
A number of
important ISM attributes and good practices are discussed throughout this report.
The following are the most important
changes that would improve the effectiveness of ISM:
TABLE
OF CONTENTS
Section |
Page |
||||
1. |
INTRODUCTION |
1-1 |
|||
2. |
BACKGROUND |
2-1 |
|||
|
2.1 |
ISM Guiding
Principles and Core Functions |
2-1 |
||
|
2.2 |
ISM Structure |
2-3 |
||
|
2.3 |
Safety
Performance |
2-4 |
||
3. |
RECENT ISM ASSESSMENTS |
3-1 |
|||
4. |
REVIEW PROCESS |
4-1 |
|||
5. |
ANALYSIS OF RESULTS |
5-1 |
|||
|
5.1 |
Comments/Observations
on ISM Implementation at NNSA Sites |
5-1 |
||
|
|
5.1.1 |
Lawrence
Livermore National Laboratory |
5-3 |
|
|
|
5.1.2 |
Los Alamos
National Laboratory |
5-5 |
|
|
|
5.1.3 |
Nevada Test
Site |
5-6 |
|
|
|
5.1.4 |
Pantex Plant |
5-7 |
|
|
|
5.1.5 |
Sandia
National Laboratories |
5-9 |
|
|
|
5.1.6 |
Savannah
River Site-Tritium Operations |
5-10 |
|
|
|
5.1.7 |
Y- 12
National Security Complex |
5-11 |
|
|
5.2 |
Status of ISM
Implementation at NNSA Sites |
5-12 |
||
6. |
CONCLUSIONS |
6-1 |
|||
|
6.1 |
Observations
on the Effectiveness of the Seven Guiding Principles |
6-1 |
||
|
6.2 |
Comments on
the Effectiveness of the Five Core Functions |
6-4 |
||
7. |
SUMMARY |
7-1 |
|||
8. |
SUGGESTIONS FOR IMPROVEMENT |
8-1 |
|||
REFERENCES |
R-1 |
||||
GLOSSARY |
GL-1 |
||||
In 1995, the
Defense Nuclear Facilities Safety Board (Board) issued Recommendation 95-2,
Integrated Safety Management (Defense Nuclear Facilities Safety Board, 1995a),
which served as the impetus to develop a new approach to doing work safely at
the Department of Energy’s (DOE) nuclear facilities. Since that time, Integrated Safety Management
(ISM), as defined by DOE Policy 450.4, Safety
Management System Policy (U.S. Department of Energy, 1996),
has evolved to a common-sense method for managing hazardous work with realistic
controls. DOE and its contractors have
invested significant resources in defining, developing, and implementing the
five ISM core functions, and the ISM wheel is a commonly used logo throughout
the DOE complex. The real return on this
safety investment comes from employing the guiding principles of ISM at the
management level and putting the core functions into practice at the facility
and activity levels-the levels where workers handle hazardous materials, operate
nuclear facilities, and perform hazardous activities.
The objective
of ISM is succinctly stated in the Implementation Plan for Recommendation 95-2:
“The Department and Contractors must
systematically integrate safety into management and work practices at all
levels so that missions are accomplished while protecting the public, the
worker, and the environment.” That
straightforward tenet created a new focus on work practices throughout the DOE
complex because for the first time, safety and mission were combined. After nearly 10 years of implementation, DOE
contractors have accepted the functions and guiding principles of ISM as the
basis for safety programs that demand standards-based safety management and
excellence in the conduct of hazardous operations.
However,
evidence suggests that execution of ISM at the activity level has fallen short
of expectations for doing work safely. For example, a scan through DOE’s Occurrence
Reporting and Processing System (ORPS) database shows that many reportable
incidents are caused by inappropriate hazard assessments, inadequate controls,
and failure to learn from past mistakes.
Independent ISM reviews at various DOE sites―discussed
later in this report―have
noted the failure of contractors to execute work to ISM principles. As a result, the full potential of ISM has not
been achieved, and the complex is not as safe as one might hope after 10 years
of implementation. Based on the above
assessments and personal interactions, the Board began to sense that ISM had
not progressed beyond compliance-driven paper processes that may appear to meet
requirements, but fail to embrace the essence of ISM, which is to Do Work Safely. The objective of this report is to analyze the
effectiveness of ISM implementation throughout the weapons complex, to identify
best practices, to compare the various sites of the National Nuclear Security
Administration (NNSA) with respect to ISM, and to help guide NNSA toward revitalizing
the effectiveness of ISM.
2. BACKGROUND
2.1 ISM
GUIDING PRINCIPLES AND CORE FUNCTIONS
Environment,
safety and health (ES&H) protection requirements have evolved over the years.
The Board’s technical reports
DNFSB/TECH-5 (Defense Nuclear Facilities Safety Board, 1995b) and DNFSB/TECH-16
(Defense Nuclear Facilities Safety Board, 1997) summarize the background of
that evolution and the rationale behind ISM. Requirements captured in various pieces of
legislation and DOE directives are generally directed at (1) protection of the
public, (2) protection of workers, and (3) protection of the environment. Such protection is sought through limiting
human exposure, either directly or indirectly, to the hazardous nature of the materials
involved. The body of protective
legislation was developed as discrete requirements and implemented in parts
because organizations subject to such regulations established separate programs
to comply. The result was a partitioned
and sometimes confusing approach to risk management.
In 1995 the
Board recommended that DOE undertake an integration of safety programs for
managing the nuclear risks of its weapons program. The major thrust of Recommendation 95-2 was:
... to bring the many safety-related directives, implementation
efforts, and new initiatives related thereto into a more cohesive, integrated,
and effective safety management program, with clearer lines of responsibility
and authority for its execution...
DOE’s
Implementation Plan for Recommendation 95-2 is based upon seven guiding principles:
In sum, these
principles articulate the management philosophy underpinning DOE’s safety
program. ISM, however, is more than
philosophy. Its practical implementation
is structured to standardize a flexible “think before doing” approach to
performing hazardous work. This approach
is captured in five core functions:
The approach is
generic. ISM can be applied effectively
to any hazardous operation, whether it is at the site, facility, or activity
level. According to DOE policy, ISM
should be applied to both nuclear and non-nuclear hazards. Its principles and functions can also be
applied effectively to safeguards and security and environmental protection
activities.[1]
DOE established
the ISM program as its reference safety management system through issuance of a
policy statement, DOE Policy 450.4 (U.S. Department of Energy, 1996), and modification
of its acquisition regulation, Title 48, Code of Federal Regulations (CFR), 970.5223-1,
to require implementation of ISM by its contractors. Since then, DOE senior management has been
constant in expecting its contractors to be responsible and accountable for managing
their contracted activities in keeping with the principles and functions of ISM.
The ISM Guide (DOE Guide 450.4-1B) has
clear and rigorous descriptions and suggestions for implementing ISM. In addition, DOE has issued numerous
directives that define ISM expectations for DOE and its contractors (see, for
example, Figure 5 on page 42 of the ISM Guide). The problem with ISM is not in the
requirements; rather, both DOE and its contractors have struggled with the
detailed implementation of ISM, particularly at the activity level. This is the case even though the contractors
have expended significant effort on strengthening self-assessment programs as
the first line of surety, and DOE’s field offices and the Office of Independent
Oversight and Performance Assurance (OA) have periodically assessed ISM system performance.
In addition, DOE has held a number of workshops
that have identified ideas for maintaining and improving ISM. These assessments and workshops have revealed
variability in the robustness of the program from site to site, with most sites
not performing to expectations.
In recognition
of the need to reinvigorate ISM, DOE’s response (U.S. Department of Energy,
2004a) to the Board’s Recommendation 2004-1 commits to revitalizing ISM implementation
with “a set of actions the Department will pursue to re-confirm that ISM will
be the foundation of the Department's safety management approach and to address
identified weaknesses in implementation. “This initiative is a positive development,
but may not be sufficient to address some of the attitudes and perceptions that
are inhibiting the enthusiasm needed to implement ISM effectively.
2.2 ISM STRUCTURE
As noted
earlier, ISM was established by DOE as a new approach to integrating work and safety.
In its original construct, ISM was aimed
at three levels:
Figure 1
illustrates the successive organizational levels of ISM. The outer, institutional level provides (1)
safety requirements in the form of DOE regulations and directives, (2) mission and
funding based on NNSA program priorities, and (3) implementing requirements and
cultural values from local site and contractor policies and procedures. The middle, facility level provides a safe
operating platform that protects the public. The inner, activity level provides procedures and
work controls to protect workers and enable hazardous work. Individuals at each level have roles and
responsibilities for integrating work and safety, but line management
responsibilities focus on the inner activity level, where work and safety must
be integrated with other priorities. The
five core functions are more applicable at the activity level, while the
guiding principles are more relevant at the institutional level.
Figure 1. Conceptual illustration of the interactions among the
institutional, facility, and activity levels. The safety requirements at each level should
be used as the basis for the next-lowest level of requirements, such that the
hazard controls at the activity level are traceable to the high-level requirements.
2.3
SAFETY PERFORMANCE
DOE and its
predecessor organizations have a long and improving safety performance record
in nuclear operations. For example, as
shown in Figure 2, the rate of Lost Workday Cases (LWCs)
at DOE sites has dropped by more than 50 percent during the past 12 years. Similarly, as shown in Figure 3, the number of
deaths in the DOE complex has been declining steadily. These safety performance data suggest that DOE
has a good and steadily improving safety record and that implementation of ISM
has had a positive impact on safe operations.
However, a
closer look at some of DOE’s safety performance data reveals some inconsistencies
that could give cause to question that conclusion. For example, DOE’s LWC rates prior to 1990
were nearly 50 percent lower than those of the early 1990s, indicating that the
downward trend after 1990 could be a simple accounting artifact. The apparent downward trend in fatalities
shown in Figure 3 looks quite different when one considers only deaths at
defense nuclear facilities since 1975. In addition, the numbers of Type A and B
Accident Investigations, Price Anderson Enforcement Actions, and Operational
Emergency Occurrences have not declined, suggesting that the frequency of
serious accidents and near misses has not been reduced by the introduction of
ISM.
Figure 2. LWCs at DOE sites reported by the Computerized Accident and
Incident Reporting System, compared with U.S. industry LWC rates reported by
the Bureau of Labor Statistics’ Survey of Occupational Injuries and Illnesses,
1990 to 2004.
Figure 3.
Annual worker fatalities at sites run by the Atomic Energy Commission,
the Energy Research and Development Administration, and DOE during the past 61
years.
Note:
Fatality data were compiled from: (1) U.S. Department of Energy (a) Annual
Reports to Congress 1979-2003, (b) Operational Accidents and Radiation Exposure
Experience with the Energy Research and Development Administration 1975-1978,
and (c) Operational Accidents and Radiation Exposure Experience within the
United States Atomic Energy Commission 1943-1975; (2) U.S. Department of
Energy, Office of Environment, Safety and Health, Annual Reports, Fiscal Years
1980-1985; (3) U.S. Atomic Energy Commission, Annual Report to Congress
1943-1974; and (4) U.S. Energy Research and Development Administration, Annual
Reports to Congress 1975-1978.
3. RECENT ISM ASSESSMENTS
Full
realization of an effective ISM program can be divided into three logical
phases: (1) establishing the program
expectations; (2) establishing the mechanisms to implement the program; and (3)
applying these mechanisms at the facility and activity levels. In 2000, DOE declared that initial efforts to
implement ISM had been completed at all DOE sites and subsequently initiated
actions to “take ISM to the next level.” Several recent reviews have assessed ISM at
the sites. In 2002, DOE held a workshop
(Idaho National Engineering and Environmental Laboratory, 2002) that provided
some interesting insights into the level of implementation of ISM. The following conclusions are paraphrased from
the workshop report:
The workshop
participants recommended improvements, primarily in the areas of changing requirements,
establishing implementation plans, and performing annual reviews.
DOE’s OA issues
an annual ES&H evaluation that provides valuable insights into the implementation
of some ISM principles and functions. Generally, the reports state that ISM programs
are well established and functioning adequately across the DOE complex. For example, in the 2003 and 2004 reports (U.S.
Department of Energy, 2003 and 2004b), OA concludes that sites with mature ISM
systems have had lower reportable injury rates, that ISM processes are in place
at most sites, that work hazards are being identified, and that all sites have effective
performance on the guiding principles of ISM. However, the evaluations also indicate that
improvement is needed in developing and implementing hazard controls,
performing work within the controls, and providing feedback and improvement. Although the assessments assert that ES&H
trends across the complex are improving, a close reading of the reports
suggests that while the processes are well documented, the effectiveness of ISM
implementation at the activity level is inadequate.
The Board
reviewed ISM work control practices at five NNSA sites in 2003 and 2004 (Defense
Nuclear Facilities Safety Board, 2004a). On the positive side, all sites used formal procedures,
applied processes to identify and analyze hazards, authorized hazardous work,
and conducted pre job briefings. Negative aspects included deficiencies in work
planning, hazard identification and analysis, implementation of adequate
controls, and feedback and improvement. As
a result of issues identified by OA’s assessments and the Board’s observations,
NNSA held a workshop to review work planning and control practices. The workshop participants identified a culture
problem “where line management has often failed to ensure that work is strictly
conducted in accordance with established ISM system processes and procedures,
and that, in some cases, has included inappropriate practices such as
over-reliance on automated job hazard analysis tools.”
Effective work
control is at the heart of implementing ISM at the activity level; therefore it
is troubling that these assessments suggest there has been no significant
change in processes for analyzing and controlling hazards during the past 5
years. This conclusion is supported by
data from the ORPS database, in which more than 40 percent of reported
occurrences are related to failures to perform adequate hazard analysis and to
develop and implement adequate controls. Further, a scan through the weekly reports of
the Board’s Site Representatives indicates that many of the issues discussed in
those reports are associated with inadequate hazard analysis and definition of
controls. In addition, nearly 50 percent
of the Board’s correspondence between 2000 and 2004 addressed ISM issues;
development and implementation of hazard controls was the core function
mentioned most frequently, and identification of safety standards was the guiding
principle addressed most often.
Overall, these
recent assessments reveal a robust set of ISM expectations and relatively strong
ISM implementing mechanisms[2] at most sites, but their conclusions indicate
poor execution of these mechanisms, particularly with regard to the core
functions of analyzing and controlling hazards and providing feedback and
continuous improvement. In addition,
specific assessments to evaluate the effectiveness of ISM implementation are
not common, and the “think before doing” safety culture that ISM promotes has
not been fully realized. Fortunately,
DOE and NNSA appear to have taken these identified ISM deficiencies seriously
and have outlined steps to revitalize ISM in the Implementation Plan for
Recommendation 2004-1. The remainder of
this report evaluates the effectiveness of ISM in improving safety performance
and safety culture in order to assist DOE in its initiative to revitalize ISM.
As a follow-up
to the ISM assessments discussed in Section 3, an approach for independently
evaluating the effectiveness of ISM at NNSA’s nuclear sites was developed. The basic plan consisted of examining ISM
practices at the institutional, facility, and activity levels at each site. A vertical slice through the organization
began with interviewing the highest level of contractor and site office
management and ended with observing work activities and talking with operators
and first-line supervisors in their workspaces. Typically, five managers (including directors
and site office managers, facility managers, and technical leads) were
interviewed and three operations (including experiments, facility
modifications, and manufacturing processes) were observed at each site. A total of 2 days was spent at each site over
a 2-month period, so the observations in this report represent a Blink (Gladwell, 2005) type of analysis. However, the rapid vertical slice through the
organization was an effective method for probing underlying practices,
attitudes, and perceptions that may be inhibiting the effective implementation
of ISM.
Lines of
inquiry based on the seven ISM guiding principles were developed and used for interviewing
the managers responsible for instituting ISM expectations and requirements. Not all of the questions were asked of every
manager; the goal was to develop an understanding of the depth of knowledge and
commitment by management to each of the guiding principles. Similarly, lines of inquiry based on the five
core functions of ISM were developed and used during on-the-floor interactions
with the men and women responsible for doing hazardous work. The goal of these questions was to establish
the depth of workers’ understanding of hazards and controls and commitment to
formality in conduct of operations.
5. ANALYSIS OF RESULTS
5.1 COMMENTS/OBSERVATIONS ON ISM
IMPLEMENTATION AT NNSA SITES
Sites in NNSA’s
weapons complex are composed of two basic types of organizations: (1)
multidisciplinary laboratories (Lawrence Livermore National Laboratory [LLNL],
Los Alamos National Laboratory [LANL], and Sandia National Laboratories [SNL]
and the Nevada Test Site [NTS][3])
and (2) weapon production plants (Savannah River Site [SRS] Tritium Operations,
Y-12 National Security Complex [Y-12], and Pantex Plant). Weapons program elements, such as
surveillance, assessment and certification, assembly and disassembly, testing, research
and development (R&D), and manufacturing, are interconnected, and the
design laboratories have close partnerships and working relationships with the
production plants. Therefore, one might
expect to see some similarities in operational safety. However, the nature of activities at these two
distinct types of sites differs significantly, as do the composition and type
of staffing.
The
laboratories are essentially NNSA “think tanks.” They serve as the source of new weapon
concepts and the facilities for research, development, and testing on nuclear
weapon materials and components. While
the laboratories have manufactured parts for years, and their production
program work is increasing, the work atmosphere has its cultural roots in
scientific creativity and intellectual freedom. The laboratories operate a number of unique,
state-of-the-art nuclear, chemical, physics, and computer facilities. Even though NNSA owns the laboratories, a significant
portion of their work is supported by agencies other than NNSA and DOE. The staff is heavily populated with
doctoral-level specialists supported by laboratory technicians who are skilled
in the use of sophisticated instruments and testing techniques. Products have been primarily advances in the
physics and chemistry of nuclear materials, advanced engineering and computing
technologies, and their application to nuclear weapon performance.
The production
facilities, on the other hand, are more akin to industrial facilities that turn
out multiple identical products. Uniformity and quality are achieved through
rigor in production, supported by repetitive processes and compliance with
proven production processes and protocols. The skill mix in the production facilities is
quite different from that in the laboratories, with greater dependence on
engineers to design, construct, operate, and maintain complex engineered
systems for reliability and safety. For
the most part, the work at the production plants is devoted to NNSA-supported
weapons activities, and products are weapon components and systems. Operations rely on technicians with skills
typically taught by the trades or acquired from on-site experience and training.
Given these
fundamental differences, it may be informative to consider the reactions of the
organizations at the laboratories and plants to ISM. Generally, management in both types of organizations
has adopted the principles of ISM and worked diligently to implement its
functions at the facility level. However, different perceptions of the
usefulness of ISM at the activity level emerged from this study. Basically, experimental scientists approach
their work differently from production engineers. A perception at the production plants is: “Operators don’t need to understand the
science behind what they are doing. They
simply need to follow procedures and trust that the procedures will keep them
safe.”[4]
In contrast, the laboratory view tends
to be: “For scientists, you have to
construct a flexible system in which the experimenter can operate safely.” Some at the laboratories believe that ISM is
primarily a compliance-driven process, forced on the laboratories by DOE to
transition from an expert-based approach to safety to a standards-based approach.
As a result, workers who understand the
scientific principles behind their work see ISM as an unnecessary documentation
process that does not necessarily benefit safety. On the other hand, some with manufacturing
responsibilities feel forced to comply with a system developed by the
laboratories to do research safely. Not
all plant contractors believe they need ISM or view their previously existing
safety processes as inadequate. When
asked, “How would you have done this job 5 years ago?” some replied: “Exactly the same. The only difference is the paperwork. I consider myself just as safe as I was 5 or
even 20 years ago.”
Interestingly, a
common perception at both laboratories and plants is that ISM at the activity
level is too focused on paper and not focused enough on work. The question implied in several conversations
was: “When does compliance interfere
with productivity and not improve safety?” A more cynical view was: “ISM is good to have because it makes a nice
safety sales package.” Clearly,
important barriers to effective implementation of ISM remain in the minds of some
who have the responsibility for dealing with hazards.
Notwithstanding
these differences between the plants and the laboratories, important safety
improvements have been achieved at both types of sites since the introduction
of ISM nearly 10 years ago. For example:
The following
is a brief synopsis of some of the observations, good practices, and difficulties
with the implementation of ISM at the seven NNSA sites.
5.1.1
Lawrence Livermore National Laboratory
LLNL’s focus is
primarily on R&D experiments involving a variety of hazardous materials and
operations. The contractor and the
Livermore Site Office (LSO) manage a number of facilities where the potential
for high-consequence accidents exists, including plutonium releases that are
complicated by the proximity of the plutonium facility to the public. LLNL runs three Hazard Category 2 and four
Hazard Category 3 nuclear facilities.[5]
The dose to the maximally exposed offsite individual (MEOI) for an unmitigated
accident at LLNL’s plutonium facility is calculated to be approximately 50 rem
total effective dose equivalent (TEDE).[6]
The variety of activities at LLNL, combined with an entrenched expert-based
safety culture, appears to have slowed ISM implementation. In fact, even though ISM has been a DOE
expectation for nearly 10 years, LLNL adopted its basics only recently. On the other hand, LLNL’s safety performance
record, as measured by its LWC rate, has been improving steadily since 1990
(see Figure 4); however, LLNL has had four serious accidents that required Type
A and Type B accident investigations since 1990.
ISM at the Institutional Level: LLNL
management appears to understand the ISM problems that have been pointed out by
independent assessments and is determined to change the laboratory’s safety
culture by driving ISM implementation from the top. In general, LLNL managers did not provide
clear answers to questions about safety functions and responsibilities. Discussion of the seven guiding principles of
ISM was handled better by LSO managers. LLNL managers have developed improved work control
processes that meet external expectations but may not meet the needs of operators.
On-the-floor observations by line
managers, which would facilitate understanding operators' needs, do not occur
as frequently as they should. The LLNL system
for identifying specific Authorizing Organizations, Authorizing Individuals,
and Responsible Individuals appears good in concept. However, LLNL suffers from inconsistent
implementation of work control processes across the site and within the various
directorates.
Figure 4. LWC rates for specific NNSA sites.
ISM at the Facility Level: LLNL’s
plutonium facility does not have an authorization basis compliant with Title 10
CFR Part 830, Nuclear Safety Management (10 CFR Part 830), and the
classification of safety systems has not met expectations. A recent OA assessment identified (LLNL
self-reported) violations of the Configuration Management Control Program, the
Radiation Protection Program, the Unreviewed Safety Question Program, the
Maintenance Program, the Quality Assurance Program, and Occurrence Reporting at
the LLNL plutonium facility. LLNL
management, in conjunction with LSO, is addressing these long-standing problems.
ISM at the Activity Level: Operational
safety at LLNL is still largely experience/expert-based.
Technicians and first-line supervisors
were comfortable with discussing the scope of work they were authorized to
perform; they understood that work must be authorized before being performed;
and they were well-versed in the need to stop work when unexpected situations
were encountered. LLNL's improved work
planning processes (ES&H teams review hazards and controls) appear sound in
concept, but their application appears to workers to be more administrative
than practical. Workers at LLNL have the
perception that the work planning processes are cumbersome and do not emphasize
the necessary safety controls. In
particular, the processes for hazards analysis and control documentation do not
lead to explicit understanding of the hazards and clear identification of
controls.
5.1.2
Los Alamos National Laboratory
LANL is
probably the most complex NNSA site: It’s activities range from fundamental R&D to
manufacturing and production. Various
hazardous materials, including plutonium-238 and high explosives, are handled
routinely, and 11 Hazard Category 2[7]
and 4 Hazard Category 3 facilities were operating at the time of this review. In addition, LANL and the Los Alamos Site Office
(LASO) manage a number of facilities with potential high-consequence accidents,
including inadvertent criticality, fire-induced plutonium releases, and
accidental impacts that could breach waste drums. Activities at LANL deal with a broad range of
hazards, from basic industrial hazards to unknown reactions resulting from
chemical research. Finally, the pace of operations
is high, and changes in mission, management, and oversight are frequent. LANL’s LWC rates have been relatively steady
during the past 5 years, but show significant improvement relative to the 1990
rate (see Figure 4). LANL has
experienced too many serious accidents and near misses during the past 15
years, however, as evidenced by 12 Type A and Type B accident investigations.
ISM at the Institutional Level: During
the mid-1990s, LANL was a Defense Programs leader in ISM, had an empowered ISM
champion, and implemented good ISM programs in some nuclear facilities. Since that time, other priorities have diluted
management’s attention to ISM, and attempts to institutionalize ISM have
encountered resistance.
Similar to
LLNL, LANL has had difficulty in implementing hazard analysis and control processes
consistently across the various directorates. Recently, safety (and security) became an
institutional focus, resulting in a 6-month standdown
of all nonessential activities; the subsequent high-level attention to safety
management should improve implementation of ISM. In general, responsible line managers have the
competence to manage work safely, but understanding of the guiding principles
of ISM does not appear to be comprehensive at the senior management level. This apparent lack of understanding can
contribute to inconsistent implementation of ISM at the activity level.
ISM at the Facility Level: Facility-level
work control has been a past focus at LANL and is generally effective. ISM functions are applied to develop facility
safety bases; however, the quality of safety analyses and Technical Safety
Requirements varies from organization to organization. In addition, the safety basis documents for
several Hazard Category 2 nuclear facilities are out of date by 2 to 10 years
and therefore are not compliant with the Nuclear Safety Management Rule.
ISM at the Activity Level: Most
facility work is well planned, and most researchers can identify hazards and
controls; however, lack of formality in operations is a persistent problem at
LANL, particularly in R&D activities. Many of the findings identified during preresumption management self-assessments were related to
work control, and approximately 30 percent of the pre-start findings were ISM
related. Therefore, actions to improve
work control are essential to improving safety at the activity level. Implementation
of LANL’s Integrated Work Management program, which focuses on the functions of
ISM, builds on previous work control initiatives, and clarifies expectations, appears
to be a good path forward.
5.1.3
Nevada Test Site
Little
hazardous nuclear work is currently being done for NNSA at NTS. The Device Assembly Facility (DAF) is being
refurbished to accommodate some experimental and subcritical plutonium
assemblies and activities, and shipments of special nuclear material for criticality
experiments are being received and stored. One or two subcritical assemblies are tested
annually in the U1a Complex, and plutonium dynamic experiments are being
conducted at the Joint Actinide Shock Physics Experimental Research facility. The workload is deliberately paced, and
hazardous activities are far removed from the public. However, the scope of operations will likely
increase when the criticality experimental facility comes on line, when and if
weapon dismantlement activities increase, and when tunnel upgrades to permit
disposal of damaged nuclear weapons are funded. For an unmitigated accident at DAF, the MEOI
is calculated to receive 325 rem TEDE; for subcritical
tests, MEOIs are calculated to receive doses less
than the evaluation guideline of 25 rem TEDE at the site boundary. Workers must deal with radiological hazards
and a variety of industrial and transportation hazards. The major difficulty at the site is
integrating the activities of three “equal” users: Bechtel Nevada Incorporated, LANL, and LLNL. Other agencies also conduct hazardous
activities at the site, thereby making the integration of safety management
practices a challenge for the site office. NTS had a high LWC rate in the early 1990s,
but recent performance has been steadily improving (see Figure 4). Since 1990, NTS has had four serious accidents
that required Type A and Type B accident investigations.
ISM at the Institutional Level: Because
of the hiatus in activity after the 1992 moratorium on nuclear testing, NTS did
not fully employ ISM practices until the startup of subcritical testing. The Nevada Site Office (NSO) used expert
consultants to develop safety management processes, and the site user
organizations are working together to implement ISM at the facility and
activity levels. NSO’s
expectations encourage the user organizations to improve safety practices by
insisting on compliance with a comprehensive set of site operating standards to
which all contractors must adhere. The issues
management system implemented by NSO appears to be effective, and overall management
oversight, facility safety analyses, and formality of operations are improving.
In general, NNSA and contractor managers
are committed to ISM, but lack experience in implementing rigorous safety
programs in a high-productivity environment.
ISM at the Facility Level: NSO
has developed a Real Estate/Operations Permit (REOP) process to coordinate
facility interface issues among the site user organizations. For example, LLNL is the primary REOP holder
for DAF and is responsible to the NSO manager; LANL is a secondary REOP holder
and is responsible to the LLNL REOP holder. The system appears confusing on paper but
apparently functions to ensure that work is properly reviewed and authorized. Prior to 2000, facility authorization bases at
the test site were basically nonexistent, primarily because little nuclear
facility work was being conducted. The
nuclear facility safety culture changed, and DAF has a rule-compliant Documented
Safety Analysis. Overall, the
understanding of nuclear facility requirements and operations is limited―the
site operators rely on safety basis consultants to write Documented Safety
Analyses and Technical Safety Requirements―but is improving
with experience.
ISM at the Activity Level: NTS
has a rich history of formality of operations based on mining safety and
control and oversight of underground tests. The site is building on that experience to
bring activity-level work control up to modern standards. Most of the “hands-on” work at the site is
done by Bechtel Nevada employees. Employees from both laboratories stated that
safety operations were better at NTS than at their home sites. A hazards-based, graded approach to work
control is used. All levels of work must
be authorized, but the complexity of reviews and oversight increases with risk.
The work authorization for moderate- and
high-risk activities is intensive and may be too cumbersome if the work pace
increases.
5. 1. 4 Pantex
Plant
The Pantex Site
Office and contractors manage activities with the highest accident consequences
in the NNSA complex: nuclear weapons, nuclear materials, and high explosives are
assembled, disassembled, and stored at the site. About 17 Hazard Category 2 buildings support
nuclear operations at the Pantex Plant. The safety of nuclear operations at Pantex has
improved over the years, and the likelihood of a high-consequence nuclear
explosive accident is extremely low. Individual weapon systems vary in complexity,
but work is carefully planned for each campaign. While the pace of assembly/disassembly
activities is fairly constant, it could increase with potential new dismantlement
and life extension missions. The conduct
of operations at Pantex must meet the highest standards achievable, and the
acceptance of equipment and operational errors should be exceedingly low. Significant improvements in some elements of
formality of operations and safety have occurred during the past several years,
and progress is expected to continue. For example, NNSA has expended significant
resources at the Pantex Plant on developing Seamless Safety for the 21st
Century (SS-21) processes that rely on specially designed tools and a minimal
number of hoisting operations to eliminate or minimize potential hazards of
nuclear explosive operations.
ISM at the Institutional Level: ISM
works on a macro scale at Pantex with multiple interfaces among different
organizations. The Pantex Site Office
approves operations after multiple levels of review and analysis:
Because
interfaces can be weak points in any system, coordinating these multiple interfaces
into an effective safety management program is a challenge. Site office personnel and contractors are
well-versed in the guiding principles of ISM and appear to believe that current
practices are fully consistent with the spirit of ISM. LWC rates have dropped during the past 10
years (see Figure 4); one accident requiring investigation was reported in 1994.
ISM at the Facility Level: Pantex
has a complex facility safety basis process; the site is transitioning to 10 CFR Part 830 compliance, and the method for developing
and implementing new Technical Safety Requirements while old requirements are
still active is slow and cumbersome. Safety basis documents at Pantex fall into
three categories: site-wide,
facility-specific, and weapon-specific. In the site-wide safety basis, the dose to the
MEOI from an unmitigated tritium release is calculated to be 325
rem TEDE; weapon-specific calculations for assembly/disassembly
operations indicate the MEOI will receive in excess of the 25 rem TEDE
evaluation guideline. Some safety basis
documents are approved and in the process of being implemented, and some are
still in the process of being developed. In addition, the practice of having nested
safety bases, with a weapon-specific safety basis within a facility-specific safety
basis within a site-wide safety basis, can lead to confusion as to who is the
integrated owner and who authorizes work.
ISM at the Activity Level: Rigorous
compliance with procedures is a strong point of safety at Pantex. Assembly and disassembly campaigns are
thoroughly planned to identify hazards, develop controls, and write procedures;
2 years from start of planning to start of work is typical. Manufacturing personnel are heavily reliant on
the design laboratories and safety analysts for technical knowledge, planning,
hazard identification, and safety controls. Pantex relies on rigid procedural compliance
for nuclear explosive operations, and as a result, ISM concepts do not appear
to be an integral part of the culture at the activity level. Some operators could not identify hazards and
safety controls, even though Pantex relies on the technicians to stop work as
the last line of defense.[8]
In addition, many procedures rely
heavily on administrative controls. Verbatim compliance with procedures works well
as long as activities are within analyzed hazards and defined boundaries. However, rote compliance could weaken the
understanding of knowledge-built and experienced-based safety; the risk is that
lack of understanding of hazards and experience could result in an
inappropriate reaction during an unexpected event.
5.1.5
Sandia National Laboratories
SNL’s nuclear
operations are relatively straightforward and of low consequence to the public.
Two Hazard Category 2 and two Hazard
Category 3 nuclear facilities are housed in Technical Area V, but calculations
show that the MEOI will receive considerably less than the evaluation guideline
of 25 rem TEDE at the site boundary. Workers deal with radiological and standard
industrial hazards. Safety practices are
largely expert-based, and SNL management has only recently been awakened to the
importance of implementing ISM principles and functions at the facility and
activity levels. The Sandia Site Office
(SSO) compiled a list of safety incidents that occurred during the past several
years to impress the need for change on SNL management. However, SNL’s LWC rate is declining (see
Figure 4), and DOE reports only one accident investigation since 1990.
ISM at the Institutional Level: Even
though SNL partnered with LLNL and LANL in 1996 to develop safety management
principles for the three weapons laboratories, ISM has not taken root and
become part of the way the site is operated. When SSO pointed out safety issues at SNL,
laboratory management acted deliberately to improve facility safety analyses
and formality of operations. Throughout
the interviews conducted for this study, a consistent theme emerged: several federal and contractor personnel
believe that SNL’s nuclear operations are safe―despite
management’s self-identified problems with ISM implementation (one manager’s
perception was that one in five SNL employees can expect to be hurt on the job
at some point in their career). Given
this fundamental belief in the safety of the status quo, it is not difficult to
see why the principles and functions of ISM have not been translated into tools
that are used consistently to ensure that work is safe.
ISM at the Facility Level: SNL has had fundamental problems in the approach used to analyze the
safety of nuclear facilities in Technical Area V. For example, the approved Documented Safety
Analysis for the Auxiliary Hot Cell Facility had significant inadequacies,
including deficiencies in the analysis of hazards to members of the public, hazards
not adequately identified or controlled, and inadequate design requirements. The site office and the contractor have
committed to improving all Documented Safety Analyses to make them consistent
with the safe harbor methodologies of the Nuclear Safety Management Rule, and
to provide adequate assurance that operational hazards have been identified
through a comprehensive hazard and accident analysis.
ISM at the Activity Level: Workers
and first-line mangers are technically skilled, and for the most part,
operations are executed safely. Even on
this brief site visit, however, informal and improper safety practices were
observed. In most cases, experts are performing
work safely, but often without an approved process, contrary to good formality
of operations. Many at the working level
apparently believe that change is necessary to keep pace with current
requirements and expectations, but that current expectations do not really add
to safety.
5.1.6
Savannah River Site-Tritium Operations
Four Hazard
Category 2 and two Hazard Category 3 facilities at SRS house weapon system
tritium production operations that are designed to load and test tritium
reservoirs. Operations are
relatively uncomplicated and present a low hazard to the public and workers. Unmitigated MEOIs
are calculated to receive less than the 25 rem TEDE guideline, partly because
of the distance to the site boundary. The new Tritium Extraction Facility is
designed to extract tritium from irradiated target fuel rods. When this facility becomes operational, the radiological
hazards from processing the irradiated targets will increase significantly. The tritium operations at SRS are an NNSA
operation embedded in the much larger and more complex Environmental Management
(EM) site. The NNSA Site Office is
separate from the EM site office, but the operations are managed by the same
contractor organization in accordance with many common procedures. LWC rates for tritium operations at SRS have
been consistently low (see Figure 4), and only one accident investigation has
been reported since 1990.
ISM at the Institutional Level: Safety
practices at the tritium facilities benefit from the long history of
application of ISM at SRS. In fact, the
tritium operations at SRS are probably the best example of the implementation
of ISM and formality of operations in the NNSA complex. Contractor managers gave crisp and clear
answers to the questions raised, indicating in-depth understanding of the
guiding principles of ISM. Most, though not
all, in the NNSA Site Office also demonstrated a clear understanding of ISM.
Managers
gave consistent answers to questions about risks at the site and clearly made safety
their first priority. Notably, all
managers spend significant time on the floor as part of their weekly walkaround requirements. Management presence at shift changes, rotating
presence during weekend shifts, and informal time interacting with operators
are common management practices. When a
midlevel manager was asked about accountability for a safety problem, the
answer was, “I get lots of help,” suggesting that at the management level, the
practice is assistance rather than reprimand.
ISM at the Facility Level: The
tritium facilities are compliant with 10 CFR Part 830. Facility
operations are managed with ISM principles and functions as fundamental guidelines.
In addition, ISM principles are
integrated into the design and construction of the new Tritium Extraction
Facility.
ISM at the Activity Level: Workers
engage in good safety practices and participate in behavioral-based safety
performance observations. Hazards are
understood, and controls are implemented and maintained. Pre job briefings are thorough and focus on
safety discussions with the operators. Operators stated that constant management attention
was the key to their receptivity toward ISM. Job hazard analysis is a routine process, but some
off-normal scenarios have not been considered.
It is worth
considering why SRS has been more successful than other sites in implementing
ISM. The following attributes may have
contributed: (1) a strong industrial
safety culture and centralized control of safety was established by Dupont; (2) a strong nuclear safety culture was developed
by Westinghouse as a result of focus, urgency, and significant resources applied
to the restart of K-Reactor; (3) continuing and effective management presence
on the floor has resulted in technical understanding and awareness of the work
and the hazards; and (4) management is always increasing the expectations for
safety performance. A potential down
side to SRS’s safety philosophy is that with so much emphasis on safety, a
culture of safety arrogance could create a false sense of security.
5.1.7 Y-12
National Security Complex
Y-12 is
basically a uranium and weapons parts manufacturing facility. The site office and contractor manage a wide
variety of hazardous operations, including storage of significant quantities of
nuclear materials, testing of weapon components, dismantlement of weapon components,
manufacturing of highly enriched uranium parts, and chemical recovery of highly
enriched uranium. Operations are
complex, and major hazards include inadvertent criticality and facility fires. The pace of work at Y-12 has been increasing
slowly since the 1994 standdown of operations; in
fact, wet chemistry operations were recently restarted after a long hiatus. Y-12’s LWC rate is slowly improving (see
Figure 4); five accidents have required investigation since 1990.
ISM at the Institutional Level: As
recently as 2 years ago, Y- 12 was having significant problems with safety;
reportable incidents were high, near-miss accidents were occurring, and conduct
of operations was deficient. However,
the Y-12 Site Office credits clear expectations and improvement initiatives by
the contractor with significantly enhancing the implementation of ISM at the
site. Managers are walking down their
spaces frequently and apparently being effective in transmitting the importance
of safe operations to the working level. As a good management practice, the Y-12
contractor is subject to an annual corporate ISM review. This turnaround in ISM at Y-12 could serve as
a model for other sites. However, recent
improvements have been accomplished during a time of relatively low workload;
management will need to maintain safety awareness as the workload begins to
increase.
ISM at the Facility Level: The
nuclear facilities at Y-12 are old; some have been operating since the 1950s,
and even though all but one have 10 CFR Part 830―compliant safety bases, some systems (e.g.,
seismic and fire protection safety systems) do not meet modem standards. For the worst case at the Y-12 facility, the
MEOI is calculated to receive an unmitigated dose of about 30
rem TEDE; for the remaining nuclear facilities, MEOIs
are calculated to receive less than the 25 rem TEDE guideline. Construction of a new storage facility for
special nuclear material has begun, and a new enriched uranium processing
facility is in the planning stage. Timely completion of these two facilities is important
to replace several aging facilities at the site.
ISM at the Activity Level: Y-12
has made significant improvements in safety at the activity level during the
previous 3 years. Pre job briefings are
effective; first-line supervisors ask operators questions about safety
requirements and off-normal responses; lessons learned from previous similar
jobs are discussed; job hazards are reviewed; and operators participate by
asking sensible questions. In addition,
the contractor has plans to strengthen post-job briefings to focus on feedback
and improvement. Operators were able to
identify hazards, controls, and expected responses to off-normal events clearly
and accurately. However, some operators
saw little value in formal conduct-of-operations processes; some personnel
expressed their belief that the new work control processes do not make work
safer, but only add paper. Criticality
safety violations still occur at an unacceptable rate, but Y-12 has an
opportunity to improve criticality safety by integrating its
conduct-of-operations initiatives with conventional criticality safety analyses.
5.2 STATUS
OF ISM IMPLEMENTATION AT NNSA SITES
Using the
information presented above, results of the previous ISM assessments discussed
in Section 3, and the Board’s expertise, a relative ranking of ISM
implementation at the seven NNSA sites was performed with a pairwise
comparison technique (Saaty, 1990). Each site was evaluated using the guiding
principles and core functions of ISM as major criteria; subcriteria
were based on the lines of inquiry. The
major criteria and subcriteria are shown in Boxes 1
and 2. The outcome of the ranking shows
that NNSA’s production plants have significantly more effective ISM systems
than those of the laboratories. While it
is not important to dwell on the individual rankings, it may be informative to
consider why the sites differ in the effectiveness of their ISM systems.
Box
1. Criteria Used to Rank NNSA Sites
Relative to the
Guiding Principles of
Integrated Safety Management
-
Roles,
responsibilities, and authorities are defined and understood.
-
Line
management is aware of the status of work and hazards (operational awareness).
-
Communication
is frequent and effective.
-
Senior
leaders drive safety.
-
Safety
ownership boundaries are defined and managed properly.
-
The
organization has an empowered Integrated Safety Management champion.
-
Authorities
and accountabilities are clear.
-
Technical
excellence exists throughout the organization.
-
The
organization has adequate numbers of safety staff.
-
Operators
understand the fundamentals of hazards.
-
Safety
decisions are made on the basis of sound technical principles.
-
Decision
makers are aware of and understand risks.
-
Safety,
security, and environmental requirements are balanced.
-
Safety and
productivity have equal priority.
-
Nuclear
facilities are operated in accordance with approved directives.
-
Exceptions
and exemptions from directives are rare.
-
Resources
are available to make facility upgrades as necessary.
-
Formal job
and facility hazard analyses are conducted.
-
All work is
performed within approved controls.
-
Operators
and first-line managers respond properly to off-normal events.
Box
2. Criteria Used to Rank NNSA Sites
Relative to the
Core Functions of Integrated
Safety Management
-
Operators
are identified and involved in work planning.
-
Physical and
organizational boundaries are clearly defined.
-
Line
managers are identified and involved in work planning.
-
The work
site is reviewed prior to the start of work.
-
All
important hazards are systematically identified.
-
New or
emerging hazards are identified.
-
Operators
can identify hazards.
-
Controls are
developed using an appropriate hierarchy.
-
Operators
can identify controls.
-
Controls are
implemented.
·
Perform
work within controls
-
All
hazardous work activities are formally authorized.
-
Work
instructions and procedures are explicitly followed.
-
Operators
and first-line managers have a questioning attitude.
-
Pre job
briefings address hazards and controls.
-
Safety
issues are addressed in a timely manner.
-
Self-assessment
and issue management programs are effective.
-
Post-job
briefings address lessons learned.
-
Reporting of
errors is rewarded.
First, the
production plants are not as complex as the laboratories:
A conclusion
might be that ISM is easier to implement and more effective in a consistent environment.
The fact is that some organizations at
the laboratories have very good ISM programs, and the important lesson for the
laboratories is that ISM may be more effective if implementation is designed to
meet the needs of discrete/uniform units, applying the principle of hazard
controls tailored to the work being performed. Note that tailoring does not mean relaxing controls;
it means implementing appropriate controls that are necessary and sufficient to
protect the public and workers.
Second, the
production plants have been managed by contractors with long-standing corporate
safety cultures. Safety tends to be
built into productivity goals because accidents have a negative impact on
profits. The laboratories have been
managed by organizations with longstanding science cultures and deep-seated
values of intellectual freedom. This is
not to say that safety and science are incompatible, but it is important to
recognize the difference when attempting to modify the values and culture of an
institution. Culture can be defined as a
behavior that is acquired by imitation and passed on to a population;
therefore, a culture change can take decades. ISM is the foundation for a shift in safety
culture; 10 years is not enough time to effect the desired culture change. Senior leaders at DOE and the laboratories
need to persist until that culture shift is passed on to the next generation of
scientists. An apparent anomaly in the
ISM ranking was discovered at NTS, which, although a multilaboratory
site, equals the production sites in the effectiveness of ISM. The anomaly may be explainable by the rigorous
safety culture developed during the era of underground nuclear weapons testing.
Third, the
nuclear facility Documented Safety Analyses and Technical Safety Requirements
are generally more up to date and compliant at the production plants than at
the laboratories, which tend to have difficulties with meeting nuclear facility
safety requirements. A nuclear facility
safety basis that is compliant with 10 CFR Part 830 forms the foundation for protection
of the public and workers and implementation of the principles and functions of
ISM. Conversely, a site that has not
adequately updated its Documented Safety Analyses and Technical Safety Requirements
has clearly not fully implemented an ISM system.
Finally, it
appears that the interaction between the NNSA Site Office and the contractor at
the production plants is collaborative, roles and responsibilities are clear,
and DOE oversight is accepted and effective. By contrast, the interaction at the
laboratories appears to lack mutual trust and confidence; oversight is viewed
as an unnecessary burden, and as a result, safety improvements are sometimes
stalled.
One might ask: “So what? Safety statistics show improvement. Major accidents are rare, and all sites are
implementing ISM.” However, it should be
clear from the above discussion that the laboratories need to improve their
implementation of ISM not for the sake of compliance, but for the compelling
purpose of reducing risks. The “think
before doing” concept is so fundamental to safety that it should be embraced
willingly and enthusiastically by all who do hazardous work. One could argue subjectively that the safety
risks at the laboratories are greater than those at the production plants. Numerous hazards, complex operations,
significant quantities of nuclear materials, proximity to the public, and weak
authorization bases translate into greater risk to the public and workers, and
therefore greater urgency to implementing an effective safety management system.
6. CONCLUSIONS
In his book
Human Error, James Reason (1990) introduces the concept of latent and active
safety errors. Latent errors are
typically made by managers, designers, safety analysts, and other decision
makers and can lie dormant in a system or operation before they reappear as an accident.
Active errors are those made by
operators, technicians, and scientists who handle hazardous materials or
perform hazardous operations, and their impacts are immediate. Operators are often left to deal with the
latent errors made by decision makers far removed from the time of an accident.
Reason argues that latent errors pose
the greater threat to safety, and the more removed decision makers are from
front-line activities, the greater is the danger of introducing latent errors.
Application of
the functions and principles of ISM should reduce both active and latent errors.
The five core functions of ISM are well
known, probably because they are relatively easy to apply and can have an
immediate positive impact by reducing active errors, while the successful
implementation of the seven guiding principles should reduce the introduction
of latent errors into safety management systems. The following is a summary of observations on the
effectiveness of implementation of the seven guiding principles and five core
functions of ISM at the NNSA sites. Recommendations for improving the
implementation of the core functions are also presented.
6.1 OBSERVATIONS
ON THE EFFECTIVENESS OF THE SEVEN GUIDING
PRINCIPLES
Line management
responsibility for safety is probably the most commonly stated and least
understood guiding principle. Inconsistent definitions of line management
were heard during many of the site interviews. Answers ranged from “Line managers follow the
money,” to “Line managers are responsible for executing work,” to “Everybody
above operators is a line manager.” The
guidance given in DOE Policy 450.4―“Line management is directly responsible for the protection of the
public, the workers, and the environment”―does not
provide sufficient clarity. Clearly, the
buck stops with line managers, who have the responsibility to direct, authorize,
and supervise hazardous activities. However, line managers are only one group responsible
for integrating safety; in fact, everyone has a role in this endeavor. Perhaps the guiding principle should be, “Line
management owns safety.” Ownership of
safety means that hazardous work-whether at the institutional, facility, or
individual activity level-is planned, analyzed, controlled, and authorized by
an accountable line manager. The owning line manager must understand the
technical basis and associated hazards of the work, be aware of all activities,
be knowledgeable of institutional and facility safe operating requirements, and
manage change control.
Clear roles and responsibilities are generally defined and documented by
NNSA Site Offices and contractors. The
principle is adequately implemented, but existing definitions of responsibility
fail to deal directly with two important elements:
accountability and ownership. Weaknesses in ownership of boundaries between
organizations and functions are not uncommon. For example, the management of interfaces such
as work by other organizations, facility/tenant agreements, line versus program
management, line organizations versus support organizations, and even
laboratories versus plants is a common source of confusion and potential
accidents.
While line
managers may have the most important responsibility for ensuring that hazardous
work is done safely, they are only one element in a hierarchy of
responsibilities for safety. An example
of how this hierarchy might work is outlined below:
Various
approaches to accountability for poor performance, ranging from termination to no
accountability, are employed by the contractors and site offices. Line managers and operators are often held
accountable for active errors that result in safety incidents or failure to
comply with safety requirements. But
processes for holding individuals accountable for latent errors were not found.
For example: What about the ES&H support personnel who
write safety requirements and procedures that cannot be followed? What about
program managers who do not allocate adequate funds to working safely, thereby
encouraging workers to take unnecessary risks to meet programmatic
requirements? What about safety analysts
and designers who miscalculate conditions for safety components? These latent errors might be avoided if individuals
understood that they were to be held accountable. DOE’s Human Performance Improvement initiative
(Institute of Nuclear Power Operations, 2002), if fully implemented, could
result in significant improvement in this area.
Competence commensurate with
responsibility is the
foundation of performing work safely. In
general, the technical capabilities of operators, technicians, scientists, and
engineers are exceptional, and the formal technical qualifications of NNSA
facility representatives, system engineers, and safety experts are excellent. As a result, contractor employees understand
the technical basis of their work, and federal oversight employees understand
safety requirements. At the same time,
however, more in-depth scientific education for NNSA employees and more detailed
nuclear safety certification of contractor workers might strengthen the
implementation of ISM practices. Federal
oversight appears to be driving contractors to standards-based safety management,
and while that may be a desirable goal, some
bureaucrats may be losing sight of the importance of technical competence to
doing work safely. The misperception
that “high-level direction comes without understanding the subject” is commonly
held.
Balanced priorities are reflected in words but not always in
actions. “Safety is the first priority”
and “Our first priority is not to hurt anyone” are common statements of NNSA
leaders and contractor managers. Unquestionably, no manager wants to hurt
anybody, but a different message can be delivered when:
Actions can
deliver a different message about priorities to workers who sometimes feel that
“managers talk about ISM, but don’t do it.” In point of fact, productivity should be the
first priority; after all, this is why the production plants and laboratories
exist in the first place. However, the
end result of good safety practices is productivity. Leading companies in the private sector have found
that implementing strong process safety programs benefits company performance
and improves the bottom line (Center for Chemical Process Safety, 2003). Similarly, productivity gains in mission
output were realized at the LANL Plutonium Facility after formality of
operations was implemented using ISM as the basis (Los Alamos National Laboratory,
2000). One scientist said, “If you want
to do world class science, you need world class safety.”
Decision makers
must always balance priorities. The
problem is that decisions are often based on external influences such as
pressures from programs, complaints from stakeholders, issues raised by
Congress, and even letters from the Board. At the institutional level, the guiding principle
of balanced priorities might be better realized if priority decisions were made
using a risk-informed approach and consistently communicated to workers,
regulators, and customers. In addition,
the sometimes conflicting priorities of safety and security could be better
balanced using risk-informed methods.
Identification of safety standards and
requirements appears to
be working at the institutional (site office and contractor) and facility
levels. However, national engineering
safety standards need to be applied more formally and effectively at the
activity level. Use of standards can be
especially valuable when planning and designing R&D processes involving
hazardous operations. In addition, a
perception that DOE directives lack consistency and are vague and sometimes
confusing is still found among some contractors.
Hazard controls tailored to the work being
performed is a principle
that can be unevenly applied. Generally
speaking, facility-level controls are appropriately graded, and oversight
systems (such as readiness reviews) are in place to ensure compliance with requirements.
However, considerable lag time in
implementing Technical Safety Requirements defined in Documented Safety
Analysis upgrades is common. Tailoring
is often accomplished at the activity level by identifying hazards and defining
the appropriate levels of work control needed to protect workers. For example, low-level hazardous work may rely
on the training and experience of the operators and supervisors to meet
requirements. Medium-hazard activities may
require standard work procedures, job-specific hazard identification, and
controls. High-hazard activities may
require observance of Technical Safety Requirements, mockup training, and engineered
design features. This type of grading
can be sound in concept as long as the screening criteria are set
appropriately, and independent reviews ensure that all hazardous work is controlled.
There have been a number of unfortunate
accidents, especially in R&D activities, in which more rigorous work
controls would have avoided injuries. For example, recently two researchers were
injured while attempting a new method of chemical synthesis; contamination was
spread in another event involving improper handling of radiological materials. SRS has a particularly effective ISM process
for conducting research (Westinghouse Savannah River Company, 2004). One of the keys to the success of its work
control is that researchers developed the process and own the activity.
Operations authorization for facilities is accomplished formally
through Safety Evaluation Reports and Authorization Agreements between the
contractor and the site office. The
process generally works; however, the Authorization Agreements vary in content
from site to site, and are not always kept current. Hazardous work is authorized after hazard
assessments have been completed and controls have been implemented. An authorizing individual, person in charge,
shift supervisor, or some other responsible individual is found at most sites. These first-line managers can have vast safety
responsibilities and a wide span of control. While the concept is sound and the persons
interviewed take their responsibilities seriously, senior managers must back
this up with moral, financial, oversight, and training support.
6.2 COMMENTS ON THE EFFECTIVENESS OF THE FIVE
CORE FUNCTIONS
ISM defines a
rational process for systematically analyzing hazards and identifying safety measures.
Rigorous application of the five core
functions of ISM at the facility and activity levels is a practical approach to
avoiding active errors. This
common-sense safety logic can be effective in reducing accidents, near misses, LWCs, and reportable incidents, and thereby improve the
safety culture. Safety culture
initiatives attempt to get at the human causes of poor safety performance, and
a number of programs have been developed to engage workers and management in
improving safety culture and human performance (Institute of Nuclear Power Operations,
2002; Krause, 1995). Efforts to improve
safety culture are important, but if they are to be successful, organizations
must first lay the foundations for safety. The Board’s technical report DNFSB/TECH-35
(Defense Nuclear Facilities Safety Board, 2004b) sets forth the attributes of
high-reliability organizations that are necessary to establish the foundations
for an effective institutional safety culture, while the five core functions of
ISM provide the essential foundation for improving safety culture at the
facility and activity levels. In many cases,
the safety programs that workers are expected to
implement were not developed by current managers, and that separation between
expectation and execution may be a reason for poor implementation. If safety programs are to be successful,
workers[9]
must be involved in their development, design, execution, and continuous
improvement, and essentially own the five ISM core functions. The following is a summary of observations on
and recommendations for improving the implementation of the ISM core functions.
Define the Scope of Work. Work planning that integrates safety and work
is a common practice at all NNSA sites. The level of rigor may vary, but planning is
done at multiple levels, from project plans to pre job briefs. What appears to be lacking is integration from
top to bottom and integration among the safety disciplines before work is
actually planned. Senior managers should
understand the working environment at the front line; workers need to
understand mission and safety goals. If
the hazardous work environment is disconnected from top-level planning of the
work scope, misunderstandings can develop that in turn can lead to
inappropriate pressures to attempt shortcuts that can result in safety
incidents. Such disconnection is
unnecessary and can be avoided if managers spend more time on the floor
discussing programmatic and safety issues with front-line workers.
Analyze the Hazards. Identification of job hazards is common
practice. However, the inappropriate use
of automated hazard analysis tools is growing, and errors occur in the classification
of jobs because of the misapplication of screening tools. Line managers must be cautious of
overdependence on these tools; hazard analysis programs can be an excellent confirmation
step, but cannot substitute for technical knowledge and experienced teams in identifying
and potentially eliminating hazards. As one respondent noted, “If not careful, the automated hazard
analysis process can allow one to suspend rational thought.” Answers during this inquiry sometimes revealed
disconnects between the perceptions of managers and workers with regard to
hazards and risks. Again, opposing views
can be corrected simply by managers spending more time on the floor, seeking a
better understanding of the issues facing those doing the work.
Develop and Implement Hazard Controls. Safety controls are generally well defined, implemented,
and maintained at NNSA’s nuclear facilities. Safety controls are more informal at the
activity level, especially in R&D activities. Some workers incorrectly assume that the facility-level
controls are adequate to protect them from hazards encountered in their
specific work activities. Work
procedures need to state clearly controls and the steps necessary to ensure that
those controls are operating as designed.
Perform the Work within the Controls. Authorization of hazardous work by responsible
line mangers has become common practice at all of the weapons sites. Similarly, work control and adherence to
procedure have been implemented at all sites. Many procedures at the production plants are
followed to the letter, and step-by-step reader/worker controls are common. While effective, however, rote compliance can
lead to a “stop thinking” approach to work that can be dangerous when
responding to off-normal events. Managers need to use innovative methods to
avoid operator complacency when following detailed procedures. The procedures for R&D activities are
necessarily less constraining than those used for routine, repetitive operations.
However, R&D procedures need to
define operational boundaries for authorized activities and controls. Operational boundaries are necessary for
workers to recognize when their control space has been exceeded and understand
that work outside of that space is unauthorized. Such a stop-work control should force workers
to reanalyze the hazards and help eliminate unanticipated accidents.
Provide Feedback and Continuous
Improvement. DOE’s Office of ES&H publishes a series of
Just-In-Time reports and a periodic Operating Experience Summary to promote the
exchange of lessons-learned information among DOE facilities; nevertheless,
feedback and improvement remains an ineffective ISM core function. DOE is well aware of the problem and has
undertaken initiatives to strengthen this important function. Accurate reporting of issues is the first step
in feedback and improvement; DOE and contractor management must remain actively
on guard against individuals not reporting injuries for fear of retaliation and
managers downgrading occurrence levels to avoid losing performance incentives. Some NNSA organizations reward safety
performance and self-reporting of errors-an important attribute of highly
reliable organizations that can help avoid accidents. Workers and managers alike need to be held
accountable for blatant errors and willful unsafe acts, and critiques and
accident investigations are essential components of continuous improvement. But a system of recognition and rewards is
equally important if the complex is going to improve its safety performance.
7. SUMMARY
Figure 5, based
on Figure 1, illustrates the three levels of ISM (institutional, facility, and activity);
high-hazard nuclear facilities and production operations generally adhere to
this conceptual model. The key is to
start with the hazards and work at the activity level,
and envelope those activities within a safe operating platform that in turn is
enveloped by safe contractor management practices and DOE/NNSA requirements. The important point is that institutional
requirements and facility safety systems surround the hazards at the activity
level and thereby protect workers and the public. The nested levels of protection in this model represent
a common-sense approach to doing work safely. ISM basically defines a safety culture that is
practical in form and function.
Clearly, ISM
has initiated many positive changes. The
question is whether NNSA’s production plants and laboratories are safer. The answer is not clear; even though formality
has categorically improved at all sites, the safety performance measures
employed are somewhat ambiguous. Managers sometimes have a compliance-driven
attitude toward the implementation of ISM; workers, for the most part,
understand the importance of working safely. However, comments such as “We have done this
safely for years,” “ISM is a
compliance-driven paper process,” and “As a researcher I think work control
stinks” suggest that safety attitudes still need adjustment. At some sites, workers and line management
perceive that ISM has deteriorated into a top-down, process-heavy,
compliance-driven system. Their
perception is that DOE and NNSA set the ISM expectations, directives, and
contract requirements; senior contract managers who are not familiar with
facility operations develop the implementation processes; and line managers are
left to implement an unworkable program.
Figure 5.
Concentric layers of ISM protection at the institutional, facility, and
activity levels.
Note:
DSA = Documented Safety Analysis; TSR =
Technical Safety Requirements.
Problems with
the implementation of ISM occur at the activity level, particularly in R&D and
nonroutine activities. A common problem appears to start with
managers and ES&H support personnel who are responsible for providing input
to the work planning and control processes, but do not fully understand the
work, do not participate in job walkdowns, and do not fully understand how to
implement controls efficiently. As a
result, front-line managers and workers do not have faith in a system that to
them appears to be a paper-intensive, bureaucratic process that does little to
improve safety or productivity. The
consequence can be a grudging acceptance by technical experts and a slow
degradation of the model shown in Figure 5. Accidents happen when work is not planned
according to the core functions of ISM. At some facilities, researchers and workers
are operating outside of the activity-level safety envelope because the
institutional-level decision makers do not fully understand the front-line work
and hazards, and workers are motivated and rewarded for getting work done. The result is illustrated in Figure 6, in
which some of the hazardous work―often though not always R&D activities―is
performed outside of the institutional boundary and facility safety envelope,
thereby increasing the likelihood of an accident. To be more effective, ISM needs to start with
the hazards and the work and should be owned, developed, and executed by line
management and the individuals who do the hazardous work, with the support of
subject matter experts as necessary. Leaders must be diligent and not allow a disconnect between requirements and the activities being
performed to occur.
Figure 6. ISM at the institutional,
facility, and activity levels.
During this
review, a number of positive attributes and good practices were observed that could
enhance the effectiveness of ISM at all sites. For example, senior managers at the best sites
have an obvious commitment to ISM and actively demonstrate its principles and
functions. Top managers spend
significant and effective time on the floor, as evidenced by their technical awareness
of the work and the hazards. Effective
ISM organizations continuously increase safety performance expectations,
recognize the reporting of
errors, and reward outstanding safety achievements. At the best sites, cooperation between the
site office and contractor management is apparent, and effective oversight,
self-assessments, and supporting issues management programs are implemented.
Nuclear
facility operations at the better sites are compliant with 10 CFR Part 830 and supporting
DOE directives. At the activity level,
clear operational boundaries are maintained and supported by a change control
program at all levels. Workers
understand the boundaries of their authorized work and what action is to be
taken when an authorized boundary is approached. Finally, worker involvement at the activity
level in identifying hazards, developing procedures, and improving safety is
common practice at the better-performing organizations.
8. SUGGESTIONS FOR IMPROVEMENT
A number of
important ISM attributes and good practices have been discussed throughout this
report. The following are the most
important changes that would improve the effectiveness of ISM:
REFERENCES
Center for
Chemical Process Safety, 2003, The Business Case for
Process Safety.
Defense Nuclear
Facilities Safety Board, 1995a, Integrated
Safety Management, Recommendation 95-2, October 11.
Defense Nuclear
Facilities Safety Board, 1995b, Fundamentals
for Understanding Standards-Based Safety Management of Defense Nuclear
Facilities, Technical Report DNFSB/TECH-5, May 31.
Defense Nuclear
Facilities Safety Board, 1997, Integrated
Safety Management, Technical Report DNFSB/TECH-16, June.
Defense Nuclear
Facilities Safety Board, 2004a, Summary
of Reviews of Documentation and Practices Associated with Activity-Level Work
Planning at NNSA Sites, Staff Issue Report, May 4.
Defense Nuclear
Facilities Safety Board, 2004b, Safety
Management of Complex, High-Hazard Organizations, Technical Report
DNFSB/TECH-35, December.
Gladwell, M., 2005, Blink: The
Power of Thinking without Thinking, Little, Brown and Company.
Idaho National
Engineering and Environmental Laboratory, 2002, Workshop on Maintaining and Improving Established Integrated Safety
Management Systems, November.
Institute
of Nuclear Power Operations, 2002, Human
Performance Fundamentals Course Reference, December.
Krause,
T. R., 1995, Employee-Driven Systems for
Safe Behavior, Van Norstrand Reinhold.
Los Alamos
National Laboratory, 2000, Integrated
Safety Management at the Los Alamos Plutonium Facility, Unclassified Report
LA-UR-00-1876, May.
Reason,
J., 1990, Human Error, Cambridge
University Press.
Saaty, T. L., 1990, Multicriteria Decision Making: The Analytic
Hierarchy Process, Vol. 1, RWS Publications.
U.S. Department
of Energy, 1996, Safety Management System
Policy, DOE Policy 450.4, October 15.
U.S. Department
of Energy, 2003, Independent Oversight
Lessons Learned Report―
Environment,
Safety, and Health Evaluations, Office of Independent Oversight and Performance
Assurance, March.
U.S. Department
of Energy, 2004a, Oversight of Complex,
High-Hazard Organizations, Implementation Plan for Recommendation 2004-1,
December 23.
U.S. Department
of Energy, 2004b, Independent Oversight
Lessons Learned Report―
Environment, Safety, and Health
Evaluations, Office of
Independent Oversight and Performance Assurance, July.
Westinghouse
Savannah River Company, 2004, Conduct of
Research & Development:
Integrated Safety Management for the
R&D Environment,
WSRC-IM-97-00024, Rev. 3
GLOSSARY
Abbreviation |
Definition |
Board |
Defense
Nuclear Facilities Safety Board |
CFR |
Code of
Federal Regulations |
DAF |
Device
Assembly Facility |
DOE |
Department
of Energy |
DSA |
Documented
Safety Analysis |
EM |
Office
of Environmental Management |
ES&H |
Environment,
Safety and Health |
ISM |
Integrated
Safety Management |
LANL |
Los
Alamos National Laboratory |
LASO |
Los
Alamos Site Office |
LLNL |
Lawrence
Livermore National Laboratory |
LSO |
Livermore
Site Office |
LWC |
Lost
Workday Case |
MEOI |
Maximally
Exposed Offsite Individual |
NNSA |
National
Nuclear Security Administration |
NSO |
Nevada
Site Office |
NTS |
Nevada
Test Site |
OA |
Office
of Independent Oversight and Performance Assurance |
ORPS |
Occurrence
Reporting and Processing System |
R&D |
Research
and Development |
REOP |
Real
Estate/Operations Permit |
SNL |
Sandia
National Laboratories |
SRS |
Savannah
River Site |
SS-21 |
Seamless
Safety for the 21st Century |
SSO |
Sandia
Site Office |
TEDE |
Total
Effective Dose Equivalent |
TSR |
Technical
Safety Requirement |
Y-12 |
Y-12
National Security Complex |
[1] DOE Policy 450.4 currently specifies that environmental protection is included within ISM. Yet many sites have not fully integrated their management programs to include environmental management.
[2] DOE Policy 450.4 states that a safety mechanism defines “how the core safety functions are performed.”
[3] NTS is a unique site that combines production-like activities, such as mining, with laboratory experimental activities.
[4] Note that quotations such as these are paraphrased from comments made during the interviews conducted for this study.
[5] The hazard analysis for a Hazard Category 2 nuclear facility shows the potential for significant onsite consequences. The analysis for a Hazard Category 3 nuclear facility shows the potential for only significant localized consequences.
[6] The number of nuclear facilities and MEOI data quoted throughout this report are based on Safety Analysis Reports and Documented Safety Analysis reports published as of the time of the site visits.
[7] Three facilities have unmitigated MEOIs with calculated doses that exceed 200 rem TEDE.
[8] Note that potential accidents have been avoided when Pantex technicians have a “stop work” as soon as off-normal operations have been discovered.
[9] Workers include operators, engineers, technicians, and scientists who perform, design, authorize, oversee, and manage hazardous operations and facilities.