This draft handbook discusses how verification, validation and evaluation (VV&E) should be incorporated into the expert system life cycle, shows how to partition knowledge bases with or without expert domain knowledge, presents knowledge models, presents methods of validating domain (the experts') knowledge, and discusses management issues related to expert systems development and testing. Mathematical proofs for partitioning and consistency and visualization of concepts are also presented.
I would have considered the draft handbook to be of little use if we were unable to do a pen and paper analysis using its procedures on a real-world expert system of reasonable size and complexity (with computer support for matrix manipulation, solving differential equations, etc.). The expert system PAMEX: Expert System for Maintenance Management of Flexible Pavements, with 327 rules, 20 input variables and 59 qualifiers, was selected for the in-house pen and paper analysis. The results of this analysis provided insights into PAMEX and identified errors in programming that the developers and users were unaware of.
This draft handbook will also be field tested in a number of States using operational expert systems as test cases. At the end of this testing (probably in about one year), the handbook will be updated based on the results of the testing.
It will be impossible to respond directly to every comment, but be assured that every comment will be reviewed and given the consideration it deserves. Thank you in advance for your assistance.
Sincerely yours,
James A. Wentworth
EXECUTIVE SUMMARY
This handbook (MS Word 6.0 document downloadable as a pkzip file) has been prepared for the Federal Highway Administration to cover the subject of verification, validation, and evaluation of expert systems. The difficulty of performing verification, validation, and evaluation (VV&E) on expert systems is one of the major factors slowing the development and acceptance of expert systems in the transportation community. There is little agreement among experts on how to accomplish the VV&E of expert systems. The complexity and uncertainty related to these tasks has lead to the situation where most expert systems are not adequately tested. In some cases testing is ignored until late in the development cycle, always with predictable disastrous results.
This guide discusses how VV&E should be incorporated into the expert system lifecycle, shows how to partition knowledge bases with and without expert domain knowledge, presents knowledge models, presents methods for validating underlying the experts' knowledge, and presents management issues related to expert systems development and testing. Mathematical proofs for partitioning, consistency, and completeness and visualization of concepts are presented. The relevant information for this handbook came from the research efforts of a three entity task force represented by James A Wentworth of the Federal Highway Administration, Rodger Knaus of MiTech, and Hamid Aougab of RAM.
In traditional software engineering, testing [verification, validation, and evaluation (VV&E)] is claimed to be an integral part of the design and development process. However, in the field of expert systems, there is little consensus on what testing is necessary or how to perform it. Furthermore, many of the procedures that have been developed are so poorly documented that it is difficult, if not impossible, for the procedures to be reproduced by anyone other than the originator. Also, many procedures used for VV&E were designed to be specific to the particular domain in which they were introduced. The complexity and uncertainty related to these tasks has led to a situation where most expert systems are not adequately tested.
Impelled by the existing environment of lack of consensus among experts and inadequate procedures and tools, the Federal Highway Administration (FHWA) developed this guideline for expert system verification, validation, and evaluation. The guideline is needed because knowledge engineers today do not often design and carry out rigorous test plans for expert systems.
Verification of an expert system is the task of determining that the system is built according to its specifications. Validation is the process of determining that the system actually fulfills the purpose for which it was intended. Evaluation reflects the acceptance of the system by the end users and its performance in the field. In other words the VV&E elements of the expert system are designed to:
a) Verify to show the system is built right.Once the system is validated, the next step is to verify it. This involves completeness and consistency checks and examining for technical correctness using techniques such as are described in this handbook. The final step is evaluation. For the serviceability program, this means giving the system to engineers to use in computing the coefficient. Although the system is known to produce the correct result, it could fail the evaluation because it is too cumbersome to use, requires data that is not readily available, does not really save any effort, does something that can be estimated accurately enough without a computer, solves a problem rarely needed in practice, or produces a result not universally accepted because different people define the coefficient in different ways.
b) Validate to show the right system was built.
c) Evaluate to show the usefulness of the system.
Expert systems use computational techniques that involve making guesses, just as human experts do. Like human experts, the expert system will be wrong some of the time, even if the expert system contains no errors. The knowledge on which the expert system is based, even if it's the best available, does not completely predict what will happen. For this reason alone, it is important for the human expert to validate that the advice being given by the expert system is sound, this is especially critical when the expert system will be used by persons with less expertise than the expert, who can not themselves judge the accuracy of the advice from the expert system.
In addition to mistakes which an expert system will make because the available knowledge is not sufficient for prediction in every case, expert systems contain only a limited amount of knowledge concentrated in carefully defined knowledge areas. Today's expert systems have no common sense knowledge. They only "know" exactly what has been input into their knowledge bases. There is no underlying truth or fact structure to which it can turn in cases of ambiguity. If the expert system does not realize its mistake, and it is being used by a person with limited expertise, there is nobody to detect the error. Therefore, where the expert system is going to be used by someone without expertise, and the decisions made have the potential for harm if made badly, the very best effort at verification and validation is required.
Another problem in VV&E for expert systems is that expert system languages are unstructured to accommodate the relatively unstructured applications. However, rigid structure in implementing code is a key technique used in writing verifiable code, such as the Cleanroom approach.
Selected for the FHWA Handbook are those techniques which seem the most straightforward, precise and powerful in practice. Included are particular variations of partitioning, incidence matrices, and the use of metaknowledge (i.e., knowledge models).
This handbook will provide guidance on planning and decision making early in an expert systems project. This concept applies not only to new developments, but to thinking/improved decision making at any stage from development through implementation, this includes planning the verification, validation, and evaluation of an already developed system. The advice given here should aid in developing clear problem definition and thorough system requirements, reflecting realism from both technical and organizational viewpoints. Risk identification information is also provided.
This handbook will discuss how VV&E should be incorporated into the expert system lifecycle. Although some ideas may be used for revising and/or reengineering existing systems, the aim is to design new systems and ensuring that enough VV&E operations are done during the lifecycle so that these systems are verifiable. This includes decisions that should be made during system specification and verification/validation during stepwise development of an expert system.
An overview of the basic method for formal proofs is provided to prove the correctness of small systems by non-recursive means; and to partition the larger systems into smaller systems and to insure that the component systems are proved to possess the correct relations as required by partitioning theorems. Moreover, the basic method for formal proof will insure that the components agree among themselves. In addition, this handbook will cover selected techniques for partitioning large expert systems when expert knowledge is unavailable.
Generally, it is best to partition a knowledge base using expert knowledge. This results in a knowledge base that reflects the expert's conception of the knowledge domain. This in turn facilitates communication with the expert, and later maintenance of the knowledge base. However, sometimes it is not possible to obtain expert insight into a knowledge base. In this case functions and incidence matrices can be extracted from the knowledge base, and the information contained therein used to partition the knowledge base.
Knowledge models are high level templates for expert knowledge. These templates express the high level structure of the expert knowledge. Examples of knowledge models are decision trees, flowcharts and state diagrams. By organizing the knowledge, a knowledge model helps with VV&E by suggesting strategies for proofs and partitions; in addition, some knowledge models have mathematical properties that help establish completeness, consistency or specification satisfaction.
Small expert systems are those for which direct proof of completeness, consistency and specification satisfaction, without partitioning the knowledge base. This handbook discusses techniques for these proofs.
Finally, evaluation, which includes field testing, addresses the issue "is the system valuable?". This is reflected by the acceptance of the system by its end users and the performance of the system in application. This handbook addresses this issue and some general guidelines which help in the distribution and Maintenance of expert systems.
Audience | Task to be Performed | Part of Handbook |
Managers | Manage expert system project | Introduction |
Knowledge Engineers | Build new expert systems
| Techniques VV&E on New Systems |
Knowledge Engineers | Perform VV&E on existing systems | Techniques VV&E on Existing Systems |
Highway Engineers | Insure that a correct new expert system is built | VV&E on New Systems |
Highway Engineers | Insure that an existing expert system has been validated | VV&E on Existing Systems |
Software Researchers | Critique and extend VV&E methods | Techniques VV&E on Existing Systems VV&E on New Systems |
1. Introduction
Basic Definitions
Need for V&V
Problems in Implementing Verification, Validation, and Evaluation for Expert Systems
Intended Audiences for the Handbook
2. Verification and Validation: Past Practices
3. Planning and Management
Introduction
Identify the Need for an Expert System
The Development Team
The T / E Team
4. Developing a Verifiable System
Introduction Specification
The Importance of Specifications The General Form of Specifications Defining
Specifications
Gather Informal Descriptions of Specifications Obtain Expert Certification of the
Specifications
Validating Informal Descriptions of Specifications
Validating the Translation of Informal Descriptions
Validation of Formalized Requirements
Step-Wise Refinement Development
Design
Implementation
Correctness Verification
5. The Basic Proof Method
Introduction6. Finding Partitions without Expert Knowledge
Overview of Proofs Using Partitions
A Simple Example
The other subsystems of KB1 can be proved consistent in the same way.
Introduction
Functions
Expert Systems are Mathematical Functions
Partitioning Functions into Compositions of Simpler Functions
Cartesian Product
Function Composition
Dependency Relations
Immediate Dependency Relation
Operations on Relations
Finding Functions in a Knowledge Base
Choosing the Output and Input Variables of a Function
Finding the Knowledge Base that Computes a Function
Hoffman Regions
The Hoffman Regions of KB1
When is a Partitioning Advantageous
Hoffman Regions of Partitioned KB1
7. Knowledge Modeling
Introduction
An Example of a Knowledge Model
Using Knowledge Models in VV&E
Decision Trees
Introduction
Definition
Example
Use During Development
Use During VV&E
Ripple Down Rules
Introduction
Definition
Example
Use During Development
Changing a Ripple Down Rule System
Use During VV&E
A Ripple-Down-Rule System is Complete.
State Diagrams
Introduction
Definition
Example
Use During Development
Use During VV&E
Flowcharts
Use During Development
Use During VV&E
Functionally Modeled Expert Systems
Introduction
Use During Development
8. VV&E for Small Expert Systems
Completeness
Consistency
Specification Satisfaction
Specification Based on Domain Subsets
Effect of the Inference Engine
Inference Engines for Very High Reliability Applications
9. Validating Underlying Knowledge
Introduction
Validating Knowledge Models
Validating the Semantic Consistency of Underlying Knowledge Items
Creating a TRUE/FALSE Test
Giving the Test
Formulating the Experiment
Analyzing the Test Results
Overall Agreement Among Experts
Approaches to Disagreement Among Experts
Clues of Incompleteness
Variable Completeness
Semantic Rule Completeness and Consistency
Validating Important Rules
Validating Confidence Factors
10. Testing
Simple Experiments for the Rate of Success
Selecting a Data Sample
Estimating a Proportion (Fraction) of a Population
The Confidence Interval of a Proportion
Choosing Sample Size
Estimating Very Reliable Systems
How a Proof Increases Reliability
11. Evaluation and Other Management Issues
EvaluationAppendix
Distributing And Maintaining Expert Systems
Distribution
Maintenance
Symbolic Evaluation of Atomic Formulas
General Regression Neural Nets
References
Figure 1.1: the V&V Process
Figure 3.1: Initial Project Planning
Figure 3.1.1: KB1 Initial Project Planning
Figure 4.1: Developing a Verifiable System
Figure 4.2: Specification
Figure 4.2.1: KB1 Specification
Figure 4.2.2: KB1 Design
Figure 4.3: Correctness Verification
Figure 4.3.1: KB1 Implementation
Figure 5.1: Knowledge Base 1
Figure 5.2: An Example of Knowledge Base Partitioning
Figure 6.1: Immediate
Dependency Relation as Ordered Pairs
Figure 6.2: Examples of domains
Figure
7.1: Pamex DT
Figure 7.2: Example ES
Figure 8.1: Completeness of Investment Subsystem
Figure 8.2: Consistency of I Subsystem
Figure 8.3: Example Specification for KB1
Figure 8.4: Symbolic Evaluation
Figure 8.5: Symbolic Inference Engine
Table 1.1: Intended Audiences for the Handbook
Table 2.1: Validation Methods
Table 2.2: Verification Methods
Table 2.3: V&V Software
Table 4.1: Level of Effort for the Correctness Verification Stage
Table 6.1: Immediate
Dependency Relation for KBI
Table 6.2: Matrix Product of the DR by Itself
Table 6.3: Immediate DR of KB1
Table 6.4: Varible Clusters of the DR of KB1
Table 6.5: How Variables Influence Rules
Table 6.6: How Rules Influence Variables
Table 6.7:Immediate Dependency Matrix for KB1
Table 6.8: Hoffman Regions for KB1
Table 9.1: Confidence Level
Table 9.2: Confidence Level with One Expert Disagreeing