These pages use javascript to create fly outs and drop down navigation elements.

HSR&D Study


Sort by:   Current | Completed | DRA | DRE | Keywords | Portfolios/Projects | Centers | QUERI

SHP 08-191
 
 
Refinement of an Automated Text Abstraction Informatics Tool
Jennifer H. Garvin PhD MBA
VA Medical Center, Philadelphia
Philadelphia, PA
Funding Period: May 2008 - September 2008

BACKGROUND/RATIONALE:
The VA redesigned its health care system, the Veterans Health Administration (VHA) by including better use of information technology, measurement and reporting of performance, reorganization of health care delivery and realignment of payment policies. 2-3 However, the collection of data for quality measures through retrospective chart review is time consuming and expensive.1 Automating the abstraction process through the use of software would be beneficial to the VA. Data can be obtained more quickly and at a lower cost. Further, the use of such software for this purpose would be a step toward the wider use of automated text extraction in VA text documents. Informatics tools have been used to extract data from several document types in non-VA settings such as imaging reports, 4-5 discharge summaries, 6-7 and problem lists, 8 among others. Within the VA, informatics tools have been used to extract data with high accuracy from documents such as physical examinations for disability claims. 9 Another study used text processing software to detect central venous catheter adverse events using extracts of VA records. 1 However, a text extraction or natural language processing software has not been studied using VA discharge instructions to evaluate if performance measurement criteria are met. Because it is important to use an iterative process to refine the accuracy of the MCVS (i.e. to train the software) within the clinical context it will be used, 10 it is important to pilot the use the MCVS to extract data from VA discharge instructions.Enter text here.

OBJECTIVE(S):
Research Objectives
Hypothesis: The use of an automated text-abstraction process to evaluate the completion of required elements within the document used at the PVAMC to record discharge instructions for inpatients with congestive heart failure will be as accurate as human abstractors.
Aim #1: To evaluate methods to compare the accuracy of the MCVSs and human abstractors ability to identify the presence of complete discharge instructions for inpatients at the PVAMC with congestive heart failure based on the External Peer Review Program (EPRP) required data elements.
Subaim #1: To evaluate accuracy assessment methods related to the MCVSs ability to extract data using a comparison of manual data abstraction based on clinician review using PVAMC discharge instruction documents
Subaim #2: To evaluate accuracy assessment methods related to the MCVSs ability to extract data using a comparison of External Peer Review Program (EPRP) data for CHF inpatient cases in 2003
Aim #2: To determine recommendations related to the use of the MCVS in PVAMC discharge instructions so that the use of the tool can be improved.
Subaim #1: To determine what contributes to accuracy of the MCVS
Subaim #2: To determine the barriers to accuracy of the MCVS
Subaim #3: To determine methods by which the MCVS can demonstrate improved performance
Enter text here.

METHODS:
C. Study Design and Approach:
This is a descriptive study that will quantify the test characteristics of the Multithreaded Clinical Vocabulary Servers (MCVS) ability to identify the presence of complete discharge instructions for 180 inpatients at the PVAMC discharged in 2003 with congestive heart failure in a training dataset. The criteria used to determine discharge instruction completion will be the same as the EPRP required elements. This study will provide a preliminary comparison of an existing manual method of data abstraction with a new method of automated data abstraction using the MCVS as well as pilot the MCVS so that it can be trained in the clinical context in which it will be used for further studies.
Study Population and Sample: EPRP Data: There were 180 inpatients with CHF discharged in 2003. These cases will be used as a training set to improve the MCVSs ability to detect what constitutes a completed discharge instruction document. The criteria used by the EPRP to determine the presence of a completed discharge instruction is listed in a subsection below. In a subsequent study we will evaluate the discharge instructions of 100% of the inpatients discharged from the PVAMC with CHF between the years of 2004-2007. The total number of cases available is 635 for which EPRP has been abstracted for the years of 2004-2007. Statistical Power for Subsequent Research to Test the MCVS: The proposed research will prepare the MCVS for the next study, the test phase. During the test phase the MCVS will need to be used with 625 discharge instructions for PVAMC inpatients discharged with CHF so that we have 95% confidence that we can tell the difference of 2% from the mean rate of documentation using a 2-sided alpha with 80% power. We will have adequate power using the 635 cases from the EPRP analysis for the subsequent test phase. If we consider all CHF discharges from 2003-2007 there are 815 cases. Because there are 180 inpatients with CHF discharged in 2003 the training set will represent 22% of the 815 cases from EPRP data for CHF inpatients from 2003-2007. After training the MCVS, the remaining 635 cases (78%) will be used as a test set in subsequent study that will evaluate the use of the MCVS in a larger set of discharge instruction documents. This research methodology has been establish in prior studies by Brown et al.1
Data Collection Preparation of Documents The discharge instructions for CHF inpatients will be identified for 2003. A research assistant will copy the discharge instructions for each case and paste them into a Word document. Each document will be de-identified so that all patient and provider identifiers are removed. The document will be saved with a pseudo-identifier on the secure VA CHERP server. Prior to using the MCVS the documents will be uploaded directly from the secure VA CHERP server to the secure VA TVHS server both of which are behind the VA firewall. The documents will then be available for use with the MCVS. The MCVS will be used via a VPN from Mayo Clinic to the TVHS server to extract required data elements. All records and data will remain behind the firewall on the TVHS server. This process was used in a prior VA study by Brown et al.1 Congestive Heart Failure Performance Measurement Criteria for Complete Discharge Instructions According to the EPRP technical manual each discharge instruction should contain the following 6 elements in the written instructions; Activity level; Diet; Discharge medications; Follow-up appointment with MD/NP/PA; Weight monitoring after discharge including documentation that patients should; weigh themselves daily, keep a record of their weight, be instructed about what weight change indicates a significant weight gain; and when to contact their health care provider if significant weight change occurs (for example, call provider if you gain >2-3 lbs overnight or call provider if you gain >3-5 lbs in the course of a week); and what to do if symptoms worsen.11
The EPRP data abstraction requires that if all elements of the discharge instructions are present, the measure is designated as being complete. This data is obtained by one abstractor with assistance from medical center personnel. For the purposes of our study we will determine the gold standard of presence of completed discharge instructions. Two trained chart abstracters will review the discharge instructions for presence of discharge instructions. If the two reviewers agree that all required instructions are present, the discharge instructions will be recorded as being complete. If there is disagreement between the two reviewers a third person will adjudicate to complete the gold standard. For example, if the third reviewer determines that the discharge instructions are incomplete, the final determination recorded in the database will be that the discharge instructions are incomplete. In contrast to the EPRP data, the reviewers in our study will only have access to the discharge instructions as opposed to information available to the EPRP reviewers who can access the entire medical record. We will limit the data available to our human abstractors because that will be the only document available to the MCVS for analysis.
Following the determination of the gold standard based on manual chart abstraction, the results of the human abstractors will be compared to the EPRP data for the same records. Tests of significance will be used to determine if there is a statistically significant difference between the results of the gold standard development and the EPRP data abstraction. In addition, when there are differences in the determination of complete or incomplete instructions, the medical record and the EPRP data sheet will be reviewed to determine the cause of the difference. This will help inform the limitations of the performance of the MCVS. For example, if in addition to information in the discharge instructions, EPRP reviewers may use information from other parts of the medical record, this data will not be available to the MCVS.
Process of Training the MCVS We will reformulate all 6 human-readable discharge instructions into a format suitable for computer implementation based on the methodology developed by Brown et al. 9 First, we will manually identify the concepts contained in each criterion and mapped them into SNOMED CT using a terminology browser. We will map concepts to either single SNOMED CT concepts or to explosions of SNOMED CT concepts. An exploded concept is linked to more specific subconcepts within the terminology. For example, the explosion of myocardial infarction includes anterior myocardial infarction, inferior myocardial infarction, lateral myocardial infarction, and several other subtypes of myocardial infarctions from the SNOMED CT hierarchies. We will represent concepts that could not be mapped to SNOMED CT using simple strings. Second, we will built complex rules by combining mapped concepts and unmapped strings with the Boolean operators and, or, and not. Finally, we will specify which section of the discharge instruction (e.g., activity level, diet, etc.) to which each rule was to be applied. This approach will allow the specification of computer-usable rules that can be applied to each examination via 3 steps. A standard process has been established to prepare the Multithreaded (Mayo) Clinical Vocabulary Server (MCVS) for use in a new dataset. 9 Using the method for our study the first step the MCVS separates discharge instructions into report sections (e.g., activity level, diet, etc.). In the second step, the MCVS indexes the document. Before indexing, words are normalized using a variation of the National Library of Medicines public domain software program NORM from the Unified Medical Language Systems knowledge source server, and then sentences are broken into single word and multiword phrases. The MCVS indexes the phrases from each examination using SNOMED CT; in so doing, it identifies separately phrases that indicate positive, negative, and uncertain assertions and constructs compositional expressions from combinations of simpler concepts (e.g., left foot is represented as the body part foot with laterality left). The MCVS has been extensively tested in Mayos usability laboratory and has been published in the medical literature. In the third step, a rules evaluation engine sequentially applies the rules expressed in health assessment language against the indices created for each examination report.
Multiple cycles of rule improvement will be conducted using the training set. In the first step of each cycle, the computer- usable rules will be applied to each discharge instruction included in the training set using concept-based indexing software. The results of the algorithmic approach will be compared to the gold standard of human expert review. The true-positive rate, false-positive rate, true-negative rate, false-negative rate, sensitivity, and specificity will be calculated for each quality rule. The second step of the rule improvement cycle will include a manual failure analysis of the false-positive and false-negative results. Rule modification based on the failure analysis will be the final step of the cycle. Mapped concepts or strings will be either added to or deleted from existing rules in an attempt to improve performance. When improvement cycles reach a preset level of accuracy (90% Sensitivity and 75% Specificity) using the training set, we will evaluate the resulting final rule set on the test set of discharge instructions.
Data Analysis Plan During development, the evolving rules will be applied iteratively to the training set (n=180) of discharge instructions. The results of the algorithmic review will be compared with the gold standard results generated by human expert review. Sensitivity, specificity, and percentage of agreement with the consensus gold standard will be generated for each rule. The results of the algorithmic review will also be compared to the EPRP data that has been selected. In all instances where there is a difference between the completion of the discharge instructions based on manual review or the EPRP data the reasons for the discrepancies will be examined. We will calculate inter-rater reliability and test for significance differences using the Kappa. These statistics will be calculated between the following: the text extractor and the EPRP data, the gold standard based on manual review and the text extractor, the gold standard based on manual review and the EPRP data.
Enter text here.

FINDINGS/RESULTS:
No results at this time.

IMPACT:
Enter text here.

PUBLICATIONS:
None at this time.


DRA: none
DRE: none
Keywords: none
MeSH Terms: none