Data Quality User Group Meeting Minutes
May 16, 2001


eRA Data Quality Group Meeting
Date: May 16, 2001
Time: 11:00 am
Location: Rockledge II, Room 6201
Next Meeting: June 1, 2001, Rockledge II, Room 6201, Time 9:00 am

Action Items

1. QRC Statement of Work (SOW) - Read final version provided by Bob. (All)

2. Prioritization of SOW Tasks - Provide Belinda with input. (All)

3. Develop Evaluation Plan to Measure QRC Progress - Provide Belinda with input. (All)

Discussion Items

1. Data Quality Improvement Contract

The contract to address personal identifier data problems in IMPAC II and the NIH Commons has been awarded to QRC, a division of the Macro International Corporation (the only bidder). Funding has been approved for three years at $2 million per year; the contract, however, includes an optional fourth year. Bob Moore will be the Project Officer.

The SOW (copies provided) contains three main tasks. The first is to develop and execute software to find redundant or incorrectly collapsed personal profiles. Existing IMPAC II discrepancy tools are not satisfactory for identifying errors. The second task entails the "brute force" data cleanup. This effort would be much easier if original applications had been scanned. Planning and implementing an environment that reduces the risk of future errors is the contractor's third task. Data improvement will require policy decisions regarding profile definition, ownership and update authority.

A kickoff meeting with QRC is scheduled for June 1. Jim Cain will attend. The purpose of the meeting is to present our perception of the data problems. In preparation for the meeting, team members were requested to read the SOW and to provide Belinda with input on prioritization of the tasks.

2. eRA Background

Belinda explained the role of the Advocates in the eRA decision-making process. These individuals have surrendered their "turf" and committed to supporting what's best for NIH as a whole. At the outset of the project, the Advocates reviewed and prioritized all software requirements. Data cleanup ranked third on the list of top priorities. The Steering Committee then approved the funding of these priorities. Expenditures/accomplishments will be closely tracked. Therefore, the team was asked to provide Belinda with input on developing a plan to monitor contractor performance in achieving data quality objectives.

The QRC award was preceded by a small contract to ROW/Logicon. ROW's expertise, however, is systems and not data. ROW will continue to perform the Oracle DBA function. It is expected that ROW and QRC will interact cooperatively.

3. Data Problems in IMPAC II

a. Maria has taken the lead in identifying the problems. She is most concerned about data problems in the affiliated tables, as well as corrections being extended to the IRDB. We now have a good audit trail of major person table changes. However, we may not be seeing all changes for all affiliated tables. Carl believes that logging all changes should be a top priority
b. Data problems may not be limited to extramural grantees. We may need to clean up internal staff records as well.
c. Bob expressed concern that the number of first-time NCI PIs appears to have dropped. Are the statistics from the database accurate? We need to report these numbers to Congress.
d. We need quality control for the inputting of PI degrees by CSR contractor staff.

4. Data Correction and Improvement Considerations

a. In some cases, it may be necessary to go back to the original application in order to resolve personal identifier problems. Therefore, the contractors may need access to the IC grant files and archives. It was suggested that EPMC serve as the venue for informing the ICs of this requirement.
b. Unless restrictive business rules coincide with cleanup, we will lose ground. Of approximately 400 distinct SSNs corrected in a previous cleanup, 6% were subsequently changed within 18 months.
c. For corroborating information on PIs, QRC may need to check other databases such as those maintained by professional organizations (e.g., the AMA and ADA) and other grantors (e.g., the American Cancer Society). The need to go outside of the NIH will be brought up at the next meeting with QRC.
d. We need to establish a consistent, accurate database before allowing access via the Commons. It is essential for QRC to coordinate with Commons developers so that consistent business rules are implemented to minimize incoming errors. Ultimately, providing PIs access to (and/or ownership of) their records should serve to improve data quality.
e. We need a mechanism for measuring the quality of IMPAC II data.
f. Bob believes we need a focal point that has primary responsibility for data integrity. At present, the responsibility for data quality is undefined.

Belinda intends to communicate with the team via email. Meetings are difficult to coordinate. As work on the contract proceeds, we may need to bring in additional staff for expertise and assistance.

Attendees
Carol Bleakley (OD), Madeline Monheit (OD), Carl Roth (NHLBI), Maria Bukowski (OD), Bob Moore (OD), Walter Schaffer (OD), Jim Lipton (NIDCR), Jim Onken (NIGMS), Belinda Seto (OD)

Attachments

No attachments.