Home Projects Publications Presentations Repositories Photo Gallery Career Staff Favorites
  • MyDelivery
  • Turning The Pages Online
  • MyMorph
  • Medical Article Records GROUNDTRUTH (MARG)
  • MD on Tap
  • AnatQuest
Links to Feeds:
PublicationsRSS  RSS
CEB NewsRSS  RSS

Last updated: June 18, 2008

CEB Projects

Print this Print this  E-mail this E-mail this


page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19   



Automating the production of bibliographic records for MEDLINE


11 System performance evaluation

Assessing the performance of the MARS system is an important goal, not only to evaluate the efficiency of its constituent modules, but also to locate potential bottlenecks. In addition, since we seek the best way to create MEDLINE bibliographic records, it is important to compare the productivity (e.g., labor hours per unit record) of the MARS systems, both versions 1 and 2, against each other, as well as with that of the manual keyboarding operation done under contract. Key questions posed as a starting point for performance evaluation are listed as follows:

  1. How long does it take for a bibliographic record to be completed?
  2. What is the time taken by each manual and automatic process for one record?
  3. What is the time taken by each manual and automatic process for one day's workload?
  4. What is the time taken to enter each field in Edit?
  5. What is the error rate of the zoning, labeling and reformat modules?
  6. What is the utilization rate of MARS-2 server processes and workstations?
  7. How long does data wait to begin processing by each of the daemon process? I.e., how long is it in a queue waiting to begin work?
  8. How often is a citation re-processed? What are the most common reasons?
  9. What is the overall cost (in labor-hours) for the MARS-2 operation as compared to MARS-1 and the keyboard operation?

These questions are addressed quantitatively by instrumenting the system and analyzing the data recorded, these mainly being event counts and time data. Instrumentation is implemented by two C++ classes written to record such data: ProcessTime which records times and PerformanceData that records statistics generated in a MARS process.

11.1 Process performance analysis

The instrumentation data yields information on the processes, both automatic and manual, at different levels of granularity. Figure 11.1 shows the average time taken by each process to complete its task for one bibliographic record (citation) in July 2001. Predictably, the manual processes of scanning, editing and reconciling take much longer than the automated ones. An explanation of the terminology: Edit_First and Edit_Second stand for the first and second Edit operator; Prod is the inhouse-developed daemon that controls the OCR system, hence equivalent to the OCR action; ZoneCzar combines the actions of automated zoning and labeling.


Figure 11.1

Instrumentation data for a breakdown of some of these processes into their constituents appear in Figures 11.2 and 11.3. In Figure 11.2 we find the actual process of scanning a page ("append") to take a relatively short time, but inserting a missing page after the fact and the entry of a new journal issue ("New MRI") to take much longer. This latter task, found time consuming, was the rationale behind the development of the new CheckIn module, which eliminates this function in the scanning operation.

The actual workload for these time consuming processes is not high, because they do not occur frequently. For example, as shown in Figure 11.3, the actual burden of inserting pages is very low, since this operation is performed rarely.

Figure 11.4 shows the average time taken for the Edit operator to enter the fields not automatically extracted. Only the data for the first Edit operator is shown, since the data for the second operator is approximately the same. In this figure, we show entries for those fields that are automatically extracted in compliant journals, because we are accommodating non-compliant ones also. Furthermore, even for compliant journals, there are pages that are not processed by the automatic modules (e.g., letters to the editor, editorials) requiring the Edit operator to key in the relevant data. The figure, however, indicates opportunities for further automation.


Figure 11.2




Figure 11.3




Figure 11.4



Since we keep track of operator names in the database, we also offer the supervisor the option of comparing their relative effectiveness, as shown in Figure 11.5 for scanning and Figure 11.6 for editing.


Figure 11.5




Figure 11.6



11.2 Comparison of the three data entry systems

Here we compare the two systems, MARS-1 and MARS-2, and the manual keyboarding operation on the basis of a workload of 600 completed bibliographic records per day, the average workday load for all of these approaches. The table lists the average number of seconds per page for each system and the number of minutes per 600 records, and shown in a chart in Figure 11.7. It can be seen that MARS-2, by eliminating many of the manual functions in MARS-1, is a considerable improvement, and that both are far more efficient than the manual keyboarding operation. To produce 600 records, MARS-2 requires 61 hours of labor per day, while the keyboarding requires 246 hours. In comparison with the keyboarding operation, MARS-2 therefore saves 185 direct labor-hours per day or 51,800 labor-hours per year (based on a year of 280 work days).

  KeyBd MARS I MARS II Keybd MARS I MARS II Category Sec/Page Sec/Page Sec/Page Min/600 Min/600 Min/600 Scan NA 71 30 NA 706 300 Edit NA 178 133 NA 1784 1330 Reconcile NA 388 202 NA 3885 2020 Total 1475 637 365 14750 6374 3650


This graph compares the efficiency of data capture methods and shows: keyboard required 15000 minutes per 600 citations, MARS I required 6200, and MARS II required less than 4000.
Figure 11.7


page 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19   
 

National Institutes of Health (NIH)National Institutes of Health (NIH)
9000 Rockville Pike
Bethesda, Maryland 20892

U.S. Dept. of Health and Human ServicesU.S. Dept. of Health
and Human Services

USA.gov Website