Scanning: An Essential Step Toward Handling e-Grants

NIH currently scans into digital format millions of pieces of paper associated with the grant application and funding process. More scanning prototypes are in the works. One impetus driving the use of scanning technology is to reduce storage space and filing costs. Another is to work through the issues of handling and storage of digital files in anticipation of the mandated requirement to begin receiving e-grants by FY2003. Taking advantage of scanning technology now will enable NIH to develop a phased approach and facilitate the development of business practices for receiving e-grants in the future.

NIH currently scans into digital format millions of pieces of paper associated with the grant application and funding process. More scanning prototypes are in the works. Taking advantage of scanning technology now will enable NIH to develop a phased approach and facilitate the development of business practices for receiving e-grants in the future.

What is happening now?

The NIH community has been scanning documents into digital formats for several years.

  • One of the first successful production scanning activities was CRISP (Computer Retrieval of Information on Scientific Projects), a project of the Office of Extramural Research (OER). CRISP is a searchable database of federally funded biomedical research projects conducted at universities, hospitals, and other research institutions. Integral parts of the grant applications are scanned into a digital format that populates the CRISP database.
  • The Extramural Inventions and Technology Resources Branch of OPeRA within OER has been scanning invention reports since 1997. By so doing all paper-based reports are now rendered fully electronic. The scanned images are linked to information entered into an invention reporting database fed by the Interagency Edison web-based interfaces being used by nearly 300 extramural grantee and contractor organizations. Since its roll out over 15, 000 invention reports have been scanned using this system.
  • The National Cancer Institute (NCI) has scanned over three and a half million pages of funded grant applications into digital format and populated a searchable web-enabled database covering the past two years. The entire grant application is scanned into the database. Grants Management and Program staff have real-time access to this database. This application has allowed NCI to reduce file room size and speed information to those in need. The response from all participants has been extremely positive.
  • The NIH Biomedical Engineering Consortium (BECON), in conjunction with OER, launched a scanning pilot project in the March 1999. To date, over three hundred grant applications from two Program Announcements have been scanned into digital format and placed on compact discs (CDs). An optical character recognition (OCR) technique was applied to the scanned images, allowing for basic text searching capabilities. The CDs have been made available to BECON members and to peer reviewers. Thus far the response to this pilot has been positive.
  • The National Institute on Aging (NIA) began a limited pilot in September 1999 to scan all of its Small Grant applications into digital format and place them on CDs. NIA followed the BECON model in applying OCR/searchable text format to the scanned image on each CD. As in the BECON pilot, responses from internal staff and peer reviewers have been very positive. NIA would like to expand this process to include hosting all grant applications on a central secure web-server to reduce file room storage and make more efficient use of human resources.
  • The Center for Scientific Review (CSR) began scanning the second page (the application abstract) of each application received by the Center in June 2000. These scanned images are placed in an internal database accessible by CSR staff only. The CSR process includes OCR to generate ASCII text so that the abstracts can be edited for Summary Statement preparation purposes. This highly successful pilot is now in production stage. The use of scanning has saved re-keying over ten thousand pieces of paper to date.

Lessons Learned

The main lesson learned from NIH's scanning efforts thus far is that scanning is very doable. The technology is mature and inexpensive. Inserting scanning into the workflow is manageable and can bring immediate, palpable benefits. Other lessons:

  • Scanning conveys different benefits to different people, which leads to a desire to scan the entire application.
  • A contractor can do scanning at 8 cents per page. OCR and indexing are more expensive but are attractive because they add functionality that mere scanning lacks.
  • There is a need for designing the PHS 398 form to be scanner-friendly. Two desiderata are: last name first (for identification) and gray-scale images. It is also important to ensure that applicants do not alter the forms.
  • Accuracy rates are estimated at 95 to 99 percent. It is not clear where the errors are occurring. They might be in places such as special text characters-i.e., the kind we can live with. CDs can providing excellent quality in PDF and are searchable.
  • We need to distinguish between scanning, OCRing, and indexing. Varying levels of value added processing can be performed at different times, depending on the preference of the ICs.

Scanning can convey immediate benefits: instant availability of the application, reduction in paper volume, cost savings, and fostering of collaborative work. The knowledge gained from pilots can lead to reengineering business practices and modifying eRA modules accordingly.

What is next?

At three large ICs, new scanning projects are in preparation.

  • NCI is preparing to increase it use of scanning to include all of its grant applications--approximately 9,000 directly received applications annually. NCI will scan these applications into the existing web-enabled database now housing the Institute's funded grants. Lessons learned from the first production database will be applied to the new effort.
  • The National Institute of Allergy and Infectious Diseases (NIAID) is designing a pilot that will feature electronic receipt of responses to a Small Business Innovative Research (SBIR) Program Announcement. NIAID will scan all the applications not received in electronic format. The electronic form will be received in interactive portable document format (IPDF). The scanned image will be in the same IPDF. The digital forms of these applications will be housed in an NIAID web-enabled database.
  • The National Heart, Lung, and Blood Institute is planning to scan all of its funded grants and current grant applications, along the model of NCI. It is not clear at this time if NHLBI will OCR the documents as well. Like NCI, NHLBI is concerned with file room space and the labor-intensive processes of filing and retrieving documents and records. Work could begin later this fall.

What is missing?

Though nothing has been done at the NIH enterprise level, change is on the way. None of the pilots or production processes has been integrated into the CSR role in NIH, nor have any of the data populated the IMPAC II databases. All of the pilot and production programs are in ICs.

The first pilot project designed to shift to an NIH-wide approach to scanning is on the drawing board. NIA and OER have agreed to enter into a pilot project that will scan the next receipt of NIA's Small Grant applications into digital form. The same criteria NIA has used in the past will be applied to this pilot. The Institute will continue to receive CDs to be distributed to internal staff and peer reviewers. The major paradigm shift will be that the image of the entire grant application will be stored as a binary large object (BLOB) electronically linked to the IMPAC II database. Lessons learned from this pilot will fuel other pilots and prototypes in this area.

It is crucial to help set NIH-wide standards for scanning and e-grants soon so ICs will have a framework for establishing procedures for specific IC needs. An example of NIH-wide standard setting is the effort to factor into the redesign of PHS 398 means of ensuring that the form will be scanner-friendly.

The long run

Cradle-to-grave electronic processing of all grant applications by NIH is the ultimate goal. Prototyping is not a panacea for all the issues e-grants will raise since prototyping requires the scarce resources of time and money. Some of the lessons prototypes teach never translate into the reality of the final system. However, scanning affords very realistic prototypes for e-grants, and learning by doing in this way is vastly preferable to mere theoretical musings or to simply ignoring complex issues of technology and organizational change until the fateful day when the first e-grant applications show up.

Even after e-grants become a reality, for many years to come NIH will continue to receive a diminishing number of grant applications in paper form. Those grants will have to be scanned so that they can be merged into the digital grant application document repository. So, quite aside from its great utility for prototyping, scanning will be with NIH for a long time.