OSTIblog: Articles and comments about accelerated science discovery

OSTI accelerates the pace of discovery by making R&D findings available to researchers and the American people

Navigate
Subscribe

The OSTI diffusion revolution, a problem solving perspective

OSTI has a deep interest in how researchers use the Web, because the Web is the key to speeding up the diffusion of scientific knowledge and accelerating science. We call it the diffusion revolution.

In order to better understand our Web users, and the obstacles they face, we are looking at their needs from the perspective of problem solving. Problem solving is a research discipline that looks at how people solve problems, step by step. The goal is to help people do a better job of solving their problems, including getting computers to help them. The basic approach is to define the problem solving process as a starting point, a goal, and all the possible paths in between. Research then studies how we pick the best path from start to goal, especially when the number of steps and possible paths is very large.

In many cases simply defining this basic problem solving situation, with the possible paths and goals, is very useful. This is what we are doing at OSTI, defining how scientists use the Web to solve their problems by finding information. (Note: OSTI's Thurman Whitson, retired, was the originator of this research effort.)

The starting point for Web based problem solving is some Web page or other. The possible paths are all the sequences of Web pages the user may follow, as they search and navigate their way to what they need. The goal is the page or pages that provide whatever the researcher is seeking. This sounds pretty abstract but it quickly leads us to useful results.

For example, we can distinguish different kinds of goals, then ask how well we are serving each. Sometimes the goal is very specific. A scientist may be seeking a specific document, or a specific fact like the boiling point of sodium. But often the goal is quite broad, like who is doing research in a specific area (like the diffusion of knowledge), or what are the main research issues, or what are the latest results? These are very different goals, and there are others, all with different search strategies. OSTI may be the best place to look for some of these results, but not for others.

Looking at the paths the researcher must follow is also very important. Using the Web to search for information by scientists is often a long complex process, with many paths being followed. (See drawing.) The simple search is the exception. When understanding is the goal, searches may take hours. OSTI serves many needs but reducing, or even eliminating, these long winding paths is central to the diffusion revolution.

Then there is the starting point. Typically a researcher will start a Web search from one of several different pages. There are basically two ways to use the Web to find information -- a search engine and link by link navigation. Which is used first determines the starting page. Often a researcher will begin with a bookmarked page, or a URL found in an article or an email. In this case they may well be within the networked community of interest and link based navigation or browsing will be the best strategy. On the other hand, search is often used first, to find these navigable networks. Search finds clusters while navigation explores them.

OSTI is mostly targeting search problems of the more global nature. In fact the portals OSTI owns or operates are by far the most efficient available for finding scientific content over a broad array of scientific fields. From a problem solving perspective the OSTI diffusion revolution amounts to creating new starting points and eliminating complex paths.

Problem solving example: How Science.gov eliminates 1000 or more steps.

The problem is to do what Science.gov does, but without using Science.gov. The question is how many steps does it take to do what Science.gov does in just one step? That is how many steps Science.gov saves. It is probably well over 1000 steps.

For simplicity, assume there are 30 databases in Science.gov. The first problem is to find them. The number of steps required will depend heavily on how much the user knows about the organization of the federal government. Let us assume, optimistically, that they know about the numerous departments and agencies and can go to their home pages. They also know which ones are likely to have R&D products.

We estimate that it will take an average of 10 steps to find each database, beginning at the department or agency home page. (This may well be very low and it may well be 20 steps or more. It is a research question.)

It thus takes an estimated 300 steps to find all the databases. Note that this effort is a one time occurrence for each user.

The next problem is to relevance rank search results from all 30 databases. Note that this problem does not occur if the user is simply looking for a single item. In that case they would simply look at the search results from each database until they found the item. This task would involve relatively few steps. However, if the user is looking for a broader comprehension of their search topic, then they must undertake a process that is analogous to the ranking that Science.gov provides.

For simplicity, let us assume that they must find the 20 most relevant hits from the combined databases. The easiest way to do this is to first go through the returns from each database to find the most important hit. That requires 30 steps. This process is then repeated for the second most important hit, then the third, etc.

Using this procedure it will take 600 steps to find the 20 top hits. Note that this effort recurs for each search. There may be procedures that require fewer steps but it is not obvious what they are.

The user also has to find and search, then rank, the 1800 or so websites searched by Science.gov. It is far from clear how this would be done. Let us simply assume that at least 100 steps will be required, probably many more.

We thus arrive at 1000 steps saved by Science.gov, probably a great many more. This is certainly a case of a computer program making very easy what is not humanly do-able without it.

How the OSTI diffusion revolution will change science.

Given the problem solving model we can ask what science will look like when the OSTI diffusion revolution takes hold. Revolutions change the way science works and this one is no exception. One simple example is in the content of research proposals. Today a proposal is required to demonstrate a knowledge of related work on the same topic. But there are many closely related elements that are also pursued in distant communities, where they are studying very different topics. These elements include instrumentation, methods, mathematics and fundamental concepts.

OSTI is making it possible for researchers to know about these distant activities, which has the potential to revolutionize science. Some day such knowledge will be expected as a matter of course. For example, nuclear physicists are not now expected to know about what is going on in forest management. But a recent paper in a forestry journal presented a breakthrough in Monte Carlo analysis, which is widely used in nuclear physics. Normally it would take years or decades for this new knowledge to migrate from forestry to physics, but the OSTI diffusion revolution is eliminating the delay time. It is here that full text search is critically important, for these cross-topical elements are often not mentioned in abstracts.

Today science is organized around topics and problems. Some day it may be equally well organized around methods and mathematics. Eliminating today's community-to-community time delays is how the OSTI diffusion revolution will speed up science.

There is also the issue of computer assistance. The discipline of problem solving has made major contributions to the technology of artificial intelligence, or getting computers to act like humans. The chess playing computer is a great example. Relevance ranking and clustering are two examples of artificial intelligence that are central to the OSTI revolution, and OSTI is actively working in both. Visualization as an aide to broad understanding, as well as search, is also an active research area.

Another interesting mode of problem solving involves the intersection of science and educational content, where OSTI is also working. Educational content differs dramatically from one grade level to another, almost like different languages. There is presently no way to search by grade level, so finding grade specific content is a very difficult problem, especially since content is scattered among a myriad of small sites. Moreover, working scientist often have a need for college level educational content, especially when they are exploring distant research communities, where they are not experts. This need will grow as the diffusion revolution progresses.

The basic point is that researchers use the Web in to solve a variety of different problems. OSTI is targeting some of these use, especially the global, community-to-community diffusion needs. We are looking at specific Web based problem solving situations in order to speed up the diffusion of scientific knowledge and accelerate science. OSTI is leading the diffusion revolution.

David Wojick

Senior consultant for innovation

OSTI

 

Comments:

Post a Comment:
  • HTML Syntax: Allowed

We welcome your comments and look forward to civil discourse on a variety of science and technology information topics. We will review comments before posting and we reserve the right to not post comments. You are fully responsible for everything that you submit in your comments, and all posted comments are in the public domain. This means that your comments could be distributed widely.

By providing the correct answer to this math question, I accept these terms and conditions for comments I submit to the OSTI Weblog.