Skip to Main Content U.S. Department of Energy
IN-SPIRE™ Visual Document Analysis

Video Transcript

Today's businesses environment is face-paced and often unpredictable. Businesses must make tough decisions quickly, though this involves sorting through and digesting large volumes of information. Such groups include:

  • consumer product developers searching for the next big thing in a particular product range;
  • quality control managers trying to understand what thousands of customers are saying;
  • attorneys sifting through mountains of evidence such as e-mail or documents in the discovery process;
  • or marketing experts exploring relationships between product sales and regional demographics.

How can businesses efficiently make use of this information to discover trends, find marketing opportunities, prevent problems, understand demographics, or respond to concerns?

Researchers at the Pacific Northwest National Laboratory developed IN-SPIRE™, a powerful way of presenting large amounts of data in a visually compelling display.

The Theme View™ presents a terrain map to show theme relationships in mountains and valleys. Stronger themes are displayed as taller mountains while less significant themes are low-elevation hills. Peaks that are closer together suggest related themes.

Behind the scene(s) IN-SPIRE™ uses complex statistical and mathematical algorithms. A unique technology automatically turns each document into a mathematical signature based on its detected theme(s). All the signatures are compared and clustered so similar documents are placed near one another in the visual display.

The software analyzes tens of thousands of text documents, such as marketing reports, patent claims, web search results, accident reports, computer logs, network data, newswire feeds, or message traffic, and then turns it into a visual representation.

The Galaxy View shows each document as a dot in the IN-SPIRE™ galaxy. Related documents are grouped together and common themes are highlighted. To extract more meaning from the data, IN-SPIRE offers robust QUERY capabilities that support Boolean, word proximity, phrase, or example-based searches.

Non-English document sets can be analyzed using IN-SPIRE™'s support for Machine Translation. Queries submitted in English are translated into the native language and the results are returned to the document viewer in both English and the original language.

Groups of documents can be analyzed with the Correlation Tool to see the similarities between common documents across the groups. Relationships that may not have been otherwise noticed are discovered.

With the TIME SLICER, a dataset may be examined in specific time intervals, such as years, months, days, or minutes. This provides a more detailed look at the trends and relationships between themes or events over time.

Paired with the EVIDENCE tool, each document in a group can be characterized in terms of whether it supports or refutes theories. The results can then be viewed in the Evidence Summary Tool.

IN-SPIRE™ lets businesses spend more time doing business and less time analyzing irrelevant data. For more information visit the IN-SPIRE website at in-spire.pnl.gov.

IN-SPIRE™