Skip to Content Skip to Search Skip to Utility Navigation Skip to Top Navigation Skip to Content Navigation
Los Alamos National Laboratory links to site home page

Digital content tool analyzes "gist" of text-based documents

LANL’s Digital Knowledge Discovery Team has created a suite of digital-content-analysis tools to gather, reduce, annotate, organize, synthesize, and visualize digital content.
April 3, 2012
image description

The Digital Knowledge Management (DKM) process consists of three discrete steps: First, the information is collected; next, the information is prepared; and finally, the information analyzed. The DKM process is controlled by an intuitive Graphical User Interface (GUI).

Contact  

  • Aaron Sauers
  • Technology Transfer
  • (505) 665-0132
  • Email
When examining larger collections, the DKM software can categorize subject-matter expertise, emerging and fading trends, and distill entire collections into a variety of single-page graphical representations.

Digital content tool analyzes "gist" of text-based documents

Applications:

  • Legal e-Discovery
  • Fraudulent insurance claim reduction based on analysis of communications patterns, claims data, and patient records
  • SEC report mining to predict poor corporate governance
  • Reputation analysis to rapidly adjust marketing campaigns
  • Voice of the Customer (VOC) Law enforcement
  • Content management / publishing
  • Subject Matter Expert (SME) identification
  • Taxonomy comparison

Benefits:

  • Saves time
  • Reduces information-overload
  • Facilitates insights regarding trends and themes across documents

Summary:

As electronic content proliferates, it becomes nearly impossible to fully consume and assess all of the available information. Over the past eight years, LANL’s Digital Knowledge Discovery Team has created a suite of digital- content-analysis tools to gather, reduce, annotate, organize, synthesize, and visualize digital content. The tools can be applied to collections of text- based documents from virtually any source.

The DKM algorithms go beyond traditional natural language processing and statistical analysis; word- location algorithms automatically extract the “gist” of the content, while others annotate targeted concepts, organize documents, and calculate goodness-of-fit with respect to a specified conceptual area. Additional modules extract features, such as dates and locations, and group documents for comparative analysis. In some cases, the software suite can compute the trustworthiness and mood of the author.

When examining larger collections, the DKM software can categorize subject- matter expertise, emerging and fading trends, and distill entire collections into a variety of single-page graphical representations. Structured information (i.e., metadata) can augment the digital knowledge to facilitate analyses of time trends, geographical co-location, and authorship, among others. Through infor- mation reduction, annotation, and organization, the analyst is able to assimilate content and form hypotheses more quickly.

DKM expedites knowledge assimilation by synthesizing digital content into a set of knowledge visualizations schemes. The ultimate goal is to allow an information consumer to look first at a few graphical representations of the concepts contained within thousands of pages of text, draw conclusions about the documents in aggregate, formulate hypotheses, and then focus attention on the particular documents that are relevant to those conclusions and hypotheses. Through a reduction process, the DKM software focuses the analyst on the important concepts and the relationships among them.

Development stage: Alpha testing

Patent status: Patent pending

Licensing status: Available for exclusive or non-exclusive licensing


Visit Blogger Join Us on Facebook Follow Us on Twitter See our Flickr Photos Watch Our YouTube Videos Find Us on LinkedIn