Security Enhanced Linux
What's New
Frequently Asked Questions
Background
Documents
License
Download
Participating
Mail List
Archives
Remaining Work
Contributors
Related Work
Press Releases
Information Assurance Research
NIARL In-house Research Areas
Mathematical Sciences Program
Sabbaticals
Computer & Information Sciences Research
Technology Transfer
Advanced Computing
Advanced Mathematics
Communications & Networking
Information Processing
Microelectronics
Other Technologies
Technology Fact Sheets
Publications
Related Links
|
Method of Summarizing Text Using Just the TextAliases:KODA text-similarity measureTechnical Challenge:To quickly and effectively process overwhelmingly large textual data sets to make it easy to study and understand. It may prove to be a powerful aid in Knowledge Management as well as in Data Mining.Description:The KODA similarity measure may be used to extract several short passages from a document that are indicative of the content of the whole document. Since KODA does not rely on document formatting, linguistic information, nor require any training, it may be used on large diverse data sets comprised of text document in various forms and languages. Furthermore it is fast and easily adapted to data sets in varying character sets, languages, and media.Thus KODA may be used in the following way: The code may read in thousands of documents, and will publish two or three (this may be user-defined) sentences or passages from each document. Thus the content of each of the thousands of documents may be determined from the representative passages. KODA also includes an option to identify the longer passages that each published passage is most closely related to, leading to a rudimentary outline of each document. Demonstration Capability:The software can be easily demonstrated.Potential Commercial Application(s):The KODA text-similarity measure has implications for search-engines as well as for single-document and multi-document summarization technologies. It may prove to be a powerful aid in Knowledge Management as well as in Data Mining.Patent Status:Issued: United States Patent Number 6904564Reference Number: 1199If you are interested in exploring this technology further, please call 443-445-7159 or express your interest in writing to the: National Security Agency |
|
Date Posted: Jan 15, 2009 | Last Modified: Jan 15, 2009 | Last Reviewed: Jan 15 2009 |