text-only page produced automatically by LIFT Text Transcoder Skip all navigation and go to page contentSkip top navigation and go to directorate navigationSkip top navigation and go to page navigation
National Science Foundation
Search  
Awards
design element
Search Awards
Recent Awards
Presidential and Honorary Awards
About Awards
Grant Policy Manual
Grant General Conditions
Cooperative Agreement Conditions
Special Conditions
Federal Demonstration Partnership
Policy Office Website


Award Abstract #0420847
MRI: International Data Mining Grid Testbed for Research in High Performance Data Transport, Data Integration, and Data Exploration -- Instrument Development Proposal


NSF Org: CNS
Division of Computer and Network Systems
divider line
divider line
Initial Amendment Date: August 17, 2004
divider line
Latest Amendment Date: August 17, 2004
divider line
Award Number: 0420847
divider line
Award Instrument: Standard Grant
divider line
Program Manager: Rita V. Rodriguez
CNS Division of Computer and Network Systems
CSE Directorate for Computer & Information Science & Engineering
divider line
Start Date: September 1, 2004
divider line
Expires: August 31, 2007 (Estimated)
divider line
Awarded Amount to Date: $237000
divider line
Investigator(s): Robert Grossman grossman@uic.edu (Principal Investigator)
Ashfaq Khokhar (Co-Principal Investigator)
Stephen Eick (Co-Principal Investigator)
divider line
Sponsor: University of Illinois at Chicago
809 S MARSHFIELD RM 608
CHICAGO, IL 60612 312/996-9406
divider line
NSF Program(s): MAJOR RESEARCH INSTRUMENTATION
divider line
Field Application(s): 0000912 Computer Science
divider line
Program Reference Code(s): HPCC, 9218, 9135
divider line
Program Element Code(s): 1189

ABSTRACT

This project, developing an instrument designed to test new network protocols and data services for long haul, high performance network called the Teraflow Testbed (TFT), integrates distributed clusters of workstation at four locations (Amsterdam, Geneva, Chicago-StarLight, and Chicago-UIC) using advanced 10 Gbs photonic networks and relying on both layer 2 optical switching and layer 3 routers. The project aims at supporting development of key network technologies important for the next step in high performance, data intensive computing. Research ranges from both low-level network protocol to offering high level data services. TFT will consist of distributed nodes over two continents that can transmit, process, and mine very high volume data flows (teraflows). The testbed will enable the development of new network protocols and innovative data integration and data mining services for teraflows. The work involves the design of a new class of applications that move not only the queries and computation, but the data when required. Subsequently, testing of the protocols and services for traditional routed networks as well as lambda grids will take place. The following three specific research activities in the general area of lambda-grids (posits collections of plentiful computing and storage resources richly interconnected by dedicated dense wavelengths division multiplexing (DWDM) optical paths) will be conducted:

High Performance Network Transport Protocols for Teraflows,

High End-to-End Performance for Teraflows, and

High Performance Data Services for Teraflows.

The first supports the development of new network protocols to provide higher bandwidth utilization and good transport performance for networks with high bandwidth-delay product optical network links. Based on previous work on SABUL, a rate based reliable transport protocol with high bandwidth delay, based on UDP (for data) and TCP (for control flow), a UDT protocol is proposed to achieve high effective throughput and still provide fairness for competing teraflows. The second supports the integration of local input-output systems with the very high data rate network protocols. Proposing the integration of an experimental system of intelligent high-speed disks connected to a cluster with the TFT, methods for relaying control information directly from the parallel I/O system to the rate control algorithm in the new protocol will be investigated to maximize overall performance between remote parallel disks. The third develops high performance data services for mining teraflows that use a SOAP/XML based control channel and a separate data channel. This activity builds a distributed peer-to-peer storage system for the data which teraflows can easily access and identifies data mining primitives to filter and process the teraflow.

Broader Impacts: TFT is expected to impact homeland defense, business continuity, and disaster recovery technologies. Post docs, graduate and undergraduate students will partake in the research. Tutorials on high performance data transport will be given at technical conferences.


PUBLICATIONS PRODUCED AS A RESULT OF THIS RESEARCH

(Showing: 1 - 16 of 16).

Chetan Gupta and Robert L. Grossman.  "GenIc: A Single Pass Generalized Incremental Algorithm for Clustering,"  2004 SIAM International Conference on Data Mining (SDM 04).,  2004,  p. ..

L. Wilkinson, A. Anand and R. Grossman.  "High-dimensional Visual Analytics: Interactive Exploration Guided by Pairwise Views of Point Distribution,"  EEE Transactions on Visualization and Computer Graphics,  2006, 

Parthasarathy Krishnaswamy, Stephen G. Eick, Robert L Grossman.  "Visual Browsing of Remote and Distributed Data,"  Proceedings of InfoVis 2004,  2004,  p. ..

Robert L. Grossman, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy.  "Using DataSpace to Support Long-Term Stewardship of Remote and Distributed Data,"  NASA/IEEE MSST 2004, 12th NASA Goddard/21st IEEE Conference on Mass Storage Systems and Technologies,  2004,  p. 239-.

Robert L. Grossman, Yunhong Gu, Chetan Gupta, David Hanley, Xinwei Hong, and Parthasarathy Krishnaswamy.  "Open DMIX: High Performance Web Services for Distributed Data Mining,"  7th International Workshop on High Performance and Distributed Mining, in association with the Fourth International SIAM Conference on Data Mining.,  2004,  p. ..

Robert L. Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong and Parthasarathy Krishnaswamy.  "Experimental Studies of Data Transport and Data Access of Earth Science Data over Networks with High Bandwidth Delay Products,"  Computer Networks,  v.46,  2004,  p. 411.

Robert L. Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong, Dave Lillethun, Jorge Levera, Joe Mambretti, Marco Mazzucco, and Jeremy Weinberger.  "Photonic Data Services: Integrating Path, Network and Data Services to Support Next Generation Data Mining Applications,"  Data Mining: Next Generation Challenges and Future Directions, H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha, editors, AAAI Press,  2004,  p. ..

Robert L. Grossman, Yunhong Gu, David Handley, and Michal Sabala Joe Mambretti, Alex Szalay and Ani Thakar, Kazumi Kumazoe and Oie Yuji, Minsun Lee, Yoonjoo Kwon, and Woojin Seok.  "Data Mining Middleware for Wide Area High Performance Networks,"  Journal of Future Generation Computer Systems (FGCS),  v.22,  2006,  p. 940.

Robert L. Grossman, Yunhong Gu, Xinwei Hong, Antony Antony, Johan Blom, Freek Dijkstra, and Cees de Laat.  "Teraflows over Gigabit WANs with UDT,"  Journal of Future Computer Systems,  2004,  p. 501.

Yong Mao, Yunhong Gu, Jia Chen and Robert L. Grossman.  "SDCS: Simplified Data Communications in Parallel/Distributed Applications,"  IEEE International Symposium on Cluster Computing and the Grid (CCGrid06),  2006,  p. 292.

Yunhong Gu and RObert Grossman.  "Supporting Configurable Congestion Control in Data Trnasport Services,"  ACM/IEEE SC 2005 Conference (SC'05), Seattle, WA, Nov. 12 - 18, 2005,  2005, 

Yunhong Gu and Robert L. Grossman.  "Optimizing UDP-Based Protocol Implementations,"  Proceedings of the Third International Workshop on Protocols for Fast Long-Distance Networks PFLDnet 2005,  2005,  p. ..

Yunhong Gu and Robert L. Grossman.  "UDT: UDP-based Data Transfer for High-Speed Wide Area Networks,"  Computer Networks,  v.51,  2007,  p. 1777.

Yunhong Gu, Robert L. Grossman and Joe Mambretti.  "A Peer-to-Peer Infrastructure for Distributing Large Scientific Data Sets over Wide Area High-Performance Networks: Experimental Studies Using Wide Area Layer 2 Services,"  Proceedings of the First International Conference on Networks for Grid Applications (GridNets 2007),  2007, 

Yunhong Gu, Xinwei Hong and Robert Grossman.  "An Analysis of AIMD Algorithms with Decreasing Increases,"  Proceedings of GridNets 2004,  2004,  p. ..

Yunhong Gu, Xinwei Hong, and Robert Grossman.  "Experiences in Design and Implementation of a High Performance Transport Protocol,"  Proceedings of the SC04 Conference,  2004,  p. ..


(Showing: 1 - 16 of 16).

 

Please report errors in award information by writing to: awardsearch@nsf.gov.

 

 

Print this page
Back to Top of page
  Web Policies and Important Links | Privacy | FOIA | Help | Contact NSF | Contact Web Master | SiteMap  
National Science Foundation
The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA
Tel: (703) 292-5111, FIRS: (800) 877-8339 | TDD: (800) 281-8749
Last Updated:
April 2, 2007
Text Only


Last Updated:April 2, 2007