nsf.gov - CCF - Funding - Data-intensive Computing - US National Science Foundation (NSF)

HOME

FUNDING

AWARDS

Computing and Communication Foundations (CCF)

CCF Home

About CCF

Funding Opportunities

Awards

News

Events

Discoveries

Publications

Career Opportunities

View CCF Staff

CISE Organizations

Computing and Communication Foundations (CCF)

Computer and Network Systems (CNS)

Information & Intelligent Systems (IIS)

Proposals and Awards

Proposal and Award Policies and Procedures Guide

	Introduction
	Proposal Preparation and Submission
		Grant Proposal Guide
		Grants.gov Application Guide
	Award and Administration
		Award and Administration Guide

Award Conditions

Other Types of Proposals

Merit Review

NSF Outreach

Policy Office

Other Site Features

Special Reports

Research Overviews

Multimedia Gallery

Classroom Resources

NSF-Wide Investments

Data-intensive Computing

CONTACTS

See program guidelines for contact information.

SYNOPSIS

Enormous digital datasets abound in all facets of our lives - in e-commerce, in World Wide Web information resources, and in many realms of science and engineering. Looking ahead, the pace of data production will only accelerate with increasing digitization of communication and entertainment and the continuing assimilation of computing into everyday life. Data will arise from many sources, will require complex processing, may be highly dynamic, be subject to high demand, and be of importance in a range of end-use tasks. The broad availability of data coupled with increased capabilities and decreased costs of both storage and computing technologies has led to a rethinking of how we solve problems that were previously impractical or, in some cases, even impossible to solve. Further, despite the continuing advances and decreasing costs of computing and storage technologies, data production and collection are outstripping our ability to process and store data. This compels us to rethink how we will manage - store, retrieve, explore, analyze, and communicate - this abundance of data.

These technical and social drivers have increased the urgent need to support computation on data of far larger scales than ever previously contemplated. Data-intensive computing is at the forefront of ultra-large-scale commercial data processing, and industry has taken the lead in creating data-centers comprised of myriad servers storing petabytes of data to support their business objectives and to provide services at Internet-scale. Such data centers are instances of data-intensive computing environments, the target of this solicitation. For data-intensive computing, massive data is the dominant issue with emphasis placed on the data-intensive nature of the computation.

Data intensive computing demands a fundamentally different set of principles than mainstream computing. Many data-intensive applications admit to large-scale parallelism over the data and are well-suited to specifications via high-level programming primitives in which the run-time system manages parallelism and data access. The increasingly capacious and economical storage technologies greatly change the role that storage plays in such large-scale computing. Many data-intensive applications also require extremely high degrees of fault-tolerance, reliability, and availability. Applications also often face real-time responsiveness requirements and must confront heterogeneous data types and noise and uncertainty in the data. Scale will impact a system's ability to retrieve new and updated data and to provide, whenever appropriate, guarantees of integrity and availability as part of the system's basic functionality in the face of varying levels of uncertainty.

The Data-intensive Computing program seeks to increase our understanding of the capabilities and limitations of data-intensive computing. How can we best program data-intensive computing platforms to exploit massive parallelism and to serve best the varied tasks that may be executed on them? How can we express high-level parallelism at this scale in a natural way for users? What new programming abstractions (including models, languages and algorithms) can accentuate these fundamental capabilities? How can data-intensive computing platforms be designed to support extremely high levels of reliability, efficiency, and availability? How can they be designed in ways that reflect desirable resource sensibilities, such as in power consumption, human maintainability, environmental footprint, and economic feasibility? What (new) applications can best exploit this computing paradigm, and how must this computing paradigm evolve to best support the data-intensive applications we may seek? These are examples of questions that at their core ask how we can support data-intensive computing when the volume of data surpasses the capabilities of the computing and storage technologies that underlie them.

The program will fund projects in all areas of computer and information science and engineering that increase our ability to build and use data-intensive computing systems and applications, help us understand their limitations, and create a knowledgeable workforce capable of operating and using these systems as they increasingly become a major force in our economy and society.

This program also supports research previously supported separately by the Cluster Exploratory (CluE) program, which made available for data-intensive computing projects a massively scaled highly distributed computing resource supported by Google and IBM and a similar resource at the University of Illinois in partnership with Hewlett-Packard, Intel, and Yahoo!. The Data-Intensive Computing program welcomes proposals that may request and use any such resources available to or accessible by the proposer(s), in order to pursue innovative research ideas in data-intensive computing and explore the potential benefits this technology may have for science and engineering research as well as to applications that may benefit society more broadly.

Proposals requesting or intending to use such resources are required to include in a separate section of their Project Description a description of the computing resources needed to test and evaluate their research ideas. This description should include what facility/facilities they plan to access and how, including as much detail as possible (e.g., schedule of use, time, space, data upload) to show the viability of their project.

Data-intensive Computing Point of Contact: Jim French, Point of Contact, Data-intensive Computing Program, 1125S, telephone: (703) 292-8930, fax: (703) 292-9073, email: jfrench@nsf.gov

Funding Opportunities for Data-intensive Computing:

CISE Cross-Cutting Programs: FY 2010 NSF 09-558

THIS PROGRAM IS PART OF

CISE Cross-Cutting Programs: FY 2010

Web Policies and Important Links

Contact Webmaster

Computer & Information Science & Engineering (CISE)
The National Science Foundation, 4201 Wilson Boulevard, Arlington, Virginia 22230, USA
Tel: (703) 292-5111, FIRS: (800) 877-8339 | TDD: (800) 281-8749

Last Updated:

April 27, 2009

Text Only

Last Updated: April 27, 2009