Overview

The objective of the Center for Economic Studies (CES) and the Research Data Centers (RDCs) is to increase the utility and quality of Census Bureau data products. The external research program supported by CES and the RDCs increases the quality and utility of Census data in several ways. First, access to microdata encourages knowledgeable researchers to become familiar with Census data products and Census collection methods. More importantly, providing qualified researchers access to confidential microdata enables research projects that would not be possible without access to respondent-level information. This increases the value of data that has already been collected. Access to the microdata also allows for data linking not possible with aggregates � both cross-survey linkages and longitudinal linkages. These linkages leverage the value of preexisting data. Creative use of microdata can address important policy questions without the need for additional data collections.

In addition, the best means by which the Census Bureau can check on the quality of the data it collects, edits, and tabulates is to make its micro records available in a controlled, secure environment to sophisticated users who, by employing the micro records in the course of rigorous analysis, will uncover the strengths and weaknesses of the micro records. Each set of observations is the end result of dozens upon dozens of decision rules covering definitions, classifications, coding procedures, processing rules, editing rules, disclosure rules, and so on. The validity and consequences of all these decision rules only become evident when the Census Bureau's micro databases are tested in the course of analysis. Exposing to the light of research the conceptual and processing assumptions that are embedded in the Census Bureau's micro databases constitutes a core element in the Census Bureau's commitment to quality. CES and the RDCs conduct, facilitate, and support microdata research.

Return to top

Research Data Center Program Requirements

Research Data Centers afford researchers opportunities to carry out unique research arising from the ability to access and explore confidential micro records.

But the opportunities come at a price. Research conducted at RDCs will take place under a set of rules and limitations that will be considerably more constraining than those prevailing in typical research environments:

  • All proposals to carry out research at an RDC must be approved by both the RDC and the Census Bureau. If data are provided by other agencies (e.g., IRS), the other agencies must approve of the project as well.
  • All projects must provide a benefit to Census Bureau programs. The benefit requirement is an explicit proposal criterion and is required by law (Title 13, Sec. 23, U.S.C.).
  • Researchers using the facilities and databases at RDCs will be required to obtain Special Sworn Status from the Census Bureau. To obtain this status, researchers will be required to undergo a security check, including fingerprinting.
  • Researchers holding Special Sworn Status will be subject to the same legal penalties as regular Census Bureau employees for disclosure of confidential information. The penalties are a fine of up to $250,000, imprisonment for up to five years, or both.
  • RDCs are secure research facilities. Access is strictly limited to researchers and staff authorized by the Bureau of the Census. The computers within the RDCs are not linked to the outside world. Researchers do not have email or world wide web access from within RDCs.
  • All analysis must be done within the RDC.
  • Researchers at the RDC may use confidential data only for the purpose for which the data are supplied; i.e., for their approved research project.
  • Researchers may not remove confidential data -- whether recorded on any medium or merged with non-confidential data -- from the RDC office. All output must be submitted to Census Bureau personnel for disclosure review prior to removal from the RDC.

Return to top

The Proposal Process

CES RDC Research Proposal Guidelines

Persons wishing to conduct research at a Research Data Center must submit a research proposal using this website (www.ces.census.gov). The following guidelines describe the research proposal submission process. This is the only procedure that can be employed to submit a research proposal.

The Proposal Process

It requires two distinct steps to submit a research proposal. The first step is the development of a preliminary proposal. The second step is the submission of a final propsal.

Preliminary Proposal Development

Researchers who wish to develop a proposal to conduct research at one of the Census Bureau�s Research Data Centers (RDC) should first contact the RDC administrator at the center where the research will be conducted. The researcher should discuss the proposed project with the administrator to determine whether the research fits with the Bureau�s mandate, is feasible, and is likely to provide benefits to Census Bureau programs under Title 13 of the U.S. Code.

The first step in the proposal process is for the researcher to register as a user with CES by opening an account through the Center�s website (www.ces.census.gov). Once an account has been opened, the user receives a system-generated email message containing an initial password. The user can change this password at the first login session. All researchers must have a user account in order to submit preliminary and final proposals to CES.

Working closely with the RDC administrator, researchers develop a preliminary research proposal that includes information about the researcher(s), site where the research will be carried out, purpose of the research, funding source, requested datasets, desired software, a brief narrative description of the research project and proposed benefits to the Census Bureau. The researcher enters this information via the CES on-line proposal management system accessible on the CES website.

Once a preliminary proposal has been submitted, the RDC administrator reviews it and advises the researcher of any suggestions for improvement or refinement. The administrator must approve the preliminary proposal before the researcher can submit a final proposal to CES.

Both CES and the RDCs entertain proposals from doctoral students who seek access to confidential data for dissertation research. Proposals that list dissertation research as the motivation must include the student�s primary advisor as a co-principal investigator. CES recommends that the advisor also apply for Special Sworn Status if he or she expects to view any intermediate output.

Final Proposal Submission

Researchers should consult with the RDC administrator about the content and form of a final research proposal before submitting the proposal through the CES on-line management system. The final proposal consists of four separate documents in Adobe Acrobat Portable Document Format (PDF): (1) Curriculum vitae of all investigators on the proposed project, (2) Abstract of the proposal, (3) Project description (full proposal), and (4) Statement of benefits to the Census Bureau. Failure on the part of researchers to consult fully with the RDC administrator before submission of proposal files can result in a decision by CES to decline to review the proposal.

The four document files should conform to the following requirements:

  • Curriculum vitae. This single file should contain vitae for all researchers on the proposed project. Each person�s vitae should be limited to two single spaced pages in length, and should contain only the following information:
    • Name and contact information, including email address
    • Education history and employment history (primary employment)
    • Five most recent publications
    • Five publications most relevant to the proposal
    • Ph.D. advisor(s) if applicable
    • Names of Ph.D. students advised to completion, if applicable
    • Names of all recent collaborators (last two years)
  • Proposal Abstract. This document should be no longer that one single-spaced page or two double-spaced pages and it should capture the essence of the project proposal. The data sets, and years of data, that will be used must be stated. Within the abstract, one or two sentences should succinctly state what the project would do and the data it will use. The abstract must also address the proposed benefits to the Census Bureau. The abstract must include, at the top of the first page, a project title and the names of all researchers.
  • Project Description (full proposal). This document should describe in as much detail as possible the nature of the research question(s), description of the methodology (including models to be estimated, how model variables will be measured and hypotheses to be tested), Census and non-Census data sets to be used, expected outcomes, and should contain a list of references cited. The proposal should be aimed at a competent social scientist who is not necessarily a specialist in the field or on the topic within the field. The proposal should:
  • Be limited to no more than fifteen (15) single-spaced pages or thirty (30) double- spaced pages (inclusive of references).
  • Contain a title and the names of all researchers at the top of the first page.
  • Include appropriate headings and sub-headings throughout the document to assist reviewers in following the proposal narrative.
  • Use a font size of at least 11 point and should have at least one-half inch margins all around. Twelve point font and one-inch margins are preferred.
  • Number all pages.
  • Contain a separate section identifying all data sets, Census Bureau and other, the project will use. Public-use Census Bureau data that will be used in the project must be included in this section. Years of data needed must also be stated. If the project would make links among data sets, the links must be indicated, and the method for making the links must be specified.
  • Contain a separate section stating the proposed duration of the project in absolute amounts of time (e.g. 14 months) and a desired starting date. This section should also state the intensity of RDC lab use (e.g. 15 hours per week).
  • Not include a separate title page, which will be counted against the page limit.
  • Not include any appendices unless approved in advance by the RDC administrator. Unapproved additional pages will count against the page limit, and may be sufficient cause for CES to decline to review the proposal.
  • Benefits Statement. This document has no length limitation, although brevity and concise presentation are encouraged. The statement should address clearly how the project would provide one or more of the nine (9) Title 13 benefits listed on the CES website.

Researchers must submit all proposal related documents through the CES website. Experience shows that PDF files can take several minutes to upload successfully to the proposal management system. The person uploading the files should wait until he or she receives a message that the upload process has completed successfully before exiting the management system and closing her or his web browser.

Proposal Review Process

Research proposals submitted to CES are reviewed on the basis of five major criteria:

  • Benefit to Census Bureau programs. Proposals must demonstrate that the research is likely to provide one or more Title 13 benefits to the Bureau. A research project must demonstrate that its predominant purpose is to benefit Census Bureau programs. If a project has as its predominant purpose one, or any combination, of the following criteria it will be considered to have as its predominant purpose increasing the utility of Title 13, Chapter 5 data (Researchers should consult an RDC administrator for more information).
    • Understanding and/or improving the quality of data produced through a Title 13, Chapter 5 survey, census, or estimate;
    • Leading to new or improved methodology to collect, measure, or tabulate a Title 13, Chapter 5 survey, census, or estimate;
    • Enhancing the data collected in a Title 13, Chapter 5 survey or census. For example:
      • Improving imputations for non-response;
      • Developing links across time or entities for data gathered in censuses and surveys authorized by Title 13, Chapter 5;
    • Identifying the limitations of, or improving, the underlying Business Register, Household Master Address File, and industrial and geographical classification schemes used to collect the data;
    • Identifying shortcomings of current data, collection programs and/or documenting new data collection needs;
    • Constructing, verifying, or improving the sampling frame for a census or survey authorized under Title 13, Chapter;
    • Preparing estimates of population and characteristics of population as authorized under Title 13, Chapter 5;
    • Developing a methodology for estimating non-response to a census or survey authorized under Title 13, Chapter 5; and
    • Developing statistical weights for a survey authorized under Title 13, Chapter 5.
  • Scientific merit. This criterion relates to the project�s likelihood of contributing to existing knowledge. Evidence that a Federal-funding agency such as NSF or NIH has approved the proposal for support constitutes one indication of scientific merit.
  • Clear need for non-public data. The proposal should demonstrate the need for and importance of non-public data. The proposal should explain why publicly available data sources are not sufficient to meet the proposal�s objectives.
  • Feasibility. The proposal must show that the research can be conducted successfully with the methodology and requested data.
  • Risk of disclosure. Output from all research projects must undergo and pass disclosure review.
    • Tabular and graphical output presents a higher risk to disclosure of confidential information than do coefficients from statistical models.
    • The Census Bureau is required by law to protect the confidentiality of data collected under its authorizing legislation, Title 13, U.S. Code.
    • Some data files are collected under the sponsorship of other agencies. In providing restricted access to these data CES must adhere to all applicable laws and regulations.
    • Researchers may be required to sign non-disclosure documents of survey sponsors or other agencies that provide data for their research projects.

Both Census Bureau and external experts on subject matter, datasets and disclosure risk review all proposals. Relevant data sponsors and data custodians also review proposals that request certain datasets.

Any proposals seeking to use datasets that contain Federal Tax Information (FTI) must also be reviewed for approval by the Internal Revenue Service. Researchers must consult the relevant RDC administrator to determine whether their proposal would use data that contain FTI. The review process is both lengthy and rigorous requiring that researchers exhibit patience throughout. Failure on the part of researchers to consult fully with the RDC administrator on this point before submission of proposal files can result in a decision by CES to reject the proposal for review.

Reviewed proposals receive one of four ratings:

  • Approve. The proposal successfully addresses all of the review criteria mentioned above.
  • Clarification Required. The proposal meets most or all of the review criteria but lacks one or more minor requirements that must be addressed in a revised version. The proposal will not require resubmission in a subsequent review cycle.
  • Revise and Resubmit. The proposal fails to meet one or more of the major criteria, but is deemed to be of sufficient potential merit to encourage a resubmission in a subsequent review cycle.
  • Reject. The proposal fails to meet most or all review criteria, and may be resubmitted as a new preliminary proposal only after substantial revision and approval by the RDC administrator.

Research Data Center administrators communicate the outcome of the review process to the contact researcher, which includes both an explanation for the decision and copies of the expert reviews. Researchers should expect to hear from the RDC administrator no sooner than two months after the deadline for proposal submission. Proposals that receive Clarification Required may need an additional month or two to achieve final approval once the clarifications are received. Projects that seek Federal Tax Information (FTI) normally require an additional two to three months to gain final approval. Recent experience demonstrates that proposals have about a fifty percent chance of receiving either an Approve or Clarification Required rating during any review cycle. Some of these proposals may have been resubmissions after having received a Revise and Resubmit rating during an earlier cycle.

Projects that request data containing FTI must be reviewed by the Internal Revenue Service to ensure that the predominant purpose of the research is to contribute to Census Bureau programs under Title 13 of the U.S. Code (See the IRS Criteria Document and above for a list and description of approved Title 13 benefits). No proposal will gain approval from both Census and the IRS if its predominant purpose is not to deliver Title 13 benefits.

Post Approval Process

Approval of research proposals by CES, and the IRS if FTI is requested, is merely the first step in a multi-step process before research can actually commence. In many instances, CES must obtain permissions to access certain data from the survey sponsors, data custodians or the Census Bureau program areas who control such access. This process can range from a few weeks to many months depending upon the nature and status of data sharing agreements between Census and sponsoring agencies, whether Federal or State.

Once a project has been approved, all researchers who expect to access confidential data must undergo a background investigation, including fingerprinting. After completion of the background check, the Census Bureau grants Special Sworn Status (SSS) to each researcher, which subjects them to incarceration of up to five years and/or fines of up to $250,000 if they knowingly or inadvertently disclose confidential information on individuals, households or businesses. All SSS individuals must take annual training in the use and protection of Title 13 data, and Title 26 data if FTI are to be used in the project. RDC administrators deliver this training.

All researchers on the project must register with CES by opening a user account through the Center�s website (www.ces.census.gov).

All approved research projects are governed by a written agreement between the researcher(s) and the Census Bureau. . The agreement stipulates the start and end dates for the project, responsibilities of both parties with respect to procedures and practices, and if the research project is conducted at the CES RDC, fee payment. All researchers on the project must sign this agreement with the Census Bureau, or if added to the project after the agreement is signed, an addendum to the agreement. If the research project is conducted at an RDC partner institution, an agreement with the RDC partner institution is required as well

The Center for Economic Studies encourages researchers to assess carefully the time period over which they request access and to make efficient use of their lab time, but also to anticipate that disclosure review may require modification to computer output before it can be released. Requests for time extensions beyond the agreement end date undergo careful evaluation and rarely gain approval. All access to confidential data and facilities associated with a given research project will end at midnight on the end dates specified in the written agreement for that research project. These dates must be consistent with the dates specified in the approved proposal.

Timing

Researchers should expect a minimum of six months to elapse between the deadline for final proposal submission and the actual commencement of research. This duration can vary greatly by individual proposal depending upon data permissions required, IRS review, background checks, software and datasets requested, and the number of proposals under consideration. Researchers can help speed up the process by the following:

  • Adhere closely to all practices and procedures for proposal submission as given on the CES website.
  • Work closely with their RDC administrator on proposal development and on any requested revisions or clarifications to proposals or predominant purpose statements.
  • Provide CES with the terms of use for any datasets they wish to bring to the lab.
  • Process their SSS paperwork quickly.
  • Take Title 13 and Title 26 training as early as possible before beginning work.

Return to top

The Research Environment at an RDC

Research Data Centers (RDCs) are secure research facilities. Access is strictly limited to researchers and staff authorized by the Bureau of the Census. All analysis must be performed within the secure RDC research facility. Security measures that protect the confidentiality of data are extremely important at RDCs. The Census Bureau and its RDC partners take security very seriously. Ensuring security has several aspects: a physically secure facility, personnel security, a secure computing environment, an on-site Census employee, and disclosure avoidance.

Physically Secure Facility

Each RDC is in a secure physical location, as certified by the Office of Security of the Department of Commerce. Access to these facilities is tightly controlled and monitored.

Personnel Security

Access to the RDC is reserved for researchers carrying out approved projects at the RDC, to Census Bureau employees, and to carefully designated persons who have a need to enter. Those who are not Census Bureau employees must have obtained Special Sworn Status (SSS). Individuals obtaining SSS must pass a background check and must sign and make a sworn statement about preserving the confidentiality of the data. Individuals who violate this agreement are subject to the same criminal penalties as Census Bureau employees, as well as the denial of future access to the data, and the potential for loss of professional credibility. It is essential that researchers share the same "culture of confidentiality" held by Census Bureau employees regarding the preservation of the confidentiality of data.

Secure Computing Environment

Access to the computer facilities within the RDCs is limited to researchers with SSS carrying out approved projects at the RDC and to Census Bureau employees . Researchers are permitted access only to the data files necessary - as specified in their proposals - to perform their research. All personnel using RDC computer facilities are subject to monitoring by Census Bureau personnel at all times and are held responsible for the use of the computers and the data they contain. All feasible steps are taken to ensure the highest degree of computer security.

All research is conducted on the secure computing equipment located within the RDC. These computing equipment are provided by the Census Bureau, and are configured, managed, and monitored by Census Bureau staff. Researchers cannot bring any other computing equipment, including laptops or other portable computing devices into the RDC.

Presence of CES Employees (the Lab Administrator) at RDCs

At least one Census Bureau employee (Lab Administrator) is assigned to each RDC. This on-site Census Bureau employee is essential, for several reasons. The employee functions much like CES headquarters staff members do in providing guidance on using the data or references to the relevant people at CES headquarters. Particularly crucial at RDCs is the need to provide guidance to the researchers regarding confidentiality restrictions. By working closely with the researchers and becoming familiar with the details of the research projects, the Lab Administrator can instill the Census Bureau's "culture of confidentiality" into the researchers and can provide effective and timely disclosure analysis of the research output.

Disclosure Avoidance

All research output must be reviewed for disclosure before it can be released for use outside Census Bureau facilities. Census Bureau employees at RDCs protect data confidentiality by performing "disclosure analysis" on all materials the researcher requests to remove from the secure facilities - whether they are intended for publication or not. This analysis ensures that Census Bureau policies are followed to prevent the inadvertent release of confidential data. The range of information that can be released without violation of confidentiality is much greater for analytic results ( e.g. regression coefficients) than for tabular data. Indeed, we emphasize that researchers should minimize tabular output because secondary disclosure is very difficult: the operating divisions release as much tabular data as possible, which limits the amount of additional tabulations that can be released.

Release of Materials

All materials must be examined by the on-site Census Bureau employee before they can be removed from the RDC. Absolutely no papers, printouts, computer media or other materials can leave the RDC facilities without being first examined and approved for release by the on-site Census Bureau employee.

Return to top

Research Output

If your research proposal is approved, the Census Bureau will grant you restricted access to non-public micro data files to carry out unique research projects that will benefit both the Census Bureau and your scientific field. But the opportunities come at a price: your research project will take place under a set of rules and limitations that are considerably more constraining than those prevailing in typical research environments.

The constraints stem from the legal requirements to maintain confidentiality of the underlying microdata files to which you may be granted access, the requirement that researchers' projects benefit Census Bureau data programs, and (not least) the mission of CES itself.

To maintain confidentiality of the data, we place limitations on the types and quantity of output that researchers may remove from our secure facilities. Here, we try to explain these limitations. In this way, we hope to ensure that our expectations match yours regarding the kinds of research output you can expect us to release. If you have questions about these policies, please contact the lab administrator of your prospective RDC research site.

The lab administrator must clear all research output for release.

Your RDC lab administrator (or other designated Census Bureau employee) must clear any listings, output, or other research results you wish to remove from the secure RDC facilities. Researchers must allow time for this "disclosure analysis" to take place, and be prepared to work with the RDC administrator in clearing output for release. You will be required to submit certain supplementary documentation to the administrator along with the research output. Typically, the RDC administrator will release the output by e-mailing it to you.

Projects must emphasize model output.

Research projects must emphasize output from statistical models (e.g., regression models) rather than descriptive tabular output. Typically, the descriptive tabulations you should expect to remove will consist of small one or two-dimensional tables of variables that describe the samples that appear in the models. Such descriptive tables usually appear in journal articles where the main point is the results from the models. Projects that emphasize tabular output (as opposed to statistical models) will not be approved.

We have these policies for the following reasons: First, tabulations present significant risk of what is called "complementary disclosure." By combining information from the released table with other sources of information, it may be possible to infer information on an individual survey respondent. This risk is almost always much greater for tabular descriptive output than for output from statistical models. Moreover, preventing complementary disclosure in clearing large tabulations for release virtually always imposes significant costs - both on you and on the RDC lab administrator.

Second, It is important to remember that the mission of CES is not to supplement the Census Bureau's tabular data programs, but to allow researchers to probe the data in unique ways, using statistical models, to gain knowledge that will enable the Census Bureau to improve its programs. Therefore, clearing large tabulations for release presents a distraction from the primary mission of CES.

All other output is an exception.

We will treat as exceptions requests for output other than models and related simple descriptive tables, as described above. In deciding whether to release the output, we will consider how the output furthers the goals of the project and the various costs of carrying out the disclosure analysis. Moreover, you should also assume that the decision to release the output will take considerably longer than the decision to release our typical model-based output. We probably will bring the output before the census Bureau's Disclosure Review Board, which has final say on all output the Census Bureau releases.

If you really want more tabular output?

If you wish to produce a tabular "data product" as part of your project, you must specify that fact in your proposal. We are very unlikely to approve tabulations that provide more detail than Census Bureau publications. The Census Bureau strives to release as much output as possible in its regular data production programs.

The comments above apply here even more strongly and bear repeating: For this type of request, we will need to bring the output before the Census Bureau's Disclosure Review Board, which has final say on all output the Census Bureau releases. Approval is not certain and may take considerable time.

What about "intermediate output"?

We must strongly discourage the release of "intermediate output." Intermediate output consists of detailed tables of preliminary descriptive statistics, large numbers of similarly specified regression models (often based on "thin" samples) and the like. Put another way, intermediate output is output that will not appear in a publication (e.g., working paper, journal article) or conference presentation. (Note: we do not limit the production of such output for examination in the RDC, only its release.)

Intermediate output increases disclosure risk for several reasons: It is difficult to track, and it can be difficult to associate with particular project publications (or even projects). Moreover, releasing similar tables based on changing samples (adding or dropping small numbers of observations) usually poses significant risk of complementary disclosure.

Projects that have more than one researcher.

We recognize that for projects with more than one researcher, the policies stated here may pose difficulties. All project researchers, even those who do not carry out the data analysis at the RDC, should expect to spend time at the RDC facility.

The most common case is a graduate student working intensively with a faculty member, either as a research assistant or perhaps working on a dissertation. Many faculty members, for very good reasons, expect their students to show them large amounts of preliminary tabular output and model specifications. For us, this is intermediate output, which we cannot release. Students may, of course, still generate this output. To view it, the faculty member should expect to come to an RDC facility.

The same situation occurs with any multiple-researcher project in which at least one researcher cannot spend much time at an RDC facility and therefore might expect to see large amounts of intermediate output.

We will work with you

Please keep the policies described here in mind when planning your project. If you have questions or concerns about the policies, we encourage you to contact the RDC lab administrator for your research site. Please work with your RDC lab administrator in planning your output and in clearing it for release. If you do, our experience shows that it should not be difficult for us to clear a set of research output that serves your project's research purposes, while protecting the confidentiality of the underlying microdata.

Return to top