National Sample Survey of Registered Nurses -- About the Survey and Data
The Survey has been conducted approximately every four years since 1977. For each
survey year, HRSA has prepared two Public Use data files in flat ASCII file format,
without delimiters. In addition, for 2008, SAS-encoded and SPSS-encoded data files
are available for download. Public use data for the nine NSSRN surveys to date have
been made available to researchers.
About NSSRN Data
The objective of the Survey is to sample and estimate the characteristics of the
registered nurses in the workforce. Nurses may hold licenses in more than one State.
Registered nurses who were sampled answer questions on their education and training in
nursing, professional nursing certifications, education and workforce participation
prior to becoming a registered nurse, current and recent workforce participation, income,
demographic characteristics, and States in which they hold current licenses.
In 2008, the design was modified to allow for stratified systematic sampling in each
State, with multiple strata developed for age level, dual license, and employment commuting
effects. This contrasted with the sample design used from 1977 to 2004 which incorporated
a complex, nested sample frame, with equal probabilities of selection of nurses sampled in
each State. Probabilities of selection were developed for each record. The samples are
selected from the universe of current licensure lists in each State. Sampling weights for each
State have been calculated and added to the record of each nurse in the respective data files,
with adjustments being made in these weights for nurses who have multiple licenses. Even though
some nurses may be sampled in sequential surveys, this is a cross-sectional set of survey
response files and no attempt is made to track the same nurse's career over time.
Links to other reports on the NSSRN surveys of 1992, 1996, 2000, and 2004 may be
found at the Health
Professions Workforce Nursing Reports page.
About NSSRN Public Use Files (PUFs)
NSSRN data made available to the public are to be used for research purposes only
and may not be used in any manner to identify individual respondents.
Most of the respondent information collected from the survey is made available in public
use data files, as described below:
State-based Public Use Files (previously called "General Public Use" files)
provide information on nurses, without identifying the County and Metropolitan Area in
which they live or work. Most users will prefer these files for general use applications
that are national or State-level research. Some information from the survey respondent data
has been withheld where there is greater possibility of pointing to an individual.
County Public Use Files provide most, but not all, the same information on
the nurse from the State Public Use File. Whereas the State Public Use File contains little
geographic information below the State level, the County Public Use Files also identify the
County and Metropolitan Areas in which the nurses live or work. Fields likely to point to an
individual in a less-populated county have been withheld.
Tables have been prepared which crosswalk the various items from the NSSRN surveys
against the variable name used in each survey and the respective data set files (i.e.,
In-House, State Public Use, or County Public Use) in which that variable is located for
the respective survey year. This crosswalk can be found as Appendix C in the State and
County data documentation.
For each NSSRN survey cycle and dataset type, there are survey response data and documentation
files. These are separated into two complementary sets of zipped files. For ease in downloading,
all of the documentation information is zipped separately from the data files.
For the years 1977 to 2004, two pairs of zipped directories are created for each NSSRN year:
StatePUFxxdata.zip, StatePUFxxdoc.zip, CountyPUFxxdata.zip, and CountyPUFxxdoc.zip (where "xx"
represents the last two digits of the year of the survey). These subdirectories range in size
from 3 to 20 MB. When unzipped, files will be substantially larger. Within the documentation
package, two SAS and two SPSS syntax auxiliary files are included for all years for each of the
respective 1977-2004 NSSRN survey public use file groups. This will give users the ability to
generate their own SAS- and SPSS-encoded data files.
For 2008 only, there are additional data and format/syntax files that may be downloaded.
These are contained in the subdirectories NSSRN2008_State_SAS_encoded_package.zip (31 MB),
NSSRN2008_CNTY_SAS_encoded_package.zip (32 MB), NSSRN2008_State_SPSS_encoded_package.zip
(25 MB), and NSSRN2008_CNTY_SPSS_encoded_package.zip (27 MB), HRSA is providing SAS-encoded
and SPSS-encoded data files, as a courtesy to SAS and SPSS users. These zipped files incorporate
data files that are encoded and ready to use within the respective SAS
or SPSS program, without needing to utilize the supplemental ASCII text data files for 2008.
The ASCII formatted data are provided in the NSSRN2008_STATE_ASCII_package (18 MB) and the
NSSRN2008_CNTY_ASCII_package (20 MB). However, the encoded files are much larger in terms of
bytes than are the ASCII data files. The ASCII data files for 2008 are generally larger than
the files from earlier surveys.
The complete documentation includes the PDF file that describes in detail how to use and
understand the survey data, as well as copies of SAS and SPSS data description files used for
loading the data into SAS or SPSS; some of these files may also be useful once the ASCII data
has been loaded into SAS or SPSS. The County documentation file is
NSSRN2008_CNTY_Documentation_package.zip (13 MB) and the State documentation file is
NSSRN2008_State_Documentation_package.zip (14 MB).
In order to keep track of the various survey cycle years, it is suggested that users maintain the subdirectory names as provided, without overriding these names.
The user may not merge the General and County data files into one aggregate database covering all attributes together with extensive geographic information. There are no common, unique identifiers for each surveyed nurse across these two database files.
Files named RNxxPUBL.dat and RNxxCNTY.dat (1977-2004) and RN08_State_data.dat and RN08_CNTY_data.dat (2008) refer to the public use ASCII database files for each survey. The .txt (or.dat) format for the survey response data consists of ASCII flat file data records which are formatted without delimiters.
For each survey from 1977 to 2008, SAS and SPSS auxiliary syntax files are included with the documentation package. The SAS auxiliary syntax files are in the form of .txt files. The SPSS auxiliary syntax files for 1977-2004 are encoded as .sps files even though they are text files in nature. Both SAS and SPSS users will need to pay heed to the second program line of each of the 'LOADNLABELS' files (‘RecFmt’ for SAS and SPSS for 2008 ) which respectively contain an ‘Infile’ statement (SAS) and '/FILE' statement (SPSS) with a default location file name and drive location. The user must substitute their own file name and location for the raw data in ASCII (text) format for each respective public use file. For SPSS in 2008, there is a VarCategories text file which identifies the data value categories for each variable. For SAS in 2008, the user should use the SASFormats sasbcat file, which is a catalog of the variables and their category values already encoded into SAS.
Users may attempt to import the ASCII versions of the database files into EXCEL; however, because the data files are fixed-length records and are not delimited, extreme caution must be taken in the use of the EXCEL Import Wizard to ensure proper location of the boundaries between fields.
On top of this consideration, the underlying data contain more columns than some versions of EXCEL can support. Thus, it is necessary to count the number of data columns you have defined, and select and discard unneeded sections of the data record. One trick that can be undertaken so as not to exceed the maximum number of columns in an EXCEL spreadsheet is to mark off one or more blocks of up to 255 characters of the ASCII file as one text field among fields that you are sure that you will not be further analyzing. The Import Wizard will allow you to skip these blocks of data (using the “Do not import” radio button) in the final step of the process. Alternatively, you may elect to import them and then subsequently delete each such text field which is not of further analytical interest.
In order to make use in EXCEL of the published weights for each nurse, the user must individually introduce new spreadsheet columns for generation of crossproducts necessary for obtaining properly-weighted sums and averages.
We believe that users who only possess EXCEL can successively perform simple and meaningful analyses of the data if the above steps are undertaken, though tedious to manually manipulate in the spreadsheet arena. We recommend that users employ statistical analysis software such as SPSS, SAS, or Stata to perform complex analyses or compute weighted estimates.
For 2008, the documentation and codebook information are contained in the RN08_State_Documentation.pdf and RN08_CNTY_Documentation.pdf files. This reflects a more streamlined approach to coverage than was published for 1977-2004 where there was a main documentation file, files such as RN04CDOC.pdf and RN04PDOC.pdf, which referenced separate appendix files, respectively, for the County and State documentation. For 1977-2004, accompanying files included the Readme files and Appendices A-I. The appendices within the 2008 State and County Documentation Files roughly correspond to the separate file appendices included from 1977-2004, except as noted below.
Readme files for 1977-2004 are the central listing for summarizing the various files and documentation in each respective zipped directory. This information is directly found in the consolidated Documentation file in 2008.
For survey years 1977 to 2004, files such as RN04CDOC.pdf and RN04PDOC.pdf constitute the main documentation manuals for each of the respective General and County public use data sets.
The documentation packages in all years include:
- background of the survey
- layout of the documentation manual
- technical and programmer's information
- naming conventions for variables in the questionnaire
- constructed (derived) variables based on formulae using the responses to the original questions of the survey
- definitions of the derived variables
- sample variance estimation and design notes, and
- a Codebook, which includes
- documentation identifying locations of each field/variable on the data file
- category levels for each field/variable, and
- marginal distribution information for the response categories used in that survey.
The appendices cover the following material:
- Appendix A (or appendxa.pdf for 1977-2004) contains a scan of the original questionnaire survey instrument
- Appendix B (or appendxb.pdf for 1977-2004) contains a description of the statistical sampling methodology
- Appendix C (or appendxc.pdf, appCGuid.doc, appCGuid.pdf, appCXwlk.doc, and/or appCXwlk.pdf over 1977-2004), a crosswalk spreadsheet showing the evolution of the various questions by topic, tracks which variables have been available in the General (State) File, the County File, or only in HRSA's in-house file.
Accompanying the spreadsheet is text (or a file from 1977-2004) that identifies and explains the coding of these variables in the columns of the crosswalk to reflect the various combinations available both in-house and on both public use files, available on the in-house and General Public Use Files, available on the in-house and County Public Use Files, and available only on the in-house file.
- Appendices D through G (or appendxd.pdf to appendxg.pdf over 1977-2004) contains a set of appendices identifying the numerical codes for various geographic entities, such as State, Foreign Country, Federal Region, County, or Metropolitan Area. In 2004, only, appendxh.pdf also provides information on Metropolitan Areas. Information that relates only to the county or metropolitan data are not included in the documentation for the General data files.
- Appendix I.pdf (in 2004, only) provides a list of the Priority State orderings which are used in the sampling and weighting processes of the survey. The information is not applicable to the 2008 NSSRN revised systematic stratified sampling design.
For all survey years, two auxiliary text files for use in conjunction with SAS (with “SAS” in the filename) and two auxiliary files for use in conjunction with SPSS (with ".sps" or SPSS in the filename) provide formatting, inputting, and labeling information for the variables and categories found on the ASCII data files. Each pair of files, for SAS or SPSS respectively, can be used to identify the data file input stream variables on the records, labels for each variable and variable value, and a data format listing for each field. For 2008, only, there are two additional files for SAS, a ‘format match’ file and a format ‘sas7bcat’ file, in the documentation file grouping; these latter two files individually cover all variables from the survey, including those withheld from the public use data files. These sets of files are to be used in conjunction with the public use data from the survey for the respective SAS or SPSS statistical application program.
NSSRN Data Files and Documentation Available Online
Public Use data files and documentation for any of the NSSRN surveys may be downloaded from this web site. To download, visit the Data Download page. If you experience any technical difficulties or have questions on the use of the files, please send your inquiry to comments@hrsa.gov.
|