HRSA - U.S Department of Health and Human Services, Health Resources and Service Administration U.S. Department of Health and Human Services
Home
Questions
Order Publications
 
Grants Find Help Service Delivery Data Health Care Concerns About HRSA
Methods for Identifying Facilities and Communities with Shortages of Nurses, Technical Report
 

Methods and Models Using Geographic Data Only

The rationale for using a geography-based method to identify facilities with critical shortages of RNs is that recruiting and retention difficulties at the facility level will be strongly influenced by geographic context (e.g., availability of RNs in the immediate geographic area). Certain types of facilities (e.g., long-term care facilities, publicly sponsored facilities) will have greater relative difficulty in obtaining and retaining adequate numbers of RNs in the presence of geographic shortages, but when numbers of RNs available at the local level are adequate to meet the needs of all facilities, inter-facility competition should be a less important factor. Facilities in communities with an adequate supply of RNs may face difficulties in attracting and retaining RNs due to issues related to organizational culture and management practices, but the NELRP program is not intended to address these difficulties.

Potential shortage areas were primarily analyzed at the county level due in large part to data constraints. An obvious shortcoming of county-based analysis was that people often cross county (or even state) lines to seek health care. There were many counties that had no hospitals, for example, but their residents presumably obtained care in other counties.

On the other hand, facilities were likely to draw RNs from the same geographic areas from which they draw patients, and so a shortage of RNs in residence relative to the estimated needs of the population may indicate problems even if both RNs and patients commute to an adjacent county to give or receive care. Clearly, however, the use of counties was inferior to the use of service areas based on actual patterns of health care access, but existing service area designations were badly dated or based on zip codes (to which the necessary data at county or census tract levels did not easily correspond).

Another shortcoming of counties as the unit of analysis relates to shortages in large metropolitan areas. In New York City, for example, all of Manhattan is included in a single county, but neighborhoods within Manhattan vary widely in their economic and demographic characteristics and in their health care infrastructure. Although neighborhoods with high and low levels of resources may even be contiguous to one another, physical and social barriers can prevent both RNs and patients from traveling into other neighborhoods to give or receive care. Therefore, in the largest metropolitan areas, we attempted to replicate the county-level methodology at the level of census tracts. Most of the necessary data were available at the census tract level.

The methodology for defining geographic areas with shortages of RNs was inspired in large part by the Nurse Demand Model (NDM) and Nurse Supply Model (NSM) used by HRSA to project nursing supply and demand. Facilities within shortage areas were then prioritized based upon facility characteristics.

A. National Models Based on County Data

There are several methods used in the literature to estimate the demand or need for RNs. For example, the most commonly used measure is the ratio of RNs to population. In addition, it is also common in the literature to use the ratio of RNs to MDs as a measure of the need for RNs. In this study, we focused on the ratio of RNs to population as a measure of the need for RNs. When calculating this measure, we needed to adjust population size by the compositions of gender and age groups in the population. The next section describes how gender- and age-adjusted population estimates were calculated.

The purpose of this component of the research was to estimate relative need of RNs across counties of the U.S. based on RNs per gender- and age-adjusted population. In addition, as a comparison, the ratio of RNs to MDs was also estimated. Both of these measures were used as a dependent variable in an OLS regression analysis. Data used in this study came from the ARF of 2005 and NHIS of 2003-2004.

1. Model Based on RN to Age-Adjusted Population Ratios

Assumption: RNs should be evenly distributed across the U.S. population adjusting for age.

Assumption: Age-specific patterns of health care utilization do not vary substantially across counties.

Assumption: Need for RNs (as distinct from demand for RNs) is based on population characteristics rather than existing health infrastructure.

Assumption: RN commuting patterns are similar to the commuting patterns of other workers in terms of county inflow and outflow.

This method was an effort to improve upon the basic RN-to-population ratio by applying weights to adjust for the age distribution of the population. The essential idea was similar to that employed in Method 1, but was more limited in what was accounted for to enable the same methodology to be applied to different units of geography so long as basic population data was available. Because older Americans use more health care than the general population, and younger adults use less, this methodology applied a greater weight to older adults than to those ages 18-44. However, the differences in utilization rates by age differed by type of health care, and types of health care differed in their demand for RNs (e.g., about 45% of the demand for RNs was inpatient hospital demand, while only 6% of RN demand was nursing home demand). The weights took into account both estimated use of various forms of health care and the influence of these types of health care on the national demand for RNs.

Table 29. National Estimates of RNs per Unit of Service

RNs
RNs
Units of Care
RNs per Unit
Total RNs
% RNs
 Inpatient units
1,058,242
168,846,928
0.0062675
2,375,792
45%
 Outpatient units
145,118
83,715
1.7334767
2,375,792
6%
 Physician offices
278,093
951,214
0.2923559
2,375,792
12%
 Emergency Department
117,381
107,490
1.0920179
2,375,792
5%
 Long-term hospitals
139,091
22,402,741
0.0062087
2,375,792
6%
 Extended care
153,366
1,469,500
0.1043661
2,375,792
6%
 Home health agency
106,690
1,355,290
0.0787212
2,375,792
4%
 Nursing education
63,833
2,375,792
0.0268681
2,375,792
3%
 Public/community health
87,952
280,836,834
0.0003132
2,375,792
4%
 School health
78,539
49,036,764
0.0016016
2,375,792
3%
 Occupational health
22,569
173,907,572
0.0001298
2,375,792
1%
 Other
124,918
280,836,834
0.0004448
2,375,792
5%

Source: 2004 NSSRN

Utilization rates [1] for specific age groups were then standardized against the average for the population overall to obtain ratios of how many persons an individual from a specific age group should count as in calculating utilization. For example, overall use of inpatient days in the U.S. in 2002[2] was 541.3 per 1,000 population, while those ages 0-5 averaged 601.0 days of care (a ratio of 1.11). Therefore, each person ages 0-5 would be weighted as 1.11 persons for the purposes of determining demand for inpatient care. In contrast, those ages 6-17 averaged 110.3 days of inpatient care per 1,000 (a ratio of 0.20). Therefore, each person ages 6-17 would be weighted as 0.20 persons in determining demand for inpatient care.

Utilization of each type of care was further adjusted for the relative influence of that type of care on demand for RNs overall. For example, the greatest driver of demand for RNs was demand for inpatient services (where 45% of RNs are employed), while non-emergency hospital outpatient services influenced overall demand for RNs much less (as they only employ 6% of RNs). Each age group’s weight for inpatient services was then multiplied by 0.45; each age group’s weight for hospital outpatient services was multiplied by 0.06, etc. Adjusted weights for each type of care for each group were then summed to produce the group’s total weight, which should be a reflection of how many people each individual “counts as” in determining overall RN demand. A constant adjustment factor was then applied to the adjusted weight for each group so that the weighted population totals equaled the actual U.S. population.

Final weights for each group are shown below in Table 30. As shown, the age group that exerted the least influence on demand for RNs (ages 6-17) was weighted at about half a person, while the group that exerted the most influence (ages 85 and up) was weighted at about five persons.

Table 30. Final Population Weights by Age Group

Age group
Final Weight
0 to 5
0.890
6 to 17
0.511
18 to 44
0.690
45 to 54
0.897
55 to 64
1.078
65 to 74
1.947
75 to 85
3.367
85 and up
5.024

These weights were applied to the population of each county to produce a weighted population count that reflected demand for RNs more accurately than simply an unweighted population ratio.

Figure 17 below shows the projected number of RNs per 100,000 age-adjusted population (see Table 44 for the weights used to adjust the population for patterns of health care use by age).

As the figure illustrates, the supply of RNs relative to the age-adjusted population will peak in 2008, decline very slightly by 2012, and decline further by 2016. By 2024, the relative supply of RNs is estimated to be 15% less than the 2004 level.

Figure 17. Projected RNs per 100,000 Age-Adjusted Population

[D]

2. Age-Gender Adjusted Population

When comparing RNs per population across counties, it is important to consider the age-gender distribution in each county, because this is an important determinant of health care utilization. A county with a higher proportion of older adults needs more RNs compared to a county with a lower proportion of older adults, even if the counties have the same other characteristics. However, numerous other factors could affect the need for RNs. For example, a county with higher morbidity rate needs more RNs compared to counties with a lower rate.

Health care utilization rate is commonly used in adjusting population to calculate the ratio of physicians to population. For example, the Sheps Center for Health Service Research, at the University of North Carolina at Chapel Hill, has used adjusted population to estimate physician per population ratios. A similar procedure was used in this study, but it focused on RNs per population rather than physicians per population.

Health care utilization rate was estimated based on the number of nights in hospital (inpatient days) and the number of visits to health care professionals including emergency department (outpatient visits). The utilization rates were estimated based on a sample of non-Hispanic Whites who had health plans obtained from NHIS 2003-2004. Table 31 presents the health care utilization rates by age and gender categories for inpatient days and outpatient visits.

Figure 18 shows the distribution of RN hours spent in direct patient care consisting of physician office, hospital inpatient, and emergency department (ED), and in non-direct patient care. These proportions were obtained from the 2000 NSSRN. The distribution presented in the figure was used to aggregate inpatient days, outpatient visits, and ED visits estimated from NHIS data. Please note that in the NHIS dataset physician office visits and emergency room visits were combined into one variable. Thus, in this study the percentage of RN hours spent in physician office and emergency room was consolidated into one value, 17.6%.

Table 31. Inpatient and Outpatient Health Care Utilization by Age and Gender, 2003-04

Age Group
Inpatient Days / Year
Outpatient Visits / Year
Male
Female
Male
Female
0 – 4
0.741
0.759
0.255
0.240
5 - 9
0.054
0.053
0.153
0.132
10 – 14
0.064
0.040
0.124
0.119
15 – 19
0.127
0.225
0.125
0.178
20 – 24
0.296
0.634
0.120
0.238
25 – 29
0.344
0.606
0.121
0.283
30 – 34
0.173
0.581
0.134
0.273
35 – 44
0.207
0.424
0.156
0.279
45 – 54
0.477
0.387
0.209
0.314
55 – 59
0.762
0.685
0.298
0.369
60 – 64
0.880
0.843
0.347
0.389
65 – 74
1.152
1.112
0.354
0.405
75 – 84
1.592
1.963
0.483
0.435
≥ 85
3.567
2.159
0.512
0.398
Average
0.746
0.748
0.242
0.289

Source: NHIS 2003-2004

Figure 18. Distribution of RN Hours Spent in Direct Patient Care (Physician Office, Inpatient, ER/ED) and Other Activities

[D]

Source: Calculated based on data from the National Sample Survey of RNs, March 2000

The next step was to calculate the weight corresponding to each age-gender group. To illustrate, the weight for males ages 5-9 years was computed as follows:

Weight = 0.317 x (0.054/0.746) + 0.176 x (0.153/0.242) + 0.507 = 0.6410

The final weights for all age-gender groups are presented in Table 32.

Table 32. Weights for Age-Gender Adjusted Population

Age group (year)

Weight

Male
Female
0 - 4
1.0072
0.9746
5 - 9
0.6410
0.6098
10 - 14
0.6244
0.5962
15 - 19
0.6516
0.7106
20 - 24
0.7199
0.9208
25 - 29
0.7410
0.9356
30 - 34
0.6783
0.9195
35 - 44
0.7083
0.8563
45 - 54
0.8621
0.8619
55 - 59
1.0474
1.0216
60 - 64
1.1333
1.1010
65 - 74
1.2541
1.2248
75 - 84
1.5354
1.6036
≥ 85
2.3960
1.6637

Note: Estimated based on NHIS 2003-2004 data and the National Sample Survey of RNs of 2000

The final step was calculating age-gender adjusted population using the weights presented in Table 32. The age-gender adjusted population for County C was calculated as the weighted sum of populations of all age-gender groups, formulated as follows:

Adjusted Pop = 1.0072 x (# Males 0-4) + … + 2.3960 x (# Males ≥ 85) +
             0.9746 x (# Males 0-4) + … +1.6637 x (# Males ≥ 85)

This method is similar to the method commonly used to calculate the base population to estimate the need for physicians in a specific county, state, region, or other geographic area.

In the first model specification, a dependent variable defined as the ratio of RNs per 1,000 age-gender adjusted population was generated. In addition, as a comparison, a model with RNs per MD as the dependent variable was also developed. The distributions of these two dependent variables are presented in Table A-1 to A-3 in Appendix A for states, regions, and rural and urban areas, respectively.

3. OLS Regression Analysis

RNs per Age-Gender Adjusted Population as the Dependent Variable

The dependent variable in the first specification model was the ratio of RNs to age-gender adjusted population. Explanatory variables used in the analysis are as follows:

  1. Dummies for 9 census divisions with the Pacific region used as reference
    (8 dummies: dr1 - dr8)
  2. Dummies for metropolitan counties, with Non-Metropolitan County used as reference
    (3 dummies: dm1 – dm3)
  3. Percentage of population ages 5 years or younger(pp_5)
  4. Percentage of population ages 65 years or older (pp65_)
  5. Percentage of Black and Hispanic population (blck_hsp)
  6. Percentage of American Indian and Alaska Native population (AIAN)
  7. Percentage of population in poverty (pvrtypct)
  8. Infant mortality rate (infmortr)
  9. Percentage of agriculture/forest/fish/hunt/mine workers (agricpct)
  10. Percentage of manufacturing workers (manufpct)
  11. Percentage of health and social service workers (healthpct)
  12. Percentage of white collar workers (whcollar)
  13. Dummy for the number of hospital in the county is more than one (dhsp2)
  14. MDs per 1,000 individuals (md_pop)
  15. Medicare inpatient days per 100 individuals (mdicr_pop)

The descriptive statistics for each of these variables are presented in Tables A-4 to A-8 in Appendix A.

Table 33 presents the coefficient estimates for the first model based on county data from the ARF of 2005. The table shows that the coefficient estimates of the dummies for regions were significant and positive. These tell us that the regions represented by the eight dummy variables had significantly higher RNs per age-gender adjusted population than the Pacific region. The coefficient estimates of the regions varied considerably ranging from 0.515 for Mountain to 2.378 for East South Central. The coefficient estimates of dummies for metropolitan counties were positive and significant indicating that counties of metropolitan areas had higher RNs per gender-adjusted population than non-metropolitan counties with similar other characteristics.

Table 33. Estimates of Impact of Selected Factors on RNs per Age-Gender Adjusted Population

Independent Variable
Coefficient
Std Err
t-stat
p-value
Intercept
-0.137
0.841
-0.163
0.870
New England
1.682
0.353
4.765
0.000
Middle Atlantic
1.367
0.281
4.867
0.000
East North Central
1.770
0.235
7.536
0.000
West North Central
1.708
0.228
7.505
0.000
South Atlantic
1.617
0.226
7.164
0.000
East South Central
2.378
0.251
9.456
0.000
West South Central
1.024
0.228
4.482
0.000
Mountain
0.515
0.239
2.156
0.031
Counties of metro areas of 1 million population or more
0.369
0.172
2.143
0.032
Counties in metro areas of 250,000 - 1,000,000 pop.
0.864
0.161
5.368
0.000
Counties in metro areas of fewer than 250,000 pop.
0.697
0.149
4.678
0.000
Percentage of population age 5 or younger
0.139
0.058
2.376
0.018
Percentage of population age 65 years or older
0.032
0.015
2.095
0.036
Percentage of Black and Hispanic population
-0.019
0.004
-5.335
0.000
Percentage of AIAN population
-0.035
0.007
-5.032
0.000
Percentage of population in poverty
-0.158
0.013
-11.835
0.000
Infant mortality rate
-0.013
0.011
-1.214
0.225
Percentage of agriculture/forest/fish/hunt/mine workers
-0.041
0.008
-5.011
0.000
Percentage of manufacturing workers
0.020
0.008
2.481
0.013
Percentage of health and social service workers
0.196
0.013
15.019
0.000
Percentage of white collar workers
0.064
0.010
6.356
0.000
Dummy for county having 2 or more hospitals
0.221
0.108
2.043
0.041
Number of MDs per 1,000 individuals
0.203
0.040
5.107
0.000
Medicare inpatient days per 100 individuals
0.263
0.128
2.060
0.039

Note: Estimated using OLS regression based on data from Area Resource File of 2005
R2 = 0.43

The coefficient estimate of proportion of population age 5 years or younger was positive and significant. This revealed that the higher the proportion of population age 5 or younger, the higher was the RNs age-gender adjusted population. The coefficient estimate of proportion of population age 65 or older was positive indicating that the higher the proportion of population age 65 years or older, the higher was the RNs per age-gender adjusted population. The coefficient estimate of proportion of Black and Hispanic populations was negative and significant indicating that the ratio of RNs to age-gender adjusted population was lower in counties with higher proportion of Black and Hispanic populations. Similar to the proportion of Black and Hispanic populations, the proportion of AIAN population was negatively associated with the RNs per age-gender adjusted population. Thus, the higher the proportion of AIAN population in counties, the lower was the RNs per age-gender adjusted population. 

The economic condition of a county was represented by the percentage of population in poverty. Its coefficient estimate was negative and significant, which indicated that lower the economic condition of a county, the lower was the RNs per age-gender adjusted population. It was noteworthy that the economic condition had a negative correlation with the percentage of minority populations (Black, Hispanic, and AIAN). Therefore, the higher the number of minority population, especially Black, Hispanic, and AIAN, the lower the economic condition. The other variable which had a high negative correlation with economic condition was infant mortality rate. The coefficient estimate of infant mortality rate was negative, but not significant.

The variables representing the structure of labor markets were percentage of agriculture/forest/ fish/hunt/mine workers, manufacturing workers, health and social service workers, and white collar workers. Table 33 shows that all coefficient estimates were statistically significant, except the coefficient estimate for infant mortality rate. The highest coefficient estimate of the percentage of health and social service workers was the highest compared to the others. This indicated that the percentage of health and social service workers in a county was the most influential factor in attracting people to enter RNs as their profession.

The number of hospitals in a county also affected the RNs per age-gender adjusted population. When the number of hospitals was included in the model, the coefficient was insignificant. In addition, the dummy for a county having at least one hospital was also insignificant. When a dummy variable defined as a county with two or more hospitals was included in the model, its coefficient was significant and positive. A county with one hospital and a county without a hospital, with the same other characteristics, tended to have the same RNs per age-gender adjusted population. But if a county had two or more hospitals, the number of RNs per age-gender adjusted population was higher compared to a county without a hospital or with only one. This indicated that in a county with more than one hospital, the demand for RNs was more competitive compared to a county without a hospital or only one. The more competitive the market from the demand side the higher was the salary; in subsequence it would attract more people to enter the nursing profession.

The MDs per 100 individuals and Medicare inpatient days per 100 individuals were also included as explanatory variables. Both variables had positive coefficient estimates and were statistically significant. The more MDs per individual in a county, the greater the RNs per age-gender adjusted population. In addition, the more Medicare inpatient days per individual, the greater the RNs per age-gender adjusted population.

4. RNs per MD as the Dependent Variable

Assumption: RNs should be evenly distributed according to locations of physicians.

Assumption: RN commuting patterns are similar to the commuting patterns of other workers in terms of county inflow and outflow.

The RN to physician ratio was expected to produce an estimate that was closer to that based on actual utilization data, as physician counts were likely to be in part a proxy for health care infrastructure. This was more useful than utilization rates in that it could be adapted to geographies or time periods where utilization rates were not available, but was a less precise measure and could bias RN shortage estimates against areas that were physician-short.

Example: Albany County

It was estimated that 4,942 RNs and 1,578 physicians were working in Albany County. If we assumed that national RN to physician ratios should be distributed evenly throughout the country, we would expect 3.11 RNs to every physician in Albany County. This would require 4,907 RNs in Albany County.

Although this method still resulted in a slight estimated oversupply of RNs in Albany County, this was only 1% more RNs than needed, rather than the 110% oversupply indicated by Method #2. This method accounted for the greater health care infrastructure in Albany relative to surrounding areas, which demanded more RNs per capita than a simple RN to population ratio would indicate.

As a comparison, we also estimated a model with RNs per MD as dependent variable. The explanatory variables for this specification were the same as those for the first specification (RNs per age-gender adjusted population as the dependent variable). Table 34 presents the coefficient estimates for this specification. The R2 was 0.25 for this model, which was much lower than that of the first specification (R2 = 0.43). In addition, some of the coefficients were not significant which indicated that the specification with RNs per age-gender adjusted population as the dependent variable was better than the specification with RNs per MD as the dependent variable.

It should be noted that just because a county had high RNs per MD did not mean the county had enough RNs. This could be explained by a low number of MDs in the county. For example, HPSA counties, which had low numbers of MDs, had higher RNs per MD. Specifically, on average, the average RNs per MD among HPSA counties was 16.9, which was about twice the average of the non-HPSA and partial-HPSA counties, at 8.9 and 8.3, respectively. In contrast, the average RNs per age-gender adjusted population among HPSA counties was 6.6 compared to 8.8 for non-HPSA counties and 8.2 for partial-HPSA counties. Thus, one must be careful when interpreting and using RNs per MD as an indicator of nursing shortage.

Table 34. Estimate of Impact of Selected Factors on RNs per MD

Independent Variable
Coefficient
Std Err
t-stat
p-value
Intercept
18.502
3.901
4.743
0.000
New England
1.311
1.512
0.867
0.386
Middle Atlantic
1.892
1.215
1.558
0.119
East North Central
3.438
1.021
3.368
0.001
West North Central
6.760
0.995
6.797
0.000
South Atlantic
2.881
0.981
2.938
0.003
East South Central
4.100
1.085
3.779
0.000
West South Central
2.746
0.994
2.762
0.006
Mountain
0.734
1.044
0.702
0.482
Counties of metro areas of 1 million pop. or more
4.811
0.743
6.478
0.000
Counties in metro areas of 250,000 - 1,000,000 pop.
4.302
0.689
6.242
0.000
Counties in metro areas of fewer than 250,000 pop.
4.557
0.644
7.077
0.000
Percentage of population age 5 years or younger
-0.661
0.266
-2.485
0.013
Percentage of population age 65 years or older
0.059
0.069
0.854
0.393
Percentage of Black and Hispanic population
-0.012
0.016
-0.754
0.451
Percentage of AIAN population
-0.015
0.032
-0.472
0.637
Percentage of population in poverty
-0.085
0.061
-1.391
0.164
Infant mortality rate
-0.003
0.056
-0.060
0.952
Percentage of agriculture/forest/fish/hunt/mine workers
0.182
0.043
4.256
0.000
Percentage of manufacturing workers
0.079
0.035
2.221
0.026
Percentage of health and social service workers
0.252
0.058
4.322
0.000
Percentage of white collar workers
-0.226
0.046
-4.911
0.000
Dummy for county having 2 or more hospitals
-3.006
0.463
-6.494
0.000
Number of MDs per 1,000 individuals
-2.064
0.172
-12.009
0.000
Medicare inpatient days per 100 individuals
-1.630
0.569
-2.862
0.004

 Note: Estimated using OLS regression based on data from Area Resource File of 2005
 R2 = 0.25

5. Distribution of Residuals

Table 35 presents the distribution of the percentage of counties with negative residual by states. (The residual was defined as the actual value of the dependent variable less its predicted value, so that a negative value indicated that a state has fewer RNs than the model predicts.) Based on the first specification, the table shows that—apart from District of Columbia—Utah had the highest percentage of counties with negative residual (83%). In the other words, 83% of counties in Utah had lower RNs per age-gender adjusted population than predicted by the model. In contrast, Hawaii and Montana had the lowest percentage of counties with negative residuals (25% each).

Table 35. Percentages of Counties in the U.S. with Negative Residuals

FIPS 

State

% of Counties with Negative Residuals

Model 1
Model 2
1 Alabama
52%
63%
2 Alaska
59%
33%
4 Arizona
53%
40%
5 Arkansas
33%
66%
6 California
57%
54%
8 Colorado
38%
49%
9 Connecticut
63%
13%
10 Delaware
33%
33%
11 District of Columbia
100%
0%
12 Florida
63%
73%
13 Georgia
50%
67%
15 Hawaii
25%
0%
16 Idaho
73%
61%
17 Illinois
30%
49%
18 Indiana
58%
79%
19 Iowa
36%
48%
20 Kansas
44%
69%
21 Kentucky
57%
70%
22 Louisiana
44%
55%
23 Maine
56%
81%
24 Maryland
38%
50%
25 Massachusetts
36%
50%
26 Michigan
75%
79%
27 Minnesota
72%
87%
28 Mississippi
28%
52%
29 Missouri
67%
59%
30 Montana
25%
60%
31 Nebraska
58%
75%
32 Nevada
77%
67%
33 New Hampshire
50%
50%
34 New Jersey
67%
43%
35 New Mexico
42%
53%
36 New York
55%
50%
37 North Carolina
31%
65%
38 North Dakota
62%
81%
39 Ohio
46%
78%
40 Oklahoma
57%
45%
41 Oregon
56%
65%
42 Pennsylvania
45%
67%
44 Rhode Island
40%
80%
45 South Carolina
44%
83%
46 South Dakota
41%
89%
47 Tennessee
70%
79%
48 Texas
62%
70%
49 Utah
83%
62%
50 Vermont
64%
71%
51 Virginia
55%
59%
53 Washington
46%
46%
54 West Virginia
60%
60%
55 Wisconsin
71%
86%
56 Wyoming
52%
74%

Figure 19. Estimated Extent of Nursing Shortages in Counties in the U.S.

[D]

B. Model Based on RN to Population Ratios

Assumption: RNs should be evenly distributed across the U.S. population.

Assumption: Need for RNs (as distinct from demand for RNs) is based on population characteristics rather than existing health infrastructure.

Assumption: RN commuting patterns are similar to the commuting patterns of other workers in terms of county inflow and outflow.

Method 2 uses a simple, RN-to-population ratio and is based upon the assumption that RNs should be evenly distributed across the U.S. population. Method 2 is a very crude measure because it does not take into account either the age structure of the population at the county level or the health care infrastructure in the county. Like Method 1, it adjusts RN supply based on inter-county commuting patterns.

Example: Albany County, New York

As calculated in Step 4 of Methodology #1, 4,942 RNs were estimated to work in Albany County in 2000. The population of Albany County in 2000 was estimated to be 294,565. Applying national ratios of 0.0080 RNs per population, we would expect Albany County to need a total of 2,357 RNs (294,565 x 0.0080). The actual supply of RNs was estimated to be 110% more than what the population of the county required.

Albany County is a good illustration of the shortcomings of this method. Because it is an urban center with many hospitals and other health care facilities, many residents of surrounding counties come to Albany County for care. Even though there are facilities in most of the surrounding counties, Albany Medical Center is a Level I trauma center and a teaching hospital, and both Albany Medical Center and St. Peter’s Hospital (also in Albany) score highly on national rankings of patient care.

C. Models Based on County Clusters

One of the obvious biases when Methods 1 to 4 were compared was that a county in which health care facilities drew many patients from outside the county, the county was shown to have more severe shortages than counties in which patients presumably traveled to other counties for health care. This was a clear problem in any methodology based solely on population. In an attempt to assess the impact of cross-county patient flow, Methods 1 to 4 were recalculated at the level of “county clusters,” where population counts, nurse counts, and demand estimates at the county level were summed for a core county and its contiguous counties. This was an imperfect measure, as contiguous counties will have a patient flow to and from the core county in the cluster, but also to and from their own other contiguous counties. For example, if County A has a contiguous County B to the west, County B’s population is considered part of County A’s county cluster. However, if County B is bordered on the west by County C, which is part of a major metropolitan area, County B’s population may be primarily going to County C for health care with very little flow to County A. Counting the population of County B as part of County A’s county cluster will therefore result in an overestimate of the pool of people who may be using health services in County A.

On the other hand, the use of county clusters was expected to have a smoothing effect across the various types of estimates, which was generally observed. For example, in Albany County, estimates of RN supply ranged from a supply that was 1% greater than demand to a supply that was 110% greater than demand. In the Albany county cluster, however, estimates ranged from a supply that was a 3% shortage to a supply that was 39% more than estimated demand.

Example: Albany and Schoharie Counties in Upstate New York

Figure 20 below summarizes how the various measures of shortage differ for a feeder county and a receiver county that are contiguous to one another. Schoharie County was a rural county adjacent to Albany County. None of its other contiguous counties hosted major medical centers comparable to those in Albany County, so persons in Schoharie County were more likely to go to Albany County than to any other contiguous county for care. In the first four measures of shortage, at the individual county level, Albany County was seen as having a surplus while Schoharie County was seen as having a shortage. When county clusters were used, however, estimates for the two neighboring counties were much more similar.

Figure 20. Comparison of Selected Measures of Nursing Shortage in Adjacent Counties

[D]

D. Models Based on Adjusting for Cross-County Patient Flow

Another method to adjust for cross-county patient flow more precisely than using county clusters was to adjust population figures based upon commuting flow. In one respect it made sense that the distances and directions in which it was convenient for people to travel to work would also be convenient for them to travel for health care, and that counties with more job opportunities relative to their neighbors would also have more health care facilities. On the other hand, the nature of health care need dictates that some areas may have health facilities but few other major employers.

Furthermore, there were sometimes additional inducements to commute out for work that did not exist in commuting for health care. For example, in Monroe County, Pennsylvania, 7% of workers who lived in the county commuted to one of the five counties of New York City for work (a distance of approximately 80 miles that cannot be traveled without crossing through at least three other counties) due to the great differences in salaries (favoring working in New York City) and the great differences in cost of living (favoring living in Pennsylvania). Yet Monroe County has a medical center, and is contiguous to several other counties with major medical centers (including some with trauma centers) that are not nearly as far as New York City. Therefore, it was doubtful that 7% of the population of Monroe County traveled to New York City for health care, despite commuting patterns for work. Areas with such extreme commuting patterns to counties that were not contiguous were certainly the exception rather than the rule, but may be more common than believed, especially near major metropolitan areas with very high costs of living.

Adjustments for patient flow were similar to those made to RN supply. This produced the same RN-to-population ratio as using the unadjusted RN numbers and unadjusted population together, but produced different raw estimates.

Using this methodology, we found that Albany County, New York was estimated to need 3,634 RNs to treat its own population and incoming patients from other areas, while 4,942 RNs were estimated to work there. This estimated a supply that exceeded demand by 36%, which was a more moderate oversupply estimate than most others using Albany County as a single county, and somewhat comparable to those using the county cluster.

Schoharie County, using this methodology, was estimated to need 191 RNs, and had 197 (a shortage of 3%). This also appeared to be a moderate number compared to other estimates.

Monroe County, Pennsylvania was found to require 922 RNs, and had an estimated 834 working (a shortage of 9.5%). It was not surprising that this was lower than the other shortage estimates based on population ratios (25% and 23%), but it was surprising that it was so close to estimates based on actual health care use (11%).

New York County was found to need 34,126 RNs while there were an estimated 22,711 working there. This shortage (33%) was also very close to that based on actual health care use (29%).

Except for Albany County, adjustments of both population and RN supply based on commuting patterns to produce a ratio seemed to offer close approximations of estimates based on actual health care use in each of the test counties, including New York County and Monroe County, Pennsylvania, both of which experienced unusual levels of commuter flow.

E. Models Based on Sub-County Analyses

As the work on the county-level analyses described above progressed, concerns arose that counties were too large to study and understand the nursing needs of communities in the largest metropolitan areas, where very disadvantaged communities may exist in close proximity to very advantaged communities. Disadvantaged communities in urban areas may have a more difficult time recruiting RNs for two reasons:

  1. RNs may be reluctant to work in communities where there is a perceived fear of crime or a large population with which they do not feel culturally competent, and
  2. a large percentage of the services offered in disadvantaged urban communities are provided by publicly operated facilities, which may not be able to offer salaries and benefits competitive with nearby non-public facilities that tend to serve more advantaged communities.

For this reason, some sub-county analyses were performed at the Census tract level using New York County (Manhattan) as a test case. These analyses were largely exploratory in nature, to try to determine what data might be available and what methods might be appropriate for sub-county analyses in the largest metropolitan areas across the U.S.

Census tract-level analyses posed many challenges. Demographic and economic population data were available at the Census tract level, and some RN supply data was available from the 2000 Census as well. Utilization data, however, was not available, nor were data on commuting patterns between Census tracts. There was also a question of the accuracy of RN supply estimates from sample data at such a small level of analysis. Ultimately, utilization rates were imputed based on the demographic characteristics of the tract population, the utilization data for the county, and the distribution of the county population between Census tracts.

It became very clear, however, that RNs in the population were not an adequate measure of available supply at the Census tract level. Some of the poorer neighborhoods had relatively high numbers of RNs per capita, but there was no basis for estimating how many of them worked in the neighborhoods where they lived. Similarly, many wealthy neighborhoods had relatively few RNs per capita (who presumably would not be able to afford to live in the most expensive neighborhoods of Manhattan), but there was no basis for assuming that the residents of these neighborhoods necessarily had difficulty obtaining nursing care. Estimates of service use in the population were also deemed suspect because it was impossible to estimate how many residents obtained health care within their own Census tract.

Subsequent reflection on the nature of labor markets and discussions with providers in New York County led study staff to believe that RN supply was not necessarily a correlate of difficulty recruiting at the local level. In large metropolitan areas, the pool of available labor tended to be geographically very broad, as illustrated by the fact that 70% of workers employed in New York County did not reside in New York County and that 16% of the employed residents of New York County did not work in New York County. It would thus appear that the supply of RNs within the Census tract where a health care facility was located was of limited relevance to the overall supply of RNs from which that facility may draw. Factors such as the ability to offer competitive compensation packages and the perceived environment of the neighborhood were likely to be much greater predictors of difficulty recruiting RNs in large metropolitan counties.

It may be best to establish guidelines specific to facilities in the largest metropolitan counties that would address the specialized problems of high-needs facilities. Possibilities include giving automatic eligibility to facilities in HPSA-designated areas or to those meeting certain criteria (e.g., public facilities), regardless of the eligibility of the overall county. To implement such a policy, it would be necessary to define a threshold for counties in which these automatic qualifications would apply (perhaps counties with populations of more than 1 million).

F. Factor Analysis of Nursing Shortage Indicators

The purpose of the factor analysis was to construct a smaller number of underlying common factors that could explain a large number of observed variables. This analysis was performed primarily mainly due to the lack of a single independent variable that could be used to measure nursing shortage that was comparable for all U.S. counties. The data used in this analysis came from the ARF 2005 release. In this analysis we chose three factors to describe the characteristics of counties in the U.S. based on 20 observed variables. The three factors explained 50.3% of total variation of the observed variables.

The list of the observed variables and the corresponding standardized scoring coefficients for each factor are presented in Table 36. Shadowed numbers were the highest coefficient for the corresponding variable, which revealed what variables were the primary bases for each factor. Note that 21.54% of U.S. counties were excluded from the analysis, mainly due to missing values or no hospital in those counties. Also, counties without hospitals were excluded from the analysis because a hospital was an important factor in analyzing nursing shortages. Most RNs were employed in hospital settings, which implied that hospitals drive the market for RNs. The counties without a hospital could be analyzed separately, but this had not been done at the time of this writing.

Table 36. Standardized Scoring Coefficients

Variable
Factor 1
Factor 2
Factor 3
Metropolitan dummy
-0.025
-0.044
0.188
RNs per 1,000 individuals
0.003
0.256
0.012
RNs per 1,000 individuals < 5 years
-0.007
0.259
0.002
RNs per 1,000 individuals >=65 years
0.005
0.109
0.122
RNs per hospital bed
0.213
-0.052
0.048
RNs per MD
0.136
0.096
-0.132
RNs per 1,000 civilian labor force
0.020
0.274
-0.045
RNs per 1,000 inpatient days
0.272
-0.059
-0.058
RNs per 1,000 outpatient visits
0.158
0.007
-0.016
RNs per 1,000 emergency room visits
0.134
0.066
0.016
Infant mortality rate
0.028
0.019
-0.140
RNs per 100 Medicare inpatient days
0.278
-0.053
-0.038
RNs per 100 Medicaid inpatient days
0.220
-0.018
-0.069
Median household income ($10,000)
-0.027
-0.091
0.310
Percent persons in poverty
0.037
0.052
-0.297
Unemployment rate
0.064
-0.037
-0.151
Percentage of manufacturing workers
0.057
-0.102
0.036
Percentage of health service workers
-0.041
0.232
-0.168
Percentage of Blacks and Hispanics
0.010
-0.053
-0.098
Percentage of AIAN
0.020
0.061
-0.119

Note: The three factors can explain 50.3 percent of total variation of all variables

The standardized scoring coefficients suggested that Factor 1 consisted of high positive loadings on RNs per hospital bed, RNs per MD, RNs per 1,000 inpatient days, RNs per 1,000 outpatient visits, RNs per 1,000 emergency room visits, RNs per 100 Medicare inpatient days, and RNs per 100 Medicaid inpatient days. These loadings indicated that Factor 1 represented the ratio of RNs to health care utilization, especially in hospitals. A county with a high value for Factor 1 indicated that the county had a high number of RNs relative to health care utilization compared to other counties. On the other hand, a county with a low value for Factor 1 indicated that the county faced a nursing shortage problem, especially a shortage related to health care utilization in hospitals. Note that a county might score high on Factor 1 just because the county has low health care utilization due to underdeveloped health care infrastructure. Conversely, a county might score low on Factor 1 just because the county has high number of health care facilities which attracts many people from other counties for health care services. So, one must be cautious when interpreting Factor 1, and in particular, it should be interpreted in the context of the other two factors.

Factor 2 consisted of high positive loadings on RNs per 1,000 population, RNs per 1,000 individuals younger than 5 years, RNs per 1,000 individuals age 65 years or older, RNs per 1,000 civilian labor forces, and the percentage of health service workers; and a high negative loading on the percentage of manufacturing workers. These patterns suggested that Factor 2 represented the ratio of RNs to age-adjusted population. In addition, this factor also represented the supply of RNs. The lower the percentage of the manufacturing workers in a county, the more likely an individual was to enter the health care industry, including nursing profession. A county with high value for Factor 2 would generally have more RNs per capita than other counties. This factor was clearer in describing the nursing shortage than was Factor 1.

Factor 3 consisted of high positive loadings on the metropolitan dummy variable, RNs per individuals age 65 years and older, median household income (x $10,000); and high negative loadings on RNs per MD, infant mortality rate, unemployment rate, the percentage of individuals in poverty, the percentage of Black and Hispanic populations, and the percentage of American Indian and Alaska Native population. These patterns suggested that Factor 3 represented the economic condition of a county, including the percentage of minority populations. The percentage of minority population and quality of health were highly correlated with economic condition. A county with a high value for Factor 3 indicated that the county was in a metropolitan area with good economic conditions and lower percentage of minority populations compared to other counties.

The three factors above can be combined to describe a nursing shortage condition of each county in the U.S. To illustrate how this might work, suppose we divide each of the factors into two categories based on its median: lower than median and higher than median. (The threshold is arbitrarily chosen and could be replaced with other values, e.g., using the first quartile or other statistics.) Based on the three factors, each divided into two categories, all counties in the U.S. can be grouped into eight categories. Note that it was very common that nursing shortage was measured using the ratio of RNs to population (or age-adjusted population). As described before, Factor 2 represented the ratio of RNs to population. Based on this common criterion, Factor 2 was considered to be the most obvious factor in characterizing nursing shortage condition. So the categories were constructed based on the combinations of Factor 2, Factor 1, and Factor 3 which resulted in 8 categories: “111,” “112,” “121,” “122,” “211,” “212,” “221,” and “222”. The interpretations of these categories are described as follows.

  • Category 111. Counties in this category had low values of the three factors. Intuitively, they were counties with a low number of RNs relative to population, low number of RNs relative to health care utilizations, and low economic conditions including a high proportion of minority populations and low quality of health. In general, counties in this category were counties with a nursing shortage problem and low economic conditions, so they needed to be supported by the government to increase the number of RNs in those counties.

  • Category 112. Counties in this category had a low number of RNs relative to population, low number of RNs relative to health care utilization, and good economic conditions. Also, they were counties with a rich population. In addition, the health care industry in these counties was less attractive compared to other industries, suggesting not many people in these counties were interested in entering the nursing profession.

  • Category 121. Counties in this category had a low number of RNs relative to population, high number of RNs relative to health care utilization, and low economic conditions. The high number RNs relative to health care utilization may have been due to the small number of health care infrastructures (e.g., one hospital). People from these counties may have gone to other counties for health care services because the amount of health care utilization in those counties was low. In subsequent, the ratio of RNs to health care utilization was high. Therefore, the high value of Factor 2 was not necessarily because of a high number of RNs but probably because of the limited health care infrastructure.

  • Category 122. Counties in this category had a low number of RNs relative to population, high number of RNs relative to health care utilization, and good economic conditions. The high number of RNs relative to health care utilization may have been due to the low number of health care infrastructures, therefore people in these counties went to other counties for health care services. These counties were similar to those in category 121, except for the economic conditions.

  • Category 211. Counties in this category had a high number of RNs relative to population, low number RNs relative to health care utilization, and low economic conditions. One possible reason for the low number of RNs relative to health care utilization was a highly developed health care infrastructure, therefore people from other counties came to these counties for health care services.

  • Category 212. Counties in this category had a high number of RNs relative to population, low number of RNs relative to health care utilization, and good economic conditions. Counties in this category were similar to counties in category 211, except for the economic condition. They may not have had a nursing shortage problem because people from other counties came to these counties for health care utilization which suggested a low ratio of RNs to health care utilization. In addition, the counties in this category did not have economic problems.

  • Category 221. Counties in this category had a high number of RNs relative to population, high number of RNs relative to health care utilization, and low economic conditions. These counties did not have nursing shortage problems, but had economic problems which included a high proportion of minority populations and a low quality of health.

  • Category 222. Counties in this category had a high number of RNs relative to population, high number of RNs relative to health care utilization, and good economic conditions. These counties did not have nursing shortage problems and were without economic problems.

Now let us look at the distribution of counties by the categories for each Census division region as presented in Table 37. Among the nine regions, West South Central had the highest percentage of counties in category “111,” which was 22% of the counties in the region. The second highest was Mountain (18%), followed by South Atlantic (11%), and East South Central (10%). On the other hand, New England had the highest percentage of counties in category “222,” which was 39% of counties in the region. Those counties did not have nursing shortage problem and had good economic conditions. The second highest was Middle Atlantic (19%), followed by East North Central (17%), and West North Central (13%).

Table 37. Distribution of Counties by Categories for each Census Division

Census Division 
Category(a)
Total
Missing
111
112
121
122
211
212
221
222
East North Central
70
14
40
23
50
20
111
34
75
437
16.0%
3.2%
9.2%
5.3%
11.4%
4.6%
25.4%
7.8%
17.2%
100%
East South Central
80
38
22
32
20
71
35
46
20
364
22.0%
10.4%
6.0%
8.8%
5.5%
19.5%
9.6%
12.6%
5.5%
100%
Middle Atlantic
13
3
13
26
48
2
8
9
28
150
8.7%
2.0%
8.7%
17.3%
32.0%
1.3%
5.3%
6.0%
18.7%
100%
Mountain
67
49
55
29
18
13
22
22
5
280
23.9%
17.5%
19.6%
10.4%
6.4%
4.6%
7.9%
7.9%
1.8%
100%
New England
4
1
2
2
25
1
2
4
26
67
6.0%
1.5%
3.0%
3.0%
37.3%
1.5%
3.0%
6.0%
38.8%
100%
Pacific
25
20
30
10
12
18
29
15
5
164
15.2%
12.2%
18.3%
6.1%
7.3%
11.0%
17.7%
9.2%
3.0%
100%
South Atlantic
167
66
58
55
41
66
56
44
36
589
28.4%
11.2%
9.8%
9.3%
7.0%
11.2%
9.5%
7.5%
6.1%
100%
West North Central
147
34
50
62
89
21
35
97
83
618
23.8%
5.5%
8.1%
10.0%
14.4%
3.4%
5.7%
15.7%
13.4%
100%
West South Central
103
102
22
52
18
72
30
58
12
469
22.0%
21.8%
4.7%
11.1%
3.8%
15.4%
6.4%
12.4%
2.6%
100%

Total

676
327
292
291
321
284
328
329
290
3138
21.5%
10.4%
9.3%
9.3%
10.2%
9.0%
10.4%
10.5%
9.2%
100%

Note: (a) An example of how to interpret the category: 121 means F1<median, F2>median, F3<median

Table 38 presents the distribution of counties by the categories for each rural/urban code. More than 50% of counties in the completely rural areas (Codes 8 and 9) had missing values or did not have a hospital. Apart from the two areas (8 and 9), the higher the codes (more rural the county), the higher was the percentage of counties in category “111.” Less than 5% of counties in metro areas were categorized as “111.” In contrast, more than 14% of counties in the non-metro areas were categorized as “111.” On the other hand, the percentage of counties categorized as “222” was lower as the code increased (more rural the county). Twenty-two percent of counties of metro areas of 1 million population or more (Code=1) were categorized as “222.” In contrast, only 2.5% of counties of completely rural areas were categorized as “222.”

Table 38. Distribution of Counties by Categories for each Rural/Urban Code

Rural/Urban Codes(b)
Category
Total
Missing
111
112
121
122
211
212
221
222
1
72
17
72
3
47
8
100
2
92
413
17.4%
4.1%
17.4%
0.7%
11.4%
1.9%
24.2%
0.5%
22.3%
100%
2
63
10
37
11
72
14
52
10
56
325
19.4%
3.1%
11.4%
3.4%
22.2%
4.3%
16.0%
3.1%
17.2%
100%
3
82
13
46
33
85
14
30
13
35
351
23.4%
3.7%
13.1%
9.4%
24.2%
4.0%
8.6%
3.7%
10.0%
100%
4
3
32
18
38
15
28
41
23
20
218
1.4%
14.7%
8.3%
17.4%
6.9%
12.8%
18.8%
10.6%
9.2%
100%
5
1
17
12
26
18
8
10
7
6
105
1.0%
16.2%
11.4%
24.8%
17.1%
7.6%
9.5%
6.7%
5.7%
100%
6
73
101
30
56
20
123
68
94
43
608
12.0%
16.6%
4.9%
9.2%
3.3%
20.2%
11.2%
15.5%
7.1%
100%
7
39
84
52
71
36
48
21
78
21
450
8.7%
18.7%
11.6%
15.8%
8.0%
10.7%
4.7%
17.3%
4.7%
100%
8
124
18
7
13
5
23
4
35
6
235
52.8%
7.7%
3.0%
5.5%
2.1%
9.8%
1.7%
14.9%
2.6%
100%
9
219
35
18
40
23
18
2
67
11
433
50.6%
8.1%
4.2%
9.2%
5.3%
4.2%
0.5%
15.5%
2.5%
100%
Total
676
327
292
291
321
284
328
329
290
3138
21.5%
10.4%
9.3%
9.3%
10.2%
9.0%
10.5%
10.5%
9.2%
100%

Notes: (b)

  1. Counties of metro areas of 1 million population or more
  2. Counties in metro areas of 250,000 ‑ 1,000,000 population
  3. Counties in metro areas of fewer than 250,000 population
  4. Urban population of 20,000 or more, adjacent to a metro area
  5. Urban population of 20,000 or more, not adjacent to a metro area
  6. Urban population of 2,500‑19,999, adjacent to a metro area
  7. Urban population of 2,500‑19,999, not adjacent to a metro area
  8. Completely rural or less than 2,500 urban population, adjacent to a metro area
  9. Completely rural or less than 2,500 urban population, not adjacent to a metro area   

Table 39 presents the distribution of counties by the categories for each HPSA designation code. Almost 50% of whole-HPSA counties had missing values or did not have a hospital. The percentage of counties categorized as “111” was almost equal in non-HPSA counties and whole-HPSA counties, at 9% each. On the other hand, 14% of non-HPSA counties were categorized as “222,” in contrast to 3% of whole-HPSA counties, and 10% of partial-HPSA counties.

Table 39. Distribution of Counties by Categories for each HPSA Code (Primary Care)

HPSA
Category
Missing
111
112
121
122
211
212
221
222
Total
None
105
74
78
84
93
44
129
85
110
802
13.1%
9.2%
9.7%
10.5%
11.6%
5.5%
16.1%
10.6%
13.7%
100%
Whole County
392
74
31
43
4
101
37
100
23
805
48.7%
9.2%
3.8%
5.3%
0.5%
12.6%
4.6%
12.4%
2.9%
100%
Part County
179
179
183
164
224
139
162
144
157
1531
11.7%
11.7%
12.0%
10.7%
14.6%
9.1%
10.6%
9.4%
10.2%
100%
Total 
676
327
292
291
321
284
328
329
290
3138
21.5%
10.4%
9.3%
9.3%
10.2%
9.0%
10.4%
10.5%
9.2%
100%

 


[1] Taken from Health, United States, 2005
[2] Data were not published for 2000.