[This Transcript is Unedited]

DEPARTMENT OF HEALTH AND HUMAN SERVICES

NATIONAL COMMITTEE ON VITAL AND HEALTH STATISTICS

SUBCOMMITTEE ON POPULATIONS

February 12, 2002

Hubert H. Humphrey Building, Room 800
200 Independence Avenue, S.W.
Washington, D.C.

Proceedings by:
CASET Associates, Ltd.
10201 Lee Highway
Fairfax, Virginia 22030
(703)352-0091

TABLE OF CONTENTS


P R O C E E D I N G S [8:35 a.m.]

Agenda Item: Call to Order and Introductions

DR. MAYS: Good morning. We are going to have a change in the agenda.

Tom Smith, Dr. Smith and Dr. Kington will be exchanging their time. So, Dr. Kington will actually be talking to us from 11:05 to 11:35 on a policy overview discussion and Dr. Smith then will assume Raynard's time, which started at 11:50 to 12:20.

So, other than that, we will follow our schedule and, again, those of you who were here yesterday -- part of what we talked about is as an introduction to the hearing is our speakers have been given a set of questions and those questions really form the nucleus of the committee's examination around the issue of health disparities and the measurement of health disparities and racial and ethnic groups.

We will continue in that vein today.

Let us call the meeting to order by starting with introductions. I am Vickie Mays. I am the chair of the Population Subcommittee. I am at UCLA.

MS. QUEEN: I am Susan Queen from the Health Resources and Services Administration, lead staff to the subcommittee.

MS. GREENBERG: I am Marjorie Greenberg from the National Center for Health Statistics, CDC and executive secretary to the committee.

DR. NEWACHECK: Paul Newacheck from the University of California and a member of the committee.

MR. HANDLER: Aaron Handler, chief of the Demographics Statistics Branch, Indian Health Service and I am a staff member to the subcommittee.

DR. CARTER-POKRAS: Olivia Carter-Pokras, director of the Division of Policy and Data in the HHS Office of Minority Health and I am also staff to the subcommittee.

MS. BREEN: Nancy Breen. I am an economist in the Division of Cancer Control and Population Sciences at the National Cancer Institute and I am here for Brenda Edwards, who is a member of the committee.

MR. HUMMER: I am Bob Hummer from the Population Research Center at the University of Texas at Austin.

MS. LUCAS: I am Jacqueline Lucas. I am a health statistician with the National Health Interview Survey, NCHS.

MS. COLTIN: I am Kathryn Coltin. I am with Harvard Pilgrim Health Care in Boston and I am a member of the subcommittee

MR. HITCHCOCK: I am a Dale Hitchcock from the Office of the Assistant Secretary for Planning and Evaluation.

[Further introductions.]

DR. MAYS: Great. Let us begin with our speakers, both of whom have introduced themselves. Ms. Lucas will start.

Agenda Item: National Health Interview Survey

MS. LUCAS: Thanks, Vickie.

This morning, my talk is entitled "Measuring Racial and Ethnic Disparities in Health, Using Data from the National Health Interview Survey." I have two basic objectives this morning.

First, I will give an overview of the NHIS, including the examples of HIS data and a summary of issues surrounding the assessment of racial and ethnic disparities, using HIS data. Then I will provide some background on the measurement of race and ethnicity under the new OMB standards, illustrate how the new standards operate with examples of HIS data, provide a summary of issues surrounding the collection of racial and ethnic data under the new standards and briefly discuss how these affect the assessment of racial and ethnic disparities and how.

Finally, I will briefly discuss some of the work the HIS plans to implement to move towards the improvement of racial and ethnic data in the survey.

As many of you know, the National Health Interview Survey is one of the largest health surveys in the United States and is conducted annually by the Census Bureau for the National Center for Health Statistics. The multi-stage probability sample is drawn to be nationally representative of the non-institutionalized civilian population, meaning that no active duty military personnel or persons in prisons or other institutions are included in the survey.

The sample has a stratified cluster design and includes approximately 40,000 households containing about 100,000 persons each year. The HIS sample is redesigned every ten years following the decennial census.

We are currently utilizing the 1995 through 2000 four sample design and will implement the next sample design beginning in 2005. The most recent sample redesign included an over sample of both black and Hispanic households in high density areas. The previous sample design from 1985 through 1994 included an over sample of black households only.

The interview is conducted face to face in the home via computer-assisted personal interview, also known as CAPI, with as many household members as are present and available at the time of the interview. A full Spanish translation of the interview accessible via a toggle key by the field representative on a question by question basis has been available in HIS since 1998.

The questionnaire was completely redesigned for the fielding of the 1997 instrument. As part of the NHIS instrument redesign, the survey was divided into two major sections; a basic module administered to all family members, plus basic modules for one randomly selected sample adult and child in the household and topical modules, which would vary from year to year.

The HIS basic module collects data on the topics listed here, such as activity limitations, injuries, health behavior, income and assets and family composition. The topical modules under redesigned HIS are somewhat analogous to the old HIS supplements that many of you may be familiar with.

The topical modules provide the survey with the flexibility of adding new public health topics as the need for data arises. For example, the 1998 prevention module provided data for assessing the Healthy People 2000 Objectives and the 2001 prevention model is being used to set baseline measures for the 2010 health objectives for the nation.

The HIS also contains a number of non-health measures that are of great interest to analysts using the data, including household composition, which includes detailed information on relationships of household members to the reference person and to one another, demographic information, including Hispanic origin and single and multiple race data, measures of socioeconomic status, including sources of income, education and occupation, geography, proxy measures of acculturation, including length of time living in the United States, nativity and contextual data at the census tract and block group levels.

Now I would like to share with you some examples of some bivariate analyses of HIS data that can be used to assess racial and ethnic disparities in health. This slide shows two asthma measures for children 17 and under from the 1997 HIS. Overall, asthma prevalence each year on the left -- and I am sorry for the colors in this -- and on the right here has an asthma attack in the past 12 months.

The data illustrate that non-Hispanic black children, which is the group here in white, were more likely than children of other racial and ethnic groups to ever have had an asthma attack and to have had one in the past 12 months.

This next slide shows selected health care risk factors such as being uninsured, which is this group here, having unmet medical needs, which is defined as needing health care, but being unable to get it because of the cost, which is this group here; delaying care because of the cost, having no usual source of care or having two or more emergency room visits in the past 12 months of children age 17 and under.

The data show that Hispanic children were significantly more likely than their counterparts -- and that is the group in the bright green here in each of these groupings -- to be uninsured, have unmet medical needs and to have delayed care because of the cost and also to have no usual source of care.

Now, typically, one year of HIS data can be used to assess health characteristics for only the largest racial and ethnic groups, as has been shown on the previous two slides because of sample size limitations. This slide illustrates how data from more than one survey year can be combined to increase sample sizes and create estimates for smaller population groups.

This is also quite difficult to see, but data here are shown for respondent assessed health status for Asian and Pacific Islander and non-Hispanic white population groups. We have Chinese, Filipino, Asian, Indian, Japanese, Vietnamese, Korean and other Asian Pacific Islanders and non-Hispanic whites.

This slide shows the Japanese respondents here in yellow were more likely to report their health as excellent than other API groups, while Vietnamese respondents down here on the end in royal blue were more likely to rate their health as fair or poor than other API groups. This slide shows another example of aggregating data years, assess health status, this time in Hispanic population subgroups.

Data are shown here for the current smoking status of adults 18 and older by sex, with four Hispanic population subgroups; that is, Puerto Rican, Cuban, Mexican or Mexican American and other Hispanics and non-Hispanic blacks and non-Hispanic white population groups. Among the Hispanic subgroups, Mexican or Mexican American men were more likely to be current smokers than any other Hispanic group. That is among these four groups right here.

But the proportion of non-Hispanic blacks and non-Hispanic white men, who were current smokers, exceeded that of any Hispanic subpopulation group. Among women, Puerto Rican women over here, were more likely to be current smokers than other Latino women, but were less likely to be current smokers than either non-Hispanic blacks or non-Hispanic white women.

Finally, this slide shows an example of using contextual data to examine characteristics of the survey population. This slide shows the proportions of U.S. and foreign black and white persons living in areas with varying concentrations of the black population. The lowest concentration, zero to 24 percent, 25 to 49, 50 to 74 percent, 75 percent to 99 percent black in the census tract and shows that both U.S. and foreign born whites are far more likely to live in areas with the lowest concentration of black persons at this end here.

On the other hand, U.S. and foreign born blacks are most likely to live in areas with either a high concentration of blacks, that is here, or a low concentration of blacks and that is here, with some differentiation.

Foreign born blacks are more likely to live in areas with the lowest concentration of black persons than U.S. born blacks. This orange bar here is foreign born blacks compared to U.S. born blacks in green. Then at the highest concentration in the black population, U.S. born blacks are more likely to live there than foreign born blacks here.

Although I don't have time to present it here, I wanted to mention that in addition to the types of bivariate analyses shown here, other types of analyses can and have been done using HIS data. The HIS has been linked to both the National Death Index and the National Survey of Family Growth to maximize the analytic potential of HIS data.

Additionally, multivariate analyses of HIS data have been conducted to examine such issues as whether or not family structure and characteristics predict child health status and whether or not socioeconomic and demographic factors are associated with differential health status among U.S. and foreign born black and white persons.

The issues that one needs to keep in mind when using HIS data to assess racial and ethnic disparities in health are these: Usually a single year of data can be used to assess health measures for only the largest population groups because of sample size limitations. Two or more years of data need to be aggregated to produce reliable estimates for smaller population groups and in some cases, for example, the American Indian population, even several years of data may not produce enough numbers to get viable estimates.

Additionally, confidentiality requirements restrict the amount of information available on our public use data files. In some cases, categories that are too small may have to be suppressed, in which case analysts will have to apply to the Research Data Center at NCHS to access the data.

Another issue to keep in mind is that the HIS has limited ability at this time to over sample smaller racial and ethnic populations, especially those that are not found in heavily concentrated areas in the United States. The major limitation there is cost of doing the sampling.

Right now the HIS does not assess the cultural competency of its health measures and does not collect detailed information on languages spoken in the home or on English proficiency. Furthermore, the measures of acculturation that could be used in analyses are limited.

Also, while the HIS is now regularly translated into Spanish, there are no immediate plans to translate the instrument into other languages. Respondents who speak languages other than Spanish usually have a family member interpret during the interview or if the field representative conducting the interview speaks the language, he or she can translate on the slide.

Now I would like to turn my attention to the measurement of race and ethnicity, which is really central to being able to assess racial and ethnic disparities in health. In 1997, the Office of Management and Budget issued new standards for the collection of racial and ethnic data in federal statistical systems and these standards have implications for how we measure health outcomes, given that there are new population groups for whom we must gather data.

They have also had implications for how we maintain trends in our data systems, which we use to monitor overall changes in health outcomes over time and most importantly, we need to be able to assess in light of the changes in the new standards, whether observed population changes are the result of changes in the way we classify people or actual behavior changes in the population.

This slide summarizes the new OMB race standards, which split the group, Asian Pacific Islander, into the groups Asian and Native Hawaiian and Other Pacific Islander, require that Hispanic origin -- items on Hispanic origin be asked prior to race questions and separately from the race question and allow respondents to federal surveys into the census to report more than one race.

Now, the most important changes that we can expect in our data systems from the new standards are that data tabulated and shown in NCHS's publications, like Health U.S. and Summary Health Statistics and the like will begin to show data for new population groups, such as Native Hawaiian and Other Pacific Islander in individual multiracial groups. We can also expect that when compared with data collected under the old standards, we may see shifts in people from one race group to another.

We can also expect that having to monitor health outcomes for new groups will necessarily create breaks in the data and force us to develop methods for making old and new data comparable. Perhaps most challenging is that we can expect our interpretation of health data by race will have to change, given that we will be analyzing and interpreting data for groups, whose composition may be changing over time.

Quickly, I would like to give you an overall snapshot of how multiple race groups fit into the overall NHIS, shown here for 1997, 1998 and 1999. Just as background, I wanted to mention that the HIS has allowed respondents to the survey to report more than one race since 1982.

It is very difficult to see here, but this bar all the way out here to the end indicates the proportion of the sample in each year that reported more than one race and it averaged at about 1.4 percent in all three survey years. This is consistent with estimates of the multi-race population from other sources that have put the number at about 2 percent or less of the population.

Of this group, the largest multiple race group in the HIS was American Indian and Alaska Native and white, followed by Asian Pacific Islander and white and black and white.

To get a handle on some of the demographic characteristics of the multiple race population, consider this slide and the next showing the age distributions for the single and multiple race groups. As you can see here, the age distributions for the four single race groups, white, black, American Indian, Alaska Native, and Asian Pacific Islander are all fairly similar to one another.

By sharp comparison, over 60 percent of the Asian Pacific Islander and white group here and almost 80 percent of the black and white group here are persons under the age of 17. The age distribution of the American Indian, Alaska Native and white group here, more closely resembles the age distribution of the single race groups that I showed on a previous slide.

This slide shows private health insurance coverage for four race groups in the three largest multiple race groups, here on the left tabulated under the new standard and on the here on the right under the old standards. If you compare the estimates of private health insurance coverage right across here, you can see that there are very little differences in the largest race groups, white and black, when tabulated under the old and new standards. The only place that you start to see that there may be a difference is here for the American Indian, Alaska Native group. The differences between these numbers are not statistically significant because of large standard errors, which is a reflection of the small sample size of these populations and the fact that this is a single year's worth of data.

There is no difference here for the Asian Pacific Islander groups. But what you do see is that the numbers for the multiracial groups, black and white, American Indian, Alaska Native and white and Asian Pacific Islander and white are not only different from one another, but different from at least one of their single race counterparts above here.

The other thing that is worthy of noting here, again, is the large standard errors around these estimates, which is a reflection of the small sample size for these groups. This slide shows the unique feature of the NHIS, which is that respondents who report more than one race are asked to indicate which race group of those they mentioned best describes them.

We refer to this as the respondent's primary race. In 1997 and 1998, a majority of respondents selected a single race group in both survey years. Here we have the multiple race groups across the top, American Indian, Alaska Native and white, Asian Pacific Islander, white, and black and white and the single race group that they chose.

This bottom line here, multiple race, refers to those persons who did not select the primary race group; that is, they either said don't know or refused to answer the question. A large majority of persons reporting as American Indian and Alaska Native and white selected white as their primary racial identity in both 1997 and 1998, although there was a decline here and this difference is not statistically significant.

The situation is different for persons who are identified as Asian Pacific Islander and white. As you can see, about 40 percent selected white as their primary race in 1997 and almost 50 percent selected Asian Pacific Islander. Those proportions are a little different in 1998, but, again, the differences are not significant.

For the black and white group, about one-quarter in both years, a little over one-quarter in 1997, a little less in 1998, selected white as their primary race group. About half in 1997 and close to half in 1998 selected black as their primary race group. But what is most striking is that of all the multiracial groups, persons who are identified as black and white were least likely to identify with a primary race group and most likely to not select any group at all to identify themselves.

Use of questionnaire items such as this may help data systems that need to place multiracial persons in a single race category for the purposes of maintaining trends in their data to do so. This process you have probably heard referred to as bridging.

Some of the issues that we need to keep in mind related to the measurement of race and ethnicity in the HIS is that in maintaining trends in our health data -- excuse me -- we are going to have to use some sort of bridging method, which I just referred to and in the case of the HIS, the question that we have that allows multiracial persons to select a primary racial identity for themselves may work best primarily because it allows them to self-identify in a single race group as opposed to using some of the other statistical methods for allocating them that have been suggested.

However, even with this HIS method, self-identity and group size changes will make allocation of multiple race groups to a single race group increasingly difficult over time. As these groups get larger and larger, there are going to be changes in how people identify, which is probably going to be due to increasing awareness of multiracial heritage or increasing desire to report multiracial heritage and then we are going to face further challenges in bridging the data to maintain trends or we may simply just have to start new trends in our data.

In addition, there are several issues related to interpretation of race and ethnicity data under the new standards. Among those that we are measuring, new population groups, such as the Native Hawaiian and Other Pacific Islander and multiracial groups, whose characteristics and patterns of illness and disease appear to be distinct and have to be studied further.

We also have to more fully acknowledge the fluidity of racial and ethnic identities. They are not fixed and this may begin to change our fundamental concept of race. Furthermore, we have to consider whether or not there is a substantive meaning of primary racial identity for multiple race persons and what role this plays in understanding their health behaviors and outcome.

Finally, what we knew to be an already complex relationship between race and health has become that much more so in light of considering health characteristics of single and multiple race groups.

Just to show some examples of other kinds of analyses that have been done looking at race and ethnicity reporting using HIS data, there have been some linked file analyses looking at the consistency of race reporting in the linked NHIS and National Survey of Family Growth and that is looking at consistency of race reporting and respondents to both surveys and also using HIS data to develop a bridging method for vital statistics data, which does not have an item that allows multiracial persons to select a primary racial identity.

In terms of multivariate analyses that have been done, there are two analyses that are being developed, one looking at developing a mortality profile of multiple race persons in the U.S., using the linked HIS NDI and that is persons who are identified as multiple race in the HIS and looking at their racial classification in the NDI and then developing a demographic and health profile of multi-race persons in the U.S., using HIS data from 1997 to 2000.

Now, some of the future directions that we are hoping to go into for the HIS are in examining the possibility of over sampling Asian population subgroups, which is something that is actually -- research that is ongoing right now.

We are also considering targeted over sampling to study smaller groups, like American Indians and Alaska Natives and Native Hawaiians and Other Pacific Islander, for whom health data is badly needed but cannot be gotten through the standard over sampling procedures in the survey. There is also work that is planned, cognitive work at NCHS, to examine the issue of committing to a racial identity that is having a primary race for multi-race persons or wanting to keep a multiracial identity and also with experiences in discrimination in seeking and receiving health care.

Just for your further information, these are the web sites where you can get information on the National Health Interview Survey and more information on the use standards for the collection of racial and ethnic data on the OMB web site.

Thank you very much.

DR. MAYS: Great.

Dr. Hummer.

Agenda Item: National Health Interview Survey User

DR. HUMMER: Thanks for having me. Thanks, Vickie, for inviting me to come.

My comments here are -- it is nice that I am going to follow Jacqueline's talk because being at the data collection site, she has provided a lot of insights in terms of how the data is collected and some of the basics in terms of how they are used.

What I was going to do is really provide two things in this talk; some of the key features of the Health Interview Survey that make is very useful for analysts like myself and many others to use in an academic setting and to publish in journals; and second, some of the key limitations that I see with the Health Interview Survey that as we discuss these issues and think about race and ethnic health disparities that I really think and after using these data sets for a number of years, really need to be addressed with these data.

I am also coming to this, again, as a data user from someone who has used the Health Interview Survey extensively from the 1986 to 1994 periods, prior to the redesign that really got fully implemented, I think in 1997. Part of that reason is that, as Jacqueline mentioned, that earlier data, those nine years of data, has been linked with the National Death Index, which allows for follow-up mortality analysis, which for my work and I think for many in the field has provided a really great data source to use.

So, much of my comments are thinking back on how the Health Interview Survey was collected at that time. Although I have done cramming in the last couple of weeks to think about, you know, the newer versions of the data set. But most of my comments are stemming from this perspective.

Some of the strengths of this particular data set: First of all, it has been around a long time. This is a data set that is now in its 45th year or something like that. Because of that, there can be great comparisons made across time, obviously, with some caution, depending on the exact questions that were used and the health outcomes that were looked at and so forth.

But there is a tremendous strength there. The second strength is the very large sample size. On average, 40,000 households, a hundred thousand individuals per year provide us with one of the largest health data sets to use in the United States. There is tremendous strength there for power of analyses, for looking at relatively rare health outcomes, for looking at relatively rare health behaviors, things like that. Although I will get to some of my comments later, too, about how I think we are losing something despite this large sample size.

There are things that -- sticking with No. 2 still in terms of strengths, you can look at several race and ethnic groups. You can look at health outcomes, health behaviors by age, by sex. You can look at immigrants. I have done a number of papers with my colleagues that I have been able to look at among Mexican origin individuals, for example, splitting out the native born versus the Mexican born and so forth. So, you can do some nice breakdowns given the large sample size with a number of the variables that are available.

The third thing is data on children and adults in the same households. A household-based survey allows for linkage between individuals in the same household. You can look at patterns of health and health behavior and so forth within households.

It is nice in No. 4 that the data are available in separate files as well. So, it is easily accessible and usable by people who are not great with data, I mean, that have skills but aren't, you know, computer whizzes, I guess, as you can tell by my overheads.

The fifth thing is that there is very high response rates here. We got response rates up in the 90 percents over time and that attests to the trust that the interviewers have, the dedicated professionality, the longstanding running of this survey and so forth. There is tremendous response rates here, better than most, if not all, surveys I have ever worked with, a great strength of this data.

A few more -- and some of the reasons why I have used these data in the past. There are generally the same basic questions year to year. There are some strengths and weaknesses to that, but there are several strengths and one of those is that Jacqueline readily pointed out that in terms of analyzing health disparities sometimes pooling is necessary across years. When you have the same questions from year to year, this allows for that.

The shifts in topical questions from year to year allows flexibility, though, to be able to get at health problems and so forth that are new or shifting topics, based on perceived need and so forth. To date, though, those have had what I have seen as less to do with race/ethnicity or immigration for that matter and more so to do with specific diseases and behaviors.

The seventh thing is that there are good basic measures of health, self-reports and parent reports for children. There are many objectives of Healthy People 2000, 2010 that have been able to be tracked through the use of these data. There is a strength there.

No. 8, the over sampling for black and more recently for Hispanic persons allows for some good subgroup analyses, but I am going to get to that further below.

Ninth, there are weights, obviously, to allow for inflating to the population.

The tenth thing, I am going to hit on this both because of my own work and I think because the fields demand for mortality data sets outside of vital statistics. The links to the mortality data for adults, those 18 and over, through the National Death Index, has been a real important addition to this data set.

Here you can pool the years, 1986 through 1994. You have got nine years of data for the core surveys. You can get some solid estimates of minority group mortality. You can look at some of the migration variables having to do with race and ethnicity as well, although there are some issues revolving around how well the matches are made to the NDI. There are very few comparable data sets of this kind of size to analyze mortality in the United States.

There is the National Longitudinal Mortality Survey, based on the CPS surveys, but those are very limited in content. They are not health-related data sets. So, this provides us with one of the largest and best data sets that we have of looking at mortality patterns throughout the United States.

Then when you have these supplemental surveys or the topical -- I guess they are called the topical surveys now available. You can look at those special topics, too, and follow individuals over time and hopefully you can, given if they match to future years, which I am putting in a big plug for here, because there are very few data sets like it that are able to analyze the types of data that you can look at here and get the number of deaths that you can in the way that they have done this from 1986 through 1994 surveys.

I have analyzed variables off the earlier supplements, things like religious involvement that was asked in one survey in 1987, health behaviors, like cigarette smoking and alcohol use, some of the health status variables to look at self-reported health and follow up mortality. There is a number of interesting variables that can be looked at and that are related to race and ethnic health disparities as well. So, that is a major strength of this data set.

I am really going to hit on a few key limitations, though, too, in terms of really trying to get at health disparities using this data set. These aren't large in number, but I think these are also very key limitations and some of these Jacqueline brought up and I will probably be a little bit harder because I am not in house. I don't have to worry about anything.

The first thing is I think the most critical thing that we need to think about in terms of the Health Interview Survey is that despite the over sampling, sample sizes for most minority groups remain very limited for many purposes. I think some of the slides and some of the reports that are issued that we can look at white, black and Hispanic populations, in more recent years, Mexican Americans or those persons of Mexican origin, but there is a real need here to recognize tremendous diversity, even within these umbrella groups.

You know, key examples, Puerto Ricans are far different from Cubans, who are far different from the Mexican origin population. We know very little about the health of other national origin groups from Latin America. The Health Interview Survey is not going to help us with understanding a lot about health patterns among these racial and ethnic groups.

Asian subgroup diversity, there is a slide that Jacqueline put up there. They are wrestling now with over sampling in the next round of the survey, starting, I guess, in 2005. But there is a number of Asian subpopulations that really the Health Interview Survey is not going to help us understand either in terms of health patterns in the United States.

There is some limited work out there about how a number of these groups might vary tremendously in terms of health status, but, again, using these data here, we are not going to be able to get at that very well.

Native American populations are also very difficult to learn anything much about using the Health Interview Survey. Simply, the numbers aren't there to support analyses based on geographic diversity, socioeconomic diversity and so forth.

I pulled an example from the 2000 survey. I think these numbers are right; if not, they are very close. But the example here is that 59 percent of the Health Interview Survey respondents or people in these households are non-Hispanic white. This is down from 1990. I traced this and it was in the midst, 70 percents in 1990 and with the over sampling changes in the mid-1990s, this figure has gone down.

But still, compared to the non-Hispanic black and Mexican origin population, non-Hispanic whites still outnumber those two populations by about 4 to 1 in the data set. I recognize the limitations of costs and things like that, but my big question here is why do we need 60,000 whites when there are at most 14,965 persons of Mexican origin. I think that is something that really needs to be addressed in a major -- what we are doing here is losing sample size power. It is just something I am very critical of in this data set, as well as many others.

To further hit on that a little bit, in that one particular year, there were fewer than 2,000 Puerto Ricans, about 1,000 Cubans, less than 600 Native Americans and about 600 each of Chinese, Filipino and Asian Indian subgroups and other smaller subpopulations were not identified, I am presuming because of confidentiality reasons.

So that on a single year basis, there is limited use of the data set for looking at health behavior, health status and so forth among many of our race and ethnic groups throughout the United States. That is really a tremendous limitation, giving that my own sense and I think the field sense in social demography where I am coming from is that diversity in race and ethnicity goes far beyond the umbrella subgroups that we often report on and discuss in the literature.

A related point is that considering these sample sizes are often subdivided by age, sex, health outcome and other variables, when we break these things down, addressing health disparities across a wide range of race/ethnic groups becomes even more difficult. We can do some nice cross tabulations of health outcome by race and ethnic groups. They standardize for age and so forth, but that just doesn't -- you can't do a heck of a lot more than that when considering the majority of race and ethnic groups with these data.

In terms of the multiple race identifiers as well, I think Jacqueline reported that there was 1,400 or something persons of multiple race in a recent health interview survey. When you think about the possible combinations of multiple race -- and, again, age, sex, health outcomes and things like that, I really seriously question whether or not we are going to be able to learn a whole lot about the health disparities of multiple race groups among subpopulations.

So, I hate to put a pessimistic picture on it, but I think that is what we have here in terms of actually getting at not only descriptive analysis of health disparities using these data across a wide range of race and ethnic groups, but also I think going beyond that and truly understanding why such disparities exist.

To add to that, though, I just wanted to say that we are at a point where we can really do something about that, too. The census, you know, was just two years ago. We have a redesign of the Health Interview Survey coming in 2005. Not knowing how these things work, I am sure that work is well underway already and I am sure a number of people have already thought about this, but this -- in front of a meeting like this and the people that are here, I hope truly reinforces that among kind of a typical lay user of these data.

Points 2 and 3 here, I wanted to get at a little bit about the content of things, of variables on the data set. Socioeconomic variables have been enhanced a bit in the more recent years, but they are still largely basic indicators, education, income. I think there are some assets data on there now, occupation. But in terms again of really getting race and ethnic disparities in both adolescent health, young adult health, middle-aged health, older adult health, there are few indicators of long term hardship for wealth or childhood experiences; this sense of accumulation across the life force.

Now, the big disadvantage here is that we have a cross sectional survey. But the big advantage is that we have topical questionnaires available and so forth, that we can do something about that as well, if we really wanted to ask and get at the accumulation of disadvantages, advantages and so forth over time, at least looking retrospectively.

It is also difficult for researchers but not impossible. I have done and a number of people have to link with neighborhood characteristics, that the data come from a census-based sampling frame. Wells and Horme(?) published a piece in AJPH(?) a few years ago to describing a method of how to use the very small areas they termed to get at neighborhood characteristics.

There has been in-house work done at NCHS by Felicia LeClaire(?) and her colleagues to look at census tracts, measures that help to get at health disparities, but by and large -- and, again, I submit this is largely because of confidentiality reasons, but could be eased through greater numbers of racial and ethnic minority groups on the surveys, that these tract level and geographic level data are important for us in understanding racial and ethnic health disparities.

Finally, and I know we want to open this up for questions, following again on some of the things that Jacqueline talked about as well, other social and cultural variables remain largely absent from the Health Interview Survey, even in the special modules. Here I draw from a literature that strongly argues that racial and ethnic health disparities are social, historical and cultural in origin.

With that in mind, the relative lack of social and cultural variables, I think seriously impedes understanding what underlies the disparities in question. The Health Interview Survey gives us a real sense, at least for some groups, of the disparities, very important publications on the disparities themselves, what they look like and I think there has been a number of works in the public health epidemiology, demography literatures and so forth, that tries to get at what underlies these disparities using the Health Interview Survey data. Here I am critiquing my own work, as well as others, but our sense of some of the key mechanisms, what underlies these disparities, I don't think we can seriously get at with the Health Interview Survey as it is currently constructed because of the sense that these are largely social, cultural and historical health disparities of origin.

So, for example, variables that tap things like social support, I and my colleagues have worked with the one question that the Health Interview Survey included on religious involvement in the 1987 supplement and showing that, for example, among African Americans, being religiously involved as proxied by attendance was associated with our lower follow-up mortality risks than those who were not religiously involved and more so, stronger relationship than among whites.

But those types of variables are really few and far between on the Health Interview Survey and so a sense of -- in that particular case, that is just one example of religious involvement might give s a better understanding of why health disparities aren't even larger in some cases than what they are, in that case, mortality. But variables tapping social networks, family supports, community involvement and so forth are largely unavailable.

I know given the size and the yearly construction of this data set and so forth, we can't get at everything here. Data that have smaller sample sizes are often the way to do this. But at the same time there has been a number of special topic modules and so forth that have gone far into depth on other types of topics. I don't see that here with the Health Interview Survey getting at racial and ethnic disparities in the way that it has been able to get at some other topics that have been looked at in more depth.

Variables related to stress, too, financial burdens, crime, safety and violence fears, things like that have not been typically available in the Health Interview Survey, too, to tap some of those things. I am not an expert on any one of these particular domains, but at the same time, I think there are people that are that are out there and that if such variables were thought to be worthy of consideration, as I think they are, then I think that expert groups need to be brought to help put the topical things like this together.

There has also been relatively little attention to immigration issues and Jacqueline pointed this out, too. My recollection is that nativity only started to be collected, I think, in 1989 in the Health Interview Survey. And if we are thinking about the growth of racial and ethnic minority populations largely due to immigration, our sense of going beyond where a person is born in the U.S. or not and how long have they been here, I think really needs to seriously be thought through because there is a lot more to patterns of migration and health than simply nativity and one question on duration.

So, those, I think, are some summary comments on things that I have thought about over the years, using the Health Interview Survey and I hope are helpful for the committee in thinking about this data set as well.

DR. MAYS: Great. Thank you.

[Applause.]

Agenda Item: Questions and Answers, NHIS

So, if you will both join us in the front -- Jackie, if you just want to move around so you can also answer questions -- we will take questions from the committee first and then we will open it up to others in the audience.

Paul.

DR. NEWACHECK: We spent a lot of time talking about sample size issues and statistical power. I would like to change a little bit -- change the course of the discussion a little bit to the quality of data that we are collecting and, in particular I wondered about for households that are non-English speaking and non-Spanish speaking, has there ever been any validity studies to look at the kind of -- the quality of the data that we get from those households in the HIS?

MS. LUCAS: As far as I am aware, there have not been. That is not to say that -- I was just going to say as far as I am aware there haven't been. What I do know is that with the redesigned HIS being done on CAPI, many times the interviewer, while they are in the household, will record in a note section some particular aspect of the interview; for example, that no one in the household speaks English or Spanish. They try to in advance, if they know, for example, it is a Spanish speaking household, they will try to send a Spanish speaking field representative to conduct the interview.

As I mentioned, many times what happens is that there is an English speaking person in the household and that person ends up doing a fair amount of the translation. The degree to which that affects the quality of the information that is obtained, I really don't know and I don't know that we have adequately explored that.

DR. NEWACHECK: You know, this, I think, is really an important issue as we move to think more about some of these smaller subpopulations. I had the opportunity to go on a Health Interview Survey interview several years ago and have been a long term user of the survey and it almost changed my mind after going on that because in California we have all kinds of different populations where English is not the customary language or at least it is not the preferred language. Watching the interviewers struggle to try to translate concepts like limitation of activity, not in native tongue, but just trying to explain what that kind of concept means to someone who is not a native English speaker is just so incredibly difficult.

I just wonder about the quality of the data that comes out of those kinds of situations. Now that we are focusing on that, we have made this a national priority, I think it is important for NCHS and other data collection agencies to consider what kind of quality of data we are getting from those households and whether there are ways we can improve that.

MS. LUCAS: Certainly that has been an issue with a fairly robust measure like responding to health status, which has been shown to be robust across many different groups. But there is a question about what does good health mean for different groups and how does one population that says their health is good, they may mean excellent, but the term "excellent" does not translate so -- so, yes, we are aware of it and it is certainly something we have to address.

DR. HUMMER: Just to follow up something on that

-- I have a forthcoming paper coming out looking at the non-U.S. born population in the Health Interview Survey and mortality follow-up on that particular question, self-reported health question and the correlation of self-reported health with follow-up mortality is much weaker among the foreign born than among the U.S. born.

I don't know exactly what that means, but something about that variable is either because of the language that the interview is being done or how it is interpreted, whatever, is not -- does not mean the same thing.

MR. HANDLER: Our presenters mentioned that the Health Interview Survey findings have been matched with the National Death Index. I was wondering, has any results from that Health Interview Survey been matched with the National Death Index Plus. The National Death Index Plus includes data on cause of death and I would think that, you know, life style in the home might be related to the actual cause of death and maybe it is too soon to do that. I think that the "Plus" was added in 1997 or 1998 or something like that.

DR. HUMMER: I am not sure about the technical term of NDI versus NDI-Plus, but I have analyzed cause of death with the links that the data set has 282 cause listing and 72 cause listing and so forth. So, you can look at differences among racial and ethnic populations to some extent by cause of death. You can do it for whites the best because of the numbers.

MR. HANDLER: Did you come up with any startling results or interesting results or --

DR. HUMMER: Interesting, yes; startling, no. It depends on -- there are a number of different disparities I have looked at and so forth. I can send you papers if you are interested.

DR. LENGERICH: This is a follow-up or spurred on by Dr. Hummer's comments here about -- on your last page about some suggestions for additional, I guess, modules, sorts of things that may be added on to NHIS. You had said that in 1987 there was the one -- the supplement that asked the one question about religious attendance, but give other options here.

I guess I wonder are those sort of modules existing for NIH or within the plans for NIH because I think that you are indicating that that could shed a lot of light on some of these disparities that we see. I guess I am wondering about the priority there for NHIS in these sorts of modules.

MS. LUCAS: Let me say that the interest is certainly there. One of the things that is happening in the next couple of years is that there is a sample redesign that is going to be implemented in 2005 and then they are changing the entire computer system that the interview is being conducted on from the CASIS(?) software that Census uses to BLES(?), which is a different CAPI system for administering the survey.

My understanding is that in the process of making that turnover, we are going back and reevaluating some parts of the questionnaire and other items that should be -- that we would like to see added to the questionnaire. Because of the composition of the staff in the Illness and Disability Statistics Branch, which is where I work -- and that is the branch that does the primary analysis of the survey data, there are more demographers, sociologists there, as well as epidemiologists. The interest is in seeing modules, as Paul suggested, that are not so much oriented to sort of health issues, but more about people themselves, about the communities they live in and their experiences with different kinds of things.

I don't know how that is going to fly, though. I don't know how well it is all going to depend on funding and depending on the timing and length of the questionnaire and all those other factors. I know it sounds kind of like a wuzzie(?) answer, but it is the truth. I don't know.

I can say that the interest is definitely there, the desire is there and we are pushing for doing some cognitive testing on adding those kinds of questions to the HIS. Let's see whether it happens.

MS. GREENBERG: Going back to this issue of how people from different countries, different cultures, ethnicities, et cetera, interpret questions, even if the translation is good -- I know that the World Health Organization has done a fair amount of work on this and seen the huge differences in people's rating their health good or excellent or poor, which are -- don't really correlate with that huge differences in actual health status, so it appears.

Some of the things that they have been looking at doing and doing to a lesser extent is some objective measures. You can correlate what people say with actually

-- and this particularly in areas of limitations or, you know, vision or other things of that nature, but actually look at some -- you know, what they say and then what they can actually do, as well as scenarios of describing certain scenarios. Well, would you say someone in this situation is in good health or poor health or whatever, to try to get some, you know, normalizing there.

Are those things that the Health Interview Survey has considered doing or might do in the redesign?

MS. LUCAS: They are things that have been suggested in the past, I know. I don't know whether or not they would be implemented for the next redesign because even with the experience that we had with the Spanish translation that was done for the 1998 HIS, we had the experience in the field that, you know, there was sort of a standard translation that was done and several people contributed to it, but the practical application of it in the field was that in different parts of the country, the word that you would use to describe something if you Ecuadorian, it would be different from the word that you would use if you were Nicaraguan or something like that.

Sometimes even with the translation being available, the field representative had to sort of translate on the file to find the right -- on the slide, to find the right word that was appropriate in that particular setting. So, I mean, I know that these kinds of things have been brought up in the past. I don't know.

DR. BREEN: First I want to thank the presenters, both of them, for laying a really substantial groundwork for us to discuss this important survey. You both did a really good job in terms of helping us think through the advantage of the survey, the problems of the survey and some new directions of the survey. I want to thank you for that.

This is a survey that I have had a lot of experience with. As I said, I work at NCI. I am an economist there and have been using this survey since the 1987 Cancer Control Topical Module, which was the first, well, supplement at the time, the first time NCI helped support this survey.

I coordinated the Cancer Control Topical Modules for 2000 and I mention that because a new survey called the California Health Interview Survey, which you may know, Paul, was in the field. It just came out of the field this year and it was done for the first time and this is a survey -- it is a telephone survey, not an in-person survey, which is modeled on the National Health Interview Survey and is designed to be comparable with it so that the two surveys can be compared.

I just wanted to mention something that that survey is, I think, also maybe going to help us work through some of these problems with the National Health Interview Survey. That is why I bring it up because it is being administered in six languages and it is trying to capture information at the county level. So, it is a lot more locally oriented, of course, than the NHIS is or could be.

One of the things that NCI is hoping to do, we hired a cognitive psychologist, who, in fact, used to run the lab at NCHS, and he is going to help lead a study in which we look at the cognitive understanding in different languages of these questions because it is certainly a problem. It has been an issue, which I have discussed again and again with colleagues at NCHS in the course of developing the Cancer Control Topical Module.

According to him, it is a very hot topic in cognitive survey development methods. So, hopefully, we will be getting some answers to that, maybe using the California data and the California population, which is so important because of its diversity, its naturally occurring diversity.

Back to the NHIS, I wanted to mention first or reiterate that this survey, according to an IOM report on data systems, it is the premiere source of data on health, health status and health services use in the country, partly because of its size, partly because of its longevity and I want to reemphasize those points because I think it really needs high priority and it justifies giving it high priority in terms of trying to get these data right.

It is also, as people noted yesterday, the survey which is used to pick some subset from for the MEPS(?). So, it is the same frame for the MEPS, as well as being its own survey, which makes it doubly important and as was pointed out yesterday, the MEPS is the only survey that provides financial information on health service use.

So, this is a really important survey and I think that is one we want to try to get as right as possible. In the course of analyzing the NHIS, I just wanted to bring up a couple of issues -- and maybe Jackie has some suggestions on how we can deal with these, but the problems that I have had, certainly you can't look at Asians, Asian-Americans or Pacific Islanders, Native Americans, Alaska Natives with this survey in any meaningful way.

I should say that I specialize in screening for early cancer detection. So, I am looking at the whole population. I am not just looking at people with cancer, which would be a pretty small group to look at. This is a general population survey specific to the age groups for which the screening is recommended.

In the course of trying to analyze even black and Hispanic, who are age specific for screening, say, 40 and older or 50 and older, in some of the older surveys, for the adult sample population, which is anywhere from 20 to 40,000, depending on how that survey was designed, I have not been able to look at those populations by income and education and get reasonable confidence intervals.

The high income, high education Hispanics and blacks, I have not been able to get reasonable confidence intervals and by unreasonable, I mean something 20 percentage points, 30 percentage points, 50 percentage points have been the confidence interval size around the point estimates.

So, this is not a point estimate by any stretch of the imagination. So, even that basic data, which I think this survey should deliver is not really being delivered with the survey. I think that should give us pause and concern and we should look into correcting that.

The other thing that came up more recently, we, NCI, did an atlas of cancer mortality a few years ago and found that there were areas in the United States where there was high and persistent -- and by "persistent," I mean over the last 50 years -- rates of cervical cancer mortality. So, we took the BIRFIS(?) and we also took the NHIS to try to look at these areas to see if screening rates were lower in those areas.

This is PAP smear screening, women 18 and older, who should be getting PAP smears. We were using the entire adult population and moreover, PAP smear use is in the range of 60 to 90 percent in the population. So, it is not a rare occurrence by any means. We were completely unable to look at rural distressed areas in this country. And Jackie brought that up as -- well, both of you did in terms of the survey is not designed to look at people living in non-densely populated areas of the country. The PSUs are not there.

So, it doesn't pick up Native Americans, Alaska Natives. It is not likely to with the current sample frame design and it is not going to give us information on rural distressed populations either. So, those are all concerns that I just wanted to reiterate or state for the record that I think we really need to be thinking about in order to be able to get just this basic data.

Thank you.

MS. LUCAS: I just wanted to address that you are right. I mean, the issue of being able to health measures or SES or other demographic measures for high income and high education, black and Hispanic persons is an issue that we have come up with in doing some of our analyses. Dr. Kington and I are working on a project right now, where that is of some interest to us.

I think it is partially a function of the way the sample is drawn, that the over sampling is done in high density urban areas, where the composition of those groups might be different from other groups. One of the things that we need to look at more carefully is what is the distribution of people by their economic status in different levels of geography because the over sampling tends to happen where they are most likely to find people. That is the cheapest and most effective way to go about doing it.

So, it makes sense in theory that that is the way it is done, but it does definitely have an impact on the kinds of people you get in the sample and what you are able to do with the data. If you want to be able to look at certain kinds of things, you have very limited ability to do that.

What I don't know is whether or not in the -- and it is just a function of the fact that I am just not privy to all that information -- whether or not as part of the sample redesign for 2005, that is something that is being reexamined, how the over sampling is done for black and Hispanic persons. One thing that I can tell you is that even for the existing sample design, what happens is -- and this is one of the things that is an issue is that the sample redesign is done sort of on a five year track after the decennial census data become available. So, for the 1990 census, right after the 1990 census, they started working on the same design that was implemented in 1995 through 2004. But by 2004, you are working with a sample design that was based on 1990 census data.

So, there are a series of population changes that have happened over time. As you are drawing the sample, you think you know certain things about neighborhood characteristics where you are drawing the sample, but they may, in fact, have changed. So, that is one of the issues that we have to deal with. One of the things, I think, that is hoped is that the American Community Survey, which is supposed to provide information in between census years, will provide more up-to-date information for surveys like the NHIS that have to do a sample design and we may not in the future have to continue this sort of ten year span to redesign it.

Because once you design it, it is fixed and that is one of the things that you have to -- one of the issues that sort of underpins why you are unable to do certain kinds of things. It would be ideal if you could say, you know, well, maybe do it every five years or every three years or every four years, you start to do something different, but we just haven't had the ability to do that thus far. But everything you said, all your points are well taken.

DR. MAYS: I am going to try and keep us to our time as much as possible here.

Dale has a short one. Dr. Coleman-Miller and then I am going -- can you try for the next one?

MR. HITCHCOCK: Mine is real short. I usually plant this question. I forgot to today. It is about translations. Do you have it up on the web?

MS. LUCAS: I don't believe that the Spanish translation is up on the web and one of the issues with that is that it was available as a toggle so that if, for example, the interviewer wanted to switch back and forth between English and Spanish, it was on a toggle key on the computer.

I don't know that the actual instrument in Spanish is on the web site. But that is something I can find out. It probably could be put up there if it isn't already there.

DR. MAYS: Great.

Dr. Coleman-Miller.

DR. COLEMAN-MILLER: Thank you.

I am looking at surveys and this morning happens to be a very difficult view of surveys, mainly because they are extremely expensive and they are not giving us the information as health disparity experts that we need. So, I have a few questions. Dr. Hummer, I heard your frustration and I really want to acknowledge that and ask you a couple of questions.

One is can we change some of the forms you are using, for instance, the death certificate in different states? Can that be changed to give you some more information that would help you in your link process? We tried to do that here in the District, to add the education level of the person onto the death certificate, which really would have made quite a difference and began to and -- although we had to change it.

Another question is whether -- you mentioned that the reason this survey is so difficult for a minority population is because it would get so big with all those considerations you have. And, yet, you mentioned that there are some situations where you have allowed it to get that big and that has been oppressive, but it got you your information.

I was just wondering now how we measure big when we come up with absolutely minimal statistics for the minority population. There may be a way that it needs to get bigger and what happens if that happens. The other thing is that years ago I spent time with methodologists in the GAO, methodologists, who do nothing but surveys and have written texts on surveys.

I am just wondering whether you have looked at some of those texts, which analyze how to do surveys and the methodology behind those surveys. Then maybe there is like one or two sentences in those texts that could help. Another is that the government persistently gives RFPs out to not-for-profit agencies that deal with minority populations across this country.

Has there ever been a link between that RFP and one grain of information that that not-for-profit will be required to do with your survey in order to make that RFP a successful one? We have requirements that are in RFPs and I wonder if we just can't add that survey dimension to it in some way that would say we know you are with this population. Maybe we can use you because you are getting a proposal or you are -- can you include part of this into your proposal?

The other is whether there is interrelated information. I mean, yesterday we sat and listened to people talk to us about surveys they were doing. So, the minority population needs to know whether you know about theirs and whether it would be dangerous and explosive for you to interrelate and if not, then can we do it and if so, can it be done on a sophisticated level so no one is uncomfortable with the fertility data that is being collected by that agency.

In other words, there needs -- something cohesive needs to happen and that is certainly could begin the process if it could be done correctly.

DR. HUMMER: Firstly, I think your questions are fantastic, but as someone who is not involved in the data collection, I am not going to be able to address most of these.

To go to your second point first, you talked about this is a big survey and, yet, we are not getting a lot of the information that we need. Again, from a user, my -- you know, the thing I keep coming back to is that we don't need this survey to be 60 percent white.

I think that if there are fixed costs, then we -- you know, my own sense is -- not being a survey methodologist either -- is that we cut back to whatever, 13, 14, 15 thousand whites and that cuts out 45,000 people and even if you can only do half or one-third of those, then you add others. Now, that is a simple answer to a much more complex issue in terms of sampling design and so forth, but I think, again, that is my biggest concern about this and many other demographic and health surveys I have worked with is that we have got to find the people we need to find.

In terms of your first one, let me just hit on that change in death certificates. The biggest problem that I think I have faced with the Health Interview Survey and the links to the NDI is that the match quality seems to be less good. That is the wrong word, but you know what I mean, than for minority groups, for most minority groups than it is for whites and blacks.

To the extent that we can do better with that -- that is a hard issue because there is out-migration to Mexico or wherever. So, death matches to the NDI are reasonably good some populations and less so for others.

For a number of your questions, I really can't address those because of not being an insider to the survey.

DR. MAYS: Okay. We would like to thank you both for your presentations and we thank you for taking your time and the quality of your presentation has helped both, I think, the committee and the audience to get better insight into the issues on our plate today. So, thank you.

Dr. Curtin, Dr. Sempos.

Dr. Curtin is joining us from the National Center for Health Statistics also and will be talking about the National Health and Nutrition Examination Survey and he will be our survey person. Dr. Sempos is joining us from SUNY at Buffalo and he is our user for the NHANES.

We will start with Dr. Curtin.

Agenda Item: National Health and Nutrition Examination Survey

DR. CURTIN: I will probably take a slightly different tack in this presentation because I am a survey methodologist as opposed to a subject matter or an analyst.

A quick background, the NHANES survey actually goes back well over 40 years. It used to be a health examination survey. In the early seventies, a nutrition component was added. There has been a series of national surveys done over the past 30 years. The middle one you will notice is called Hispanic HANES. This was a special effort to get at estimates for the Hispanic population.

This was the only sample which was not a nationally representative sample in the strictest form. The last data that has actually been released by NCHS is for the NHANES III survey, which finished off in 1994. However, I will talk a lot today about the upcoming plans for release of the current survey NHANES and we used to call it NHANES IV, NHANES forever, NHANES continuous, but we will probably release data for 1999 to 2000.

NHANES is -- first of all, it stands for National Health and Nutrition Examination Survey. It is a little bit different beast than just a typical interview survey. It starts with a screener interview for our over sampling basically. It has a household interview. And the household interview is actually directly relating to the Health Interview Survey. We use some of the same questions. Everything Jackie said about race and ethnicity and about basic questionnaire format also applies to NHANES. So, that is why I don't have to discuss it again.

But the real key to what makes this a very rich and invigorating data source is the mobile exam center because you bring sample people into what is a series of four trailers that are set up usually in a K-Mart parking lot. You then bring them through for a series of medical examinations, a cardiovascular fitness test, bone dense geometry, blood draws and any number of different pokes and pryings that is done on the American population.

This is strictly bureaucratic, directly out of OMB type clearance documents, but basically it is to estimate the number and prevalence of people with disease factors, disease and risk factors, monitor trends and prevalence, analyze risk factors, study relationships in diet health, explore public health issues and establish baseline information.

That doesn't tell you too much, but if I spent the whole time discussing the components of NHANES and everything it does, we would probably take up the whole 20 minutes. The Healthy People 2010 Objectives to give you a flavor of some of the content of what goes on here in terms of osteoporosis, diabetes, elevated blood levels. A new initiative is dealing with some of the environmental exposures in the data, exposure of pesticides in heavy metals, which are access to the blood sera; high blood pressure, cholesterol, folic acid, which is very important in terms of preventing various birth defects, of course, obesity and overweight, a favorite subject these days in the health field, as well as some dietary things, such as sodium and calcium intake.

As I said, the last data that was actually released to the public is the NHANES free data set. The Health Interview Survey, which already has been noted having some sample design issues related to how it is constructed, well, these design issues become even more apparent in a six year survey, which only has 81 primary sampling units and only 30,000 examined people. This, however, is a highly screened population.

After doing the Hispanic HANES survey, our constituent groups -- and this may be a change in today's culture, but back then they were driving us to say if you can't do all Hispanic groups, the total Hispanic really doesn't do us that much, so just do what you can do, which in our case was the Mexican Americans. So, NHANES III and HANES current are really designed to get at three population groups, Mexican Americans, essentially non-Hispanic blacks and non-Hispanic whites.

As you can see, the NHANES III was about 30 percent Mexican Americans, 30 percent black and the remainder was non-Hispanic white, some single race categories and other categories. I felt the need to present some data and this is actually out of the NCHS/Department of Health and Human Services publication, "Health U.S."

This is just some basic risk factor information that NHANES provides; percent of high blood pressure, percent high serum cholesterol, over 240 milligrams, and the percent obese where obese is defined as a BMI greater than 30. This is just to show some differences among the three race/ethnic groups and by male and female. You have a copy that you go over yourself at your leisure.

Because what I really want to discuss today for the most part is where we are going with the NHANES survey, and I guess it is sort of a negative presentation, the problems that arise because of this. As I said, the NHANES survey is really designed to be a six year survey, but that takes a long time by the time you plan a protocol, get it in the field, have six years of data collection, do data clean up, you are ten years down the road.

Most researchers don't want to wait ten years for their data. The desire was to do sort of more flow bases and maybe release it every three years. However, there are other people who wanted it on an annual basis. The survey is actually designed to be a, quote, representative sample, unquote, on an annual basis, but there are severe limits to that.

We, euphemistically, NCHS internally have discussed the possibility of one year versus two year data release and right now we are being led down the path of a two year data release. We anticipate that a public use file for HANES 1999-2000 will be released in July of 2002 and it will be an Internet data release.

As opposed to a single data file with five or six thousand variables on it, we are going to release multiple files and the sequence number that the user is going to have to link and that is because these files come in on a flow basis and sometimes need to be edited and updated. It is easier to keep track of them this way.

Certainly, the issues that we are involved with in this current data release involve the new OMB guidelines on race and ethnicity, plus a severe confidentiality concern. As the sample size gets smaller in a two year data release, new foibles come up in confidentiality that restrict even further the release of the data.

There are some severe analysis and estimation issues that arise in dealing with the two year data set. To get on my soapbox for a minute, there are some strengths to the survey design and the strengths to the survey design have to do with your ability to control the selection. You can control the selection to over sample minority populations, do stratification in screening.

You are also protected from a statistical standpoint largely through the aspect of randomization. Every stage of a multi-stage sample is selected at random and then the other thing that protects you in your analysis is sample size. Now, these were all the strengths of the survey design. They are also weaknesses of survey design.

If you don't have a large sample, you have a problem. If you don't have a large sample, the randomization can actually cause a problem. NHANES III we sort of refer to as the U-shaped design because if you had a map, which we can no longer release for confidentiality purposes, but if you had a map, you would see that the PSUs were located down around the country like this and in the middle of the country was basically underrepresented.

This is because the desire to over sample Mexican Americans and African Americans drove the sample in a certain geographic area and left the remainder of the things largely -- well, let's say largely undisturbed by our mix coming through there.

This causes a real problem if, in fact, you are dealing with a variable that is related to rural health or something like this, where that sample wasn't there. In theory, every county in the United States has a probability of selection into the sample. But when you are only selecting 15 out of 3,000 counties a year, you have a great tendency for selection bias going on there.

NHANES is actually designed with very specific analytic requirements. I think it is useful to keep in mind when I show you the sample sizes later. The design effect for those not familiar with survey work is a measure of the inflation you get in the sampling error due to the clustering, the over sampling and differential rating. It is usually greater than 1. The Health Interview Survey design effects may be 1.5, 1.2.

NHANES we design around a design effect of 1.5, but we are cheating because a lot of the design effects actually go up around 2 and greater than 2. To get a 10 percent statistic, 10 percent probability with a 30 percent relative standard error, under a design effect of 1.5, you need about 150 people per sub-domain. However, most of the interest isn't in looking at differences between sub-domains.

In order to get an absolute 10 percent difference, for example, difference of all the way to 30 percent versus 20 percent, with 95 percent significance level, 90 percent power, you need a sample size of about 420. Even though the NHANES sample is a very small sample relative to other national samples, it is highly stratified and highly selected, selected for Mexican American, non-Hispanic black and a residual of white from 1999, what happened was we were concerned about the number of low income whites in the sample because of the way the sample was drawn.

We added in another selection strata for low income whites in the year 2000. Then there are some very specific age groups by sex that we ended up looking at and it is really targeted to the very young for various reasons. NHANES is done under contract to Westat. It is linked to the Health Interview Survey PSUs for 1999 through 2001. An independent design has been drawn on for 2002 and we will no longer link at the HIS PSU level.

This is done for a couple of reasons. One, the HIS was designed on basically 1990 census information and when Westat got onto the field for 1999, the population composition was much different than we expected and we weren't doing as well on our screen as we had hoped.

We went ahead and did an independent design for 2002 and beyond. In addition, as I said, we are linked at the questionnaire level to HIS as well. The measure of 5 that we use in selecting the sample has to do with the percent Mexican American and percent black. Again, we are limited to only 15 PSUs per year because of the operational considerations of taking these MECs(?) around the country, setting them up, having them there and then packing them up and hauling them away again.

There are basically two different MEC teams that go around the country and the operational constraints is given that size of the field staff, you can only do about 15 primary sampling units per year. We do screen for race and ethnicity and we sample more than one person per household and, in fact, the sample selection results for 1999 are still in here.

It is the nature of the beast that when you use a measure of size, rates of percent Mexican Americans, you are going to end up with Los Angeles in your sample every year. Okay? So, that is why you see 27 STANS(?) and 26 PSUs. L.A. was in there for both 1999 and 2000 for the different sections of the city.

The segment level, we screened 600 -- we selected 681 segments, approximately 23,000 households, identified about 6,000 eligible households and interviewed about 12,000. The response rates are not great here, but that is largely due to 1999. I am not sure who was running the survey in 1999.

DR. KINGTON: Your assessment of that has not been great.

DR. CURTIN: But there are some start up problems and actually what really went on here is some of the PSUs that were selected early on in 1999 were some that had historically low response rates anyway. The response rates for 2001 are looking better. So, as we combine these over years, we will actually improve these response rates as time goes on.

It does create a problem, though, because in order to increase response rates, you do a lot of outreach and because we do a lot of outreach, it is pretty easy to determine where we have been. We can go in the red and scan through and find out where the 15 PSUs are. So, if you have a data tape that has pseudo-PSUs on it and you know the characteristics of those pseudo-PSUs, you can probably identify where they are. Once you do that and identify the geographic unit, you only have 400 people per sample, our Disclosure Review Board gets very upset about releasing any data on those people from the standpoint of being able to identify them.

So, for HANES 1999-2000, we will not release pseudo-PSUs, which, of course, you say, well, how do you do your sampling errors and we are working on that, even as we speak. We are actually doing -- I can't get into details because it is a disclosure of risk avoidance thing. If I tell you how we did it, you might be able to undo it.

We are basically going to deal with a jack knife estimate of variance, using 52 replicates that is somehow related to the sample design and give approximate standard errors for what you would get under the PSU design. So, that is a little bit of trust need, but we are presenting a scientific paper at some point, probably the July Data Users Conference that NCHS will put on.

In addition, because the sample size is too small and we have this rule of disclosure avoidance, that you can't have more than three people in a cross tabulation, we are probably going to have to limit the type of socioeconomic status variables that go on in this in terms of education, income and race/ethnicity.

Right now we have a real problem in dealing with our Disclosure Review Board about households and family links. If a person is selected into the sample as an adult and their child is also selected into the sample and the data set is on the Internet, they can go find their own informational unit and identify themselves, but they can also identify who their child is.

Since we do some very sensitive interviewing in a CASI(?) mode, audio-CASI, which is the headphones on sexual history and things like this and we do blood tests on STDs and herpes and other things, we have a real problem right now in disclosure avoidance on this variable. These are still under consideration but this is a tendency that we are driving towards at the present time.

Of course, to make it even more complicated, there was some field up start costs for 1999. So, we actually went to 12 PSUs and the nature of the way they were segmented was the fact that these were mostly white and black PSUs that were not selected or we underrepresented blacks in 1999. Although they are weighted properly, they are still underrepresented relative to what the sample should be. The sampling practices were changed in 2000 to try and alleviate that.

These are the actual sample sizes for Mexican American, non-Hispanic blacks, white and others. Combining the male/female group under age 6 and very broad age group, 6 to 19 and 20 plus for males and females, keeping in mind that 420 that is needed with the design effect of 1.5 to detect a 10 percent difference, you are already in problems here in analyzing this data.

Of course, we are also dealing with the issue of how to handle the OMB guidelines, it is -- since we use the HIS questions and ask ethnicity first, Mexican Americans as is known in some other surveys, tend not to report a race. About 20 to 25 percent of the HANES sample has unknown race associated with it right now. If you use a census type coding scheme that they used in their public law file, which would assign a no known, but Mexican American to some other single race, then 1,892 would be put into some of the race category and we would severely underestimate the number of whites and blacks in the country.

We are still trying to deal with this issue. In addition, there is 441 multiple race who will probably decrease because under the current coding scheme if you put white and Mexican American on your record, that is then coded as white and other single race and then it is coded as multiple race. So, we are still in the process of working out our coding rules for release of this data.

I can't escape without doing a couple just foibles of analysis when you come to this data. Because you are dealing with small sample size, the typical question I get over the phone is this number doesn't look right. It is just not what I have gotten in the past. Well, you are going to have to keep in mind that the two considerations in tradeoff of bias versus variance.

Small sample numbers, small randomization issues could lead to large sampling errors and you can get outliers very easily in this data. By the same token, these really are covering 12 to 15 geographic areas. You could have an unforeseen selection bias. It is very important if you have external data sets to compare them with what you have got out of the NHANES survey to see if what you are dealing with is bias or sampling error.

Furthermore, in terms of estimation, when you estimate sampling errors in the fashion that we have here, we have a small number of degrees of freedom to really estimate those and you are moving more towards using standard normals versus T tests. They are using T tests, which give you larger confidence intervals and reduce your effective sample size.

In fact, in the HANES survey, you have at least, for example, a hundred Mexican Americans in the PSU. There is only six in 1999 and five in 2000. Even though you have maybe several thousand sample persons over 15 PSUs, they are all clustered in just a very few primary sampling units.

The conclusions that we draw from this is to analyze the data, but analyze it carefully. There is a very small sample size. You may have problems because of the limited geography.

You have to be extremely careful with the design-based estimation in terms of influential values and influential sample rates and in terms of dealing with the degrees for the variance estimations. Finally, just one plug for HANES future, it is well understood that given the limitations of the NHANES design, it is very difficult to get at specific sub-domains.

There is a plan under foot to -- we used to call it the RV or Winnebago HANES. It is now changed to Defined Population HANES and it is now Community HANES. What this does is it takes basically a large recreational vehicle or an 18 wheeler and just moves that from community to community in order to target the population defined in terms of their race and ethnic status or in terms of some health outcome.

Hopefully, it is going to be more flexible data collection approach if it can ever get funded.

That is that.

DR. MAYS: Thank you.

Dr. Sempos.

Agenda Item: National Health and Nutrition Examination Survey User

DR. SEMPOS: Thank you, Dr. Mays.

I would like to thank Dr. Mays and the Subcommittee on Populations for inviting me to the meeting. It is really a privilege and a pleasure to be here. I would also like to thank Dr. Susan Queen and Ms. Gracie White for helping me with all the preparations for the meeting.

Dr. Curtin has given a very nice overview of the HANES system, its history and its future. What I would like to do from the standpoint of a user is describe over all some of what I think the assets and liabilities of the survey are and possibly what some of the changes for the future might be that might improve its applicability.

Now, I don't come as a totally unbiased user. I worked for 13 years with the National Health and Nutrition Examination Survey and only three years ago moved to Buffalo, New York. They told me it was the Miami of the North. After seven feet of snow in December, I began to wonder whether they were telling me the truth.

But to begin, the U.S. Health Interview Surveys consist of a cross section of periodic -- a series of periodic cross sectional studies, which extend back over 40 years from the present to the first survey in 1960. From the very beginning a key feature of the NHANES has been the assessment and monitoring of minority health status.

Presently, the NHANES produces estimates for Mexican Americans, non-Hispanic blacks and non-Hispanic whites. Those estimates cover approximately 85 to 90 percent of the U.S. population. The principal reason for the survey's existence is to assist federal agencies in the development and monitoring of public health policy.

The goal of the NHANES are to produce national estimates and to establish a series of health status indicators, which can be used to assess and monitor national health status, develop, monitor and help modify federal regulatory policy and examine the association between health status indicators and certain selected conditions and sometimes, and to a much lesser degree, diseases.

Two types of data are collected as part of the health surveys; self-reported aspects of health, including race and ethnicity, income, nutrition, health behaviors, medical history and most importantly the second set of data collected are those physical attributes of individuals, which can be measured, such as height, weight, infectious disease exposure, chronic disease risk factors, such as blood pressure, cholesterol and obesity, as Dr. Curtin mentioned; environmental exposures and the presence and absence of a very few selected set of diseases, for example, arthritis, periodontal disease, diabetes, coronary heart disease and, most recently, with the publication of data on the prevalence of different mutations of the hemochromatosis gene, gene frequency data.

It is the measurement of those physical characteristics, which is the strength of the HANES program and which quite honestly makes it the premiere health survey in the world. It is also that same characteristic, which prompts the involvement in the design and support of the survey by virtually every federal agency in the Public Health Service.

What does the HANES do well in my estimation? The HANES program assesses national mean levels and distributions of health status indicators. It gets us national prevalence estimates or percents of the U.S. population with certain often unhealthy characteristics and health status indicators.

By using successive NHANES, you can produce national trends in health status indicators. Hispanic HANES in 1982 to 1984, the surveys were able to document health disparities between black and white Americans. Since that time, it has been able to be used to document health disparities between African Americans, Mexican Americans and non-Hispanic white Americans.

It has been able to document potential areas of unmet medical need associated with those health disparities; for example, in the areas of diabetes care, infectious disease exposure, environmental exposure, dental health and, as Dr. Curtin mentioned, prevalence of high blood pressure, of high blood cholesterol and obesity.

Let me emphasize again that from the beginning the goal of the NHANES program has been to describe current levels and trends in health-based status indicators by age, sex, race and ethnicity and it describes those national levels extremely well and as I said, it does it for 85 to 90 percent of the population when you look at subgroups.

What doesn't the NHANES program do so well in my estimation? It is not very good except in very selected cases in disease diagnosis. It is not very good at producing subnational, state or local estimates. It wasn't designed to do that. Estimates for that 10 to 15 percent of the population not covered and possibly most importantly, it does not explain why health disparities came about or even how they can be reduced.

With the tremendous improvements in health, especially the declining cardiovascular diseases since the late 1960s and the increase in life expectancy and with the realization and to a certain extent acceptance of racial and ethnic diversity, there has been an increasing interest in finding and eliminating illness and disease in racial and ethnic minorities and in the medically underserved populations of the U.S.

There has been a desire to understand the causes and solutions necessary for eliminating racial and ethnic disparities. This, along with the tremendous advancement in computer technology, has resulted in an increased interest in a desire for health status data at the state and local level and for coverage of those Americans not specifically over sampled and studied in the NHANES survey.

What do I think can be done to help reduce these data deficiencies? Well, first of all, I would like to recommend to the subcommittee -- I am sure it has already been recommended that you take a look at the report chaired by Anthony DeAngelo and Dr. Carter-Pokras, which looked at the data deficiencies of the U.S. data collection system. I think it was an outstanding report and in detail set out what the deficiencies are in many of the data collection systems, including the NHANES.

But I want to go back to a suggestion that Dr. Curtin brought up and which Dr. Kington originally proposed and that is the Defined Population HANES and the Community HANES, as it is now called. I think that is an outstanding way to supplement the data deficiencies of the NHANES. I would like to hope that the Subcommittee on Populations can encourage federal agencies to support the actual setting out of a schedule in which different communities throughout the U.S. and in cooperation with other federal agencies, specifically the Indian Health Service, in trying to identify a series of communities and populations which could be examined on a regular cycle basis, including such communities as the Commonwealth of Puerto Rico and different Asian American communities within the U.S.

I thin that would help a lot to supplement those data deficiencies. A second recommendation that I would make is increasing the efforts to make the NHANES surveys into cohort or follow-up studies. Starting with the NHANES I study, which was 1971 to 1975, there was a direct follow-up of all the participants who were examined as part of the study.

Unfortunately, the NHANES I survey, although it is an excellent cohort study, does not include at baseline a lot of the risk factor measurements in disease exposure, which are really pertinent to developing public health policy in the 21st Century. So, I would like to encourage the Subcommittee on Populations to consider making a recommendation that the National Center for Health Statistics increase its activities in the area of developing follow-up studies as a regular ongoing product of the NHANES program.

Finally, and although not directly related to the NHANES program, I would like to mention another aspect of federal data collection, which I think could be improved. The vital statistics data, the birth and death data of the U.S., are the most fundamental health data that exist. Right now we have an outstanding resource that is available through the Internet and CDC called CDC Wonder. Through CDC Wonder, you are able to obtain mortality data at the state and local level.

Unfortunately, unless it has been changed recently, those data are only available for black, white and other and although the capacity exists, data for the Commonwealth of Puerto Rico and other trust territories and protectorates are not necessarily included on the CDC Wonder data. I would like to suggest that if at all possible a recommendation be made to CDC to see what can be done to work out the problems so that that data might be included.

So, in the end, I think the NHANES survey does extremely well for what it was designed to do and it does extremely poorly at what it wasn't designed to do and, hopefully, the effort will be not to change NHANES dramatically because it does serve a need very well, but to develop auxiliary or ancillary methods to supplement its deficiencies.

Thank you very much.

DR. MAYS: Great. Thank you. Thank you to both of you for your presentations.

[Applause.]

Agenda Item: Questions and Answers, NHANES

Okay. Let me start with questions for the committee first and then we will open it up.

Dr. Handler.

MR. HANDLER: I have one question. It all gets down to money.

If national data is only available from NHANES, how much does it cost to run an NHANES?

DR. CURTIN: Am I allowed to answer that question?

MR. HANDLER: I don't know. It must be in some budget document somewhere.

DR. CURTIN: I am sure you can get some budget document somewhere. I will say something, but, again, take it as a grain of salt, as I am a statistician, not a budget person, when you ask how much it costs, there is the cost of data collection through the data collection contract. There is the cost of staffing at the NCHS level. There is the cost of staffing at the NIH level to support this.

There are a large number of costs that I am not taking into account, but you could come up with a ball park estimate of 30 million?

DR. KINGTON: 30 to 35.

MR. HANDLER: If you increase the sample size to get more minority representation, you might have to double that number. That is a reasonable --

DR. KINGTON: That number is an annual number, so, 35 -- I think it is closer to 35.

MR. HANDLER: So, again, more specificity, minorities that are not represented now, you might have to double that amount. That is -- something like that.

DR. CURTIN: That is the fixed cost versus variable cost as well. Right now you have two mobile exam teams that go throughout the country. It is not feasible just to add 10 percent of the sample or 20 percent of the sample. You have to do it in a chunk and the chunk that you do is to increase the sample size by 50 percent by adding another MEC team or by double, by adding two more MEC teams.

So, it is roughly a 50 percent increase in cost to get a roughly 50 percent increase in sample. That is why I believe a lot of the concern or interest is in the Community HANES survey. It is more limited in the type of data and the scope of data you collect, but it is cheaper and you go to more places on a more frequent basis.

DR. SEMPOS: In fact, in NHANES III -- and Dale, who has forgotten more about NHANES than I ever knew, can tell you originally NHANES was designed to have a sample size of around 60,000. There was going to be -- there was a discussion of either having a segment in Puerto Rico or a Puerto Rican section of NHANES. That was eliminated for budgetary reasons, not because it wasn't requested.

DR. LENGERICH: I have forgotten which one of you said it, but I am interested to hear more about the -- I didn't realize the hemochromatosis gene had -- the prevalence of that had mapped through NHANES and that that has been released. I guess I would like to hear a little bit more about the plans and the considerations and particularly the confidentiality issues around NHANES and genetic issues.

DR. SEMPOS: It was recently published in JAMA and saying that, I will let everybody else answer the rest of that question.

DR. CURTIN: The NHANES III, when it was collected, there is nothing in the informed consent document that stated that these samples may be used for genetic testing later on.

DR. KINGTON: Which was the standard at the time.

DR. CURTIN: Which was the standard at the time, okay, because again the informed consent document was done at the beginning of the survey. So, it was decided through the institutional review board that the only -- even though there was excess sera available and you could spin off the white blood cells, immortalize and take DNA samples, it was decided that could only be done in an anonymous manner through a very prescribed protocol procedure whereby you would have to submit a protocol and funding and go through a very lengthy process to get at that data.

There is a process for doing that in the old NHANES III samples and there are other studies ongoing even as we speak that are using that baseline material. The issue of current samples and use of current samples for genetic study -- actually, I will know more about it tomorrow because I am headed down to Atlanta tomorrow with staff from the division to discuss this problem with the Center for Environmental Health in Atlanta.

We use their lab to do some of our lab work. The informed consent process was actually changed and genetic testing is now part of the informed consent process. You might not have to do it anonymously, but there are severe data disclosure issues involved in any public release of such data. I guess I could summarize this -- and Raynard can correct me -- there is ongoing discussion about how to best use this. Some of the samples actually used and given to Francis Collins at one time in the Human Genome Project to help in the mapping project. About a thousand samples, I believe were sent over to NIH.

But, again, it was anonymous and there are some severe data disclosure problems and a number of policy issues that revolved around this. I am not the best person to address it.

DR. CARTER-POKRAS: Hi and certainly welcome to my colleagues, many who worked closely together on this survey.

I also -- since we did talk about budget yesterday and budget also today, I also wanted to make sure that folks understood what the budget was for HIS. That was one question I had and the other one is what approaches is NHANES and HIS doing to help minority researchers in how to access and use these data.

MS. LUCAS: One of the benefits of being a subject matter expert, which may not help you very much is that I know absolutely nothing about the budget and I could find out and tell you, but it is just not my area. So, I don't really know anything about the HIS budget. But I could certainly get some of that information and send it to you by e-mail or to the committee if they are interested.

12 to 18 million a year for the core questionnaire and then additional monies for the supplements for the topical modules.

But about training, you asked about training minority researchers to access and use the data. I guess to the extent that universities or schools of public health request demonstrations of survey data through the university visitation program and to the extent that those are historically black colleges and universities or Hispanic colleges and universities, we do go out and do presentations of the data.

I also know that several people have requested help from NCHS -- from HIS to use the data for doctoral dissertations and master's theses and things like that and sort of a case by case basis. But that is not any special outreach effort directed towards minority researchers. Years ago, there used to be a minority research grant program at NCHS, which I don't know if it is still in existence. It seems to be defunct at this point, I guess, maybe for lack of funding.

That was a program that was targeted specifically at getting monies available to minority researchers from NCHS to do either research of their own or on NCHS data sets. But to the best of my knowledge there isn't a particular effort.

I do know that the California Health Interview Survey is having a conference coming up in May that is dealing specifically with public health issues related to the African American population and that an HIS representative will be going out there to demonstrate the use of the data set in examining public health issues for African Americans.

MS. HEURTIN-ROBERTS: This is a question for Dr. Sempos and Dr. Curtin both. You both mentioned -- and this is a theme we have heard repeated yesterday and Dr. Hummer commented as well on NHIS, that NHANES in this case is able to document health disparities, but you can't elucidate the causes of health disparities or the processes leading to health disparities. Is that right?

What would be needed to rectify this and is that feasible to somehow incorporate items that would allow us to do so, items relating to, let's say, social, cultural economic contexts and is that an appropriate use of NHANES or would we be better off trying to link NHANES with some other data that is supplemented?

DR. SEMPOS: That is, as they say, the $64,000 question and it is difficult, if not an impossible one to answer. To a certain extent the social and cultural factors, which determine health are at a different plane above the physical characteristics, which we measure in individuals and moving from those social cultural factors, which have enormous impact on all aspects of health and which in large extent help to define race and ethnicity.

It is really very difficult listening to some of the presenters yesterday, to try to include variables to model, tease out what are the social, economic factors, which lead to disease causation.

Another problem is that the HANES surveys are cross sectional. So, you have exposure being social, cultural status measured at the same time as the health status indicators.

The development of more follow-up surveys, you may not be able to elucidate so well why those disparities came about, but it may -- cohort studies may help you to better identify the health ramifications of those disparities and the need to eliminate them.

DR. CURTIN: There is also another budget involved here. That is the budget involved with respondent burden. If we put together all the questions that people would like to ask, you are probably talking about an eight to ten hour interview. There are not many people who would sit down to do an eight to ten hour interview. That is probably an exaggeration.

But, nevertheless, it is true that we could fill out not just four trailers in the MEC, but we could probably fill up eight or ten trailers in the MEC for examination components. The real issue is tradeoffs, such as in sample design, you trade off geographic and state estimates versus racial and ethnic disparities because they tradeoff in survey design, so does content.

NCHS has, believe it or not a process by which we gather information from people and try to set the agenda and there is an ongoing process for survey content and, ultimately, though, it does come down to the tradeoff on what the public health researchers feel is important because, again, I think it is up to us not to determine the content, but to react to people who are using the data in terms of what is of interest to them and what is of interest to the American people.

DR. SEMPOS: Right now, there exists a great deal of socioeconomic status data as part of the data collection with NHANES. But part of the problem also in trying to relate those socioeconomic status indicators at a national level is that when you look nationally, a lot of the cross-cutting trends that occur regionally and locally tend to cancel out or can cancel out at the national level.

For example, cardiovascular disease mortality overall is declining in the U.S. and it is declining for black Americans as well, but if you look at the State of Mississippi, cardiovascular disease mortality for black men is on the increase. So, even if you are trying to use the data to document unmet medical need nationally, it is that aggregate look that doesn't -- it sometimes obscures problems that are existing at state and local levels.

DR. MAYS: I want to follow up with a question. Let's just say that users said that they wanted social support. That was just one of the things you put there. If one had the funding, say, from a foundation or NIH or some place like that, is it possible, for example, to buy time? Can individuals buy time on a survey? Can foundations buy time on a survey?

I am going to couple that with what are the barriers? Is that really like, for example, that it becomes a human subject IRB issue, a burden to the person? Is it OMB clearance? Is it -- what are the actual barriers? If I had money and I came to you -- I hit the lotto and I have hundreds of thousands of dollars and I said I have five variables. I am really interested, and they are reasonable. I mean, there is a scientific basis for it. Could I buy time?

DR. CURTIN: Well, for you, of course.

When we discussed the HANES budget -- one thing you have to keep in mind is that is not all in the CDC NCHS budget. A good portion of that comes from NIH and other people as well and it is done through reimbursable agreement because there is already a mechanism in place, more or less, for supporting various aspects of health through the reimbursable agreements.

So, yes, we -- in a very sarcastic sense, yes, we sell part of the survey, okay, but who has the money? It is usually those that have been appropriated through some sort of need and Congress has determined there is a need to study a problem. They give the money into somebody's budget and they then turn it over to us to help study the problem.

Now, as an individual, it is a little bit harder as an individual, of course, but certainly the academic community, the public health community, there are public health associations, there are various ways to get the point across that certain variables are very important to the study of health. It helps to have money attached to it, but at the same time, something else would probably have to go out of the survey at the same time, unless it was just a three or four minute bank of questions that you could just kind of slide in.

There would be a whole review process. So, if you gave me the check today, we might be able to get the questionnaire changed for January 2003. But if you gave me the check in May, it would be January 2004 before you could get it in.

DR. KINGTON: As an aside, there is one precedent now of an external group paying for an additional question, which was justified on scientific grounds. It was a combination actually of a drug company and a foundation. Data are going to be released to the public. They will not get it before anyone else. They thought it was important. It was added. There was justification for it and they paid for it.

There is a process now through the CDC foundation of actually supporting -- I mean, buying is a crude term -- supporting the efforts to improve the health of the public.

DR. CURTIN: Friends of NHANES.

MR. MITCHELL: There used to be, too, sort of a philosophical constraint, where if the various review processes along the way thought that this question really didn't relate to a particular examination component or something that HANES was really equipped and able to do the exam center, then perhaps HANES wasn't really the right survey to be asking --

DR. CURTIN: Yes, I think Raynard's point about there still has to be scientifically valid and of interest to the general American public.

DR. NEWACHECK: I think there are also other examples outside of HANES where this has taken place. The Robert Wood Johnson Foundation has paid for an access supplement to the National Health Interview Survey. The Gerber Foundation paid for an early childhood survey as a part of SLATE(?). So, there are lots of examples of that process.

DR. SEMPOS: And I think one of the things that Dr. Curtin could talk about in detail is that if you would contact Dr. Jerry McQuellan(?), National Center for Health Statistics, right now there is a way for outsiders to propose studies using the excess sera from NHANES III. So, if there is something you want to measure in blood and sera, do some kind of study, that can actually be done now.

DR. KINGTON: And the web site, NHANES has the procedure for proposing content changes. It is well described on the -- for anyone in the public to propose.

DR. MAYS: Let me just take my second question and then we will do this and then we will go on a break.

That is about the moving now to the samples that are being collected. One of the things I know is that -- and I know I am not going to get the acronym right but there is some part of NIH that has been thinking through -- it is called ELSRI(?) or something -- Ethical, Legal Social Responsibility Issues. So, they have begun to think about what that means within the context of research. I guess my question is what has been the thinking for NHANES about -- particularly if you now move to these targeted communities and that is the community in which you now are going to select samples and those communities are the very ones that we are talking about in terms of health disparities, are you prepared for the type of discussion that I think is going to ensue about that is the point at which now in order to get this health disparity -- oh, and you just happen to have, you know, a sample that you want to collect, too.

Anyone can answer this.

DR. CURTIN: Well, there is certainly an institutional review board process at NCHS to review such things and they look at the cultural diversity and then the sensitivity of the thing relative to the sub-domain of interest. But before it even gets to that level, it should not even be sent to the IRB unless the NCHS and CDC management agree that it is ethical and there are no problems with it.

So, actually I served a period of time as chair of the IRB and it was always my position that they shouldn't bring anything to us that they weren't willing for us to put into the field. I think that kind of follows along here as well and the other thing that I did not like the IRB doing was breaking new ground.

It sort of came up when we were talking about the DNA stuff for the first time. They were at that time no real ethical standards generated. There was no national bioethics advisory commission back then. I really don't think that NCHS is the appropriate place to break new ground on the ethical and research concerns.

My personal preference would be to follow the research community and what guidelines they take care of in the normal mechanisms.

DR. BREEN: I just wanted to ask the defined population or the community HANES had come up -- and I think this is something that Raynard had proposed when you were directing NHANES and I wondered is it dead? Is it alive? I know NCI was actually interested in helping to support it, but then there didn't seem to be anything to support.

DR. CURTIN: Well, NCHS, nothing is ever dead, other than the mortality statistics, I guess, but the real issue there is what can be done. If you do a community NHANES, you have to take the cross sectional HANES out of the field for a period of time. But can they both be done together or can some sort of mixture of the two be done?

At the same time, Chris mentioned that there is a need for longitudinal studies. So, you actually have three different types of studies that are under consideration; repeated cross sectional, longitudinal follow-up and community HANES. To be crass about it, it is somewhat who is going to support it because if 25 percent of the NHANES budget comes from external sources, then that becomes a driving factor.

I think there is a lot of -- and there has been a lot of interest in community HANES for 10 or 15 years as a means of getting at special population groups. When there was a health initiative along the U.S.-Mexican border, it was considered at that time. After September 11th, it was considered as going up to Manhattan with something. There have been any number of occasions where we have said, gee, if we just had a community HANES in place, we would be able to turn out some very significant public health results.

But it is a matter of the start up and getting it done first, but you almost be in the field and prove its worth to get other people to support it from then on. So, I guess it is a roundabout way of saying that the issue is still alive and well and we will probably discuss in the future planning for NHANES cycles as to how to best mix and match among the different modes of data collection.

MS. HEURTIN-ROBERTS: I just wanted to comment and I guess go back to my original question. I guess I am concerned that the surveys that we have discussed yesterday and today, over and over again we have seen that we can document health disparities, but we cannot explain them and if we can't explain them, how can we possibly address them.

So, I think this is a glaring need that we will somehow have to remedy and I think that, you know, you are all wonderful, creative survey methodologists and --

MR. HITCHCOCK: You are assuming that the survey is the appropriate mechanism to discover --

MS. HEURTIN-ROBERTS: I don't think it is the only appropriate mechanism, but I think that we can go a lot farther if we have the contextual data, that we can do a lot more than just describe. I think we need the sort of broad scale data that a survey can provide. I think that is necessary to move things forward.

We can use other means to get contextual data, I agree, but I think we do need the sort of broad scale investigations that a survey can do, that a survey can produce.

DR. MAYS: I think what is going to have to happen is I think that the subcommittee is really going to have to really think about what recommendations we need to make about this because I think that there are, you know, some issues about putting it all ont the back of these surveys at the same time, the issue of a different perspective about what the surveys can do that we might comment about.

Let's make this the last comment and then we will go for a break.

MS. GREENBERG: I am sure there are all sorts of sampling and estimation implications of this, et cetera, but I have heard some interesting reports of some interesting survey work on questionnaire development and I think John Ware(?) is one of them, that very much tailors the questions to the individual so that -- I mean, I know that we would have skip patterns and all of that, but that really takes it another level further where depending upon your particular characteristics, you might get a certain set of questions and et cetera, and that technology kind of allows us to do that now and it kind of triggers -- different responses trigger different sets of questions.

It seems that that might be, you know, one possibility in the future, when we are talking about all these different types of variables that we would like to get. We know we can't put all of them or even most of them on -- you know, ask everybody, but it may be that you only want to ask certain ones to certain people after they have responded in a certain way.

I wondered whether either HIS or NHANES is looking at this type of approach.

DR. CURTIN: Not really for a very -- well, I think it has to get worked out a little bit more because as you say, the standard in the box, thinking of a data set is that you ask the same questions of the same people and then you have variables on a data tape and you properly weight them and analyze them.

But if you are talking about then using different flexible means of collecting information to summarize it in a consistent manner, then you have alleviated measurement error problems and that, then it has possibilities.

There are also sampling techniques that are under development. There are some called adaptive sampling, which is a variation of the NCI friends and family type situation, where you have located individuals with a rare characteristic and from them you get identification of other people with that rare characteristic and add them into the sample.

So, there is a lot of look at how to sample rare and elusive populations. There is a lot of work on the cognitive aspects of survey and what does it really mean in terms of what you are getting out of it. There is a lot of survey work now in translation of instruments in the different concepts that go on in translating and then there is a lot of concern in the survey community about what we call interviewer imputation, where when you actually get into the field, you see that they are doing something totally different from what was in your protocol.

So, there is that other aspect of it and I would be very concerned in the type of situation that you said that the interviewer might go through that instrument a little bit differently than they should under a protocol. So, I think all those would have to be worked out before it could be implemented on a national basis.

DR. MAYS: We are going to take a short break and then we are going to proceed with Dr. Kington. Thank you.

[Brief recess.]

DR. MAYS: If you remember, one of the reasons we made a change to our schedule is that we have to get Dr. Kington to the White House. While it is wonderful that he spent some time talking to us, we want to make sure that he goes and talks where we can get some change and action. So, we want our representative at whatever meeting it is you are going to. We want you there, too.

So, Dr. Kington is -- as you heard earlier, has a history also with NCHS. So, clearly he is coming to us with a lot of expertise in this topic. Currently he is the director of the Office of Behavioral and Social Science Research, which we behavioral scientists really champion, and he is also currently the acting director of NIAAA.

Dr. Kington, thank you for taking time to be with us today.

Agenda Item: Policy Discussion

DR. KINGTON: Thank you. I apologize for not being able to attend the entire meeting. Up until Friday, it was all blocked off but something else happened. Also thank you for graciously allowing me to reschedule so that I can attend another meeting and really to have the opportunity to say a few words on this topic, which is personally of great interest to me and of great relevance to the mission of the NIH.

As you know, NIH as a whole and each institute and center has developed a research plan to address racial and ethnic disparities and the new National Center for Minority Health and Health Disparities, under the direction of Dr. John Ruffin(?), now has the primary responsibility for pulling together the final version of the plan, as well as for monitoring progress towards meeting its objectives.

Data, essentially those sources described at this meeting, are important to NIH really for two reasons. One, they help us understand the magnitude of the problem of disparities in health and health care and that helps us set priorities in terms of our research agenda. These data also are important because they give us insights into fundamental causes of these differences and ultimately can help sort of point the way toward likely places for intervention, although I recognize that the primary purpose of these data sets is not to look at causal pathways, but in spite of their more policy oriented purpose, they really have in many cases sort of pointed us in directions to help us understand causal pathways.

When Vickie and I spoke about what I was expected to do, Vickie made the mistake of more or less saying that I was free to give my thoughts on the broader perspectives of this issue. Of course, she might not have interpreted those comments that way, but that is the way I chose to interpret them.

So, let me just run through several issues that I believe are important, starting with the issue that has already been discussed a little bit and I just want to expand it a bit. I know this term is overused, but I will use it nevertheless. It really is time for a paradigm shift in how we think about how we obtain data on our increasingly diverse population.

The single most important problem -- and this has been echoed over and over again this morning -- was how do we obtain data on more, particularly smaller subgroups of American population. The idea that all data needs can be solved by simply over sampling really reflects a poor understanding of how diversity is playing itself out in the country. To be honest, I don't know if it has even served the larger subgroups, such as African Americans very well these days.

Let me just expand upon that a bit. Clearly, there is increasing data on -- increase and interest in data on smaller subgroups that are often very geographically concentrated, in addition to be very small, relative to the population of the United States. This was alluded to by Jackie, particularly for national surveys, as simply too costly to screen the entire country for a small population and a population that is highly geographically concentrated.

I really think that we need to take this fundamental idea that was alluded to with the community HANES and really bring it to a whole other level. What we need are two parallel tracks of data collection; on the one hand a core data collection tract that is able to get at the large population, racial and ethnic population subgroups, non-Hispanic whites, African Americans and Mexican Americans as one combination. It could be others and then a separate tract that runs in parallel in which data are collected in a very similar way, allowing comparisons with the national data, but are concentrated that allow us to get information on this what is truly a mosaic of smaller groups; Native American tribes, specially Native American tribes, Asian and Hispanic -- smaller Hispanic subgroups.

I think that particularly the American Community Survey really offers a possibility of giving us more information to allow us to run these two tracts. But I think the idea that we are going to have this sort of one survey or one or two or three, even five surveys, that will serve our needs and really describe the complexity of this country is just ridiculous. It is just not the way the country is demographically composed these days and for the foreseeable future.

So, I think in general, we need to move away from this simplistic thinking about data sources and really move toward a different model. I think this sort of parallel tract model is the way to go, one large apparatus that gets at the large groups and then smaller, parallel tract or tracts of data collection that gets at smaller groups that are -- in themselves may be heterogeneous and are highly concentrated in specific regions of the country. That is the main point I want to make.

Now everything else is sort of less important. First of all, there is a -- also on the second point I am making, there is also -- there were a lot of comments about the need to get -- to hone down to get more complex data on different dimensions of life for racial and ethnic minorities in particular and the idea being that really you can't really get at the problems of whether it is access to health care or mortality associated with diabetes or screening rates, that you really need to sort of get more complex pictures of the population and that means getting more information on a range of different dimensions, social and behavioral factors in particular.

I think what all means is moving to a more rational planned system of cycling through supplement. We have supplements now for lots of different surveys in various ways, but in my mind, at least, and from what I have seen both inside and outside of the system of data collection, there really isn't this rational thought about the long term perspective and how you cycle through and collect different types of supplemental data.

I will give you an example. The National Center for Health Statistics and NIH recently had a workshop specifically at getting at developing a series of measures from the very core, essential measures, expanded measures, looking at how we measure economic status. I learned that this issue has been talked about for 20 years in the data collection system of this country.

How do we get at something -- more complex data and better data on economic status. Just as an example, increasing evidence suggests that one's assets, one's net worth is -- in many cases it may be much more predictive of both of your true financial resources and your health status than what your income is in a particular year. This is particularly true for elderly where income often drops, but assets may be quite enormous in some smaller populations of elderly.

So, there has been this struggle to try to get data sources that -- in the health collection system better information on financial status and we had a workshop where we pulled together leading economists, leading epidemiologists more or less in different rooms and we weren't thrilled about the idea of talking to each other before this meeting and we really tried to get at some core measures that we thought all health data sources should begin to collect that would be used to guide data collection for all major health surveys and then a series of supplemental data sets.

We really sort of teased out all the different ways that we have been asked information and wealth -- your questions and how you might go through and periodically expand data collections in various -- along various dimensions so that over time you get a more complex picture of what is going on.

So, the second point is that we definitely need to continue to have supplements, but we also need a more thoughtful process for working through what types of information we need to collect periodically and how we measure economic status is just one example. You can have similar discussions on community contacts, stress, social support and go down the list.

What is happening now is there are lots of people thinking about all these things and no one is sort of putting it altogether into a comprehensive, coherent plan. The third problem -- and I have to be really careful in discussing this in this building on this floor -- there are serious problems with funding and we know that data collection is not sexy, you know, and believe me, I have been on the Hill trying to make it sexy and it is just hard to do.

But we have brains and we can sort of think about the real value of these data sets and I think we have to do a better job of really demonstrating in very concrete terms how extraordinarily important having good data on health across a wide array of racial and ethnic groups is to really solving some of the public health problems that we have. I also think we can do a better job in trying to do a better

-- get farther with the money that we have.

I think there are major sort of efficiency gains that can be made in how we collect data. I don't think it is any surprise to anyone in this room that the cost of collecting data is going up dramatically. NIH funds a number of large surveys and quite a few of them have had large cost overruns.

What is happening in the broader community is that response rates are dropping and it is harder and harder and harder to get those high response rates and the idea that we can sort of all sort of reinvent the wheel on how to have reasonable rates and how we reach out to communities is sort of silly. I mean, we need to sort of expand our dissemination of information on how to do this in a better way and I think ultimately there are some efficiency gains that can be made across data collection efforts if we are to go in that direction.

Let me see if I have any -- oh, yes, race and ethnicity questions, obviously, a topic of great interest to me and some of what I think we need to do on that front was alluded to also by Jackie. That is, we have a lot of discussion about the OMB directive, which was really an important change. I mean, I don't mean to downplay that but people forget sort of one word that is in that directive and it is "minimum." It is the minimum data collection on race and ethnicity. Minimum.

I think we all know what that means, but it doesn't seem to affect what we do because everyone automatically goes and collects the minimum. Well, you can do more than the minimum. The HIS question on primary racial identity is just an example. I think there are lots of other ways that we can try to really take this notion of race and ethnicity and expand it beyond the OMB core data set.

Core questions and examples are, for example, growing interest in asking people not only what you think you are, but what you think most people -- what other people think you are because in terms of discrimination in the health care systems, you know, they aren't going to ask you what you think you are. That is not going to be the way that bias manifests itself. It manifests itself based on someone else's judgment of what you are and that might be very different. I think in particular in the context of multi-racial categories, we might get some play from really beginning to ask people, in addition to what they think they are, what in general other people think they are.

Now I am going to expand even more to a sort of -- I think an area that some people might think is ludicrous, but I don't, and that is getting at the notion of expanded and dynamic notion of race. We say over and over again that race and ethnicity are social constructs. That means that depending upon the social context in which you are, your definition might change and we always here the examples of, okay, I can be in the United States here and get on a plane and go to Brazil and I personally might be in a different category in Brazil than I am in the United States.

But I would carry that even a farther step. As a friend of mine sometimes says, sometimes I feel blacker than other times. Depending upon the context, whether they are a majority in your world, in your sub-world that you are dealing in of your same race and ethnic groups, how much of a minority it is, it might have important implications for how you experience race.

This suggests that we need to get, I think, or at least explore the idea of race as an exposure variable and that you might -- if you spend your entire life in a segregated community in which you basically never have contact with whites, you might have a very different experience of race and that might have very different implications for what race means for your health status, than if you are in a situation when you are constantly going back and forth between a situation in which you are in the majority and a situation in which you are in a small, small minority.

I can relate to this experience in my day-to-day life as well. So, all of this is to point people toward thinking in a more creative way about these narrow dimensions of race and ethnicity and really, you know, just saying that race is a social construct is not good enough and we have to think about what the implications of that -- what the implications are for understanding how race plays itself out and that means going beyond the simple questions and I think both exploring the idea of perceptions of race, in addition to self identification and getting at more -- exploring this issue of maybe there is a way we can think about race as an exposure variable and adding a dynamic dimension to race and ethnicity that really tries to go to the next step of understanding race within the context of a particular social setting. I think that might also offer some interesting scientific opportunities as we move on down the road.

Finally, I think there are ways that we can think more long term about what we need to do and I think meetings like this are extraordinarily important. I was recently at a really interesting meeting sponsored by the Human Genome Institute in which they drew together the leading molecular biologists, the leading computational mathematicians, computational biologists, leading ethicists, social scientists, behavioral scientists, who deal with genetics. The question was, okay, one of these days we are going to finish this map. What do we need to do 10, 20 years from now to really move this science on to the next stage?

It was a really serious effort to bring together an extraordinary group of scientists to really think out -- think 10 and 20 years down the road in terms of what we need to do on this broad area of understanding the role that genes play in health. Incidentally, one of the things that was discussed a lot was this idea of really trying to get at the interaction between genetic risk and social and behavioral environmental risk.

But I just use that as an example of the type of long term parade of thinking that needs to occur and I think meetings like this sort of move us closer to that point where we can seriously sit down and really think about what we are going to need 20 years from now, how we can lay the foundation now for having the scientific resources both in terms of human resources and in terms of scientific data and information down the road that really allow us to be at a different place 20 years from now than we are now.

I will stop there.

DR. MAYS: Thank you.

[Applause.]

Open it up to questions. Dr. Handler.

MR. HANDLER: Very appropriate you are going to the White House after I ask you this question.

After September 11th, the nation is aware of homeland security more than it ever was and if you look at the front page of today's Washington Post, there is a threat sitting out there. Nobody knows where, how, when, but it is out there on the front page of the Post.

What I was thinking was what would happen if there is another event, biological, chemical or nuclear, how would our data system provide information on the health status of the population, not just immediate. The police do that. The long term care. What impact does it have on the environment, on the water you drink, on the air you are breathing and what data systems do we have in place to address that? We are really fragmented right now.

DR. KINGTON: Let me reshape that a bit and also the implications for understanding racial and ethnic differences, which is really the topic of the meeting. I think it is an important question. There are a series of working groups that are being formed now. There is an overall anti-terrorism task force out of the Office of Science and Technology Policy, a subgroup of the National Council on Science and Technology. I think that is it.

There are a series of subgroups and one of them I am cochairing on looking at the behavioral and social and educational and scientific needs to deal with terrorism. One of the few series I am on where there is a representative from the CIA. I think those are precisely the types of questions that efforts like that are moving toward. I think there are a total of six working groups altogether and they are all trying to get at what the real scientific needs are for responding to terrorism in the long haul.

I think getting better quickly is one of those issues and that is reflected to some extent in the President's budget as well.

MR. HITCHCOCK: Just as a quick follow-up, tomorrow afternoon at 1 o'clock, the HHS Data Council is going to meet and I think the main item on the agenda is going to be Dr. Claire Broome(?) from CDC, who is one of the associate directors, is going to present on what was learned as far as the public health data goes from September 11th to where we are now. If you have a chance, you might want to come and listen.

DR. KINGTON: I also think that the RAND Corporation has also been retained by the Department of Defense, I think, to look at the big picture in terms of planning issues like this as well. So, there are a number of activities out there.

DR. MAYS: Our committee, I think, is going to take up this issue at the February 26, 27 -- is that when we meet? So, again, we are an open group.

PARTICIPANT: The Data Council Meeting is in 705A in this building on the 7th floor.

DR. MAYS: Dr. Breen.

DR. BREEN: I wanted to kind of put you on the spot a little bit, Raynard, in terms of you said that efficiency gains are possible in the surveys and while I think there is probably a little more orchestration we could do with our surveys and I just wondered -- it is probably something you have thought about and so if you could maybe share some of your thoughts with us.

DR. KINGTON: I am less concerned about groups like HIS and HANES not just because I used to work there, but because -- I mean, they have been doing it for a long time. So, they know how to do it well and relatively efficiently. But there are lots of data collection efforts that are going on out there that are not necessarily represented at this meeting. As you know, NIH funds a large number of large surveys on a regular basis.

It is a little bit different relationship we have with our grantees than these relationships, but still the point is that I think that there are ways to translate the knowledge that has been gained from large surveys that have been doing this for many years to the broader scientific community and we can think of all sorts of ways.

For example, we might -- one possibility might be to develop regional sort of clusters of survey research methodologists and sort of teams so that there can be a reasonably efficient way of translating games and efficiencies across the whole country and those might be coalitions of some sort that might be tapped into by grantees and by federal agencies as mechanisms for conducting research in a way -- and if you do it -- appropriately plan it so that you could cover the entire country without having to reinvent the wheel every time or for that matter you might be able to hone down on a very narrow area and get expertise in a particular region on efforts that could be useful for smaller studies.

But I think we need to just think creatively. It is clear that it is getting harder and harder to do these surveys. It is costlier and costlier. I really worry that we aren't really taking the opportunity to make sure that when they are gains in methods, particularly in terms of retention and recruitment, that those methods can be disseminated more broadly and that as you well know, I mean, one of the most important determinants of response rates are really the skills of the interviewer, the skills of the people who knock on the doors.

There is a small cadre of those people in this country and I think that is something that also has to be dealt with. Maybe there is some way -- I mean, there are organizations like NORC(?) and others and NCHS and Census that sort of have locks on certain subgroups for those persons, but maybe we need to professionalize in a more standard way and have some type of way of making sure that researchers can gain access to experienced household interviewers, for example.

I don't claim to have all the answers, but I think to at least have the discussion and think seriously about how we can do a better job with the resources that we have so that when we do learn how to do this, when there are gains in methods or when barriers seem to increase, such as more data communities. I mean, you just run down the list. I mean, the telephone surveys are increasingly difficult with the large number of cell phones. In Europe, it is a real problem because many people only have cell phones and the whole notion of a household survey is sort of -- is going away when you have everyone with their own little phone.

So, I mean, I think there are serious implications for the ways that the world is changing, that we need to have a mechanism for figuring out the implications for the surveys or studies that we do and then trying to reach some economies of scale in disseminating that knowledge more broadly.

DR. NEWACHECK: In your remarks you briefly alluded to the potential importance of contextual factors of influencing population health and we had an e-mail exchange about that, yes. I know you are developing some work in that area and this is a little bit off the target of our discussion today, but I think it is of importance to the Subcommittee on Populations.

Could you tell us a little bit more about what you are doing in NIH in the area of contextual variables and what we are really talking about here is moving away from a sole focus on individual level characteristics, those determinants of health like we do in the National Health Interview Survey but potentially adding contextual information to something like the NHIS, where we would collect information. The interviewer might collect information on neighborhood conditions, things like that. So, we would have a richer source of information to look at the characteristics of the neighborhood and the community, as well as the individual as they determine health and access.

DR. KINGTON: As you know, there is increasing amount of evidence suggesting that community characteristics are measured in all sorts of different ways, seem to be predictive of health outcomes in a way that is above and beyond individual's characteristics. So, I mean, we are even getting articles in The New England Journal of Medicine on this topic. So, that is some indication of reaching a certain level of consciousness in the medical and public health community.

The problem, though, is that, one, the broad area of so-called ecometrics, sort of how you go about measuring these larger characteristics is really poorly developed. We are nowhere near the level of the science in psychometrics and statistical characteristics of measures at the individual level. We have everything from exploiting secondary data sets to primary data collection where you count the number of broken windows and look at the number of conflicts in the street.

At the University of Chicago, Rob Sampson(?) has been a leader in this, among others. The problem is that we really haven't pushed the scientific methodology on those methods for gaining those dimensions. That is a real problem. So, the first thing is sort of we need the advance of science on the method of the measurement.

The second issue is that there are major statistical problems that have been completely ignored in the analyses. The way it happens now, you just sort of add a contextual -- and I have done this myself. I am guilty. I am here to testify to that, but it is -- I mean, clearly people aren't randomly distributed in these communities. Often those communities because they have certain characteristics -- and statistically that is a difficult problem and to be honest the epidemiological community has been sort of reluctant to delve into that front.

Economists and demographers have done a little bit more in trying to tease out statistically how to do that and that is the second big issue. So, what we are trying to do is improve the methods by having -- we are going to have a couple of workshops to try to push ahead the methodologic work on the measures, particularly because many of the measures were developed in urban settings.

I mean, what is going -- I mean, what do you do in a rural community where there are no street lamps, you know? I mean, what does it mean to sort of count broken windows in that context? It means nothing. So, we really have to sort of -- we really want to understand what is happening across the country. We need measures that are dynamic enough to really get at a wide cross section of the lived experience in America.

The second point is that the fiscal issue, which we are also trying to develop a workshop on as well. I am really trying to get at sort of taking this fiscal method to the next level so we can really deal with this simultaneity bias issue and we haven't been very good -- more and more papers are flooding the journals on using these methods and I haven't seen one, not one that has really dealt with this -- or even recognize that it is a statistical issue. So, I think we are sort of pushing on that front as well.

We will push until we see movement. I have to head out.

DR. LENGERICH: I have just a couple of comments before you do go and they are not questions.

I am an epidemiologist and I teach that at Penn State University and related to your -- both of these comments come from the class that I teach. We do talk a little bit about ecological variables and the references that I have to go back are five and ten years old to even come close to any kinds of methods there.

So, keep moving forward. It needs to happen.

The second thing is I think it was the second class I used the term of "race" as exposure and about half the class was ready to walk out on me. Now, these people were behavioral psychologist sorts of people and they were really reacting strongly to me, using those two in the same sentence, so I think there is a real -- there is something -- I guess in light of your earlier comment, there is a real culture there that we also need to --

DR. KINGTON: That tells me are hitting on something that is interesting. And it is not -- I don't think it is a substitute for this more narrow static measure. It is sort of trying to take it to the next level to say that there are differences. My experience in being a black person here right now is probably going to be different than when I am going down the road to the White House. My guess. It will just be different. I will let it go at that.

Oh, God, I am going to get myself into big trouble.

DR. MAYS: No, I am going to help you. I know you have to go. We are not going to let you get in trouble on that one.

Thank you very much.

I am going to ask Dr. Smith to join us at the table.

The word "NORC" was thrown out. For those of you who don't know, there are other places that conduct surveys, other than NCHS and we have someone here who is not from NCHS today to talk about it. The National Opinion Research Center, which is housed at the University of Chicago, has for several years conducted the general social science survey, which Dr. Smith is the director of.

He is here to talk about a topic, which keeps coming up, which is the use of multiple race data.

Dr. Smith, thank you and thank you for graciously exchanging positions with Raynard so that we could get the best of both of you.

Agenda Item: Multiple Race Data Use

DR. SMITH: Thank you. It is my pleasure to be here today.

Twenty years ago when I was serving on a panel of the National Academy of Sciences, I contributed a paper called "The Subjectivity of Ethnicity" to the report of that panel. That report had two purposes. One was to discuss some of the complexities in measuring race and ethnicity and the second was to illustrate, along with several other papers that demographics were often as difficult to measure reliably and validly as attitudes were.

Today, the discussion of multiple race measures brings up new technical specifics by the same basic theme. One, that race and ethnicity are social constructs; two, how they are conceptualized, defined and measured depends upon the social and legal conventions that exist and a specific scientific and policy purposes that they are supposed to collect information for.

Three, that race and ethnicity are complex variables and, four, that the information that one collects on race and ethnicity will depend to a notable degree on the way that one measures them.

First, I would like to make some comments about multiple race versus one race measures. Multiple race measures differ from the traditional one race measures in several ways. First, multiple race measures record more members of all racial groups. There is a table in my prepared remarks, which illustrates the kind of theoretical basis for which -- under which this occurs, but basically it is pretty simple, simply secondary, tertiary and other non-primary racial identifications are drawn in by multiple race questions.

Whether you are primarily white or black or Asian or any other particular group, if you have secondary or tertiary identifications, multiple race questions allow you to express those, while a single race question forces you to choose between them. We say that, in fact, that this is born out when we look at the difference in the distribution of single race identifiers versus single plus multiple race identifiers on the 2000 census.

For example, whites are 75 percent on one race identifiers but when we add in multiple race identifiers who have white as one of their races, whites become 77 percent of the population, a gain of 2 percentage points. Blacks move from 12 to 13 percent. Asians from a little less than 3 percent to a little more than -- a little less than 4 percent to a little more than 4 percent and all other groups gain.

The gains are not all of the same magnitude. The number of Native Hawaiians more than doubles when one counts multiple race dimensions and the number of American Indians almost doubles. Other groups all gain considerably smaller rates. But all gain.

The second difference between one race and multiple race is that -- it was already alluded to -- that multiple race questions expand the number of racial identifiers, mostly by drawing in marginal members. The sole identifiers are already identified by single race measures. So, what multiple race does is brings in people who have more than one identification and allows them to enumerated under each of their identification.

Third, except for the sense of self, many individual multiple race categories, for example, a person who is black and Asian or someone who is Asian and white and American Indian will be too small for useful analysis.

There is nothing particularly unique about this situational effort. Many of the existing racial and ethnic groups, such as American Indians or Native Hawaiians or many of the ethnicities within racial groups, such as the Lithuanians or Asian Hispanics are already smaller than some of the multi-racial groups.

Alluding to the remarks that were made earlier, the United States is made up of many different diverse groups, many of them quite small in size and multi-racial measures do not fundamentally change the fact that many are small, that they are difficult to measure with precision.

Now, there are a couple of things that one can do to deal with this problem. One is one can pool data across surveys. The current population survey interviews 50, 60 thousand households per month, a large number, but for measuring some groups, small; similarly the National Health Interview Survey is a large sample, but if different rounds of the CPS, different rounds of the Health Interview Survey and other surveys were pulled together building up cases with the new observations, one would be able to take and get better, more precise measurements of racial and ethnic groups and examine groups that would otherwise be too small to be usefully examined.

Another thing that one can do is that they can use -- approach the measurement of groups as -- in multi-racial measurements as -- consider them as alternative ways of measuring. For example, in a model, one can look at one race, blacks, versus all blacks; in other words, specify two different models to explain a health condition. If in implementing it one way, just looking at people who are one race, blacks, you get the same results as including one race plus multiple race, blacks, in that. Then you have demonstrated that your results are robust and you have more reason that they can believe that the racial differences are real and compelling.

Similarly, if one looks again, taking the example of -- looks at people who are identified only as white, only as black and then those who identify as both black and white, if one looks at those models and finds that the white/black multi-race group is intermediate in where they come out on results, say, for example, on levels of hypertension, then, again, one has added evidence that they can believe that the racial differences are robust and real.

Fourth, while multiple race measures capture a richer and more accurate picture of people's ancestry, they do not necessarily get full and complete information. I will mention just several reasons why this is so.

First, people tend to omit racial ancestries when their background becomes too complex. If someone has three or four or five racial ethnic backgrounds, they usually do not take and report all of those. They simplify their presentation.

Even Tiger Woods is usually known as being black and Asian and that does not fully describe his racial and ethnic background.

Second, even moderately complex racial and ethnic backgrounds are often not fully reported by informants, even spouses. So, people are much better and more detailed in their reporting of their racial and ethnic complexity, even well-informed members of families, such as spouses. So, when one because of the medical condition people or because as in the census, much information is being reported by informants within the household, not by each individual member, one will get a less complete picture, even if one has a multiple race question, when self-reports are not used.

Finally, ancestry in lower status racial and ethnic groups will be underreported by people.

Now I would like to touch upon the issue as to how many multi-racial people there are in the United States. The census found that 1.9 percent of adults or 2.4 of the population as a whole reported multiple race. In our 2000 General Social Survey, we got a considerably higher number. We found that 5.5 percent of adults mentioned more than one race. If one excludes from this figure, people whose multiple race included Hispanics as one of the dimensions, a group which the census does not consider as a racial group, then one finds that 3.3 percent of adults gave multiple racial mentions compared to, as I said before, 1.9 of adults on the census.

The GSS figure is higher than the census figure, in part because all the reports that I alluded to earlier are self-report and, therefore, there is less underreporting on the GSS and also because the form and format of the GSS question makes it more facilitating dimension, multiple race dimensions, and the check or mark all that apply approach of the census.

Now, while the census and the GSS find only a relatively small number of multi-racial groups, as I alluded to earlier, this is already larger than many other ethnic and racial groups that are being measured. And it is a number that is going to grow. First, there is more intermarriage across racial and ethnic lines than in the past. Both the declining social barriers to intermarriage and the changing immigrant mix will tend to further increase the number of people with multiple racial ancestry in the future.

Second, as people become more exposed to censuses, administrative records and other data sources that use multiple racial questions, they will become more used to reporting the full complexity of their racial and ethnic background.

Third, as the old social conventions, thinking of self as only belonging to one race, further erode, there will be -- and as the new multiple racial standard gains wider social acceptance, we will get more multiple racial mention.

Now, one of the problems in adopting the new multiple racial standard of the census and OMB have adopted, is the difficulty comparing figures from following this new standard, the figures using the old standard. Now, in surveys moving forward, the problem will be minimized because both federal and non-federal surveys will probably adopt a standard designed to be equivalent to a census standard because generally this is what has happened both in racial and in non-racial measures in the past. There has generally been a movement towards census standards there.

However this will not help at all in terms of historical records, past censuses themselves, statistics collected, death and birth certificates and the vital statistics system, for example. There, the challenge is to collect data that both captures the accuracy and richness that is provided for by multi-racial data, but also allows a straightforward comparison to data collected under the old standard.

Essentially what one needs to do is try and determine not only all the racial and ethnic identifications that people have, but try and get them to express a primary identification. This is the closest we can come to duplicating the old standards without simply having two sets of questions, which is burdensome and confusing to people, one following the old standard and one following the new.

The census form, unfortunately, is not well-adapted for this. Because it is a self-completion form, which they ask mark all that apply, there is no indication of primary identification. Other techniques can be used, which capture the richness and multi-race, which do provide information.

The General Social Survey, for example, in its multi-racial question adapted to match all the categories that the U.S. Census used, did do one thing different because it was an interviewer-administered survey. It recorded multiple racial mentions as first, second and third mentions, recording them in that order.

Past research indicates that people tend to mention their primary racial and ethnic identifications first. So, this gives you some basis and in some analysis we have done, we have taken as a default first mentioned as the primary mentioned. A technique that goes beyond this and is even stronger in this regard, would be one based upon the way the General Social Survey has since the early seventies measured ethnic identification.

Here we record up to three ethnic identifications, again, designated as first, second or third identification and then we go beyond that by asking which of these identifications do you feel closest to; therefore, explicitly asking people to give a primary identification. Now, not everyone is able to do that. Some people say I cannot give you one. These are my identifications. There is no primary one.

But a substantial number of people can and this does help to identify primary ethnicity or race and, therefore, using techniques like these would tend to enable a better comparison of the new standards to data collected under the old standards and, therefore, more reliable comparison, especially when one is talking numerators and denominators from different data sets.

Race and ethnicity are difficult variables to measure, giving their changing social definition and the complexity of people's ancestry. Multiple race questions improve measurement by providing responses that can capture people's ancestry more accurately and by measuring the socially recognized and growing segment of the population. But the new multiple race standard also causes problems. The census item is ill-suited for comparisons with other data sources using the traditional one race standard.

Multiple race questions add even more small groups to the many racial and ethnic categories that already exist and multiple race measurement is sensitive to the precise way in which items are worded and administered. Still, the measurement challenges are not reasons for avoiding collecting multi-race identifications, merely indicators that race and ethnicity must be measured carefully.

Thank you.

DR. MAYS: Great. Thank you.

Agenda Item: Questions and Answers, Multiple Race

Let's open it up to questions. Start with the committee.

MR. HANDLER: I deal on a daily basis with the problems that you identified, but there is no solution to them. I deal with birth and death statistics for the American Indian population and the problem I have with the multi-racial identifier on the 2000 census is that it can't be projected outward for the Indian population because births and deaths are collected the old ways. How do you project out a multi-race identifier? Using a single race identifier -- well, I shouldn't say single race, but if one race was reported as American Indian, you get a count in the year 2000 of 2.5 million people. If you use a multi-racial identifier with one of the races being American Indian, you get 4.1 million people.

That is at the national level, but when you look at that at the state level, variability changes from state to state. So, you don't have the same thing happening in the same proportions in each of the states. Of course, you know, there is no denominator now in existence to calculate birth and death rates when the 2000 birth and death data become available for the American Indian population.

Now what you addressed in your presentation was mainly survey data and I work with birth and death data. So, that is the other sign of the coin. We have always had a problem with including American Indians in survey data because they are 3/10ths of 1 percent of the national population and this makes it even harder to be included in the sample survey data now.

DR. SMITH: Right. American Indians are a particularly difficult group, first of all, because you get considerably different levels of reporting when you ask a race question and American Indian is one category, versus when you ask an ethnicity question. You get many more people who will say they are ethnically American Indian than who will say they are racially American Indian.

It is also the most common race or group to be mentioned among -- for both whites and blacks as someone who says first of all on white or first of all on black, the single most common group to be mentioned as their second mention is American Indian. So, there is a lot of very complex interracial measurement in that, as well as, of course, the issue of if you think it is important to capture tribal distinctions, which takes a small group and fractures it into dozens of even legally recognized groups, besides some which are only socially recognized, like the Lundy's(?) of North Carolina.

MR. HANDLER: There is about a hundred thousand of that and they are not even fairly recognized.

DR. SMITH: Unfortunately, as I alluded to, the census does not give a good basis. The closest that one can try to do with the census is to put together the different pieces of the census, to see if one -- you know, one looks at the Hispanic and ethnicity question, the race question itself, the race composition of the family members and the ancestry question. Putting them altogether, one can sometimes figure out -- have a better sense who or the actual nature of individuals in it and any one of those questions provide alone. But it still, as you have indicated, it still is virtually -- there is no perfect solution. It is very complex. It is very messy.

MS. LUCAS: I am just curious. Have you ever shared any of your work, your previous work, with the Office of Management and Budget, either at the time when the race and ethnicity standards were being revised or since they have been revised and the guidelines for collecting and publishing the data have been issued?

The reason why I am asking this question is because I am with the National Health Interview Survey and our survey has asked a question along the lines of the one you asked on the General Social Survey about the existence of a primary identity for people who report more than one identity. There has been significant resistance to the idea of asking such a question and, in fact, the ability of the HIS to keep that on the survey is actually up in the air on a long term basis.

The idea, I think, that OMB put forward was that having now revised the standards to allow people to report more than one race on the census, it seemed to them almost a violation of those standards, to then turn around and ask people -- well, to pick one of them. I think that that is a somewhat simplistic view of the complexity of racial and ethnic identity and actually there is a fair amount of literature out there written by multi-racial interest groups that indicate that there are circumstances under which multi-racial people do have a primary identity and to select one.

So, I am just curious about -- have you ever shared some of this information or -- with them because the prevailing viewpoint right now is that surveys are not going to be allowed to ask this kind of question on a long term basis; whereas, I think most of the people using the data feel that it is an almost essential tool in helping to tease out what is going on with multi-racial reporting.

DR. SMITH: Yes. As you allude to it, it does not surprise me too much that there is some resistance in OMB because of the sense that this is counter to the new philosophy, but it really isn't. I mean, simply they are not thinking through it. Everyone would still be asked to say tell us all of your racial and ethnic identifications and so we capture all that and then through some mechanism, like is there a primary one or the rank ordering of them or something, one gets additional information and information that is very important to help deal with this issue of comparable to previous sources that did not allow the multi-racial mention.

In terms of OMB on the issue of racial identification, no on the General Social Survey has ever been in touch with them. All of our data, as well as all of our reports are in the public domain, but we have not had any direct communications with them on this issue.

DR. MAYS: Okay. Thank you very much.

What I would like to do is -- we are at almost the end of our time -- is to be able to spend a little bit of time trying to bring us to a point of closure. The hearing over the last two days has been one in which there has been a lot of information that has been thrown at us, but I think many have begun to identify what the real challenge is and that is what the committee will do henceforth in terms of thinking through these issues.

So, with a little bit of time left, what I would like to do is to spend only about ten minutes hearing from individuals as to suggest directions for us to consider. We will be meeting as a subcommittee February 26th, I think it is -- 26th, I guess it is, and at that time what we will do is review what has been presented to us and decide a little bit about what we heard as being some of the themes, some of the really repetitive issues, some of the needs, some of the challenges so that we can kind of decide where we want to go.

Agenda Item: Discussion

So, I would like to take about ten minutes, open it up to the committee and then to the broader group for any suggestions or recommendations that you want to give to the committee for us to consider in our deliberations. The committee will be at the meeting. So, it is like yours can be short and I will give a little time to the audience, but just so that we are all thinking before we go to that meeting since it is such a short amount of time.

Paul.

DR. NEWACHECK: I think we got a tremendous amount of food for thought in the last couple of days and some wonderful ideas that could serve as the gist for a report and also some excellent areas for potential recommendations. I think what we need to think about as a subcommittee now is what is going to be the framework for our report, how we are going to structure it, how we are going to cull through all of the information that we have learned so far and what we will get in the future to fit into that report.

I think that is right now a little bit vague to me and I think we need to spend some time going through kind of thinking about this process forward. I would hope that each of us could over the next two weeks before we meet again think about what things came up in the last few days that might serve as the basis for recommendations; that is, what are the priority areas.

As you mentioned, many themes came up repeatedly, but there are also things that probably perked each of our individual interests as well, that might be important to include in the report. So, maybe just kind of culling through that and kind of coming up with a list of things that we think are key themes that we want to consider.

DR. MAYS: I agree. I think that is a good point.

MS. HEURTIN-ROBERTS: I just want to say this because I was going to say it when Raynard was here, but he left and I don't want to forget it, although I wrote it down, but it won't mean the same thing when I go back and look at it.

I really liked his idea about the sort of parallel tracks and focusing -- having one track focusing on smaller groups, on smaller studies, more regional studies.

He was saying that in the context -- he was suggesting it in the context of getting adequate samples for the smaller groups. But I would suggest that it also might serve the purpose for getting the better contextual data that Paul requested because when we put everything into this big global pot, you lose an awful lot of information about context, even if we did get it or could have gotten it.

So, I think we need to think it is useful for both.

DR. MAYS: I agree. Anybody else on the committee?

DR. LENGERICH: I started to kind of categorize some things and what I was thinking I heard was that there are issues around sampling more or differently to capture more populations is one kind of a common theme that we heard. Another was kind of what data do we collect as identify individuals.

Then a third area is adding or considering what kind of contextual variables should be considered in the different surveys. Now, there are some subpieces that might be looked at, too; things like training and education in statistical methods. But I saw those first three as kind of three broad themes to look at.

DR. MAYS: Good. Anybody else?

MS. COLTIN: A fourth that I heard was tradeoffs, that in a zero sum game, are we surveying too many non-Hispanic whites or not enough?

DR. LENGERICH: I think the tradeoffs are going to be very important and that may be the very bottom line, but that would be -- to me that would be an issue related to how to do the sampling. Exactly.

DR. CARTER-POKRAS: One thing that kind of came up, also sideline conversations, is in addition to focusing on how the data are disseminated to researchers, as well as policy makers and the community in which that the data were collected, we also need to think along the lines of how are we sharing the instruments that are used to collect the information because significant money has gone into this.

I know Dale did mention that. Tim mentioned at the outset of this particular session, he talked about the new effort that is going to be done by the Department to sort of access the data from the data system. Well, we also should take a look at see what can be done to access the instruments not only in English but also in other languages in which they are translated because it can save a lot of money.

One of the students who wasn't able to be here today from George Mason University just wanted to make sure that her positive feelings about the surveys were also shared because she felt that this was extremely useful for a long doctorate student in accessing information for her dissertation, being able to access it.

DR. MAYS: I wanted to add a piece that I think is not necessarily a piece that came up in the hearing, but a piece that comes in putting this hearing together and that is I think the paucity of people who have expertise in actually using these data sets and who are focusing on the content of the groups that we are talking about, not people who can do a cross tab using race and ethnicity.

I quite often would get to someone that someone had recommended and it was like, oh, yes, I used it. I said, no, but what about the population. Tell me about the population slide. Tell me, you know, a little bit more on what you would find is that currently even though we have a focus on health disparities, I do have a concern about the limited number of people who have first of all the background, the technical skills and the commitment -- I mean, you know, it is like not the one paper person type thing, but the commitment to continue to use this.

So, there really is for me, clearly, from doing this, a training need that is there. I think it was alluded to in the sense of some of the questions that you raised while we were here today.

MS. LUCAS: I just wanted to raise the issue that when we look at racial and ethnic disparities among groups and a lot of the talk that we have had here is focused on, you know, looking at smaller population groups or the ability to look at smaller population groups that we don't have a good handle on right now, but I just wanted to mention in light of the discussion about, you know, sample size issue and maybe pulling away from the large, white population to be able to sample in other groups, that there is actually a fair amount of diversity within the white population itself that shouldn't be ignored from a health standpoint and the paper that Raynard and I are working on, looking at health characteristics of U.S. and foreign-born blacks and whites is illustrating that there is a population out there and we are not really sure exactly who they are, but they are foreign born persons who self-identify as white, who have health characteristics that are quite different from the total U.S. white population.

We tend to think of the terms "white" and "black" as generic, all-encompassing terms, but they do, in fact, contain a fair amount of diversity within those groups. I think when we are making decisions about changing how we sample in order to look at other population groups, we shouldn't ignore that, ignore that that is there.

We would have to consider whether or not it is worth trading off the ability to examine that in order to look at some other groups. I just wanted to mention that.

MR. HITCHCOCK: My point, to sort of tease out what Jackie is saying and what you were saying, too, Vickie, in terms of training, the same time we are encouraging training, we probably should also be encouraging more collaborative efforts between folks like you and Raynard and applying the few skills that we had out there, along with other researchers to get a better handle on what is happening here.

I was thinking the same thing when we had a presentation yesterday from the National Survey on Family Growth, when they were talking about various characteristics and their effect on welfare policies, we ought to pull in the welfare side of the department to work up in these reports, too, to show where we could take the next steps.

MR. HANDLER: I have a practical type of comment or overview. I don't even know if it is in the purview of this organization, this subcommittee, to even consider, but it should be, I think. That is how much are we spending on data collection activities of various kinds. If we make recommendations to expand, it is going to cost even more than we are spending now. So, the thing to also keep in mind is are there possibilities of cutting back certain things?

Nobody wants to cut back anything. Another alternative is are there additional funding sources

besides --

MR. HITCHCOCK: One percent of the Department's discretionary budget we are spending on data collection.

MR. HANDLER: Well, there might be outside funding sources, outside of the Department, that could be accessed. I know when the Blackfeet tribe in Montana thought that the census in 1990 didn't count them properly, the Census Bureau said we will conduct another survey for you, but you have to pay for it. The Blackfeet tribe realized that if they got more of their people counted officially by the Census Bureau, it would bring more money to the tribe. So, they actually paid money to the Census Bureau and they did come out with more and people are numerated the second time around.

Now, there was a representative here earlier from Montgomery County, Maryland. Possibly, Montgomery County, Maryland would like to have a Health Interview Survey conducted among all the people in Montgomery County and if there were willing to pay something towards that, maybe we would do it. That is another example.

But there might be other sources besides just the Federal Government to get the Federal Government's work done.

DR. MAYS: Good point.

What I am going to do is I am going to take some comments. I am going to ask each of you to keep your comments short so that I can get everyone in so that we can also get out at a reasonable time.

MR. PARK: My name is John Park. I am with Montgomery County.

Sure, I mean if we had the resources, we would like to even pay to get the kind of information that we need, but we felt that given all the resources that are available, we shouldn't have to.

Well, anyway, what I am here to talk about is hidden below all these racial and ethnic issues are the issues of migrant workers and the refugees. Now, the refugee characteristics differ as the world events go by, but the refugees of one country, for instance, El Salvador, are very different from the people who are actually living in El Salvador, like in Vietnam.

I am a Korean by birth. The people who have come from Korea, during the Korean War, to the United States, are different from the people who are coming now. So, when you put all that kind of information in there, I think these type of differences, disparities in the major groups of people in certain ethnicities also differ.

So, I wonder if the issues with the migrants and the refugees not being considered enough as we consider the racial and ethnic issues.

DR. MAYS: Thank you.

DR. COLEMAN-MILLER: I attended the racial disparity conference at NIH in 1985 everyday and I look today and see that I still have a little bit of hope that things are going to change, more hope than I did at the end of 1985.

I am going to ask that as you make your transformations and hopefully that is what they will be, that you call the historically black colleges and ask those students to come in and watch you do it. Let them actually take class time. Work it out with the HBCUs so that in class time they can be sitting and learning as you try to switch this paradigm, as was suggested by Dr. Kington. Just observing will teach them much of what they need to know so that they can be sitting at this table with you in 2010 and it won't all be only one minority presenter.

The second thing is when we talk about the cost of getting this data, clearly, CMS could talk about the cost of the disparity and we would all lose our breath over it. So, it is a major tradeoff in cost.

The third is I think you are all doing your work. I just don't think right now that you are getting the job done and there is a difference.

MS. LATHAN: Monica Lathan from the American Public Health Association.

I have three observations. I really liked the idea of a collaboration or regionalization and standardization of data collection so that you can obtain a more comprehensive data set that can be used in multi-organizations and in multifaceted ways.

The part about data not sexy, that kind of appealed to me in the sense that I saw that there is an emphasis on the data collection and the descriptive and the identification of problems. That is where the data part comes in and it is actually the first step or the first part; whereas, you kind of need to present as well what you do with this data in terms of policy, in terms of program development and that is actually the sexy or appealing part.

However, the data collection and the descriptive and the demographics are the actual crux of identifying the problem.

In terms of -- I am glad that the skills of the interviewer in surveys was mentioned as an important part. Being a former field epidemiologist, I thought that was really good to observe that. Perhaps to train people to become more skilled in interviewing, become more receptive to the communities themselves.

Lastly, I had a statement about the multiple race data. It is a nice idea. However, it seems like it is not such a priority in terms of the fact that we have so much more to do in terms of things that are done with the data. It can be somewhat tedious to a person answering surveys. For instance, I consider myself black. I have a heritage in American Indian, white, but I am not naming all of that.

I think that the important thing that Raynard mentioned was what do you think you are and what do others think you are. I think it really boils down to that, particularly in health systems. When they see me, they see an African American woman. They don't see all of the other things that make up my culture or heritage.

When I present, I present as an African American woman. So, I think that you need to take that into consideration and I think that that is something that would be good to do on the long term to find out every specific part of one's being. However, I don't think that that is a big priority at this time when there is so much more work to be done.

DR. MAYS: Great. Thank you.

MR. DAWES: I am Jim Dawes(?) with the Asian and Pacific Islander American Health Forum.

I want to relay one piece of information and make one recommendation. The information is that at a conference the ICCC, Intercultural Cancer Council Conference on Sunday, an official from NCHS, from the mortality section, indicated that for the purposes of calculating national mortality rates, they would create the denominator for the multi-race people. In other words, they would reassign people who checked one race, Category 2 ought to be one of them. So that they could create a standard denominator.

Obviously, that is going to have a lot of implications, especially for Native Hawaiians, American Indians. I am going to be very curious to see what that looks like.

Especially also curious to see how the states -- whether the states follow that and whether other agencies for other purposes follow that.

The recommendation I have is that I did like the recommendation of the parallel tract of using other studies to learn about and to measure health disparities. What I would recommend further is that Healthy People 2010 then accept those as valid data upon which to measure and create goals on health disparities because right now they are only using these federal data sets. If you have looked through the Healthy People 2010 document, you see a lot of stars, asterisks, in the American Indian and Asian sections because the data are not there.

So, if we are to accept those studies, let's accept them within the systems that we have.

Thank you.

DR. MAYS: Thank you.

MR. FAYE: I am Walt Faye(?), D.C. Department of Health. I am so glad Beverly called me this morning to come over. She said come right now. I am just sorry I didn't know about it yesterday because we in the Department of Health are very, very interested in getting good data. As we now move to get evidence-based approaches to some of the disparity problems in the District, we need this kind of input.

One question I have is probably an agenda item for your next meeting. As we move into HIPAA compliance, the D.C. Department of Health, in looking at the standards for privacy of individually identifiable health information, there is concern. Will this be a constraint in terms of collecting the data, which we all need to make a lot of other decisions.

So, I looked over this agenda and I didn't see this discussed. I think that would be of use to us, especially the D.C. Department of Health. So, I just leave that as a comment.

DR. MAYS: Thank you.

MR. KATZ: Michael Katz(?) from CMS.

I think wherever we collect data, we need to collect language preference data. We are starting to get more of that, but it is really way down on reliability and even collection. So, it should be routine. We need language preference data. Language preference data absent subpopulation breakdowns, for example, Asian American and Pacific Islander, language preference data can also be a proxy if for nothing else for the breakdowns of the subpopulation.

The other part is linkage has to be made between disparities and culturally and linguistically appropriate services. There is an article by Cindy Brock and Irene Frazier, I think it was December 2000, that talked about how to set up studies that would basically look at disparities and culturally and linguistically appropriate interventions. We talk about both things. We don't really link them. We talk about like, for example, if you are looking at health promotion and disease prevention information, you can't just do translations. They have to be culturally appropriate translations. Verbatim translations don't work yet. Many people in government organizations still use verbatim translations. It doesn't cut it.

We have a Horizons Project at CMS that is totally dedicated to looking at culturally and linguistically appropriate services and yet even some of their recommendations can't make it through the bureaucracy because of the low percentages of population. When you look at budget, you say, oh, that is a very small percentage of the population. We can't expend that much money there when we have all these other things to do.

I think Dr. Kington's point is well taken. We have to look at the breakdown of populations in smaller parts of the country, not a national figure, but local figures. I think something like 70 percent of Asian populations are in like ten American cities. We have to have better population breakouts and to do that, again, the data has to support all of this.

It has to be data driven so that we can look at and evaluate the effects that the interventions can have. So, I think in terms of disparities we need a little broader look. We have to include language for sure and we have to account for how class can affect disparities.

DR. MAYS: Thank you.

One of the things I would like to say is I want to thank people for their comments. I am going to try and take up Dr. Coleman-Miller's charge to us. I appreciate her comment that we are working. I am going to see if we can do the job, but I have to be honest about what that means. It means that we only have a piece of it and that part of our job is to turn some of this back over to other people in terms of our professional organizations to come into partnership with us, in terms of things like the HBCUs to rise to the occasion.

So, part of what is going to be important is not just what we do but who else we give it to because I want to be very clear about this committee's ability to do all and be all to all people, which we cannot. So, I want to be very honest about that.

But I do think that when it comes to in particular looking at health data and health statistics, this committee has an expertise in terms of both who sits on it and its ability to access individuals involved in it.

So, I think in terms of the strength of that piece, that is where I think our expertise and our efforts will go. But in terms of some of the other -- and I dare say in this building, the word "advocacy" -- and I am not sure if we are supposed to or whatever the rule is, you know, a blank moment on the paper or whatever, but that is where, you know, we can do the selection. We can do some analysis and we have to do some give away.

So, I really appreciate that people are here, but I am also going to challenge you to the same thing that Dr. Coleman-Miller challenged us to, which is besides the work, you have to do the job and the job is going to be we are going to try and give you back some of what you asked for and ask you to help us in this process.

One of the other things I want to do before we leave today is I think one of the things you have to understand is that while I sit here as the chair of this group, that there is no way that I was able to do, you know, even an eighth of this alone. One of the things is that this really is a team effort. The committee, of course, helps in terms of, you know, what the content is, but we have some very clear individuals who, you know, some of you knew before you knew me in the sense that they were sending your e-mails begging you, better than I begged you obviously to get you here and what have you.

I just want to take a moment at least to acknowledge their contributions because some of you may continue to contact them in the sense of sending them information, if you continue to have thoughts, et cetera. So, it is not just me you can send these things to, but you can also interact with other people.

When we began this process, many of you got e-mails from Susan Queen. Without Susan, we wouldn't have had a half of our individuals here. Susan worked pretty hard at getting them and making sure that, you know, they understood the task.

Dale was also instrumental in -- I think he knows the who's who in the federal agencies. So, he was also very instrumental in helping us to bring people together.

Debbie Jackson, who is sitting there quietly blending in with the audience, is actually a part of our staff and has been quite helpful in helping us to understand what we needed to do.

And you all knew Gracie before you knew me. Gracie, you know, greeted you as you made your way here and made sure that from all that understand that you think very nicely of the National Committee on Vital and Health Statistics. So, I appreciate that because I know for some of you we are a new entity to you.

So, my hope is that you will continue to stay in touch with us and we with you as we continue to have our hearings.

Olivia, the many times, you know, knows the other half of the world that Dale didn't know. So, you know, Olivia was -- has been good about also keeping us in touch with many other aspects.

Our new people came right to the cause in terms of the sense that they are brand new to this group and joined us and have been quite good at also trying to help us to get -- hit the ground running.

Of course, to the committee, I think you for, you know, all that you have contributed to today. So, this was a team effort. So, thank you, everybody and we thank all of you for attending. We will let you know what we do next.

Thank you.

[Applause.]

[Whereupon, at 12:35 p.m., the meeting was concluded.]