[This Transcript Is Unedited]

NATIONAL COMMITTEE ON VITAL AND HEALTH STATISTICS

AD HOC WORK GROUP FOR SECONDARY USES OF HEALTH DATA

July 18, 2007

National Center for Health Statistics
3322 Toledo Road
Hyattsville , Maryland

Proceedings By:
CASET Associates, Ltd.
10201 Lee Highway, Suite 180
Fairfax , Virginia 22030
(703)352-0091

Table of Contents


P R O C E E D I N G S [9:05 a.m.]

DR. COHN: Please be seated. Good morning. I want to call this meeting to order. This is the second of three days of hearings of the Ad Hoc Work Group on Secondary Uses of Health Information of the National Committee on Vital and Health Statistics. The National Committee is a statutory public advisory committee to the U.S. Department of Health and Human Services on health information policy.

I am Simon Cohn. I’m Associate Executive Director for Kaiser Permanente and Chair of the Committee. Now I want to welcome committee members, HHS staff and others here in person and do want to welcome those listening in on the Internet. And for the benefit of those with cable, I do want to tell you that we actually are currently on the Internet. So as always, I would advise you to speak clearly and into the microphone so those on the Internet can appreciate our deliberations and testimony.

I want to again thank the National Center for Health Statistics for their hospitality and for hosting us for this meeting.

Let’s now have introductions around the table and then around the room. For those on the National Committee, I would ask if you have any conflicts of interest related to any of the issues coming before us today, would you so publicly indicate during your introduction. I would also expand that to even if there are no outright conflicts of interest, if there are interests or participation in the various bodies that will be testifying today, you probably should publicly indicate even though it is not precisely a conflict.

I want to begin by observing that I have no conflicts of interest although I am an active member of the American Medical Informatics Association which will be one of the testifiers today. Harry?

MR. REYNOLDS: Harry Reynolds from Blue Cross Blue Shield of North Carolina. I’m a member of the committee and no conflicts.

DR. CARR: Justine Carr, Beth Israel Deaconess Medical Center, a member of the committee, no conflicts.

MS.. GREENBERG: Marjorie Greenberg, National Center for Health Statistics, CDC and Executive Secretary to the Committee.

MS. BRITT: Myra Britt, I’m a contractor to the Work Group.

DR. Mary Jo Deering, National Cancer Institutes Staff to the NHII Work Group of the NCVHS.

Dr. LOONSK: This is John Loonsk with the Office of the National Coordinator for Health Information Technologies.

MS. GRANT: Sharon Grant, Booz-Allen & Hamilton, Contract Support.

MS. HAMILTON: Christine Martin Allison, Booz-Allen & Hamilton, Contract Support.

DR. LABKOFF: Steve Labkoff, Director of Healthcare Informatics, Pfizer Pharmaceuticals.

DR. BLOOMROSEN: Meryl Bloomrosen, American Medical Informatics Association.

DR. OVERHAGE: Mark Overhage, member of the Committee and the Work Group, and while a member of AMIA, I have no conflicts.

MR. VIGILANTE: Kevin Vigilante, member of the Committee, Booz-Allen & Hamilton. No conflicts.

DR. STAMEN: Phil Stamen, Health Policy. I’m a member of the Committee and no conflicts.

DR. STEINDEL: Steve Steindel, Centers for Disease Control and Prevention, staff of the Ad Hoc Work Group and Liaison to the Full Committee.

DR. TANG: Paul Tang, Palo Alto Medical Foundation, member of the committee, no conflicts, current Chair of AMIA.

MR. ROTHSTEIN: Mark Rothstein, University of Louisville School of Medicine, member of the Committee and no conflicts.

MS. JACKSON: Debbie Jackson, National Center for Health Statistics Committee staff.

MS. HAZENER: Beth Hazener, America Health Insurance Plan.

MR. CONNORS: Chuck Connors, Geographic Information Systems Development at CDC.

DR. EISENBERG: Floyd Eisenberg with Siemens Medical Solutions, also with HITSP Population Health Technical Committee and the IAG Quality Domain.

MS. SISK: Jane Sisk, Director of the Health Care Statistics Division at the National Center for Health Statistics.

MS. CONN: Heather Conn, Informatics Specialist, National Center for Health Statistics, CDC.

MR. LANDAHL: Morris Landahl, Office of the National Coordinator.

DR. COHN: Welcome all, as well as those listening on the Internet.

Before we move into the agenda review and before I actually turn over the microphone to Harry Reynolds, one of our Co-Vice Chairs to actually run the meeting, I do want to just make a couple of comments and go through agenda review, and some of this will be a little repetitious of what we discussed yesterday. But since we have actually everybody here this morning, it’s useful to sort of remind everyone a little bit about what we’re doing, the charge and, I think, the plan.

As commented, this is the second day of this first set of hearings. We intend before we’re finished to have somewhere between six and eight full days of open hearings, hear from a variety of stakeholders and interested parties on the whole issue of secondary uses.

Now just to remind you, we have been asked by the U.S. Department of Health and Human Services and the Office of the National Coordinator to develop what I would describe as the overall conceptual and policy framework that begins to address some of the secondary uses of health information both for a way of thinking about it and certainly a taxonomy, and, I think as we began to talk yesterday, certainly a definition of terms. And there are a lot of terms out there that get bandied about, and I think if we come away as part of our report providing some clarity for HHS and the country in terms of all of this, this will be one very good outcome of the deliberations and work.

Now beyond that, and once again, probably the reason that we’re really here talking is we’ve been asked to develop recommendations to the U.S. Department of Health and Human Services on needs for additional policy, guidance, regulation and public education related to expanded uses of health data in the context of developing a nationwide health information network. And obviously, this is a broad area. I think our intent and our initial area, initial emphasis is on uses of data for quality measurement, reporting and improvement.

Now I should add that beyond this what I describe as almost high level recommendations and frameworks and all of that, I would also interpret our charge to include coming up with what I would describe as more specific recommendations, actionable recommendations for HHS in relationship to tools, technologies, other approaches that may be able to minimize any unidentified risks that we identify in the course of our work. So it isn’t just telling people to be educated better, to enforce regulation better, but certainly if there are things that we can identify that really will minimize identified risks, I think that will once again be a very valuable outcome and once again, it gets us relatively concrete certainly as well as makes our recommendations potentially actionable both in the near term as well as the longer term. So we actually do have a relatively wide variety of work products that we need to come up with, and we’ll talk about some of the activities and how the agenda is structured moving from the general to the specific and then back again.

Now, as yesterday, I do want to thank all of the members of the Work Group for donating their summers to this activity. Paul Tang, Bill Scanlon, Mark Overhage, Mark Rothstein, Kevin Vigilante for being willing to serve. I do want to particularly thank Harry Reynolds and Justine Carr once again for donating their summers to be co-vice chairs of this activity. And then, of course, key liaisons and staff which includes John Loonsk from the Office of the National Coordinator, Steve Steindel, Mary Jo Deering, Marjorie Greenberg, Mary Beth Farquhar who hopefully we will see and John White who I know called in yesterday and may be calling in again today.

Obviously, we also have staff, Debbie Jackson, Cynthia Baur who is here. I want to thank obviously Erin Grant who isn’t exactly staff, but I guess is consultant. Erin Grant as well as Christian Martin Anderson, and of course Margaret Mosiaka who is now sort of lead staff on this project. So obviously thank you all for your participation and contribution, and I think it will be a very interesting set of hearings this summer, in case you were wondering what else you were going to be doing this summer.

Now the intent, as I commented, is structured in a way we’ll be talking almost simultaneously about – you know, I apologize. I actually didn’t thank Mike Fitzmaurice who I’m sitting here looking at for his involvement and participation, and welcome this morning, representing the Office of Agency for Healthcare Quality and Research, and I’m sitting here looking at him and I didn’t really specifically acknowledge him as one of our liaisons. So thank you. I know I’m going to miss somebody in these comments.

Anyway, the agenda is really meant to be sort of this combination of looking at broad framework issues as really more specific issues that really involves quality and, as I was describing, tools and technologies and approaches to help minimize risk. And we saw some of that conversation yesterday as we started out with framework conversations and then dug down deeper into areas. Obviously, this morning we again are talking at sort of a high level and then begin hopefully talk about taxonomy and some definitions and then hopefully dwelling down as the day goes on also.

This is meant to sort of create a dynamic tension between sort of high level views as well as like let’s get real and stay grounded. And we all have to judge, as the hearings go on, about how successful we’re being on this. Certainly, at the end of today and into tomorrow morning, we will be talking about some of the structure of the August hearings so that we can fine tune whatever we need to do to make them more effective and useful based on the information we’re hearing today.

Now, the hearing today starts out and obviously we’re again at the sort of the framework level. Talking is considered a framework for optimizing uses of data, and we’re very pleased to have Steve Labkoff as well as Meryl Bloomrosen joining us from the American Medical Informatics Association. They had what was a very successful not in terms of offsite, but symposium and session looking at secondary uses of data to framework issues at the framework level, and we’re very interested in hearing obviously in open session some of your thoughts about that as well as recognizing, I think, your bringing forward sort of concepts, approaches, frameworks, tools, trying to think about how they might be able to help us in all the work that we need to be doing. So we’re obviously delighted that you could join us this morning.

Following our break, we have John Halamka calling in, and then we have Floyd Eisenberg joining us in person and thank you, talking about the issue of standards and how that plays into our conversations. And then this afternoon we move into issues of definitions, anonymization, pseudonymization, de-identification and all of this, and the overall issues of security and how that plays in.

And then finally late in the afternoon, we’ll have a session talking about providers who capture use and share data and how this all begins to play out in that environment.

Late in the date, we’ll have some work group conversations. We really do want to make sure that we’ve gotten everybody’s thoughts, insights from the day recorded as we begin to move into the third day of our deliberations.

With that, I will turn this over to Harry. Harry, thank you for being willing to facilitate the session.

Agenda Item: Considering a Framework for Optimizing Uses of Health Data

MR. REYNOLDS: Okay. Our first panel’s in place, so that’s good. We’ll just go in – do you want to go in the order that you’re listed on the presentation, Steve and then Meryl.

DR. LABKOFF: Sure. Meryl and I are going to hand off side to side during the presentation. So I want to thank the Committee for inviting us to participate and to provide our thoughts about secondary uses of health data. AMIA has been for several years now trying to examine this issue and surface what are major concerns and issues in the arena. And in the course of today, we’re going to go over –- I guess I better do the slides, I generally will be going over some definitions that we came up with, an overview of things that we did in 2006, and we’ll get into the conference that we just convened back in June and go over some pre-conference working groups and what their findings were, and then get through some of the preliminary themes that have emerged and some of the thinking around those issues.

I do want to stress that the work we’ll be presenting is not conclusive work. The meeting convened only a month ago, and most of the work in distilling and coming up with final conclusions is still ongoing. So we’re just sort of giving you a preliminary look at what our thinking is and what the meeting came out with. We would be very pleased to come back in a few months’ time when papers are written and presentations are more honed to give you sort of final thoughts on the subject.

So with that, I’ll hand it off to Meryl.

DR. BLOOMROSEN: Thanks, Steve, and I want to just echo Steve’s thanking you all for inviting us to be here. We actually did testify in front of perhaps it was the full committee last year –- November, I believe it was, and that was to give you a summary of our work from last year. That testimony actually had the benefit of six or seven months having passed when we first convened the meeting.

But to Steve’s point, our 2007 meeting occurred just about a month ago, and we’re very delighted to be here and would be pleased to answer questions, but would caution. I hope all the answers aren’t, well, we’ll get back to you. But there are quite a few people thinking about this from a multi-disciplinary approach, and we’d be happy to come back, as Steve said, to augment or clarify any additional information that might be necessary.

Let me just digress for a minute and make sure, as we did before, that we thank the leadership of AMIA of Paul as our Chair and to extend thanks to the Board of Directors, Don Detmer, our President and CEO as well as David Bates, our incoming Chair, Charlie Safran who may be on the phone listening in and also available to participate in this presentation if he can. But very much this is a concerted team effort, and we’re very much wanting to express our thanks to those folks.

So to take a little bit of a step back, a year ago or so, we took our first look at secondary use of data. And in order to do that, we came up with the definition that you see in front of you. Well, what is primary use of health data, and I will say and I understand from just side bar conversations that there are quite a lot of discussions happening about the term primary and secondary in and of themselves.

We in fact had significant conversations and discussions both last year and this year about the use of these terms. But for starter purposes, we have made a distinction between primary use and secondary use. And as you can see here, for our thinking, primary use of health data are data that’s collected about and used for the direct care of an individual patient. Most of the time, in our way of thinking about this, it’s occurring in real time as the clinicians and the providers are in fact taking care of the patient.

And we recognized that there might be some connotations to this definition, but we felt like we had to start somewhere. And in fact, when we looked at where existing policies, regulations and guidance concentrate, it is in our minds mostly concentrating on the primary use of data. And we found that there was reason to elevate the level of thinking and thought that was going into other uses of data. Next slide.

So we began with some definitions that are, as Steve mentioned, in process. We’re honing them, we’re refining them, and we’re trying to distill them. And at this point in time, the secondary use of health data is, in our minds, defined as non-direct care use of personal health information that includes, but is not limited to, analysis, research, quality and safety measurement, public health reporting and monitoring and surveillance, payment, provider certification, accreditation and marketing and other business activities that include strictly commercial activities. And as you can see on the slide parenthetically we kind of say a/k/a everything else.

I will point out, and I think Steve will go into a little more detail, that as we looked at this definition, some of the terms within our definition become subjects of discussion. And as those of you and several of you around the table were either at our meeting last year or even at our meeting this year, Simon and Justine and Mary Jo, you probably remember that the use of the word commercial became fairly contentious and subject to a lot of discussion. So that might be something that we’ll suggest and continue to look at ourselves.

We also want to point out that we’re identifying the second bullet which is a secondary use of health data occurs when data are used for purposes other than those for which they were originally collected or obtained. So we think between the two statements here that it’s a good reflection of trying to make this distinction that we thought was important between primary and secondary use.

So last year in April of 2006, AMIA convened an invitational meeting, and we published the findings of that meeting in January of this year in the Journal of the American Medical Informatics Association. The 2006 meeting was – we had about 30 individuals representing various stakeholders’ viewpoints and constituencies. And in our report, we identified a couple of key things from that initial convening.

We found that there were widespread and incredibly diverse secondary uses of health care data. We were sort of touching the tip of the iceberg when people started to talk about what uses were they making or could they make if they want to make of data. That public trust issues – the consumer, the patient, the patient’s caregivers, the trust issues dominated.

In fact, a lot of the reason for that is there seemed to be a gap in an understanding or an awareness of what uses there were of data along the continuum. There seemed to be multiple people along this chain that are obtaining and using data and then re-obtaining and using data, and there seemed to be a consensus that the general public was not aware of all of these uses.

We found last year that technological and technical capabilities – and to your point, Simon, if the committee is going to be looking at tools and techniques and technologies, we found that capabilities technologically were outpacing policies, procedures, guidelines, et cetera in the sense of dealing what could and should be used – dealt with in terms of data.

So the ability of technology to allow people to de-identify but then re-identify patients, or to merge different data sets, even if they’re disparate, to link health data sets with potentially non-health data sets, for example, employment types of data, became of concern and of importance to our group last year because the ability to do that without facing the fact that policies on the books national and state even did not address the fact that the futuristic uses of these data were not covered.

And then also that we felt there was considerable need for attention and leadership at the national and state level to explore these issues, and we’re obviously thrilled to see that the Secretary and NCVHS and AHIC and others are beginning to look at this. Next slide.

So last year, we proposed a framework for secondary use of health data that’s identified on this slide, and I’d like to mostly identify a couple of the key words. You’ll see them again as Steve outlines what we continued in 2007. The transparency issue – that people need to know, all of us need to know what is going on if there are people collecting our data, the transparency of that, the fact that we should know how and what is going on is very important.

We early on last year acknowledged that data ownership was going to get us into a circular conversation, and we didn’t want to simply restrict our conversations to ownership. The group felt that conversations and issues relating to privacy and security of data were able to address ownership issues. But in fact we felt that data stewardship and the control of the data and the use of the data sort of surpassed issues relating to ownership.

There was a general consensus that privacy, policy and security still needed attention, public awareness, education was important, that uses and users of data were in fact a complex issue. And what you’ll see is that we started last year with the recommendation that AMIA could or should or someone should look at developing a taxonomy of users and uses and, again, national leadership. Next slide.

So with that as a backdrop, Steve is going to dive into what occurred about a month ago when we convened our 2007 conference on secondary data. And I would like to mention that this year we actually wound up having a hundred plus people at the meeting. We expanded the broadness of the participants and continued the dialogue. Steve.

DR. LABKOFF: Thank you, Meryl. In addition to that, one of the things that we undertook between ’06 and ’07 was we teased out three specific areas that we felt needed more work than could be accomplished in just a couple days in a meeting and convened three working groups, and we’ll talk about them in turn. A working group around taxonomy that was led by Chris Chute and Stan Huff, a working group on data stewardship which was led by Melissa Goldstein, and an identification/de-identification working group – I forgot who led that off –-

DR. BLOOMROSEN: Doug Barton.

DR. LABKOFF: Sorry, Doug Barton. And we’ll go through what those working groups found. Then we decided that at the course of the meetings, the nuts and bolts of it was going to be to attempt to try and tease apart a framework for how to consider secondary use of health care data in four different domains, and those domains were the research domain, public health, quality and commercial space. As Meryl indicated, the commercial space became somewhat cantankerously debated during the course of the meeting.

We also came out with some principles around data stewardship. And for anyone who’s gone on the Internet, you can get all of the pre-meeting work and preliminary materials at the link that’s included on our presentation here.

So to start off, we’ll talk about the data stewardship area. And I guess the first thing was what was the definition of data stewardship that we used, and we decided that data stewardship would encompass the responsibilities and the accountabilities associated with managing, collecting, viewing, storing, sharing, disclosing or otherwise making use of personal health information, and that principles of data stewardship would apply to any and all personnel and systems and processes that engage in health information storage and exchange within and across organizations and would provide guidance for those for all of the discussions about secondary uses and lay the groundwork for principles, and this became sort of the starting point for that working group.

Now one of the tasks for the working group was to come up with some preliminary principles and the question as to why would we establish them would be these. They provide a rationale and safeguards for legitimate use of secondary uses of healthcare information. The principles could be used to describe enforcement mechanisms that provide reassurance for appropriate usage and describes the benefit to the field of having trusted data stewards who adhere to these principles.

And the issue here of trusted stewards is really an important one because effectively if you have two different organizations both of whom adhere to a given set of principles and are considered trusted stewards, then the ability for them to share information among each other without having to go for independent transactional guidance or transactional approval is facilitated. So this concept of having a series of principles or a certification, perhaps, among organizations to allow sharing could facilitate things such as the NHIN or local health information exchanges.

The principles that came out of the meeting included those around accountability, including governance, oversight, and to the extent and level of acceptable regulations, openness and transparency. This issue of transparency is one that continues to come up, around structure, processing, delivery of data and business processes and practices.

Notice to patients is also one that was exceptionally important so that patients are informed that their data, whether used in a blinded or non-blinded fashion, they were informed of how their information is going to be utilized outside of the health care domain – outside of them receiving health care.

Privacy and security, including data quality, de-identification, the cost of re-identification, granularity of patient consent, permitted uses and disclosures including data aggregation and analysis, and enforcement remedies.

And the enforcement remedies are important basically because if you don’t have some series of remedies in place and someone does violate these principles in practice, there’s really no recourse for the patients or for the organizations at hand.

Another distillate that came out of the working group was what was called a consumer guide regarding personal health information, and this is a series on the next two slides a series of questions that one could levy at organizations that are meant to steward that information. And I won’t go through every single bullet of these at this point, but will just highlight a few.

Does the organization have privacy policies that are written in clear understandable language, definition and terms for patients and their organizations.

Are patients notified when there are changes to these privacy policies. Does the privacy policy include a list of all of the uses of the data whether or not and whether or not it can be identified. Does it identify how data are protected. What happens when an organization is sold, or it merges, or another organization receives the data through some business transaction, or if the company goes into bankruptcy.

What happens in the case that the individual terminates their agreement for working with the health organization. Will the data that they’ve generated become purged from their system. Will it remain on the system in a blinded form, or in an identified form.

The answers to these questions helped generate a series of levels of stewardship that individuals can take and use as an assessment against organizations that they’re making choices as to whether information will be stored or how to be stored.

The next working group that convened was that around taxonomy, and it was agreed early on after the first meeting that a taxonomy was absolutely necessary to have our people talking on the same page about terms and processes. That work was convened prior to the June meeting and was vetted at the June meeting. The taxonomy was meant to identify possible non-political uses of personal health information and to clarify societal and public policy, legal and technical issues.

The taxonomy supports more focused and productive discussions regarding health information, health data and their uses.

DR. BLOOMROSEN: Excuse me, I just want to mention that we don’t have copies with us of the draft taxonomy, but they are among the many documents that are posted on our website, and they’re available to the public. So if you wanted to start looking at that, that’s a work in process, as Steve mentioned, but that is available on the web.

DR. LABKOFF: The axes of the taxonomy at the moment include what are the categories or classes for secondary usage, how is the data meant to be used, and what are the existing potential sources of secondary data and who are the users. In short, it’s basically attempting to identify all the different ways of talking about the subject in a consistent and common framework.

Now you’ll notice that we’re not going to be talking about de-identification and identification working group, and the reason that we’re not is because over the course of the pre-work it became very clear to us that that issue became very difficult to work through, and issues around that became distilled and worked through in the rest of the meeting at the June meeting. So that working group did not come up with policies or recommendations or a final work product to vet at that point. But we caught that early and were able to bring it to the whole group a little later.

The next portion of the meeting that took place at the June meeting was vetting what we were calling our framework, and it was this framework actually that caught the attention of some folks and I think generated some of this interest. And we figured out early on that this is such a complex issue in many different dimensions that we tried to figure out a way of dissecting the dimensions in such a way that you could address different axes, you know, kind of concurrently, but in different domains.

And the four domains that we identified were those of public health, research, quality and commercial uses. And an exercise that took place at the June meeting was trying to figure out in each of those domains where along a continuum would basically would your framework sit. So within these six different areas, accountability, transparency, patient consent, the cost of re-identification, oversight and regulations, where would it be the right place for these in these four domains for these basically sliders to sit. And we’re not mentioning these sort of independent sit at 25 or something, but sort of a band and given different situations, the band would expand or contract but in a given range.

And we wanted to give you the definitions of where these end points were in the framework because we do feel that this is a workable framework. And by the way, if there are other domains that need to be addressed, the framework can accommodate that simply by adding another domain but still using the same series of points.

So for accountability, we said that the accountability area was that around leveling sanctions or penalties for disclosures of inappropriate use of patient health care information, and one end point was that there was no accountability, that there would be no penalties, and the other extreme would be at 100 would be criminal sanctions. So if something were to occur, there would be, you know, absolute fines or even potentially prison time for misuse of information.

DR. BLOOMROSEN: At the risk of –- what we did as part of the exercise was ask people to break into small groups and to tackle one of the four domains at two points in time -- the current state and potentially a future desired state.

So in an effort to inform, we asked people to sort of take a Polaroid snapshot of where the participants thought reality was today in each of the domains across each of the dimensions, but not necessarily thinking that everyone felt that the way things work today was the way they could or should be, and then to take a snapshot of where we thought the future state should be that we should be aiming for.

And one of the reasons why we’re not ready yet to distill all the findings from each of these small groups is these are still very much a work in process as much as the domains are themselves and as much as the attributes are themselves. So we’re still distilling all of the commentary we received during our meeting to try to flesh out because we were asked at one point to try to tell you where the sliders fell, and we’re not in a position yet to make a definitive suggestion that the slider should or could be at any particular point on any of the continuums.

DR. LABKOFF: But as they say, stay tuned. The issue of transparency, the measurement of transparency was the extent to which practices governing the use of patient health data are known and understood by those who disclose or use the data and to the patients whose data are the subject of use.

And the end points in this instance were zero where the patient was completely unaware of the secondary uses of the data, or 100 where the patient is informed of every use of their data and at the time of its occurrence. So if their data was meant to be used for a quality measure, they’d be notified for that particular instance each time it would be used.

So you can see we strived to find really disparate end points here for each of these areas. In terms of patient consent notification, that dimension was around the opportunity offered to patients who would allow or permit the use of their health care data, notification refers to the mechanism by which patients are informed of their right to consent. And the end points in this instance were no choice at zero, or 100 would be opt in.

Now I mentioned earlier that the de-identification working group had a lot of difficulties, and it very clearly became the issue of, well, maybe the issue isn’t around de-identification so much as it is from a practical perspective the cost of re-identification because I guess there’s issues or questions as to how much one can actually completely blind information. So if the point really isn’t that – we won’t even go into whether or not that’s possible or not. But if you go to the point of saying how expensive is it, that puts a metric around how hard it would be for end users to really decode that information.

And in this case, we said the end points of zero were where the decoding would be actually very easy, or at 100 where the decoding would be actually very, very expensive and difficult sort of akin, I guess, to something like cracking RSA encryption or something like that.

In the last two areas – last two dimensions, oversight was to the extent to which an entity is subject to governance or supervision including the ability to impose remedies for breaches, and the end points here were zero for internal residing with the entity that has the data, or 100 percent which would be external residing with a public governing board so that if it was at 100 a governing board or an external body would be the ones who would vet the uses on a per-use basis.

And lastly, regulatory or law, and this framework really was the framework around regulations and laws that govern secondary uses of health care data, including penalties and enforcement guidelines. Zero in this case meant no regulation, 100 meant fully regulated and, again, with penalties.

An area that sort of got into the meeting and was presented at the meeting was one that wasn’t really a working group but looks like it will become a new working group, and that is around analytic principles. And you know, I’ll put my Pfizer hat on for a second because this one generated for some of the work we did internally at Pfizer in terms of thinking about if people are going to start using secondary – using health care data for secondary uses, there really ought to be some basis for guidelines or principles around how these data will be reported especially in analytics.

You know, if someone is going to make a claim that something works better than something else, or one treatment is better than another, it would be helpful if we were able to use a framework for which the analytics were actually done in a consistent fashion. That sort of led to this concept of data analytic principles, and we presented them at the meeting as suggested principles or to start the conversation, and it kind of took.

So the rationale around this is that a statistically sound approach is necessary for secondary data analysis of large clinical practice data sets, that random analyses or unstructured data mining actually could yield associative conclusions or potentially introduce false positive associations. Standard data analytic principles provide a framework for sound studies with credible reproducible results and for minimizing errors that are possibly introduced during analyses. And data analysis principles mitigate the risk of false positives, and finally they provide grounds for multiple parties such that the analyses can be more readily compared.

Now the proposed principles are fairly simple, and they include, number one, an agreement that people who are going to be reporting an analyses of these data sets would all have agreed to use the principle –- that’s sort of the first principle.

Then establishing local data access committees or data guardians or gatekeepers in institutions, and these gatekeepers would be meant to provide governance and guidance to run data sets and run analytic projects. So this is similar to how IRBs work in university settings. But in other settings, they’re not necessarily the same types of rigors that are out there, and we felt very strongly that having that kind of a watchdog over these large data sets is actually very important.

Hypothesis is actually needed to actually start the analyses. But prospectively defined scientific hypothesis or purpose should be done for hypothesis testing required prior to doing analysis. And the corollary to that is that if hypothesis testing is actually the goal, that’s fine, but that no changes to medical practice would be made until hypotheses that were generated in that process would be validated in a second experiment or second study.

A sound experimental design would be needed. We should have a clear statistical experimental design necessary to undertake the study. And lastly and most importantly is to close the loop so that these data gatekeepers actually are informed about the results of what’s been done in the course of the studies. And by having this, it helps mitigate many of the risks involved with these false positives, Type 1, Type 2 statistical errors. And, again, these are not prescriptive. These are to start the discussion. There may be other principles that are needed. In fact, one other principle that was suggested was around being able to show an audit trail of how you arrive at various conclusions in the course of a study. And I would posit that that might make a good sixth principle possibly.

So some of the key takeaways from this 2007 conference include the following: Secondary uses of data is important and valuable. And although the value may be in the eyes of the beholder, there’s a need to broadly educate various audiences with the value of secondary uses of these data. Issues are very complex and ongoing work is needed, and the environment continues to be dynamic and fluid.

Consumers have an important role, although there were various opinions about what that role might be. Taxonomy is an important tool to help inform the greater community, and it will need expansion and maintenance and a process built to keep that continued.

There’s still some confusion around various rules, specifically HIPAA and privacy. FDA’s – chuckles out there. FDA’s human subject protection regulations and the Common Rule, none of which may be applicable or adequate to address the secondary uses of health care data.

Data stewardship principles need to be refined. They must address implicit chain of trust as data changes hands. Data must be of minimum quality. It must be accurate, reproducible, complete, timely and credible. Data limitations should be acknowledged and described and data analytic principles adhered to whenever possible.

Some of the plans, work products and outputs that ought to be accepted from the meeting later this year will include white papers, commentaries, recommendations on the value of secondary data, health data stewardship, framework and principles for secondary uses of data, and refinement of the framework instrument that we described here today. Data stewardship and definitions and principles are being worked on, and the taxonomy of users and uses continues to be refined.

Some next steps that we’ll be working on is to synthesize and distill the conference discussions. We’ll be reconvening the working groups and introducing a new working group in the case of the analytics principles. Refining interim work products for the meeting and continue the public discussion at the AMIA 2007 annual session later this November and participate in ongoing discussions and forums such as this one and AHIC and Connecting for Health, eHI and several others.

And lastly, we would like to acknowledge the Steering Committee for the Conference. As you can see, there is a wide array of stakeholders who came together to help work on this, including Doug Barton from Lockheed, Meryl and Bob Browne from Eli Lilly, Bart Harmon from Tricare and the DOD, Mike Lieberman from GE Health Care, Susanne Markel-Fox from GlaxoSmithKline, Charlie Safran, Sam Scott of Lockheed and Bill Tierney from Indiana University.

I’ll flip through the other slides to demonstrate who the other sponsors were, and, again, you can see there’s a wide array of corporations that care about this subject and participated and just demonstrate on the slides the lengthy list of folks who participated in these working groups. Again, taxonomy was led by Chris Chute and Sam Huff. Stewardship was led by Melissa Goldstein, and De-Identification working group was led by Doug Barton.

And with that, I think we will conclude at this point and open it up to the floor for questions. Thank you very much.

MR. REYNOLDS: Thank you to both of you. I’d like to thank you in a couple of ways, first for an excellent presentation, but also the work that you did helped a number of us that aren’t as astute in this phase as others around the table really get a jump start on how to think about it. So we do thank you very much for what you did and what you delivered even prior to this session. So thank you for that.

Questions, Mark and I have Justine, Simon.

MR. ROTHSTEIN: I have a question and a comment. The question is what do you expect to come out of this work? Is it a way of trying to control the development of public policy based on the expertise of your members, or is it best practices or some sort of professional guidance? DR. LABKOFF: I think there are multiple expectations from this work. One of them in fact is to help guide policy creation. It became clear to us in the very beginning of this effort that many of the policies that exist today just didn’t address some of these issues because it hadn’t been really discussed in open forum up until about two years ago. And I would say that would be one of the major hopes for this work. Also to inform corporations and inform the businesses around the country in terms of what the rules of engagement are for how to work with data in a transparent and a trusted fashion.

The issue of commercial uses, again, was hotly contested during the course –- just even the word commercial actually was hotly contested during the course of the meeting, and being able to help distill out what that means and provide a series of rules of how to engage in a consistent fashion is, I think, quite important. Meryl?

DR. BLOOMROSEN: Yes, I think to augment that, we’d like to inform and provide assistance as much as possible by providing and leveraging the thought leadership that’s available through the organization. As an example, the folks who have been working very hard on the taxonomy – and you saw their names – are people who work and immerse themselves in issues relating to taxonomy on a very regular basis. And what we would like to do is continue to bring their expertise to bear as we refine, for example, the taxonomy. It was conceived of a year ago, and then we’ve been working quite hard to get it vetted as much as possible with as wide an audience as possible.

As an example, to Steve’s point, at the Spring Congress of AMIA, we had a town hall meeting at like six o’clock in the evening to vet the draft taxonomy thinking, oh, no one’s going to show up for this. In fact, we had standing room only, and we recognized that by continuing to work on this, we will hopefully be contributing to, you know, the public good as well as bringing taxonomy expertise to helping classify secondary uses and users. So I’m hoping that answers the question. That’s an example.

As Steve mentioned, these work products are a work in process, and I’m not sure that we envision that there’s a start and stop to necessarily any of them because of the way the environment is changing.

For example, we are looking at and re-looking at – I’ll use the taxonomy again as an example. It started out with three axes, and, as Steve mentioned, it’s sort of evolving or morphing based on a lot of the vetting that took place that it looks like we may need five axes. But how that taxonomy might ultimately come to bear on policy making and other venues, we would hope we can help inform what kind of uses could be.

So, for example, if someone were to make a request for a secondary use of data, perhaps they could use the taxonomy to explicitly state what uses they’re making of the data so that we’re all, as Steve mentioned earlier, on the same page about the uses and users. What we then sort of found is this is a very complicated question. There was no easy answers, although we’ve honed in on the commercial use conversation, there were no easy answers in many of the break out sessions and as we fleshed through these issues.

MR. ROTHSTEIN: Okay, and now my comments. Having read your material and heard your talk, I’m concerned about the primacy that you give to consumer choice. In my experience, consumer choice is often set up as an alternative to any regulatory activity. And under the current regime, the fact of the matter is that consumers have virtually no choice with regard to secondary uses.

So, for example, disclosures for public health purposes did not require any sort of consent or authorization. Disclosures for health care operations, one of your secondary categories, did not require any consumer input. And when consumers are required to execute an authorization, they are often in the position where the authorization is compelled. So that if you apply for a job or life insurance or disability or long term care insurance, and the list goes on and on, as a condition of applying, you sign and it releases all your information.

So I value the concept of consumer choice. But in reality, there is no consumer choice or very little consumer choice. And if you put too much reliance on consumer choice as a principle as opposed to regulating the substantive aspects of what can be disclosed and what can be used, I think that undermines the situation.

And finally, I was interested that you did not include – perhaps you did, but I did not see any discussion of the concept of contextual access criteria. This was a part of our June 2006 letter to the Secretary where we recommended that contextual access criteria would be –- should be explored, researched, and if feasible, implemented to limit the disclosure of information to secondary –- for secondary use whenever possible so that when disclosure is made to an employer, only job-related information goes. And unless we have the capability to do that in an informatic sense, then even the law requiring that only certain information goes is basically worthless.

So was there any thought to including discussion of contextual access criteria?

DR. LABKOFF: So the answer is yes, there’s a great deal of thought. And in the course of the – I put back the slide on the screen about the framework. And to your point, we’re not trying to be prescriptive in the course of this meeting. We’re trying to actually elicit where these sliders need to be set. With respect to consumer consent, if you look at – if you check this in your head, you know, put a little circle over public health, well, in the public health I’m not going to report absolute findings, but I will tell you that just from memory that, of course, in the public health domain consumer consent really was way down at zero because patients don’t really have consent even by law, I believe, for their data to be used for public health reasons.

But in other domains, there might well be a need for consumer consent, and it might be very appropriate. Where in these domains those sliders fit really is what we’re trying to figure out. So, for example, in research, I do believe you do need to ask patients’ permission to use their data to do research on it. And so in that case, the slider might be set much higher than it would be in the public health setting.

So in each of these exercises, so the point of this exercise wasn’t to get prescriptive and say this is where it must be. The point of the exercise is to say where is it now and where do we think it ought to be in the given dimensions.

And that particular one that you mentioned was one that was absolutely discussed, and we’ll have more to say about that as the meeting materials get distilled down. With respect to contextual access, again, we couldn’t present every single thing that was presented at the meeting in this short period. But that was an issue that was discussed in the meeting in a variety of the break out sessions and will be addressed again in the distilling of this work. It wasn’t addressed specifically in this presentation today.

DR. BLOOMROSEN: Mark, what I would suggest the Committee consider is when we broke up into the four domains, it became quickly apparent that we even within the domain couldn’t over-generalize. So on our website are some discussion questions and scenarios that we used. And in terms of potentially setting the slider or having conversations about these issues, it became clear that we might need to do it on a scenario-by-scenario basis which I would believe is somewhat analogous to your recommendations relating to contextual access. So that we felt like we tried to divide the health world into these four domains just for purposes of getting moving on distilling it so that secondary data use was compartmentalized.

And then once we got into those domains, it quickly became clear that we couldn’t over-generalize even for public health or for research. We had to probably go down a little more detail. So I would invite you to look at our scenarios as another way to address some of that.

DR. COHN: Can I just follow on only just because I was there and just to maybe be a little more gentle on the conversation.

At least in my memory or my sense of the conference –- gentle, gentle, not general. But I think a lot of the conversation and I think a lot of the value of the discussion was, as you commented, really trying to figure out where things ought to be here versus some sort of a future state, at least along these axes.

And at least my memory was – and obviously this will be additional work by AMIA and by the Board was really this issue, which I think Mark was referencing, was more along the lines of tools, technologies and approaches to minimize risk in the context of all of this. And I don’t at least to my memory of being in the conversation, that was not really – that second part was really not the focus of this session, but might likely be a focus of another expert panel or separate meeting.

And if you think I’m wrong in my characterization of that –

DR. LABKOFF: No, I think you’re right on, Simon. I think that’s precisely how we tried to frame this up. I mean, we’re not claiming this as the end all and be all in this discussion. This is the starting point of the discussion, and it’s meant to provoke conversation and thought, you know, discussions on policies and potential regulations if necessary. But it’s meant to start those conversation, not end them.

DR. BLOOMROSEN: And I think you’ll see that the taxonomy probably – at least our hope is the taxonomy will help us speak to the different uses in a context within these domains.

MR. REYNOLDS: Okay. Justine.

DR. CARR: Well, I think some of my questions have been already touched on, but thank you for excellent work and an excellent presentation and ongoing work.

So a question I have is, again, this slider was actually a very effective tool in helping clarify where things were. In a way, we heard a bit of this yesterday as we heard different speakers come through where they talked about a project where there is a lot of trust, there was less concern for some of the regulatory and so on, and where there was confusion there was reaching for a lot of regulatory.

But is it the expected outcome that with this there’ll be hundreds of scenarios, each of which would have a different solution?

DR. LABKOFF: That’s honestly hard to say. I would hope it wouldn’t render into the hundreds, but certainly I think there might be a series of scenarios with which you then would have to extrapolate from there. I would hope we wouldn’t be vetting six or seven hundred scenarios at the end of the day. That would be a fairly onerous task. But I do think that there are probably, you know, I guess you want to call someone who tends to be a lumper rather than a splitter, I think that someone’s to my left over here. I think the objective here is to figure how to lump some of this together in ways that make it easier rather than more complex for general use and utilization. If we’re able to lump many of these scenarios together and try as hard as we can to provide guidance. But you know, at the end of the day it’s how people interact with people that are going to make these things work or not work. No matter what the regulations or laws that are intended, we have to trust that if you put out policies, people will adhere to those policies. I don’t think that we can hope that that will be – there’ll be some balance that will stop misuse.

DR. CARR: Well, my recollection also of the meeting which was great was that you could move – you’d be ultimately able to move towards lumping with three of the domains, and that what was called commercial represented a vast opportunity of splitting. And will that area be – will you be doing a deeper drill down on that area?

DR. LABKOFF: That is the expectation, and, again, we’re just beginning to distill out the materials here. But the expectation is that figuring out different areas of the commercial sector are some people framed it as data for dollars, if you will. There does need to be a clarification because it turns out that corporations are not the only organizations that pay for data. It turns out that research institutions do it as well, and for the data to be collected and aggregated in the first place, somebody actually has to have a business model around doing that application. It doesn’t happen for free.

So whether it’s a corporation buying data for some research project or for some commercial venture or for some marketing campaign or a research institution purchasing data just to advance their own research agenda, there’s a commercial transaction that takes place there. And we need to understand better how those are going to be framed in the future in a way that is respectful and transparent and basically, you know, does the information keep it safe.

DR. BLOOMROSEN: Justine, I think your point – I think a point that sounds to me like it’s implicit in your comment because you were at our meeting is that there may be some interrelationships or even interdependencies with some of the slider categories. As you said, as people were studying the sliders during the exercise, if transparency was high, then there might be an implicit patient consent that goes along with it, that as long as I know what you’re doing with my data, I’m more comfortable or we would think the patient should be more comfortable. So the level of consent wouldn’t necessarily have to be that you have to tell the patient every single time in real time what’s happening with the data. So I would like to emphasize that as part of an implicit finding. We haven’t really distilled it all completely yet. But that seems to be where people are headed.

DR. CARR: Yes, that’s how it seemed at the meeting as well.

MR. REYNOLDS: Simon, and then Mark and then Kevin.

DR. COHN: Well, I guess I sort of had two questions or maybe a comment and a question. The comment goes along with Mark Rothstein which is that obviously we’ve heard a lot about HIPAA and had conversations about that yesterday.

I’m of course always struck that, and I think Mark said this already, the patient consent is one dimension of activity. But the other one is for information practices. And as someone who – I would apologize probably in open session, described to you what I usually do with information, policies and consent forms that I get in the mail, but it’s sort of like in a sense at least in some idealized fashion you want to make sure that people are protected with their information, policies and practices as much as possible, and that in a busy environment and a busy person doesn’t have to be consenting or not consenting at every turn in the day. And I’m not sure that I saw that well represented in any of this although patient consent is a subset of that overall issue, but it doesn’t really do that.

But let me get to really the basic point that I have which is – and I’ve been sort of listening and I have participated in the meeting. One of the things I came away a little confused by and maybe you can now clarify this for me is how to figure out how does this overlays or doesn’t overlay or interdigitate with the current world of HIPAA as you know it.

And I can see touch points, but I can’t decide whether this is meant to deal with things beyond HIPAA or really is meant to be a different way of completing the architecting or at least overlaying all uses, and I guess if we’re indeed talking about all uses of data, then I’m sort of going, well, what about the pair, what about the O part of PPO and where does the employer fit into all of this stuff.

And I guess I’m just trying to figure out where – I mean, it sort of fits but doesn’t quite, and obviously I’m sitting here grasping for words. But I need some guidance in terms of how we could sort of leverage this if we’re sort of thinking in a world where HIPAA exists.

DR. BLOOMROSEN: I’ll take a stab at that. I think the best way to answer the question is that we’ll be doing more work through the data stewardship working group. But to answer the question regarding HIPAA, and if you’ll look at who was on and who participated in the data stewardship working group, there were obviously many attorneys present and folks who have experience in HIPAA.

I think our conclusion is that these discussions were outside of the scope of HIPAA, I believe. Having said that, we haven’t definitively reached the conclusions. But I think we’re saying that there is confusion about where HIPAA fits in the use of secondary data. Also the FDA’s regulations and rules and the Common Rule. I think there’s a slide that actually identified for you that there was sufficient differences of opinion and confusion that we believe that’s an issue that could be addressed via policy or further clarification. We weren’t necessarily looking to solve any HIPAA deficiencies or address HIPAA exclusively. It came out as a byproduct of the discussion that it appears that these conversations raised questions about does HIPAA address this, and, if it does, is it sufficient.

There was, I think, and this is Meryl talking, an conclusion that probably these fall outside the scope of HIPAA and maybe someone needs to be thinking about how to bring these back into some policy framework to the extent that they’re not currently addressed.

DR. COHN: I think that’s a very reasonable issue and certainly one that we’re pondering. And the first part of the question is, knowing that HIPPA is hundreds of pages long and has initial guidance’s on top of it, where are things where things are not clear, where are things that are just not known to people very well, and I think that’s probably in some of these borderline areas or some of these issues we certainly need to dig down and figure that out.

DR. BLOOMROSEN: And Susan McAndrew is here and was certainly at our meeting and participated in our working group. So I don’t want to put words in anyone’s mouth in particular, but I do think the conversations were complex enough that at least it’s an area that does require further study, and that may be something your group may want to look at.

Having said that, I’m not sure we would want anybody to connote that we’re advocating a redo of HIPAA. That is not what we’re saying. We’re on the Internet, and we are not suggesting that HIPAA be redone. What we believe we’ve identified potentially are some nuances or gaps that HIPAA may not address or maybe did not intend to address because we believe we’ve been talking about uses of data that are beyond HIPAA.

MR. REYNOLDS: I always like it when we get consensus from our speakers. Mark Overhage.

DR. OVERHAGE: I guess one area I’d like to draw down a little bit, I heard you talking about the potential for a fourth working group focused around the analytic principles, and I understand the notion of rolling that forward, and I think it gets back to the issue of in some ways the value proposition and the transparency that go along with the research. If I know flaky research being done with my data that puts my data at risk, I may not want done creating research that’s going to say people fly badly once done and I may make different trade offs.

It struck me that a lot of the principles that you list in this whole area is fractals. You know, every time you drill down, there’s another whole set of things that you have to try to sort through, and the research is one that feels that way to me.

Your principles that you listed, recognizing that they’re early on their evolution and so on -- understood, yes, seem to be focused around what we call big formal studies as opposed to, for example, one of the most common secondary uses of this kind of data that we see which is covered by HIPAA is use of data sets like these four hypotheses, generation and preparation of proposals for studies and things, exploratory sorts of analyses that obviously would be held to a different degree of rigor than this is.

So one of the fears I have, I guess, is that this is yet another ballooning area that’s going to be difficult to bound. And so I’d be curious to your comments about (a) in the theory of analytics as you think about it, do you have thoughts about how it ends up being bounded and scoped and whether these other uses fit into it or have to be addressed separately. And if so, this challenge, I think, that sort of heavy weight process of a board hearing about it and reporting back and things probably becomes infeasible pretty quickly.

DR. LABKOFF: So I guess the answer is that I think just like most of the things we’re discussing here today is there’s a spectrum of ways we can approach this. And you’re right, the studies that we – when we started thinking about this in my company, it was around groups that do large data analyses, in the epidemiology group and in the office research groups, these are folks who care deeply about the fact that their studies are done in a valid way and are perceived in a valid way. That’s their bread and butter. They need to make sure that it’s done that way.

And when you get down to smaller – actually, there is one of the corollary principles was that in the case of smaller things when you’re trying to do hypothesis generation, there’s a provision for that that we’re suggesting that hypothesis generation itself can be the purpose here.

Now with respect to the question on overly complicating the scenario in terms of making it, you know, everything has to go through an IRB or an IRB-like being, and that can slow things down and become too bureaucratic. Again, just like I answered the question earlier, there’s got to be guidance here around how things are done. You know, perhaps I don’t know where the line is drawn, but when you’re doing small stuff, I mean, you abide by principles that are relevant to small things.

But when you start doing large data dives over millions of patients where Type 1, Type 2 errors can come in where if more than one organization – more than one group and organization using the same data set for research purposes, that in and of itself can introduce error into everyone’s studies. Those types of things ought to be controlled, I believe, and I think that’s kind of where we’re trying to shoot for.

But please don’t misinterpret the suggestion of principles being as prescriptive nor being mandatory, but just as being guidance in certain domains, in certain types of environments where they really need to apply or ought to apply. Does that kind of get to your question, Mark?

DR. BLOOMROSEN: I’d just like to augment that to say that I think we envision that the data analytic principles, Mark, might in fact become part of the data stewardship principles. In other words, that data analytic principles need to be part of being a data steward. And we’re not really sure where we’re going, where the outcome would be on that. But I think we welcome your comments and insights on that as well.

We also recognize that it overlapped a lot with health services research processes and disciplines and things like that.

MR. REYNOLDS: Kevin.

MR. VIGILANTE: Thanks for a great presentation. Now as Paul was alluding to yesterday, you know, it’s when we think about our charge, it’s nice to be able to sort of triage things into areas of high yield or high importance so we can focus our activities. And I’m intrigued by your exercise you conducted where you sort of had people using this tool in terms of what is and then what ought to be.

And I just to – I know you told us you don’t have any conclusive evidence. But are there early indicators of areas that might be more productive for us to focus on either because they’re controversial or because there’s a lack of consensus, or because in comparing what is and what ought to be there is a paucity of guidance, policies and regulations, a huge chasm that really needs to be addressed. And if the answer is we can’t tell you now, when could you give us some early indications of it because I think that kind of information would be useful as we think about this triage.

DR. LABKOFF: I think that one of the early areas that I think that absolutely needs to be teased out some more is the area around the commercial emphasis, and I guess to Justine’s point, that may not be a lumpable activity.

We need to dissect deeper. It was very clear in the time spent with the subgroup in that space, I mean, there was fractured conversation left and right and center, you know, trying to figure out in this scenario what does it look like versus that scenario versus – and then the points about the research community actually is a commercial entity in and of itself, although it’s under a different umbrella.

So all those issues need to be teased out. So that is an area where we anticipate additional work to be done. But in terms of anticipating outcomes from this, we’re hoping that we’ll have some material out, I would guess, by November/December time frame, something like six months out from now.

MR. VIGILANTE: That’s kind of late for us. Is there any sort of early – you know, because we get the insight into some of these thoughts in terms of which – and even the principles which were the ones that –

DR. LABKOFF: Well, let me back up and ask – I wasn’t aware that there was a time component to this work. If there is, please let us know, and maybe we can re-triage our priorities and re-organize in such a way that we can be – we want this work to be influential. We want it to be able to be useful. And if it’s going to come out too late, then that’s clearly not going to be helpful.

But if you tell us that you need stuff done, please don’t tell me it’s due by the second week of August.

DR. BLOOMROSEN: Well, Kevin, I would again ask perhaps Margaret and others can follow up with us at the staff-to-staff level. But certainly the taxonomy, the working draft is available. The sliders, you know, that’s a work in process. The principles – all of this is, you know, again a work in process, and we’ll try our best to be able to be as informative as we can for your processes.

DR. LABKOFF: Yes, and I should clarify the time line just because this is based obviously on the request from HHS. But I think the intent was to have our work substantially done in the October time frame, and I think that was, as I said the initial agreement. So obviously we’re looking for things – I mean, our work – I just want to say this to our work group. Obviously, our greatest delight is to identify what pieces that we can endorse as opposed to having to reinvent. And so I think in all these things like taxonomy, we obviously need to further review it. It may need guidance from you in terms of who we should talk to better understand that and understand when it moves from draft to executive version 1.0 type of work also.

DR. BLOOMROSEN: Yes, the taxonomy’s a good example because it’s pretty concrete, and yet it’s pretty in depth at the same time. And what we’re doing, as Steve mentioned, is it’s been vetted twice so far formally. It’s been vetted at the Spring Congress, and then it was vetted at this meeting, and we’re still accepting comments and we’re in the process of reworking it.

So there’ll be another version of it that we were planning to take to the AMIA meeting in November. But certainly we’ll be posting, whether it’s 1.1 or 1.0 or whatever, it’s –-

DR. LABKOFF: Let me go back to the – let me turn back to the committee the question, of all the things we’ve presented here, which are the ones that you prioritize as being needed sooner rather than later because if you can tell us your priorities, I think that we can try to rejigger some of our priorities and try to get that work done in your time frame.

MR. REYNOLDS: Paul, do you want to make a comment?

DR. TANG: So one, I wanted to do upfront because I didn’t get in the line early enough and I just wanted to redispose that I am the current Chair of AMIA. So I won’t comment on things that they might recommend.

I do think it would be helpful, Simon, if you specified a time line – October, I think, we’re to be turning in our final report which so, as an example, October would be way too late for input, yes, and I think it’s acceptable for me to say that what Kevin’s question I thought was excellent in terms of how it would help – that work probably would help because he drilled right down into let’s get to the meat, and there’s so many things to discuss. Let’s find out where there’s the most controversy, where the most help can be, and so I thought his comment was particularly useful. But obviously it has to meet a time line, and perhaps you can shed light on when is it in time enough.

DR. COHN: Yes, well, I certainly think the taxonomy should be -- one would hope that it could be a low hanging fruit. In turn, I’ll ask Margaret to begin to take a look at that, and the question would be, as I think we’ll talk off line to identify whether somebody needs to come and brief us specifically on the taxonomy, that’s going to be likely a living document. So it’s more a question of –

DR. BLOOMROSEN: Yes, and I was going to say –

DR. COHN: But on the other hand, it’s hard to deal with things that just remain in draft forever. So I think one can appreciate endorsing or supporting a version one. It’s a little harder if it’s a version .4, and that’s just sort of an observation of processes.

You know, I think that clearly the intent of having you here at this early date knowing that it hasn’t been very long since that conference -- since you had your work completed on that expert panel, expert meeting with the intent to begin the dialogue, and I think that things, for example, Kevin is asking like, well, what are the big areas and what are the big areas. And I think certainly you’ve identified commercial as an area that needs more thinking and more work which, of course, is the area that Paul Tang brought up yesterday, I think, using slightly different terms but was an area that I think and other speakers have presented as an issue that is particularly vexing. And it isn’t that it’s bad or good. It’s just that it’s complex and probably has different things in it.

DR. BLOOMROSEN: Simon, perhaps another way to address – and Harry, the question is to potentially have you consider looking at the fact that there’s HIPAA, there’s the Common Rule, there’s the FDA protection of patients and research, et cetera, et cetera, et cetera. There are some disparate components that may address some aspect of secondary use, and that might be another drill down where the extent to which there are any gaps in existing code beyond HIPAA that address the issues of data, that’s something we just touched on, but we have not drilled down yet. But I believe that might be another area sort of like with commercial uses that there’s lots of conversation.

MR. REYNOLDS: One other that we’ve been looking at is having used the sliders, it’s becoming more and more apparent, at least to some of us, it’s more like a sound mixer. Have you ever watched somebody doing a recording because what happens is – one thing that I don’t see that I would like and Simon brought up earlier, there aren’t clear definitions of the data. De-identified, anomymized, pseudonymized, and so on and so on. Once you fix your slider, any discussion you have, all I got to do is take those four kinds of data, and I may move every slider again. And so I think that’s a cut because as we try to explain this – because part of what we want to do is come up with something that once it happens the general public understands it, too. And a lot of these discussions and a lot of this focus wouldn’t necessarily be transferable to somebody trying to understand it.

Simon also mentioned employers today because, if you listened to some of our testimony yesterday, individuals are most concerned about what data their employers may or may not get. So we say commercial, we say public health, we say quality, but there’s also a big group out there that makes it so. Taking some of those things into consideration also because as we come up with these definitions, that moves a lot of these very quickly even after they may be set.

DR. COHN: I’ll make another comment which is that I think that the work of AMIA in terms of trying to fix where on the lion’s share of everything fits is, I think, laudable. I’m not at all certain that’s really part of the charge of this committee.

I think our committee – I mean that the thinking that these are important dimensions that need to be considered and then the issue of how you minimize risk, maximize public benefit and all that really is the way I think we need to think about it as opposed to having long discussions about whether for a certain use you get 50 percent or 65 percent or 70 percent. I do think that that may be an area where obviously that may be important for you and especially how we mix all of this together.

DR. LABKOFF: I think the issue is – I’m not going to claim that it’s necessarily important for us or anyone. But I think what was important was to be able to take the temperature of folks in these domains and these axes to figure out, you know, when you start out having a discussion and you can’t distill it down into discernible parts, it makes it even more complicated to have a discussion.

By breaking it out into dimensions and these axes, it makes it at least be able to start the conversation and start being able to understand where folks are thinking about it. And, again, not meant to be prescriptive, but to be able to start these discussions and get a sense so that when this committee decides where things ought to be around research and patient consent, all of a sudden you’ve got a way to hone down rights of that area, whether we don’t care where you set the slider, that’s not the point. The point is that you can actually decide in a secondary use of data discussion that there is a way to get the discussion to the research component and then get down into these various axes. I think –-

MR. REYNOLDS: Mary Jo’s been holding a comment, and then Kevin you have a comment.

DR. DEERING: It isn’t about – one thing about the slider issue, but then a larger question. Getting back to Mark’s initial comment about contextual –

MR. REYNOLDS: Is this – I thought they were asking us what we need to tell them because there’s a list of people who have questions.

DR. DEERING: Okay. Then the first one is the priority. You began your conversations by recognizing that you had heard a lot of discussion about the full term secondary use, and that you recognized the issues about it, et cetera, et cetera, and then you said that just forth starters were going to use it.

And then all the discussion, of course, about secondary uses and primary uses. So you’ve actually locked in. And by noon or by one o’clock yesterday morning, you would have felt the whole universe was against using the term. There seemed to be a very strong movement not that the work group necessarily is going to ultimately decide to go in that direction.

So one question to you would be, do you feel that it would be useful and could you ask yourself that question. How strongly do we feel about the utility of the term “secondary data” and make a conscious decision as to whether this is –- the use of it is purely opportunistic because that’s where it’s at and just to get on with it, or whether you feel strongly that it is in fact useful.

Now maybe you can’t do it. Maybe it’s out of your time frame. But since there was so much strong feeling yesterday by many other people that, gee, why don’t we try and get away with it, it would be interesting if AMIA could move toward at least making a pronouncement on that.

DR. LABKOFF: I don’t think AMIA is trying to pronounce that thou shalt call this science secondary use of the health care data. I don’t think that’s the point. I think when we started the discussions around this area two years ago, it felt like we needed a way to describe it, and, you know, if you start with the definition that primary use is used for health care –- the providing of health care, and everything else is secondary to that, that’s kind of where the term came from. I don’t think anybody’s kind of wedded to the term. But you know –

DR. DEERING: Some of the other people said that those other things are for health care.

MR. REYNOLDS: Let’s do it this way. Let’s do it this way. We’d love you to consider whether or not what you think or don’t think and just at some point come – that’s great.

What I’m going to do is –- we got a tight time frame for the next panel coming. So I’m going to – Kevin and John, and I know there’s other people, I have about six other people on the list. I’m sorry, but –

MR. VIGILANTE: I’ll be very brief. This is just a sort of a follow up to the previous conversation, I’m sorry. And I’m just being really tactical here about thinking about the work we have to do and accomplish in a fairly short time frame. And my only – I’m not really talking about where the slider should be set or shouldn’t be set. I’m just sort of trying to tap your experience and what you’re doing now to help us focus where we get the most bang for the buck. And so, you know, if -- and I’m just going to throw a hypothesis out there, you know. If you’re just saying that the domains that require the most scrutiny or the highest level of scrutiny are and concern are –- ranks something like this, you know, commercial, research, quality of public health, and within each one, when you talk about those principles, when you think about what is and what ought to be, there is more of a paucity of regulation, guidance, different principles, and then helps us direct our thinking about what recommendations to make to the Secretary. And if we can identify those gaps and those areas early on, it helps us triage our activity, and that’s all I’m saying.

MR. REYNOLDS: John?

DR. LOONSK: Thanks. I want to commend the work that’s been done and to also commend the offer to help further with the work of this committee. I do think that it would be desirable for the committee to take AMIA up on that in some specific areas that I’m not sure are quite formulated yet. And so hopefully as the work here matures a little bit and as the thinking here advances, that that offer will be still held out and that this committee, obviously time frames are short, but can formulate that specific, ask that that would be something that AMIA might respond to.

I do, you know, this is a tremendously complicated area, and the concept of sliders is clearly one tool to address some of that. There are also multiple axes that need to be considered here, and a number of them are on the screen here. I think, you know, part of the discussion here is that secondary use is a slider, and it’s a sliding term and that to try to nail it down exactly is fraught with issue as well.

But I’m also particularly – I think that these axes and these sliders are very helpful in helping to frame the conversation which is how they’re offered, I think. My general question is about the commercial aspects because that -- I’m not sure that that’s the right axis to put a slider on. And part of the question about that is, you know, because I can see at the same time just looking at it from a consumer perspective that brazen commercialism associated with the use of their health data might be very unattractive. But that in fact where we have commercial interests that align with advancing the quality of care and the quality of how they are treated, that’s a win-win that is obviously encouraged.

So I do think that it would be helpful to continue to advance sliders on a number of different axes, some of which were on the screen, some of which are not, cognizant of the fact that eventually from a consumer perspective, this may have to be packaged into three or less considerations. Even what you have here is obviously complicated and complicated to this audience, much less complicated to the consumer.

Just on the commercial side, was there – you said there was a very robust discussion. Was there any light that came from that in consideration for other ways of articulating an axis or axes in that domain that would perhaps come –- advance that conversation because it does seem like just commercialism may not be the operant axis.

DR. LABKOFF: Well, I’ll answer that and then I’ll hand it to Meryl because I think she sat in that particular working group. I think the answer is you’re right on is that it covers, you know, we were looking for a term to just, again, sort of provoke conversation and discussion and, boy, did we provoke.

I think what we need to figure out is how do you dissect that into different channels because, you know, a commercial transaction for the sake of safety is certainly different than a commercial transaction for the sake of marketing. Both are necessary in the world, but they’re going to be treated differently most likely, and just labeling them and lumping them together under commercial doesn’t do either of them justice, nor does it help the discussion. And we learned that along the way. We had to start somewhere, and that’s where we began.

But I think that figuring out what those transactions look like or what those pieces look like in a more granular fashion would help, I think, help this committee and probably help overall in the space. Do you want to --

DR. BLOOMROSEN: I would just say that would be said as an accurate reflection of the conversation. The term commercial, of all the terms that we decided to use and tried to define, seemed to be one of the terms that was very subjective and might need to be further clarified because it was –- we were talking a lot about business models even in that commercial quadrant, if you will. So that under what circumstances or specifically under what business relationships are data bought and sold might be something to explore and for what purpose, and not just for money. I mean, I don’t – it became much more apparent that it was not just controversial; it was very complicated.

DR. LABKOFF: A use case that was presented at the meeting was wrapped around how a health information exchange would sell their data so they could sustain themselves in a business setting, and that’s where the discussion started, and then it fragmented into different places.

But, you know, if we get into the health information exchange debate – and I don’t want to open that door here yet, but bottom line is a lot of REOs and a lot of health HIEs are thinking about ways of using their data in a way to create a sustainable business model, and that’s a very complicated discussion, and that’s kind of where this all began.

MR. REYNOLDS: Okay. So in terms of where that discussion went, you would point to use as the principal axis for helping to further delineate commercialism, because potentially a multitude of these axes could be applied to –

DR. BLOOMROSEN: Well, again, I think another way to imagine potentially using the various tools is in conjunction with one another. So you’ll see a lot of granularity within the current draft taxonomy with what I would guess would be further granularity. And if you took those and looked at how they might speak to the different domains, that might be another way to help lump and split, if you will.

Implicit in these quadrants, by the way, is education as well. So I think our thinking about this is evolving somewhat. If you look at what started in 2006 and where we are now, we’re sort of –- we’re getting a little more specific and trying to nail down some definitions. As problematic as they might be, to Mary Jo’s point, we feel like these terms need defining at least to have the context and the discussion. So we have the taxonomy. We struggled with creating scenarios that made sense. Steve mentioned one. And then we recognized that where people might include something like education also needs to be teased out.

There was also some potential blurring between quality and research, I think, in the conversation. So, you know, we want to be as helpful as we can. But I think some of the synthesis is allowing us to make some other conclusions as well.

MR. REYNOLDS: Okay, I’m going to cut this off now, and I apologize for those of you who didn’t get to. But as we said yesterday, we’ve got a large body of work to cover, and it’s cumulative. So keep asking your questions – your same question that you didn’t get to will probably be used in one of the next six or seven discussions that we have. And are either of you going to be around for the rest of the day?

DR. LABKOFF: I’m going to stick around for a good chunk of the day.

MR. REYNOLDS: Well, good, and if necessary, if any of these subjects come up it will be helpful to call them back for their opinion, we’d be happy to do that. So thank you very much, and we’re going to take a ten-minute break – not 15 because we do have a tight panel coming in next. Thank you.

(Brief Recess)

Agenda Item: Impact of Standards on Uses of Health Data

MR. REYNOLDS: Okay, we’re going to start, please. Everybody, take your seats, please. John, can you hear me on the phone?

DR. HALAMKA: I hear you just fine.

MR. REYNOLDS: Okay, our next panel that we’re going to be hearing from is going to talk about the impact of standards on uses of health data, and we have – everybody knows John Halamka who’s on the phone with us and also Floyd Eisenberg.

So we’ll go in the order that’s on here. So, John, thanks for joining us, and we appreciate your being willing to call in, and I’ll tell you if anybody makes any faces while you’re speaking or anything.

DR. HALAMKA: Oh, that’s perfect. Thanks so much, and I really wanted to be there in person. Today in Boston we’re doing a documentary on the release of human genome sequences web. There are ten individuals that have volunteered to release 60 million base pairs of our individual person identified genetic data plus all of our medical records. And, of course, there’s a very interesting standards implication because, of course, we now have to gather all of our medical records from every place they may be and represent them in some standard way on the web along with our genome sequences. So I will certainly report back to NCVHS at a later time the interesting implications and issues that come out of this initial meeting of how one as a patient gathers all data and releases phenome and genome to the public.

To the matter at hand, we will spend 15 minutes introducing in general what HITSP has done with primary and secondary data standards, and then you’ll hear from Floyd in really great detail on the best thinking around how we’re going to take some of the aggregate data that we gather in hospitals, clinics, pharmacies and labs and use it for secondary uses such as quality analysis and population health and research.

Now I presume folks do have the slide stack that I sent along?

MR. REYNOLDS: Yes, we do, John.

DR. HALAMKA: Great. And so I will just start with the first slide, and I will cover the AHIC priorities and timeline so you can see what’s primary and secondary uses of data from the use cases we get from AHIC, talk about what we’ve done thus far and really what implications it has for secondary uses of data, and to show you some detail on the complexity of what we’ve had to work through to ensure that primary data sources are also useful for secondary applications.

So if we can go to what would be slide number three in your stack which is labeled AHIC Priorities and Use Case Roadmap, this is a slide that I apologize for. Of course, if John Loonsk is in the room, he knows this was what we create called the eye test slide. But it does in one slide illustrate all of the standards that we’ve worked on in 2006, 2007, what’s coming in 2008, and what you’ll see 2009 and beyond.

So specifically in 2006, we were given consumer empowerment, electronic health records and biosurveillance where consumer empowerment really meant demographic data, who is the patient, and that includes age, gender, aspects that might be zip code, name, such things, medication lists and history.

The EHR narrow use case was laboratory result reporting. But the implication is if we’re going to come up with a set of data standards for the uniform transmission of a laboratory result that we should also include all those aspects around that process that might be used for secondary purposes, and I’ll show you some detail on that.

A lot of effort has been made to really be quite broad in our use of laboratory standards reporting so that it can be repurposed for multiple use cases in the future. One great theme of the standardization that HITSP is doing is the last thing we want to end up with is 100,000 pages of guidance simply because every use case is a one off. We really try to think about these use cases and break them down into base standards, what we call transaction packages, composite standards. These reusable chunks that say if laboratory data which implies not only results but the vocabulary around naming a lab test and who was it ordered on and why was it ordered, that that set of transactions can be repurposed from multiple uses primary and secondary.

And biosurveillance is a perfect example where, if we are going to detect outbreaks or bioterrorism events, we may want to look at radiology report tests, laboratory results which are numeric, a variety of demographics so that we understand the distribution of whatever is detected. And so all of those kinds of standards that would be necessary for identifying a patient and identifying a lab and identifying a radiology report are now a secondary use of public health monitoring of the de-identified or synonymized data set coming in from multiple sources of data.

In 2007, we’ve been given a variety of use cases which also have primary and secondary purposes where consumer access to clinical information means that every individual with a personal health record should be able to get at their data regardless of where it’s sourced, be able to take that data and put it in both a network and even a physical piece of media like a thumb drive or a CD and deliver it to a consumer of data, be it a doctor’s office or a hospital. And so we’ve had to come up with a variety of standards allowing extraction and the transmission of data through multiple means and not controlled by the patient.

The emergency first responder use case is to deal with Katrina-type mass casualty incidents, how do we ensure that medical records are available for an individual, and then when treatment is delivered in such a setting, how does that get part of the permanent record.

Medication management includes all aspects of e-prescribing and inpatient medication reconciliation. So quite a lot of interesting issues with dealing with longitudinally keeping the medication lists from all the various types of environments of care consistent.

And quality is a perfect example of a secondary use of data. So we’ve been working very closely with the NQS. You’ll hear all of the details from Floyd on making sure we can look at all of those standards that are necessary to support AQA, HQA, HEDTH or XJCO, et cetera and not that we’re going to go through every single one of those hundreds of quality measures, but we believe you can say if you have atomic lab data, medication data, problem list data, medication management data, that you will be able to derive the bulk of these quality measures from good vocabulary controlled problems, meds, allergies, medical visits, et cetera type data. And so although it’s certainly true that many of the data elements we’re going to need are used for the primary treatments of a patient in direct clinical care, we’re going to repurpose many of the existing standards we have already named for this secondary purpose of quality analysis.

And in 2008, just early looks at what we might be getting are remote monitoring which would include such things are comiter interfaces, device interfacings from the home, remote consultations which may be a web visit, could be telemedicine. We will do referrals and transfer of care for continuity among providers which would include problem lists, reasons for transfer and results of such a consultation.

Some definitely new personalized health care PHR work as we get more and more mature on our consumer empowerment, 2006, 2007 and 2008 efforts. Public health reporting of reportable infectious disease and direct communication to labs, again repurposing as we can all those microbiology and laboratory results that we had already used in 2006 and 2007 for other use cases and response management which would include again more of the bioterrorism, mass casualty event kinds of data.

In 2009, you can see a vast array of possible other use cases that we could get to, and, you know, working very closely with the Office of National Coordinator and AHIC will ensure that we do in a Pareto kind of fashion those standards which have the greatest impact on primary and secondary uses of data collection. And I might imagine such things as clinical trials, clinical devices. These will be things that we will be given in 2009.

Overlaying all of this is an immense amount of work on privacy and security because privacy and security are foundational and cross cutting across every one of these use cases. The great challenge that we have is that there is not a uniform national privacy policy, and I might argue that we may never have a uniform national privacy policy. So HITSP therefore must come up with, instead of security standards that empower whatever regional variations may occur in privacy policy. Consents may be opt in or opt out. One imagines that there is a whole variety of specificity of what a patient might want released to whom and in what situation. So we have to deal with all the aspects of security and confidentiality from authentication to role-based access control to auditing to consent management, secure transmission. And by October of this year, we will have a first cut on all the standards that are necessary for nine different classifications of this whole security framework that HITSP has developed in conjunction with other organizations -- the AHIC Working Group on Privacy and Security and the HITSP Foundations Committee looking at all the best practices for security management across many industries.

The next slide, just to show you the timeline on all these standards, you can see 2006, 2007, 2008, 2009, 2010, how we go from AHIC having an idea about meeting a standard for primary or secondary use to HITSP coming up with interoperability specifications to those being recognized by the Secretary to CCHIT incorporating them in its functional criteria for certification of electronic health record and HIS systems. And, obviously, when we have the recognition by the Secretary, that has implication because federal requisition or federal purchasing of any type of system then will have to be compliant with these standards. That’s really the definition of recognition of a standard by the Secretary.

So you can see the timeline goes really from idea to complete vendor-based adoption in about three years or so from idea to completion. And you can see the litany of work that’s in process and the timeline for getting it all done.

The harmonization process, just so you know where we get these ideas and where our work products come out, AHIC will prioritize primary and secondary requirements for data standards and deliver use cases to HITSP which then turns them into actions, actors and events, very detailed use cases understanding, well, if a patient wants to do this, if a public health agency wants to do that, if a doctor needs it for a particular purpose primary or secondary, who are the actors, actions and events that are important and then what standards exist or what gaps exist such that we, you know, will produce an interoperability specification using as many existing standards and ask SDOs to fill gaps where there aren’t existing standards.

This is a very regimented process that we go through with multiple layers of review and public comment. We have created comment periods of one-month duration at several points that enable a broad array of input from all stakeholders on the appropriateness of the standards selected and their implementability in the environment by vendors and institutions and certainly comments from the public on such issues as privacy and security concerns.

So just briefly then on what we’ve done thus far and then I’ll go through some detail on showing you how we’ve prepared for secondary use of the data. On consumer empowerment, we have really worked with all of the standards development organizations, SDOs that include ASTM, DAQH, CDC, Federal Medication Terminology, HL7, Integrating the Health Enterprise (IAG), NCPDP, X12 and SNOMED to create a single document-based clinical summary that includes the demographics, the medications, the allergies and certain aspects of patient preference like advanced directives in an XML-structured vocabulary controlled document that incorporates all these standards, really the best thinking from every one of these organizations.

So that was a lot of effort to achieve that level of collaboration in a singular interoperability specification which is now completely finished. Every aspect of it is documented, on the web, been accepted by the Secretary and every one of the implementation guides are finished by the standards development organization.

Now next slide, slide seven on harmonization, just describes some of the efforts that we had to go through to bring all of this together and really take a lot of the nitched standards that had been developed in the past and bring them together into one common framework. I won’t go through that detail, but in a sense that we took the CCR and the CDA from ASTM and HL7, brought together that one document, the CCV.

Biosurveillance –- now here’s a very good example of how these standards that we have brought together for consumer empowerment and for EHR can be used for secondary purposes. So this is, of course, an opportunity for taking anonymized or pseudonymized data that is raw clinical data as might be used in consumer empowerment in labs but using it for public health surveillance and looking at the demographics of where events may occur, symptoms may occur, lab and radiology results that may occur that point to a specific disease process or event.

Next slide. The key take home is that we have repurposed for biosurveillance the EHR laboratory result reporting we did for clinical care. The lab to doctors EHR standard is the same for that clinical use as well as the biosurveillance use. So we have worked with HL7 to create – there’s really two kinds of standards, a messaging standard that uses HL7251 to go from a lab to a doctor’s EHR to transmit in real time data from system to system, but as well the documents that I described that might be used for consumer empowerment is a mechanism by which an XML vocabulary controlled human readable and computable document containing lab data can be exchanged.

Now for the EHR, we have had to work with HL7 to create HL7251 implementation guides that are appropriate for all the AHIC use cases, EHR, biosurveillance and consumer empowerment, and that ballot is currently active. It required HL7 to actually amend their existing standards. That ballot closes on August 4th. We do expect that that will be a successful ballot. And so Secretary Leavitt knows this one is coming. We finished everything about EHR except this one ballot, and therefore recognition and federal procurement will take place likely in the March timeframe because of the delay in getting this ballot done.

Next slide. Just to show you some detail of that laboratory message that HL7 has now incorporated in this ballot that enabled this standard to be used for primary and secondary purposes, we had, for example, to have a vocabulary controlled institutional identifier to understand what institutions ordered the test, what labs did the test. This really enables biosurveillance and public health monitoring to localize test results to a regional lab or place, an ordering physician, an ordering entity.

There are many aspects of the work flow around the lab beyond the simple result which would include why was it ordered, was this an employment-related illness. So if you want to look at such things as injuries that occur on the job, or infections that may be – and there’s a comulant infection or gathered because of a needle stick, that kind of data needs to be placed in the laboratory order. The clinic name if the patient was admitted in the hospital, what type or acuity of admission was it, how were they discharged into what kind of care, long term acute care, home or other place, when was the service delivered, what are the physicians involved in the care process, and then a whole variety of control terminology such as the LOINC, what lab test was ordered in vocabulary controlled detail, SNOMED use for the nature of the lab test, what was the reason for its order, UCUM for the unit of measure of the laboratory result itself. And we’ve standardized those control vocabularies across all the AHIC use cases.

And then finally just to show you again, the very granular detail that HL7 has created an implementation guide that modifies individual data elements in the PV1 and PV2 segment that empower the secondary uses of data for public health, for research, for biosurveillance, for surveillance and those sorts of things that would not have otherwise been included if this was just purely a constrained message reporting a result back to an ordering physician in an EHR.

So next step, we continue to work with AHIC and, as I’ve said, that we will finish up the ballot for the HL7251 message August 4th. There is one other message still outstanding from Oasis, a standards organization that’s working on a measure for hospital resource availability. If in the case of a mass casualty or a bioterrorism event you would like to know what hospital beds are available in a given geographic location, this data standard that reports on the nature of what a hospital’s capacity might be will be done in the fall time frame, and we’ll also give that then to Secretary Leavitt probably the first of October.

We’re working very closely with CCHIT to align their certification processes with the generation of the standards by HITSP so that the vendors will incorporate these standards, and we will therefore have products that empower these secondary uses of data, of course, presuming privacy, security and patient consent are achieved. But if we have the vendor products gathering and transmitting the data according to HITSP standards, that’s truly going to empower all the other uses of data public health and other agencies may desire.

We are really focused, as I mentioned, on security and making sure we do document consent, ensure auditing, ensure patient access to audits, all the things that will make data transfers possible. Security standards are absolutely requisite before the public can feel good about the data standards that transmit their data to various entities. And, of course, finishing up our additional 2007 use cases and awaiting the December delivery of 2008 new use cases from AHIC.

So that’s the broad strategy, the broad overview of primary and secondary uses of data, and a capsule summary of the work to date. And Floyd will, of course, now show you in great detail our best thinking on how we’ll measure quality, how we’ll gather data for secondary purposes and some of the aspects of coordinating SDOs and filling gaps around them.

MR. REYNOLDS: John, thank you. Floyd?

DR. EISENBERG: Okay, what I’ll try to do, and there is a fair amount of detail in some of these, and I’ll try to go through them fairly quickly today so we look at some of the issues around policy, the data re-use landscape.

You’ll notice in my slides since last year when we talked about secondary use in our History Committee, we had a lot of feedback, which apparently you did yesterday. But secondary that –- especially public health users are not secondary users. They use the data primarily except it didn’t originate for the purpose for public health.

So I have adopted the term re-use data. Others have suggested value-add, but let me not get into all the – what you want to call this. Let’s just say we’ve dealt with that in HITSP as well -- look at some of the sources of data quality measurement, re-use management and some of the next steps. Obviously, the urgency from the Presidential Order, Executive Order for promoting quality and efficient health care in federal government administered or sponsored programs.

In HITSP as well as in IAG, there has been a lot of discussion about – while we were waiting for the use case itself, about the re-use of data and where data is reused. And in the Patient Care Coordination Committee at IAG, it was requested that folks from the groups from the quality area as well as public health as well as clinical trials get together to try to identify what are some of the commonalities of their needs for data and what are some of the differences. And in doing so, there was some initial analysis of there are a lot of re-use data issues for financial analysis, biosurveillance, reporting of infectious disease, as you’re seeing the next use cases come out, data share for quality as well as research.

And today, coming from, as this committee is well aware, coming from existing systems, there are many point-to-point interfaces to make that happen and very complex.

So in order to more effectively deal with that, we actually came up with three areas of data re-use, and they are either rule-driven reporting, meaning a retrospective analysis of the data that exists to determine how well something occurred or to determine if there is – what is the appropriate cohort of patients to deal with in a trial or to identify those who have certain adverse outcomes in – so rule-driven reporting.

Related to that is simple case reporting, identifying any single case with an adverse event or who might fit into a category. And the third was a bit more complex, and this we have not bitten that off yet as part of the use case from ONC, and that is clinical decision support, the decision guidance concurrently as a problem is identified for a patient to determine what that next step should be as part of the direct clinical care provision that also applies to the secondary use and that will be collected for measurement or for placement into the cohort.

DR. OVERHAGO: Could you distinguish the first two before you go on. In fact, I didn’t understand –-

DR. EISENBERG. Okay. Actually, these terms have driven and they may well be very similar. The simple case reporting is for one case in the middle, and the rule-driven reporting is the aggregate. But in many respects, it’s the same rules that are driving the decisions. So it’s just the left side is the aggregate, the middle is the individual, and that’s more retrospective and concurrent with the decision guidance.

But all of these – these three different areas of need is what we identified so that if there are data to be able to have the data input so that the rule-driven reporting, the aggregate analysis could subscribe to the data that are needed to the data being published so the reporting group knows what they can subscribe to, and here in the individual case, what’s the case step mission and the case report, but very similar to the aggregate.

The decision guidance is more guideline related, and we have not really approached that and what standards we would use. That is our next step after this year’s effort – one of our next steps, and you’ll see there are from the HITSP group we actually have three phases, and that is between second and third phase.

But in the IAG group, also information coming from the collaborative for performance measure integration into EHRs which is a collaborative of AMA, CMS and NCQA as well as the IAG group, there were certain types of work flow that were identified needed to determine a cohort and to manage that cohort around quality.

And so starting with quality, to identify what are the criteria, what’s the group, the site that we’re looking at, how do we identify the cohorts so the patient or all patients meeting those criteria for inclusion into that group, what are the exclusion criteria’s that take the patient either out of the denominator if it’s a quality measure or out of the numerator because for some reason, if they’re expected to receive a medication but they’re unable to for medical reasons, what are those reasons that remove them now from the study. And what are the reporting data, how is it reviewed and fed back, how is it analyzed, is it mapped –- actually, before analysis, aggregated and communicated.

So we looked at those categories, and what was agreed by the groups that were present at the IAG Committee was that the quality strategies, research strategies and public health strategies pretty much aligned around with similar workflow to identify cohorts of patients out of a set of data from a population.

And there was a white paper requested that we will be putting together, and there was some delay in getting that done as various groups in public health asked to have additional input. But so far all of them mapped to the same issues.

Also looking at to measure this population what are the data elements. There has been and this – I called this AHIC. I apologize. I wasn’t able to change the slide. It’s AHIC had asked NQF to National Quality Forum to create a health care information technology expert panel to determine data elements to manage for the quality use case, the AQA, Ambulatory Quality Alliance and Hospital Quality Alliance measures the data elements required to get to the high priority measures. And these data elements are listed here. There’s some liberty I took from what actually came from the HITEP Panel because I added some additional terms to that, but they’re the same data elements. So the definition of the measure, demographics about the patient, results which might include laboratory results, imaging results, and, in this case, they may be quantitative or qualitative.

An example is a left ventricular systolic dysfunction measured by ejection fraction. The quantitative is an ejection fraction numerical greater than or less than 40. The qualitative is a set of terms that indicates moderate to severe dysfunction that may be allowed by different measures, and I’ll show a slide also not meant for reading but what is described in some of the measures to see how we’d be able to actually collect those data.

Substance administration, which includes medication, oxygen, other substance administration. That term being used because it was the term used by HL7 for medications and other substance. Procedures performed, location, and location is specifically important here with respect to not just where is the patient, but where are they going next. So if it’s a hospital discharge, if the patient is going to hospice care, that becomes an exclusion for the measure. So location was important to be able to calculate who is and who is not in the measure.

Events – clinical observations from findings, problems which include allergies but not limited to it, diagnoses and history. And as we looked at these in the HITSP Committee, we had additional data discussions over procedures and diagnostic tests. Depending on the study on the measure, we could be looking at ordered, tests that are performed, those that have results and will require a result and also require procedure, date and time especially with respect to surgical measures that have a pre-operative antibiotic within a time frame related to an incision. So the trigger events, given that example, for date and time.

Lab information, the result value as well as the order because different measures – some look for the order, some look for the result. Symptom information, physical findings and observations, vital signs, physical exams, med allergies, true or anticipated side effects, diagnoses, and what types of diagnoses, principal, admitting, chronic conditions, acute, and also some of the information especially related to exclusions may be coming from family history, patient past history which could include medical or surgical, social history, allergies, medications, existence, orders and trigger events.

Basically, the record – in other words, to look at measures and to look at secondary or re-use of data to be able to find a cohort in a population, there’s very little that’s missing. So that’s a challenge because is there – now our challenge is, is there a standard to get to all of these elements, each of these elements so that we could define them for dealing with this.

So we started with –- we looked at the list of 60 measures, 30 of which had a lot of detail in it, and I’m going to show one example here where we can find some good standards and some where it’s a little bit problematic. In this case, it’s the Ace Inhibitor or Angiotensin Receptor Blocker, and this could be – I’m going to refer to the HQM measure, so discharge of a patient with left ventricular systolic dysfunction. Do they have a prescription for one of these medications.

And so it’s are they are on the medication, do they have dysfunction -- systolic dysfunction. Do they have allergies, or do they have other medical reasons. And in terms of do they have NMI which is actually the first criteria, that was well defined. That’s ICD-9. It certainly could be SNOMED. No problem with that.

Did they have a test for left ventricular dysfunction, and we certainly can look at that as far as the standard. But if fraction of the record from joint commission and our problem was how do we take this to is there a SNOMED code that indicates this, that, that, that and all these components, and there very likely is. Part of what we will do is in HITSP is we’re taking five measures down to that very detailed analysis to say are there really SNOMED codes that exist here because we’re looking at SNOMED as the method to pull those data. But if in fact – and if there aren’t, then recommend there needs to be.

But more importantly, we’ll be referring to the HITEP Expert Panel and back to standard for measurement developers that these kinds of descriptions need to be codified more in the terminology if we are to be able to get this information out of a problem list and elsewhere in the record, whether we’re using parsing technologies or we’re taking discreet codified data. So that becomes problematic.

Another one was heart failure patients with documentation that their caregivers gave them – they or their caregivers were given written discharge instructions on these six items. The six items are listed in bold. There were no SNOMED codes. There were no other codes. That was the end of the measure which we then went through SNOMED –- or I went through SNOMED to see is there some terminology, is there some term or code that would apply to this. And if it were codified out of the record, could I say, okay, we met the standard.

Well, SNOMED played really well. There was a term. Very likely, especially – and I can guarantee the first one came from some nursing terminology that is now in SNOMED. I suspect that most of these others did as well, but not the same nursing terminology which led to a discussion in our group that we have public domain nursing terminologies and the two public domain ones that do not cost – that have cost to providers, we should put them both into our standard spec. So they were clinical care classification and Omaha.

And what we found was both of them, as others, are also in SNOMED. And after just a small bit of discussion, we came up with a consensus statement. Again, you don’t have to read it all here. But for the purpose of HITSP and for purposes of interoperability with respect to the ONC quality use case, mapping is required through SNOMED CT. So that will be the standard at least we’re recommending. We’ll see what the comment period comes back for the terms for quality and secondary or data re-use.

For local interface -- user interface terminology, any of the preferred nursing terminologies are fine as long as for the data re-use they are mapped to SNOMED, and that was the consensus we came to with agreement from all the nursing groups on the terminology groups in the discussion.

And briefly – I’m not going to go through the quality use case, but basically as we look at the hospital base care and we look at the clinical care, one thing that’s important is in each of the scenario flows – and the reason I included this – is the patient level data is identifiable in most of these situations until it’s aggregated. And I believe there’s a good reason for that, and that is to identify the entire population, whether it’s for quality, whether it’s for any cohort determination, but whether it’s for quality, for clinical trials or public health, unless you know your entire population and the full data set, you could be eliminating patients from the set if you require consent to get into that, to get them into your population analysis, and that could be skewing the view of the population.

So I think it is important that we have the full patient level identifiable set to do the initial analysis with appropriate data stewardship, but to do that analysis and we see the same on the clinician ambulatory side that was recommended as identifiable patient-level data.

And for provisioning, it’s after the aggregation and for getting back to –- whether it’s clinical trials, quality or otherwise, getting back to the individual patient that anonymization or re-identification would need to be dealt with or pseudo-anonymization. But in the initial analysis, my suggestion is that this –- it include the entire population of data. I’ll just skip through this.

One of the other reasons was another piece of re-use case, and that was augmenting clinical information. When information either gets to a health information exchange or some regional analyzer or third party analyzer, there’s often some missing data that may be present in written records that are not scanned, that are not available to the analyzer.

And the use case, for many good reasons, gives the opportunity for the care provider to augment it and not make it up. I don’t like the term augment because it makes it sound like it’s going to be fabricated. That’s not the intent. The intent is the data do exist, but they’re not in electronic format, and it does imply some auditing will be required to make sure those data that do exist really did exist when they’re reported. But if it’s to be augmented, this has to go back to who was that patient so the local site can identify the right patient and augment the data.

Basically, for re-use management, the information required from sources, and just a summary, there are quantitative and qualitative sources. Some of it will come from freeform text. Some of it will be codified. Some will be unavailable, and that’s where augmentation will come in.

Text parsing is one way to get to it. Terminology mapping is another. But hybrid methodology, I think, will remain a long term requirement for quality or public health reporting as well. But basically of the three strategies, which I did not get into commercial use –- secondary use of data, but for research, public health and quality, it’s pretty much the same requirements.

As far as anonymization, just briefly benefits or privacy protection, the limitations is to truly anonymize, you have a limited data set. For the biosurveillance use case last year, the intent was just to find a trend that there is a new syndrome out there, that there is a situation with influenza, with anthrax, and it didn’t matter who the patient was, and it even statistically – it could be adjusted for the fact that some patients might be reported more than once because they were seen in more than one facility because we were looking at aggregate population trends. Once we get to quality and public health at the local level, it has to get back to that patient, and looking for real quality measures and identification of cohorts for health care operational management, more detail is needed.

Pseudo-anonymization does add some privacy protection but also increases some burden and cost for creation of pseudonym, re-identification and re-identification through where this pseudo-anonymization occurred -- the local level, regional and larger regional area, or all three. And because we are asked to be architecture neutral, it could actually be occurring in all three depending on the local implementation. So --

MR. REYNOLDS: Could you give us a little more definition on what pseudo-anonymization means.

DR. EISENBERG: Pseudo-anonymization – and actually you’ll be hearing a lot more of that from Lori Fourquet just after lunch, that means providing a pseudonym for the patient so that the patient cannot be identified by the receiver of that information on the other end.

Now there are a lot of ways to take minimal data and know who the patient was, whatever that pseudonym is, but that’s the basic premise is to provide a pseudonym so the direct patient identifiers are not accompanying the data. Then it becomes issues of do you provide just the age, just the year of birth, just the month of birth, and depending on how high level you make that, it becomes more difficult to determine how well you’re doing on a quality measure in the first four months of the year if you’re really not sure who in your population is what age at what time. And depending on how high a level you take that, you’re limiting your ability to use the data.

But pseudo-anonymization is basically providing an alternate ID, and many of us have done that in local settings in simple ways, but not using standards for years for VIPs in our own hospitals. That’s been going on, but we’re looking at it from a standard way. Okay.

As far as some of the detail of what we’re looking at now for collecting some of the data, it’s to be looking for in using some of the profiles from IAG, the retrieve form for data capture to be able to identify the measure definition, collect the data through a form filler where that form filler is unable to get the data electronically, how a human may be able to add additional data to complete the report.

And this actually was one of the optional items from the biosurveillance use case interoperability specification for public health reporting which we know we will be able to re-use for the use case that we expect in December on public health reporting. But it could also apply to quality measures as well.

We also see for the query for existing data – QED which is a profile out for public comment now from IAG to be able to get data from different repositories as needed for quality measures in order to get information for display. The RID profile from IAG which would be displayed for a human to read and abstract but at least to get data on an analysis level, to use XES registries to be able to get data out of existing documents out of the CBA structure, and so to re-use many of the existing profiles that already exist that are already out there to be able to get the data for analysis, ask for additional information where necessary and get it back to the agency.

What we’re not tackling right now is how that measure is defined by a measurement developer in a standard format because of work ongoing, and I’ll get to that in a minute, or how the report back to the measure requester will look because there are different ways that happens today but not a definite standard. So that’s still in development.

And actually so our new term is to leverage IAG, also message-based collection of data, and those will be part of the constructs we will be doing over the next two months and also to identify a standard export and import model. What we’ll be doing is looking at efforts from the collaborative for performance measure integration with EHRs which needs a shorter name which is looking at an XML schema for expression as well as identifying standard import model.

There’s currently HL7 structured report activity ongoing which should soon be going to ballot as well as work being done for expression of reportable diseases and hospital-associated infections from CDC with HL7 as well.

So hopefully it’s given that these are ongoing now. These will be items that we’ll be able to tackle in 2008 for the rest of the use case. So what we’re looking at is to re-use and look at the anonymize if it applies and where it applies, to management document sharing. These are all packages –- transaction packages from last year. There are some new ones we’ll look at – patient demographic query, query for existing data as a new package that we’ll be looking at this year.

Patient explorative quality data is a conglomeration of several profiles, and we’ll be looking at that. That’s also new this year, and retrieve for data capture, how that would apply for quality. And so there are a couple of new ones. I won’t go into all the details of these.

I thought I had another slide – I guess not. Okay. That’s the overview of our current activities, and I’d be happy to take questions or comments.

MR. REYNOLDS: Good. Thanks to both of you. John, you still on there with us?

DR. HALAMKA: I am indeed.

MR. REYNOLDS: Okay, good. Well, I’m going to open up to questions. Justine has one, and then –-

DR. CARR: Yes, thanks very much, John and Floyd. I have a question, and maybe it’s very fundamental. So apologies if it’s very concrete. But what you’re mapping here is how we get quality information for quality reporting, how we get clinical information for quality reporting. And so you’ve demonstrated that we have a crosswalk of how data elements could be standardized and transmitted.

And so my question is on the other side, as we’re building electronic health records, is it – how much will data entry by the clinician change in order to facilitate this crosswalk? In other words, can things go on just the way they do today and somewhere behind the scenes all this crosswalk will happen, or will clinical notes become more checklists and bounded in writing a clinical note.

DR. HALAMKA: Well, this is an excellent question, and let me answer it in a couple of ways. Today at Beth Israel Deaconess, we have problem list management in our ambulatory care areas that is a combination of pretext entry and choosing terminology from a controlled vocabulary.

Well, that means that about 70 percent of our problem lists end up being free text which is non-computable from a quality standpoint. So, you know, what are your options.

On the one hand, if one could use diagnostic codes which are billing and administrative codes as a proxy for a problem list. But wait a minute, that’s not exactly the same because a diagnostic code is an historical item whereas a problem list is a snapshot in time of all conditions you may have. So they’re different.

So what the implication may be is that we are going to now cause doctors to use controlled terminologies that may be nursing terminologies like what’s called the CCC, a patient classification of disease state, or SNOMED and not give them the option of just typing in free text. I think the patient has a headache today. That will be a doctor change.

Allergies, also in the past, have been a mixture of codified and non-codified data. Those will have to be cleaned up, and as Justine has said, one might imagine that certain aspects of text-based data such as today I have left ventricular rejection fraction in a PDF that is a non-computable text-based data element will now require a structure data element in addition to what may be a pretext note describing an overall impression.

So I think we’ll see a staged implementation over time of cleaning up problem lists, allergies and making notes at combinations of structured data entry plus free text.

DR. CARR: Thank you.

DR. EISENBERG: And just if I can add to that, I think it will be a combination, and I think you will see different implementations of those combinations at different sites. Some will have different ways of getting to it. There was a Gardener report recently, I believe it’s December, maybe January review of an implementation of a clinical system and that exact problem, of the problem lists were free text and the doctors didn’t want to use the diagnoses, and they used a proprietary solution to be able to get from what they wanted to say to what was important for a measurement, and that was IMO which is another company which I have no relationship to.

But just – there are other mechanisms to do that, to get from the free text, or if they want to answer, to the measurement criteria that we need.

Another issue is how does one answer an exclusion. Today, it may be in some pretext notes to say that this patient can’t tolerate this drug, but it’s not on an allergy list because there are no allergies. But there may be implementation mechanisms of making it part of the order set on admission for that problem that, rather than a computable order, it becomes a documented piece out of the order set to say I’m not ordering this for this reason.

So there are different ways to implement components of this. But it will change how some of the electronic records will turn out.

DR. CARR: Just one other part of that. As you know, we’ll be in a hybrid stage for a while. So you think there will be a larger role for this scanning technology, sort of word identification and written notes that then can trigger this structured data element?

DR. EISENBERG: I’m interested in John’s comment on that. But I do, and I think there are challenges around that to make sure that that is – that it needs to be sure it’s valid and it’s accurate in terms of collecting this data, especially as we don’t just use those for measurement, but we use the output of that scanning and parsing as part of the decision support we want to do. But I do think there will be more of that as time – as we –

DR. HALAMKA: And I would just comment this is very hard to do. And that is, to give you an example, is that if you build a natural language processing system and an optical character recognition system that will parse free text data and try to codify it in the structure, you have to really be careful that you don’t run into the following.

Imagine that I dictated notes that states, “The patient uses alcohol swabs to disinfect their skin before injecting insulin.” Well, the natural language parser sees, “The patient uses alcohol,” and has now identified them as a candidate for an intervention for, say, Alcoholics Anonymous.

DR. EISENBERG: Especially since he injects alcohol. No, that is a challenge and it’s not only that, but what is negation and how does parsing handle that. There are a lot of challenges around that which actually suggest that there needs to be validation on what that parsing comes out – what the results of that is, that this is from the clinician. This is what they meant. And the last thing we want to do, though, is create extra workload on the clinician who’s trying to enter the data to make sure we’ve identified it correctly.

So it is a challenge. It’s going to be an art to see how this works.

MR. REYNOLDS: Simon?

DR. COHN: Well, Floyd, thank you for reminding me how difficult all of this is, and I’m actually also reminded, having – as you all know, I work for Kaiser Permanente, and as a managed care organization, we’ve been living with HITAS and CQA measures for many years. And I am reminded of hybrid manual chart abstracts and will probably have a major role going in the future.

I guess I’m also reminded why for some of these measures I guess the interim or near term solution is going to be adding, I think, it’s CPT Category Two codes and G codes and all of this to the actual bill to get this stuff going which is what I understand is the sort of the near term strategy versus the longer term strategy which is, I think, being perceived here.

Let me ask you maybe a more fundamental – I’ve got one or two sort of more fundamental questions here. And, you know, let me think in which order I want to ask them because I really have sort of two.

One is, and John, you may want to comment on this also, that yesterday we heard testimony describing the quality paradigm as terribly broken in the sense that we spend a tremendous amount of our time doing quality reporting, quality measurement that doesn’t somehow connect in well with quality improvement.

And obviously I see a very sophisticated and detailed review getting the measurement, and I think that’s what we’ve seen. I do know what I’ve seen at a high level AHIC quality use case somehow bringing this back into quality improvement, though, some of this is sort of escaping me in terms of our view of how we get this into a quality improvement paradigm. Maybe can you comment or enlighten me on how this actually really fits together.

DR. EISENBERG: Well, I’m actually sure there are others here who could comment very well. But one concern I’ve had and in fact am presenting to the HITSP panel on Monday a couple of questions came up as we’re doing all of this effort to show, as in the example, that someone checked all those boxes or a box that indicated all that education was given to the patient. We have no idea if the patient understood it or if it was followed up, and the patient actually did what was necessary. So what are we – what is the purpose of the effort to collect data to prove that a health care provider did X if X doesn’t really improve the patient’s care and doesn’t change the outcome.

There was a review in December, I believe it was in JAMA with an editorial by Susan Horn that there was very minimal improvement in mortality, and I believe length of stay related to good compliance with quality measures. There was a little bit, but not a lot. So are we looking at the right measures.

And I think there are two answers. One is sometimes the measure has a different –- we’re looking at one outcome that the measure wasn’t intended to reach, so we have to make sure we look at the right outcome to associate with the measure.

But the other is, I think, the measure – we need growth in the measure area to be sure we’re really looking at improvement in health care and the quality of care delivery, not just the process piece itself.

And A1C is a nice surrogate for good quality care, and it’s a nice process step that I can look at. But some of these elements are not quite as clearly well related. And so the more we can get to good surrogates for an outcome, the better. But I think we have some limit yet.

DR. HALAMKA: And I completely concur. I think of quality measures, there are two kinds – process measures and outcome measures. A process measure suggests a patient with congestive heart failure was given an Ace inhibitor upon discharge. An outcome is they didn’t return to the hospital five times in 2007 like they did in 2006, and I think the challenge with this whole quality measurement schema is the measures themselves are going to change over time because they may go from process measures, as Floyd has said, to outcome measures in the future.

So HITSP has to come up with a standard that will allow us to measure quality regardless of how the measure changes. As long as you have data around, well, problems, meds, visits, you know, these kinds of things, you’re good. And this is why I’m slightly reluctant to say, oh, here is a HITSP report and here is 100 best quality measures in the world, and we’re going to have standards for those.

I’d rather rate the quality measures into their atomic data elements and say we’re supporting problems, meds, allergies, visits, history, and you do with that data whatever you think is best to measure quality.

DR. EISENBERG: And that’s basically the approach we’re taking in the Technical Committee for that reason. But I think being able to select that information on a broader scale than just sampling will enable us to identify what are the quality measures that really do lead to outcomes because right now we’re dealing with sample populations and limited data to actually find the evidence for outcome. So I think it will help, but it’s going to take some time.

DR. COHN: This is actually not really a follow on, and I’ll pause on this issue. If you want to do that one, and then I’ll ask the other question.

DR. TANG: I mean, I think you had a very good point, Simon. One suggestion for the group, just like Mark Overhage asked you, what does rule based reporting mean, and just like we dealt with the term secondary use. As a suggestion, it might be useful just to make it simple and call it population reporting, case reporting, and contract decision reporting. It just helps in understanding.

So related to that, it’s ironic and I don’t think you can change this overnight, but the question that Simon is asking, you said that we’re not working on the contract decision part now. Ironically, it’s almost like we should do that first because that was the goal and then figure out how to report the results of that, rather than report on things actually that don’t target the outcome and then figure out how to make it happen.

In some sense, there should be almost a paradox, but that’s not something we can figure out tonight.

DR. EISENBERG: Well, I fully agree. I think the issue was given the time frame, what’s the task that can be done in the near term. But I think identifying the standards for the terms that could be used routinely will help the front end game from the back end rather than starting out with the clinical decision support. So knowing what it is you’re looking for helps the decision support even though what you’re looking for isn’t quite robust enough to show the right outcomes, but at least it’s closer.

So I don’t think there’s a problem with that, and they’re concurrent efforts right now about how to do the concurrent measures. In the quality domain that was sponsored by American College of Cardiology and American Heart Association along with NHINS, and their measures are all concurrent. So it was an interesting discussion to say I don’t just want the problem that’s well defined; I want the problem that I think is going on now as you say or are admitted to the hospital, and that’s what I want the decision support to be based on.

And it could be that that’s not what you end up with as your final diagnosis or the final reason for being in the hospital. So your population cohort may be different on a concurrent versus a retrospective measure. And how to resolve that has been in discussion, but it’s really been a time frame issue as to how do we get it done.

MR. REYNOLDS: Okay, you’ve got Simon and then Kevin and then Paul, and then we’ll break for lunch.

DR. COHN: Okay, I’m standing between you and lunch. This is bad. Just another question and I apologize. You may not be the right one to ask. I probably should be asking this to Kristen or Erin around the quality use case.

As I’m looking at obviously all the diagram and all of the data being shipped and anonymized and really sent from provider environments or health plan environments to sort of central entities for processing which is a lot of what this whole diagram is all about in the quality use case, I was actually reminded of a presentation harping back to the EENIA(?) sessions that we sat in on a couple of weeks ago where they, at least in Europe and I believe it was in England, and they were talking about models where effectively queries were sent to the actual holders of the data with the idea being that they got responses back on those queries.

Now, of course, England is a different place than the United States. They likely have more comprehensive data stores. But recognizing that we do live in a decentralized environment, does all of the work that you’re doing contemplate a situation where, rather than all of the data going to a central environment or to a data steward that instead it may – the query may go out to 40 or 50 or 100 different environments or actual holders of data for them to run the queries locally, come back with the results that go centrally, or is that not contemplated in any of the models that we’re seeing.

DR. HALAMKA: Let me start with that one, and that is what HITSP tried to do is not impose architecture. Anything that we do should work in all architectures whether centralized or decentralized or federated. So in the State of Massachusetts, everything we do is very decentralized and federated. The approach we’re taking to quality measurement is exactly what you describe. Let’s imagine the chief executive officer of the physicians organization on a Tuesday night at midnight says I wonder what our diabetes care is like. Well, it turns out we have 500 physicians, and some are in the central EMR, but many have private EMRs. We’re working with e-Clinical Works to enable the exact kind of distributed query you describe that, oh, at midnight here comes the query, how many diabetics have a hemoglobin NC/A1C less than seven, and using the kind of methodologies that Floyd has described from a standards perspective that within an EMR the vendor would do a computation and report back, oh, there are six out of 100 patients that fall outside this range. The person in the central quality office would not see patient identified data, but the results by practice of many, many EMRs reporting back in answer to the query. And so I think the answer is we’ll support both models.

DR. EISENBERG: Yes, and it really was intended to be architecture neutral. If you look on this diagram, that’s what the query for existing data would be a very good standard to manage that. So that’s exactly what we’re looking at.

MR. REYNOLDS: Thank you. Okay, Kevin.

MR. VIGILANTE: Thanks. That was a great presentation, I thought, that really illustrates the challenges in collecting data and identifying data for quality measurement and then with the added challenge, of course, that measures are going to evolve, and, you know, John is right about having more itemized data at our disposal.

I think what I heard at the end of the day is that you basically have to collect everything or close to it – a lot of data. And here’s my question, and it may reveal my ignorance, but it also may help me understand our scope a little bit.

So you have all this data about all these patients. They’re a really rich data source. And if you’re a big institution like a Kaiser or other big system, you have a ton of data on a lot of people under the rubric of quality measurement, which is part of, you know, the treatment, payment and operations covered by HIPAA. And so we kind of know what to do with that and how to handle that and what the guidelines are.

But now you have all this data, and somebody comes and says to you – and there’s, say, two scenarios. One is a medical device company that makes stents, and they want to see the sort of outcomes of drug-eluting stents versus bare metal stents, and they’re willing to pay for it. Or somebody else is doing research, say, on, I don’t know, using bicarbonate administration in folks with ketone acidosis, and they’re not going to pay for it, but they’re doing research.

And it seems to me that that’s where it gets interesting for us because now we’re treading – you’re talking about data that was collected under the rubric of TPO for HIPAA which is now migrating into an environment that is not so well defined.

And the question is – so that’s a secondary use of that secondary data that was – and then there could be a tertiary user as this thing, stuff keeps getting so – and I guess – is that really where we need to be focusing on that margin or on those hand offs that are just not well defined yet and vague. Is that the – and that’s a question for the group. I mean, I don’t know.

DR. EISENBERG: Well, actually, one of the questions that I had posed to me a couple of months ago was from some pharmaceutical groups that were discussing to set up a clinical trial costs money at any site. So they would like to know where the greatest population at risk for whatever it is they’re trying to treat with their medication or their stent – where those people exist so they only set up their studies at the highest volume sites.

Now at that point they don’t need to know who those people are, but they – who the patients are until it’s time to really try to set up a trial and find an investigator. But I do think that’s something that needs to be addressed as – is that information, or how do we handle that information and that access to the information because I know drug companies and manufacturers of stents and all that do want to identify those sites.

And perhaps if it’s post-marketing surveillance looking at pharmaco-vigilance, how much of that is part of operation of health care that should be going on, and how much of it is commercialism which also should go on, but maybe different security around it. So I think it needs to be addressed, and hopefully the secondary use group will be looking at that because we discuss it in HITSP all the time, and I have to get us back off tangent to let’s look at the standards. But it comes up almost every meeting.

MR. REYNOLDS: Paul, you want to ask yours, and Kevin, I think it’s absolutely the question on the table. That’s what our continuing discussion is how do we – because I think your example was excellent in that the first reason it was given out, it was given out somewhat de-identified or whatever term we want to use for it.

MR. VIGILANTE: Well defined.

MR. REYNOLDS: But as soon as they get to de-identified, then it’s really not helpful until everybody can go that next step. And then who generates the next step? Is it the original holder of the data and what is that? Is that marketing, and then do they get permission for the – so you’re exactly right. That’s where it starts getting difficult.

Okay, John and Floyd, thank you very much for everything you covered. We’re scheduled to be back at 1:15 if everybody would rejoin us then and keep your questions and concerns coming.

DR. HALAMKA: Well, thanks, and certainly if I can answer any other questions, I am on email and certainly I do again apologize for not being able to be there in person today.

MR. REYNOLDS: John, thank you very much. You were very helpful again.

DR. HALAMKA: Thank you.

(Whereupon, a luncheon recess was taken.)

A F T E R N O O N S E S S I O N

Agenda Item: Methods of Protecting Privacy in Uses of Health Data

MR. REYNOLDS: Okay, let’s go ahead and get started. We’re going to be hearing from will have to do with methods of protecting privacy in uses of health data. We’re going to hear from Glen Marshall and Lori Reed-Fourquet who we’ve heard from a number of times before. So you want to go in order of the agenda, or did you guys have a preferred agenda.

MR. MARSHALL: It would be probably better if I were to go first and give them the framing.

MR. REYNOLDS: We’d like that. Good.

MR. MARSHALL: Okay. Actually, just to point out that a lot of the questions and the comments that I heard in Floyd’s presentation are sort of a good lead into the presentation that myself and then Lori will have. We’re going to get into somewhat more of the issues around the protections in the data.

Just to introduce myself, I’m Glen Marshall. I work for Siemens Medical Solutions, but I’m also the Co-Chair for the HITSP Security of Privacy Committee as well as Co-Chair for HL7. I’m not identifying that on the slide because this slide set has not been vetted with any of those people. So this is really my professional view on it, but I’m should at least reveal my affiliations and point out that what I’m going to say is not necessarily their opinion. However, you will see some of the thought processes going into those qualities as a result of all of this.

Okay. Even though a lot of the questions that we were asked pertain to specific methods of data protection and hinted that there were some things to be directly addressed, one of the things that I always do when I’m asked questions about, well, should we use this specific technique or should we use a specific standard or what have you, gets into, well, why would you ask that question in the first place. And it really is that in order to successfully address these issues, we do have to dwell a bit on the risk management aspects associated with it.

So it’s really these questions what data assets are at risk, what are the stakeholders, and who are they, what is exactly at stake, what policies are applying and what do the policies require, what threats exist and also specifically what happens if a threat is realized, and, of course, what controls already exist in the system because if people are proposing novel controls that duplicate the purpose of existing ones, why bother.

Okay. I’m going to make some assumptions about the data assets, and we’re talking about secondary uses. And so really what we’re dealing with is data obtained for these purposes, measurement quality improvement which did come up last time as a question, population health assessment, health improvement, and, of course, clinical research which could be not only pure research, it could also involve industrial uses of the data.

There may be other uses, but these seem to be the topics that we’re talking about.

DR. TANG: Can I ask one clarification.

MR. MARSHALL: Yes.

DR. TANG: When you say obtain for this purpose, did you really mean the act of acquiring that data for this purpose, or did you mean use for this purpose?

MR. MARSHALL: Yes, I really have to deal with both of those because from a risk management standpoint, the way that we acquire the data as well as the way we use it are sources of risk and also are covered by policies. So it’s quite important that we cover and our scope be both.

Okay, the stakeholders – and, again, there may be more of these, but it’s important that we’re dealing with the healthcare subjects, and the subjects of the data, and these are the patients or people who are somehow identified with healthcare provision. So it could be a patient, their caretakers, a variety of situations. It could be an entire family, for example.

Healthcare providers who collect data are obvious stakeholders. Healthcare data repositories, and I do want to qualify that I am not advocating a centralized view. I am just noting that data tends to be at rest some place. Okay, consumers of the data obviously that could be anybody that wants to use the data, and then public beneficiaries, and these are really people who benefit from the outcomes of all of this, and they all have a stake in it. And each one of them has specific risks and specific benefits that need to be taken into account.

Now I’ll just mention parenthetically here to give you an example. Not only do we have risk of privacy to the individual subjects, but let’s say I was a bio-terrorist. What would be the first thing I’d want to compromise would be the public health data system, and then I’ll introduce my pandemic. Okay. There’s a threat.

Or let’s say that I am a clinical user of the data for clinical research, and industrial espionage is of use. So each one of these people has an interest to protect, and we have to keep all of these and interest and keep all of them balanced as we choose what we’re going to be doing.

Okay, now this is the laundry list and I ask that question, do we have enough with sort of tongue in cheek. I would, in my opinion, say that we have too many masters that we’re serving here, and this just at the federal level is enough to cross your eyes. And then you get into all of the state and local variations on that, and this is an essential thing. In the HITSP world, we have been dealing with individual patient consents not as privacy controls, but as policies that are developed by the patient on their own behalf to apply to their specific cases, okay. And if you deal with them that way, it actually becomes technically easier to treat them.

Obviously, we have enough and perhaps too many, and we say that we do not need more, we need less and far more understandable.

Okay, let’s take a look at the kind of threats that we’re going to have to consider here. Obviously, loss of data confidentiality. That’s what patient privacy is really all about is the patient’s right to have their data kept confidential. Confidential means not observed by people who they don’t want to have observe it.

Loss of data integrity, this being that the data could be corrupted in some way, either maliciously or just accidentally. Loss of data – just outright loss of data via accident or whatever have you. These are all categories. Loss of data collectors. Now here’s a case where if you need to know something and you don’t have the people who are collecting that data, you’d have a chance of knowing it, and that is a threat that you have to protect against. Loss of the data repositories themselves which would suggest maybe strategies for back ups and that kind of stuff.

Loss of funding for threat litigation. This stuff is not free. It’s not automatic. It doesn’t happen as a result of the system. If I had a data repository, I’m going to have to fund the stewardship of that data for as long as the data is useful, and that could be periods of decades.

The side effects from threat litigation. For example, if I have a mitigation that shields and de-identifies the data and later on I have an urgent public need to know who had that case because let’s say it’s a drug-resistant TB or something like that which we recently had a case of that, then I may have an urgent need to re-identify the data. But if I don’t have those links, all I have is a case record.

Terminology overload confusion. I’ll give you an example there. The term audit as a security professional, that has very, very specific meaning to me. As a privacy professional, it has a different meaning. As a database administrator, it has a different meaning. As an accountant, it has a profoundly different meaning. What is auditing. It actually is a whole variety of activities and recordkeeping about the data. But to give you another example and this happened during the development of this talk, the term authorization is oftentimes used colloquially to be synonymous with consent or permission. From a technical standpoint and from an operational standpoint, they are profoundly different concepts and they’re done by very different people.

So we have these issues. If you talk to me as a security professional, and you want me to do auditing and what you really mean is you want a record of all of the patient consents, okay, I may give you an answer that you didn’t expect because I heard the question differently. That happens all the time. And, of course, that’s outright human ignorance.

We have a lot of people – this is a difficult topic, and it’s – on the surface it sounds simple. But when you get into the weeds, it is quite difficult and education is somehow required. So this isn’t ignorance in the sense of illiteracy; this is just you don’t know better.

Okay, let’s take a look at in the current system and in the current environment, we have a variety of control – manual protection procedures, keeping things under lock and key, barring people from the door, having guards at the door. There’s a whole variety of things you can do there. Educating the stakeholders, okay. The more people know about this kind of stuff, the better pre-prepared they’re going to be.

Physical and network protections. This gets into things like encryption at the network and making sure the wiring closets are closed or making sure the computer room is secure, those kinds of things. Penalties for privacy and security violations. We already had some prescribed in HIPAA. Arguably, there could be more. But the point is that they serve to discourage people from doing things they shouldn’t.

Insurance – something that’s oftentimes overlooked is you may want to protect your financial risk because of – just by insurance because insurance can help you fund the recovery of data that you may have misplaced. And then, of course, there is localized controls that are very, very situational, and a lot of our healthcare providers have those in place already. So let’s make sure that we don’t overload them and make them duplicate work.

So I’m going to give you a quick tour through – now this is from a security professional standpoint. How do I go about doing my job. I start with the privacy policies. What am I required to keep confidential, and that could be keep confidential in terms of patients, it could be keep confidential in terms of I’m a healthcare provider and don’t want my competitors to see the data, or I’m a clinical researcher. I don’t want to have industrial espionage. So there’s a variety of things that are policies. Policies are timeless statements of what outcomes you want. They do not tell you what to do; they tell you what should happen or what shouldn’t happen.

Security policies are very similar. They are actual protection of the data assets, and they’re different from privacy because of the technical nature of them. But they really do tell you what security outcomes you want. You really get those two together, form a body that I would call system object access policies.

Threats themselves usually come in as a list of things that could happen, and then you have to go through a step of risk analysis which is basically determining the probability and the economic value of those threats and somehow getting some sense of priority about them. And, of course you have environmental protections that do go into the analysis.

The combination of system access policy, risk analysis and environmental protections together will form a body that’s known as security objectives. Now for those of you that may be familiar with any security technology, this is going to start looking a lot like the common criteria and the process that you use, ISO-15-408 Standard, because this is how you wind up going through and arriving at security objectives in the 15408 framework. From that, from the objectives there are three outputs, technical requirements. These are the things that you do technically to protect the data from the – to enforce the policies and to mitigate threats.

Then you have assurance requirements. These are things that you do to maintain security over time. For example, a technical requirement of security is to produce audit logs. An assurance requirement is somebody has to read the things, okay. An assurance requirement is that somebody has to educate the users. They have to make sure the system is installed and wired up correctly and a variety of things like that. So those are things that you do that aren’t necessarily technical, but they serve to provide and lock in a level of assurance. It also, by the way, means that periodically you have to test the system and stress it to make sure the protections are in fact operating.

Then you have environmental requirements. These are things that, at the final analysis, are not technical requirements, you wouldn’t do from assurance, but you want to be provided to you in the environment. For example, I could say I need a specific computer system. But in order for that computer system to run, I have to increase the tonnage of my HVAC. That’s an environmental requirement. Okay, so that’s pretty much the way that you run a risk management-oriented program, and this by the way, because it does marry up with a standard, the ISO-15-408 standard, it also considers the framework to evaluate products and the adequacy to fit into a pre-established framework of this sort.

So I wanted –- the reason I went through this little exercise on risk management, unless you have it in place, you’re going to have proposed controls that may or may not work. They may ignore policies. They may not protect against the threats that you make. They may duplicate and therefore add cost to the system without improving the security of it, thereby incurring unreasonable costs – unreasonable meaning that there’s no rational reason that you’d spend that money, and they provide no real assurance, and who do they provide assurance to. So there’s an exercise here that you need to go through so that you can at a later point actually provide a rational basis for your choices.

Okay, so now I’m going to get into some proposed controls. I am going to provide these without architectural recommendations specifically thinking as a HITSP co-chair any of these things do not imply a specific architecture, but any examples I give are in fact just that – examples.

So the first thing is that you want to achieve accuracy for the data at rest. That means clinical records – not so much data in motion across the networks like transactions but when you get a repository of some sort. You really want to use standardized data sets. Why? Because you can detect a whole bunch of good things if you’re using standardized data sets as opposed to freeform text. So there’s a lot of automatic controls that become possible.

HL7 CDA seems to be something that’s used a lot if you take a look at the history constructs, it most certainly is because it’s an essential part of the IAG approach that a lot of their work has gone into.

Using a standardized vocabulary, now there’s really two dimensions here. You have to use a standardized vocabulary obviously for the clinical data. But we know that these vocabularies are always in flux, and we talked about earlier, as Floyd talked about mapping concepts. So if I have nursing data sets, for example, CCC or Omaha, if those are changed and new terms are brought into them, they have to be mapped against SNOMED. So there’s a harmonization process. And that’s something you just have to keep doing.

Also, the orthogonal to that, make sure that when you introduce a term like auditing or authorization that you use it in a way that everybody understands exactly what you mean, and it means the same thing to everybody. Providing assurance for the data subjects. It turns out that if you provide a certain degree of assurance to the people who are the patients that you’re collecting the data from, they are far more likely to provide you accurate data. If you provide no assurance, they’re going to lie. I mean, how many times have you given your exact phone number to somebody at the checkout counter who asked for your phone number when you’re paying with your credit card. I mean, I always give them a fake number. It’s none of their business because I have no assurance they aren’t going to publish it.

Okay, so consent and confidentiality controls are obviously part of that. Now providing incentives for data sources. This gets into the ugly little facts here is that – in the current reimbursement arena, just the cost of care are covered. Very little money is left to invest in these new technologies. People, therefore, will go to seek additional revenue sources. Now clinical research is one of those primary sources. So basically healthcare providers have a very large incentive and it’s a financial incentive to collect data, but being an incentive to collect data that we the public need. I don’t know. Right now, if they’re being asked to collect data and they’re not being reimbursed for it, we are not likely to get as good data as we’d like.

And, of course, educating the data subject and the sources so that they understand the necessary implications of everything that they’re doing. So what I’m saying here has to go well outside of this committee and get a lot of detail.

More controls – integrity of the data at rest. A very simple thing here. This is a PIP standard. It was developed as part of the roll out of their recent AES which basically is a very strong form of hashing that means that your data at rest will likely, if you provide a hash code for it, you can prove its integrity over a period of decades without risk of spoofing.

Providing assurance to data subjects. Obviously, it turns out that the, as I pointed out, a lie can perpetuate, and it could create data integrity issues. So we have to provide assurance to people so that they will in fact disclose the correct data to you.

Providing incentives to data repositories, again, the data repositories are the people that are going to be providing integrity over the long term. And unless they are paid for what they’re doing, they’re going to wave their hands and say, oh, yes, I do that, and not really do it. And, of course, educating the data stewards, whoever they may be, whether that is a local repository at a hospital or an original repository, they have to know what they are doing.

Availability is another issue, and, again, using standardized data sets turns out to enhance the availability of data because that means that you do not have to go through scanning text or, if you will, interpreting. But it means the data is far more readily available. So this improves the time dimension of availability remarkably.

Repruning data subjects and sources. This just basically means that, you know, if you recruit ahead of time, and you know your population ahead of time, you have a better chance of getting the data when you need it. Providing assurance for data subjects, again, if the data subjects are not assured, they’re likely to avoid even being asked questions and making their data unavailable to you.

Providing incentives to data repositories, again, if the data repositories are not going to be automatic data-admitting machines without payment. So we have to deal with that. And, of course, educating the data stewards, again, helps with availability.

What you’re going to see is a pattern of certain of these mitigations actually serve to provide multiple controls, and a lot of them are relatively simple to do and non-technical.

Okay, now the subject of consents has to be brought up at this point. All of us have been to physicians’ offices, and all of us have been asked to sign the HIPAA Notice of Privacy, and it turns out that that isn’t a consent. That’s just somebody’s told you what they’re going to be doing with your data.

There has been a suggestion that we want to get much more active consent from people, especially for secondary uses. However, I’ve seen these consents, and they’re things that only lawyers could love. So first thing is we want a standard form for all people, and that is that when you sign a consent, you should recognize it as a consent form right off the bat, and it should be unambiguous as to what it is that you’re doing. So there’s a certain amount of standardization there.

Simple language in the subject’s native tongue. We have had, if you will, newspaper reports of people who are not native speakers, maybe uneducated, being recruited for medical tests, and they didn’t know what they were signing because it was in lawyer’s English, and they spoke Spanish. There’s a problem with that.

Okay, verbal and written form. This is important that you have to be able to make sure, and usually it’s in a couple of ways, that the person has given their knowing consent, and verbal and written usually are the acid tests there. And it has to be limited meaning that the specific purpose has to be noted and the duration of that purpose has to be noted, especially the accountability, and this is something where we’re a little bit lax in the current rules and regulations is that the person who is signing the consent should be able to know who is accountable for enforcing that consent. Should they appear violated, they shouldn’t have to go looking for somebody to be accountable.

And of course, no duress. This has a couple of things. You know, I’m not talking about arm twisting, or paying somebody or what have you. I’m talking about sticking a consent under somebody’s nose when they’re in an office for treatment, and they feel their problem is urgent, and they’re willing to sign anything to get that treatment. That’s duress.

So these are some of the issues that are coming up, and I believe that if we were to go after consent at this level that some of the consents that we get would in fact be of high quality and dependable and would be immune from the litigation threat, which by the way I didn’t list litigation as a threat, but we all know it is.

Okay, confidentiality for data at rest. This gets into something – now Lori’s going to get into this in far more detail. But we’re really talking about standard anonymized de-identification which I’m defining as a permanent redaction of identifying data to provide assurance of confidentiality for the subject of healthcare information, re-identification being highly unlikely.

In security, you never say something is impossible. You really are trying to say that the level of assurance is very high, okay. And that – what that really says is that there are certain data mining techniques that are known to our national security agencies that will in fact reveal our subject’s identifiers, and we just have to realize that those techniques are known to bad guys as well.

Okay, standard pseudonomyzed de-identification --basically you’re substituting identifying data with something else, and you’re providing an assurance of confidentiality for the subject, okay. But that doesn’t mean that it’s sort of a lesser assurance because re-identification can occur under already pre-defined conditions. So, for example, let’s say I substitute a patient’s identity with a number, and that number, if the person who knows what that number indexes to, can serve as a link to re-identify the data. But that means that the person who has agreed to it has to go through a protocol to actually do all of that re-identification, and there are technical aspects of that that Lori can get into.

Okay, and then there’s one other thing which is aggregation. If all I do is provide you aggregate data like how many cases of a certain type do I have, it actually does serve to completely de-identify the data. But if you have a small aggregation group, I can re-identify it by forensic techniques. Aggregation exposes you to the risk of data availability if I need to drill into the aggregate data to find out more facts about the details.

And, of course, it’s only useful for the aggregation derived purpose, and oftentimes that also exposes a risk of misuse where people will take aggregate data and apply meanings to it that it was never intended to mean, and they’ll imply it by implication, and this is where you get conspiracy theories.

In any case, that’s pretty much the realm of the proposed controls that I would go through based upon what I see as a surface analysis of the risk. I’m sure that more would pop out if we did a thorough risk analysis.

Questions?

MR. REYNOLDS: We’ll let Lori go ahead and then we’ll answer questions.

MS. REED-FOURQUET: As Glen indicated, I’m going to take more of a technical dive into some of the specific techniques that we talked about for privacy enhancement. Much of my perspective comes from my involvement in standards and the topic of pseudonymization, anonymization is one where we’ve just generated a standard in ISO-TC215 for which I serve as the vice convener for WG5 on security.

When we look at considerations of privacy for secondary uses of healthcare data, and in some cases throughout these slides much of the more informed work comes from what we did last year in HITSP biosurveillance, and we haven’t taken as deep a dive on that yet in quality, so just keep that in mind.

If you’re looking at privacy matters and these are in compliance with HIPAA and protection of human subjects, we’ve got four options for enabling the secondary use of the data. Using the personal data with consent or non-objection for the data subject, as Glen just described in the consent approach; obtaining an IRB approval where it’s determined that the risks are minimal and/or acceptable; using the personal data without explicit consent under some public interest mandate or anonymizing, de-identifying that data prior to use.

Many complications with relying on consents and authorizations for that secondary use. That consent would be very difficult to track to its original authorizations, and this is particularly the case for research subjects. How do you go back and interpret the original authorization now to a broader context of application of that data. And then if you need to go back and retrospectively identify and contact those original data subjects to add the additional explanation of how their data is used and extend that authorization, that becomes very difficult, especially where those database projects may contain tens of thousands or millions of records. How are you going to go back and obtain truly informed consent from those data subjects retrospectively on a re-use situation.

There’s an additional risk of privacy for those patients. They may object to being contacted down the road particularly if it’s been a circumstance where their health or environmental issues have changed since their original authorization or consent for using their data for research.

The other risk of privacy threat would be linking techniques to what might have otherwise been de-identified data in order to find the contact information for those individuals. So in trying to go back and capture their authorization, in a sense you’re compromising their privacy.

Privacy enhancement technology – and notice I use the term enhancement, although throughout here you may see the term protection. It touches upon that statement that Glen just made that we are never in security 100 percent assured that we have protected it, but so we will talk about this as an enhancement.

We’re not just talking about securing the information, but using the technology to protect the information and personal privacy. Pseudonymization is the technology we’ll drill down upon in more detail which is essentially reversible de-identification technique, although it’s not always reversible. It can be consistently traced throughout the information system. So many of the current or historical research projects will fill the track with patients within an organization. But if you’re doing a community study or otherwise, the linking of those records across that community or other cohort becomes more difficult without some sort of identifier on top of that.

They can be under strict and defined controls and enable re-identification of those data subjects in accordance with policy. Converting the identity into a pseudo-identity for use within an information system is what we’re trying to accomplish, and we would typically, if you’re going beyond the walls of an organization, rely upon some sort of trusted third party service to securely assign and manage those pseudo-identifiers across the domain of set of subject sources.

Re-identification, again, can be restricted and defined to pre-authorized rigorous procedures that would be fully implemented security controls, and that needs to be specified in some sort of re-identification policy.

The standard that I referenced ISO TC215, Health Informatics Pseudonymization, was approved in March of this year. The technical specifications specify its principles, requirements and guidance for privacy protection using pseudonymization services for the protection of personal health information.

We named it by HITSP last year for privacy enhancement of biosurveillance data, but it’s included in the HITSP quality requirements and design document for privacy enhancement of quality data, although we have not since completed the actual detail analysis of how much anonymization is going to be required when we get to those data elements.

Some of the terms related to anonymization, de-identification, and this is how we have defined them in the standard, is the general term for any process of removing the association between a set of identifying data and the data subject.

Anonymization is the process that removes the association between the identifying data set and the data subject is a subcategory of de-identification. It does not provide a means to link back to that person across multiple data records or information systems, and it does not allow the re-identification of the anonymized data.

Pseudonymization is a type of anonymization that goes –

MR. REYNOLDS: Lori, hold on a second. I think Paul has a clarifying question.

DR. TANG: Lori, can you help me understand the difference between de-identification and anonymization given your third bullet says you cannot re-identify anonymous.

MS. REED-FOURQUET: The de-identification is really a generalized term which includes both anonymization and pseudonymization and other techniques such as removing the data identifiers. It’s, again, a general classification.

The distinction between anonymization and pseudonymization is that the pseudonymization is going to allow you to link and potentially re-identify the data subjects, and it’s going to allow you to link across the multiple entities.

I’m going to go into more detail as well. If that’s still not clear after I drill down, perhaps we can come back to that.

DR. TANG: Well, I think I’m having trouble right at the beginning here because I believe I understand the difference between anonymization and pseudonymization quite clearly. What I didn’t understand is what you said that de-identification is –

DR. CARR: Anonymizaton is a subset of de-identification. What else is a subset of de-identification. I mean, is there something else?

MS. REED-FOURQUET: Is pseudonymization is also a type of de-identification.

DR. TANG: I think that is to me, it’s in conflict with your definition of de-identification because that says the removing of association between a set of identifying data and the data subject, and pseudonymization does not completely remove the identifying data and the association between the identifying data and the data subject.

MS. REED-FOURQUET: It does remove the identifying data, and it removes the association with the data subject. You cannot get back to that data subject unless you go back into the algorithm essentially that was used to provide the pseudo identifier. So in the absence of, say, a key to unlock that identifier, you have no means of identifying that patient.

DR. LOONSK: Yeah, I just wanted to comment that I think that there may be different interpretations of how anonymization is being used and how pseudonymization is being used in the context of if any of us had any responsibility for reinvigorating the life of these terms. There were some in public health who were seeking to remove direct identifiers from data being provided to public health and provided to an authority where those data are potentially managed and protected where that authority potentially has even broader domain potentially to accessing those data, but where de-identified data in the HIPAA context lose substantial portion of the meaning.

So, again, if you take de-identified data in the HIPAA context, you lose localizing information that may be of importance in biosurveillance circumstance. On the other hand, getting named data with the person’s record number or the patient’s name was not always necessary for biosurveillance, and there was a desire in public health to seek to not get more data than necessary to support the need.

Pseudonymization or pseudonymization had a particular role in the public health context wherein, for example, local public health agencies may get name data specifically to follow up on reportable diseases and didn’t want to lose those named data or the ability to follow up on those cases as is part of their authority and part of the requirement for their activity by law.

So the concept of pseudonymization and there are probably – there are certainly a number of other contexts for it, but one is where an appropriate authority can make a request in the case of, for example, a reportable disease to try to ask the data provider to identify more specific information about the patient so that case report can be completed and that can be followed up on if it’s a case of communicable disease, for example.

DR. TANG: Okay, so I think I –

DR. STEINDEL: Lori, let me phrase the clarification question like this, and this has to do with Glen’s comment that a lot of terms in this area are overloaded. And if we talk about de-identification as it’s described in the HIPAA regulation which is the way we usually think in this room about the overloaded term de-identification, that would be a form of anonymization, is that correct?

MS. REED-FOURQUET: That is.

DR. STEINDEL: Because that is a process to remove the association between the data and the identity.

MS. REED-FOURQUET: Yes, that is correct. However, I will step through the process that we went through last year with biosurveillance where that is only step one of a three-step process.

DR. STEINDEL: I’m just trying to clarify terminology.

MS. REED-FOURQUET: Okay.

DR. STEINDEL: And now with respect to de-identification as you have it on this slide, that is a definitional statement with respect to just the ISO publication. So it’s really different in de-identification from the HIPAA sense.

MS. REED-FOURQUET: Yes, it is.

MR. REYNOLDS: Last comment on this, and then I’ll let –

DR. LOONSK: I would suggest that in the way that de-identification has a fairly clear definition and it’s principally associated with HIPAA de-identification which is very specific, pseudonymization has a specific meaning which includes the association of some sort of data linker so that one can get back. Anonymization is ambiguous in its meaning and has been used in several different contexts, and whether it’s a super set or a subset of de-identification is depending on the context and is not clear.

MR. REYNOLDS: All right. Lori, will you please continue. Have you got a real problem on this –-

DR. STEINDEL: Yeah, I just got confused on this.

MR. REYNOLDS: No, that’s fine. Go ahead.

DR. STEINDEL: Because I thought the way I stated it was relatively clear because we’re talking about the word de-identification and anonymization and pseudonymization with respect to these definitions with respect to ISO. And de-identification in the overloaded sense from a HIPAA point of view is very specific. With respect to the ISO definition, it falls under anonymization technique.

So anonymization with respect to this slide which is what we’re using for clarification at this point in time is not an overloaded term. Now I agree with what John says when we put it into the HIPAA world, we have the exact opposite of that where de-identification is relatively clear and anonymization becomes confusing.

So that’s the problem that we’re grappling with on this committee, and I think it’s something that we’re going to have to attack when we write a report.

MR. REYNOLDS: So summary is Lori’s presentation is Lori’s presentation, and –

DR. STEINDEL: And we need to think about it with respect – we have to think about it in the next hour or so with respect to Lori’s presentation.

MR. REYNOLDS: Go, Lori, you’ve got it.

MS. REED-FOURQUET: Okay, and just to follow up on that a little bit, these terms were included in our final comments because the industry commenters at STC groups and the vendor said that this is an area where there is a lot of confusion, and that we needed to put down some specifics in definition. So that’s where this ends up. Okay, a little bit more on semantic overloading. But for the purposes of the tactical specification, we did rely on the definitions from the European Directives for personal data meaning any information relating to and identified or identifiable natural persons, and that being the data subject. And we extend the term data subject in the technical specification to also denote other entities such as devices, processes, organizations, et cetera. But the other – let’s be careful on further overloaded terms. Since there are other privacy legislation out there that are using these terms, not only do we want to harmonize those terms in the standard world, but let’s be careful moving forward on how we do that in the legislative area as well.

So in the – the first thing we do when we look at the pseudonymization technical specification was identify potential uses that we would be needing this technology for specific to health care. There is an annex that describes some uses cases, and these include secondary research use of clinical data, clinical trials post-marketing surveillance, pseudonymous care – in other words, if you’re a patient who wants to go to a website and keep their record anonymously, they should be able to do that, or otherwise moving laboratory data as de-identified samples. Patient identification systems where we’ve had some discussion in the past on the voluntary patient identifiers and how a patient might be able to be in control of their identity and perhaps change that identifier if they felt it was compromised. So this technique supports that.

Public health monitoring and assessment, and we actually included our biosurveillance use case as the specific example in the annex.

Confidential patient safety reporting such as adverse drug effects, and this is one of the use cases where we may want to not only protect the patient’s privacy, but also the provider so that we do not threaten the provider to make those reports.

Comparable quality indicator reporting, as we’re looking in the quality use case now, other such peer review and consumer group. You’re linking that data to some traffic data to health data. You may be supporting some of the consumer organizations.

The concept of identification now. We have a set of data subjects and a set of characteristics. So a data subject is identified within a set of data subjects if it can be singled out among the others. So if you have an information resource that has a certain number of subjects, they should be able to be linked by those characteristics.

Some of the associations between the characteristics and the data subjects are more permanent than others such as your social security number, your date of birth and others may not be as long list such as your email.

When we go through the processing, we talk about a payload. So a payload would be as we take the personal data and we split it up into two parts, the payload is the information that is considered anonymous, non-identifying and then the identifying information is separated out. The identifying information contains a set of characteristics to allow the unique identification of that subject. So you’re going to have your demographics, your social security number, et cetera as being the identifiable.

DR. TANG: When you say contains anonymous data, do you really mean that another word for payload cannot include any text that is not codified but could be re-identified by classification.

MS. REED-FOURQUET: Yes.

DR. TANG: It truly means there’s absolutely nothing in there.

MS. REED-FOURQUET: Yes, because this is going to be the stage before you finally publish it. So for this processing, we are making the assumption that the payload is already anonymized. So you may have processing happening before this stage. It may be this text processing. It may be other codifying – it may be removing the last two digits of the zip code as we talked about for the zip code’s small area of risk.

DR. TANG: So, as an example if you were to pseudonymize this packet, then it is true that if you wanted to re-identify, you actually might have to change the payload.

MS. REED-FOURQUET: To re-identify –

DR. TANG: No, I mean I would be allowed to have a different payload when I choose to re-identify if I’m authorized to re-identify a pseudonymized package.

MS. REED-FOURQUET: The re-identification process would bring back in the payload after you have your identifiers. You would have the payload information repository with a pseudo-identifier. You would need to go back and locate the tuberculosis patient or whoever that might be, and you would –

DR. TANG: well, let me make it very precise like TB. Okay, I’ve described everything about this person who went to Europe and back. So in the text of my progress notes, when I sent that to you pseudonymized, I sent a pseudonym, but I’ve also cleansed all the text. And when I went to re-identify this package and its payload, it really requires repulling which – do you see my question. Really I no longer have any value, and I can’t create it by re-identifying the algorithm.

MS. REED-FOURQUET: Well, there are many implementation approaches to that and techniques that can be used. So when you cleansed that pretext data, hopefully first of all you codified it where you didn’t have information walls. If you really needed to go back to the free text, there are other techniques. So we could have encrypted, for instance, that text and made it part of the payload. And maybe you were only authorized to decrypt it under the same re-identification authorization processes.

Or, as you say, you may go back to the information source now that you do have the true identifier and go to the clinical data that has much more depth.

DR. TANG: So I think in the process of asking the question and hearing your response, I have a different understanding of pseudonymization than I had when we first defined it which is that it’s not true that I can apply an algorithm that I got permission to use to actually re-identify in toto and get the full information content of the payload. So that’s missing.

MS. REED-FOURQUET: Correct. Correct. You’re only getting to the identifier which will then enable you to further get to the payload.

DR. LOONSK: And a related and equally complicating issue, this implies that potentially from the payload you may have excluded things like text like the woman next door.

MS. REED-FOURQUET: Yes.

DR. LOONSK: Who with a natural language processing would not normally identify it because it doesn’t look like a name, but would be to some extent identifying information that is eliminated from this payload.

MS. REED-FOURQUET: Yes. So it certainly is a balance because I suppose if you were trying to see who – where the source of an infection came from, that woman next door may be a key piece of information, but this isn’t the only source of information. By getting the identifiers, you should be able to go back to the information source as you would typically do today.

DR. LOONSK: Okay, Lori.

MS. REED-FOURQUET: Okay, anonymization is the process that removes the association between the identifying data set and the data subject. So that might be done by removing or transforming the characteristics of the associated data or by increasing the population of the data sets so that that identifying data is no longer unique. So basically filling it with dummy data, if you will, so that it’s less identifying.

Pseudonymization now is a particular type of anonymization that after you remove the association, you add an association of a pseudonym. Okay, so if it’s irreversible – so pseudonymization allows for it to be reversible, but you can also implement it in an irreversible mechanism, then you do not have a method to drive the pseudonym. So pseudonymization is still valuable in a one-way scenario by linking the subjects across the multiple domains, but you may choose to never allow identification.

So if in reversible pseudonymization the model includes a way of re-associating the data either from a derived payload or from a pseudonym in a look up table. So typically today if it’s a local hospital, the hospital may have a look up table, and that’s how the research organization would come back to them as opposed to being able to recompute the identifying value.

So the pseudonymization processing, you would take identifying data and the payload data, split them into two different parts, pass the identifying data through a person identification service, and this is how we specified it in HITSP, and then that pseudonymization service will turn the identifiers consistently into the same pseudo-identifier typically especially if there’s a trusted third party approach through a key cryptographic method and then you would take that pseudonym plus the payload data and that would be considered de-identified, and you can load that into whatever information resource you’re trying to make available.

I’m going to come back to privacy threats – yes?

MR. REYNOLDS: You will not be first on the other question. Go ahead.

DR. TANG: On the previous slide, so there I did not see a method used to anonymize the payload data.

MS. REED-FOURQUET: I have not included a method in here for the anonymization, and I’ll drill down a little more on the anonymization process in a few slides further.

Identification or re-identification is one of the privacy threats. So our concerns if we’re going to keep it a secondary use information resource, one might be can the status of the subject be re-identified. So if I have a data item, can I establish a link that that data subject. And if I have the data subject, can I establish the data items that are associated with that data subject. That’s a little more simple than the inference, and it’s the inference issue that causes the more complex problem.

So if I have given the data subject, can I verify it against another set of characteristics that I as an attacker have access to and then associate that with the data subjects. So this is the linking it with some other data resource whether it’s traffic information or anything else that I may have access to that you’re not expecting to be linked with your information resource. Given the data subject, can I verify that the set of characteristics is not associated. So I can as a hacker come in and start weeding through the data. And, again, pointing out that that attacker may have access to additional information resources – either authorized resources or unauthorized resources.

So we want to be careful that we don’t consider the data that results from anonymization or any of these de-identification processes to be fully protected in and of themselves. If somebody is going to have data mining access to them, they can certainly be a threat to privacy.

Refining the concept of identifying ability and anonymity. We need to account for all of the means that are likely, reasonable either by the controller or by some other person to re-identify. And, again, this is a quote that was taken from the European Data Protection, and that drove us to want to in the standard refine the concept of identifiability and anonymity and take into account in the threat model what means are likely and what those any other person might be.

Okay. Levels and approaches for anonymization. So level one kind of coined rules of thumb on data items. That is for the most part, that was the 18 variables from HIPAA. Let’s simply remove those.

Level two is a status, data model, data flow, re-identification risk analysis. If I have a diagnosis code and an admit date, I probably could identify patients under certain circumstances.

Level three is more of a dynamic populated repository. In theory, we’ve created a repository of information that should be pseudonymous. It should be privacy enhanced. But if I do an analysis against that repository similar to an audit, if you will, I might be able to identify additional outlyers that were not considered in level two.

So back to level one anonymization, and this is the process that we went through for the biosurveillance use case. We took the 18 HIPAA variables and tried to identify where we might have a problem in generating biosurveillance information resource with simply removing those identifiers.

The items that came up less than 20,000 people in a zip code. A zip code is a very important element for biosurveillance.

Dates, a birthdate is not a biosurveillance issue, but it certainly is coming up in discussion in quality. I did highlight in this case because we need to know specific ages that a patient is before applying to see whether or not they’re qualified for a certain measure.

The admit and discharge dates, again, will have issue with the quality measure.

Level two anonymization – we looked at the data set that was given to us by the AHIC. Date of variables that were likely to be in the form of freeform text. We had suggested that those needed to be codified and/or removed. Those data elements would be chief complaint, things such as nurse and triage notes, and test interpretations. And in some of these cases, there are other variables that were typically being mined today for detection. So we’ve recommended that those be codified.

MR. REYNOLDS: Lori, can I interrupt you for a second.

MS. REED-FOURQUET: Yes.

MR. REYNOLDS: I know you’ve had a lot of questions. I noticed you’re about halfway through your slides. I’d love to leave enough time if we could for questions because I think there’s going to be a lot of questions.

MS. REED-FOURQUET: All right. Data variables, subject to re-identifications from other fields would be putting such things as facility codes, diagnosis codes, disease state, and laboratory results. You put those together, and you may have some compromise, and then outlyer variables. Really the diagnosis code is the one that’s highest risk of being an outlyer. The risk analysis, though, is resource specific. So we have to this resource analysis when we get our data set for quality.

The level three anonymization really is more of an audit process. It’s continuous re-identification risk analysis of the live database. It would take into account the content of the data, the outlyers it might lead to indirect identification, and we should be able to have a routine risk analysis criteria specified by policy or by a service provider so that there’s an assurance level that this is happening on a routine basis.

The re-identification risk analysis should be defined by policy. What does it take, who can re-identify, what it the authorization process, and that is very much going to depend on the risk model. And apart from the regular re-assessment of these reviews may be triggered by events. So if you change the data variables, if you add data variables, you should re-review your risk of re-identification.

Identifying information that is necessary in secondary use may need to be enabled. You may need to roll up to the three-digit zip codes. There are other techniques that can be applied to further encrypt other data elements where you may need to make those data elements accessible, and then they are all going to need to be coupled with all of the other security controls that Glen had discussed – access, physical controls, personal sanctions, et cetera.

Just briefly on consent, if I could deviate slightly, the sensitivity classes and functional roles are standard classifications that we are working on in ISO and trying to work through the HITSP process. When you provide an authorization, if you’re going to include consent and authorization to access this information, it would be helpful if we can start getting into more of a structured representation of that consent so that we can have machines be able to interpret, read and act on that information. So which data is being authorized, what sensitivity level is it, to who are you authorizing disclosure, say, to their functional role and for what purpose, and does that purpose include quality assessment. Are you going to ask for authorization for quality.

There is an example included in here of data that would be collected, say, for research project where you’ve gotten some identifying information, name, date of birth, address. While address in this case is not necessary, it can simply be removed, and then the remainder of the information at bottom is used for research data.

The remaining identifying information, we go through an identifier calculation, cryptography, be assigned an identification and the remaining privacy protected risk management associated with anonymizing that payload gets processed as well, and then at the end that data can be made available to a research repository.

Reasons for re-identification, and this should be included for audit requirements specified by policy. Why would you re-identify, and these are listed in our standards. You might want to verify or validate the data integrity. You may be going back to check for suspected duplicate records. You may be looking to enable requests for additional data. You might want to link or supplement research for information variables. You might be looking for a compliance audit, informing data subjects or their care provider awaiting significant findings so that they can provide some follow-up care, or to facilitate follow-up research, and that last one would be your public health address.

The identification requirements – there are a number of requirements that I won’t go into. But there is a need to assure the health consumer’s confidence. We need to specify what our re-identification policies are in order to make the discloser of the source data comfortable.

We have a list of pseudonymization service standards for privacy policies. We have not defined a standard policy, but we’re looking for at minimum trustworthy services and what needs to be expressed in a policy in order to enable these services and make them trustworthy.

We similarly have trustworthy practices that are minimally expected to be sure that the underlying security is in place including things like this audit, time-sensitive re-identification. This is specifically to accommodate the health concerns. If a pseudonymization service is going to take three days to re-identify, that will be unacceptable to a public health agent that needs to act, say, within 10-15 minutes to have that source data. So that needs to be specified and made clear to the subscriber.

International perspectives – this is not an exhaustive list, but I noted there has been use in reference to this technology specifically to health care in France, The Netherlands, Belgium, U.K., Canada, Australia and certainly in our U.S. biosurveillance use case. And while I don’t have specifics, the Japanese delegation was extremely active in here, and I believe it’s in their files.

Some specific concerns about HIPAA – just with respect to enabling this technology, within the de-identification, the privacy rule says that the code is not derived from or related to information about an individual. The concern with that statement is that pseudo-identifiers may be key encoded based on those identifiable data, and we want to – I’ve had questions saying is this compliant with HIPAA because it’s a cryptographic encoding based on those identifiers, or is that going to be a problem.

And then cannot be translated to identify the individual, is stated in there, and the concern is using a reversible key encoding approach, it might be permitted that reversible identification is not permitted.

So if there is any way of clarifying that information, that would be very, very helpful. And policy recommendations to clarify those HIPAA rules go out for pseudonymized resources to be enabled. We need to establish policy for secondary uses. We need risk assessment, risk mitigation expectations so that we can appropriately define mitigations for them.

And then establishing a policy and minimum requirements for re-identification.

MR. REYNOLDS: Before we open up for questions, could you go back to the HIPAA concerns log one more time and take us through, starting with the second bullet, take us through the subsets of the second bullet there, and let’s make sure we get that.

MS. REED-FOURQUET: Okay. So the privacy rule indicates there are commits and assignment and retention of a code for other record identification if that code is not derived from or related to information about the individual. Now if we use a technique that’s going to take your demographic information and encrypt it using key encoding mechanism, is that or is that not derived from that source information.

Now the intent behind that statement, I believe, was that you would not have something like a look up table that once you identify your secret, you can use it over and over. But the technology we’re talking about here wouldn’t be subject to that risk.

Okay, the next one, it cannot be translated to identify the individual. So if we’re allowing for re-identification, are we not translating back to intentionally re-identify that individual, does that mean that we are not able to use this under HIPAA.

DR. TANG: Unless Sue’s in the room – oh, Sue, I mean to me, both of those are disallowed very explicitly.

DR.LOONSK: Can we just further specify the question because this is for de-identified data in the context of, for example, for public release, just to be clear. So there are other data uses specified in HIPAA that were not specified here.

MS. FARQUHAR: The first one probably is technically disallowed, whether or not – I don’t think it was the intent. What we were looking for is you can’t come up and scramble a social security number and claim that you’ve come up with a re-identified.

MS. REED-FOURQUET: Exactly.

MS. FARQUHAR: The second point really is the question of who. I mean, clearly we allow for re-identification. And so what we were looking for is, you know, the code, the key has to be kept by the source of the information, the source that synonymized it or de-identified it so that if the key goes with it or if the ability – if the scrambling of the information that’s derivative of the identity is so transparent that the recipient can translate it, then that would not – I mean, the re-identification has to be done and kept secret by the source. It can’t go with the data.

MR. REYNOLDS: Any comments from anybody on this? We will now officially –

DR. TANG: I don’t think HIPAA said that it’s so obvious. I mean, I think that qualifier’s not in there. So that’s why I think both two is almost a derivative of one. You can’t have it where you can – that was not in that particular language. Those are both disallowed.

MS. FARQUHAR: But it doesn’t, I mean, there aren’t – you can’t have an anonymous key that goes with it that is identifiable only at the source of the data.

MR. REYNOLDS: And all put together in this way.

DR. LOONSK: Can I – you said that’s not what they were talking about, and was not an anonymous key part one of the pseudonymization techniques. So that was one of the – there are at least two pseudonymization techniques being described, one in which number one certainly would be at issue because it is based on the name or medical record number, another where it’s an anomymous key where there’s no – it’s not a data manipulation of anything identifiable, but it’s just a sign that the point of origin, and it sounds like that’s what Sue is indicating may be allowable in this context.

DR. TANG: But I think their use – the whole reason for them doing this, well, you will be able to link – you would be able to link pseudonymized data from multiple sources to this same individual. That would not be true based on what you just said.

DR. LOONSK: That is one function of pseudonymization. That’s not the only function for which pseudonymization is sometimes referred to.

DR. TANG: Well, we just can’t get all of the benefits or the uses that she talked about if you use the other.

MS. REED-FOURQUET: And I also want to point out, and maybe that changes your interpretation of this first one. Typically, this would be a two-cap encoding, so the source of the data would do one pass of encoding with their key, and the third party service would do a second. And in doing that, you would need to go through both parties essentially to re-identify. So it’s that final one being based on an encrypted identifier that was generated from identifying information be a problem.D

MR. REYNOLDS: Simon, Glen did you want to comment first?

MR. MARSHALL: I’d like to point out that this entire discussion over the last five minutes has made Lori’s point. The lack of clarify in HIPAA and the lack of clarity in the interpretations, I could go to ten lawyers and get eleven different answers. No, and it doesn’t mean that any of them are correct. So that the lack of clarity and one of her recommendations is to deal with that lack of clarity and lack of specificity.

I’d also point out that the current recommendations in terms of those 18 variables were produced without records and adequate risk analysis, okay. So somebody came up with these because they thought it sounded good. But the fact is that Lori just gone through and shown you that they aren’t. So we need to deal with those issues.

MR. REYNOLDS: Simon?

DR. COHN: Thank you very much, because I had been a little confused by what was going on. Thank you for clarifying that the whole thing was to confuse us. I’ll pass on this. I have a question on a completely different point. Maybe others are still in the confused state I currently am. I don’t know what question to ask at this point other than to say, yes, there needs to be some clarification.

MR. REYNOLDS: John.

DR. LOONSK: The first item there where to follow up on Paul’s comment, the potential to link data though it is pseudonymized has its attractions in the quality context because it absolutely can go cross counters, go across locations and do it.

It would be highly dependent on, though – and this is my question, on a very meticulous specification of those data that in practice is probably not found in most care situations. I mean, you’re using an algorithm to do a consistent modification to data with the hope that you can relink it afterwards. You’re highly dependent on the cleanliness of those data to begin with.

MS. REED-FOURQUET: Actually, the way that we specified in HITSP, it’s leveraging the patient identity cross-reference manager which is already assigning based on whatever underlying linking algorithm, a consistent identifier. And so you’re not actually feeding input raw patient elements. You’re feeding as input the link.

DR. OVERHAGE: But that’s a big assumption that that’s how they all work. They don’t all work that way. There is not a consistent identifier in all of them.

MS. REED-FOURQUET: Yes. There’s certainly architectural issues that have to be addressed in any implementation of this.

MR. REYNOLDS: Okay, did you –

DR. LOONSK: I just wanted to add there’s one more concept that’s out there that people talk about which is, and I think I may have heard it, but it’s perturbing data to retain the value of the data for analytic purposes thought swapping data attributes so that the data cannot be identified. I just wanted to put on the table because it is talked about as well.

MR. REYNOLDS: Okay, we usually have a questioning period, but you won’t be able to tell the difference in this one. So I’ve got a question, and then Simon and Paul and then Mark. So you can see it continues.

You said two or three things, and I’d just like a quick answer on it, and both of you did an excellent job. And we knew we were going to really focus on this issue with questions.

You mentioned a trusted third party. As we come up with our definitions, trusted third party a lot of times fits in the eyes of the beholder because who’s exchanging the data. So as we continue to think about the consumer and others when we use some of these other terms like trusted third party, that may or may not translate, you know, to each of these audiences.

So as we come up with any of these other definitions and everything we’re doing as we put our reports and everything together, the eyes of the beholder is always going to be a key subject that we have.

Second, you mentioned that pseudonymization – I’ll learn a new term – equals protection. Again, only depending upon how strong the chain of – so I’m not sure I can completely buy into that right upfront because the strength of the chain of the data being passed really decides whether or not pseudonymization is in fact a protection.

MS. REED-FOURQUET: It can’t be the only protection.

MR. REYNOLDS: Good, good.

DR. LOONSK: And there could be a number of issues, one being how many data elements from a re-identification standpoint are eliminated so that the ability to re-identify from other data sources and pseudonymized data is not clear, or the way in which the linker is held, and both those are potentially –

MS. REED-FOURQUET: Yes, and that’s why surrounding this there is so much focus on policy, definition and specifying it, clarifying it and going back and continuously doing the risk analysis on the information resource and information gathering process.

MR. REYNOLDS: And at some point, I want to make sure that we as a committee, those of us that aren’t necessarily in the public sector, would understand what public health really is by some kind of a definition because we – no, a lot of people talk around the table that if it’s for public health, this is okay, or if it’s for public health, that’s okay. But a lot of us don’t have that definition. So a lot of people that we would be sending this to may not be able to come up with that definition just quickly. I would like to add this.

MR. MARSHALL: There’s a definition in each state.

MR. REYNOLDS: Thank you. That makes it even clearer. Okay, I got Simon, Paul, Mike, and then Mark, and then Paul will be on question time until August.

DR. STEINDEL: Yes, I’ve noticed the same thing as representative of CDC about a lot of the confusing statements about public health, and we’ve asked Dr. Leonard who’s coming on as the director of National, and we intend, assuming that we haven’t vetted the talk yet. But one of the intentions is to put some clarifying statements around the various different uses of clinical data for public health.

DR. COHN: Gosh, you know, I actually just wanted to test my level of understanding, and I’m actually embarrassed because I’m probably going to fall all over myself and look at Susan to maybe keep me straight here. And this is maybe what’s in my level of confusion. Now first of all, going back to something like HIPAA which gives me some basis, I’ve heard of a variety of – I mean, HIPAA describes curee techniques that relate to some of the secondary uses we’re talking about but doesn’t get very restrictive in terms of public health uses of data in terms of security. And so, therefore, layering on of additional pseudonymization or whatever is more along the lines of additional things that CDC may choose to do to provide additional protections. That is correct, right.

Similarly, quality depending on how we decided to find this and whether or not this all fits in with PPO and if indeed it does fit with PPO, there’s really once again not the need to de-identify or pseudonymize or anything else for that matter, but we may choose to do that.

So this is once again – and if I’m want here, correct me. So we’re sort of talking about this one, but there’s really – the fundamental issue is protecting the data, not necessarily going through elaborate security mechanisms of one sort or another or arguing about whether HIPAA de-identification includes various aspects of pseudonymization unless we’re actually expecting to fully publish the full data set of everything that’s being dealt with for quality. Am I correct so far?

Okay, so some of this is like over-specification potentially of potentially what we need. So I just want to make sure I was understanding.

MR. MARSHALL: That’s actually very close to one of the points I was making as well as Lori that unless you do adequate risk analysis up front, you do run a severe risk of over-specifying and incurring costs that are unreasonable for the situation you find yourself in.

To paraphrase that, never under-estimate the value of an existing locked door.

DR. COHN: I just wanted to make sure that I was fundamentally understanding the routine. Now having said that --

MR. REYNOLDS: Can I add one thing?

DR. COHN: Sure, and then I’ll ask my other questions.

MR. REYNOLDS: And so you mentioned PPO which I had listed. We mentioned public health, and then there’s other whether that’s research, or what is that. If you would add that to what your premise was, then I agree.

DR. COHN: Which was the third one?

MR. REYNOLDS: Other, and it may include research. It may include – but are there any other things where the state is involved because I agree with everything you’ve said, but it leaves out some segment over here.

DR. COHN: Well, I hadn’t come up with a comprehensive list.

MR. REYNOLDS: And I’m not saying you did, but – then I think we’re all in agreement here. But that’s another category that we’ve got to think about.

DR. CARR: Well, that was my question also. In terms of research that’s overseen by an IRB, an IRB doesn’t necessarily obligate this level of de-identification. So, again, just scooping where this would apply, not necessarily in research, not necessarily – not in PPO necessarily.

DR. COHN: And I guess, once again, just trying to distill a lot of overhead and a lot of information, what I’m hearing and I guess I must have not really appreciated this morning was obviously your comment that HITSP is recommending as a good practice pseudonymization for it sounds like both quality data that appears to be going around outside of the organization as well as biosurveillance, is that correct?

MS. REED-FOURQUET: Let me qualify that. If you read the HITSP language, it says where it is required by jurisdiction because it’s recognized that there may be jurisdictions or other agreements in place that may clearly allow identified data to be communicated.

DR. COHN: Okay, so I guess I overread your slide then.

MS. REED-FOURQUET: Yes, I am not advocating that HITSP be the way that everything has to go.

DR. COHN: Okay. Now, having said all that, not that I’m any smarter, only less confused, Glen, where does HASH and HASH 256 – SHA – where does this fit in --

MR. MARSHALL: Well, what that really does is deal with integrity issues. The data can be corrupted in a variety of ways, including electronic or somebody mishandling a value or deliberate corruption.

In any case, what happens is a hash is nothing more than a numerical algorithm applied to a pile of bytes that results in a stream of 256 bytes long in this case that is unique. One byte of that pile of data changes the hash value changes, okay. And what that really says is that if I have a hash value and I have the purported original data, what I do is I rehash it and I compare the rehash that I just calculated with the original hash, and if those two values are equal, I know that the data that’s just been handed to me is in fact absolutely the data that originated. So it’s really a mathematical technique.

DR. COHN: I’m just listening to you, and I’m reflecting on yesterday when one of our esteemed former members, Dr. Clem McDonald came in and was talking about hash in relation to research data, and how they had been –

MR. MARSHALL: Oh, that’s another overloaded term.

DR. COHN: How does one way hash relate to –

MR. MARSHALL: Well, 256 is a one-way hash. It basically, if you will, it’s irreversible encryption which means you can from a hash you can never recover the original data. All you can do is, given the original data, whether it remains with integrity. Trust me, this is one of those things that security geeks get and we – just pay me and I’ll do it for you.

MR. REYNOLDS: Okay, so we’ve got Paul, Mike, Mark and a break.

DR. TANG: So I’m going to attempt to simplify things again. So first of all, thank you, presenters because it was very, very helpful, and I think we all share the same goal. My bottom line partly goes off of what you asked and what the two of you answered which is sort of we don’t -- it’s a bit of a technology look in search of a solution in search of a problem. So let me say so, one, as far as this is concerned, I think Mike said it right, the law is very clear. So I think basically, at least for one – we’re all good about one. But the main thing is there are certain allowable things – there are certain things that are allowable by HIPAA, and it does not require hashing anything. And we should just figure out cleanly and clearly what that is and do with appropriate protections.

There are certain that are just plain impermissible about which the law is very clear. Let me just try to be – and I’m not trying to be – this is sort of pseudoconformance. I mean, Lori’s saying yes. And there’s no reason to pseudoconform to anything because when we define it, public health, research, that we are allowed to use identifiable data and protect it as such.

But there are some things we just aren’t allowed to do, and pseudoconforming doesn’t accomplish that. So in some sense whether it’s technically like this provision or similar limitations, and I still go back to the payload, one of the things you mentioned is well, you know, in order to get that information out, you would (1) have to code it so we would even know it’s there to get it out. Well, that doesn’t apply to text data. I mean, there is no automated way to go into text, figure out what’s identifiable and cleanse it short of the human process of coding it so that the machine can figure this out. That’s why I call it pseudoconformance – it really is pseudo. And I think we really confused it. It was really a good chance for us to understand the technology, at least better. There was a gleam of hope that, wow, there was a standard way, a way of standardizing method, et cetera, but it may have been a pseudo hope.

But does that help clarify a little bit? I think this is not necessarily – and I’ll look to them to – it’s going to be the answer to all of our problems, or the answers to the hard problems that they’re trying to solve using this technology, is that fair?

MR. REYNOLDS: All right, Mark Rothstein – Paul, you kicked over something.

MR. ROTHSTEIN: Well, I may be totally off base here. But I think, Paul, your attempt to oversimplify it oversimplified it by suggesting that the methods that were discussed here are purely academic. There is one area where I think this is very important, and that is in the research context because if you wanted to research on a data set but did not want to go back and -- privacy rule and the common rule, then you have to supply some method of removing identifiers. I will not use any of the different terms. That is where this comes in. Correct?

DR. TANG: When you use the word removing identifiers, you basically were in search of something that would do that reliably. This does not remove the identifiers. It encases them, but it does not in the HIPAA sense remove the identifiers.

MR. ROTHSTEIN: Okay, I am not qualified to pass on this particular method, but what I am saying is some method—some process—of satisfying HIPAA standards and common rule standards to make sure that the provisions of HIPAA, which would mean requiring authorization, and the requirements of the common rule which would require higher approval in forms, et cetera, et cetera, etcetera do not kick in. Something has to be done, and that is where this—not this design so much, but this analytical framework apply.

DR. OVERHAGE: I guess the case you are talking about is IRB or --

MR. ROTHSTEIN: Okay, so I want to do outcomes research on a whole data set, right? If I am doing it on stuff that is identifiable in a sense, that is human subjects research. However, if there are—if I can satisfy the separate criteria that OHRP has published on what qualifies for anonymous research, then I do not have to go and get—I do not have to re-contact these people. I do not have to get their consent. I do not have to get IRB approval because now it is anonymous research. The same thing in a slightly different way, applies under HIPAA.

MR. REYNOLDS: What I would like to do is, that is on the table along with the other comments. We need to continue discussion. I have Mike and then Marc Overhage. We have some other people on the phone.

DR. FITZMAURICE: I want to followup with what Paul was saying and also with John’s concept of perturbing the data. I can see that an element in the payload may have to be changed. You take the identifiers, but you may have to change part of the payload. I would like to have those observations flagged so that I might delete them from the observations. My dataset might be used to link hemoglobin A1C with a person’s weight, and I also have dietary information and other information in there. I have got the heaviest man in the world in my study. Everybody knows who he is. I do not really know who he is. He is 973 pounds. Somebody could say, well I am going to take that and put him into the 300-pound class, or I am going to divide outliers by two. It destroys the robustness and the statistical strength of my analysis. I would be better served by knowing that there is a flag and having it removed, or given instructions to the data supplier to me to get that out of there because the payload itself identifies who the person is. I just want to strengthen Paul’s comment about the payload can also be used to identify somebody.

MR. REYNOLDS: Okay, Marc?

DR. OVERHAGE: Well since we only have a few minutes, I will ask something easy. It goes to Marc’s and Paul’s question. We talked a lot about here about pseudo-identifiers and one of the identifiers that I spent a lot of time worrying about the last couple years is location and where the patient lives or works. It is not very subject to this kind of the identification for the kind of purposes. You cannot use it if you mess it up too much. Any thoughts or work or comments on how that—because obviously each of these identifiers have their own unique issues it seems.

MS. REED-FOURQUET: Right, so if you have two fields, one of them is your rolled up zip code into the three digit versus the five or nine digit code available to most of the users of that resource. You can also have an encrypted version of the full zip code available to authorize access of the data. There are other lower level techniques within the information resource to allow you to use those data elements as needed.

MR. MARSHALL: I will give you a slightly different answer. We have been talking in the last few questions about analysis of comparative risk or offsetting risks. You have the risks to the person’s individual privacy, and then you have the risk to the research value of the data. There is a trade off that we are talking about.

One way to resolve that trade off is to go to a more common point where both things are in fact protected. So, it may be appropriate to rather than say, de-identify individual patient records to supply an entire completely identified patient record set, but provide protections at the set level so that you do things like encryption of the set or other things along that line. Then what you have to have is an absolute trust agreement of some sort between the parties that basically says that if you violate this, if you look what the conditions are, you are going to go to jail or some really serious stuff. Basically, what you are doing is you want to decrease the likelihood that the risks—both the risk of re-identification or identification of the data outside of a proper use context and the research value of the data are in fact protected.

So, it could be that one of the recommendations is to heck with this de-identification stuff protected at the dataset level because that seems to be adequate, and it covers the risk of two competing stakeholders.

In other words, if you go through the risk analysis, you may come to that conclusion, and that is a perfectly valid conclusion. In order to do it, it has to be reflected in policy.

DR. FITZMAURICE: In the federal government, we have about nine statistical agencies where that sort of thing happens. We happen to be sitting in one right now.

MR. MARSHALL: Oh, I just described your life, right?

DR. FITZMAURICE: The NCHS is such a statistical agency where things like that can happen.

MR. REYNOLDS: Simon has a comment to kind of pull us all together, and then we will break and we will be back at 3:10.

DR. COHN: Gosh, that is asking a lot for me to quickly put it all together. John, did you raise your hand? Did you have a comment before I quickly close?

MR. LOONSK: Thank you, Simon. I feel like we have put our toe into some technologies here and some approaches here, but that the committee overall has not fully absorbed them. I do not think this problem is as simple as we may have made it out to be though. If you look at public health for example, HIPAA says—if you are looking at it from a regulatory perspective with existing law and existing regulation, HIPAA says that public health should get the minimum data necessary.

Now, would that include the names if they are not necessary in that context? I think that is an open question. It is not quite as clear-cut. I think that this is not necessarily just about meeting the letter of the regulation either. This is about trust. These technologies bare more discussion in the context of how we get to recommendations that help to insure that trust is maintained.

So, there are aspects of these, which are very esoteric, and almost—just as a personal opinion—completely impractical in terms of their application in the short term in terms of if you look at existing health care and how it lays out.

On the other hand there are some very simple things that are in here that are very practical, just not having names bandied about where they do not need to be. It is a very simply solution and is pseudonymisation. I think that this is worthy of further discussion as we move forward because I think that part of this is on the agenda for what this workgroup needs to grapple with.

DR. COHN: It is good that John talked because he said about 80 percent of what I was going to say, which is that we talked a little earlier at the beginning of the day about the issue of tools and technologies to help minimize risk. I think we have heard now—Glen just reminded us about encryption, but certainly pseudonymisation potentially whether it is required by HIPAA or might just be a very good practice in relationship to some of the things that we are pondering is really something we need to think about. These are those issues of tools to minimize risks, which is sort of basically what John was commenting on also.

With that, I agree with Harry. Let’s take a break.

MR. REYNOLDS: Okay, 3:10 because we have got three speakers for the next session.

(Break)

Agenda Item: Providers Who Capture and Use Data

MR. REYNOLDS: Okay, we are going to go ahead and start our next group so that we can make sure we have plenty of time to cover this. We are going to go in the order of Dr. Peterson and then Dr. Nugent should be joining us shortly from the airport and then he will have to turn around immediately after speaking and go back, and then we are going to have Jennifer Lundblad speak third.

DR. PETERSON: Just for clarification, did he fly in just to get a testimony and then he is going to fly back again?

MR. REYNOLDS: No, he is on his way from the airport to here. Okay, Dr. Peterson, we appreciate you joining us, and you have the floor.

Agenda Item: Practice-based Research Networks

DR. PETERSON: Well, thank you very much. I am going to start out by saying I am sorry I am not able to be there directly and be able to speak with you in person. I do appreciate the opportunity to be able to address to you over the phone. I am going into a new area here that is provided to capture and use data. I will talk to you briefly about Practice-Based Research Network. Someone is going to be handling the slides there for me?

MR. REYNOLDS: Let her know whenever you want it switched, please.

DR. PETERSON: Since I was not able to be there, I thought I would put my picture up there and you can kind of see who it is that is talking to you. I will take the first slide then.

I wanted to start by beginning to make sure that everyone knows what a Practice-Based Research Network was, and what I am referring to here are going to be collaborative in this case of primary care physicians that are really committed to performing research in their own clinical practice generally that is of relevance to primary care. These are groups of usually experienced primary care clinicians that are in a community setting. They principally take care of patients, but they also have an involvement in clinical research. They are generally organized in regional or national groups and they do multiple studies within a single network.

The reason that I think that this has importance really is that if you generally think of these networks of Primary Care Practice-Based Research Network, sometimes called PBRN are basically the clinical laboratories that primary care uses. It uses it to do clinical research, and it uses it for dissemination of information. It is important because most visit in this country are—most visits by people are to their community clinician and mostly Primary Care Clinic. In fact, if we look at general medicine and pediatrics, it is family medicine. There are actually more visits to those primary care physicians than there are in all of the other specialties combined.

MR. REYNOLDS: What slide are you on now?

DR. PETERSON: That would be Distribution of Office Visits. So the next slide would be the federation then. The Federation of Practice-Based Networks is really a national organization of these networks. It was established in 1998. It has currently about 8,500 physicians in that group. Fifty-seven different regional or national networks, and in this case I want to say that we are talking about primary care not within a specific discipline. The group that have these Practice-Based Research Networks running include the American Academy of Family Physicians, American Academy of Pediatrics, American College of General Internal Medicine, American Academy of Nurse Practitioners—I think they have two or three national networks actually. So, there are really all of the areas of primary care. The patient population served by that group is a little bit over 16 million with over 2,700 participating clinics currently.

I do have on this slide that quote from 1999. I think Larry was a part of that. That is the most promising infrastructure then was related to the Practice-Based Research Network.

Briefly, again making familiar with what is going on, these are the locations of the headquarters.

MR. REYNOLDS: Dr Peterson? You are cutting in and out. I do not know if you are using a speakerphone or if you could get maybe closer to what you are using? We are starting to lose you in and out of times.

DR. PETERSON: Oh, sorry about that. I am not on a speakerphone, but is this better?

MR. REYNOLDS: Yes, thank you.

DR. PETERSON: Thank you very much for telling me. I was just going on to the slide with the location of the headquarters for Practice-Based Networks. Again, what I was saying here is that these are the headquarters of the sites of the regional and national networks. In fact, the area covered across the country include the entire country, but if we just look at the headquarters, then it looks like they are pretty much reflecting the distribution of people.

So, what kind of data are we capturing and why are we capturing information now? Really, the Practice-Based Research Network is a little different than your usual clinical research network in the sense that we are really beginning to look at data capture or information that is delivered within the setting of usual care. It is actually because—although primary care tends to have both of the Practice-Based Networks, we now see these kinds of networks of providers in dentistry, neurology, cardiology, oncology has had them for some time.

We are also seeing specialty based practice-based research. The advantage really is that our patients that are part of that study then, regardless of the area that we are looking at, really do demographically resemble the general population much more than they would within in a local research study that occurs within an academic center. We also have accessibility to really a widely diverse population because of the very wide distribution.

There is an advantage of having diverse locations for these and involvement of possible delivery systems. You will sometimes see networks that are a single delivery system, and that is not what we are talking about. So, we are not talking here about a managed care organization or a large organization of clinics that tend to deliver care in a very uniform way.

The data that we collect from Practice-Based Networks cover a very wide variety of areas. The information that is delivered here is really—of course we often talk about the translation of research into practice. That translation is really from the perspective of a community clinician—is really both quality. That is kind of bringing their practice up to date as well as dissemination of new information into the community. It is a place for doing both effectiveness and efficacy research. I think that we tend to see more effectiveness work now being done because the sites are so generalizable. It is also—you will hear more about the information that is collected from the perspective of quality. Some of those groups are regulatory bodies. The PBRNs are really provider generated performance measures. We have a number of tools that are used to help providers evaluate their own practice. There are studies in safety, of course, as we would suspect. There is quite a bit of work in health disparities. There are clinical trials going on and community based participatory research where the investigators are involving the community that they are caring for in the identification of both the research question and of the process that is going on in the research.

As I begin to look at the needs of this community, of the practice based research network community, I would really begin to emphasize the interoperability perspective. I know that you have had some tremendous experts speak to you earlier about system requirements. He may have had something like this up. Really, what we are talking about is the ability of these systems to exchange information and predictably use that information that has been exchanged. You will see that syntactic interoperability is what we refer to as the ability to actually read information that is exported. The next build would be semantic interoperability is really the ability for us to understand the words that are used. In order for networks or primary care clinicians or specialty clinicians to speak together, we really need to be able to exchange information like this. I would need to be able to both have the ability to read and the ability to understand each other’s information because it is only in larger groups that the information that is provided becomes a value.

Currently, the National Institutes of Health is funded in electronic architecture for Primary Care Practice-Based Research Networks. The purpose of what NIH is funding is the ability to facilitate clinical research in primary care practices anywhere in the country and to help that translation, the rapid integration of new research findings into primary care. This is really a trans-institute involvement with funding coming at this time from several of the institutes, National Health Line of Blood, National Center for Research Resources, National Institute for Diabetes Digestive and Kidney Disease, National Cancer Institute, as well as organizations such as AHRQ.

I can show you a little bit about what that looks like, but I am going to put up this fairly simple cartoon that—this interface that is being built on what we call our local NIH gateway. It is a dataset that is at the clinic. We can take information from laboratory and billing. We put that into a registry and then export that using what we call a continuity of care record or a CCR. The next build, this works very well and takes in consideration as we move forward. An Electronic Health Record does the same thing, and an Electronic Health Record is able to export this kind of thing at CCR. That is able to populate that gateway. That gateway then does a number of things. Next build shows that we are really able to generate a number of quality improvement reports from multiple disease registries, clinical tools, from that gateway while at the same time we are really able to see that that information is able then to be shared in HIPAA compliant ways through web services with—in this case it shows the electronic research network. It is really another portal through NIH.

I wanted to show you what that looked like. That gateway is a concept really. I have summarized it here because it is really made up of a large number of pieces. I guess the importance of this—for those of you who are technically savvy, you can take a look at it and see what it looks like. The importance really is that it is based on the new Internet to web services group through Globin. It is really a creation of a grid. That is, we are putting this information at the local clinic available on a secure grid so that we do not centralize information, centralize the querying service, and then queries can go out and run locally at servers that are located across the country. Each clinic then maintains its own data, has it run locally, and only the answer then is back out of the clinic. No data is ever moved or removed from the center.

In order for that to provide an advantage for the local clinician, then there are other benefits that we barter really. The local clinician has an interest in research, but of course the main drive is clinical care. There are a number of different tools that we also provide. This is one of the tools. It looks like this. This is a patient specific position reminder. That is, it is a patient specific profile of one with diabetes. It shows the summary of some of the patient’s demographic information. It shows the graphic history of previous test results. On the bottom, it has a little alert area as we see in some of the templates in Electronic Health Records that really identify—as of this time, this patient needs the following. Those are the kinds of summarizes that are provided. Within an Electronic Health Record—when a person gets an Electronic Health Record, then those may be greater or a lesser value depending on the EHR. We expected not to have much of—we thought that once they got the EHR they would be able to do that, but that was not the case.

There are also a number of other tools that are applied that are clinically useable. In this case, it is a name put onto the Dusty Roads Clinic. It is really a quality improvement device. In this case, it is focused on diabetes listing patients and their last A1C—kind of the summary of the clinical results that might be useful to a clinician. As we do that, then there is additional clinical buy in to some of what takes part in this NIH gateway.

So that is kind of the way we bartered entry into some of the clinics. I think that clinics are interested in the clinical tools. There is a great deal of information that is coming out. I think that we are just at the beginning of being able to do this. By the way, we will be working with some of the people that will be talking to you a little bit later. The two things that I suggest that would help that for us makes the biggest difference that facilitates our use of data within a clinic—I may have said this over and again, but I am going to say that one of the most important pieces for us is really the establishment of a standard data structure and the establishment of the standard data element.

The standard data structure that we have used and that I think is being used predominantly by groups of clinicians across the country—not necessarily across the country, but certainly in clinician care is the continuity of care records. The continuity of care record is a valid standard for standard development organization of ASTM, one of the largest standard development organizations of the world. It is a standardized XML strength. It is writ in what we call W3C, which is standard XML. It is openly readable. We can identify—you do not need special expertise to get your information in and out of this. That tends to be of value to those that are sharing. The next build would show that continuity of care document, which is a compatible document written by HL7, mostly compatible. It is a CDA document, architecture document, a little bit more like an envelope that we can put information into. Still, our data structure that we would encourage—I prefer the CCR. We could take the CCD, but we need a data structure. We need data elements too. I think that across the world as we enter with this National Health Services in England and across the United States.

SNOWMED is probably the single most comprehensive global nomenclature that we would use. It would be of value for us to have additional—facilitate how I used it if we are going to include it. Now, those are some of the ideas that it seems to me that I wanted to tell you about that would facilitate our use of data. Having not been at the presentations at the previous day and this morning, I do not know if I have answered the question that you would have, so I thought I would open it up and say, if there are any questions, I would be happy to answer.

MR. REYNOLDS: What we are going to do is we have the two other panelists. I do not know if you are available to stay on the phone or not?

DR. PETERSON: Sure.

MR. REYNOLDS: Yes, we will go through the other two panelists and we will open it to questions from the floor. Okay? Okay, thank you. I am going to go ahead and have you go.

Agenda Item: VHA Upper Midwest Quality Performance Improvement Alliance; MN Community Measures, Netter Quality Information Pilot

MS. LUNDBLAD: Hi, I am Jennifer Lundblad. I am President and CEO at Stratis Health. We are a Minnesota based quality improvement organization that collaborates with both providers and consumers to improve health care. When describing the opportunities and uses of health data and health information exchange, quality is often mentioned but it is pretty typically lower on the list. So, I am really pleased that the committee has recognized the important opportunities that health data and health data exchange present related to quality and quality improvement, and so I am very glad to be here.

I hope to share from our perspective both generally the work that we do as an external change agent. We are really trying to help translate research into practice, and then I am going to draw on a couple of specific examples, both Minnesota Community Measurement, that you heard Kevin Peterson mention just a moment ago and the Quality Performance Improvement Alliance or QPIA.

We have been working with VHA Upper Midwest. The Medicare QIO organizations in the five states of North and South Dakota, Minnesota, Illinois, and Wisconsin. Over the past year and a half on an initiative where 40 hospitals in our region who are already very high performing hospitals have stepped up at the leadership level to say they want to achieve 100 percent performance. They have a whole variety of policy and program and quality related reasons for wanting to do this. But because this is a data driven initiative, I think it is going to illustrate examples of responses to many of the questions that I hope will be helpful to all of you.

So, if you can see here, I think these are often referred to as the flyover states, but I think there is a lot of good work going on. First, you posed questions to some of the panelists that were around setting the context in health information exchange, in particular about enablers or restrictions or policies. So, I want to comment a little on what is going on in Minnesota because I think that will help you understand as I talk later about the specific quality improvement initiatives.

One of the key remaining barriers to fully

gaining the benefits of electronic data is the lack of implementation of standards for data transmission and sharing across providers using different vendors—vendor systems simply do not talk to each other. This is particularly problematic in a state like Minnesota, which is a strong integrated health system state. We have situations where the best vendor product choice for a group of hospitals that are part of a health system is in fact a different choice than the primary care or the specialty clinics that are part of that same systems. So, they are forced to choose kind of the least common denominator to be able to exchange data, but not meet their needs and requirements, or meet their needs and requirements and not be able to exchange data. So, that is true in our integrated health system environment, and then you think of the QPIA example that I just gave. Five states trying to exchange across the state borders, and you know that we have real problems around health information exchange.

In addition from our perspective in supporting quality improvement across the continuum of care, which really includes hospitals, clinics, nursing homes, home health, and some of the long term care agencies that are working in our state. We really hope and expect that the next generation of quality performance measures will include patient-centered measures across the continuum of care that a patient experiences for any given episode or for any given conditions. So, it will include kind of the silo based measures and quality improvement work that we are currently doing, but it will also reflect the need for multi-providers across the continuum of care that bring accountability for the entire patient experience and allow identification of improvement opportunities across this continuum. Given the current uses of Electronic Health Records and the current capabilities of Health Information Exchange, achieving this will be no small task.

We did hear Kevin also mention the continuity of care record, and we think there is some hope and opportunity in that as a particular tool for looking across the continuum.

You also asked in advance about the environment around current laws and whether they provide sufficient privacy and security protection for identifiable health information to be used in quality improvement. HIPAA is really the floor for data privacy and security regulation across the country. I know you have heard from many experts around HIPAA. Then there are kind of a patch work as I think about it of state based regulations that come into play above and beyond HIPAA that create a pretty fractured and confusing environment for health care providers.

In Minnesota, we have a very strong tradition of very strong patient consent laws. In our state, the combination of HIPAA and our state laws do give sufficient privacy and security protections. Even with, or maybe because of the strong patient privacy and consent laws, there are barriers that are perceived in the ways that Minnesota laws and requirements have impeded the electronic exchange of information: undefined terms, ambiguous concepts, difficulties in determining application of the law as to an electronic environment, and really the need to update where we are around patient consent.

As a result, that is what this next slide is and the attachment that you have with my materials. We passed in this legislative session in May, the 2007 Health Record Act which seeks to provide solutions to many of the barriers that have been perceived by providers. It includes everything from precision and some of the definition. Things like, what does a health record mean? What does medical emergency mean? To looking at long-term care situations where there is perhaps someone who is not physically or mentally capable of priding consent. How do we address those situations? There is also a framework in there for record locator services to support data exchange.

So, this Health Record Act resulted from the Minnesota Privacy and Security Workgroup, which under funding from RTI was a broad multi-stakeholder group that was able to work through a lot of these barriers and develop the solutions that have now made it into this new law. I know you have heard a bit more in detail at the national level about the RTI project yesterday.

These are important advances in health data privacy and security to support health information exchange in Minnesota. If you think about the QPIA, the five state initiatives that I just described to you, you can begin to see how this patchwork of state-based initiatives can impede quality improvement. What we pass in Minnesota is not the same as what is in any of our neighboring states so that the health and systems in Minnesota have to deal with that as their systems cross the border. Then, this QPIA initiative, the Quality Performance Improvement Alliance, which is trying to do collaborative improvement work across five states really faces a complex crosswalk of regulations and laws that they need to understand in order to be able to accomplish their collaborative improvement work.

You also asked as part of content setting about the opportunities to use the National Health Information Network and some of the national infrastructure, and I just want to comment briefly on that. The challenge it seems to me, from a quality improvement organization perspective in this arena is striking the balance between national infrastructure and policies and local innovation and control and needs. We want and need a national network and the associated policies, but we recognize that innovation tends to arise from local means and local opportunities. So, it strikes that there are parallels to the world I live in which is the national quality measurement world. Those of us who are immersed in that world are all trying to strike a similar balance between broad use of national consensus based measures, for example those that come from the National Quality Forum and that are reported nationally, for example through the Hospital Quality Alliance, while not inhibiting and finding ways to in fact encourage innovation and new measures that develop up from local providers and local initiatives. Health care delivery is local, having at its core the relationship between the physician and the patient, and we strive to measure quality in consistent and comparable ways, but we also want to encourage that innovation in ways to assess quality. So, I think there are some parallels as we think about quality measurements and as we think about Health Information Networks at the national versus local and where that balance is.

You then asked about the specifics and you heard Kevin just speak to some of the ways that are the sources and uses for data. I want to do a parallel track here as well. I have listed for you in the slide some of the ways we, as an external change agent, draw on sources of data for use and quality improvement. Chart abstraction, of course the benefit is that it is most details at the bedside data about patient care. The challenge is that it is retrospective, although the opportunities for Electronic Health Records to improve that kind of realtime data collection and analysis are real and in front of us right now. There are some concerns that Electronic Health Records though will diminish the details and the nuances of individual patient care. There are opportunities as well in HER and chart abstraction by building guidelines in as logic models and forcing function, quality can be effected at the bedside right during care delivery.

For electronic registries, the second item I have on my list, the benefit is really the ability to view patient populations by conditions, for example all of you diabetic patients or by treatment, but the challenge is that electronic registries are not accessible by all of the providers who would find the information useful. Internal repositories are an interesting source of quality data, and right now—you heard me describe the very integrated health system environment in Minnesota—most of our large health systems are really challenged by getting data back out of their internal repositories. They have made the move over the past few years to implement their Electronic Health Records. We have great levels of adoption in our state, particularly by the largest provider groups, and as we work with them, what they feel is data rich and information poor. They cannot get the information out that is going to help them look at their population. It is going to help improve their care delivery. So, they have got a lot of data there, and it is not yet useful for them in many ways for decision-making and for quality management and for quality improvement.

External warehouses, the fourth item on my list, are data that are part of the local, regional, or national repositories are really useful for benchmarking purposes, but the data and reports in them often have significant timelines which means they are not quite as useful as they might otherwise be. Lastly, I have included as a source of data for quality, the administrative like claims data, patient satisfaction data, and other kind of survey data.

I want to share with you a case example going

back to the QPIA project, again, more than 40 hospitals across five states. Hospitals that participate in this initiative told us early on that one of the most useful things for them, they are all already high performers. One of the most useful things was going to be able to see the data for all of their peers in the topics that we are working on: heart failure, AMI, pneumonia, and surgical care. They all already collect that. It already gets submitted through their vendor to the National Data Repository. We, as the QIOs have access to that repository and can bring that data back and have them be able to take a look at that. It seems like it should be a fairly straightforward process, but in fact it took us nine grueling months to work through the data policies that exist right now that I think are very outmoded and outdated that allow that kind of data sharing to happen. We had to go through data use agreements from each of the hospitals. We had to go through data sharing agreements between each of the five QIOs involved. You can imagine the spider web of paper that was crossing paths, and then each of our organizations with VHA to be able to share that. CMS was really supportive. They thought this was a great initiative and a great project. It took us about a month of those nine months to actually execute the agreements. It took us eight months for us to have clear understanding with CMS about how that data could be shared. So again, CMS thought it was a great project, wanted to be very supportive, but it took us—I still feel a little bruised through the whole process of trying to get through all of that. We have gotten through that, and it has been enormously valuable for those hospitals. That is an example of some of the barriers around data exchange and using those external repositories.

You also asked about not only sources, but uses of data for quality improvement. I have articulated again the most common examples that we come across in my organization as an external change agent. Internal quality improvement and patient safety, comparing your own performance over time or comparing one’s performance to those of the peers whether that is in a state or region or another breakdown. These comparisons can lead to identification and opportunities for improvement.

Peer review, which I think often does not get mentioned and should in the quality improvement realm, which is using individual cases to understand sentinel events or near misses and then encouraging peer learning specifically between and amongst physicians. Public and community health, local, regional or state --

Pay for performance, I think the next iteration of transparency is really using the quality performance data by health plans by employers and by state and federal governments that pay differentially for quality, whether that is for reporting or achieving certain outcomes. Then research, contributing—quality data can contribute to the research and evidence base that we all should be using as we are doing quality improvement work.

Here, I want to share with you another case example from Minnesota. Again, Kevin mentioned it in his comments as well. That is Minnesota Community Measurement. This is a new organization that was formed a few years ago when in our state we looked at the HEDIS data that is collected by health plans about the care that is delivered to their numbers and is reported in the Health Plan Accreditation Process. All of the work that is undertaken, all of the chart abstraction and administrative data that goes into creating HEDIS was then reported at the health plan level. That does a medical group or clinic no good in terms of its ability to use that data for performance improvement and quality improvement.

The Minnesota Community Measurement had its origin in attempting to take the HEDIS and attribute it to specific medical groups so that reports could be generated, first internally and now in a very public way about the performance of medical groups. We know that is an area that nationally has lagged behind where hospital public reporting, nursing home public reporting, and home health public reporting has been. So here, in Minnesota, we are making an attempt to say that we want to report at the medical group level, and we are working our toward reporting at the end of an individual clinic level and be able to have that kind of data that comprehensively tells us how we are doing at those levels using data that has already been collected and just attributed at the specific group level.

Now, the Minnesota Community Measurement is piloting direct data submission now that we have so many providers in Minnesota who are using Electronic Health Records. So, we think it will be even a more rich and a more detailed data set that will be reporting at the medical group level.

I have given you the website there. You can see how those data are reported out of the medical groups in Minnesota. Again, this is ambulatory care measure at this point. I also will let you know that Minnesota Community Measurement was selected last year to be one of the six national pilot sites for the Better Quality Information for Medicare Beneficiaries—BQI project. So Minnesota is one of the six sites that is getting Medicare data, merging that with the commercial data and the state public programs data to for the first time bring all of those data sources together and report at a medical group, at a clinic site, and potentially at an individual physician level. So, for the first time will be that comprehensive all payer source of data reporting in each of these six pilot sites.

These six pilot sites are the precursors for what

you are probably all familiar with in the value driven healthcare initiative from the Department of Health and Human Service, the value exchanges that are currently being promoted by AHRQ. I believe the public comment period just ended last week on the description on the value exchanges. That is what the BQI projects, when they roll out and move beyond pilots will evolve into these national value exchanges.

What we learned from the results in Minnesota is that we started reporting, but with a focus on diabetes in the community measurement project, and moved to reporting a composite measure five different elements of care and whether the patients were in line with the result for their and for their test results in each of those five areas. The practice according to medical guidelines in this composite diabetes measure increased from four percent in 2004, which was the first year of the public reporting of the data to ten percent in 2006. That still seems really low to the general public in terms of whether we are giving optimal diabetes care or not. That is a remarkable improvement. Many of the groups have been able to achieve very high improvement, and so we are trying to learn from what those experiences are as we continue to improve quality.

As I step back from all of this, I think it is important to spend a minute talking about transparency since that is a world that Stratis Health is immersed in and is one of the strongest uses of health quality data around public reporting and transparence and ask kind of three questions about whether it is achieving the results that we intend.

The first goal of public reporting is to have it be affective at bringing the attention of health care leaders to quality and patient safety and to driving improvement. I give this one a resounding yes. I think very much the move for transparency has accomplished this goal.

The second goal that is often talked about in transparency is to help consumers be more informed, decision makers, and activated patients. I would say we are not there yet on this one. I think we have a lot more work to do and research to undertake to know how to share data meaningfully with consumers so that they can use it to be informed decision makers. I think we have a ways to go with this one.

A third goal is the biggest goal of all, is transparency affectively helping us get where we want to be as a system whether you define that as better quality and safety, whether you define that as value, whether you define that as the six aims from the IOM, I think it is too early to tell. I think we do not know what those results will be. Again, I hope Minnesota can continue to be a leader in this.

I want to leave you with two additional pieces that I think an organization like mine my uniquely bring to this committee as you consider the uses of data for quality purposes. First is the need for clarity and consistency in an electronic record environment as to what constitutes the legal medical record. We know that for external peer review for utilization management, for litigation, we request for the chart as part of those activities. When we are an electronic environment, we are seeing as the QIO in Minnesota real challenges when we request a record from an entity that is using an electronic medical record what constitutes that legal record. What is going to fully describe the care that is going to allow us as a quality improvement organization, but you can imagine other uses around utilization review and perhaps litigation. What screens do you pull? What data do you pull? How do you know what is the right information to include? I think we are lacking in definitions and clarity around what constitutes the legal record in an electronic environment.

Secondly, a very different topic, Minnesota is a very rural state. My organization has done a lot of national work in rural health improvement. Rural health can be a model and a leader in many areas, but there are particular challenges when it comes to uses of data and particularly when it comes to transparency. Small provider organizations often do not have the capital and resources to adopt technologies as quickly as their larger urban and suburban counter parts, and I do not have an adequate number of cases to report that does not explicitly or implicitly identify patients. Yet rural providers are committed to quality. They are committed to patient safety and transparency. I think the last thing we want to end up in this country is a two-tiered system. I think we very much have to pay attention as we consider the uses of the data for the rural environment.

I thank you all for your time, and I look forward to opportunities as Kevin said to answer questions when we are all done.

MR. REYNOLDS: Thank you. Our third presenter is in the neighborhood somewhere. It took us all a while to get here in the last few days. I would like to go ahead and open the floor. Steve wants to ask a question, and then we will keep asking questions.

DR. STEINDEL: Actually this is more of a comment than a question, and this is to advise you that it has just been passed as a national standard, but the HL7 has a functional model for the Electronic Health Record. There is a committee that is defining the medical record that is being chaired by some people from AEMA(?) that is now undergoing final scrutiny to be put up for ballot as a profile.

MS. LUNDBLAD: That is terrific news because we have really been challenged by that. Do you know when that happens?

DR. STEINDEL: No, I actually do not because I have not been paying very close attention to it, but it will actually innumerate those portion of the functional model that will be balloted as a legal medical record. Once it passes, it will be published as a standard.

MS. LUNDBLAD: Great.

DR. CARR: Well this is a question for both speakers, but just great work and so clearly presented and very helpful. What makes something like this happen? There are a lot of obstacles that you encountered and yet somehow there is an alignment of motivation and incentives. What do you attribute that to?

MS. LUNDBLAD: I am seeing if Kevin is going to jump in first. I do think there is something to be said for that the way that the health system is structured in Minnesota. I think it is not any one factor. I wish I could say it was one simple thing and it is replicable and why cannot we all just do this? Unfortunately it is not that simple. I do think the fact that we have such an integrated health system state means that the largest players that are trying to influence policy and trying to take action in our state are looking fairly broadly across the care continuum because they are paying attention to all parts of their organization, particularly hospital and ambulatory care. Our integrated health systems are truly integrated and also reflect long-term care. I think that is the context in which we do our work because of how the systems have grown up in Minnesota.

Secondly, we are very fortunate to have ICSI, Institute for Clinical Systems Improvement, which over the past fifteen years since they were formed has really given Minnesota physicians and other clinicians the opportunity to come together and shape by consensus what guidelines are for care delivery in Minnesota. It is not that those replaced what comes from any of the other consensus building bodies or guideline building bodies, but they reflect in Minnesota the standards that we want to hold ourselves to, and because they are very good at engaging physicians in that process, we compliment the integrated health systems have good rounding in agreement in the guidelines and standards for care.

Again, I am not sure what to trace it back to, but Minnesota is a very collaborative improvement environment. It is also very competitive, but there are certain topics and certain areas where most provider organizations agree to set their competition at the door whether that is around agreeing to protocols for surgical site marking or agreeing to immunization practices and common messaging. Whatever it might be, there are enough topics around quality and patient safety that they say that is not what we are going to compete on. We are going to compete on all of the other things and all of the other services but on quality and patient safety, we are going to take that as kind of a common good or a public good and work together on those things. I quite honestly cannot trace for you how we have gotten to that point. That is the environment in many instances in which we are trying to do the quality improvement work that we do.

DR. PETERSON: Jennifer, I would like to answer, and maybe my answer will be a little more simplistic too. I think we cannot diminish the importance of the system in Minnesota that has been really pushing this. I think that people have always tried to do better. I have to say that part of this change in the landscape seems to me is also because of very simple introductions of substantially more electronic. You know we now have computers that can store enough information. We have them that can process fast enough. I think in many ways what we are seeing then is not so much the motivation of people, but the ability to do it technically and technologically and I think that pushes some of these changes.

MR. REYNOLDS: Paul, I think you had a question?

DR. TANG: I was just going to say something. Minnesotans are nice people, and they play together well. It sort of--

DR. CARR: Is this about Boston?

MR. REYNOLDS: Simon?

DR. COHN: First of all, my congratulations. I think these are two very good examples of work. Jennifer, I want to dig in a little further into—you were talking about high levels about quality improvement. You are the first BQI that we will be having testimony from. I am anticipating in July that we will be getting additional testimony. In some ways, you represent sort of an example of I think what the vision of transparency and the quality use case is really sort of pondering this idea of 360 degree evaluation of providers both hospital and the physicians development and somehow creation of quality metrics on a variety of levels. I think that some of the implicit data stewardship perspective, which is something that we are going to need to hear more about—we heard some previously and now and shall be digging into this one. We talked earlier and you were here when we talked about all of these interesting security practices and all of that. Can you walk me through a very basic and maybe real level about how you relate to your physician and hospital environments in terms of how all of this works. Let me just give you a couple examples of what I am talking about. Do you sort of publish things that you want, and then you get things back from the practices like 78 percent? Do you get the results back or are people actually submitting data to the BQI to enable you to sort of put all of this together? Is it personally identifiable information? Is it pseudonymised information? Maybe you can explain to me how this works in the real world environment.

MS. LUNDBLAD: I will try to answer all of those things. As I said, Minnesota Community Measurement before it was a BQI site was first taking chart abstracted data that was used originally for reporting HEDIS measures at the health plan level and they developed an attribution methodology to then attach those results to medical groups. So it was using chart-abstracted data that was done for the purposes of HEDIS and then attributing those results to the medical group. Those were the data that were originally reported first internally and then publicly and have been available now for the past three or four years in a public way. It is ambulatory only, not hospital. It is measures of things like diabetes care, asthma, cardiovascular disease, depression from your typical slate of outpatient measures, and it is reported at the group level. It is not at the clinic site level and not at the physician level because that is what the HEDIS data would allow to happen. Now that more and more of the clinic sites in Minnesota have adopted and are using Electronic Health Records, Minnesota Community Measurement are piloting a direct data submission process so that the HEDIS data can still sit there as the base. If a clinic site or a clinic system had electronic data for those same measures, they could submit that and that data would then supersede what was there as the HEDIS. We are going to be doing some studies to see how does that line up? If you have the HEDIS derived data and you have the Electronic Health Record direct submitted by provider data, are we going to get the same results, and if we do not where are there going to be differences? That is in its pilot stage right now.

That is what has been happening, and again, it has been at the group level soon to be at the clinic site level, and we have some real debates about whether the physician level is appropriate or not. Many people believe that most of the practices that we are describing with these measures relate as much to what the individual physician is doing as to what the clinic systems and processes are so that it is the site level that is the appropriate for public reporting individual physicians for internal improvement purposes. That is a pretty health debate we can have from this one. So that is what is going on.

Now we have the overlay of being selected as BQI site. Now, in addition, we have the whole Medicare Part B claims data and some chart abstracted Medicare data that will go through that same attribution process first at the group and then at the site level to add Medicare data to be the comprehensive reporting of what at this point has been commercial and state programs data. Does that make sense?

DR. PETERSON: Could I try to answer that too from a different perspective? I think from the perspective of the group being measured, I think that there are a couple of things that we have to be careful of. One of them is to the extent that some measures reflect the patient population and not necessarily the clinical practice. Jennifer, you mentioned the diabetes optimal care. I will suggest that one. One of the five measures in the optimal care of diabetes is smoking. Well, if the patient smokes then that is a point against that. It does not make the optimal care measure. To the extent that the population then varies in their rate of smoking between different populations, then some of those characteristics can be reinforcing. A practice that is in an area that is very poor and underserved may have problems with people affording their medicine and not having very good control. What we then may begin to do is to tie that to a last reimbursement so that a group that is in an underserved area that the measure is reflecting the underserved and they become underpaid as well. I think we have to be careful of that.

I think one of the other concerns that shows up is that when you define what you want, then you will get what you ask for. That seems to be what we want to do, but I would encourage that it not always—sometimes you get too much of that. One of the examples that come to mind is the contract with the National Health Service in the UK that came in last March. There were a few measures in depression. If you do not measure depression, then you do not get depression care. That is, when a person—if we measure what we feel is quality, what will happen is that our physicians are going to focus on making those measures. What we do not measure, they are not going to focus on. Be aware that if we are not going to measure it, it might not get done. I encourage just to measure things that are important.

DR. COHN: Okay. Jennifer, I did just want to go down just a little more in specificity. I understand all that you are saying. When you talk about HEDIS, you either get the HEDIS results or you actually start looking at the records. I was just trying to get a sense of—now you are getting all of the claims data on top of it being sent to. I was just wondering if there was anything that you were doing? Are you getting patient identifiers and all of that on all of the claims? You probably do not get it on the HEDIS information, probably just getting the percentages coming out of the clinics or from the health plans, but what parts of this are really patient specific data that you are looking through? Is there anything going on to protect or make anonymous that day prior to you actually looking at it? Is there anything going on there and any concerns or otherwise?

MS. LUNDBLAD: Yes, I think there are a lot of concerns. The Minnesota Community Measurement has just formed a quality audit committee and is trying to address more and more now that they are deriving data from more and more data sources what those needs are. I would say we are still very early on in that. To date, we have really tried to not have identifiable data be what is coming in to do that attribution for the medical group level, but as more and more data sources are there that is going to happen. I think we are just in the early stages of figuring that out.

MR. REYNOLDS: Okay, I would like to welcome William Nugent. You have that look on your face of travel that all of us have been through. We are sorry about that, and we look forward to your comments.

Agenda Item: Northern New England Cardiovascular Disease Study Group

DR. NUGENT: I appreciate the opportunity to visit you. I have been asked to come and talk about the Northern New England Cardiovascular Disease Study Group. I have put together a brief presentation describing who we are and giving you a little background in what we are trying to accomplish with our cardiovascular population in Northern New England. Basically, I have no commercial or financial affiliation with anything. I am representing NNE, the Northern New England Cardiovascular Disease Study Group and there may be some comments with respect to Dartmouth-Hitchcock Medical Center.

A rocket sled tour back in 1987, HIPAA decided to release institution specific and ultimately clinician specific mortality data. This happened at a time when Jack Weinberg had just published an interesting paper looking at the variation utilization of transurethral resection in patients with benign prosthetic lipotropy suggesting that based on the population that you are associated with, it was more important where your prostate than how it big it was in determining whether it was going to be treated medically or surgically. These two things coincided to suggest that we could in fact collaborate by sharing in very sensitive outcomes data on a population based environment and maybe learn from that in a way that would protect and improve the outcomes of our patients. In 1987, the entity was formed. Basically a collaborative to exchange information concerning the treatment of cardiovascular disease, regional multi-disciplinary group consisting of clinicians, hospital administrators, and health care research personnel with purposes to seek to improve the quality safety and effectiveness and cost of medical intervention. It is a hefty mission statement, but it has been consistent. We have withheld it and we have stuck with the mission statement for the last twenty years.

The entity relationship to its member organizations, I think the most important thing to stress here is that we are a regional organization and we have really put a priority on maintaining its regional focus. Therefore, membership is offered to all institutions performing any sort of cardiovascular intervention in Main, New Hampshire, and Vermont. We have been approached by institutions outside of those three states, and we basically consider membership on an individual basis. Once you have been agreed to be a member of the NNE, you basically sign a contract agreeing to pay the necessary dues to collect, provide, and allow validation of all data and to attain and host meetings. Our meetings are held periodically throughout the year rotating sites within the region.

Our funding began, basically, for the first ten years out of our own back pocket. We paid for our own travel and our own meals and actually funded all of our research activities through grants. That has since matured to the point that all registry activities are now supported by dues collected directly from member organizations. The nice thing is we have become sufficiently central to the quality improvement and the quality assessment of systems of all organizations that we are now part of their budget basically in order to help maintain this registry. All of our research activities are supported directly through grants and typically they are either through the American Heart Association or through the NIH. We support our own travel and member institutions pay for travel and eating et cetera. Since we rotate meetings around the region, we basically each institution hosts and pays for each meeting when their time comes up to have one.

It is hard for me to organize what we do in a way that could get through this in fifteen minutes, so this is basically what we do. Our activities are to develop and maintain a data registry. That really is our primary activity, and I say with some emphasis that it is a registry. This is not a randomized controlled trial through the registry. We provide outcomes data and reporting system. I have brought some examples of that to show you today. We provide clinical decision support tools. I have brought some examples to show you. I think it is worth mentioning that if you are going to provide a population based database and provide risk stratified outcomes data, you can turn that data back on itself and provide risk gratified predictions based on patient individual characteristics. Besides telling us or informing the clinicians how they are doing with respect to one another, we also are providing decision support tools for practice enhancement to help them make decisions about individual patient encounters. We organize quality improvement activities. I will show you examples of those. I think generally provided services to clinicians and institutions as needed and as are appropriate.

This is a big slide. I apologize for this, but this is probably the most important slide and probably will generate the most questions down here. I think it is worth stating that although the registry analyzes the data and does house the data, all data is the property of individual institutions and it remains the property of individual institutions. All data has been reviewed by IRB and IRB has signed off at all institutions with respect to the collection of data. Most of the IRBs designated are registries of quality improvement activity and therefore exempted because of that.

Data is protected therefore under a QA umbrella, and as I pass these things around you will notice we put the disclaimer on the bottom. That is never to my knowledge ever been challenged in any way, shape, or form. It is worth stopping to pause for a minute just to mention that we have had at least three very highly contested certificate in need applications within the region. We have not used this data to either refute or to support it because it basically goes against the tenor of what we are trying to do by promoting one institution over another. I think that has been generally acknowledged and generally recognized by all member institutions. I have wonderful anecdotes to tell you about how important we feel it is to stay as owners of the data to stay in control of it for appropriate reasons.

We have complied a consent sort of a general umbrella consent for the use of data for QA purposes, which is part of the administration process when you sign into the hospital. When we get into the nitty gritty, institutions are responsible for submitting data and allowing for validation. Now our validation is primarily procedure and outcome, meaning mortality outcome. We have elected to use in hospital mortality as our benchmark outcome because it is so easily validated against procedure. This is apposed to 30 day or procedural mortality rates, which are harder to validate once the patient leaves the hospital. We have done the due process to go back and compare a subset analyses that found that the hospital mortality rate, which is easy to validate, does not change the ranking on patients when you compare 30 day or procedural mortality rates.

Basically, the means of data collection is pretty much up to the institution and we have examples of both paper and pencil submission as well as the confirmation. We developed electronic into this thing and it is now available free to all institutions. It is just a matter of when you want decide to implement it at your institution. We have done that the last two years at Dartmouth. So we are electronic now. We validate it approximately every two years. I think we are very proud of our validation procedures. The hospitals are at work with us to provide administrative data to compare against our registry data to make sure all patients who have had the intervention counted, and those that have died are counted, and those that have died twice or three times et cetera et cetera. We actually go to the rigors of making sure we find resolution 100 percent of the cases. I will say we are tracking 100 percent of the patients that are intervened upon. This is not a sample initiative. This is a 100 percent tracking initiative in the region.

We do compare our data to the NDI for long-term outcomes every five years. This is gets into the sort of issue of patient identifiers. We have got to have a patient identifier in order to prepare the NDI social security numbers. We do have social security numbers. For a long time we were using Hogben numbers to try to keep the patient identifier at the institution. It would have to be—it would be figured out at the institution if you ever wanted to know who Mrs. Jones actually was. We would be tracking her through what was called a Hogben number, which was some combination of first names and last names and initials and numbers or something. We now, for the last number of years using social security numbers.

We have a way of bedding out new variables. We meet three times a year, and occasionally someone will suggest a new variable be applied to our dataset. It typically goes into a flux space and we make sure it can be accurately tracked. We make sure that e have a representative of successful tracking before it is codified into the registry permanently.

What is the status of our registry right now? This is pretty up to date. The PCI register is about 100,000 patients. It began in January of 1990. We really started this initiative with a search goal initiative and then grew out into the PCI. Our CABG and Valve registry which goes back as far as 1987 has about 76,000 and almost 77,000 consecutive patients in it through over a twenty year period. We have added cardiac anesthesia and perfusion. As I quickly go through some of the work that we have done, we found it was important to get into the meet of the operation and therefore expand it into the anesthetic and anesthetic and perfusion variables. You can see the numbers that are included there. Interestingly, these are maintained by the representatives of those specialties. They are tied to the central dataset, but all of the ideas that came from what to track in perfusion and what came out of the perfusion is similar to the anesthesiologists. That is the beauty of a multi-disciplinary group like this. You can really think beyond your own specialty. We now have a relational database that allows us to cross interventions from medical from PCI to surgical interventions. It represented about 133,000 patients. We have been tracking long-term mortality rate as far back as 1987 using the NDIs as a benchmark.

This is our website. It is just an example of what is on our website in terms of looking at regional outcomes. We post this as well as provide individual institution reports. I brought the report from Dartmouth. These are produced biannually. All of this other stuff I can leave. Even in today’s transparent world where you can get this on the Internet, this is still pretty sensitive stuff, and it is my only copy. These I can leave and you can spend as much time looking at them as you like. Any way, you can see in terms of some of the outcomes that we look at in hospital mortality, stroke rate, bleeding rate, mediastinitis, this is sentinel admission mediastinitis rate. This is one of our weaker outcomes because it tends to miss patients who are readmitted for the disease. I add that as a postscript because it has been bugging me since we have been cracking for fifteen years. Postop renal failure insufficiency, that is twice the doubling of creatinine. Use of internal mammary arteries, mean post operative length of stay, median and mean post operative length of stay, and there are a lot of other outcomes that you will see in here, but this is a sort of a thumbnail look at the region. This is what our PCI looks like in terms of status and intervention, mortality, those going to emerging bypass, and then vascular and stroke complications associated with the intervention.

I think this is one of the more interesting, and until we got into the appropriateness works, one of the more appealing and sexier aspects of what we do in terms of decision support tools, we now provide pocket cards for cardiac survey, decision support for interventional cardiology and even an electronic second opinion that is available that allows us to predict long term outcomes based on whether they have a surgical PCI or medical therapies for their heart disease. These are periodically updated. For example, the coronary bypass—I will pass around, allows you to look at individual patient variables, predict their mortality from CABG, their mortality from an aortic valve and mortality for micro valve surgery based on their individual characteristics. You can look at the likelihood of stroke or mediastinitis for these as well. This is helpful when you are trying to get informed consent. You can try to as accurate as possible in terms of assessing a patient and what their chances are. This has come directly out of our dataset. These are what I will pass around. I have a risk card for CABG and a risk card for vascular complications following PCI. This other yellow one is a little narrow in its scope, but it looks at the likelihood of low cardiac output. We went through a period of time and are continuing to try to understand low cardiac output and try to prevent that as an intermediate complication to debt followed by a bypass surgery. This allows us to predict the likelihood of low output. The hope is that you would take this and stratify care based on the likelihood of developing low output from this card.

These are some of our quality improvement activities that are currently underway. We have a readiness for revascularization. we decided that we could do better particularly with transfers and just trying to meld guideline data to our clinical practice. We have tried to codify steps that we think idealizes the patient prior to an intervention, whether it be a PCI or whether it be a CABG. So, we have started a multi-disciplinary and regional look at basically readiness for surgery. That is the one I was interested in. There is a similar one undergoing for PCI. we are trying to improve our decision-making by trying to improve our decision support tools. If you look at cardiac surgery, we have a specific intervention right now trying to tie actual actions at the operating room with the likelihood of embolization whether what we consider microembolization which we think is surrogate and a little bit more common outcome for stroke and for some mental changes that occur following surgery. So, we have an initiative going there. We have a prospective study. This is actually one that requires full informed consent. We are trying to get as many patients as we can to allow us to take a sample of their blood and freeze it and at some point in time go back and look for biomarkers whether they be markers for CNS, injury markers for myocardial injury, et cetera. We are looking closely at blood utilization. It is a real interesting and quite variable aspect of coronary bypass surgery in terms of the pernicious use—I should say the pernicious impact of blood on patient’s outcome post-op and the detrimental effects of excessive anemia. We are sort of weighing two bad things. It is bad to get anemic, and it is bad to give blood. We are trying to sort out those two issues, primarily by reducing the amount of blood that we are transfusing.

The service to the institutions are summed up

here. We provide, and you have this book that is given basically to every clinician as well as the reinstitution. We try to present or report the institutions in JCHO friendly formats so they can really just read this into their QA environment and then apply it to their JCHO credidation environments. We are willing to customize reports based on individual institutional needs. We support regional presentations, and we support national meetings. We have about 80 peer reviewed publications that are available on our website up to three years ago. You will see what our bibliography looks like in here. We also support about 100 abstracts or more at national meetings and peer review publications. I think the beauty of this groups is it has finally found a way to integrate the private and the academic sectors in way that is really meaningful in terms of creation of new science within these specialty. We have published some sentinel papers with basic private practice first authors. It really provides a sense of pride and a sense of momentum to the organization when you can do something like this. This is a beautiful melding of the private sector and the academic sector.

I think it is worth mentioning that all of our

papers, in order to be published, you have to have at least one author from each institution named on that paper. We have a lot of authors on our papers, but it means that every institution is represented, and there has been active input by at least one individual in each institution. That has been a tradition within the study groups since its conception. You will see our recent studies report, which I am passing around. We now have a website which you are welcome to visit. We are almost fully transparent right now. Certainly the individual institutions are moving quickly towards full transparency. All of our outcomes short of mortality are fully transparent. There have been some legal glitches trying to get that final mortality rate to be formerly transparent at our website level. It is transparent at Dartmouth, and I think it is transparent at most individual institutions. I think within the next few months, we will be fully transparent with respect to coronary bypass surgery. It will take a while before the cardiologists are willing to do that I think. We periodically meet and try to figure out where we are going. About every three years we have a leadership retreat.

I guess the next question is, that is who we are, now where is the beef? In other words, what have we really accomplished over the last few years? Maybe many of you who have heard about us, we sort made the papers a number of years ago by realizing that nobody knew how to take care of patients in Northern New England better than Northern New England doctors. When we decided we would learn from one another in our early formative stages, we spent an entire meeting deciding would we go the UABs and the Cleveland clinics and learn how they took care their patients when we first learned from our own back door. We decided we would learn within our own region. We went around where teams visited all institutions and the opportunity to look under the rock at every single institution in the region. This is really a sentinel event for this organization in that we began therefore to gravitate our care towards the main, I might say. We all started realizing we put our pants on one leg at a time just like everybody else. A lot of the fear and distress began to fall away. There is some wonderful spin-offs from that occur when that happens. Mainly, you start trusting your competitors, and when you start trusting your competitors, with trust comes a certain expectation. I think by getting to know each other that well and that sort of intimately, our data got better. We began to collect better data. We expected data, and we expected to give good data. This collaboration in the form of regional meetings and regional benchmarking really did result in a spirit of trust among what was considerably contentious and uncomfortable environment.

We began to focus on moving from first order analysis, which is really your traditional mortality outcome and move toward the second order analysis, which is what really drives that mortality. We very quickly—while we reported mortality, we really dove into this concept of what was it that was driving the mortality. Was it low cardial gap? Was it transfusion? Was it individual surgeon technique? Though we reported outcomes, we quickly learned that the meat was in understanding process. By that, these are four examples of process variables that are all within the surgeon’s control and completely and totally linked by logistic regression to a better outcome. That is preoperative aspirin. That is the use of the IMA to the LAD. That is avoidance of transfusion. That is preinduction heart rate, adequacy of data blockade. By gaining over the mortality thing and moving up stream and trying to find out what drove that mortality statistically in a large population, we could go to meetings and say, give aspirin before surgery. Use the internal mammary artery. Try not to transfuse the poor soul, and get their heart rate adequately blocked before you take them to the operating room and chances are that patient will do better.

So what happened? Our mortality as you see it—I have to think the 2004 thing was sort of reeling back from the coded stints arriving on the scene. We have since adjusted and have gotten back to where we belong.

This basically looks at the five institutions

that started. I think the two things that we learned when we started this initiative back in 1987 and published our data 1991 was not so much that the mortality rate was high but there was significant wide variation between institution practicing. You can see how that variation narrowed down over the course of the years. Where it narrowed down is when we completed our regional site visitation.

This is just an example and somewhat dated slide, but our work continues to be this good. I think it is just worth pointing out when you look at the urgent elective emergent of the patient and stratifying just look at the electric population. You are looking at 1,500 patients throughout a three state region where six people died in the course of the year. That is pretty good. That is a 0.4 percent mortality risk and suggests that our systems are sufficiently robust to get the patients who are able to walk into the hospital carrying their bag out of the hospital at least alive. I think when you begin to see numbers like this, you begin to get to where I am going to come and sort of conclude with this presentation. That is, you begin to become more concerned about your regional rate than your individual rate and your institutional rate. You just think of the occasions there as a physician.

So I am going to conclude by saying that clinical use of our data really has begun to preclude all that we have heard about tracking outcomes in the gaming strategies that are associated with this kind of work. It is important work. There is no question, and even just monitoring is important. When you own the data and control it, and actually use the data to change your practice, then the gaming issue sort of becomes silly. It becomes ridiculous. Measurable practice changes do and have occurred based on our data.

When I had a debt, I was more concerned on screwing up Northern New England than screwing up myself in Dartmouth Hitchcock Medical Center. I had the ability to stand in front of our entire Board of Trustees at one point and count the fact that I had just came back from an NNE meeting on the risk of mortality – the adjusted risk of mortality for anybody having coronary bypass surgery in Northern New England is 1.7 percent and probably the lowest in the entire country.

That is simply meant that anybody in that room which happened to be the Board of Trustees for Dartmouth-Hitchcock Medical Center could go get chest pain in any place in Northern New England with the same likelihood of survival because there has been no statistical difference between institutions for the preceding six years.

That to me was my first chance to take back the night as a clinician who had been told what to do and who had been monitored for the proceeding x number of years. Because we had taken the initiative here, we were suddenly now looking at a population of patients not because it is going to make Dartmouth-Hitchcock a better place, but it will make Northern New England a better place.

So I will conclude. I apologize for that acronym, NNECVDSG. The Northern New England Cardio Vascular Study Group has shown that high quality population based clinical data can be accurately collected and creatively used for the purpose of improving patient outcome. For that, I traveled long and hard to speak. Thank you very much.

MR. REYNOLDS: I would like to thank all of you. You actually followed our instructions. You were all concise. Well done. I am going to open it now for questions and Paul Tang?

DR. TANG: That was a very compelling and engaging presentation. Thank you for the good work. I have two questions. One is, you mentioned at least three times that I picked up on about the importance of maintaining data ownership. I think the first time or the second time you promised there were stories behind that. So, I am wondering what — we are talking about trust. We are talking about protection. You made a big deal out of it. Basically, there was something about the ownership that was enabling. I am trying to figure out--

DR. NUGENT: This is not rocket science. I bet you could answer the question for me. There is no ambiguity in terms of what we are here for. There is absolutely no ambiguity. I am terrified of having a high mortality rate and my institution winding up on the front page of a newspaper. That does fine. That is the big stick. That is the stick that keeps me in line. Okay? This data does not scare me. This data helps me. That is because I trust the data and I trust those that are analyzing the data to use it for the reasons that we created. Now the reason we are able to say that is because we have owned that data from day one. We have never missed an opportunity to congratulate ourselves with number one, how good it is and number two, what it is for.

So, I will give you a couple of examples. When one of our institutions decided they would put up a—well we do not really have one of those big signs—billboards in Northern New England. We would send out a colored glossy advertisement that they were the best organization in Northern New England when they in fact they were not any better statistically than anybody else. It was the surgeon then at institutional that went to the consortiums to say this was not right because they were using the same data that was basically going into the consortiums to say they were somehow better than everybody else and they were not. That brought the administrators to the table because they came to us and they came to the registry and said, we need to be able to talk to our third party payers. We also need to be proud about what we do. How can we do that without jeopardizing the integrity of the organization? That allowed us to sit down, and this is where you will notice we are tracking by 500 boluses in the last 500 cases. We have really given up the annual mortality rate. Some hospital has 150 cases, some hospitals have 1,200 cases. That sort of blows any inanity out of that data. You really look at a backside that is representative and are looking at the last 500 cases as are metric. It is a nice metric to use.

We just worked out ways where you can advertise your data, but also provide confidence intervals afterwards that allow anybody to realize whether or not that data is statistically better than anybody else’s. My point is it was the administrators that came in and said, we do not want to make you mad. We just want to be able to use our own data. So, by owning it, we were able to do that sort of thing.

DR. TANG: The second question is, I saw that variance that you talked about and I saw the 2003, and I heard a chronological fact that site visits. How do we know that that was a change in the way the data—there was information sharing about how to calculate data versus that all surgeons changed over night?

DR. NUGENT: I have been asked this so many times. Is this true to a relater or true to an unrelated basically. In other words, was our group hub up there, did it really have something to do with the outcome or not? If you look at—there was a paper by Eric Peterson interestingly enough. I can dig it out if you want to see it, where he compared the 1993 mortality rate. We are going back in time. It is still representative of basically the time we are talking about. He looked at the 1993 Medicare mortality rate by state and he included the entity region as a state. He looked at how much it improved in the preceding five years on an XY axis. He plotted it out which allowed you to have four quadrants. Those with low mortality rate, low improvement; those with low mortality rate, high improvement; those with high mortality rates—you see what I am saying? Northern New England had the highest improvement and the lowest mortality rate in 1993 after the improvement and the most improvement of any place in the entire country now. There is nothing to suggest that this would have changed by simply tracking, tracking, tracking, and tracking. There was the intervention where we got together and watched each other practice and compared notes and did slow diagrams and things like that. I never proved it was a one on one relationship, but you will never convince me otherwise.

MR. REYNOLDS: Marc?

MR. ROTHSTEIN: Thank you. That was sort of my question. You are the first person we heard from who described as sort of grand rounds. I am wondering whether you think the success of that is attributable to the specialty, to the region, and to other factors. How much stock should we put in that as a model to recommend elsewhere?

DR. NUGENT: There is no question that the corner in population was the poster child for this kind of work. It is not even in the corner of population. It is disappearing. It was a beautiful homogeneous group of patients that could be compared and sacred cows systematically slain in terms of my patients are older and sicker, my patients are different. We were really working on the same page throughout our region. There is no question that the specialty was the ideal to start this kind of work because of the homogeneity and because of the high cost and because of the burning platform. All of those things were in place back in the early 90’s to get us together.

There is no question that the Northern New

England Environment, which is competitive, and I know you guys do not believe me. It is and it was competitive. I hated those guys in Manchester. They scared me to death. I had recently taken up a leadership in Dartmouth. There was this coronary mill that was going to eat my lunch down there in Manchester. They were growing incredibly fast. I hated them. Okay? It got me to get to know them, and get to know them on a personal and medical basis. I have to say that we were competitive, but there is something nice about your nearest competitor being 70 miles away. So, we were the ideal region to give this thing a shot. Okay? It has been hard to duplicate since. Believe me, I have eaten a lot of rubber chicken in the last 20 years trying to duplicate it. Getting beyond the mortality, beyond the outcome to the process has been hard for regional or any other groups to transition to. All right?

Third thing is the fact that we published. That has been an incredible important part of our momentum because it is has given us incredible credibility around the country. It is as good as gone on CNN. It is better than gone on CNN because it gives you a longevity in our specialty that transcends any kind of 15-second sound bite on the evening news.

Now, can it be duplicated? I maintain that the important quality characteristic of this group. I think where the rubber really meets the road from the clinician’s perspective is the regionality of it. I think there is something beautiful about knowing we are taking care of a corner of the world, and we are taking care of that whole corner. All right? It is a large enough population for us to understand and see relationships that are impossible to see in an individual or even in an institutional basis. Okay? We get rid of so much of the confounding and so much of the noise that you try to tease out of your own practice or your institutional practice. You have a large enough population to learn from these infrequent outcomes, but it is a small enough group that I still have an identity. Every single surgeon up there has its individual identity that if you really screwed up, you would see the belief. You would find yourself up there. It is also this wonderful identity in terms of who we are. To me, yes, I think it can be duplicated but there has to be facilitation there. I think the key is the regionality of it. The Holy Grail for whoever you are is to have this sort of national mark. I am going to put a plug in for the fact that as much as I would like to compare myself to southern California, I do not really care. I just want to make sure I am giving the best care I can to my patients in Northern New England. I would put a pitch in. I think that is an important attribute. I hope I answered your question.

MR. REYNOLDS: I am going to ask a quick question and then Kevin and then Justine, but I may stop in the middle because I know Marc Overhage is leaving and we may want a quick summary from you since you are not going to be with us tomorrow. All of you have mentioned, and Jennifer you actually said the words about the strong patient consent environment in Minnesota and so on. We are talking about a lot of quality data, but nobody is talking about one of our visions is, how do you the patients trust that you are using that data? I have not heard anybody actually say that. So, any of you that want to respond. Jennifer, maybe you first since you said the words and kicked it off.
MS. LUNDBLAD: I would just turn your attention back to the document that I attached to what you have on the PowerPoint slide, which is the Health Records Act. This is our current vision in Minnesota about how we get to that balance of appropriate protections and patient consents and patient privacy, but still being able to do quality improvement at the individual provider level, at the regional, at the state, and at the collaborative groups. So, I think it had in here a lot of those pieces that specify what we think of as that notion. It is the balance. We are trying to achieve the balance. There is no magic solution to it because there is going to be—I heard people characterizing that we all get along and everyone works well in Minnesota, and they do not believe we have competition there either. We really do. Part of the reason we have a tradition of strong patient consent is because we have some of the strongest privacy advocacy groups that are so active at our state legislature, and it is about striking that balance in a way that is going to work for what our state regs are that are going to compliment the HIPAA pieces. It is complicated. I do not think this will be the end all. This is what our most recent hot off the press actions have been.

MR. REYNOLDS: Dr. Nugent?

DR. NUGENT: This is not an issue for our patients. Honestly, the average patient does not know the Northern New England Cardio Vascular Disease Study Group exists. The average patient wants to know that they are getting high quality care at their individual institution. Honestly, they want to make sure that institution is as good or better as competitive and also available institutions. We are working behind the scenes to do that. We have certainly approached patients for specific studies, such as the biomarker study. It is really our energy level and not the patient’s in term of whether they are going to sign up or not because it is 15 cc’s of blood. In all fairness, patients are really not aware that this is going on. It is not that we are hiding it from them. It is not the sort of thing that makes the evening news or the regional press unless you screw up.

MR. REYNOLDS: Marc, why do not you make your—if you have any comments or summaries from today before you take off out of here.

DR. OVERHAGE: Well, I guess that the major thing that strikes me and I will be interested to see if the bares out over the next week is that what I am hearing is that it seems to me the task you are said invest—what is the barrier to you using these sorts of routinely collected clinical data for other purposes and especially for quality? It seems to me that mostly what we have heard today is A, you can do it. B, there is no legal barriers to doing it. There are a whole bunch of policy process uncertainty barriers to moving forward. There are probably some technical things that would make life easier for everybody. Those may be one of the places where it seems to me we need to keep drilling down including the issue that we touched on a couple of times with anemia discussions about commercial use and some of these other things. I think we still have not really wrestled with that too much. The good news is that while we could make things cleaner, easier, or simpler, the framework legally and regulatory wise are permissive anyway for these uses. We have great examples of successes of doing that. We just have to figure out half way in between them.

MR. REYNOLDS: Kevin, your next question?

MR. VIGILANTE: Just a couple of quick questions. What risk adjustment algorithm—did you develop your own or did it come from somewhere else? The other is, is there any interest in extending this to noncardiovascular cases? Say the ACF is mis-quiped adaptive program, and three—if you publish so regularly, does anybody say you need to start submitting to the IRB? You do not need IRB approval, but we do.

DR. NUGENT: All IRBs are reviewed. Every time we sign a contract each year, the IRB steps in.

MR. VIGILANTE: I must have misunderstood when you said that.

DR. NUGENT: Most IRBs have passed it on to the QA and have allowed the registry work to be QA. Most of our papers are purely descriptive. Whether or not that is different than research, I do not know, but it is purely describing populations. For the most part, that is what we publish. The risk stratification tool—that it is why it took forever to get the micro-valve in because it took forever even for a region to have enough numbers to provide a viable and valid tool. Of course, any of these risk stratification tools are used like any other clinical tool like a hemoglobin or chest x-ray. It is just a conglomeration of data that spits out a number. It is not prescriptive. It is just descriptive. These are all our own data. They are transferable. We have at least tested that out on a couple of other risk stratification environments back in the 90’s.

MR. VIGILANTE: Let me go to another question. What if somebody came to you, say, a stent manufacturer and said, we would like to buy your data to look at drug eluting stents versus bare metal stents? How would that complicate your life? What would your response be? What would the barriers be?

DR. NUGENT: That actually happened. In the mid 90’s the Apache system wanted to use our data to build their stratification tool for their software. After our data was collected and analyzed, we did give them batches of that in part of our early days. We have not been approached and that has since gone away. How would that complicate things? I basically do not think it would. Would I have to put a disclaimer up here?

MR. VIGILANTE: In terms of, would you feel - it is sort of uncharted territory, but this notion that it is not a really visible patient who does not even know this is being used. In this case, your registry data is—if that were the case, you would have to disclose it or--

DR. NUGENT: No, in this case we certainly disclose it to ourselves. In this case there were no patient identifiers. That was stripped of any of the data. It left us as bland data without any patient identifiers whatsoever. You would have to tell me. I do not know. It did not complicate things, and it was part of our environment for a period of years. It helped fund us for a period of years. It did not tie us to any organization. Basically, they bought it as a package--

MR. VIGILANTE: Let me jus say something interesting. Obviously this stuff—doing what you are doing is not free. It costs something. The fact that you are able to get all the revenue through to help sustain it is an important consideration of when you think about sustaining these kinds of activities on a broader scale.

DR. NUGENT: Right, I think that is valid. I think our last dues are roughly 60,000 dollars today just to give you an idea.

MR. REYNOLDS: Justine, last question?

DR. CARR: Thank you very much. I just want to go back to saying that we keep hearing themes repeated, certainly the trust theme. But also going back to what we heard yesterday how the continuum from clinical care and quality measurement is blended because it is all one in the same as we heard from early yesterday. Secondly, the continuum between research and quality. You cannot draw the line here. It begins with quality and what they learned became publications, but under quality. So, trying to dissect in between at what point something changed. Then, I think what is especially great about this is the fact that it is not just the reporting out. There is an absolute feedback loop that resulted in tangible clinical decision support and shared learning. I just think this is kind of a great place that we would want to be. The challenge of one of the things that this group is trying to do is looking at all of these regulatory things and the potential what if’s and protect them, but not losing something that seems to work so well and began 20 years ago. It is long before--

DR. NUGENT: I will just comment on the research. My comment, when you say this tie between quality improvement and research, I think, we can do that. It does give so much credibility to what is a very soft sign. It is beginning to structure and legitimize that work. I think to somehow divorce it because it has got some sort of a negative connotation would be an incredible step backwards. We need to legitimize and to structure and to formalize a lot of the quality work that is being done. The peer review process is the best way I know of to do it.

MR. REYNOLDS: Well to play off your comment, and for the rest of you, we did not know who you were until today either.

DR. NUGENT: You do now.

MR. REYNOLDS: I say that all three of you have made a difference. We will try to make that same difference for you. You all made a big difference for us. That concludes our panel, but as a committee we are going to spend a few minutes.

Paul, if you have any comments about what you saw in the last day or so, go ahead.

DR. TANG: I certainly want to echo what has been said about trust. I think that is the primary consideration here. I think one of the things we need to recognize is that the public does trust for many of these allowable uses such as research and policy and a blended version of those. I think a sense that is not a problem.

The other I think we heard from Glen and others and what Marc just reiterated is that the barriers are really actually quite low to conduct those things that we consider research and quality activities. I think where we get in the trouble or concern is in the data world. I think it arises because Dr. Nugent’s way of saying it is ownership. It is the same thing with patients. It is ownership and control, knowing where my data is going to be used and for what purpose. When there is a surprise factor where something they find out about does not meet their expectation then I think that is something that we need to weigh in on, on the patient or public’s behalf.

I think it really focuses our attention in to this particular area where its data to the extent that we can delineate the various types of that as you were saying where they drove down in that commercial—what they call commercials. I think that would serve us and our constituents as well.

DR. SCANLON: I think I am very much where Marc is in terms of I think we have heard some very good applications, and they are successful applications in the current world. Things are feasible. For me, I would have to go back to yesterday to old world versus new world in the way that data are gathered and used. I think I would like to see a lot more new world which is take advantage of the Electronic Health Record, be able to sort of—we have examples today in terms of demonstration of where people are using Electronic Health Records and then sitting down and manually abstracting information. This is where we do not want to be down the road. The question is, what other way can we facilitate that, and what are the barriers to prevent that?

I think that I came into this the least informed

of anybody in this group. What I have heard is that there might be barriers, but there certainly is uncertainty. I think for the nervous among the population, uncertainty often creates inaction or inhibits action. Therefore, if there are things that can be done to improve the certainty. That is something that we should think about. At the same time, recognizing how complex a task it was to get to where we are with respect to things like HIPAA. You should not let the perfect of the enemy of the good. You should not be thinking about, if I pull on this string here, things are going to be fine because that unravels much too much. I am reserving a lot of judgment for what I hear as we move along because, again, I feel like I came in here with the least amount of experience. I think that the successful application suggests that things are very feasible and the question is now sort of what is the best strategy for facilitating improvements in them, and that is where I am reserving the judgment.

MR. REYNOLDS: I take a little different view of what you have to say. As you say, the people have trust. I think a lot of what we heard today is that they did not know. I think a lot of these studies and everything, they did not necessarily know that their data was going—as was mentioned, but that still might be all right. As long as there are frameworks of controls and frameworks of structure and the PPOs. So, I am not as sure I can go as far having listened to that that the people automatically trust. That is the words I heard. I am not saying that is where you are going. But I think that agreeing with everybody that someone said—I think there are structures, and there are environments, and there are regulations and rules in place that allow those things to go on. I am not quite sure I can jump that far.

DR. TANG: You may have misunderstood me because what I was doing was citing the surveys of consumers that say, would it be okay if your data went for public health or went for clinical research? Like the 80’s percent were saying, yes. What I mean is that is something they would trust, that kind of use. I am not saying where data is going at all, no. That is not their fear. Again, Carol summarized that the fear is it going to these other things which would not fit with their expectation of what happens when they go to their doctor and data is collected about them.

MR. REYNOLDS: Other comments from the committee on anything you have heard or anything else? If not, we will see everybody at 8:30.

[Whereupon at 5:06 pm the subcommittee meeting was adjourned.]