ATDEPARTMENT OF HEALTH AND HUMAN SERVICES
FOOD AND
DRUG ADMINISTRATION
CENTER
FOR DRUG EVALUATION AND RESEARCH
ONCOLOGIC DRUGS ADVISORY COMMITTEE
ENDPOINTS IN CLINICAL CANCER TRIALS
AND
ENDPOINTS IN LUNG CANCER CLINICAL TRIALS
Advisors
and Consultants Staff Conference Room
PARTICIPANTS
Donna Przepiorka, M.D., Ph.D.
Johanna Clifford, M.S., RN, BSN, Executive Secretary
MEMBERS:
John T. Carpenter,
Jr., M.D.
Bruce G.
Redman, D.O.
Sarah A.
Taylor, M.D.
Otis W.
Brawley, M.D.
Stephen L.
George, Ph.D.
Bruce D.
Cheson, M.D.
Gregory H.
Reaman, M.D.
James. H.
Doroshow, M.D.
Pamela J.
Haylock, RN (Consumer Representative)
Alexandra M.
Levine, M.D.
Maria
Rodriguez, M.D.
CONSULTANTS (VOTING):
Philip
Bonomi, M.D.
David
Ettinger, M.D.
Thomas
Fleming, M.D.
Bruce
Johnson, M.D.
David
Johnson, M.D.
Scott Saxman,
M.D.
PATIENT REPRESENTATIVES (VOTING):
Michael S.
Katz
Sheila Ross
ACTING INDUSTRY REPRESENTATIVE (NON-VOTING):
Antonio Grillo-Lopez, M.D.
GUEST SPEAKERS (NON-VOTING):
Paul Bunn,
M.D.
Richard
Gralla, M.D.
FDA:
Robert
Temple, M.D.
Richard
Pazdur, M.D. (by telephone)
Martin Cohen,
M.D.
Grant
Williams, M.D.
Patricia
Keegan, M.D.
Ning Li,
Ph.D.
C O N T
E N T S
Call to Order and Introduction of the Committee,
Donna
Przepiorka, M.D., Ph.D. 4
Conflict of Interest Statement,
Johanna
Clifford, M.S., RN, BSN 6
Endpoints in Clinical
Cancer Trials:
Opening Remarks, Grant Williams, M.D. 9
General Regulatory Background,
Ann Farrell,
M.D. 10
Endpoints for Past Approvals,
Ramzi Dagher,
M.D. 16
Selected Issues in Oncology Trial Design,
Grant
Williams, M.D. 22
Clarification questions to Presenters 46
Introduction of the Questions,
Grant
Williams, M.D. 78
Questions for Discussion 80
Endpoints in Lung Cancer
Clinical Trials:
Non Small Lung Cancer Regulatory Background,
Martin Cohen,
M.D. 190
FDA/ASCO Non-Small Cell Lung Cancer Workshop Summary,
Paul Bunn,
M.D. 199
Quality of Life and Patient Reported Outcomes
as Endpoints in Clinical Cancer Trials,
Richard
Gralla, M.D.
Clarification Questions to Presenters 228
Open Public Hearing:
Mr. Mark
Scott 287
Questions for Discussion 292
P R O C
E E D I N G S
Call to Order
DR. PRZEPIORKA: Good
morning to all. I would like to call the
meeting to order. This is a meeting that
is covering no drug evaluations but, in fact, methods for drug evaluations. I think it is a good time for this talk
because there are very new types of drugs coming out for which these issues may
be very germane.
I would like to start the meeting by an introduction of the
committee members, if we could start with Dr. Grillo-Lopez and just go
around. Let us know who you are and
where you are from.
DR. GRILLO-LOPEZ: My
name is Antonio Grillo-Lopez. This is my
first time sitting around this table. I
am a hematologist/oncologist. I spent
half of my career in industry and half in academia so I am hoping to make some
positive contributions here. Thank you.
DR. GEORGE: Stephen
George, from Duke University.
DR. CHESON: Bruce
Cheson, Georgetown University, Lombardi Comprehensive Cancer Center.
DR. DOROSHOW: Jim
Doroshow, City of Hope Comprehensive Cancer Center.
DR. RODRIGUEZ: Maria
Rodriguez, M.D. Anderson Cancer Center in Houston, Texas.
DR. BRAWLEY: Otis
Brawley, Emory University, Winship Cancer Institute.
MR. KATZ: Michael
Katz. I am a 13-year myeloma survivor.
DR. FLEMING: Thomas
Fleming, University of Washington.
DR. LEVINE:
Alexandra Levine, University of Southern California, Norris Cancer
Center.
DR. REAMAN: Gregory
Reaman, Children's Hospital and George Washington University.
DR. PRZEPIORKA:
Donna Przepiorka, University of Tennessee Cancer Institute.
MS. CLIFFORD:
Johanna Clifford, FDA, Executive Secretary to this meeting.
MS. HAYLOCK: Pamela
Haylock, oncology nurse and doctoral student in Galveston, Texas.
DR. CARPENTER: John
Carpenter, medical oncologist, University of Alabama at Birmingham.
DR. REDMAN: Bruce
Redman, University of Michigan Comprehensive Cancer Center.
DR. TAYLOR: Sarah
Taylor, University of Kansas Medical Center.
DR. LI: Ning Li, FDA
Biometrics.
DR. WILLIAMS: Grant
Williams, Deputy Director, Oncology Drug Products.
DR. PRZEPIORKA:
Thank you to all.
DR. WILLIAMS: And on
the phone, of course, is Dr. Pazdur.
DR. PAZDUR: Hi. I hope you don't hear the dog barking.
DR. WILLIAMS: I was
going to say that this was the first time that Dr. Pazdur has ever been
speechless--
DR. PAZDUR: And you
love that, Grant!
[Laughter]
DR. PRZEPIORKA:
Welcome and, Dr. Pazdur, thank you for joining us. We would like to move now to the reading of
the conflict of interest statement.
Conflict of Interest
Statement
MS. CLIFFORD: The
following announcement addresses the issue of conflict of interest with respect
to this meeting and is made a part of the record to preclude even the
appearance of such at this meeting.
Based on the agenda, it has been determined that the topics
of today's meeting are issues of broad applicability and there are no products
being approved at this meeting. Unlike
issues before a committee in which a particular product is discussed, issues of
broader applicability involve many industrial sponsors and academic
institutions.
All special government employees have been screened for
their financial interests as they may apply to the general topics at hand. To determine if any conflict of interest
existed, the agency has reviewed the agenda and all relevant financial
interests reported by the meeting participants.
The Food and Drug Administration has granted general matters waivers to
the special government employees participating in this meeting who require a waiver
under Title XVIII, United States Code Section 208. A copy of the waiver statements may be
obtained by submitting a written request to the agency's Freedom of Information
Office, Room 12A-30 of the Parklawn Building.
Because general topics impact so many entities it is not
prudent to recite all potential conflicts of interest as they apply to each
member, consultant and guest speaker.
FDA acknowledges that there may be potential conflicts of interest but,
because of the general nature of the discussion before the committee, these
potential conflicts are mitigated.
With respect to the FDA's invited industry representative,
we would like to disclose that Dr. Antonio Grillo-Lopez is participating in
this meeting as the acting industry representative, acting on behalf of
regulated industry. Dr. Grillo-Lopez is
employed by Neoplastic and Autoimmune Disease Research.
In the event that the discussions involve any other
products of firms not already on the agenda for which FDA participants have a
financial interest, those participants' involvement and their exclusion will be
noted for the record. With respect to
all other participants, we ask in the interest of fairness that they address
any current or previous financial involvement with any firm whose product they
may wish to comment upon. Thank you.
DR. PRZEPIORKA:
Thank you. The first item on the
agenda then is the opening remarks. Dr.
Pazdur, will you be making those opening remarks?
DR. PAZDUR: Why
don't we have Dr. Williams do that?
DR. PRZEPIORKA: Dr.
Williams?
Opening Remarks
DR. WILLIAMS: Just a
few remarks. First of all, we are just
very appreciative of all of your presence here today to give us advice. I think we are actually pretty excited about
the whole process of getting endpoints out and discussed. For us it is a very difficult problem. We have multiple end of Phase II meetings,
multiple different clinical settings and trying to be consistent with the endpoints
that we require for drug approval across these many settings is quite a
challenge.
This reflects a process that we started about a year ago of
looking into endpoints, or even before that internally, and our plan in this
process is to have a series of workshops, a series of ODAC meetings on specific
clinical settings. We have engaged the
National Cancer Institute, AACR and ASCO to help us with picking experts in the
field to do workshops on very specific endpoint settings and we plan to follow
these with ODAC meetings, and this is the first after these workshops. We had a lung cancer workshop in I think
March or April and then this afternoon we plan to have discussions on lung
cancer endpoints.
As we thought about moving toward creating a guideline or
guidances we also considered that we should have some sort of a broad discussion
to sort of set the foundation, and then also to lay the foundation for a
background section of the guidance. So,
that is what we are trying to do here this morning. This afternoon we would like some voting on
some specific questions. As we go along
we will try to determine those that seem appropriate for voting.
But this morning it is more of a broad discussion that we
are looking for. What are those
principles that we should be evaluating as we move forward to evaluate
endpoints? What are those value
judgments globally so that we can then apply them to specific instances,
specific clinical settings?
So, we look forward to the discussion today. I think it is going to be very interesting
and fun. The first talk will be by Dr.
Farrell, who will talk about regulatory considerations with endpoints in
oncology.
General Regulatory
Background
DR. FARRELL: Good
morning, everyone.
[Slide]
I am here to discuss regulatory considerations for endpoint
used for approval. Requirements for
marking approval have been codified and further defined in response to
perceived need. Prior to 1938 there were
no requirements for marketing approval.
As a result of the sulfonamide tragedy, Food, Drug and Cosmetic Act
required manufacturers to provide evidence that their product was safe for
marketing.
In 1962 Congress, concerned about misleading and
unsupported claims being made about marketing products, amended the FDAC to
require that manufacturers provide evidence that the product was
effective. This was to demonstrate
substantial evidence of effectiveness.
In the practice the agency has understood that adequate and well
controlled investigations or substantial evidence of effectiveness means that
efficacy must be demonstrated in at least two adequate and well-controlled
trials.
In 1997 Congress passed the Food and Drug Modernization Act
which stated that the requirement for substantial evidence of effectiveness
could constitute one adequate and well-controlled trial plus supportive
evidence.
[Slide]
There are two basic mechanisms for approval, regular and
accelerated approval. The requirement
for adequate and well-controlled studies is the same for both mechanisms. The regular approval mechanism provides for
approval based on clinical benefit or on an established surrogate for clinical
benefit.
The clinical benefit endpoint is usually an endpoint
thought of as reflecting quality or quantity of life. In oncology, examples of these endpoints
include survival or improvement in a disease-related symptom.
Accelerated approval is a mechanism for those products
designed to be used for the treatment of serious and life-threatening
illness. The mechanism provides for
approval based on a surrogate that is deemed reasonably likely to predict
clinical benefit. The new therapy must
provide an advantage over available therapy, and that can be the ability to
treat patients who are unresponsive to or intolerant of available therapy, or
it can be a therapy that provides an improvement patient response over
available therapy.
[Slide]
The accelerated approval mechanism, as I said, is based on
a surrogate endpoint believed to be reasonably likely to predict clinical
benefit or it can be based on an effect on a clinical endpoint other than
survival or irreversible morbidity. In
any case, post-marketing studies are required to determine clinical benefit.
[Slide]
The evidence for accelerated approval should be substantial
evidence from well-controlled clinical trials regarding a surrogate endpoint,
not borderline evidence regarding a clinical benefit endpoint in a poorly
conducted trial.
[Slide]
As I stated before, ideally the substantial evidence should
come from more than one adequate and well-controlled investigation. The passage of FDAMA allows us to consider
the evidence from one adequate and well-controlled trial plus other supportive
evidence. The effectiveness guidance
discusses supportive evidence and the characteristics of the single trial.
[Slide]
This slide outlines examples of situations where
extrapolation from existing studies combined with a single clinical trial could
support a new indication or new drug application. In pediatrics, if there is bioequivalence in
modified-release dosage form, for different doses or for different regimens.
[Slide]
The effectiveness guidance lists the characteristics of a
single trial supporting approval. In
general these trials should be large, multi-center. The primary results should show consistency
across study subsets. This could be
thought of as various age categories.
The study should be large enough so it could be considered to have
multiple studies in a single study, and that could be done through a factorial
design. And, the results from secondary
endpoints, if positive, could also be supportive for the use of that single
trial. The primary endpoints should show
statistically persuasive results.
[Slide]
In oncology we have accepted oncology supplemental
applications based on a single trial supported by data in a different stage of
disease. The FDA has approved cancer
drug supplements in an NDA in an adjuvant setting when there has been a single
trial plus supportive evidence in a metastatic setting. One example of this would be Irimidex from
the adjuvant treatment of women who are postmenopausal. We
have also accepted applications in first-line settings with one trial when
there has been supportive evidence based on approval in a refractory
setting. An example of that is Gleevec.
In addition, we have accepted applications for the use of
products in combination therapy when there has been an approval in a
monotherapy setting. An example of that
would be Zoloda in combination with Taxotere when Zoloda had already received
approval as monotherapy in the treatment of breast cancer.
Theoretically, we could accept an application and approve
it based on a single trial in a second cancer if there was already an approval
in a closely related cancer.
[Slide]
In summary, the agency has some flexibility in judging what
constitutes adequate information to meet its requirements of substantial
evidence from adequate and well-controlled investigations. However, all products must demonstrate that
they are both safe and effective.
Because oncology is a serious and life-threatening illness we have
actually two mechanisms for approval, regular and accelerated approval.
Accelerated approval can be based on a surrogate endpoint
with planned completion of a post-marketing study to verify the clinical
benefit. Approval can also be based on
one trial plus supportive evidence.
Endpoints differ for different approval mechanisms. Drs. Dagher and Williams will discuss this
issue in greater detail. Thank you.
DR. PRZEPIORKA:
Thank you very much, Dr. Farrell.
Next, Dr. Dagher will be talking about endpoints for past approvals.
Endpoints for Past
Approvals
DR. DAGHER: Good
morning.
[Slide]
In the next few minutes I would like to summarize endpoints
used for approval of oncology drugs.
[Slide]
This slide provides a summary of endpoints commonly used in
the oncology clinical trial setting.
Survival has been considered the gold standard in many settings and
provides an unambiguous endpoint that is easily measured. Time to progression may provide several
advantages as well as challenges, which Dr. Grant Williams will discuss later
this morning. Disease-free survival is
an endpoint utilized in the adjuvant setting.
Objective tumor response is an endpoint that measures an effect largely
related to treatment, independent of the natural history of the disease. Tumor-related symptoms and patient-reported
outcomes are quite relevant from the patient's perspective.
[Slide]
For the purposes of regular approval we have considered
improvements in survival or tumor-related symptoms as evidence of clinical
benefit. In the adjuvant breast cancer
setting we have also considered disease-free survival as evidence of clinical
benefit.
[Slide]
In some settings, where tumor shrinkage has been associated
with symptom benefit or survival, we have considered objective tumor response
as an endpoint supporting regular approval.
In leukemias and some solid tumors, such as testicular cancer, durable
or complete responses have been utilized for this purpose. In the case of hormonal therapies for breast
cancer partial responses have been considered evidence of clinical benefit.
[Slide]
A summary of endpoints and approvals from our Division,
published in The Journal of Clinical Oncology, reveals that more than
half of the approvals have been based on endpoints other than survival. This applies to all approvals as well as
those excluding accelerated approval, a setting in which response rates are
often utilized.
[Slide]
The following table, adapted from this publication,
illustrates the diversity of endpoints used.
For approvals between 1990 and the end of 2002 in the Division of
Oncology Drug Products survival was used in 18 of 55 approvals. Response rate, either alone or in conjunction
with improvements in tumor symptoms or time to progression, was utilized in 26
approvals. As discussed, improvement in
tumor-related symptoms has been used as a basis for approval. Disease-free survival or other endpoints were
used infrequently.
[Slide]
The first two bullets of this slide provide examples where
improvement in tumor-related symptoms was the basis for regular approval. In patients with advanced hormone refractory
prostate cancer a pain scale was utilized to evaluate mitoxantrone plus
prednisone versus prednisone alone.
Photofrin was evaluated for obstructive esophageal
lesions. In this case a dysphasia scale
was used with supportive evidence for objective tumor response.
In the case of several bisphosphonates approval was based
on evaluation of a number of skeletal related events, including pathologic
fracture, radiation to bone, surgery to bone or spinal cord compression. In the case of prostate cancer, pain
requiring change and anti-neoplastic therapy was also a component of the
evaluation.
[Slide]
As Dr. Farrell mentioned, accelerated approval is based on
a surrogate endpoint reasonably likely to predict clinical benefit. In our experience, most of the accelerated
approval indications were based on an evaluation of objective tumor response in
studies without an active comparator, that is, single-arm studies or those
comparing two dose levels of the drug in question. However, randomized trials were conducted in
some settings with an active or placebo comparator, allowing for evaluation of
time to event endpoints such as disease-free survival or time to
progression. Some examples are shown
here.
[Slide]
As was also discussed, accelerated approval requires
further evaluation of the drug to confirm clinical benefit. Therefore, two strategies have emerged for
approaching accelerated approval and subsequent confirmatory evaluation of clinical
benefit.
With the first strategy accelerated approval is based on
response rate evaluated in single-arm studies of refractory patients and
confirmatory studies are conducted in related populations such as those with
less refractory disease. This approach
has the potential advantage of allowing rapid completion of single-arm studies.
[Slide]
However, accelerated approval may influence the ability to
enroll patients for confirmatory studies.
Furthermore, it has become more and more challenging to evaluate
marginal benefits in more and more refractory populations, and findings in
refractory populations may not be relevant to other populations which may
benefit from the drug. In fact,
evaluation in refractory populations first may lead us to miss an active
drug. The single-arm component of the
strategy is associated with its own limitations: First, an inability to evaluate time to event
endpoints in a non-randomized setting and difficulty in completely assessing
the toxicity profile.
[Slide]
The second strategy for accelerated approval depends on
evaluation of a surrogate endpoint and an interim analysis of a randomized
study, with subsequent evaluation of clinical benefit in the same trial using a
final analysis. This approach allows for
evaluation of the same population for accelerated approval and regular approval
and facilitates completion of a confirmatory study. The randomized setting allows comparison to
available therapy and a thorough evaluation of the toxicity profile.
[Slide]
However, this approach may require more time and patients
than single-arm studies and accelerated approval could still influence
completion of the study.
[Slide]
In summary, improvements in survival or tumor-related
symptoms have been considered evidence of clinical benefit. In some settings durable, complete or partial
responses have been considered endpoints supporting regular approval. Finally, objective tumor responses in
single-arm trials have been the basis of approval in most cases of accelerated
approval. Thank you.
DR. PRZEPIORKA:
Thank you Dr. Dagher. We are
going to hold questions until the end of the presentations and Dr. Williams
will now talk to us about selected issues in oncology trial designs that are
pertinent to this morning's topic.
Selected Issues in Oncology
Trial Design
DR. WILLIAMS: Well,
thank you, Dr. Przepiorka.
[Slide]
Members of the committee, ladies and gentlemen, what I
would like to do is to first review the selected issues in oncology trial
design before we go to discussing specific problems and your recommendations
for our further deliberations.
[Slide]
Here is the outline of my presentation. I will begin with several difficulties we
face in oncology that are well-known to all of you, and I will briefly discuss
the non-inferiority trial design and the difficulties we face with this
approach. Finally, I will discuss time
to progression, expanding upon some of the regulatory issues presented by Dr.
Farrell and Dr. Dagher, especially the issues relating to the meaning of clinical
benefit and also surrogates for clinical benefit. Then I will discuss the pros and the cons of
TTP as an approval endpoint.
[Slide]
During our end of Phase II meetings with sponsors we often
ask whether trials can be blinded and we are usually told they cannot. These are the reasons that we are told,
first, that there are toxic side effects that are said to unmask both the
physician and the patient. Second, the
investigators adjust doses based on drug-specific toxicities and the
investigators believe they need to know drug assignment to do this safely. These seem to be very difficult problems,
although I think maybe the first point might bear some further discussion--has
anyone actually studied the degree of unmasking by side effects of oncology
drugs? As we move to new potentially
targeted therapies and to oral therapies we should consider whether we can
blind more trials.
[Slide]
Placebos are widely used in many areas of drug
development. The use of the placebo is
seldom feasible in evaluation of advanced cancer. There are some cancer settings where placebo
use may be possible. Blinded,
placebo-controlled studies might be performed in some early disease settings
where no effective treatments exist. In
advanced settings the so-called add-on design can allow placebo use comparing
drug A plus placebo to drug A versus drug B.
In some settings it may be reasonable to continue placebo and drug B
even beyond progression. An example of
this were the bisphosphonate trials which assessed effects on bone morbidity
even after chemotherapy was changed.
[Slide]
So, the unfortunate result of not having blinded,
placebo-controlled studies is that we must use controls which are active. If we use a superiority trial design the new
drug must beat the active drug, or we can use an add-on design. Not surprisingly, many trials for drug
approval are based on drug combinations and add-on designs. Certainly, this can lead to toxic combinations.
The other possibility is to do non-inferiority
studies. As I will discuss, these tend
to be very large trials and the quality of historical data in oncology is
frequently insufficient to support this approach. Again unfortunately, in this setting where
blinded, placebo-controlled trials may not be feasible it is very difficult to
demonstrate the new drugs are less toxic but have similar efficacy to an
approved drug.
[Slide]
The frequent use of drug combinations in oncology also
present regulatory challenges. Since
marketed approval is for a single drug rather than a combination of drugs,
trials supporting regulatory approval need to isolate the effectiveness of the
proposed agent. Evidence is needed
showing not only the effectiveness of the combination but also establishing that
there is a contribution of the new drug to that regimen.
[Slide]
Now I would like to turn to the topic of
non-inferiority. Obviously, I am not a
statistician but I will try to share with you what I understand about it. The reason we are not having statisticians do
this discussion is because we don't want to be at this a whole day on
non-inferiority.
[Laughter]
[Slide]
So, here is the way I see it. First I want to review some non-equivalent
words. I don't know if anybody caught the pun in the title here. First of all, we love superiority. We love to hear the word superiority; we love
superiority trials. Equivalence is a
word you should never say to a statistician, but I was corrected on this, it is
all right to say it to a Bayesian.
[Laughter]
Equivalence is something that can never be proven. Because we cannot show equivalence we rule
out inferiority by a prespecified margin.
We call this demonstration of non-inferiority. A very important regulatory concept is that
proof of non-inferiority does not necessarily prove efficacy, and we will
discuss this a bit further. I think the
use of these words in our oncology journals can create serious
misconceptions. A common problem is the
assumption in oncology journals that no statistical difference is the same as
equivalence or non-inferiority.
[Slide]
This slide lists the steps needed to perform a
non-inferiority analysis. Just the
number of steps should suggest the complexity of this process and the potential
for error. In this example we are
demonstrating that drug B is effective.
In order to do this we refer to the effect of drug A observed
historically in randomized studies. I
think I have these steps out of order; I will stick to the third one.
We then prospectively identify a margin that includes an
acceptable fraction of drug A's efficacy.
We randomized drug A versus drug B.
We prove that drug B is no worse than drug A by that margin. Probably the step that is most often ignored
is that we determine that the constancy assumption is valid. Invalid assumptions at any stage of this
process could lead to a false result and this is why non-inferiority studies
are not FDA's favorite trial design.
[Slide]
The important constancy assumption is the historically
observed drug effect of the active control drug also exists in the current
non-inferiority trial and in the population.
The problem is that conditions are never the same in historical trials
and a current trial. Differences include
different populations; differences in supportive care; differences in
availability of new drugs that can be taken after failing, including the
possibility of crossover. Finally, the
designs can be different with different frequency of follow-up. So, any of these could change the sensitivity
of the trial to detect the treatment effect.
The serious result of violating that constancy assumption could lead to
the approval of what has been termed a toxic placebo.
[Slide]
This is another property of non-inferiority trials that Dr.
Temple has noted, sloppiness obscures the observations of differences. For superiority trial designs sloppiness
obscures efficacy but for non-inferiority trials sloppiness could lead to a
false efficacy claim. Again, this is why
we like superiority trials. I think that
is a common theme you will be hearing here perhaps.
[Slide]
A critical problem in doing non-inferiority studies in
oncology is the paucity of studies that are available to determine the
historical effect of the active control drug.
We basically strike out at the first step of this process. What we really need is multiple trials
showing a consistent, large effect and we need to perform a meta-analysis of
those trials which provides us with a dependable effect precisely estimated.
The real situation in oncology, almost without exception,
is that we have one or two rather small trials with small effects and with
marginal statistical significance. This
leads to small historically documented effect sizes; small margins; and very
large non-inferiority studies. The
process becomes even more complicated when we consider drug combinations and
the contribution of individual drugs to historical effect.
The reason I am presenting this is that I think this is
such a complex topic and people don't understand why you don't do a
non-inferiority study. I don't think you
can say it without trying to go through all these steps, but it is basically
just not possible in many of our settings at least using the primary endpoints.
[Slide]
Now I would like to turn to endpoints and surrogates. Dr. Farrell and Dr. Dagher provided an
overall review of regulations on oncology endpoints. So, I want to briefly review the history of
regulatory standards for efficacy endpoints.
The 1962 amendments to the FD&C Act simply stated that
a drug must be shown to have the effect claimed in the label. However, subsequent judicial decisions
established that effectiveness meant that the drug must have clinical
meaning. In the 1970s marketed
applications for cancer drugs were approved primarily based on objective
response rates and on rather minimal activity we would say today.
However, based on advice from ODAC in the late '70s and
early '80s, FDA determined that the response rate should generally not be the
sole basis for drug approval because the possible benefits associated with tumor
shrinkage did not necessarily justify treatment with toxic anti-cancer
drugs. Acceptable endpoints for drug
approval were improvement in survival or improvement in physical functioning or
relief of pain.
As Dr. Dagher discussed, in the 1990s FDA struggled with
the difficulty of measuring patient benefit and in some settings found various
surrogates to be adequate in specific clinical situations.
[Slide]
There are various definitions for a surrogate. In this context we will use the definition
from Dr. Temple. A surrogate endpoint of
a clinical trial is a laboratory measurement or a physical sign used as a
substitute for a clinically meaningful endpoint that measures directly how a
patient feels, functions or survives.
Changes induced by a therapy on a surrogate endpoint are expected to
reflect changes in the clinically meaningful endpoint.
[Slide]
In various settings for many years FDA has based regular
drug approval on surrogate endpoints which were judged by FDA and experts in
the field to be reliable indicators of clinical benefit. Examples outside the field of oncology
included blood pressure, blood sugar and blood cholesterol.
[Slide]
It may be useful to review where we have used the term
surrogate in oncology. In accelerated
approval the surrogate need only be reasonably likely to predict benefit. Obviously, this is a lower standard than the
usual use of the word surrogate.
We have discussions with statisticians--Dr. Fleming,
regarding validated surrogates and we expect to prove quantitatively the
relationship between the surrogate and the established endpoint. Unfortunately, in oncology we have very few
settings where we quantitatively validate the surrogate. It would be easier to validate surrogates if
we had more effective drugs with large effects to compare surrogate and
clinical benefit. Finally, we have
surrogates that have been used to support regular approval of cancer drugs in
very specific settings, usually based on clinical inference and judgment that
these surrogates relate to clinical benefit.
[Slide]
At the recent colon cancer workshop Dr. Fleming reviewed
Prentice's criteria for strictly validated surrogates. The surrogate endpoint must be correlated
with the clinical outcome. The surrogate
must fully capture the net effect of the treatment on that clinical outcome.
[Slide]
In the clinical setting this would involve meta-analyses of
clinical trials and a comprehensive understanding of the disease and the
intended an unintended effects of drugs.
As I stated, where possible this is the kind of evidence we would like
for a surrogate endpoint. The question
for us today is what should we do with endpoints we have today? What can we use for approval endpoints today
and in what settings can we use them?
And, what can we do to gather more data for the future?
[Slide]
As we looked at TTP to ask whether it is an acceptable
surrogate in various settings, I propose that the question we should ask should
not be whether an improvement in TTP has clinical meaning. I suggest that nobody in the field of
oncology really doubts that it is good to delay the growth of cancer. That is not really the question that we need
to answer.
[Slide]
The real question is whether you can reliably measure TTP
and, if you can, what does it mean? How
much delay in progression is worth how much toxicity? With survival we seldom quibble about the
size of the effect. Given the low statistical
power of our studies, a statistically convincing survival benefit is generally
considered to be worth the toxicity of treatment. However, can we say the same for the delay in
TTP? That is, when progression is
determined by only images on a scan. So,
the real question is how do we trade off a TTP benefit compared to drug
toxicity?
Another question is the relative value of treatments
evaluated by different endpoints. When a
well-established survival benefit exists for an approved drug what is the
meaning of the claimed TTP effect for an investigational drug? Although two treatments are not required to
have equal efficacy this is, nonetheless, an important consideration for us.
[Slide]
FDA's approach to endpoints for hormonal treatment of
cancer illustrates how clinical judgment has played a role in the acceptance of
surrogates for regular drug approval. For
many years these drugs have been approved primarily based on comparison of
response rates with two reasonably large, randomized, controlled studies. TTP and survival were assessed as secondary
endpoints. Many hormonal drugs have been
approved with this approach. I think
that everybody is satisfied that we approved effective drugs through this
approach.
So, what allowed this approach? These are what I believe are the critical
factors. We have a long experience with
tamoxifen and, despite little data with regard to a survival or TTP benefit,
tamoxifen was widely observed to provide benefit to patients. The main indicator of activity was response
rate. Given the non-toxic nature of the
drugs and similar mechanisms of action, response rates seemed a reliable
indicator of clinical benefit in this setting.
[Slide]
Four years ago at ODAC we discussed TTP as an approval
endpoint for first-line cytotoxic treatment of breast cancer. The committee was not supportive of TTP for
regular approval but did suggest its use for accelerated approval. Prominent in the ODAC deliberations was
whether the standard treatment doxorubicin produces a survival effect and, if
so, what size is that benefit. Committee
members noted that current treatments only produce small TTP effects and they
questioned whether there was or was not a correlation between TTP and survival,
whether it was reliable. As I note in
later discussion, I think this question needs to be carefully evaluated because
of the under-powered nature of most of our studies.
Questions were also raised about the reliability of TTP
measurement and also a claim that in order to measure TTP accurately frequent
scans would be needed. So, the ODAC
criticisms were varied and they addressed the data available at the time in the
specific cancer setting.
[Slide]
So, I would like to take a closer look at TTP. First of all, what is TTP? The basic definition is time from
randomization to documented progression.
However, there are very many different definitions of TTP with a lot of
different details, such as how do you handle missing data and how to
censor. If TTP is to be used as an
important endpoint there should be careful agreement between FDA and the
sponsor on the protocol, case report form and the statistical analysis
plan. Difficult issues include how to
follow the patient for new lesions and how to define and validate progression
of non-measurable disease.
[Slide]
I want to mention three TTP-like endpoints that we
frequently encounter, time to progression, progression-free survival and time
to treatment failure. For TTP the
measured event is progression. TTP may
be thought of as a measurement of anti-tumor activity. Patients going off study for toxicity and
non-tumor deaths are not counted as events.
Note that for non-tumor deaths censoring occurs at the last visit where
TTP was evaluated. This censoring makes
the assumption there is no relation between death and progression, an
assumption that might be questioned.
[Slide]
With progression-free survival all deaths are counted as
progression events. Dr. Fleming
suggested at the recent colon cancer workshop if TTP is being considered as a
clinical benefit surrogate, perhaps the deaths should be counted. FDA has often counseled sponsors to keep TTP
and death separate however, that is, to measure TTP without the deaths and to
measure deaths in the survival analysis.
The main concern with including deaths is that patients lost to
follow-up will subsequently be counted as progression events at the time of death. In such a scenario sloppy progression to
follow-up leads to longer progression times and asymmetric follow-up of such
cases could lead to a false result. If
deaths are included in the analysis, then careful symmetric follow-up is
needed. Perhaps we need analysis rules
to deal with patients who have inadequate follow-up.
[Slide]
Time to treatment failure is a composite endpoint measuring
time from randomization to discontinuation of treatment for any reason,
including progression, treatment toxicity and death. Because it combines elements of safety and
efficacy, TTF is not an acceptable endpoint for documenting efficacy. Time to treatment failure has not supported
drug approval.
[Slide]
Let's look more closely at TTP as a potential regulatory endpoint. Here as some of the positive qualities of
TTP. TTP is measured in all patients and
might, therefore, be a better measure of overall benefit than response. TTP does not require massive tumor shrinkage
and might be a better measure for metastatic agents.
From a practical standpoint, progression is often the
reason oncologists change therapy.
Therefore, an advantage of TTP is that TTP is measured before patients
cross over to other therapies. This is
of growing importance as we develop more effective drugs. Moreover, because progression often occurs
months to years before death much smaller studies may be needed to study TTP
than survival and this can vary dramatically with the different diseases.
Finally, some would argue that delaying progression has
face validity as an indicator of benefit.
The benefit seems obvious because progression is a necessary step
between cancer growth, patient morbidity and death.
[Slide]
But here are some problems with TTP. It has been said that it may not correlate
with survival. It is an indirect measure
of clinical benefit, sometimes reflecting minor changes on a radiograph. Therefore, small differences in TTP may be of
unclear clinical value, especially when one is evaluating toxic treatments.
There are obvious concerns relating to ascertainment bias
in unblinded trials, and there are concerns regarding the reliability of a
small effect with the kind of trials we have today with monitoring schedules
which may vary from patient to patient.
Finally, careful assessment of progression at frequent intervals is
labor intensive and expensive.
[Slide]
We encounter difficulties in determining the exact
relationship between TTP and survival.
First of all, there are many different cancer settings so the database
for any one setting may not be large and it isn't clear when you can combine
data across different cancers. Secondly,
unfortunately, we don't have many treatments that produce large survival
effects.
A fundamental difficulty is that there is always more statistical
power for the analysis of TTP than survival.
On this basis alone even if TTP were a perfect surrogate one would
expect some studies to show a statistically positive TTP benefit without a
statistically positive survival benefit.
Oncology studies are virtually never large enough to rule out a
meaningful survival effect and, thus, individually cannot establish a lack of
correlation.
Finally, there is the crossover issue. Even if TTP were a perfect surrogate for
survival, crossover to other effective therapies could prevent detection of a
potential benefit.
In summary, with the trials of the size we usually see in
oncology or therapies of only marginal benefit it would be difficult to
determine the exact relationship between TTP and survival.
[Slide]
In reviewing these slides from the 1999 ODAC, I came upon
this one. Dr. Johnson I thought did a
really good job of summing up a comparison of survival and TTP. Survival time is precisely determined
regardless of follow-up. Survival is a
known entity. On the negative side,
survival takes longer to assess, needs larger trials and its benefit can be
obscured by secondary therapy.
[Slide]
TTP is only a surrogate, not a direct measure of clinical
benefit. Later today during your
deliberations we want to hear your thoughts on the important factors FDA should
consider when evaluating TTP as a surrogate for clinical benefit in specific
settings. For instance, would TTP be
more acceptable in cancer settings where symptoms occur at the time of or soon
after progression? What TTP benefit
increment would be persuasive? How
important is the toxicity of treatment in evaluating a TTP benefit? Finally, to what extent is the benefit of
other available drugs important? For
instance, what if other drugs produce a substantial survival benefit?
One approach to the problem of TTP measurement has been to
convert TTP to a direct measure of clinical benefit by measuring time to
worsening of cancer symptoms. For years
FDA has suggested this endpoint to sponsors at the end of Phase II
meetings. However, sponsors and
investigators have cited several problems with this approach. First, there is the ever-present problem of
lack of blinding and potential bias thus the endpoint may not be reliable. Another problem is the usual delay between
the time of objective progression and the onset of cancer symptoms. Often alternative treatments are begun before
reaching the symptom endpoint. At our
colon cancer workshop Dr. Langdon Miller presented data suggesting that in
colon cancer there is a fairly long time lag between progression and onset of
symptoms. When alternative treatments
are begun prior to symptom progression the issue of confounding effects arises,
just as it does in analysis of survival.
[Slide]
We must remember a critical difference between analyses of
survival and tumor progression. The date
of death, represented by the star in this cartoon, will not change regardless
of the evaluation schedule or censoring.
For progression measurement, however, the date we assign for progression
is usually the date of a scheduled visit occurring some time after the actual
progression date. It should not be
surprising that assessing progression at longer intervals leads to longer time
to progression and that asymmetry in this process could lead to bias.
[Slide]
With measurements repeated over many visits assessment of
TTP by traditional methods is difficult and labor intensive. Many problems are encountered by FDA during
reviews such as not all lesions being followed, or extra scans being performed,
or measurements being missing. So, how
do you assure equal measurement? How do
you assess the impact of bias? How do
you verify progression of evaluable disease by unblinded investigators? These are the difficult issues for review of
TTP data.
[Slide]
One approach to making progression assessment practical and
reliable would be to consider different progression endpoints. An approach that seems worthy of research is
to assess progression at only a single time point. This would considerably decrease the burden
in the amount of data collected and eliminate the concern of time-related
assessment bias. Scans would need to be
evaluated only at baseline and either to document progression for that time or
at the prespecified time to document stable disease.
[Slide]
Progression measured at a single point would be much easier
to audit and verify, needing only two sets of scans per patient and
time-related bias, as mentioned, would be minimized if not eliminated.
So, I think research into approaches such as this would be
of great interest to identify the benefits and problems. In this case you would certainly lose some
statistical power, requiring larger studies.
There would be concern that you would miss a transient TTP benefit if
you hit the wrong point with your single time analysis, and we would lose the
information we are used to seeing about other parts of the curve, such as the
early effects or the potential benefit of a plateau.
[Slide]
In conclusion, here are some issues you may wish to
consider in your deliberations. As FDA
proceeds with the workshops and meetings on endpoints for cancer treatment
settings, is TTP ready for active consideration as a drug approval endpoint? If so, what are the factors that determine the
acceptability of TTP as a drug approval endpoint? What amount of TTP evidence would be needed
to support a TTP claim, such as number of trials, value, magnitude and
precision of TTP benefit?
[Slide]
And, can we improve our approach? Do we need research on novel progression
endpoints such as a single point analysis?
Do we need research on the association between TTP and survival data to
validate TTP as a survival surrogate?
Should we develop an approach to TTP endpoint definition and censoring
methods that are standard? Do we perhaps
need a separate workshop just to concentrate on TTP methodology? Can more trials be blinded? Does independent blinder radiologic review
improve endpoint assessment? And, can
symptoms be incorporated into the endpoint?
So, this ends my presentation. I think what we will do is take questions
from our seats and just briefly introduce the questions at the beginning of the
question discussion rather than to do it now.
How long do we have for questions?
Clarification Questions to
the Presenters
DR. PRZEPIORKA: Two
hours, just for clarification or the actual questions? Until the break--about 20 minutes. We have the floor open now for questions for
the presenters for this morning.
I have a question for Dr. Williams. Just for a point of clarification, for
non-inferiority you are not truly looking for non-inferiority per se in terms
of the response but it has to be non-inferior in terms of its treatment effect
as well as less toxic to be a real winner in that sort of design.
DR. WILLIAMS: Well,
let me start with just non-inferiority in general. It just means that you have met your
margin. Okay? Non-inferiority for the FDA means that you
have met your margin and that margin means the drug works. It is a separate judgment about whether you
are less toxic; I mean about the risks and benefits. But there wouldn't be a direct requirement to
be less toxic from our regulations, I don't think.
DR. PAZDUR: I think
a lot of people confuse that issue of toxicity and non-inferiority since
several applications came in dealing with perceived less toxic drugs and
comparing them to a standard drug. But,
as Grant said, the toxicity evaluation is different. Many times what we actually see is not really
less toxic drugs but a different spectrum of toxicity, and that is another
thing that people have to consider also when they are evaluating toxicity.
DR. WILLIAMS: We
have never applied this approach but I know I have heard Dr. Fleming talk about
it and we have talked about it before, you could always have the toxicity
affect your margin. That means you might
be willing to accept less proof of efficacy if you knew it was less toxic. But that would be involved in the judgment
process.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: The grim
reality of non-inferiority studies is that we usually set a margin at something
like preserving half of what we think the effect of the drug is. That is not very gratifying. I mean, you would hate to lose half of the
valuable effect and, yet, if you explore sample sizes it is really not possible
to do much better than that. So, in
return for getting a drug that might have less toxicity, or is easier to give,
or is a different dosage form and things like that, we do the best we can
sometimes, as Grant pointed out, there often isn't. So, it is a tremendous problem to get less
toxic or more easily taken drugs. The same problem actually arises when you are
looking for drugs that mitigate the side effect of another drug. If you want to show that you preserve the
effect of the drug, I can't imagine what size studies would make a convincing
case and, as Grant said, there is often very unclear evidence on what the
actual beneficial effect of the drug is in the first place. This isn't unique to oncology; it occurs
everywhere but it is a major challenge.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: Where in
this do we account for differences in durability of response? For instance, you could have two treatments
that have equivalent TTP but very different duration of response and that would
be something that would be very different in terms of patient benefit.
DR. WILLIAMS: Well,
I guess it would be a separate judgment.
If they had the same TTP, that is one thing but duration of response
would relate also to response rate. I
have never had considerations where we were looking at TTP as a primary
endpoint and we saw differences in response rate and we were making a
judgment. But I think, obviously, if you
are looking at response rate, duration of response is always an important
consideration and a big judgment call when you have such a long duration. I think the O'Shaunnesy paper had some
discussions about that in the early '90s about certain settings with big
response rates and long durations of response that we might consider using it
as an endpoint for clinical benefit, but it is very much of a judgment call.
MR. KATZ: I guess I
was raising it strictly because of, you know, the difference in quality of life
between being treated with something constantly over a three-year period
between your randomization and progression versus being treated with a blast at
the front. That is a significant
difference. You know, it is separate from
the response rate.
DR. PRZEPIORKA: Dr.
Grillo-Lopez?
DR. GRILLO-LOPEZ: I
believe that TTP is an excellent endpoint for regular approval even and that,
in fact, it is much better than survival.
It may not be obvious but survival is plagued by a number of biases that
we can discuss during the course of the day.
One would tend to state that F is the ultimate endpoint when you are
talking about survival but, again, there are a number of biases when you are
looking at death as an endpoint.
But to address your question, I think that one way to
address the issue of TTP and its relationship to response is to do an analysis
of TTP for responders. When you look at
TTP for responders, this is even a better endpoint than duration because the
problem with duration of response is that you are looking at two time points,
both of which are variable. The duration
of response starts from the first day that you see a response, and that can
vary depending on when the evaluations are done, and ends with progression of
disease which, again, can be somewhat variable.
Whereas, TTP at least has a definite calendar date for the onset of TTP.
DR. WILLIAMS: WHO
does response duration--or ERTC or somebody--from the time of
randomization. That is where they
routinely measure response duration but, obviously, there is a longer but
perhaps more precise measure.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: I was
just going to comment on duration of response.
There certainly have been situations where very long response was
considered sort of self-evidently beneficial in some of the leukemia/lymphoma
drugs. In testicular cancer, if you are
still alive and have not progressed at a year everybody assumes that you would
have been dead. So, there are some of
those cases but as an endpoint in clinical trials we have never been
successful, to my best knowledge, in incorporating that particular measurement
into the overall evaluation. We sort of
say if it is too short, that might not be meaningful but I don't think it has
been more precise than that except when you get these partial responses that
last for a year and everybody is very impressed by that as a likely clinical
benefit.
DR. WILLIAMS: That
was a big role with IL2, wasn't it, Pat?
Long duration response?
DR. KEEGAN: Yes,
that was the basis for the approval both in metastatic renal cell and metastatic
melanoma. Although there were relatively
few responses--I think it was less than a 15 percent overall response rate for
either one. The responses were measured
in months for partial responders and years for complete responders.
DR. TEMPLE: And the
treatments for hairy cell leukemia all sort of had those characteristics.
DR. PAZDUR: And that
was for Fludara and for valcane too.
DR. PRZEPIORKA: Dr.
Redman?
DR. REDMAN: Dr.
Farrell, just for my own clarification because I heard the words being used in
the same sentence, in the regulations clinical benefit is not defined as
survival?
DR. FARRELL: Right.
DR. REDMAN: It is
defined as clinical benefit. What we are
trying to discuss is what is a clinical benefit and assuming that time to
progression is a surrogate endpoint to survival may be false just by
definition.
DR. WILLIAMS: But as
I said in my talk, clinical benefit it not in the regs, or at least it is not
in the Act. Do you want to say more
about it, Dr. Temple?
DR. TEMPLE: It is
definitely not in the Act. An important
court of appeals case--whether that really changes the law or not is debatable,
but Warner Lambert versus Heckler said it is just obvious that the Commissioner
needs to consider what the effect is. He
doesn't have to approve something silly, like there used to be drugs to
increase bile flow. You know, that
doesn't sound like it is very useful.
But that is what it is and it has never been defined as a particular
thing. In other words, as Grant said,
everybody thinks that delayed time to recurrence in adjuvant settings probably
is a clinical benefit because, you know, you don't have tumor yet or you don't
know you have tumor yet or because it is usually symptomatic. That is okay.
If somebody thinks that very delayed time to progression must
correlate--there is a lot of judgment in it.
There is no rule; nothing is written down.
As Grant said, up until 1985 we used to approve everything
based on response rate. We didn't think
that was illegal but we concluded it wasn't so good.
DR. WILLIAMS: And
looking back at the history of oncology, at the very time that we made this
decision the Supreme Court was evaluating Laetrile and the Supreme Court was
supporting the FDA that we could demand proof of efficacy in terminal cancer
patients. The words used were symptoms,
function and survival. So, I mean, it is
a collection of sort of legal arguments as sort of the basis I think.
DR. PRZEPIORKA: Dr.
Fleming?
DR. FLEMING: In
considering the concept of clinical benefit, I think many of us have, across
many disease areas, considered direct measures of clinical benefit to be
measures that unequivocally reflect measures tangible benefit to patients. So, Grant had put forward examples of those. Obviously, duration of survival; measures
that reflect quality of life; disease-related symptoms, those are obvious
measures.
Where we struggle is that in any disease area there are
targeted mechanisms by which we are hoping to achieve those clinical benefits,
and we may be more or less right about those.
In oncology we would tend to think those would be most directly measures
that reflect disease tumor burden. Time
to progression, response rate are, in that regard, measures that we would give
considerable attention to. One could
argue though that you could shrink a tumor by a certain fraction or delay time
to progression by a certain fraction and that doesn't necessarily lead to
something that the patient would be tangibly aware of unless, as was pointed
out--I think Bob pointed out, if progression is associated with symptomatic
disease or disease-free survival, if the delay in the time to having detection
of disease provides a psychological benefit.
Those are direct tangible factors.
But the complication that arises here is that time to
progression may, in fact, be the intended mechanism by which we hope to achieve
clinical benefit but the problem is may you delay progression by two weeks or
four weeks without that translating into something that the patient is tangibly
aware of in terms of longer survival or improvement in symptoms or quality of
life.
DR. REAMAN: For
clarification, are we lumping together time to progression and time to
recurrence and the issue of stable disease as an endpoint?
DR. WILLIAMS: I am
specifically mentioning time to progression.
We will talk about disease-free survival during the questions. We have taken a stronger stance, as Dr.
Dagher has stated, that with disease-free survival in some settings is a
clinical benefit. Disease-free survival
in the adjuvant setting I don't think we would say is the same as time to
progression. So, our discussion here so
far has just been time to progression.
If you would like to bring up the other now, but we will certainly
discuss it later too.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: I think
what we are going to find here at the end of the day is that the importance of
the various endpoints is going to vary considerably by disease. Dr. Temple was citing all these examples
about how drugs got approved, single agents, all hematologic malignancies. What has been referred to this morning has
been more referable to solid tumors. So,
this is going to be really complicated.
I would like to get some input from people like Dr.
Fleming, all too often we see that time to progression does not translate into
a survival advantage. The cause of that
is because the survival measurement is under-powered, or is it because once
they progress with a longer time to regression they don't respond to subsequent
therapy? What is the explanation for
this because we see it all too often?
DR. FLEMING: That is
a good question and it is one in general that arises as we consider markers as
potential replacement endpoints. Just as
a quick, brief response to your question, if we are using time to progression
and we are using it as a measure of the intended mechanism by which we hope to
achieve clinical benefit, such as survival, why is it that you may see a time
to progression effect and not a survival effect? Part of it may be that it is not fully
captured in the entire mechanisms through which these processes are influencing
outcome.
A better example I think of that might be if you used
objective response rate as the surrogate because it may be that you are
under-estimating the true effect on the clinical endpoints, such as survival,
because the intervention has a cytostatic component that delays progression
without necessarily shrinking tumors.
Of course, the other factor is the clinical endpoint can be
influenced by unintended mechanisms so that you may be having a potentially
partial beneficial effect mediated through the intended delay in time to
progression, but that could be offset by other unintended mechanisms,
toxicities etc. which would yield in the end a lesser impressive survival
effect.
Typically the marker is more proximal and often the true
clinical endpoint is more distal. So, it
is not surprising that the nature and magnitude of the effect on the more
proximal measure may be different from the more distal.
The critical issue in validating a surrogate, as we will
get to later on, is that it shouldn't be assessed in terms of statistical
significance, yes/no. It should be
assessed in terms of does a relative risk reduction in the time to progression
translate into some definable and predictable relative risk reduction in
survival. So, if we reduce progression
by a rate of 30 percent, is that a pretty reliable estimate of a reduction in
death rate by 20 percent? In fact, if
that is true, clearly a study is going to be more adequately powered for
progression than survival because you can detect a 30 percent reduction with
half the sample size of a 20 percent reduction of death.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: I would
like to talk a little more about the time to progression in symptoms
issue. I think we all would tend to
agree that conceptually, ignoring the methodologic difficulties, a delay in
progression is a good thing. We have a
lot of problems with measuring it, and how the design is done and all these
things that contribute to it. But it
seems to me that if we are after a clinical benefit, an important clinical
benefit is that development of symptoms.
So, you have some diseases I suppose where you have the distribution of
time to development of symptoms after progression that would be relatively
short, in which case you would look to build that probably into the definition
somehow. In other diseases you might
have a very long time, and that becomes a lot more problematic I think because
that would be more variable and longer-term in individuals and then you really
have to worry about how it translates into individual patient benefit.
I noticed you briefly talked about some related things,
like progression-free survival, and you just kind of briefly touched on
them. So, do you have any more comments
about this issue?
DR. WILLIAMS:
Certainly, we look forward to your deliberations on this matter. Of course, right now this is just questions
to the speaker. That is one of the
biggest things we would like to know, can you do this or not? If you can't do it, then forget it. And, that is basically the answer we have got
from most investigators, we can't do this.
But if you can, we would love to see it.
DR. PRZEPIORKA: Just
to clarify, I don't mean to put words into Dr. George's mouth but, again, it
seemed that you were somewhat negative on the concept of progression-free
survival as opposed to time to progression.
Would you like to expound on that?
DR. WILLIAMS: Okay,
what I should have said was that we have often said don't do
progression-free. It has been our
approach because we have been disturbed by loss to follow-ups coming in as
deaths, you know, prolonging survival.
It is a very sloppy business and there is no rule in there about how you
deal with that. As a secondary endpoint
I think that is quite reasonable but I think, as Dr. Fleming said, if you are
really going to try to capture more in this endpoint if it is relevant, then
include deaths. I think that is a good
thing for you to discuss, is that reasonable to do? But if we do, then we have to do something to
make sure those deaths don't mess up our analysis and produce unreasonable
results like, you know, three-year progression-free survival and then death,
things like that.
DR. PRZEPIORKA:
Again, your definition of progression-free survival does not include
death?
DR. WILLIAMS: TTP
does not include deaths.
Progression-free survival includes deaths. That is the terminology I use.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: With TTP
you censor the deaths and don't count them.
With progression-free survival your worry is that you gain credit for
very great delay in progression because nobody observed you for a long time
until you died. It doesn't have an
obvious bias, it just gives you a wrong number.
DR. WILLIAMS: Well,
both of them produce wrong results. I
mean, we like to censor the visit before the death instead of at the death but
still, you know, that is being cut off because the patient died. Was really that death unrelated? If it was related, then you have
non-informative censoring. So, it is
which kind of bad data do you want. So,
the real way to do it is to do the trial right and not have these kinds of
things.
DR. TEMPLE: Can I
pursue a previous discussion with anybody?
The practical difficulties of doing time to death in addition to time to
progression I don't think have been adequately recognized. Just as a quick example, which will be statistically
incorrect, if you delay progression from six to eight months my quick hazard
ratio is 0.75. If you improve survival
from 12 to 14 months, the same difference; you can't expect to have a bigger
effect. So, your hazard ratio is only
0.86.
Now, the implications of that for sample size are major and
I haven't even calculated a crossover.
So, if you imagine that the crossover to study drug now reduces your
advantage from two months to one month, we are talking about major differences
in sample size. I am not sure anybody
has actually modeled the difficulty but it is clearly going to be very, very
hard just on practical grounds alone.
You don't even have to postulate that there is a difference in effect on
progression to survival. I am just
assuming it is the same but still I am sure the sample size goes up a factor of
four with what I just said, but someone can correct that. It is a very substantial problem, not really
addressed.
DR. WILLIAMS: But
underlying that, Bob, we have had many of these discussions and the issue is do
you assume a constant hazard or do you assume a constant increment? I don't know what we should expect.
DR. TEMPLE: Grant,
why would anybody imagine that a two-month increase in time to progression
would lead to a four-month increase in survival?
DR. WILLIAMS: I
don't know but you heard Tom do it and I think the statisticians continually do
kind of assume a constant hazard when they go from one endpoint to the other.
DR. PRZEPIORKA:
However, this again begs the question of whether or not one is supposed
to be a surrogate for the other, or can you say time to progression is a
clinical benefit and we don't have to worry about whether it is a surrogate?
DR. TEMPLE: Right,
but one of the tempting reasons to do that is the implication for sample size.
DR. WILLIAMS: Maybe
we could hear Tom. What is the
assumption and which is valid?
DR. FLEMING: Well, I
think the essence of what Bob is saying is what drives interest in looking at
replacement endpoints. The example I
gave was a 30 percent reduction in progression rate compared to a 20 percent reduction
in death rate and that would lead to a doubling in sample size.
DR. TEMPLE: It
depends how much delayed death is compared to progression.
DR. FLEMING:
Indeed. The example you gave,
Bob--you are actually not too far off, it would be a three- to four-fold
difference in numbers of events required to detect a 12- versus 14-month
difference in survival rather than a six- versus eight-month difference in time
to progression. It is what drives a lot
of interest in looking at replacement endpoints. It is not just because they occur six months
sooner that would cut six months off the regulatory process, but the relative
risk that you would expect to see in the endpoint that is the direct mechanism
by which you hope to achieve ultimate benefit, and it is more proximal, is
typically going to be greater.
There are counter examples, Bob? How could it be that there is a counter
example? Because your surrogate may be
noisy and may not, in fact, be capturing the essence of the mechanism by which
you achieve clinical benefit. So you
may, in fact, have as impressive a result on the more distal clinical
endpoint. But in general what you say is
right, and that is that typically you are going to see a bigger relative risk
reduction.
So, the challenge is can we achieve that payoff of a
quicker assessment based on a smaller sample size, using Bob's logic, without
paying the price of having less reliability?
When is this quicker answer reliably telling us what we need to know
longer term?
But while I have the mike let me just quickly go back to
one of your earlier issues and defend what Grant had indicated I had advocated
in the past, which is disease-free survival.
Disease-free survival and time to progression are both important
markers. Time to progression is
censoring the deaths and if one is really trying to get at the mechanism by
which I am achieving clinical benefit, a targeted mechanism such that what I
really want to look at is the treatment effect on the targeted mechanism of
tumor burden and I don't want that assessment to be clouded or complicated by
the noise of unrelated deaths, I will censor the deaths and look at time to
progression. That would make sense if it
is a supportive measure of biologic activity.
But if it is a registrational endpoint you want it to be as close as
possible to what is really clinically relevant and clinically interpretable.
What is really relevant here would be to say I want to
delay the time that I have progression or death. A good thing is to be alive and free of progression. So, those deaths should count. When you censor the deaths, and I think it is
important for clinicians to know the game that statisticians are playing, if
Grant and I are going along and I die and Grant doesn't and we are in the same
arm, I am censored in time to progression but I am not left out. Some people think I am censored and I am
taken out. No, I am still in the
analysis and we are imputing my time to progression by what Grant's time to
progression is.
Now, it is an incredible assumption of informative
censoring that because I die I am no definition than Grant. I am probably more frail; I am different and
so my time to progression would have been different from his. So, when we look at time to progression I
would hope that we would also look at that with tremendous caution because we
are censoring the deaths and we are making a major assumption about
non-informative censoring that is almost certainly not true.
DR. PRZEPIORKA:
Grant, I have a question for you.
You talked about validated surrogates.
Who is responsible for validating surrogates, the FDA or the sponsors?
DR. WILLIAMS: Well,
I really don't think that we use the term as a regulatory term. We are looking for something that is a
substitute. In this case I was using
validated to refer to the Prentice criteria for strict quantitative
analyses. Certainly, our regulations
don't have validated surrogate in them.
I don't think we really have a regulatory answer for what a validated
surrogate is, maybe Bob does.
DR. TEMPLE: No, we
don't. But the accelerated approval rule
says you know those other surrogates we used to use--blood pressure, blood
sugar, the ones we are talking about now are less validated than that. That is really all it says. It gives you a direction and that is quite
explicit in the preamble, but it doesn't say the other ones meet the Prentice
criteria. I don't think anything has
ever met the Prentice criteria because there is too much noise in the system to
make a very persuasive case for that.
But the contrast is with blood pressure, blood sugar and cholesterol
which a lot of people would argue about anyway even though those are widely
accepted. But it is a qualitative,
somewhat seat-of-the-pants judgment about whether this is persuasive or not.
DR. PAZDUR: Could I
answer Donna's question?
DR. PRZEPIORKA:
Sure.
DR. PAZDUR: I think
the academic and scientific community have the obligation to validate these
surrogates. We could accept or not
accept the information that is provided to us but this tends to be a long and
complicated process and what we are looking for is basically external
validation that these are real, true scientific findings to them base
regulatory decisions on.
DR. PRZEPIORKA: In
that case I would like to follow-up and I am going to assume that there is no
guidance document on what would accept as a validated surrogate. Is there a guidance document available for
how to validate a surrogate?
DR. TEMPLE: No,
there isn't and when you actually get into it, it becomes extremely difficult. For example, I bet if you looked at all
studies over all time, shrinking tumors is probably good; I mean I think it is
likely if you had a large enough database.
What does that tell you about an individual study where the difference
in tumor response is a small percent? In
putting a quantitative thing on these is extremely difficult. I mean, people could try to do that. It would be a massive project but I wonder
how much it would help you in each individual case as to whether it was
plausible or not. But your question
leads to the answer that there really isn't much in the way of guidance on
this.
DR. PAZDUR: But to
follow-up on Bob's comment, I think this is one of the major problems we have
had in oncology, that is, as we try to make some correlation here basically our
treatment effects have been so small that it is hard to really impact the
subsequent endpoint.
DR. PRZEPIORKA: Dr.
Dagher, a question for you. You had gone
through the list of all the ways of accelerated approval and obviously they
need further follow-up for full approval.
Can you tell us has there been any drug that has been approved on
accelerated approval but had its post-marketing study turn out to be negative,
and what did we learn from that and what did we do with it?
DR. DAGHER: Well, we
discussed some of these at the March ODAC last year and I mentioned that you
could have confirmatory benefit either in the exact same population or I used
the term related population. The reason
I mention that is that it is intuitive that you would expect confirmatory
studies to be done in the less refractory populations when you are looking for
people for second- or third-line accelerated approval. But we have had settings where we have had evidence
of clinical benefit confirmed in related populations.
What do I mean by that?
We have some settings where we still had somewhat refractory populations
but they were related. For example, the
approval for Taxotere was for failure of prior athracycline. Then when we looked at confirmatory benefit,
that was a population where there were some patients that had failed prior
alkylator therapy. So, if you look at
the label, after we did the conversion we now have a slightly expanded
population, if you will, to say failure of prior chemotherapy which might have
included either athracycline or alkylators.
So, that is one situation where you could argue, okay, the population
was still somewhat refractory but it is a slightly different population.
In the case of irinotecan, the evidence that was helpful in
providing evidence to confirm clinical benefit came, as you know, from two
European studies not the studies that were originally intended as the studies
that were designated originally as those that would provide clinical benefit. In those studies, you could say those were
fairly close populations in terms of the patient populations.
So, basically what we are saying is that you could have
confirmation of benefit either in the same population or related
populations. In terms of regulatory guidance,
the 1996 document on reinventing the regulation of cancer drugs illustrated
some concepts. One of the concepts was
that clearly we recognize that confirmation of clinical benefit doesn't always
necessarily have to occur in the exact same population that we use for
accelerated approval. Obviously, the
reason for that is that it could be more informative for us that further
studies are done in different populations.
For example, if you had accelerated approval in a third-line setting one
could argue that it would be much more informative to have further studies done
in the first-line setting and evaluate benefit in that setting.
DR. PRZEPIORKA: I
think my question was probably addressing more a specific individual study as
opposed to a confirmatory trial where a drug received accelerated approval on
the bases of a surrogate but in long-term follow-up survival was either not
different or, in fact, worse with the new drug.
Has that ever occurred?
DR. PAZDUR:
Yes. Donna, a recent example of
this is oxaliplatin. Although we
approved the drug on the basis of an interim analysis of a randomized study
which showed an improvement in time to progression and response rate, the
survival did not show any advantage.
Hence, you know, we knew that this was a high probability because there
was a built-in crossover for all patients to receive the drug subsequently.
I think an important aspect is that when we take a look at
accelerated approval--and this came out in the March talk--that we really have
to take a look at the whole context of the drug development. It is not just one trial, this drug also had
positive trials in a first-line study in an adjuvant setting. So, yes, there are examples. I think we have to take a picture of how the
drug fits into the context of other trials going on.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: Well,
the oxaliplatin is a very telling example and certain studies in breast cancer
in my opinion came out roughly the same way despite a dramatic effect on
disease-free survival. But that is
because of the reason we gave before.
There is crossover and it is later so it is much harder to win.
There are some examples, I mean there is a near miss, if
you like. In the ordinary course of
things Iressa probably would have been approved for third-line therapy with a
requirement that they go study first-line therapy. Well, we know what happened there. They would have failed utterly. The message I think is, you know, you are not
always as smart as you think you are.
Drugs don't always work better--
DR. BUNN: [Not at
microphone; inaudible]
DR. TEMPLE: I am
just talking about the results of the well publicized first-line therapy study
that was done, an excellent pair of studies.
Nobody criticized the design.
Yet, if those studies had been the requirement on an accelerated
approval--other studies are now the requirement for accelerated approval--you
would have had a case where you didn't get confirmation but, of course, it was
a different disease. So, it is
possible. Can I say accelerated approval
contemplates that. It contemplates the
possibility that we will put a drug into the marketplace that ultimately proves
not to be effective. The risk is considered
worth it in bad diseases with no good treatment.
DR. PRZEPIORKA: Dr.
Fleming, a final question?
DR. FLEMING: I was
just following up on what I thought your question was, which is are there
examples where an accelerated approval is granted and then a validation study
is done and the results are not confirmatory.
I think in the March 12 and 13 ODAC committee meeting we had we saw
several examples. One of those examples
was ethiol in advanced non-small cell lung cancer that was used for
chemoprotection against renal toxicity, and where a validation study was done
and duration of responses were much shorter with ethiol and survival was
shorter, time to progression was shorter.
Survival was almost statistically significantly shorter and was, in
fact, shorter in the subgroup of ECOG performance status.
That was, in fact, an issue that came to light in that
advisory committee, that not all validation studies are going to be positive
and it is not as simple as saying, well, with crossovers at progression we are
going to dilute survival differences. At
times makers don't give a reliable assessment of what the ultimate clinical
benefit will be. And, one of the
complexities here is when those validation studies are quite unfavorable what
happens?
DR. PRZEPIORKA: Dr.
Dagher?
DR. DAGHER: Just to
follow up, this is why Dr. Pazdur was emphasizing this concept of an overall
development plan because we talk about confirming clinical benefit in the exact
same population or in different populations, the fact is that you could have
for a variety or reasons, as Dr. Fleming mentioned, studies that are
"designated" as those that are going to be supportive for approval
and, yet, those either aren't completed or when they are completed they don't
show the results you expect.
This is why we encourage sponsors to sort of have a broad
view of the development plan, meaning that we would like to have, you know,
several trials ongoing or in the process of being developed that could
ultimately support that full approval.
Like in the irinotecan example I provided, because there were other
large randomized studies being conducted, even though they weren't designated
as those that would be reviewed for confirmation of benefit because they were
ongoing they could provide that evidence.
So, when we talk about an overall development plan one of the things we
are talking about is having other trials ongoing even if they are not
necessarily "designated" at the time of the original accelerated
approval as the ones we are going to necessarily review for confirmation of
clinical benefit.
DR. PRZEPIORKA: Thank
you. I think we are going to stop here
for a break and we will come back for the open public hearing and Dr. Temple's
comments starting at 9:45.
[Brief recess]
DR. PRZEPIORKA: Is
there anyone in the public who wishes to make a comment? Now would be the time. Please come forward to the microphone in the
front of the room. Seeing no takers, we
will proceed to the discussion of the questions and Dr. Williams I think will
give us some introductory comments.
Introduction of the
Questions
DR. WILLIAMS: I
don't know if Dr. Pazdur is on the phone; I don't hear a cough. I imagine that is going to be the rest of our
Division next week.
I just want to introduce you to the questions, sort of the
structure. Why don't you turn to
them? This morning there will be just
sort of general discussion questions that we want to take general principles
from to guide us as we go to specific areas.
In the afternoon we will look into the questions on lung cancer and have
a few voting questions if it seems that that will be helpful.
For this morning's session the first question is just on
survival. It will be a continuation of
what we have had here. The second
question is about time to progression.
We have had a lot of trouble trying to figure out how to do this. So, what happened is, you know, Dr. Pazdur
took all of my little questions and was going to throw them away. Instead, I stuck them in the appendix.
[Laughter]
So, what we need to do is to talk about time to progression
but also all of the different factors about time to progression, how important
are the different factors? In the
appendix I have sort of taken the different factors out to give you a little
idea of what we are talking about, if you need to refer to that, things like
relationships of time to death; whether patients are symptomatic; the magnitude
and precision of the benefit; whether or not there is a benefit out there that
has a survival effect for instance, whether that matters; how much does it
matter if the endpoint is highly reliable or if it is more fuzzy; toxicity and
the design, superiority versus non-inferiority.
I mean, you can come up with all kinds of scenarios but
these are the factors that we are often considering when we say is this
acceptable or not. So, there is a
question here that mentions each of these factors and if you need to think more
about them there is the appendix.
Then, there is the question of disease-free survival. We didn't really present on it but there is a
little discussion here. Basically the
issue is we have accepted disease-free survival in breast cancer, partly
because it is hormonal therapy and I think one of the early defenses was that
these patients were more symptomatic at the progression so it is more like
delaying symptoms. But others will argue
that disease-free survival itself is clinical benefit, that you don't have
known cancer and now you do and now you get toxic treatment. So, how you weigh in there I think will be
important to us as we move forward.
Those are really the main two questions for this
morning. Certainly, if you feel like
there are other questions or points that you want to discuss, that is
fine. So, I will turn it over to Dr.
Przepiorka.
Questions for Discussion
DR. PRZEPIORKA:
Thank you. Dr. Williams, just as
a point of planning for this discussion and trying to make sure we get
everything in, especially that last question which may actually have some
importance regarding hematologic malignancies, and recognizing the complexities
of the discussion for TTP, would you mind terribly if I took some of these out
of order?
DR. WILLIAMS: You
are welcome to.
DR. PRZEPIORKA:
Thank you. Let's start with the
first question for the committee.
Discuss the role of survival as an endpoint. Consider in your discussion the importance of
whether existing therapies prolong survival and the potential confounding of
survival results by patient crossover or where several subsequent therapies may
also affect survival.
We actually discussed this a little bit about four years
ago, if I recall. At that time I do
recall Dr. Pazdur very pessimistically stating there is no drug that really
improves survival in cancer so crossover shouldn't make any difference.
But I think in the modern era that is no longer true, or am I incorrect
about that? Dr. Grillo?
DR. GRILLO-LOPEZ:
Perhaps even before we start discussion we need to make a distinction
between survival as a goal and objective and survival as an endpoint. Survival is a goal for all of us here in this
room because we are all involved either in patient care or in some way trying
to better the lot of patients. You know,
I have taken care of cancer patients and survival is very important to me. I am a cancer survivor myself. Survival is very important to me. But it is a word that is very compelling and
that has a lot of emotional baggage behind it.
Perhaps because of that we are tempted many times to follow it with the
phrase gold standard and perhaps we shouldn't.
Perhaps as you said earlier in our discussions today in
considering TTP, and we will hear a lot about the pros and cons of TTP, we have
to divorce that from survival as TTP being a surrogate for survival because
survival is not a very good endpoint in fact.
I love survival as a goal, as an objective. I dislike it intensely as an endpoint because
it is subject to so many biases and a lot of people don't recognize that. The most important one may be that patients
do get subsequent therapies and those subsequent therapies may or may not be
active but there are extremes. There is
the patient who chooses to have the best possible care, who takes care of
himself, who follows treatment and who happens to respond to subsequent
therapies. He will have a longer
survival than at the other extreme, the patient who chooses to expedite his
demise ultimately, perhaps even through suicide. If you have done enough clinical trials you
will have had patients who committed suicide.
It can be subtle at times. It can
be as subtle as stopping your medication and no one knows about it but you. But we think it is just jumping under the
train; it is not like that. So, it is a
very biased endpoint. It has more
biases, in my mind, than TTP does.
DR. PRZEPIORKA: Dr.
Brawley?
DR. BRAWLEY: I am
sorry, are you talking about survival as measured in a randomized clinical
trial or are you talking about survival as simply increased time from diagnosis
to death as measured through comparing various trials?
DR. WILLIAMS:
Randomized trial as a primary endpoint.
DR. TEMPLE: It is
not that you couldn't be persuaded by a historically controlled trial but it
just almost never happens.
DR. BRAWLEY: I have
a second question which is more for Dr. Fleming and Dr. George. I sort of mentioned it to both of them. Are we assuming that increased survival in a randomized
clinical trial translates in a decrease in either overall mortality or cause
specific mortality?
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: Since I
heard my name mentioned--yes, we talked about this at the break. Well, let's talk about lung cancer since we
are going to talk about it this afternoon, I think it is traditional to use
overall survival as the primary endpoint even though in many studies, if you
look at attribution of cause of death, there are quite a few deaths that are
not attributable to the treatment, not attributable to the disease but are from
other competing causes of risk. So, I
don't think we are assuming that. What
we are doing though is we are saying that we don't really know; we can't really
trust this attribution, first of all, in cause of death. Secondly, we wouldn't know quite how to
interpret, say, a difference in cause specific mortality, say in lung cancer in
this case, in the two treatments if there wasn't an overall survival difference
because we don't know what the full mechanism of action of the treatments is.
So, I think it is not true that we are assuming anything
about the different causes of mortality but what we are doing is saying that
the overall survival is the important thing in those kinds of settings.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: Well, I
think we have to be careful to talk both about the difficulty and the
practicality of each of these measures separately from the validity of these
measures as true measures of patient benefit because they are different
issues. It seems apparent that we don't
really have the capacity since we can't freeze time and we don't have computer
models to basically run clinical trials in the blink of an eye, we can't answer
the questions adequately.
I think Dr. Cheson said that the punch lines are likely to
be different for different disease settings.
I agree. But I think the other
thing is that the punch line in terms of whether a certain endpoint is really
an indicator of patient benefit is likely to be different for different
patients because different patients may view overall survival benefit of eight
months as something huge, whereas someone else, you know, may value
disease-free, progression-free survival and maintaining a constant in terms of
their current life styles as a higher benefit.
So, I think we ought to view all of these, and is each of them valid to
use as a measure and sort of add them as arrows and quivers as opposed to
saying which is the best one to use because we have to use a lot of them I think
to get the right result.
DR. PRZEPIORKA: The
question not here that I would like to throw out came up with our journal club
back at home yesterday. We were
reviewing a paper where difference in median survival ended up being 1.2 months
but, because there were so many patients, the p value was 0.003. Dr. Williams I believe stated earlier that
survival, when considered the endpoint, was easy to measure because when it is
significantly different it is acceptable.
But here our group looked at a paper and said we still wouldn't change
therapy based on that. Any discussion on
what is a meaningful increase in survival?
Dr. Cheson?
DR. CHESON: Again
getting back to what I said before, it is all relative. Whether you are talking lung cancer, whether
you are talking follicular lymphoma or let's look at melanoma. We have some interesting drugs there. A difference of two months may be very
meaningful. Yet, if you look at that in
follicular lymphoma, as you know, we would go "pah."
DR. PRZEPIORKA: I
think Dr. Williams asked earlier for discussion of principles and I think he is
going to want some rather specific examples.
So, if you would like to discuss what you would consider meaningful
survival in a lung cancer patient versus a low grade lymphoma patient he would
probably be happy to hear those numbers.
[Laughter]
Just as examples of people who have long lives and short
lives.
DR. CHESON: Well, I
think also you have to look at whether you are talking front-line therapy or
relapse therapy and, as he also mentioned, the risk of the therapy. For follicular lymphoma in the relapse
setting I would think four to six months with a new therapy might be something
important, whereas that would be only of marginal interest in up-front where
some of the newer agents are, hopefully, getting us nine months to a year with
additional therapy.
This is a totally moving target, particularly in the
hematologic malignancies which, as you know, are far ahead of the solid tumors.
DR. PRZEPIORKA: Yes.
DR. CHESON: Every
time we get a new drug approved, the bar just gets set higher and higher. So, what you say today is not going to be
relevant in another six months for lung cancer, which I don't follow. Paul and Bruce can certainly comment much
better on what would be a meaningful endpoint.
I know when I was still in my former job they were talking about
response rates of interest in lung cancer being in the ten percent range. We saw that with Iressa and that would not
cut it at all in hematologic malignancies, even in the most aggressive of
those. So, it is a totally moving
target.
DR. PRZEPIORKA: Any
guiding principle you might come up with though? If drug A gives you two years benefit over no
therapy and drug B is coming along, how much more benefit would you want to
see?
DR. CHESON: It is
hard to give an absolute number.
DR. WILLIAMS: Dr.
Przepiorka, maybe I could focus that a bit?
DR. PRZEPIORKA:
Sure.
DR. WILLIAMS:
Because we have not, that I know of, not approved a drug that had a
survival effect that we really believed.
I mean, you also have to trade off the toxicity. But I think what we would really like to know
is when you have a drug with a survival effect out there, how does that affect
your acceptance of another endpoint that isn't survival? A lot of times these survival effects are not
so big--one or two months, as you mentioned, and that is what you have, maybe
it is a symptom endpoint, maybe it is TTP or another endpoint with another
drug. How does that, and what magnitude
of effect of survival would affect the way you looked at this endpoint?
You know, we don't have a definite comparative efficacy
standard but, nonetheless, I do think it is important we do consider these
things, whether there is a large survival effect or not.
DR. TEMPLE: You have
to be specific about the study. I mean,
if you have a standard therapy out there that you knew something about and now
along comes another drug and it actually shows improved survival, well, you
know something about this drug. It is
not worse than the other drug at least, and even if you are not bowled over by
the effect it is sort of showing you that it does something other than shrink
tumors. You might consider that as sort
of proof of principle and a statement that, well, it is at least as good as
what we have and actually it is probably better. Even if you think that one month is not of
particular value, it has told you something about the drug and what it can
do. Whether that becomes standard
therapy or not is a different question, but from our point of view maybe it has
shown the kind of effectiveness you want if it is not over-toxic.
DR. PRZEPIORKA: Dr.
Rodriguez?
DR. RODRIGUEZ: You
are asking about developing principles and I think that coming up with specific
numbers doesn't address a principle. I
think a concept of principle would be, as Dr. Cheson has said, that there
should be different guidelines for each malignancy. We are finding today that even within a
defined category of malignancies we, in fact, have many biological variants of
that same disease and we all have been bowled over at the recent meetings about
how we now have to start thinking of proteomics and genomics in the definition
of treatment for patients.
So, I think that this is, indeed, a moving concept and the principle
should be that the endpoint should be appropriate for the disease and that it
should be appropriate for the stage and/or status of the disease because
patients who are in relapse are different from patients who are being treated
in the adjuvant setting, or for metastatic front-line treatment, and/or for
post-transplant, or being considered for transplant, etc. I mean, I think we know as clinicians that we
manage all of these patients very differently so we should not have
"standard expectations" of any one of these categories of
patients. They should be different.
DR. LEVINE: I would
agree. I would add one more point to the
principle. If, in fact, the survival
benefit is a very small one, it would seem to me that I would want some
confirmatory advantage as well as far as symptoms are concerned, or toxicity,
or quality of life. So, one month in the
hospital, you know, on IV morphine, or whatever, is not necessarily something
that I would be aiming toward. I would
want that in a small survival difference.
DR. TEMPLE: We don't
really have authority to refuse a drug because its advantage over other therapy
isn't big enough. We have said publicly
that in oncology, unlike many situations where we would be obliged to approve
something even if it was inferior, we would not feel obliged to approve an
inferior cancer drug because there are serious consequences to that. But to insist that it be better is really not
within our statute. It doesn't have to
be better.
It is important to make the distinction between showing
that you are better as a way of showing that you work at all, which is what a
superiority study does, and showing that you are better because you have to
show you are better in order to be approved.
You really don't have to show you are better to be approved. The statute and the legislative history is
very clear that they were not trying to set a relative efficacy standard, much
as one might want to know that a new drug was better. But we can't insist on that. What we do is we find superiority studies
interpretable so that they show that the drug works. They also happen to show that it is better
but that is in some sense incidental.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: It
seems to me that a couple of things may be helpful. One is that we have diseases, hematologic
malignancies or breast cancer being examples, where there are a lot of
therapies that are at least somewhat effective and that probably do impact
survival. How one stacks up a new
therapy at a given stage in that setting and how one stacks up a new therapy
in, say, disseminated melanoma where I think there is probably no generally
accepted treatment that dependably improves survival are just going to be
different scenarios and you almost have to have different rules there.
The other thing that has to be factored into this, but
there is not a very quantifiable scientific way that such a committee always
does, is to try to balance benefit and toxicity. Richard Gilber's analysis in breast cancer is
one reasonably validated, not very scientific but it is an effort to quantify
this kind of balance. I am not
suggesting that we all adopt that but it is that kind of balance that I think
is going to have to be left as a non-quantifiable but important aspect of this.
DR. PRZEPIORKA:
Bruce?
DR. REDMAN: I think
it is important--in reading the question, you are asking about comparing in a
randomized trial against drugs that have proven survival benefit. I think that is a kicker because there are
Phase III trials out there with a survival endpoint and the comparator is a
drug that has never been proven to show survival. It may be approved. Melanoma DTIC, and DTIC has never been shown
to improve survival but it is used as a comparator. It may actually shorten survival; we don't
know. So, if you are going to accept
survival it has to be compared against a drug or a therapy that has been proven
to affect the survival, or one that we think does.
DR. TEMPLE:
Right. In a situation that you
describe we would never accept non-inferiority as meaningful, obviously, but if
it was superior, and ignoring your concern that the control might actually
shorten survival--that is a big problem because you do have to assume it is at
least neutral, in a study like that you would have to show an advantage over
the available therapy and the available therapy would just be there as your
placebo equivalent.
DR. REDMAN: Then the
advantage of that has to be predetermined up front, what is acceptable. Then we are back to what Dr. Cheson was
saying. You know, what is acceptable in
stage IV untreated the same as the advantage in stage IV in someone who has
received two prior treatments, specific in lung cancer, melanoma, kidney
cancer.
DR. TEMPLE: I mean,
historically we have taken the position with the committee that if there is no
available treatment that works for people we grant accelerated approval based
on a showing of tumor response, time to progression, anyone of a number of
non-clinical, borderline clinical endpoints.
We would never worry if somebody managed to show improved survival and,
as Grant said, even modestly improved survival.
That has always been the basis for approval if you can show it. What you can show is really determined by the
sample size you choose at the beginning as much as anything. I suppose if you made the study big enough
you could show improved survival that a lot of people wouldn't think is very
important. Historically, you would
probably advise us to approve it anyway.
That has been the pattern up till now.
DR. PRZEPIORKA: That
is a very telling comment that you just made though since we are supposed to be
approving drugs on the basis of clinical benefit, but I think I just heard you
say, if I can paraphrase this correctly, that we always approve drugs on the
basis of survival even if people don't think it is a very meaningful survival.
DR. TEMPLE: Yes, in
practice studies are hardly ever large enough to show a completely trivial
effect. So, we are in the 2-month,
2.5-month area and the recommendations we have gotten and our actions have
usually said that is good enough in solid tumors; that is the best you can hope
for so far.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: Could I
address the second part--
DR. PRZEPIORKA: Yes,
please, yes.
DR. GEORGE: --the
confounding thing? This always puzzles
me somewhat. If you have two therapies,
let's say A and B, and then you have some other therapies that would be given
after, say, recurrence or at some later point and often you don't have very
good evidence that they have any effect, first of all. You might assume they do just to explain away
the reason you didn't get any difference in survival. But whether or not they do, let's suppose
that happens. You had a strategy of
giving A and B followed by whatever is available at the time that they have
recurrence, and let's suppose that that treatment does have some effect and
sort of obliterates any potential survival effect you would have gotten if you
had done an unethical study, say, to force people to stay on treatment and not
give them anything else no matter what happens--you couldn't do that, of
course, ethically--so what is the overall conclusion you would come to? To me, it is that the treatment strategy you
started off doing with A and B didn't work in terms of the outcome of overall
survival in the context of that disease and in that setting with other
potentially available therapies. So, in
fact, if treatment A was the comparator and treatment B was the new treatment,
in terms of overall survival you would say it doesn't have an effect. That is a simple answer.
Now, in terms of whether it is approvable, that means you
had better have thought through other endpoints that you might be trying to use
to get it approved. But in terms of
overall survival it didn't work and it is not worth all the discussion about,
well, maybe it was because we had all these other therapies or maybe it was
this or that. The fact is it didn't work
in this setting at this time.
DR. TEMPLE: The
trouble is if the only endpoint that leads to approval was survival, then this
active drug has just failed.
DR. GEORGE: Exactly.
DR. TEMPLE: Even
though if there weren't other therapies it would have been active in the usual
sense. That is the problem.
DR. GEORGE: That just
means you had better come up with the right endpoints and you had better not be
using overall survival.
DR. TEMPLE: That is
what we are here for.
DR. GEORGE: Well, I
am just pointing out that people spend a lot of time discussing why it didn't
work in terms of overall survival.
DR. TEMPLE: But that
is because historically there has been a bias, not surprising and not
unreasonable, in favor of a survival outcome because everybody knows that is
tangible, that is a real benefit with some expressions of concern even about
that. That is hard-wired. It is not subject to interpretation too much
and everybody likes it. The trouble is
the very things you are talking about can obliterate the ability of a drug that
could be valuable to show its effect.
That is what our trouble is, especially if the crossover is to the very
drug that is being studied which happens for any marketed drug all the time.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: Harking
back to something Dr. Rodriguez said, these diseases aren't failing these
drugs. They are different diseases
looking for the right therapy. We have
certainly learned that in the hematologic malignancies where we started with,
you know, leukemias and now we have separated them out into a myriad of
different diseases. When we approve
drugs, as we have seen recently, we are going to miss active drugs because the
population in which they work is obscured by all the patients for whom the drug
doesn't work, and there are some drugs that you all are approving that only
work in small populations of a certain disease and, yet, they are getting
generalized to the disease group at large and both of these are unfortunate
circumstances for a variety of reasons.
So, I think we need to recognize--and we certainly will be doing that
more and more and we certainly do this in leukemias and lymphomas--that these
are a bunch of very different diseases and we are going to have to be studying
them like that. Instead of studying
non-small cell lung cancer, we are going to have to find out, you know, what
are the different subsets and how they respond differently to drugs like Iressa
etc., else we are just going to miss effective drugs and we are going to be
spending a lot of money on ineffective therapies for patients in whom they don't
work.
DR. PRZEPIORKA: Dr.
Grillo?
DR. GRILLO-LOPEZ: I
want to go back to what Dr. Rodriguez and Dr. Levine said earlier and add to
what they said, that another consideration in choosing the appropriate endpoint
and having an idea of what the expected magnitude of the effect should be is
whether you are evaluating that new agent as monotherapy as opposed to that new
agent within a combination therapy. If
you are evaluating it as monotherapy and you are comparing it one-on-one, like
the DTIC example that was provided by Dr. Redman, then I believe a survival
endpoint becomes even less desirable because it is seldom that you see a single
agent be curative in any malignancy.
There are some exceptions but this is seldom.
The other extreme is when you are evaluating within a
combination therapy. Now, we do have
combination therapies that are curative in at least some percentage of patients
with certain tumor types. However, how long
did it take us as a research community to find those optimal combinations? It takes years and years and years. Consider in your minds the ones that are
available and you know how long it took to get there. It took many years after approval. Now, are you saying that you would deny the
oncology community the opportunity to research this via an approved drug that
can be worked into a combination, or that you would deny patients a drug that
has shown efficacy in Phase II, that has reasonable activity, because you have
not determined the optimal combination that would be curative and then you can
use a survival endpoint? I would say no,
you can't do that. Other endpoints are
suitable to that outcome because it is very unlikely that during development, pre-approval,
you are going to have the optimal combination identified.
DR. PRZEPIORKA: Dr.
Williams?
DR. WILLIAMS: There
is an underlying question that I don't think has really been heard. Let me just give you a situation. You have a marginal survival benefit out
there. You are accepting TTP now; you
believe in it as clinical benefit, let's say, but you are getting now this
survival benefit over here so there are a couple of different settings. One is something like fairly marginal,
two-month median survival increase. You
have a trial over here that is not even going to evaluate that; it is just
going to use time to progression alone because of its clinical benefit too.
So, what is the tradeoff here? When do you have a survival effect here that
is so significant that you can't do that trial; it is not ethical basically to use
TTP to approve a drug? You wouldn't make
the tradeoff for TTP because you have something else over here that is so
good. One setting would be that you
compare directly to this drug and you beat it in TTP. If you accept TTP, would that lead to approval?
Another would be that you evaluated TTP in another setting
and you didn't beat it; you just showed that you had a TTP benefit. The question is when does the survival effect
proven in one setting affect you so much that you can no longer accept this endpoint
in another setting?
The way this happens is we have trials coming along. All of a sudden, one of these drugs is
approved based on some survival benefit.
It might be a little one; it might be a big one. Then, at what point does that become so significant
that it affects your ability to consider a different endpoint such as TTP?
So, that is the tension that I want to hear some discussion
on. For instance, in the colon cancer setting, the lung cancer setting where
you have one- or two-month survival benefit, does that then mean that you
wouldn't even look at TTP as a separate benefit or that you would only look at
it if you were beating that drug that had the little survival benefit? So, when I am talking about the size of
survival benefit it is not necessarily would you approve it based on survival
but how does that trade off and affect you looking at other endpoints?
DR. PRZEPIORKA: If I
hear your question correctly, when would we actually insist on using survival
as an endpoint and not use anything else?
DR. WILLIAMS: That
is assuming that originally you had already accepted another kind of endpoint,
such as TTP.
DR. TEMPLE: I assume
this comes up because of the disconnected nature of the approvals. If there was something out there that had a
survival benefit you would compare the new drug with it because you couldn't
really not.
DR. WILLIAMS: That
is a question though. If you have a very
small survival benefit you either have to say I am going to beat that drug, do
a non-inferiority study which is impractical, or this is so small that it is
not of any real meaning.
DR. TEMPLE: But it
would be the standard and everybody would use it, but what you are saying is
now you have just suddenly discovered something and you have all these people
developing drugs without a comparison out there because they didn't know about
it.
DR. PRZEPIORKA: Dr.
Reaman?
DR. REAMAN: These
trials are being designed and conducted to demonstrate a clinical benefit, not
to dictate and define what the standard or a new standard is going to be. Correct?
DR. WILLIAMS: Yes,
we don't do those kind of trials. We
don't do trials to develop standards.
So, yes, they are all being developed for clinical benefit but it is a
different nature of clinical benefit here, the survival versus other drugs
which might be TTP, let's say.
DR. REAMAN: But I
think the question you raise really has to be considered within the context of
the disease and the patient population in which the study is being
conducted. I just don't think there are
any absolutes that can be given, yes/no, will we always demand survival as the
ultimate endpoint and can time to progression replace it.
DR. TEMPLE: Can I
refine the question a little more? I
guess if there were something that had a major effect in a particular setting,
stage of disease--let's leave leukemias and cures, but had a major effect, most
people would think the right way to develop a new drug is to compare it with
that drug or add it to it or something like that. Right?
So, I think Grant is asking if you developed something that
had an effect like that while other studies were going on that were looking at
response rate, time to progression, would you be happy approving a drug not
knowing how its survival effect compared to this thing that is now there? That is very important to people who are
developing drugs without knowledge of what other people are doing at any given
time. Does that capture your question?
DR. PRZEPIORKA: Do
you have a response?
DR. REAMAN: I would
say yes. I mean, it may take a very long
time to know about some of the impacts of drugs being approved and the impact
that they could have on survival long-term, particularly using combinations.
DR. PRZEPIORKA: Dr.
Grillo?
DR. GRILLO-LOPEZ: I
have to say this is fun. I am
practically jumping out of my seat here to address what Dr. Temple said. I did that.
I developed Rituxan and we didn't find out until after the year 2000
that it was adding to the cure rate in intermediate grade lymphoma. We presented it to you for low grade lymphoma
in a relapse or refractory setting, where survival was not an issue because it
was not the appropriate endpoint, and you approved it. So, this is an example of an agent that had
the potential of being curative within in a combination but got approved
earlier on for relapse/refractory combination with a single-arm trial where
survival was not the endpoint, and it was a regular approval.
DR. TEMPLE: Yes, we
are well aware that the initial approvals of drugs do not define their total
use in the community. One of the reasons
for accelerated approval was a barrage of arguments, often from the oncology
community, that said, look, if you don't have the tools to do it, it is just
impossible to develop drugs properly.
Within limits at least, we bought that idea. That is why half of all drugs at least are
now approved under accelerated approval based on response in refractory
disease, the thesis being if refractory disease responds it is probably useful
other places, and people are going to do studies, there will be cooperative
studies and all that.
So, I think there isn't any particular debate about that
question. There still is a lot of
concern about what the standard should be given past guidance we have gotten
for other kinds of approvals, not really most about accelerated approval which
is sort of at least moderately settled if we know we could get the definitive
studies done later. It is what should
the standard be in first-line therapy given sample sizes, given crossover, and
maybe that should be different from one tumor to the other. That is one of the things you are talking
about.
DR. PRZEPIORKA: Dr.
Redman?
DR. REDMAN:
Regarding Dr. Williams' question, I guess a lot depends--you know, if
you are talking about two randomized trials and if the comparator in the two
trials is different, if the comparator arm is different and one shows a
survival advantage while the other one was powered to show a time to
progression advantage, I mean I guess you are never dissolve ODAC, you are
going to have to ask somebody. I don't
know the answer.
But if the comparator is the same and you said to them at
the end of Phase II, listen, we will accept this as a valid endpoint as a
clinical benefit, I think you have to.
DR. WILLIAMS: But it
sounds like it is a value judgment and basically there is no over-arching rule
that we are going to apply across the different diseases and it will be a
case-by-case kind of discussion.
DR. PRZEPIORKA: I
think we have beaten survival to death--
[Laughter]
Just to summarize, I think we started out with excellent
philosophical points from Dr. Grillo, which is that survival is a goal but not
necessarily an endpoint, and that survival can be biased, as is pointed out in
the questions, by subsequent therapy that is not standardized. However, under those circumstances we have to
ignore the confounding factors if the original agreement was that we would look
at survival; we should have different guidelines for each biological subset,
meaning the disease, the status or any biological subset within a disease or
disease status. At this point we can't
demand survival under any specific certain circumstances. Everything has to be looked at individually.
Any other comments to add to that? Dr. Fleming?
DR. FLEMING: Well,
it may be just a bit of a reinforcement but, to my way of thinking, choice of
endpoints ought to be based on what it would be the patients really care
about. In oncology, certainly, cancer
has a huge effect on duration of survival and, certainly, from a patient's
perspective to prolong survival would be of profound importance. That doesn't mean though that that is the
only benefit that patients would look to.
I would go back to Mr. Katz' comments, there may well be other measures
but I would ask that we distinguish whether those other measures unequivocally
reflect tangible benefit to patients.
Others that do, that we have heard a lot about, are disease-related
symptoms or, as he was talking about, patient's functional status, being able
to carry out normal activities.
Those would all be very tangible benefits. Those need to be put in contrast to the
mechanisms by which we hope to achieve those benefits. In oncology classical measures would be tumor
burden type measures such as response and time to progression. But I would only caution it may well be that
we affect those measures which are the treatment mechanisms without, in fact,
impacting the clinical endpoints of interest.
I would argue then that our primary endpoints for registration should be
these measures that unequivocally reflect tangible benefit or, as we will talk
about a little bit later on, measures of biologic activity that have been
validated.
I would like to reinforce one more thing that Dr. George
pointed out, and that is the argument that has been given against survival is
that it may be impacted by subsequent interventions. I would argue again from a patient's
perspective that the goal here is to formulate regimens which, when implemented
in the best standard care approach in clinical practice, would prolong survival
and improve quality of life. So, if I
randomized to an experimental therapy against a control and secondarily
supportive interventions allow for equal survival to be achieved, that is the
truth. That is the truth. Even if the experimental therapy would give
you an improvement in time to progression, if supportive care improves in the
control arm such that there is no difference, that is the truth.
Now, it may be though that we have the wrong endpoint. In this case there may be clinical benefit in
other measures. It may be that we are
reducing the need for other toxic interventions, etc., in which case those
factors need to be considered as well.
But the one thing that complicates this, and what Dr. Temple
referred to before, is if best supportive care isn't what is being delivered to
the control regimen but, rather, cross in to the experimental therapy so that
you are looking at experimental now versus experimental later. That is answering the right question if you
have established that experimental is efficacious and you are just looking at
what is the optimal timing for delivery.
But it is a circular issue if you are really trying to find
out whether or not it is truly effective.
I realize going down this path is going to be a very complicated pathway
but I question the ethics and the scientific validity of crossing in to an
experimental therapy that hasn't been established to be effective. Is it imperative to do so? No, it is not. An example would be the Evastin trials that
have just been done in advanced colorectal cancer. Is it possible if you do that you will still
be able to show benefit? The answer was
yes, as was seen with Herceptin in advanced breast cancer.
But in general, as Dr. George had pointed out, crossing in
to a best available standard of care is the scientific question of
interest. That is not a bias. That is not diluting survival. That is the true effect on survival and if
you are not going to impact survival in that way, then a different measure
could be the relevant approach but it, again, should be a measure that
unequivocally reflects tangible benefit.
DR. TEMPLE: I just
want to make a distinction between the best treatment of cancer patients and
whether this drug is an effective drug because they are not the same
thing. Tom, you are saying that if order
doesn't matter, if you are studying drug A versus some treatment and now, when
you progress everybody gets some other drug, if that drug turns out to be
effectiveness enough, not necessarily more effective than the test drug but
equally effective, say, it could obliterate or substantially reduce the
apparent survival effect.
Now, that may be true information and useful information
for the community of people treating cancer but it gives you the wrong answer
on whether drug A works if survival is your endpoint. And, that is our worry. Also, if the drug is already available, if
you are talking about a Phase IV study, you can rail about the undesirability
and lack of ethics of crossing people over to the test drug but they are all
going to be crossed over to the test drug anyway despite your view, which means
that in many cases the confirmatory studies we want are perfectly predictably
going to be much less powered than you wanted them to be in the first
place. That is a consequence of
insisting on survival.
So, I need to press this point because it comes up in
conversations all the time and it is very important for us to distinguish
between is this an effective drug and, therefore, should be marketed and what
is the best way to treat people. It may
be that, you know, using the other drug first is just as good, or the sequence
matters, or any one of a bunch of conclusions.
That is all fine. But what we
want to figure out and we want to be able to tell people who come to us for
advice how to figure out is what do you need to do to show that the drug
works. And, I am very worried about
survival where crossover is either predictable or unavoidable for the reason I
gave before. I am sure somebody could
model this. You probably need studies
four times the current size, five times the current size. l
So, if survival is going to be the endpoint at least in
certain settings, then everybody has to sit down and say, okay, we are not
going to allow crossovers or we are going to try as hard as we can to prevent
them, or we are going to do studies five times the size we are doing. You can't keep saying survival is the
endpoint and not account for those things or then you get failure to meet the
desired endpoint and then you are scuffling for what you really meant in the
first place.
I am hoping for real straighforwardness in this. If that is really, in practical terms, almost
impossible to do, then we should hear that and not advise people to try to do
it because they are not likely to be successful if the thing they cross over to
is active, or somebody should model these things. It wouldn't be very hard. We could all do it. I couldn't but you could. We could model what the consequence of
crossing over to an active drug is. You
could calculate what the effect on power would be. But we really need to know the answer because
otherwise we can't give anybody intelligent advice.
DR. PRZEPIORKA: Dr.
George, last comment?
DR. GEORGE: Just to
follow-up on that a little bit, you certainly could model it but it would be
based on assumptions. And, one of the
assumptions that seems to be behind this worry about the crossover is that when
you cross over that agent that crossed over to, the same one, is going to have
equal effect. In fact, that might
entirely be wrong.
DR. TEMPLE: Fifty
percent.
DR. GEORGE: Well,
even if you assume some percentage, you just don't know. That is why you are worried about it I
guess. But I think there are examples
that show it is the timing of it that is critically important. So, later, at progression, it may not have
the same effect or maybe a very small effect so you could still get a survival
benefit. But I think your point is
correct that you just have to think clearly about those endpoints, and if you
think there is a possibility that that could occur survival may not be the best
thing. You may get the right answer in
terms of the strategy of using it but the wrong answer in terms of whether it
is an effective agent.
DR. PRZEPIORKA:
Let's move on to the questions regarding disease-free survival. The FDA has stated that disease-free survival
can support regular drug approval in cancers where the majority of recurrences
are symptomatic. Others propose that
prolongation of disease-free survival should support regular approval in all
clinical settings because a delay in cancer detection or a delay in the need
for toxic cancer treatment is of clinical benefit.
So, question number three is discuss whether disease-free
survival is generally an adequate endpoint for approval of cancer drugs or
whether additional evidence is needed, such as data demonstrating or suggesting
that disease-free survival is a survival surrogate. So, I guess the question is, is disease-free
survival an endpoint or is it only a surrogate.
Dr. Brawley?
DR. BRAWLEY: I think
they are two different things. I think
disease-free survival without increase in survival could be a patient
benefit. This is a purely hypothetical example
where the patient's disease is suppressed for a prolonged period of time. The patient is without symptoms because of
that suppression of disease. When that
disease comes back and flares up perhaps even more aggressively, than if it had
not been suppressed by the original drug--a purely hypothetical position, I
think there is patient benefit there.
So, again, I am lapsing into what Dr. Cheson and Dr.
Rodriguez have stressed before, that it is a disease specific entity and
perhaps Dr. Redman is correct that we are going to prolong the life of ODAC by
making these arguments but I really do think you can use disease-free survival.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: I was
just thinking but, no, I do agree with Dr. Brawley. I think disease-free survival is important,
that the patient has no disease. The
patient is generally seeing the doctor less commonly, has less complications,
no treatment, less lab tests. So, even
if there isn't a survival benefit there is generally a quality of life benefit
and certainly the patients, as was mentioned before, would rather not have
disease than to have disease around but it is just not progressing. But certainly from the quality of life
aspect, visits and labs, and all that stuff, it is clearly a benefit. Now, whether that is important for regulatory
approval of drugs is I guess something we are talking about.
DR. PRZEPIORKA: I
would just like to add that I would also agree that disease-free survival is of
actual importance, not a surrogate specifically in the leukemia patients. Patients we acute leukemia who relapse end up
having to drop their job; put their lives on hold; get back to the hospital and
be on therapy for another six months.
And, being able to delay that by one or two years makes a huge difference
in their life, especially in young adults who are primary care givers in a
family. So, I don't think disease-free
survival as an actual endpoint should be limited to the adjuvant setting. There are some diseases now with very high
response rates where disease-free survival could probably be a good
endpoint. Dr. Taylor?
DR. TAYLOR: Well, I
would agree that disease-free survival is a good endpoint but I think, again,
you have to go back to it being very individual because some of the therapies we
use to maintain a disease-free survival are very toxic, as with interferon with
melanoma patients and it is something that you have to really weigh for each
disease and each drug. I don't have any
problem with disease-free survival but it may not be important if that entire
time is spent doing high-dose chemotherapy and seeing the doctor anyway. Bruce already pointed out if you are going to
have less doctor visits and less troublesome and better quality of life, that
is an important aspect of it.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: I just
wanted to be clear on this. This is a
composite endpoint. It obviously can be
closely related to survival just by definition almost. You know, if you die without a recurrence, I
mean, that is an event in disease-free survival. So, it is going to be important to know in
whatever setting we are talking about what is the likely percentage of patients
that that might occur for. What are the
sort of competing risks of death in the given disease setting you are talking
about, and what is sort of known about the expected distribution about time
from recurrence to death. Those are
important considerations about whether this is going to be an important
endpoint. I think in general it is a
fairly good endpoint in a variety of settings because of those things but it
just needs to be considered.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: I
think this is a critical area. I talked
about balancing whatever these considerations are with symptoms. To be a little bit more specific,
disease-free survival without major symptoms of disease or major symptoms of
treatment is something that I think almost all of us would say would be
important. The bigger the impact of
disease symptoms, the bigger the impact of symptoms from treatment, I think you
would have to down-regulate that same benefit.
DR. TEMPLE: Wouldn't
you presume that there are no symptoms from the disease if you are disease
free? I mean, what would we be meaning
if not that?
DR. CARPENTER: Well,
let's give an example of allogenic bone marrow transplantation. You have no leukemia after your transplant
but you have graft versus host disease which compromises your quality of life.
DR. TEMPLE: No, I
understand about toxicity but not--
DR. CARPENTER: If it
is disease-free, then you are free of disease and you have no symptoms from the
disease. You are right.
DR. TEMPLE: Yes.
DR. CARPENTER:
Absolutely.
DR. WILLIAMS: Dr.
George, you mentioned duration from recurrence to death. I guess what you are saying is if there is a
longer duration between recurrence and death it is a less important
phenomenon. Perhaps for instance, you
know, PSA recurrence in prostate cancer might be many, many years. Is that what you meant?
DR. GEORGE: This
needs to be considered. For example, if
you have a very short time from recurrence to death you really are talking
about sort of the same thing, especially if you have a lot of deaths that occur
without recurrence. But, you know, you
need to know that in a given setting because when you look at disease-free
survival, for example in a setting where there is a long time between
recurrence and death the curve is going to look real short and fast and then
you have to kind of worry about that translation and relationship to survival. But that doesn't mean it is not a good
thing. I think it is a very valid
endpoint in many settings and is a good one.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: Actually,
what I wanted to cover was covered. I
would just agree that definitely, you know, for a patient's standpoint it is a
benefit to have increase in disease-free survival.
DR. PRZEPIORKA: Dr.
Reaman?
DR. REAMAN: I would
just argue that I don't think disease-free survival always connotes the absence
of symptoms for every disease. Certainly,
individuals who have had surgical interventions for management of their initial
disease may have long-lasting symptoms as a result of that. Patients with brain tumors may similarly have
symptoms which aren't going to disappear.
I also agree with Dr. Przepiorka that disease-free survival should be an
endpoint and not necessarily be considered as a surrogate for survival.
DR. PRZEPIORKA: Dr.
Levine?
DR. LEVINE: I was
going to say the same about surrogate.
This is not, to me, a surrogate; this is a valid endpoint. The only other point that I would like to
mention is that if this is the only endpoint you will exclude some drugs
perhaps unnecessarily. In other words,
to get into that equation you have to be a responder in some sense and there may
be other benefits of drugs that we are going to talk about later. But, to me, this is an extremely valid, real
endpoint.
DR. TEMPLE: Well,
you almost need either the adjuvant setting or something where there are a lot
of complete responses or something not commonly seen in solid tumors certainly.
DR. PRZEPIORKA: Dr.
Fleming?
DR. FLEMING: My own
sense of whether I would consider a surrogate or not a surrogate would depend
on the setting. We have heard a number
of different potential benefits that could arise or could be accrued by having
a delay in disease-free survival. One is
if, in fact, this is a disease where at recurrence there is clear and frequent,
if not standard, occurrence of symptoms, then clearly it is, in fact, a direct
measure of clinical benefit.
One, of course, might argue that if that were the case then
a direct symptom outcome measure ought to be able to also show that overall
benefit. It has also been argued that
there are potential psychological effects where, if we delay recurrence or
detection of recurrent disease, there is that overall benefit to the
patients. I would also accept that
although that psychological benefit I would consider to be of much less
profound importance than an actual delay in death.
As has been pointed out, what is the tradeoff in benefit to
risk? If what we said is we are going to
delay by six months or a year the knowledge of recurrent disease, how much
toxicity would you accept for that benefit against saying I am actually going
to prevent the recurrence of disease; I am curing you of this cancer in 25
percent of the patients? I would
consider that, as a patient, a far more profound piece of information, that I
have a 25 percent increased chance of being cured than a delay in a year of the
time in which I am going to have recurrence of disease.
So, it does become important to understand what it is that
we can reliably conclude from a delay in disease-free survival. It is in part, in those cases where it is
symptomatic disease, a direct clinical efficacy endpoint. In cases where it isn't it could also be a
very relevant measure but now it is in the arena of a surrogate. We have to be able to know whether or not a
delay in disease-free survival is reliably telling us we have a delay in death.
Maybe later in the discussion I will comment that there are
specific standards that are emerging for what that evidence would have to be,
but at this point I want to just distinguish that there are two different
realms in which disease-free survival would be of interest. One is a direct clinical endpoint through the
symptom aspect and another is through its surrogacy for survival.
DR. PRZEPIORKA: Dr.
Reaman?
DR. REAMAN: I guess
I am still unclear about the symptom issue and why it would be a surrogate for
survival. I am not aware of any disease
that is easier to manage once it recurs.
So, I don't understand why disease-free survival couldn't be an endpoint
for determining clinical benefit. It is
a clinical benefit if you prevent something from recurring.
DR. FLEMING: Yes, I
think what I was saying is if, in fact, there was something tangible, such as
symptom prevention or occurrence of symptoms or the psychological benefit,
those are, in fact, direct clinical benefits.
But that is separate from whether this is also predicting a prolongation
of survival.
DR. REDMAN: But if
it prevents the disease from coming back it could be predicting a prolongation
of survival.
DR. FLEMING: Well,
in fact, that is the hope and, yet, there needs to be some validation. Of all surrogates, this is one that tends to
be much more plausibly valid, that if we can delay recurrence of disease we are
very likely to be prolonging survival.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: I think
given the fact that we are talking about diseases which can't be cured, I think
we have to view this in terms of providing patients with options that they
might not otherwise have that a rational person could perceive to be a
benefit. Something like disease-free
survival may be absolutely critical to someone based on where they are in their
life. Someone may be in a position where
being able to function without the disease for some number of years may be
critical to putting their family in a financial position so they feel they have
done the right thing. I mean, there is a
lot of theory around this but I think it is all about patient options and that
clearly provides patients with options that they don't have.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: I was
going to say something similar. Most of
the situations we are dealing with here have to do with new agents for solid
tumors and, in fact, curative medical treatment is generally unavailable for
all these. So, things based on a
theoretical increase in cure are a little bit far out. Whereas, things that keep your disease from
coming back for a tangible period of time or that keep your disease simply
controlled for a tangible period of time seem to be a very direct benefit for
that person.
DR. PRZEPIORKA: Dr.
Brawley?
DR. BRAWLEY: No.
DR. PRZEPIORKA:
There are two very interesting questions that are lumped into number
four which come to the meat of what we do when things come here. Consider whether the adequacy of disease-free
survival varies with the clinical setting in terms of an endpoint. B is treatment where the investigational drug
shows prolongation of survival when randomized against an effective standard
therapy where the standard therapy has already been shown to impart a survival
benefit.
Would this august body be inclined to recommend approval
based on disease-free survival for the investigational drug when compared
against a drug that has already been shown to have a survival benefit? Dr. Carpenter?
DR. CARPENTER: Yes.
[Laughter]
DR. CHESON: This
gets back to what Dr. Fleming was talking about before, that it is a
bi-functional endpoint, the surrogate nature and the non-surrogate nature. Again, it is going to vary a bit with disease
but I think in general--and I would think also when we were talking about time
to progression before, it is not like you looked at survival and you didn't
look at all the other endpoints along the way, like response rates and time to
progression and disease-free survival.
So, you will have some parameters to compare to this drug or this
regimen that caused prolongation in survival and also had some point of
disease-free survival and also had some time to progression and also had some
response rate, looking at it backwards.
So, you do have something to compare it against, which may give a little
more support to using it as a surrogate endpoint in that particular condition.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: And
from a regulatory standpoint you just told us it doesn't have to be necessarily
better to be approvable. It just has to
be would we consider this evidence of effectiveness, and I think probably so.
DR. TEMPLE: Yes, I
think B goes to, you know, you have one thing that shows that you know has an
increase in actual survival. Now comes
along something that is actually better on disease-free survival which you
don't know the effect on actual total survival.
How worried would you be not knowing that last?
DR. CARPENTER: Well,
if you were to grant accelerated approval, I would think that would be the very
right setting and you would hold that other in abeyance--
DR. TEMPLE: That is
okay, other people would also want to know whether they could get regulatory
approval on the basis of being superior to a drug that is already hot stuff in
one measurement that isn't ultimate survival.
DR. PRZEPIORKA: Dr.
Redman?
DR. REDMAN: I sort
of agree with the statement that that would be fine but I would really like to
see the data. What if the disease-free
survival advantage was compared with the second-line regimen that prolonged the
survival of the standard therapy that was given after those patients relapsed
and they lived longer because they had the second therapy and now you have
brought it up front-line and there is no second-line?
DR. WILLIAMS: This
is disease-free survival here.
DR. REDMAN: No, no,
but something that has shown overall survival advantage. It may be that the overall survival advantage
is then partly due to the regimen that you are now bringing up front.
DR. WILLIAMS: Well,
I think what the question is meant to say is that you have a treatment that
does improve disease-free survival. We
know that; it is not secondary therapy.
You have another treatment that comes along. It is either under-powered or the data aren't
yet mature enough and it beats that treatment in disease-free survival but you
don't yet know that it has the survival effect yet it is better in this
surrogate or also maybe clinical benefit endpoint itself. Is that enough or are you going to be nervous
about approving it until you see a lot more survival data?
DR. REDMAN: I guess
I would have to know what the agents are, what the disease is. I mean, overall what you are saying is
intuitively correct. If it beats it in
disease-free survival and, you know, the other one has gone out longer and
shown an overall survival advantage, yes.
But I couldn't in a blanket way say that.
DR. PRZEPIORKA: And
I think a number of folks have already indicated that under the right
circumstances disease-free survival is the endpoint. So, we would not be so worried about survival
to demonstrate efficacy as opposed to let's look at the survival information
when it is available for safety. Dr.
Keegan?
DR. KEEGAN: Yes, I
would like you to actually revisit the right circumstances because the right circumstances
seem to be integrally involved with the toxicity of the agent. I think this is important if we need to meet
with sponsors and tell them, well, it depends upon how toxic you are and your
evaluation of the toxicity of this agent and the impact on the quality of life
of the patient. Are you suggesting that
for an agent which has more than minimal toxicity for adjuvant treatment or
more than extremely short course that we need to be measuring some aspect of
the quality of life and, if so, what aspects do you think are important? Because if, in fact, they lose on that they
have to have as a backup plan a trial powered to look at survival.
DR. PRZEPIORKA: Dr.
Grillo?
DR. GRILLO-LOPEZ:
Although there are exceptions, usually you are going to be evaluating an
agent versus a combination therapy which may have some prolongation of survival
and all of the issues of single agent versus combination come up again. It is unlikely that even though you are using
the experimental agent within a combination that it is the optimal combination
ever to be found with this agent. So, I
would say in that situation disease-free survival is still a good endpoint.
If you are doing a single agent study, single agent versus
single agent, standard single agent and experimental single agent, and you have
a standard therapy that cures 100 percent of the patients and is totally free
of adverse events, then disease-free survival is not the appropriate endpoint
but I can't think of an example.
DR. KEEGAN: What
about, for instance, areas where there is not a curative standard adjuvant
therapy accepted so it would be single agent against observational
control? I mean, obviously, it can't be
less toxic than an observational control so what components of toxicity should
be evaluated? What are the important
factors? One thought that was mentioned
was that the individual is able to work and carry on all their activities of
daily living. Is that the important
component, you know, as opposed to just collection of adverse event information,
which is hard to put into context of impact of a patient's physical functioning
sometimes.
DR. PRZEPIORKA: Dr.
Levine?
DR. LEVINE: A couple
of thoughts. I differ a little from the
group. In this example, B, my thought
would be if we do have a curative regimen at some level, whatever it is
depending on the disease, and now you have another drug which shows
prolongation of the disease-free survival, in that setting I would say that is
the surrogate marker. This, to me, is
what accelerated approval should be all about.
It is highly likely to convert into a survival benefit in the
future. You don't want to withhold it
from the people right now. In that
example I would say it is a surrogate but I think it is still a good surrogate
marker.
In answer to the question related to what would be
important, I defer to Mr. Katz and others but it seems to me that functionality
is the critical issue. You know, if the
patient is on this drug and the patient is able to work, or go to school, or
care for family, that, to me, is critically important and far more
objective--you know, the quality of life measures are very difficult to put
meaning onto. Functionality is easier
and more objective, it seems to me, and perhaps more valid.
DR. TEMPLE: This is
the way you would measure how troublesome the toxicity is.
DR. LEVINE: Yes, can
you function.
DR. TEMPLE: I have
to say that we rarely get data of that kind.
DR. LEVINE: That is
probably the most valid, I would think.
DR. CARPENTER: You
should though.
DR. TEMPLE:
Maybe. We do try. It is extremely hard to do in unblinded
settings, which most of them are although not all adjuvant settings are
unblinded. It is just very hard. I mean, in these quality of life things you
usually don't know what to look for in advance.
So, you are looking at multiple things and it is really hard. Many people have brought us patient-reported
outcome data and very few of them have been even close to persuasive.
I wanted to throw one thing out as part of the discussion. We are talking here about controlled trials
where there is a control group. It is a
fact though that for many years we have recognized the potential benefit of a
very durable complete response, which is sort of related to disease-free
survival, and we don't see that very often but where it does occur that has
been a persuasive endpoint even on sort of historically controlled observations
and I think that reflects the same thing you are saying here. All the treatments for testicular cancer that
are approved were approved based on data like that.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: Actually I
have three points that have been stacking up here. One, relative to Dr. Keegan's question or
comment, you know, I think that we have to distinguish between toxicities that
are kind of quality of life issues and toxicities that are irreversible
because, clearly, safety issues are a big deal.
You know, relative to Dr. Temple's comments, I think that
that is one of the reasons that we, patients, are really grateful to be at the
table here because I think the size of the instruments that you guys come up
with to measure quality of life is indicative of the fact of how hard it is to
really explain. So, I think having real
patient input on those things is really the only way to gauge that.
Also, I agree wholeheartedly with Dr. Levine. You know, when we are in the situation where
we have low cure rates, low effectiveness of cure with these treatments I think
we would all hope that people sitting around this table are basically asking
themselves would a reasonable clinician gives to a patient and expect a better
result even though we don't know for sure, and we don't want to hold back
something that is potentially valuable.
I think that is what I hear in this room and I am very encouraged by it.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: I am
just wondering about this issue that you asked about, functionality and how you
measure impact. Functionality, even
though hard to measure and maybe frequently we are unable to, I think most of
us would accept is important. The other
thing is some way to measure the impact of the symptoms on the person's
function. And, how many measures or how
many other drugs in an adjuvant setting have to be used to take care of the toxicity
or the side effects of the treatment, however you would want to quantitate
that, it seems that one way to try to assess impact on quality of life and
sometimes it is easier to count that or ask a few things. Pain medications are a long-standing thing
but certainly not the only things used.
Particularly in an adjuvant setting, you wouldn't expect to use many of
them. But there are other things which
may have to be used. Neuropathy would be
a common thing that could have a big impact and is important in certain
adjuvant settings--some way to try to measure that or sort that out because
what you want is to control all the symptoms and not have the disease come back
in this setting and some kind of way to quantitate how close you have come to
do that. It seems to me a way to be able
to compare and know what the impact of the new thing may be.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: Most of
what I was going to ask has already been said.
But it gets a little more complicated because some of these therapies
that prolong disease-free survival may be something you give immediately at the
time you are initially treating the patient and some may be things you have to
chronically administer and that has a different impact on patient quality of
life, how you are going to follow toxicity, etc.
I certainly agree that we need in any circumstance to
continue to monitor the AEs because there may be untoward events that are
clearly unanticipated. Secondary
malignancies are the ones that always come to my mind. It is nice that people are 100 percent
functional but if five years down the line the risk of acute leukemia becomes
eight or ten percent, then we have to reconsider what we are doing.
DR. PRZEPIORKA: Dr.
Li?
DR. LI: I would like
to hear Dr. Fleming's and Dr. George's comment on the single-point analysis
discussed by Dr. Williams. The issue was
raised for different assessment period imposed for the TTP or disease-free
survival and that may cause bias and the need for a similar analysis at
one-year survival or two-year survival as a single-point analysis for TTP or
disease-free survival that may provide a kind of alternative. So, I would like to hear some comment from
the committee.
DR. GEORGE: It has a
certain charm but is, like other things, I think a risky thing to do because
you have to settle on what that point is.
In terms of determining the progression you have to assess it at that
time or enough to it, whatever that means, so it makes sense. If you miss it, that is worse than having a
sequence of values of which you are missing one. So, it has some appeal in a setting where you
know what that time would be and you are sure you are going to get all
readings. Otherwise, I doubt that it
would be of benefit. You are obviously
losing some information and the question is whether that information is
critical. I don't know. I would tend to say that is not the way to
go. That is my feeling. You just need to develop procedures and
carefully design studies so you kind of minimize the problems we talked about,
that Grant talked about this morning, but not try to fix it with a single
point.
DR. PRZEPIORKA: So,
in summary, I think we are saying that--oh, Dr. Fleming?
DR. FLEMING: Had you
already gotten to part C or are you still looking--
DR. PRZEPIORKA: No,
C is open for discussion.
DR. FLEMING: Okay,
if it is open discussion I might just add that C becomes much more problematic
than B. I think we have discussed the
complexities with B. In C, what we are
saying is we haven't proven superiority; we have just ruled out that
disease-free survival is meaningfully worse by some margin.
I think C is an extremely complex circumstance and I come
back to this distinction again, is disease-free survival itself a clinical
endpoint because it carries with it symptomatic improvement and it carries with
it the psychological benefit? Or, is the
major focus or a different focus of disease-free survival that it is, in fact,
a surrogate at some level of validity for evidence for prolongation of
survival?
In that first domain it is entirely possible to say that
if, in fact, we are using this as a measure of symptom relief efficacy could
follow if we establish that we are maintaining at least half of the symptom
relief. On the other hand, if we are
using it as a way of providing evidence that we are actually going to have a
survival improvement, which I still maintain, to my way of thinking, is a much
more profound benefit if the intervention is actually providing a survival
improvement. It is now very problematic
as to whether or not not being a certain amount worse in disease-free survival
allows me to conclude we maintained some of the survival benefit. So, I go back to some of the earlier comments
and we will talk about this in more depth with time to progression later on
this afternoon.
If we have established that an agent improves survival,
let's say, and following Grant Williams' discussions from this morning we are
saying we want to know that we are maintaining at least half the benefit we
have to know not only that a benefit on the surrogate is telling us we have a
benefit on the clinical endpoint, let's say survival. To do a non-inferiority argument we have to
know how much improvement we can have or need to have in the surrogate to get a
certain amount of improvement in survival.
For example, it may be that, as with 5-FU, levamisole, 5-FU levorin in
the adjuvant colon setting, we have a 40 percent reduction in the rate of
disease-free survival and that translates into a 33 percent reduction in death
rate. If we want to maintain at least
half that benefit in survival, how much reduction can we see in disease-free
survival to maintain half? That is
wishful thinking, to think we know the answer to that. So, essentially what we are doing is what I
often refer to as my worst nightmare, a non-inferiority trial design in the
context of using a surrogate endpoint.
[Laughter]
So if, in fact, here disease-free survival is of importance
to us in a substantial manner because of its prediction of survival benefit, C
becomes incredibly problematic. On the
other hand, if all we care about in disease-free survival isn't because it
tells us anything about survival but it is just that it tells us something
about symptom relief, then it is possible to do this, although I would say it
is pretty weak evidence that we know we are maintaining a small fraction of the
symptom relief that standard of care would provide.
DR. PRZEPIORKA: So,
in summary, I think what we are saying is that disease-free survival could be a
primary endpoint rather than surrogate, most useful in diseases that have high
response rates, testing drugs that have a very good likelihood of giving a high
response rate. It is important to keep
people off therapy or on treatment with little more mostly reversible
toxicities; that functionality is what is critical when looking at disease-free
survival, and that we should also keep in mind the other endpoints that should
be looked at just for confirmation of clinical benefit. In the situation for randomized trials where
the comparator is already a highly effective therapy that has a curative
fraction, there is some variation in thought regarding whether that
disease-free survival should be an adequate endpoint or just a surrogate.
Let's move back to question number two--
DR. TEMPLE: Can I
just comment on Tom's thing? I am sure
it won't placate your nightmares--
[Laughter]
--but for the adjuvant setting, at least in breast cancer,
we have asked for 75 percent retention of the effect on disease-free survival. Also, for what it is worth, even for
tamoxifen I don't believe very many individual studies have actually shown
improved survival. The meta-analysis
does but that is not the same thing if you are talking about an individual
trial. So, that is not so easy.
DR. PRZEPIORKA: So,
time to tumor progression, it has been proposed as an endpoint for regular
approval, not a surrogate. Page two at
the top lists the pros and cons that Dr. Williams has already gone
through. What we need to do for the next
35 or 40 minutes or so is to discuss whether clinical settings exist where time
to progression improvement should be considered an established surrogate for
clinical benefit and should support regular drug approval. We need to identify the factors that determine
when time to progression is an adequate endpoint for drug approval.
The factors that we are supposed to consider include
reliability in measuring the endpoint, the relationship of disease progression
to death, established benefit of available therapy, drug toxicity, and whether
progressing patients are symptomatic.
Dr. Williams has kindly provided us with a host of scenarios to
stimulate our discussion.
If we could actually just pick up with Dr. Li's question
from before about whether or not the clinicians on this panel also have any
comments about the single endpoint with regard to time to progression. Dr. Cheson is chomping at the bit.
DR. CHESON: I think
using the single endpoint--again, I am thinking from my sphere of diseases, has
the potential to be very dangerous. If
you take some therapies where the initial toxicity, whether it be
pharmacogenomic or for whatever reason, is exceptionally toxic and if you
survive that you do well, then you are going to miss that initial real drop-off
which might be a very undesirable effect.
I drew a little curve here but, you know, the curve may go straight down
and then sort of level off for the people who survive the therapy and you would
miss that because of the same six-month point or whatever point you
choose. Another therapy might get there
but not have this initial somewhat disastrous effect on a large proportion of
patients. So, I would be strongly
opposed. I think you would lose too much
very important information on patients proximal to that point in time.
DR. PRZEPIORKA: Yes,
I would tend to agree in that the name of the endpoint is time to progression,
not progression-free survival at some point.
So, if we really wanted to say that time to progression is what provides
clinical benefit, we actually have to look over a course of time.
One issue raised earlier today is how do you measure this,
knowing that patients come in for their staging at various time points and that
can be somewhat difficult. My response
to that was if the sponsor chooses to use time to progression as an endpoint,
they need to do the work and they need to provide the data. If the data is missing, then they haven't
done the study and they shouldn't get approval based on lack of data.
DR. TEMPLE: Could
you talk about that a little more? One
possible argument is that too infrequent measures decrease the precision of the
measurement but, unless there is a bias tendency to get people in to look, it
might not introduce a bias. So, how do
you rate those two things? I mean, it
might be true anyway even though you are only seeing them every three or four
months. You might still be able to
detect a difference as long as, say, the visits were similar in the two groups
and there wasn't a bias. So, which is
the worst problem or which problem are you focusing on?
DR. PRZEPIORKA: I
think the problem that I would focus on is missing patient data and missing the
fact that if somebody doesn't show up for staging in a year you really can't
make measurements based on every three-month interval. I mean, it is the difference between looking
at a Kaplan-Meier and a life table analysis.
In fact, some people put out Kaplan-Meier plots and you can tell how frequently
they do their restaging because the Kaplan-Meier plots fall every three
months. That is the kind of analysis
that needs to be done as opposed to continuous analysis. The statisticians may end up having to come
up with a new way to do comparisons using that sort of data because it is
clearly not continuous.
DR. TEMPLE: So, they
should make sure, if they are going to use this as an endpoint, that they are
seeing people at some regular interval, every two months or every three months
or whatever gives you the adequate precision.
DR. PRZEPIORKA:
Hand-in-hand with that, you are looking at power calculations to
determine how much of an interval in improvement you have to make, that
interval has to be at least one interval between staging. You can't say you are going to stage people
every three months and then you are going to power to look for a one-month
difference in time to progression. That
would not make sense.
DR. WILLIAMS: I will
follow-up on that because I have heard that and I honestly do not believe that
is true. It depends on whether you are
trying to precisely estimate the effect; maybe it is true then. But in terms of producing a highly
statistically valid detection of effect, you can do it at one point just as
well. So, the frequency really doesn't
determine your ability to detect a small effect. It might determine your ability to precisely
estimate the difference perhaps--maybe the statisticians can correct me on that
point, but I have heard that discussed several times at ODAC and I don't
believe it is true that you have to look at an interval that is smaller than
the measured median difference that you are after.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: Just one
thing about these kinds of measurements, of course, there is a whole big issue
in statistics about how you handle this in data in longitudinal kinds of
studies. This is a little different
because here let's say you do a reading, then you have a long interval and you
do a reading again and there has not been progression, it is reasonable in this
setting I think to assume that they never progressed. You are not monitoring a process that
progressed and then un-progressed and you missed it. The problem comes in when you have those long
intervals when you discover that they did progress and you don't know exactly
when that occurred between this measurement here and here. So, you have to consider in this setting the
disease I guess. We are back to
that. What is the disease setting and
what is your prior estimate of when these things would be occurring. So, you just don't want that to be too imprecise. You can quantitate that if you know something
about the setting you are in.
DR. PRZEPIORKA: Dr.
Redman?
DR. REDMAN: I agree
with Dr. George. I got kind of thrown
off by Dr. Przepiorka's one-year follow-up on a patient with advanced disease
without progression. But I think,
depending on the disease category, with the diseases I deal with you can define
and I hate to say mandate but, you know, if you are going to say you are going
to follow the patient every month by CT scans and every month you have to have
the CT scans, and it has become less of a problem in today's technology
world. We just send them to a third
party and they actually have copies.
I guess the question I have, and Dr. Fleming and I had a
conversation, I am a little concerned about, you know, what happens in time to
progression for the patients who die on therapy while they are responding. I got the sense from Dr. Fleming that those
patients are censored and not evaluated and it has been diluted out. I am a little bit concerned about that
because that somewhat speaks to the toxicity of therapy.
DR. TEMPLE:
Certainly people look at toxic deaths as a separate item. How that gets factored into the analysis is
something of a question.
I wanted to be sure about this, could I ask Tom and Steve,
should we be advising people who are hoping to detect an advantage of, say, two
months that if they don't see patients every two months they don't have a
prayer; it is not valid? Or, could you,
in fact, see them every three months and still detect a difference of a couple
of months? That is the question Grant
was raising. Is there a precise
relationship or requirement? This is
very important for how we advise people.
If they are looking for differences that are small, two or three months,
they had better make sure they are seeing people at least as often as that or
perhaps more often.
DR. FLEMING: It
depends on the nature of the true distributions of time to progression. If we just said, for example, if we had
exponential distributions for time to progression, i.e., time to let's say a
certain amount of growth in tumor volume and there was a two-month difference
in the median, you could look less frequently than two months and you could
still see the difference. But, you know,
sensitivity to that overall difference is going to be somewhat less. So, it is not a black and white, yes, you do;
no, you don't but your sensitivity will be somewhat diminished if you are not
following them with as great a frequency.
In fact, you said before how could you have a bigger
survival effect than time to progression effect, this is one of the ways. This is one of the contributing ways. You are actually getting a noisy measure of
what truly is happening by the intervention to tumor burden.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: I
support Tom's opinion on that but I would also say that you need to consider
the circumstance you are in. That is,
there is no hard and fast rule that says if you are trying to pick up a certain
difference you have to do the measurements like this. But you should be considering what you know
about the rate, or what you suspect would be the rate of progression over
time. I guess that is what Bruce was
saying too. In other words, you would do
it differently in different settings.
So, I think you want to have reasonably careful measurements in that
period where there is a high risk.
DR. PRZEPIORKA: Dr.
Grillo-Lopez?
DR. GRILLO-LOPEZ:
No.
DR. PRZEPIORKA: Mr.
Katz?
MR. KATZ: I would
suggest adding one factor to the list that we put here. We said whether progressing patients are
symptomatic. I think whether stable
patients are symptomatic is also germane here because you have tumor reduction
but no symptom relief.
DR. PRZEPIORKA:
Could you speak a little bit more about that with regards to who might
actually be a good candidate for a time to progression patient? If somebody is symptomatic already, is time
to progression really an endpoint that you would consider clinically valid? That is, you are sick and as long as you
don't get any sicker it is okay or, is this something for patients who have
minimum disease and are not exactly ill?
MR. KATZ: Well,
clearly if you start in a situation where you are highly symptomatic everything
is valid. If you can get a treatment and
it relieves the symptoms and it delays the time to those symptoms getting
worse, then there is certainly an argument to say that that has a value to a
patient. If a patient has profoundly
serious symptoms that are horrible but you know that they can get worse but
they are not getting worse because we have done this and it hasn't progressed,
then I think that is also valuable. You
know, things get more acceptable depending on what you are looking at coming next.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: As Grant
said, we have been encouraging people for years to look at time to symptomatic
progression and I would say we have met with total failure. Nobody does that for a lot of reasons. I don't know why. You probably know better than I do why. Symptomatic improvement in a group that is
symptomatic has always been accepted as a valid endpoint. But as Grant also said, except for a couple
of pain things with prostate, we have had very little success in attempts to do
that and you have seen them--esophageal obstruction, you know, that works fine
but most of the other things have been very resistant to success.
DR. PRZEPIORKA: Dr.
Levine?
DR. LEVINE: I was
just going to say that in considering time to tumor progression as the
endpoint, not as a surrogate but as a real endpoint, it would seem to me that I
would want it in the context of some sort of confirmatory clinical benefit
other than that itself, i.e., symptoms are manageable; symptoms are better or
have not re-occurred; toxicity of the drug is "acceptable"; quality
of life. So, if it is just time to tumor
progression alone without these other things, I don't know that that would be
valid in a clinical sense.
DR. CHESON: Again,
that depends on the clinical sense because there are some settings where you
start with nothing. When you
"ain't" got nothing you have nothing to lose. If they start in an adjuvant setting or some
setting where the patients just have disease, are asymptomatic, like early
stage follicular lymphoma, and they don't have anything, then it doesn't work
there.
DR. LEVINE: Right,
you are right. So, in other words, it
goes back again to disease specific situations.
DR. CHESON: Right.
DR. BRAWLEY: Can I
ask for a point of information?
DR. PRZEPIORKA: Yes.
DR. BRAWLEY: Was
gemcitabine approved for quality of life or for prolongation of disease-free
survival?
DR. TEMPLE: Two
reasons. Lilly invented a clinical
benefit scale that had some elements of tumor progression and some elements of
other stuff and they won on that. That
is one thing.
But I think what actually persuaded people most was the
one-year survival of 18 percent versus 2--not an official endpoint but it sort
of looked pretty impressive. So, that is
what it is for better or worse.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: Can we
talk a little more about the issue of the deaths that occur when you are
looking at time to progression and death occurs before progression? I think if you are in a setting where there
is some substantial percentage of patients for which that is true, that greatly
decreases the value of the time to progression kind of analysis, in my view,
because you don't know what that means.
Further, even if you don't have deaths first it is pretty important to
know something about that distribution from progression to death in different
diseases, again to get back to the point I made earlier. If it is very short then, of course, it is
sort of the same as survival really but if it is long, then you are in a
setting where you probably need to consider this more as a surrogate or a
potential surrogate. But I am worried
about a situation in which you have some substantial proportion of deaths
without progression and how you handle those then becomes critical. In the usual way you just kind of censor them
but that is clearly subject to a lot of problems.
DR. PRZEPIORKA: Dr.
Williams also talked about time to treatment failure as being an unacceptable
endpoint and, yet, if we talk about time to treatment failure defined as
disease progression or death would that satisfy your concern about how to
incorporate death?
DR. GEORGE: Yes, but
that is more like progression-free survival.
I like that.
DR. WILLIAMS: I
think we need to bring that up and the question is when we are looking at TTP
as more like clinical benefit endpoint or surrogate, should we use
progression-free survival, include the deaths and do a very careful evaluation
and analysis to deal with the deaths or should we use TTP? It sounds like there is at least some
consensus that progression-free survival is a good endpoint.
DR. PRZEPIORKA: Any
disease categories where anyone here thinks that progression-free survival or
time to progression simply would not fit and should never be used, or the
converse where this is clearly the best endpoint because they will never get a
remission and all you could hope for is progression-free survival?
DR. TEMPLE: Well,
just to be clear, I heard some uncertainty about that from Dr. Levine. I mean, if along with that you need to
improve symptoms or something like that, then it is not just progression-free
survival; it is symptomatic benefit too.
So, I think we need to be clear on what people do think. But our initial question is, assuming you
don't have all those clinical benefits, do you think progression-free survival
or time to progression is a good stand-alone endpoint in this current, real
world? If that is not clear, we are very
interested in hearing whether it is or not.
DR. PRZEPIORKA: Dr.
Redman?
DR. REDMAN: I think
progression-free survival, at least in the tumor types I deal with, is
fine. I don't think this is the
implication, but if you have a drug that is coming in and you say, okay, we are
going to pick progression-free survival and it cures 100 percent you are not
going to miss it. I mean, it is going to
be there. You are just saying what is
the lowest, minimum activity or clinical benefit we are willing to accept.
DR. PRZEPIORKA: Dr.
Fleming?
DR. FLEMING: Just to
return to kind of a general response to this question of where and when can TTP
or progression-free survival be used for regular drug approval, I would return
to the pros and the cons and, just in the interest of shortness of time looking
at the cons, what we have to overcome are these uncertainties, uncertainties
that arise because it is an indirect measure.
The clinical meaning of TTP differences, of small differences is
unclear. The reliability of unblinding
interpretation results are issues. I
would add to that another one that, in fact, did come up in the oral
presentation, and that is just the noise and the variability factors add
complications due to variability in imaging assessments or timing of
assessments, as we were talking about some ten minutes ago, and missing data. There tends to be a bigger missing data
problem with the TTP endpoint, less so with progression-free survival and,
obviously, even less so with survival.
Because of this issue of clinical relevance and missingness
induced by death, I find TTP especially problematic if I am using it as a
registrational endpoint as opposed to a supportive measure of biologic
activity. So, among the two, if we were
looking at it as a registrational endpoint, certainly I would prefer
progression-free survival.
But I would like to just step back for a minute. Rather than say, yes, it is a good endpoint;
no, it isn't a good endpoint, just talk a little bit about the principles that
should guide the decision as to when it is a good endpoint and what kind of evidence
we would like to have because there is now a lot of science behind what it
takes to validate a surrogate.
So, in our November 12 meeting of the FDA ASCO working
group, basically in that session we talked about a marker such as time to
progression as being one of four levels.
Level one would be the best. In
level one forget about surrogacy, it is, itself, a clinical endpoint. We said examples of that would be when you
have the event disease-free survival or progression-free survival it is
inherently linked to symptomatic disease.
So, symptomatic events, preventing or delaying symptomatic events are
inherently of tangible benefit to patients.
If that is the case, then we have an endpoint that is, in fact, in its
own right a valid clinical endpoint and surrogacy issues don't arise.
The second level would be an endpoint that reliably
predicts clinical benefit. So, when I
see an effect on time to progression I can know that I will see--let's say if
it is a surrogate for survival--a certain level of effect on survival.
The third level is reasonably likely to predict clinical
benefit where the agency then uses this as a measure for accelerated approval
but with the understanding that the ultimate answer on clinical endpoints will
still have to be obtained in a validation trial.
The fourth level I will call none of the above, none of the
above often being a correlate. There are
an awful lot of correlates out there that, in fact, aren't any of the top three
levels.
What does it take to be in level two, versus three, versus
four? Well, the first thing we will look
for is if it is a correlate. Is time to
progression a correlate of survival or whatever the clinical endpoint is on a
patient specific basis? Almost certainly
it is but, in essence, that doesn't tell us anything about whether specifically
the benefit or the outcome on the clinical endpoint is mediated through
that. For example, you may have CEA
correlated with survival but it is not through changing CEA if the disease
process leads to an outcome in survival.
So, changing CEA may not change survival. That could be a level four.
So, we have to go beyond that. The evidence that we typically look at to go
beyond that is guided by the Prentice criteria.
So, what we are typically looking for is not just having a
correlate. That is a necessary
condition. It is not a sufficient
condition for validity of a surrogate.
We want to find out whether or not the effects on that marker are, in
essence, capturing the net effect on the intervention of the clinical
endpoint. At a certain level of
persuasiveness that would get us to level three and I think in many settings
people would argue time to progression because it is, in fact,
directly--getting at tumor burden is very likely to be at that level but
obviously it needs to be addressed on a case-by-case basis.
The bigger challenge is to say when is it a valid surrogate
such that I know if I achieve an effect on this measure I don't need
accelerated approval; I have actually established clinical benefit. That best evidence is obtained by meta-analyses
of studies that have looked at an array of trials, an array of studies that
establish treatment effect on the surrogate--in this case I will call it time
to progression and treatment effect on the clinical endpoint I will call
survival--specifically saying what is the functional relationship between a
certain level of reduction in the failure rate on time to progression versus a
level of reduction in the failure rate on survival.
Understanding that is really critical and, in fact, in many
settings we don't have that kind of evidence and, as has been pointed out
before, partly because we are looking at interventions that at this point don't
establish much of an effect on the clinical endpoint. But the essence of validating a surrogate and
saying we can use time to progression as a surrogate for, for example, survival
would be having meta-analyses of studies that would show reduction in time to
progression rates and reliably would tell us we would have reductions in
whatever the clinical endpoint is, such as death rate--reduction in the
rates. So, if we reduce the rate of time
to progression we are improving time to progression and we want to reduce the
rate of death to improve the survival time.
DR. PRZEPIORKA: Dr.
Williams?
DR. WILLIAMS: Dr.
Fleming, I saw your categories at the workshop on colon cancer but when I was
preparing my talk I was wondering what category we would put our practice of
breast cancer hormones and response rates.
I mean, perhaps category four, which is even worse than accelerated
approval category or what I think it is, it is clinical inference about number
one. I don't know if you have a category
for that and I don't think you do.
DR. FLEMING: Well,
my sense is that if you are talking about response rate in breast cancer--I
think that is the example you were giving--
DR. WILLIAMS: Well,
it was hormonal breast cancer where there is a long history with gemoxifen--
DR. FLEMING: Right.
DR. WILLIAMS: --and
assume benefit but a long history of using tamoxifen and it was felt certainly
by experts in the field that it was useful and this was used as a surrogate and
maybe the blood pressure and maybe some of these others. I don't see a category here that I could put
them in. They are basically clinical
judgment, clinical inferences about the benefit. So, what do you do with those?
DR. FLEMING:
Certainly, my sense has been--and you can clarify what your sense is,
but my sense has been for some of these interventions that provide a duality
here, that are providing some direct evidence of benefit through, for example,
delay in symptoms and a surrogacy aspect of them, saying that if you are in
fact delaying progression that is some suggestion of a prolongation in
survival. The duality of that in the
context of a very safe intervention is giving you adequately persuasive
evidence of benefit to risk. In the end
that is what it comes down to. In the
end is benefit to risk established to be favorable? The stronger the evidence of efficacy, then
the more resilient you are on safety and, similarly, if you have an incredibly
safe intervention you might accept or you might be more resilient in what you
consider adequately strong efficacy.
Certainly showing a survival benefit I would say in many ways is the
most compelling thing to do because it is the most compelling benefit and
provides more resilience to issues of irregularities in trials and issues in
safety that could arise.
In this case, what I understand you to be doing is really,
in essence, saying we have partially a level one here because we have some very
direct tangible benefits that are occurring and it is reinforced by an
anticipation at some level, valid or invalid, that you are actually delaying
death as well. With a very safe
intervention that is favorable benefit to risk.
DR. WILLIAMS: I
think that is really basically a lot of what we are doing here today with
progression-free survival. Are there
settings where we can accept, or the clinical experience with this endpoint,
the broad experience it seems clear we don't have the strong quantitative
validation we would like but, you know, what are those factors which might
allow it to be used in some very specific settings at this time?
DR. FLEMING: Just
one last response to this, you identified some of those in your appendix. So, specifically the ideal settings are C, E,
P, J and N, C being itself patients are symptomatic so you have at least in
part a level one endpoint. By delaying
time to progression you are directly getting evidence of an improvement in symptoms
or delay in symptoms.
I might challenge whether there would have been another way
to do that, specifically looking at a symptom endpoint as a way to establish
that. I also might challenge that that
is, in my own view, not as compelling as actually having evidence of a survival
effect. But C does get, in my
definition, potentially into level one.
So, surrogacy issues are not as compelling.
If we don't have C, and many times we don't have
specifically symptomatic disease at progression. In November 12 meeting that was certainly the
agreement, that in first-line colorectal cancer at the time of progression we
don't typically see symptoms. Then,
these other aspects that come into play are do we have a large and precisely
defined benefit? The larger the benefit
on the measure, obviously the more plausible it is going to be that it actually
translates into clinical benefit. Hence,
P, a superiority trial, is far more persuasive a setting. A non-inferiority trial and surrogate, as I
have already said, is my worse nightmare.
Blinded trials are important and we probably can achieve
that routinely so it does, in fact, diminish our confidence. We can in fact though, as you say in K, try
to have some kind of an independent evaluation committee that is itself
blinded.
N, drugs that have minimal toxicity, that is where I see in
part the example you have given comes into play. The evidence on efficacy is somewhat less but
if you have an intervention with an established record that is extremely safe
you may, in fact, have a little more resilience on what the strength of
evidence on efficacy would be.
DR. PRZEPIORKA: Dr.
Grillo?
DR. GRILLO-LOPEZ:
Having heard all of that with a bit of impatience--
[Laughter]
--I have to say that clinical medicine even today is still
an art and clinical research resists our efforts to quantitate it; it is also
an art. And, there is no such thing as a
perfect endpoint. There is no such thing
as a perfect endpoint and TTP has its problems but it has a lot of pros. You have to also make a distinction between
those problems that are inherent to TTP and those problems that have to do with
how TTP is measured, presented, how the data is acquired in the clinic, issues
like GCP, sloppy data or good quality data, and put those aside because your
assumption has to be that the data is going to be of good quality. That should not be a deciding factor on
whether or not TTP is a good endpoint.
You have to assume it is going to be good quality.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: Just one
more small comment. Listed under your
pros there is a theme. TTP is a measure
of tumor effect in all patients, rather than measure effect in a subset of
patients. I would look at that as a con
rather than a pro. We are talking about
all the different subsets of patients that may respond totally differently and
you have to have a very strong impact on the right group to overcome--going
back to Iressa for example--to overcome the negative impact on another personal
bias but that is how I would look at that.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: I have a
comment along the same lines. One of the
difficulties, and you have described this repeatedly, is that we are trying to
look for an effect in an overall population when we are only probably
influencing a small fraction. That is a
real burden. In most other conditions
you don't have to do that and you have some hope of treating everybody's
headache even if that is not true. So,
that is all going to get better when we get all pharmacogenomics--
[Laughter]
--I think Grant listed that as a pro for the following
reason and I wonder what people think about it, that to actually shrink tumor
volume by 50 percent you really have to be quite a good responder. There may be people who don't get quite that
good a response but whose tumor growth is slowed, and you might think there are
more of those than the former. That is
why I think he thought that might be a more powerful measure.
But I also have a question.
Remember, I don't treat patients with cancer so if you think this is
really stupid just tell me. If there is
no really good follow-on therapy, which is often the case, why do we monitor
progression other than by symptomatic progression at all if there is nothing
much we can do about it? If everybody
progressed with symptoms then there wouldn't be any argument about it. So, why do we do that? If that is really a stupid question, just
tell me.
DR. PRZEPIORKA: Dr.
Taylor?
DR. TAYLOR: No, it
is not a stupid question. For many of us
who have patients in whom there won't be treatment we don't do repeated x-rays
and you do go by symptoms and you treat them by symptoms because that is the
most practical thing to do. In essence,
that is why ASCO recommendations are for follow-up after adjuvant breast
cancer, to follow symptoms and to do mammograms and physical exams. So, that is not a stupid question.
The only time we are compelled I think to look for
progression is when we are in an investigative setting in which we want to know
what is going on with this particular drug.
DR. TEMPLE: For what
it is worth though, we wouldn't mind seeing a study that was simplified and
that only weighted for symptomatic progression.
Whether it is ethical to do that is a different question. But if it was time to symptomatic progression
there would be no debate about whether that was clinically meaningful at all.
DR. TAYLOR: Again, I
would say that is only specific diseases.
There are some diseases where you do need to monitor.
DR. BRAWLEY: For
example, in certain diseases--I live in the world of prostate cancer, the
patients insist upon PSA to look for relapse.
There are other diseases as well where the patients insist upon some
type of radiologic imaging to look for relapse.
Believe me, it is very difficult to explain to the patient that I don't
really know if this is in your best interest.
DR. PRZEPIORKA: The
other issue is always medical-legal. If
you miss a diagnosis the patient always comes back and says, well, maybe I
would have survived two years longer had you caught my tumor before it became
symptomatic. So, that is another big
issue. Dr. Rodriguez?
DR. RODRIGUEZ: The
reality is, at least in the patient subset that I follow and I mostly treat
patients with lymphomas, is that they can have other malignancies, not just
lymphomas and that the second or third malignancies could be potentially
curable if caught early. So, that is
another overlying concern.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: However,
there have been two to three randomized trials--you say you don't know whether
it is ethical or not to do them--in which patients with lymphoma both Hodgkin's
lymphoma and non-Hodgkin's lymphoma, have been randomized to looking at
patients presenting with symptoms, physical examination and simple things like
that versus regular CT scans at certain intervals, and the overall outcome was
identical. The patient was in general
the best indicator of when the disease was coming back, although we all have
patients where we do pick up things early and, in the grand scheme of things,
survival was not adversely effect in any of those three studies.
DR. PRZEPIORKA: Dr.
Williams?
DR. WILLIAMS: I
wonder if all this discussion mostly refers to settings where the disease has
gone away and you are not treating them.
I am thinking that when you are giving cytotoxic therapy I think a lot
of investigators feel like they need to know whether there is progression or
not and generally they tend to stop the treatment, cytotoxic treatment--Dr.
Temple brought up the question, if it is not a toxic treatment do you really
need to know or you can just continue the drug anyway.
DR. TEMPLE: Of
course, we don't really know if it is time to stop a therapy just because it
has progressed. Maybe it is still
providing benefit. We have had lots of
conversations with companies about that with these newer non-cytotoxic
therapies. But I guess if it is
cytotoxic everybody wants to get rid of it.
DR. PRZEPIORKA:
Other comments? Yes? Could you come up to the microphone? If you could just identify yourself for the
record, please?
DR. SRIDHARA: Yes, I
am Raji Sridhara, from FDA Biometrics. I
am team leader. I have a question going
back to the first one that George and Fleming commented on. You know, when you have crossover you are
saying that, okay, it can't be helped; it happens and we leave it at that. I think we get to a point where actually the
design is such that your primary endpoint is survival and then you don't know how
much you will cross over and at the end you will have some crossover and you
are left with all these secondary endpoints which were never powered properly,
or we don't have specific secondary endpoints.
Would you rather suggest then that we should have specific secondary
endpoints which we can rely on just in case the primary analysis is not
feasible because of too many crossovers, loss to follow-up or any of those?
DR. GEORGE: You are
bringing up a very good point. I think
there was an issue some time ago, not in cancer, that came before the FDA in
which the primary endpoint was not survival.
The survival endpoint seemed to show a survival advantage and then what
do you do? You know, it didn't show
something in the primary endpoint which was not survival but did show a
survival advantage in a surprising way; you didn't expect it. Could you get approval? That is not a question for me I guess.
DR. TEMPLE: Well, in
other settings, other than cancer, the unexpected discovery of survival
benefits turns out, not surprisingly, to carry a lot of weight. We agonize a lot but we tend to say, hm, that
is good.
DR. GEORGE: I think
so. I mean, I think that is the right
kind of approach but you can get yourself into conundrums with saying this is
the primary endpoint; survival is secondary.
But to answer your question, if you really think all of the crossovers
and subsequent treatments are going to be a serious issue in the trial you
really do have to rethink whether survival is the proper primary endpoint, and
in those settings it may not be.
DR. SRIDHARA:
Picking up on what you said about other settings where there was a
survival advantage or where it was not termed as the primary endpoint, then
should we be considering in all these settings co-primary endpoints survival
and time to progression so that it will allow us to look at either one of
them? Since generally until the trial is
over we don't know really how much crossover is going to happen.
DR. GEORGE: What
does is a co-primary endpoint mean? Does
that mean you have to meet both of the objectives?
DR. SRIDHARA: One or
the other, or however you want--it depends I guess on the disease setting and
what we are doing.
DR. TEMPLE: Sorry,
did you ask about co-primary?
DR. GEORGE: Yes,
what does that mean?
DR. TEMPLE: Usually
people divide the alpha appropriately, whatever appropriately turns out to
be. There have been cases, but not
mostly in oncology, where we expect a benefit on more than one endpoint. But, as everybody knows, that becomes a
formidable challenge and we get requests to reduce the alpha or make the alpha
less demanding. But usually that means
people have to make some accommodation to multiplicity--always tricky.
DR. PRZEPIORKA: Dr.
Fleming?
DR. FLEMING: Just to
return to this point, it seems to me that therapeutically what we are trying to
do is improve the regimens and the therapeutic strategies. I think that was the term that Dr. George
used earlier. We are looking at
comparing a therapeutic strategy involving the experimental agent versus the
standard of care strategy and trying to show that this experimental strategy
is, in fact, better in a tangible way to patients. Obviously, that means that we should be
delivering care in an optimal fashion and when the first intervention to which
you are randomized leads to failure at some level you are going to follow-up
with best supportive care, as you should.
In fact, we would hope that we can improve on strategies
that will ultimately lead to an improvement in survival relative to what is
available in the standard of care. So,
clearly, in many settings it would be an appropriate endpoint. But there are many other settings where it
may not be anticipated that that would be the most sensitive measure to what
beneficial influence we provide to patients.
If, in fact, that is in part because of crossovers diluting the
long-term survival effect, I would still argue that is the truth. That is what I am ultimately doing on
survival. There may be need for other
measures. I would argue that those other
measures ideally should be direct clinical measures of benefit, measures
reflecting improvement in functional status; measures that reflect overall
improvement in symptoms. With
bisphosphonates, for example, what we have gone to is skeletal related events
as an alternative clinical efficacy measure.
Beneficial effects may be reflected in survival but a more sensitive
clinically tangible measure may be the measure in reduction in fractures and
spinal cord compression and radiation and surgery to the bone, other rescue
therapies. So, if I can improve that
measure that is clinically tangible benefit.
I would rather see that measure being the co-primary endpoint rather
than a surrogate measure, unless that surrogate has been truly validated.
I just want to come back to one of my colleague's earlier
points that was raised in the criticism of time to progression. You are absolutely right, we want to do high
quality studies. So, we are going to
presume that people are going to the very best study they possibly can on
whatever endpoint they are looking at.
However, certain endpoints lend themselves to more readily being
assessed in an unbiased, objective way.
In an unblinded trial it is much more problematic when you have an
endpoint that requires judgment, such as a symptom endpoint or a time to
progression endpoint, as opposed to survival.
And, missingness has over history been more of a problem when we are
looking at these markers as opposed to survival as an endpoint. In particular, as we have said, with time to
progression we are building in missingness because automatically time to
progression, by censoring deaths, means you are missing what happens in time to
progression subsequent to death in those patients who die. So, there are some inherent problems that
exist with lack of blinding and with censoring deaths that even in the best
quality study you are going to have some difficulties with.
DR. PRZEPIORKA: If I
could just summarize--
DR. GRILLO-LOPEZ: I
disagree with that.
DR. PRZEPIORKA: Feel
free.
DR. GRILLO-LOPEZ: I
cannot agree that you can measure survival better than time to
progression. I think that if you have an
appropriately designed trial with the appropriate interval for CT scans you can
measure time to progression better than you can measure survival because of all
the biases in the survival measurement that I mentioned earlier. So, it all depends on how you design your
protocol; how you schedule your evaluations and how good the quality of the
data is. Again, there are so many biases
inherent to the survival kind of endpoint that it is not an acceptable endpoint
in most situations, in my mind at least.
The other thing that I would like to mention is that the
issue of crossover goes away completely if you are not using survival as an
endpoint. It is an important issue
because if you have a drug, a new agent that has gone through Phase II trials
you know of its clinical activity; you know of its safety and you know what the
patients know of its clinical activity and safety because they go ASH and they
go to ASCO and they go to the websites and they know that there is an option
which in some situations, in the refractory setting, may be the best option for
them and they are not going to go into a Phase III trial and take a 50 percent
chance of being randomized to a standard therapy that may not be as good in
fact as the experimental therapy and never have the chance to get the
experimental agent unless they know that there is some opportunity, not perhaps
within the same protocol but some time later on, to get the experimental agent.
DR. FLEMING: But
your response is presuming that access to that intervention on a delayed basis
is going to provide the essence of what the benefit is when you deliver it up
front--in some settings more plausible but in other settings much less
plausible. And, your response hasn't
addressed the issue of the inherent risk of bias that arises in what is
typically done in oncology, which is unblinded trials, and it hasn't addressed
the issue of the informative censoring that arises if you choose to censor
deaths.
DR. GRILLO-LOPEZ:
But that is not my assumption. I
am saying that it is the patient's assumption.
It is the patient's assumption that there is benefit and they want to
get that experimental--
DR. FLEMING: That
doesn't matter if it doesn't, in fact, carry a substantial part of the overall
benefit up front. It doesn't matter if
that is the patient's assumption.
DR. GRILLO-LOPEZ:
You miss the point. What I am
trying to convey is the difficulty of doing a Phase III randomized trial if the
patient knows that he has only a 50 percent chance of getting an agent which
the patient perceives as an active agent.
DR. FLEMING: The
Evastin trial in colorectal cancer was just successfully completed in a manner
that you are saying couldn't have been done.
DR. GRILLO-LOPEZ: It
may be an exception.
DR. TEMPLE: Surely a
company can control whether it makes an experimental drug available to
everybody and allows crossover or not.
It is their drug.
But I thought the earlier point you made, and it is one of
the reasons we are here, is crossover doesn't matter if you are measuring time
to progression because crossover happens after that.
DR. FLEMING: If, in
fact, time to progression is the answer to the question that we care about and
can be addressed without the problems of these other biases that arise so it is
not getting us out of the woods.
DR. TEMPLE: No, it
just solves one problem.
DR. PRZEPIORKA: Dr.
Brawley, last comment?
DR. BRAWLEY: Well,
it was actually somewhat of a question.
It is just sort of a gut check. I
am just sort of remembering all those trials, many of them not in cancer
treatment but in other areas where initial endpoints and initial surrogates
seemed to be very positive and then, when we finally got to the randomized
clinical trials we found out that the intervention actually was not as
positive. I am thinking specifically
right now of premarin in the Women's Health Initiative, although I have some
rumblings of Iressa Phase III clinical trials in the back of my mind, Iressa
trials using Iressa and chemotherapy as well.
We have to be very careful as we go down this path.
DR. PRZEPIORKA: A
very good point. If I could summarize
what I heard, there are actually a few parallels to our discussion on
disease-free survival. Specifically for
time to progression, we did not think that a single endpoint design would be
attractive at all. There is concern
about death on therapy and perhaps progression-free survival might be better
than just time to progression.
We agree that there has to be rigorous assessment for
scientific reasons, not for clinical reasons.
So, repeated assessments may be done in studies where we would not
usually do them in clinical medicine but we do want to get the scientifically
valid results.
We would not use this therapy for patients who are very
symptomatic because progression there would not be good for those patients as
opposed to really trying to get a response.
And, toxicity needs to be factored in as a risk-benefit for whether or
not this is something useful.
So, it appears that progression-free survival would be for
diseases with low CR rates in therapies that would be unlikely to alter
survival because of the underlying disease to be used as a primary endpoint,
but in a comparative study when standard therapy is already shown to have a
benefit it would probably only be as opposed to a real endpoint. Any other comments on that summary? Dr. Temple?
DR. TEMPLE: One of
the points was that we don't expect these drugs to alter survival. I guess I am not sure that is the
assumption. We think it may be difficult
to demonstrate that because of crossover and because it is going to occur
later, but I guess I think one of the assumptions is that if you have an effect
on time to progression, or something like that, it probably does have a
favorable effect on survival even if you are not able to measure it very
well. Am I wrong in that?
DR. PRZEPIORKA: I
don't think I would disagree with that but I think time to progression would be
an excellent endpoint in a disease such as metastatic prostate cancer in the
elderly where, no matter what you do, they are going to end up dying of
non-cancer reasons. Whereas, if you can
keep them symptom free it would be very valuable.
DR. TEMPLE:
Actually, the last point is one we didn't talk much about, survival is
tough if it is an old population that is dying of a lot of other things. We didn't really discuss that but in prostate
that is probably a major factor.
DR. PRZEPIORKA: We
will close this session with an announcement about lunch.
MS. CLIFFORD: The
statement I made earlier, unfortunately, is not true about your badge. It will not grant you access into the
building next door. I am sorry. At the front desk there is a list of six
restaurants that are local, that are within walking distance that you are
welcome to visit. Thank you.
DR. PRZEPIORKA: We
will reconvene promptly at 1:00 p.m.
Thank you.
[Whereupon, at 12:05 p.m.,
the proceedings were recessed for lunch, to reconvene at 1:00 p.m.]
A F T E R N O O N
P R O C E E D I N G S
DR. PRZEPIORKA: In
this afternoon session we will discuss non-small cell lung cancer endpoints and
we do have a different group with us this afternoon so, for the record, I would
like to go around the table one more time with introductions for everyone who
is new this afternoon and everyone from this morning. If we can, let's start with introductions
with Dr. Ettinger, if you could let us know who you are and where you are from,
please.
DR. ETTINGER: David
Ettinger, the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins in
nearby Baltimore.
DR. SAXMAN: Scott
Saxman, in the Cancer Therapy Evaluation Program of the National Cancer
Institute.
DR. BONOMI: Phil
Bonomi, Rush Medical College, Chicago.
DR. JOHNSON: David
Johnson, Vanderbilt University in Nashville, Tennessee.
DR. JOHNSON: Bruce
Johnson, from the Dana Farber Cancer Institute.
DR. GRILLO-LOPEZ:
Antonio Grillo-Lopez, acting industry representative.
DR. GEORGE: Steve
George, Duke University.
DR. CHESON: Bruce
Cheson, Georgetown University Lombardi Comprehensive Cancer Center.
DR. DOROSHOW: Jim
Doroshow, City of Hope Comprehensive Cancer Center.
DR. RODRIGUEZ: Maria
Rodriguez, M.D. Anderson Cancer Center.
DR. BRAWLEY: Otis
Brawley, Emory University, Winship Cancer Institute.
MS. ROSS: Sheila
Ross, Washington representative for Alliance for Lung Cancer, and I am a lung
cancer statistic.
DR. FLEMING: Thomas
Fleming, University of Washington.
DR. LEVINE:
Alexandra Levine, University of Southern California Norris Cancer
Center.
DR. REAMAN: Greg
Reaman, George Washington University and the Children's Hospital in D.C.
DR. PRZEPIORKA:
Donna Przepiorka, University of Tennessee Cancer Institute.
MS. CLIFFORD: Johanna
Clifford, FDA.
MS. HAYLOCK: Pamela
Haylock, oncology nurse from Texas.
DR. CARPENTER: John
Carpenter, University of Alabama at Birmingham.
DR. REDMAN: Bruce
Redman, University of Michigan Comprehensive Cancer Center.
DR. TAYLOR: Sarah
Taylor, University of Kansas Medical Center.
DR. LI: Ning Li, FDA
Biometrics.
DR. KEEGAN: Dr.
Keegan, CDER Office of Drug Evaluation VI.
DR. WILLIAMS: Grant
Williams, FDA Drugs.
DR. TEMPLE: Bob
Temple, Director of ODE I.
DR. PRZEPIORKA: This
afternoon's session is actually split into two.
The first will be three talks regarding non-small cell lung cancer and
clinical trials. We will have a brief
break, followed by an open public hearing and then address the questions that
have been posed to us by the FDA. We
will start this afternoon's session with a talk by Dr. Cohen on non-small cell
lung cancer, the regulatory background.
Non-Small Cell Lung Cancer
Regulatory Background
DR. COHEN: I am
going to review the approval in lung cancer that the agency has made through
the years.
[Slide]
The data that I am going to present is the data that is in
the individual labels for each drug. So,
the data may be somewhat different from published data that you would find for
each of these trials.
[Slide]
For non-small cell lung cancer there have been first-line
approvals, second-line and third-line.
There were five approvals for first-line. All of these approvals were regular
approvals. For second-line there has
been one approval, also a regular approval.
For third-line non-small cell lung cancer there is one recent approval
which was an accelerated approval. For
small-cell lung cancer second-line there has been one regular approval and
there has been one approval for palliation of non-small cell lung cancer.
[Slide]
This is a listing of the five approvals for first-line
non-small cell lung cancer. There was
one single agent, vinorelbine and four approvals for doublets containing
cisplatin, and the doublet partners have been vinorelbine, gemcitabine,
paclitaxel and most recently docetaxel.
[Slide]
What I am going to do in the next group of slides is review
each of these approvals. This is the
vinorelbine approval. The approval was
based primarily on an improvement in one-year survival and also, as supporting
evidence, there was improvement in response rate. In this trial the comparator regimen was 5-FU
leucovorin given in the Mayo Clinic type regimen.
There were 211 patients entered into the study. There was a 2:1 randomization in favor of
vinorelbine. As you can see, the
response rates were 12 percent versus 3 percent. Median survivals were 30 weeks versus 22
weeks and one-year survival was 24 percent versus 16 percent. The p value refers to the difference in the
survival curves.
[Slide]
Vinorelbine/cisplatin was evaluated in two studies. In the first study vinorelbine/cisplatin was
compared to cisplatin alone and 432 patients were entered. Response rates favored the combination
therapy. Median survivals were 7.8
months versus 6.2 months. One-year survivals
were 38 percent versus 22 percent, and the p value for the survival comparisons
were 0.01.
The second study was a three-arm study that included
vinorelbine, cisplatin compared to vinorelbine alone and the third arm was
vindesine/cisplatin. You can see that
the response rates in this study favored the vinorelbine/cisplatin
combination. Median survivals were 9.2
months versus 7.2 months for vinorelbine alone versus 7.4 months for the
vindesine/cisplatin combination. One
year survivals were as listed. The p
value for survival comparing vinorelbine/cisplatin to vinorelbine alone was
0.05 and the p value for the comparison of vinorelbine/cisplatin versus
vindesine/cisplatin was 0.09.
[Slide]
Gemcitabine/cisplatin was also evaluated in two randomized
trials. In the first trial the
comparator regimen was cisplatin alone.
There were 522 patients entered.
Response rates were 26 percent versus 10 percent favoring the
combination. Median survivals were 9
months versus 7.6 months and the p value for that comparison was 0.008.
In the second study, which was somewhat smaller, the
comparator regimen was etoposide/cisplatin.
The response rates were 33 percent for the gemcitabine/cisplatin regimen
versus 14 percent for the VP16/cisplatin.
Median survivals were 8.7 months and 7.0 months. As you can see, that survival difference was
not statistically significant.
[Slide]
Paclitaxel/cisplatin was evaluated in an ECOG trial that
was a three-arm trial. The first arm
included paclitaxel 135 mg/m2.
There was a 24-hour infusion with cisplatin. The second arm was paclitaxel 250 mg/m2
with cisplatin. The comparator regimen
was etoposide/cisplatin.
As you can see, both paclitaxel regimens had an increased
response rate as compared to etoposide/cisplatin. Median survivals were 9.3 months for
paclitaxel 135, 10 months for paclitaxel 250 with cisplatin and 7.4 months for
the VP/cisplatin regimen. In terms of
survival, which is listed on the bottom on the right, the survival comparison
of paclitaxel 135 mg/m2 plus cisplatin compared to
etoposide/cisplatin, the p value was 0.08 and for the paclitaxel 250 mg/m2
the p value was 0.12. However, if you
look at response rates which is a), and time to progression which is b) on the
bottom, both of these were statistically significant in favor of the paclitaxel
regimens, with paclitaxel 250 doing somewhat better than paclitaxel 135.
[Slide]
Docetaxel/cisplatin was evaluated against
vinorelbine/cisplatin and also against docetaxel/ carboplatin. A total of approximately 1200 patients were
entered into this study. As you can see,
the median survivals were relatively similar for all three regimens. This was a non-inferiority analysis and doing
the non-inferiority analysis docetaxel/cisplatin retained greater than 50 percent
of the therapeutic benefit of vinorelbine/ cisplatin. On the other hand, docetaxel/carboplatin did
not. So, the docetaxel/cisplatin regimen
was approved.
[Slide]
Docetaxel was also evaluated as a second-line treatment
regimen in two studies. In the first
study docetaxel was compared to best supportive care and 104 patients were
entered. The response rate to docetaxel
in this patient population was 5.5 percent.
Median survivals favored docetaxel, 7.5 months versus 4.6 months, with a
p value of 0.01.
The second study involved docetaxel compared to
chemotherapy that was investigator's choice and 248 patients were entered. The response rates for docetaxel were again
in the 5-6 percent range. The median
survivals were comparable for docetaxel and investigator's choice
chemotherapy. But one year survival for
docetaxel was 30 percent versus 20 percent for investigator choice, and that p
value was significant at less than 0.05.
[Slide]
Gefitinib or Iressa was recently evaluated as a third-line
treatment regimen in patients who had failed a platinum and who had failed
docetaxel. There were 143 patients who
met these eligibility criteria. They
were randomized to receive Iressa 250 or 500 mg/day. Overall, if one combines the two treatment
groups and that was done because it was relatively comparable for each group,
the overall response rate was 10.6 percent with a 95 confidence interval, as
listed, and it was of interest that in exploratory analyses response rates were
higher in females, in nonsmokers and in patients with adenocarcinoma.
[Slide]
The one approval in small cell lung cancer was Hycamtin or
topotecan and that was compared to CAV,
Cytoxan, adriamycine and
vincristine. The eligible population for
this trial were patients who had responded to first-line treatment and who had
then progressed greater than or equal to 60 days after stopping treatment. There were 107 patients in the Hycamtin arm,
104 patients in the CAV arm. The
difference in this study was only in response rate. The response rate was 24 percent for Hycamtin
versus 18 percent for CAV and this difference in response rate was felt to be
of sufficient importance to warrant approval.
[Slide]
The one palliative approval in non-small cell lung cancer
involved photofrin photodynamic therapy, and that was compared to nd:YAG laser
therapy. The patient population eligible
for this study were individuals with symptomatic obstructive bronchial
lesions. Symptom severity scales were
used as the evaluation tool. Symptoms
rated were dyspnea, cough and hemoptysis.
Photofrin therapy was of comparable efficacy to nd:YAG laser therapy.
[Slide]
So to summarize the approval endpoints, in first-line, as I
mentioned earlier, there were five studies.
Three of the approvals were based on superior survival. One approval was based on non-inferior
survival and one approval was based on superior time to progression and
response rate with a trend toward improved survival.
In the second-line setting there was one study and approval
was based on superior survival in that study.
In the third-line setting, which was the one accelerated approval in
non-small cell lung cancer, the accelerated approval was based on response
rate. And, there was one approval based
on symptom palliation.
[Slide]
In second-line small cell lung cancer there was one
approval and that approval was based on response rate. That concludes my presentation.
DR. PRZEPIORKA:
Thank you. We will hold questions
until all three speakers have had the opportunity to presentation. Next, Dr. Paul Bunn will talk about the FDA
ASCO non-small cell lung cancer workshop.
FDA/ASCO Non-Small Cell
Lung Cancer
Workshop Summary
DR. BUNN: Members of
ODAC, members of the FDA and guests, I would first like to say that I am
honored to be here. It is a privilege to
be here and I want to mention that I take this extremely seriously because what
I do for a living is to take care of lung cancer patients and I think what you
are deliberating is extremely important.
[Slide]
With respect to the history of why we are here, Rick
Pazdur, in his infinite wisdom, I think agreed with a comment that Bruce Cheson
made this morning and that is not all cancers are the same and in the future it
is highly likely that we are going to have to look at these endpoints in
individual cancers based on data from the individual cancers, not based on
feelings but based on data from these individual cancers. Of course, this morning we heard a lot of
theoretical discussion. Hopefully, this
afternoon we are going to be talking about data-driven discussion.
So, to put the data into context, the FDA and the American
Society for Clinical Oncology had a series of telephone conferences and a
single open public hearing discussing endpoints for approval of drugs for lung
cancer. What you are hearing this
afternoon is somewhat of a rehash of that.
You will be asked some questions based on what you hear.
The way we have done this is that we have divided the
discussion into two topics. The first
topic is what has been called classical endpoints. The classical endpoints that we discussed
were objective response, time to progression and survival. For whatever reason, we called another one
non-classical endpoints. The distinction
I think is incorrect but, anyway, that was largely patient-reported
outcomes. After I get done talking about
the classical endpoints of objective response, time to progression and
survival, Richard Gralla is going to talk about patient-reported outcomes.
I have an apology to make.
The slides that you have in front of you--my secretary and I were in a
miscommunication mode and they have nothing to do with what I am going to say--
[Laughter]
--so don't bother looking at your handout. You will be very confused. You will actually have to look at the slides
and I apologize for that.
Before I actually begin I want to make one correction to
what Marty said and one other comment.
Actually, the Albain study of vinorelbine/cisplatin versus cisplatin
happened after the approval. Actually,
the LeChevalier study for the combination was the primary study and the
Crawford study for single agent was the primary study. The Albain study actually came later and
confirmed what happened but was actually not known at the time of the ODAC
presentation. I know because I am old
and I was there.
I have great respect for the consultants here. I also have great respect for Dan Ihde. What I am going to say is something that I
think in 1985 Dan Ihde and I agreed on and I wish he were here to agree with me
now that what happened in 1985 was a big setback to lung cancer drug approvals.
[Slide]
I am going to begin by trying to keep this simple,
stupid! Why are we here? Drug development takes enormous amount of
fiscal resources and long periods of time.
Currently we know more about novel targets than ever before. At the same time, there are fewer new drug
applications. We could ask why is
that. It is undoubtedly for many
reasons. It is possible that stringent
FDA requirements for approval at the moment are a deterrent to new drug
applications.
I think we could all agree that most knowledge about drug
utilization and toxicity occurs after the initial approval. We might also agree that if we had safe and
efficacious drugs, expedited drug development might benefit society. Therefore, I think it is appropriate that we
are looking here at criteria for endpoints for NDAs, or new drug applications,
for lung cancer.
As you heard this morning, FDA regulations require that
drugs be safe and efficacious for a defined population by adequate and
well-designed clinical trials. As you
also heard this morning, simple statements are sometimes gray, not black and
white. As you also heard this morning,
FDA legislation does not require that a drug be shown to be superior to other
drugs. It has to be safe and
efficacious; it doesn't have to be better than approved drugs, with a single
exception which I believe should be discussed openly and frankly in this
afternoon's deliberations. Oncology drug
divisions is determined that drugs given accelerated approval should offer an
advantage over existing agents.
DR. TEMPLE: It is in
the reg.
DR. BUNN: It is in
the reg? Okay. Well, we are going to discuss this during my
presentation.
[Slide]
The question is, well, why would be here just for lung cancer? What are some of the differences between lung
cancers and other diseases? One of the
difference is that almost all the patients, three-quarters, present with
advanced disease. That is, they are III
or IV.
Most studies show that 90 percent of patients or more are
symptomatic at the time of presentation.
So, our discussion this morning about whether patients would be
symptomatic or not, in lung cancer the basic idea is that they are symptomatic. When they get relapse they are symptomatic;
when they present they are symptomatic.
The majority of patients have co-morbid cardiopulmonary disease. Dr. George was talking about deaths from
unrelated causes. This is a huge problem
in lung cancer. If you look at trials of
adjuvant radiation and adjuvant alkylating agents the hazard rates are 1.2, so
a 20 percent increase in the hazard rate of death is not due to the disease but
it can accelerate the disease. Many of
those deaths are not actual toxic deaths that you would define as a toxic death
but these are sick people and when they get tough treatments sometimes they
die.
In the current SEER data in the U.S. the median age is 70
years old. The majority of these
patients are elderly. Recruitment to
surgical trials is extremely difficult.
In this disease at the moment, unfortunately, complete responses are
rare. So, talking about disease-free
survival is an oxymoron when you talking about stage IIIB and IV lung
cancer. We don't have to have that
discussion that we had this morning; it doesn't happen.
It used to be that objective responses or 20 percent were
very rare. Fortunately, we have drugs
that work now. We have drugs that make
people live longer and objective responses oftentimes do occur in more than 20
percent of patients.
It used to be that second-line therapy did not influence
survival but now, as you heard from Dr. Cohen, it does. So, some of the issues we heard this morning
about second-line therapy influencing survival will be an issue.
[Slide]
So, classical endpoints--objective response. Up until 1985 this was a major deal. In 1985 Dan Ihde, along with the FDA, looked
at a bunch of data and there was not a wonderful correlation between response
and survival. That probably would be
true today for melanoma and other diseases where responses over 10 percent are
rare. We are going to re-discuss that
now in 2003 to actually look at what the relationship is between response rates
and survival.
Time to progression has not often been used because it is
very difficult to assess and, in the past, because second-line therapy didn't
affect survival. The difference between
progression and survival was very short but we will have a little bit of
discussion about that. Survival I guess
is not only FDA's favorite endpoint. As
you heard this morning, most of us can agree that it is a real and important
endpoint.
[Slide]
So, in the past objective response rates were quite
variable, not consistently assessed; did not always correlate with survival and
most agents, such as the alkylating agents and the athrocyclines were toxic to
smoking patients. Some of these agents
produced response in up to 20 percent but rarely higher of untreated patients
but there was no survival improvement.
Thus, in 1985 the FDA decided that objective response rate was not
definitely associated with patient benefit.
[Slide]
What happened since that time? I think that this is a very important study
and one which really needs to be updated.
In fact, after this morning's discussion I am thinking about having one
of my fellows go back and actually do this.
I partially did this but not in a real meta-analysis.
But there was a study that looked at the correlation
between response and survival in 176 Phase II trials with 7000 patients between
'76 and '95. Since that time, the drugs
that Dr. Cohen mentioned have largely been approved and were not part of
this. The average response rate in these
trials was only 11 percent. I think
since 1995 we are in a different place.
In these 176 trials they found 12 drugs, or 11, that had a
response rate of more than 20 percent.
Those are cisplatin, vinorelbine, docetaxel and paclitaxel. As you heard, all those are approved. This also included small cell so irinotecan,
etoposide, vindesine, epirubicin and ifosfamide and edatrexate showed up in
that list.
They also did a correlation between response rate and
survival time. You can see that the
correlation coefficient and the p value.
Then they did a logistic regression coefficient and you can see the p
value between the relationship between response and survival was 0.0003.
[Slide]
So, what has happened since 1995 in terms of what is in the
literature? These are the drugs that
most of us would consider the most active cytotoxic drugs. We have the Phase II single agent studies of
these drugs in untreated advanced non-small cell lung cancer. As you can see, these have response
rates--these are limited institution studies now, not the big cooperative
groups and I will get to those. They had
response rates varying from 20 percent to 27 percent. They had median survival times ranging from
7.6 months to 9.7 months and one-year survival rates ranging from 22 percent to
41 percent. I think from historical
controls, any of us would say, if you are an optimist, the median survival would
be 5 months and the one-year survival rate would be 10 percent. Vinorelbine, as you heard a moment ago, is
the only one of these drugs approved for non-small cell lung cancer.
[Slide]
What about multi-institution Phase III trials with these
same therapies. You can see here that,
again, there are large numbers of patients but there are some differences. The response rates before varied from 20
percent to 27 percent and now the response rates vary from 16 to 18. Why is that?
The primary reason for that is that the cooperative groups require a
post CT scan done four or more weeks later and most trials have them done eight
weeks later. Many of the patients don't
have the second scan and those are unconfirmed responses and the cooperative
groups don't count those patients as having a response. So, it is generally true--and some of the
ECOG or other people could comment on this--that in the multi-institutional
cooperative group trials response rates are approximately five percent lower
than in the limited institutions primarily for that reason.
You can also see that the confidence intervals around these
response rates are actually quite narrow.
Largely, that is because people can actually use RECIST and actually
have objective response rates that are fairly reproducible. Median survivals in these trials range from
6-7 months and one-year survival from 25-33.
[Slide]
I am going to come back to first-line therapy after a
minute but something new happened, and that is patients are living longer. Now, just remember that the minority of
patients have benefit. If you have a
response rate of 20 percent means that most patients aren't having any
benefit. Now, median survival is not
likely to change a lot when 10 percent or 20 percent of the patients are benefited. Two-year survival goes from 1 percent to 20
percent in advanced lung cancer with treatment but median survival only goes up
by a couple of months.
In the second-line setting the drugs that have been
approved and the drugs that we think about the most are shown here. Response rates range from 9 percent to 16
percent in these trials although the confidence intervals and the ranges are
much broader in the second-line limited institution setting than they are in
the first-line setting.
[Slide]
With respect to multi-institution Phase III single-agent
therapy in non-small cell lung cancer, the data from the trials that we have
had are listed here. Response rates vary
from 8 percent up to 14 percent. Now, as
you heard, docetaxel is approved and gefitinib is approved. Question number six in your handout could be
viewed as a pre-setting for a pivotal trial looking at pemetrexed in the
second-line setting and the response rate, median survival and one-hear survival
from that trial are shown here.
[Slide]
So, a question that I hope you all will address, because I
think it is extremely important--in 1985 it was basically determined that
objective response was not either a likely patient benefit or a definite
patient benefit, and in my opinion objective response that exceeds a certain
threshold should be considered as likely evidence for patient benefit--likely,
not proven. In Dr. Fleming's terms this
morning, that would be his group C. I
think that objective response over 20 percent in untreated patients is a likely
surrogate for patient benefit. It is
possible that meta-analysis could change that into a definite evidence of
patient benefit, as documented by symptom relief and/or survival.
Every drug that we know of with a response rate over 20
percent in limited institution trials and over 16 percent in
multi-institutional trials has been shown in randomized trials to affect
survival, and most of them have been shown to relieve [sic] patient
benefit. I am not going to discuss
patient benefit in terms of symptoms because Richard Gralla is going to talk
about that.
So, if one could consider that objective response is a
likely indicator of clinical benefit, the question is could accelerated
approval be given based on objective response rates? Certainly, I think that they could. One could say that if the surrogate is
definite it is full approval. If the
surrogate is likely, it is an accelerated approval. Well, I believe it is likely. It could be definite but I think it is likely
so it should be considered for accelerated approval.
Another thing is that RECIST criteria I believe are
actually good and can be reviewed independently by the FDA and independent
committees. So, I believe that the
endpoint we are talking about here is a reproducible endpoint.
[Slide]
In the first-line setting one could argue that if an agent
had an objective response rate of more than 20 percent in a limited institution
study or 15 percent in a multi-institution trial that a drug might be given
accelerated approval. One could argue in
a second-line setting active agents have objective response rates of more than
10 percent in limited institution studies and more than 8 percent in
multi-institutional studies.
Now, to demonstrate this type of response is actually not
trivial. These data are I think almost
right but not exactly right. I have a
little bit better data from Dr. Piantidosi.
If you want to show that a drug has a 25 percent response rate,
plus/minus 5 percent, a 95 percent confidence interval of 5 percent, Dr. Piantidosi
informs me that would be a 400-patient trial.
If that goes to plus/minus 4 percent the number would be 625 patients.
[Slide]
This is not actually just an academic consideration
here. Not all the drugs work that are
developed. Current FDA policy promoting
Phase III survival trials have led to the institution of multiple Phase III
trials after the completion of a Phase I trial even when no single-agent
activity was observed in the Phase I trial.
No inactive drug has ever been shown to improve survival or improve
patient symptoms when used alone or in combination with chemotherapy. However, going straight from Phase I to Phase
III has led to multiple negative trials costing not thousands but millions of dollars
and thousands, not hundreds, of patient live resources.
Examples of randomized trials of agents not showing any
activity up until the time of a survival Phase III trial are shown here,
tirapazamine, MMPIs and a Gentasense compound and a whole bunch ongoing.
[Slide]
This is what we have learned from these trials. These inactive agents when combined with
active agents do nothing. This
particular negative trial had 700 patients.
No benefit to the patient.
Probably approximately 100 million dollars wasted. If objective response had been available to
get accelerated approval, people would throw away the inactive drugs. Because they can't get accelerated approval
for active drugs, they go straight from Phase I to Phase III, waste millions of
dollars, thousands of patients lives. I
would submit this is not a good state of affairs. Obviously, you may all disagree but it is not
my favorite thing.
[Slide]
Single-agent activity of tirapazamine has never been
established. Nonetheless, for the same
reason multiple Phase III trials were done.
Interestingly enough, one of these Phase III trials, shown here, showed
an improvement in response rate of tirapazamine/cisplatin versus
cisplatin. The response rate was higher,
survival was higher but when this was done in another trial response rate was
not improved nor was survival. This does
show why we should also discuss in certain instances why you might want two
trials instead of one. Perhaps we can
discuss that.
[Slide]
Now, some drugs that get developed are not all that far
from patent exploration. When companies
need a Phase III survival advantage trial to get a drug approved and it is
going to take five years and they are four years away from their patent
expiring, they may not want to develop the drug. So, a drug called oxaliplatin was done as a
Phase II trial in lung cancer.
Interestingly enough, it was done in performance status II patients
which, as everyone knows, is a very bad group of patients. The response rate was 15 percent, median
survival was 8 months and there was not a single grade III or IV hematologic
toxicity.
If accelerated approval was available for this drug, on the
basis of this probably one would want to do a big trial to try to get
accelerated approval. The huge question
is whether this drug will ever see the light of day for lung cancer patients
because of the current interpretation of how to get a drug approved.
[Slide]
When we get into combinations response rate sometimes gets
a little trickier. This is a trial that
makes us all humble of course and it highlights the issue about response and
median time to progression, and perhaps would be used to say that there should
be surrogates for likely benefit, not definite benefit.
This was a study from Germany that compared cisplatin to
Taxol and cisplatin. The Taxol and
cisplatin arm had a much higher and statistically significant higher response
rate. It also had a statistically
improved median time to progression. On
the other hand, survival was actually a little worse, not statistically so but
a little worse in the combined therapy arm.
I don't know what to make of this trial. It is certainly an outlier and it shows why
outliers happen. One could argue that
this is why objective response and time to progression should be surrogates as
opposed to definite relationship to patient benefit.
[Slide]
Now, if accelerated approval was actually available and
people took advantages, where would be today?
Actually, docetaxel, paclitaxel, gemcitabine, irinotecan, pemetrexed and
cisplatin would be approved for lung cancer and I don't think there is a single
person in this room who thinks that would be bad. Drugs that would not be approved and have
either been shown not to be useful under Phase III trials at the moment are
equally as many. And, why do we have to
go through large, 1000-patient, randomized trials for inactive drugs?
There were drugs approved, vinorelbine and gefitinib, and
gefitinib was actually approved by accelerated approval based on response. That precedent that you all set--I think what
you did was right. I think what you did
should be common, not uncommon. Not
every active agent has a response rate over 20 percent. Carboplatin, I think most of us would agree,
is a useful drug and makes people with lung cancer live longer but doesn't have
a response rate over 20 percent.
[Slide]
So, just to reemphasize what you did, if gefitinib had not
been studied in large numbers of patients and approved based on response rate,
it would be gone because the company did what all the other companies have been
doing, going straight from Phase I to Phase III, and they did that as
well. They went straight into combined
studies. As you all know, those trials
were negative.
Besides the fact that most of us think that lonafarnib and
gefitinib are drugs that should be approved for lung cancer, we have to learn
how to use them. Look at the time to
progression in these trials. After the
chemotherapy was stopped the groups that got gefitinib did better than the
group that got placebo in both trials. I
think everybody in this room thinks we need to understand why that is. We wouldn't be able to understand why that is
if these drugs were not given accelerated approval--these would be gone.
[Slide]
Now I am going to talk a little bit about these EGFR
inhibitors--
[Slide]
Before I do I want to say one thing, FDG PET hasn't been
studied nearly as much as CT response.
In every trial comparing CT response to PET response, PET response is
correlated with survival better than CT response. There is not a single trial were PET response
is not correlated with survival. I
think, if nothing else, we should be encouraging our pharmaceutical colleagues
to consider this for development as a potential surrogate endpoint that
actually could be better than actually objective response by CT.
[Slide]
So, what about subsets?
Lung cancer is not one disease.
We heard this morning that leukemias are not all the same. Bronchoalveolar carcinoma and large cell
neuroendocrine carcinoma are not the same disease. Small cell carcinoma is not the same as
non-small cell carcinoma. What are we
going to do about subsets? If we require
that for a subset approval a company has to do a Phase III survival trial,
forget subsets. Forget it. If companies can get accelerated approval
based on response rates in subsets, we might be able to make some progress.
[Slide]
Everyone sitting at this front of the room can identify as
a classic patient with bronchoalveolar carcinoma, which is one subset of
non-small cell carcinoma. Those of us
who deal with this disease know this is not a very chemosensitive disease. We don't have a ton of data but what data we
have suggests response rates are low in bronchoalveolar than in any other
histology.
Anecdotally it was found that EGFR inhibitors often make
responses in patients that have this chemorefratory disease. It is also anecdotally noted that these
patients have high expression of EGFR and HER-2, which was unexpected.
[Slide]
Now, we have a problem between the pathologist and the
clinicians. Pathologists say that
bronchoalveolar carcinoma has to be non-invasive. So, they are talking about infiltration among
the alveoli septi where there is basically no invasion. They divide bronchoalveolar carcinoma into
mucinous and non-mucinous forms. When we
see these bilateral infiltrates what we usually have is invasive adenocarcinoma
with bronchoalveolar features. So, that
is something that we have to work out between the clinicians and the
pathologists.
[Slide]
But as I mentioned, bronchoalveolar carcinomas have very
high expression of EGFR and HER-2.
[Slide]
This is what we know about bronchoalveolar carcinoma
clinically. Chemotherapy, as I
mentioned, has response rates that generally are lower. So, Taxol which has a response rate of 25
percent in other Phase II trials had a response rate of 14 percent. There tends to be a little more indolence so
survival is a little bit better even despite the low response rates; median
survival at one year 50 percent.
There have been two Phase II trials of erlotinib and
gefitinib in bronchoalveolar carcinoma.
Response rates were 24 percent and 19 percent. Median survival was 12.5 months versus not
reached after 7 months. One-year
survivals were 80 percent and 57 percent.
Remember, these are pills compared to cytotoxic chemotherapy.
[Slide]
This is the Southwest Oncology Group, two consecutive
trials, not randomized. Overall survival
standard Taxol--this is the data we saw before.
Response rate was 1 percent; median survival 12 months.
[Slide]
This is the data with gefitinib in the Southwest Oncology
Group. The untreated patients had a
median survival of 15 months and a one-year survival rate of whatever I said,
57 percent. Even the previously treated
patients had a median survival of 10 months.
It is likely, when we get to randomized trials, that these
single-agent pills will be better than our standard two-drug chemotherapy. Remember, if accelerated approval had not
been granted for these drugs--we only had those randomized Phase III
trials--these drugs would not be seeing the light of day. And, in that large list of other drugs that
went to Phase III trials, how many are actually active? We don't know because people were afraid to
give approvals based on objective response.
[Slide]
Time to progression, there are a lot of problems that you
heard about. One of the major of those
is the frequency of assessment. We are
looking at changes. Median time to
progression in untreated patients is four months. A 25 percent reduction is going to be a
difference of a month of less. We get CT
scans every eight weeks. The frequency
of assessment for time to progression is a huge issue here. Not only that, cycle length can actually
affect time to progression. If the cycle
length varies, therefore, the time you get the CT varies.
Another issue is sick and progressing patients may not be
evaluated. Most of us who treat lung
cancer patients, when they get sick and get worse, that is the end of it. If they need a CT scan six weeks later and
they have already progressed, and all that, a CT scan is not obtained. As you heard, oftentimes these patients die
without any documentation of what actually happened.
[Slide]
This is an example of some of the problems with TTP that
might argue it might be surrogate endpoint.
This is the four-arm ECOG trial.
The PIs of that trial are sitting to my right. It was comparing four different two-drug
combinations. The response rates you see
here. Time to progression varied from
3.3 [sic] months to 4.5 months. The 4.5
months with gemcitabine and cisplatin was actually statistically significant
compared to the 3.5 [sic] months in the paclitaxel/cisplatin arm. But just remember this is a three-week cycle
and CT scans are obtained every six weeks.
This is a four-week cycle and CT scans are obtained every four
weeks. As you can see, there is no
difference in any of the survival outcomes.
So, this might be a surrogate but it would be hard to say that this is a
definite endpoint, definitely associated with survival and I think in lung
cancer time to progression has really a lot of issues.
[Slide]
You were talking about disease-free survival or time to
progression in early stage patients.
Certainly, if you progress you are symptomatic but the question is what
is the timing of the assessments.
Another thing is that relapses are essentially always
followed by a short survival. So, the
advantage you have in some other diseases of doing this with much shorter
intervals may not happen here.
Another problem is that, again, these patients are highly
likely to die, not from toxic deaths but related to a toxic therapy. Those deaths are scored in very many
different ways.
[Slide]
Just to show you that in the recent trials, this is a trial
of a very toxic regimen, MIC. Three
drugs, mitomycin, ifosfamide, cisplatin.
Remember, ifosfamide-based treatments increase the hazard-related
death. In this particular trial there
was an improvement with the MIC chemotherapy.
The hazard rate was 0.89. It
wasn't statistically significant. It
certainly favored the chemotherapy. But
look at what happened in survival. The
people who got the chemotherapy were dying earlier. They did cross but the hazard rate for
survival was 0.96 and, obviously, that wasn't statistically significant. So, if this had been a little bit better in
progression-free survival there might have been an approval without an
improvement in overall survival.
[Slide]
That actually happened.
These are all trials, by the way, from ASCO this year or last year. This was an intergroup trial looking at chemo
radiation versus chemo radiation followed by surgery. Time to progression favored the triple
therapy. You can see this is the time to
progression in the triple therapy and the p value was 0.02. It was better in terms of time to
progression. What happened in terms of
survival? The triple therapy arm had a
lot of deaths early on. It was worse
early on. Perhaps it was a little better
later on, a p value of 0.51.
Now, some people have interpreted this to say that triple
modality therapy is better. I have a
hard time with that. I think we still
all agree that survival is a pretty hard and important endpoint. And, I think that in some of these trials we
might have been misled by the time to progression analyses, not always, especially
if the treatment is not so toxic.
[Slide]
This is a two-drug platinum based regimen, a more modern
regimen looked at in the adjuvant setting.
This is disease-free survival, statistically significantly in favor of
the chemotherapy. Survival looked like
this. Survival was statistically better
as well. In this case time to
progression or disease-free interval and survival were the same but it didn't
take much extra time to find out that survival was also better as well.
[Slide]
So, I still think that survival does remain as a major
indicator of clinical benefit and symptom relief may also be a major indicator
of patient benefit. Richard Gralla is
going to talk about that.
[Slide]
So, I believe that survival should remain as a major
endpoint for clinical benefit and for approval.
Richard Gralla is going to talk about this, but I believe symptom relief
can be considered as an indicator of clinical benefit and also granted full
approval, but Dr. Gralla is going to talk about that. In my belief, objective response can be
considered as a likely endpoint of clinical benefit and, therefore, an
acceptable endpoint for accelerated approval.
With the current regulations, since new drugs are likely to
offer an advantage in toxicity over existing drugs, requirement for a benefit
over existing therapies is not a major obstacle if response was considered as a
surrogate. But in the future this could
limit drug development if this requirement of being better isn't gotten rid
of. I hope that you, as ODAC, might
advice the FDA whether they really ought to look at that accelerated approval
improvement requirement for being better than existing therapies. Right now if you granted accelerated approval
based on objective response, I think since we are going to have better toxicity
with the new drugs it will be okay but in the future when we get a bunch of
targeted therapies if you got two targeted therapies that are active one is not
going to be less toxic than the other, and why should one be approved and not
another? I don't understand that. I think drugs should be approved because they
are safe and efficacious, like the law says, not efficacious and better than
something else. TTP--I am not sure if it
is a marker for accelerated approval at the time or not. Thank you very much.
DR. PRZEPIORKA:
Thank you, Dr. Bunn. The final
speaker for this session will be Dr. Richard Gralla who will talk about quality
of life and patient-reported outcomes as endpoints in clinical cancer
trials. Due to technical difficulties,
why don't we take our break a little early.
Let's be back here at 2:10. Thank
you.
[Brief recess]
DR. PRZEPIORKA:
Would you take your seats, please?
Dr. Gralla?
Quality of Life and
Patient-Reported Outcomes as
Endpoints in Clinical
Cancer Trials
DR. GRALLA: Thank
you very much. We had an unplanned pause
but it looks like we all benefited from it.
[Slide]
It is always a pleasure to share the podium with Dr. Bunn
and to be here at the FDA to discuss these interesting areas. I am going to add to the non-small cell lung
cancer a little bit on mesothelioma, given that it fits all of Dr. Bunn's
criteria in terms of being a difficult disease with very similar parameters.
I also want to thank the many members of the group that
contributed to the presentation.
Obviously, we are not all going to agree. Where you agree with me, those are my
ideas. If we disagree, those are the
other folks on the committee.
[Laughter]
[Slide]
This new term, patient-reported outcomes, PROs, sort of
defines clinical benefit or a term that probably could have stayed as
palliation for this purpose and quality of life.
For quality of life we need a multidimensional concept that
includes areas less likely to be affected by chemotherapy, the spiritual,
perhaps less the psychological and social but certainly the physical and
functional.
For clinical benefit, with talked about the original
definition. It includes areas more
likely to be affected by the treatment choice.
Why isn't it just symptom benefit?
Well, performance status is not a symptom is probably the reason. So, it includes functional and physical
aspects as well but areas likely to be affected.
So, this is sort of the overall working of PROs--symptom
palliation, quality of life of life as well, but quality of life used in a
denotative way, not as a connotation of oh, it must affect his quality of life.
[Slide]
This is probably my slide that I should have entitled much
like Dr. Bunn's, sort of the why are we here?
Is there really a need to look at PROs?
I think the answer is absolutely yes.
Every physician knows that hardly a day goes by that a patient doesn't
say to us, you know, doctor, I am interested in my quality of life as well, and
why isn't that involved in drug approval?
It should be and I think we have heard the desire for it to be.
Lung cancer mesothelioma are a highly symptom
diseases. Survival response reveal only
a portion of the experience that our patients and families have. Our treatments vary in their side effects and
risk profiles, some of them really being quite toxic but this applies to
surgery radiation and chemotherapy. So,
we have to be able to balance that experience in some way. The response rate simply won't do that. Actually, if we are honest with ourselves,
meaningful survival differences are most uncommon. Every trial is designed to look at the
survival differences but they are extraordinary when the occur.
[Slide]
The question came up before do we really know what symptoms
to look at? You are darned right we do in
lung cancer. We absolutely do,
mesothelioma as well. Look at the
frequency on presentation or during the time for non-small cell lung cancer and
small cell lung cancer for these common symptoms that our patients present with
and tell us about.
In the development of the better instruments, which I will
talk about, the input of patients is absolutely crucial or we could not have
been able to assemble such instruments.
These were not developed by people in "ivory towers."
[Slide]
Our patients are highly symptomatic at baseline. This is a large, 30-center trial. We looked at using a validated quality of
life instrument in the beginning. As you
can see, 80 percent of patients present with three or more of these symptoms,
92 percent with two or more. So, another
way perhaps of doing it, to get away from some of the multiplicity issues, is
to look at how patients rate their overall symptom distress, what the symptoms
really mean to them. It gets back to
some of the functional issues as well.
Unfortunately, people at presentation first-line are extremely
symptomatic.
[Slide]
Looking at survival, and this is just a compilation of
large randomized trials over the past decade.
The red bar represents supportive care.
We no longer have the issue does chemotherapy improve survival over
supportive care. Seventeen out of 17
trials with this design--way too many--showed improvement over supportive
care. The majority of those trials
independently showed an improvement in survival. Way too many trials were done there.
The next bar, next to the red, is just platinum alone and
Dr. Cohen told us about platinum alone.
But if we look at the last three bars, carboplatin combinations, older
cisplatin combinations and newer cisplatin combinations, yes, the newer drugs
have a little bit of a benefit for us; they are easier for us to use in many
ways and we prefer them. But in terms of
survival benefit, it is very, very difficult to have a meaningful survival
benefit although, God knows, we don't want to talk about what a meaningful
survival benefit might mean. We have
already sort of addressed that one. But
it is pretty hard to have survival benefit that gets our attention.
[Slide]
Dr. Janet Dancy really put together a lot of this and I
think she is just right. Here PROs can
create an accurate picture of the disease.
Without this we are missing what are patients tell us about in every
single patient encounter. We must have
this to really understand about the disease.
The second paragraph--unfortunately, many studies have
shown us that we are not so good as nurses as doctors in predicting how our
patients feel about these things. It is
too bad but, unfortunately, has been reproduced even in the JNCI and in the
Miles trial was shown once again.
Interestingly, why we need this is that response rates
under-estimate the benefit. It appears
we don't need a major response to be able to have enough change to be able to
have benefit.
Finally, how do we have this balance between symptom
improvement, toxicity, the difficulties of treatment and the benefits? There are many examples where more toxic
regimens are associated with greater patient benefits, including their symptom
relief, etc. So, to be able to put this
together is not easy--actually, it is easy, we have to ask the patients and
they can tell us.
[Slide]
So, the four questions I have always had with these areas
are can we define quality of life? We
surely can define pain, dyspnea and cough.
Can we measure quality of life? That is what a lot of the conversation was
about. Can we quantify the more
subjective aspects? We quantify
subjective aspects all the time in many different areas in behavioral science.
Can we agree on how to analyze the data? I am not sure we are quite there yet but I
think we are getting closer. We have a
lot of good people around the table who can help us with that.
Can we present the data in a way that is clear and useful,
not looking at 99 different endpoints, etc.?
That is nuts!
[Slide]
Define it. If we ask
each one of us in the room to define quality of life in one, two or three
sentences we will probably end up with some disagreement. If we sat here for a while we would probably
come pretty close and be able to carve out one paragraph. One thing we can agree on is this is probably
made of these dimensions, the physical such as symptoms and side effects; the
functional which we talked about earlier, psychological, social and
spiritual. Spiritual doesn't have to
mean religious; it can be meaning of life.
So, these are the denotation areas of quality of life. Now, the other PROs, the patient-reported
outcomes, deal more with the physical and functional.
[Slide]
This is the model part of the content or actually the
construct validity for quality of life.
Dr. Patricia Hollen publishes for the LCSS instrument. Well, if we look at the physical dimension
and the functional, those are what are, for the most part, discovered or looked
at in the other PRO dimensions, the symptoms, the performance status. Yes, we can look at functional
dimensions. The FACT-L actually does a
very nice job of looking at the differences in function and how function is
meaningful, and we don't have to look at these as a lot of different endpoints. So, we can focus on the physical and
functional, which account for about 75 percent of the variance in many of the
studies, and globally capture quality of life in the others.
[Slide]
Instrument development has changed, or instrument use has
changed in quality of life. We have
instruments that are good for all populations that are kind of interesting to
look at, but I think it is clear that there would be a need for instruments
that are more cancer specific than, say, osteoarthritis. The pace of these diseases can be quite
different.
We talked a little bit about lymphoma. The B symptoms of Hodgkin's disease are a
great deal different than the symptoms of lung cancer. Issues such as fertility are issues that we
think about all the time in younger patients with lymphoma but it is not really
such an issue in lung cancer. So, we
need disease specific instruments. We
might even need treatment specific. We
talked earlier today about adjuvant trials.
In adjuvant trials in lung cancer in patients with stage I and II, we
want to look a year later to see if our interventions in an adjuvant trial in
somebody who has undergone a right pneumonectomy whether we have good quality
of life a year later. That may be a
different instrument that refocuses on the functional endpoints than we would
use in a clinical trial in stage IV where that patient has an expected 7-, 8-,
9-month live altogether and we have such instruments as well.
[Slide]
Here are the three instruments with acceptable
psychometrics. We will look at the
psychometrics in a second, the LCSS, EORTC QLQ30 and the FACT-L. The latter two, the EORTC and the FACT-L are
similar. They are 30-40 items total, a
general module 7-13 for the lung cancer.
The LCSS was developed specifically for clinical trials and clinical
management. It is shorter; 8 items in
mesothelioma, 9 in lung cancer and 6 observer items but the observer scale is
optional. They take between 3 to 10, 12,
15 minutes. These are not the 99-item
instruments that are out there, and more.
We do not need those.
[Slide]
What kind of validation have they been through? They have been through very serious
validation methods. These validation
methods were not set up for cancer; they were set up for behavioral science and
they are very strict and are much more difficult than, say, RECIST or most of
the other things that we have been talking about. We can see that these instruments to be
useful must be valid, reliable and feasible, able to be used in a real clinical
practice in real time studies.
Here are some of the psychometrics that are there. As far as the content validity, the content
of what we looked at if we didn't have patient agreement, patient input, it
wouldn't be worthwhile. Fortunately,
that is true in all these instruments.
[Slide]
If we look at internal consistency, if we look at the
reliability, stability--do you get the same results if you give it again to the
same patient? Do you get it if you give
it in different groups of patients who have the same characteristics? The answer is yes. Dr. Nunnally wrote the textbook in this area,
not as far as oncology is concerned, and the instruments that I showed you,
those three instruments stand up very, very well.
[Slide]
If we look at two of the lung cancer instruments, for
instance, that are used the most in U.S. trials which is why I looked at them,
if we look at their reliability coefficients, the Cronbach's alpha for their
core measures, they come out very, very well, and much better than needed for a
new measure. For the lung cancer module
they come out really quite well also. In
fact, we have a new publication from Dr. Chris Earl and Jane Weeks that looked
at quality of life and PRO instruments in oncology and the lung cancer
instruments, specifically the LCSS, are among the very best in all of
oncology. So, as far as lung cancer is
concerned, we are blessed by having some really pretty good instruments and
most of these instruments now are being put into electronic format so that they
can be very, very easily done with very little extra time for patients or data
managers.
[Slide]
If we look at other types of validity construct criterion
related, they are really there. They
compare well to gold standards and other aspects. So, there is no doubt that the validity
process that has been used for these types of measures in a variety of
different conditions are met by these validated instruments, not necessarily by
other instruments.
[Slide]
We talk about this clinical meaningful difference. I am just floored why it is that this should
be answered for these PRO endpoints and quality of life but not for
survival. I really am amazed that we can
even talk about non-inferiority if we can't set what the border is for survival
that would be important. I think that
this really becomes rather difficult. We
know it doesn't meet non-inferiority but what was the border? Why was that boundary selected? The same thing is true here.
I like what Dr. Williams said, we look at whether there is
a statistically significant difference, whether we can be confident that there
is a difference. Let's apply whatever we
are applying to these PRO or quality of life endpoints too. Either we have a difference or we don't. It is for somebody to look at and say that
three percent difference doesn't mean much to me. We heard the five-week difference didn't mean
much. But, of course, Dr. Cohen
presented a lot of five-week differences here that we have approved drugs on,
and there is value to normative data being collected as well.
[Slide]
Phase II trials, single-arm, non-randomized trials, these
trials suffer from the same problems that survival studies do. We talked about the gefitinib trial
before. We were all glad to see that
patients had a rapidly occurring change.
Of course, that was really looked at from the subscale FACT-L, not
necessarily the whole FACT-L and, yes, there was symptom improvement and these
are all very nice things to see. But the
problem with these is, just as with survival analysis, that with the lack of a
control group we don't have a context.
[Slide]
What makes it particularly difficult in symptom control is
that we are giving standard palliation.
It is not a blinding issue. Of
course, we are giving pain medicines to people who have pain; cough medicines
to people who have cough and oxygen to people who are dyspneic. We wouldn't want to do a trial that was any
other way. These are confounding
problems but they are what we deal with in clinical medicine every day. So, without having something for context I
have no idea whether or not that is a great response rate we see or not. So, in Phase II these are helpful in
hypothesis generating but difficult for us to say that they lead to true
improvement.
This can lead to an overestimate of benefit. On the other hand, if we just looked at the
response rates, since less than a major response gives benefit, that has been
an underestimate of benefit. So, there
are problems with Phase II. It is
probably really good to analyze these data in Phase II studies so it can be
more useful in trying to guess what difference we need to look at in Phase III.
[Slide]
What about Phase III trials? What kind of problems do we run into there in
comparison trials? Well, these are the
complaints that we hear the most, cumbersome instruments. Yes, but actually the three instruments I showed
you are not so cumbersome, the 3-, 5-, 15-minute analysis isn't so bad. It takes a whole lot less time than the MRI
that we get all the time or the PET scan or the CT scan. People say how can you ask a sick person to
complete this questionnaire that might take them five minutes, you mean as
opposed to getting into an MRI machine?
It is really very easy. It is
tough to get the sick patient who may have progressed over to the PET scanner
but it is not so hard to do these instrument and many of these can be done by
phone.
Patient deterioration is a big problem and this can lead to
the sloppy data that we heard about before or asymmetrical follow-up--nice
term; I like that term. If we don't
follow-up equally in two groups in a Phase III, that is not good. So, we need to be looking at patients even
after they progress. Lack of
investigator commitment. How do we
prevent that? We emphasize it from the
very beginning.
[Slide]
This looks at those same 673 patients that I showed you
before with those symptoms. We wanted to
see after three cycles how many were staying on study, 64 percent. The main
reason for coming off and not having assessment was disease progression. This is completely controllable simply by
following with something as simple as an instrument that costs pennies, not
thousands of dollars, to be able to follow this.
Another advantage of following the PROs is we talked about
the problem of contamination with crossover.
This isn't crossover. We don't
have to worry about that. It is eliminated
from looking at this. So, we should be
able to improve this follow-up by at least 20 percent to be able to get 80-90
percent adherence rather than the 64 which is certainly not good.
[Slide]
Who drops out? Who
is in the attrition group? Well, we
looked at age which is not a prognostic factor in lung cancer and there was no
difference between the on-study group and the attrition group by age. Indeed, if the symptom burden was worse or if
the quality of life was lower, those patients were disproportionately seen in
the attrition group. Think what that
does. That takes an arm that is inferior
in terms of response or survival and it drops out the more symptom or lower
quality of life patients, artificially making the inferior arm look
better. So, that is a real problem. Is it surmountable? Easily and it has been surmounted.
[Slide]
This is from mesothelioma study. I will talk a little bit more about it. Nick Vogelzang published this study in the
JCO this summer. It is pemetrexed-CIS
versus CIS in advanced mesothelioma.
What did then do? They conducted
a brief training session so that everybody involved understood why quality of
life and PROs were being done. They
included baseline quality of life data as part of the randomization which
emphasized the importance that we really want this as much as we want the CT
scans. They continued to have emphasis
while monitoring the trial and, as a result, more than 90 percent of the
planned assessments--this was done weekly which I think is excessive and there
are reasons to believe it is excessive, but more than 90 percent of the planned
assessments were done. So, this is
probably the industrial standard.
[Slide]
We talk about survival, quality of life and response as
being separate. We need to analyze them
separately, that is correct but, of course, they are more related than
different. They are related because they
are largely determined by the malignancy.
If we cannot control the cancer we will not be able to improve survival
very likely or quality of life. Of
course, if the treatment is harsh then this could have a negative impact on
survival or quality of life or both.
But when we look at the approved regimens that Dr. Cohen
showed us, they are all pretty similar in terms of their toxicities. There are not big differences. So, we shouldn't expect with modern care that
that is the problem. So, they are
inter-related but they are not identical, these endpoints, and quality of life
is a very important one. But I don't
think we should ever look at quality of life without looking at survival or
looking at survival without looking at quality of life, but either one of these
could be a primary endpoint.
I like what Dr. Bunn had to say about response and
accelerated approval but when we talk about large trials response is probably
not of great value if it doesn't contribute to quality of life or if it doesn't
contribute to survival, and probably any good treatment will contribute to both
because it is mediated through the malignancy.
[Slide]
This looks at the survival based on quality of life at
baseline. If we look at that group that
scored their quality of life in the lower half of the group, they had a much
inferior ultimate survival when compared with the group that scored their
quality of life in the top half of the group.
That is not too surprising but this was a more important prognostic
factor in multivariate analysis than any other, including stage III versus IV,
including gender, including performance status.
So, ignoring quality of life is missing the boat on a lot of these
areas. Yes, it is more difficult to
measure quality of life than to use the instrument that we use for survival,
that instrument being a calendar, but I should think we are little bit more
sophisticated than just having the ability to use a calendar.
[Slide]
For Phase III we have problems in analysis. The standards for statistical approaches
remain controversial. I do agree that
the less modeling we can use, the more data that we can include, the better off
we are. There are problems with simply
averaging scores. Survival differences
complicate quality of life analysis because the attrition is not random. But these are correctable.
As Dr. Fleming has emphasized, results from all patients on
trial need to be analyzed. Instead for
looking for a way to adapt for that, we need to follow all the patients. They did that in the mesothelioma trial and
we can do that too.
[Slide]
Well, does it really add to response or to survival, the
common endpoints? Let's just look at
these data. This is almost a 500-patient
study. If we look at this in terms of
the PRO outcome of pain, which is something Dr. Carpenter brought up as
something important, it is not too surprising to us that patients rated their
pain control as better if they had either a CR or PR, but we know there are not
real CRs--a major response versus stable disease versus progression disease.
But what we didn't expect to see is if you just look within
response, because we think of response as a blunt instrument and you either
have a response or you don't, if we looked at how patients rated their pain
there was a major difference between the pain control for those who got the
combination regimen, in this case pemetrexed-CIS versus the single agent. You can see the yellow bar versus the blue
bar. These patients were all followed to
the same degree. They all responded but
there was a change in pain. In fact, in
all 8 LCSS parameters the same pattern existed within responders and patients
on the combination rated their patient-reported outcome, including quality of
life, as being better. So, it is
possible that this is a more sensitive measure than the blunt instrument of
response.
[Slide]
What about survival?
Well, Dr. Vogelzang reported in the JCO that there was a survival
difference between the combination regimen and cisplatin alone. If you look at 12 weeks there was no sign of
this. At 18 weeks there was only a slight
suggestion that there might be a survival difference.
But let's look at quality of life and symptom
distress--this covers all the PRO aspects.
If we look at quality of life we can see that there was already some
difference at week 12 and a larger difference at week 18. When patients rated distress from their
symptoms the same pattern was seen. At
12 weeks this was not significant. At 18
weeks this was highly significant, even if one addresses the issues of
multiplicity, showing that it was easier to show quality of life differences
and symptom distress as the patients reported which was significant earlier on
than was survival. In fact, this is
predictive validity, predicting what will happen to survival which is
considered to be a very strong validity point.
[Slide]
My conclusions would be, and our group said, yes, this is
ready for "prime time." There
are validated instruments but when we do these studies we must select
carefully. We need to use a validated
instrument but, remember, some of these instruments measure different aspects,
such as a clinical trial versus an adjuvant trial, a little bit different and
we need to be sure that we have the right languages and cultural aspects which
many of these instruments address.
As with other study endpoints, before the trial begins we
need to delineate what are the primary endpoints. We need to address areas of multiplicity and
of analysis. Too often I see protocols
that say, well, here is the instrument we are going to use and we are going to
analyze it and then later comes the analysis.
No, that has to be thought out ahead of time. If so, we will have something that we can
present to our colleagues at FDA that I think they can probably get their arms
around.
We need to follow all patients whether they are progressing
or not. That is one of our biggest areas
of problems so we need to follow all patients throughout a predetermined
interval. So, if we have an interval to
follow the patient, how long should that interval be? Appropriate to be able to see response and
appropriate to be able to see the toxicities.
If we can see that, we can see that area.
There are other uses for quality of life. In terminal care we can look at it in those
areas but that is a different issue. But
in the beginning in a clinical trial, follow for a specified time but follow
all patients. When patients die, is that
a problem? It is not a problem. Quality of life is a function of life. If some patients have died, that is what
occurs; we don't follow those. But we
don't look for the patient who is no longer contributing, the patient lost to
follow-up. That is as bad as with
toxicity and response.
[Slide]
We need to use an appropriate control group. Sometimes this is difficult. And, all these comments refer to quality of
life measures when we are looking at drugs that are likely to have their
benefit by means of anti-cancer activity.
We are not talking about pain medicines here. We are talking about anti-cancer drugs and
looking at approval for those. Their
appropriate control group is important.
We need to emphasize compliance throughout the study and as
long as the investigators and the patients understand this, then I think we are
likely to have people included. When it
is feasible to blind the patients and the doctors, especially the staff, that
is great but it is not always possible to do that and I am not sure that is the
biggest objection.
[Slide]
So, can we define quality of life adequately? Can we measure quality of life? I think we have some decent instruments. They are not perfect but they are
decent. When they are put in electronic
media they take almost no time from the staff, almost no time from the
patients. Can we agree on how to analyze
quality of life results? We are getting
closer. There are thoughtful ways that
we can talk about. Can we present
quality of life findings clearly? Sure,
we can. We don't have to present every
last aspect, especially when we have determined at the beginning of a trial
which are the primary endpoints that we wish to look at. Thanks.
Clarification Questions to
the Presenters
DR. PRZEPIORKA:
Thank you. Before we have our
introduction to the questions I would like to actually ask the three speakers
to take the podium together and have the committee have the opportunity to ask
them questions. While the synapses are
all firing up here, I will take the prerogative to ask the first question.
Dr. Gralla, you went through what validation means or
quality of life which, in the lab, would qualify as qualification rather than
validation which would be predictive of an outcome. You did mention "the gold standard"
but did not identify it. What do you use
as the gold standard? For example, if we
had a surrogate as a response rate we would hope that would predict for
survival. What do the quality of life
instruments measure for?
DR. GRALLA: For
instance, predictive validity from an instrument, and this could be true for
time to progression or whatever and for quality of life, predicts for another
validated endpoint. But when you do
against gold standards, if we looked at instruments such as the American
Thoracic Society dyspnea scale, if we looked at the Melzack-McGill pain scale,
etc. we now have huge numbers of questions to ask. So, what we look for are correlations between
using these already validated instruments.
So, for pain the Melzack-McGill scale is one that one could select,
there is a whole variety of different scales that are out there for different
aspects that are used for use as gold standards.
This is why if you read the papers, and each one of these
three instruments have published psychometrics, they tell you exactly which
scales they used, the PONS, etc. to look at various aspects. It takes years to validate these scales which
is why we don't want to see somebody just ad hoc make up a scale to be used in
the next myeloma, lymphoma, lung cancer trial.
So, there are specific scales that are found in each of the
publications.
DR. PRZEPIORKA: Dr.
Levine?
DR. LEVINE: I have
kind of a crazy question but people are all different. I saw this on one of your slides but, you
know, one person may call something pain and that is not pain at all to
somebody else.
DR. GRALLA: Right.
DR. LEVINE: So, is
it valid to just look at what I say is my quality or maybe what you should be
looking at is change, you know the delta, in each given patient. How do you analyze that?
DR. GRALLA: You
brought up a very good point. For many
of these instruments, that is what the Cronbach's alpha, the internal
consistency, can look at. When you look
at certain items that don't make sense--for instance, the fatigue question, 15
years ago when we looked at that we said we don't think people understand what
fatigue is. So, we will look at
tiredness; we will look at weakness.
Well, they all meant different things to different people. It turned out that the right term to use,
years later with much more testing, was fatigue--
[Laughter]
--and only by testing could you find that out. So, you must find that out. In emesis scales, which is different, nausea
means something rather different. Don't
ask my mother-in-law what nausea means to her.
It is entirely different from what it means to others. And, that is a real problem. But for each of these instruments those
points are there.
Now, do you ask about change over time? You must have a time period. For instance, if you ask a patient how did
you feel nine weeks ago it is really difficult for us to say. So, for many of these instruments the time
frame is in the past day or in the past week.
DR. LEVINE: I didn't
mean that. I meant let's say the
instrument is done at baseline and then every week. I guess it is an analysis question, couldn't
you just look at changes between week 1, week 2 and week 3 and that they have
answered in a timely way?
DR. GRALLA: Indeed,
that is the way that many analyses are looked at.
DR. PRZEPIORKA: Dr.
Bonomi?
DR. BONOMI: Along
the same lines to Dr. Levine's question, maybe we could define a quality of
life response just relating to the physical elements, not the whole quality of
life instrument, and the point that you made, a baseline and, say, four weeks
and eight weeks. What is the
statistically significant change? I know
in gefitinib they talked about a difference of two points. I don't know the statistics of it but it
sounds like an awfully small change to be considered significant. It seems like we need to look at that. Could we define some type of quality of life
response that could be then applied across studies?
DR. GRALLA: Phil, I
think that Dave Cella meant 2 points out of his 7 questions, and of 29 total
points yielding a 7 percent difference.
We can either accept that or not as such. It is kind of the same discussion that we
have had before. Think of the
risk-benefit aspect there. If you were
looking at imatinib versus marrow transplant in CML, clearly you would have to
have a better benefit in the marrow transplant to be able to be worthy to most
people than, say, just giving Tylenol or just giving imatinib. So, the risk-benefit probably comes in there
and it is just the discussion that we talked about before, in rapidly
progressive disease, highly symptomatic.
One of the problems is when the baseline is 70 percent
where 100 is perfect and 0 is terrible and you improve by just 6 or 7 percent,
that doesn't sound like very much but actually it is 25 percent of the amount
that you could improve. So, it is the
relative difference versus the absolute.
These are very, very difficult things to answer. In a progressive disease like lung cancer is
it the number of patients who report an improved quality of life, a stable
quality of life, or is it when treatment A preserves more quality of life over
that entire group versus treatment B even though there is a deterioration in
both groups? I favor the latter rather
than looking at the quality of life response.
DR. PRZEPIORKA: Ms.
Ross?
MS. ROSS: Thank
you. I guess this would be to Dr. Bunn
and Dr. Cohen. Dr. Bunn made the
statement that only in oncology drugs is accelerated approval dependent on
showing an advantage over existing drugs. Was that your statement, Dr. Bunn?
DR. BUNN: Right.
MS. ROSS: I heard
someone say that is not true.
DR. TEMPLE: The
accelerated approval rule refers to showing an advantage over available
therapy. That is why you would accept a
lesser standard of approval.
MS. ROSS: Is that
only on oncology?
DR. TEMPLE: Oh, no,
it is for everything, for any accelerated approval.
MS. ROSS: Has that
ever been changed? Is it a rule?
DR. TEMPLE: It is a
rule; it is a regulation.
MS. ROSS: It is a
regulation?
DR. TEMPLE: Yes.
MS. ROSS: Or is it
law?
DR. TEMPLE: It is
actually now in law as well. It is part
of the fast-track provision of FDAMA as well as the rule.
MS. ROSS: Thank you.
DR. BUNN: As I
mentioned, right now that is probably not a huge problem for oncology because
many of the new drugs have less toxicity so they do have an advantage over
existing drugs in terms of toxicity. I
brought that up in terms of thinking about the future. You know, laws and rules are made to be
changed so perhaps in the future one would consider whether that provision for
accelerated approval is a bit too strict.
Certainly for regular approval that provision doesn't exist, only for
accelerated approval. Is that right,
Bob?
DR. TEMPLE:
Yes. There is one thing that is
important. The Commissioner has
announced this. We were trying to decide
among ourselves whether this has made it into a rule but you can or will be
able to have a second accelerated approval, say, for another drug that is not
cytotoxic as long as it still has an advantage over anything that has full
approval. I don't think that
completely--
DR. BUNN: It is
halfway there.
DR. TEMPLE: I don't
think it goes completely to where you want to go but that is important.
DR. WILLIAMS: Dr.
Bunn, as I read it, there is no reason to have accelerated approval. You know, according to your proposal you
could use a different endpoint then that would be tantamount to full approval
and there wouldn't be any particularly setting where you needed it. It would be in every setting. You would get approval in every setting for
the surrogate endpoint. Right? That is what you are proposing? There is no particular setting--
DR. BUNN: No, no, if
you had a response rate in an untreated population of 25 percent and you had
the same toxicity profile, then you wouldn't be able to get accelerated
approval. If you had a response rate of
25 percent and you had less toxicity, then you could get accelerated approval.
DR. TEMPLE: If it
was still accelerated approval now and it was based on response rate alone and
there was no other drug and the second, third, fourth still had an advantage
over available therapy, they could still be approved. I think you really want to say if it is a
useful drug none of that should matter and you would like to make that a
standard for all cases, but we haven't done that--
DR. BUNN: Right.
DR. TEMPLE: --but
accelerated approval is not terminated by the approval of one drug under the
accelerated approval rule.
DR. PRZEPIORKA: Dr.
Cheson?
DR. CHESON: Paul,
response rate in lung cancers to you is an important endpoint. Does it matter how long the responses last?
DR. BUNN: Of course,
it does but--
DR. CHESON: Is there
a minimum duration of time which you would accept for that?
DR. BUNN: We don't
know that. That hasn't actually been
looked at and it is something that probably could and should be looked at. But, surprisingly, there is very little
variation in duration of response. They
are very similar. I don't know why it
is. You know, why is 20 percent, more or
less, sort of the magic threshold for what will lead to an improved
survival. It is hard to say. Almost all those drugs have a median duration
of response in terms of three months. If
you had one that had a median duration of response of a year it might make a
bigger impact on survival. If you had
one that only had a median duration response of a month would it still affect
survival? I don't know and that is
because we don't have any examples. So,
it is something that we should certainly look at but there is not a lot of data
and there is not much we can say about it at the moment. Do any of the experts over here disagree with
that? I mean, I think at the moment it
would be hard to put median duration of response into the equation.
DR. PRZEPIORKA: Dr.
Fleming?
DR. FLEMING:
Actually, I have questions for both Richard and Paul but to avoid
confusion let me just start with Richard.
DR. GRALLA: I was
afraid of that, Tom!
DR. FLEMING:
Actually, I was pleased to see that you addressed a number of the issues
with PROs that we struggle with, issues of how imperative it is to ensure you
are following everybody so you are getting an unbiased assessment. I still struggle a little bit with how to
handle the deaths in that regard.
With the validity issue, you talked a lot about that. Blinding still troubles me as to how we could
address that. I think blinding is really
key to the objectivity of measuring these.
A question that I would like to ask or a comment maybe in
response to one of your questions, you had pointed out this committee, in a
sense, dodged the question of how much of a survival effect you need to see for
it to be relevant and you were saying why should we be asking the same thing
for PROs. At least for some of us the
reason that there is a difference there comes down to a multiplicity issue with
PROs. There usually is a wide array, as
you have mentioned, with these various scales, 6-plus 9 or 15 measures, 30-40 measures
etc. It really is important to formalize
this into something that is a primary endpoint.
Sometimes that may be based on a composite. What you get then is you compromise
interpretability for enhanced sensitivity and here is the issue, you might now
have exquisite sensitivity to small differences in these composite measures and
then it is, in fact, much more likely that you could achieve statistical
significance there and wonder if it is clinically significant. It is much less to occur on survival, for all
the reasons we have heard--it is difficult to get an even adequately powered
survival study. So, I would say there is a reason. I don't know if you wanted to comment
specifically on the issue of multiplicity on this.
DR. GRALLA: I agree
with you entirely, Tom, it is a real issue and that is why you need to define
it in the beginning. First, it is simply
something as simple as looking at quality of life which can be looked at
globally, or looking at symptom distress or looking at pain, whatever you feel
would be most important in this population.
You don't need to look at all of them.
The problem we have had the most is with people looking afterwards and
then choosing, oh, here is the one that came out, or overwhelming us standard
data in a 99 instrument and 44 looked at this and 33 didn't. That is over.
That time is over. Those aren't
the issues.
When we use these instruments we can look at families and
maybe we do give away some sensitivity but, in fact, in looking at some of the
data that I was pleased to see with some of the trials that I mentioned, we in
fact don't have a multiplicity issue.
When we look at two or three of these areas, even if we adjust for the
fact that we are looking at three endpoints, it is still significant.
I know that that gets back to your other point of looking
at small differences in survival. Again,
we are talking lung cancer. Marty showed
us approvals with five-week, three-week survival. So, I don't think that we should be rushing
to worry about those small differences.
I can't understand why a patient would say to me, well, let's see,
doctor, there was only a 7, 9, 10, 12, you-name-it, percent difference, why
wouldn't I want the one that had that 12 percent difference? And we look at what patients want and whether
we are fulfilling those needs.
The blinding, it is great to do when you can and often it
can be done and should be done but, you know, when you think about it, you have
a large trial and you are looking at pain control and you give the patient the
pain visual analog scale. The patient I
think is pretty honest about telling you what it is and as an investigator in a
400-patient trial I have no clue as to how that affects. In other words, I am not putting my input in,
the patient is. I am not sure the
patient understands which one is better in that regard.
Where it is also important though is the context. Did it require more pain medicine to be able
to get that pain control result? So, we
do need to look at that. Anyway, that is
sort of how I would address some of those key issues that you bring up,
Tom. They are important but they need to
be thought of, just as survival, ahead of time; just as whether we are going to
look at disease-free survival, TTP, TTF and survival. I think they are similar issues.
DR. FLEMING: I think
it is when we use the composite scales that are harder to interpret and then we
can see very small differences. Yes, I
would say a small difference is better than no difference if I can get it for
free but then it is benefit to risk.
Let me get to a question that is probably more for Paul
although it relates a little bit to what you were talking about as well,
Richard. Paul, one of the take-home
messages I get from what you are saying is you are identifying concerns with
launching large-scale Phase III trials because we have to show survival effects
when there really isn't adequate evidence at hand at baseline to say the
plausibility of achieving that positive effect on survival is adequately
high. Gee, if we had responses and we
were looking at 15, 20 percent responses, then your sense from the data you are
looking at is that it is much more likely that we will see a survival
effect. I guess one take-home message I
get from what you are saying is then we ought to have fewer study settings
jumping from Phase I to Phase III. Let's
do that Phase II trial with 100 people and see if we get a 15 or 20 percent
response rate.
The issue that is troublesome here, and is a little bit
related to what Bruce's comment was before, as I look at response it seems to
me that response is a component of what we would think of as an integral causal
pathway through which the oncology disease process is influencing outcome like
survival. My worry is that when we look
at percent of patients that achieve a certain level of tumor shrinkage would
dichotomize the world and that dichotomization may be missing part of what the
intervention and disease process is really doing here. It is not just a matter of did you achieve a
response. What was the magnitude of that
response? What was the durability of
that response? It is easy to envision
that an intervention could readily be achieving intended benefit on clinical
endpoints like survival and an oversimplification of what is really happening
to the disease process, to the tumor burden may not be adequately captured by
percent responding.
One of the things that troubles me too, and you and I had a
brief chance to talk about this, when you look at that meta-analysis of the 176
Phase II trials, those studies are looking at the relationship between whether
somebody responds and what the overall survival is.
So, Richard and Paul, you are vigorous and I am frail at
time zero. In fact, Richard, you have a
better quality of life than I do and, Paul, you have a better response than I
achieve and both of you survive longer.
What do we see from those data?
That there is an obvious correlation between quality of life and
survival and response and survival. Now,
Richard, I don't care that that is the case in what you are advocating because quality
of life is a value to me whether or not it is a surrogate for survival. But with response, Paul, I do care because I
do want to know that this is, in fact, giving me evidence that mediated through
that response I am causally inducing what I really care about.
Here is the rub, we could have a million patients in the
data set that you have been providing to us.
What it does is it tells us about a correlation that exists but it could
be that the causal mechanism for that correlation is not induced responses
leading to prolonged survival. What I
need for that, and this is critical information, is properly controlled trials
that can compare what is the treatment induced influence on response versus the
treatment induced influence on survival.
That relationship across a meta-analysis is telling me whether or not I
am causally influencing survival mediated through response.
DR. BUNN: I don't
really disagree with what you say. One
of the issues gets down I suppose to semantics but, you know, it has to do with
cytotoxic versus cytostatic. If a lot of
the drugs that we have actually worked by being cytostatic this would be a huge
problem. Maybe bevicuzimab will be the
first but maybe some day we will get confounded by cytostatic. But most of the drugs that improve survival
and, in fact, in my belief all of them at the moment, have actually worked
because they are killing cancer cells.
Even tamoxifen causes objective responses in patients and certainly
Iressa causes objective responses.
So, I think when the mechanism is to kill cancer cells,
that objective response actually makes sense.
Sometimes, you know, examples are useful. I think it is not out of school to be
actually thinking about what is coming along.
You heard about a trial that looked at a non-inferiority survival
advantage in second-line non-small cell as the major endpoint. In every efficacy parameter, including
symptoms, both pemetrexed and docetaxel were identical. It is the biggest trial ever done in
second-line non-small cell. But the
non-inferiority p value was 0.051. I
don't know what the committee will do but I do know that the response rate to
pemetrexed was 9.1 and to docetaxel it was 8.8 and the symptoms were just as
often relieved.
So, if the committee can't deal with a single trial with a
p value of 0.05 in terms of non-inferiority, accelerated approval could be
given on the basis of response for, you know, a drug that I think needs to see
the light of day in this disease and killing some of these drugs may be the end
of the light of day. Erlotinib is going
to come before this committee in a trial where the hazard rate for the study
was a hazard rate of over 30 percent reduction for a single pill in second- or
third-line non-small cell that is a big change and that may not make it against
best supportive care in terms of survival but I will eat my hat if in terms of
response it is not highly statistically significant and if it isn't eight
percent or higher.
DR. FLEMING: But
your example is a bit changing the topic here because you gave an example where
you were talking about evidence on response and time to progression and
survival, and you are really asking the question, in a non-inferiority setting,
what is an adequate amount of evidence on the aggregate of those measures,
which is different from the thrust of your presentation which was let's
reexamine whether or not there is adequate evidence that if you can induce an
impressive response rate at a certain level that is now adequately reliable
evidence for benefit.
DR. BUNN: Right, if
erlotinib has nine percent and best supportive care has two percent I would say
accelerated approval should be given.
DR. PRZEPIORKA: Dr
Cheson?
DR. CHESON: Paul,
coming back to part of your elegant presentation, there are some drugs which
you had on your list that never should have gone on to Phase III because they
are inactive as single agents. I take
issue with that because there are some drugs, particularly one of them that you
had on your list, which are probably not active as single drugs but work better
by enhancing the activity of other agents.
What I am thinking of is Gentasense, for example. So, I would be reluctant to throw out some
drugs like that have a unique mechanism of action. Some of the growth factor receptors may be
the same sort of thing. The typical
cytotoxics, okay, but when you get to the new targeted therapies I think a lot
of them may work better and should be studied going right from Phase I to Phase
III if there is in vitro rationale for such combinations.
DR. BUNN: I am sorry
I don't have my slide to put up but the bottom sentence on that was unless
there is very good compelling preclinical evidence for why that would
happen. So, that is not uncommon to the
situation up until now but I certainly don't disagree with your sentiments but
I think there should be compelling preclinical reasons for that. Again, you know, bevicuzimab may be the first
one to actually prove me wrong but I will be happy to be wrong.
DR. PRZEPIORKA: Dr.
George?
DR. GEORGE: Richard,
I have a couple of things. One is that
you make very compelling arguments of why we should be able to these kind of
studies in quality of life. One of the
frustrating things to me, sitting on this committee, is we don't see these
things. We don't see good, well done
studies in this area and I was wondering if you have any notions, accepting
what you have said, that we are not seeing them because they certainly could
add a lot to a lot of these kinds of applications.
DR. GRALLA: Steve, I
agree with you 100 percent. The problem
is in the past we really haven't seen so many good ones. In fact, over the last five years what we
have seen is sort of leapfrogging. Each
trial gets a little bit better than the last at doing these. We see more trials that start to use
validated instruments. We have even
heard of some ad hoc instruments. I
think now with the electronic way of keeping the data we are there on some of
these. So, I think that we are now
poised for you to be seeing more of these.
The second line in small cell approximated some of these,
approximated one of the validated instruments.
It wasn't really an elegant presentation for looking at the topotecan
second-line but it was getting there.
So, I think why we are here is to encourage that and to try to set some
points along the road to help those who are doing these studies to be able to
present trials in that way to this group so that you are more able to evaluate
these results.
We have had some presentations at ASCO this past year that
looked in that way, and maybe the year before.
So, I think that is what we are going to be seeing in the future.
DR. GEORGE: This
just seems to be an area where theory and practice seem to be far apart.
DR. GRALLA: You have
a very good point but I think we are getting much, much closer now and I think
you will see them soon.
DR. GEORGE: One
quick question, just a small point, on this blinded evaluation, blinded to the
interventions, there are other types of blinding that can be equally important
in this area. I guess we saw some of
that before. For example, just knowing
sort of the clinical development of things could presumably influence quality
of life. That is, you have to know when
you are asking these questions if the patient was just told that they had, say,
a response--
DR. GRALLA: Right.
DR. GEORGE: --Mrs.
Jones, your tumor is shrinking. Now,
would you please as this question, how do you feel?
DR. GRALLA:
Right. That is why all of these
instruments believe your point and have taken it for granted. It is not just a response; how about your
white count? Your white count is
1.9. We are not going to treat you
today. Oh, my God, I am going to
die. So, for almost all of these
instruments is when you repeat the measure.
You do it before the patient sees the doctor and before the patient gets
any clinical results. You are 100
percent correct. That must be done or
you could have wonderful impact on the study through more subtle means. So, those areas have been addressed.
DR. BUNN: I would
like to make just one comment. I think,
you know, we are getting better. The FDA
actually has said for a long time that symptom benefit could be for a primary
approval but sometimes the studies have been so bad that that hasn't
happened. I will just give you that same
example again where there are going to be three endpoints. There is going to be survival, and in my
opinion the study is a bit under-powered because it is looking for a big
survival advantage, but there is symptom benefit. This is erlotinib versus best supportive
care. I believe full approval should be
granted if there is a tend in survival and there is symptom benefit that is
statistically significant if you believe it was done well. If you don't believe it was done well and
there is a statistically significant difference in response and the response is
eight percent of higher, then I believe accelerated approval should be given
based on response. So, I mean, you have
three endpoints and you need to decide what to do.
DR. PRZEPIORKA: We
are approaching the scheduled time for the open public hearing but I don't want
to squash questions. I see a few more
hands back there. Dr. Bonomi?
DR. BONOMI: I have a
question for Tom. I think there is no
question that response is at least a treatment-related diagnostic factor but,
you know, the cause and effect thing--we have been talking about it for 25
years and we used to plot out the curves, the PRs and the stable disease and we
can't do that because maybe the people who were better, who were going to live
longer also exhibit a biologic response.
But with all the data we have and all the cooperative group studies, is
there some type of statistical modeling that could be done to try to elucidate
this? You know, my gut feeling is
response does translate into some benefit for the patient but how can we go at
this?
DR. FLEMING:
Absolutely, there is and you are exactly right to say that it has been
25 years since we have recognized this issue that, you know, responders live
longer than non-responders but that is not evidence that I have a
treatment-induced effect on survival mediated through response because, as you
say, people who are intrinsically better may be the people who would have
survived longer and would be more likely to respond and treatment has just
labeled those people who were better.
It is, however, the first step. If I have a marker that I am going to use as
a potential replacement endpoint the first thing I need to know is, is it
correlated. So, it is not a useless
step. By the way, if it is correlated
then, in that sense, it can be useful in other ways. PSA can be correlated with prognosis and it
could be a very good measure to counsel patients or to detect disease but that
doesn't mean that it is a good measure to indicate treatment effect. What we have to know for that is that the
disease influence on the clinical endpoint is predominantly captured by this
marker, that this marker is in that pathway mediated through which these benefits
occur. And, we have to have some sense
that it is unlikely, and this is tough, that there aren't unintended mechanisms
that can influence outcome not captured by the marker.
Those are clinical insights that are important to
supplement the data. The data, as you
point out, can also though be very helpful and it needs to be analyzed in a
much more sensitive way. It is only the
first step to see that people who respond live longer than non-responders, have
a better quality of life, blah, blah, blah.
What I really want to know is if you have 20, 30 or 50 or 100 studies
that have been done, and these need to be randomized, controlled trials, and
those studies have measured treatment-induced effect on the marker--let's say
it is response, let's say it is time to progression, and treatment-induced
effect on the clinical endpoint, what we need to understand is what is the
functional relationship between the level of treatment-induced effect on that
marker, such as response, and the level of treatment-induced effect on the
clinical endpoint, which is other than what that meta-analysis of 176 studies
did. It is a different issue. An example of this is the analysis that was
presented on November 12, looking at whether disease-free survival--this as Dan
Sergeant's analysis--could be a surrogate endpoint for survival in the colon
adjuvant setting. They at least did a
meta-analysis on all potentiated 5-FU colon adjuvant trials and showed a fairly
strong relationship between the magnitude of treatment effect on, in that case,
disease-free survival and the magnitude of treatment effect on survival.
So, the kind of thing that would be very informative here,
in this setting if we were talking about time to progression for example, is
this meta-analysis looking at an array of studies to see whether or not when
you achieve a given level of reduction in failure rate and time to progression,
does that translate reliably to a given level of reduction in survival.
My biggest concern is to be able to rule out cases where
when I achieve a certain response rate or when I achieve a certain reduction in
time to progression, does that ever translate into no benefit? How big do those effects have to be such that
we don't get no benefit on survival?
Those are answerable questions.
We can go to the data and start doing those meta-analyses. They will give us very important
insights. Those, however, have to be
supplemented. Just to quickly repeat what
I said before, we really do need to have a clear sense of mechanism. So, if we are talking about biomarkers, is the
biomarker the result of the tumor burden and it is not mediated through the
change in the biomarker that the patient has worse survival? I suspect that is the case. So, that wouldn't be a classic example of
what we would go for. But basic measures
of tumor burden would be the likely candidates that we would be looking for,
and if we have interventions that are thought to be fairly safe so that it is
unlikely that there would be major unintended negative effects, then we are in
the ball park of the kind of evidence that we would be needing to see and the
kinds of settings we would need to be in.
DR. PRZEPIORKA: Any
burning questions before we move on? Dr.
Temple?
DR. TEMPLE: Actually
I have a burning question for Dr. Gralla.
Most of the time when you study symptoms you make sure the people
entering the trial have one. You
wouldn't study headaches in people who didn't have the headache but you thought
might get one some day. A lot of the
quality of life efforts we have seen do not make sure that the people who are
entering the trial are impaired in those dimensions and, even more, even if
they have one of the things on your list of physical symptoms they don't have
all of them. So, anybody trying to show
improvement is starting out with a huge disadvantage because there is no
prominence to the symptom.
So, my question is this, we have urged people to think
about this, for each patient identify a target symptom, namely, one that they
actually have and try to focus on that, even if it was actually different for
each patient in the trial. I wonder if
you have any thoughts about that. I
mean, if I were doing it that would seem the way to find an effect if there is
one because you are at least identifying people who have the problem, whereas
in so many of the trials we have seen the people don't even have that
problem. It is hard to win.
DR. GRALLA: Yes, I
understand your point and I think that is another reason why we have to be
careful about setting an absolute number on improvement. Three percent of patients are asymptomatic,
three percent. When people ask me how do
you treat the asymptomatic patient, I don't worry about it, I just wish more
would walk in the door. So, everyone has
symptoms.
The question of looking at symptom burden, how do your
symptoms affect you is not a bad one to look at in that way because, therefore,
it doesn't matter whether it is pain, cough or dyspnea.
DR. TEMPLE: But you
want to be sure they are having an effect.
It wouldn't be a good question to ask if they said, no, it doesn't
bother me, I get through it.
DR. GRALLA: No, no,
everyone rates that question from zero to 100.
You can rate it zero, you can rate it 100. So, you can see the whole group. If you have 200 patients in an arm, you make
up the number and you can see what the scores are. If you start out at baseline with one group
being much more symptomatic than the other, then you have big problems but that
is not what usually happens. And, what
you can see here are differences, real differences when you see drugs that
work. So, what you can see is patients
rate the effects of their symptoms as being improved more on treatment A versus
treatment B. It is not a huge effect but
it is there.
If you want to, you can start with those patients. People have correlated different scores on a
visual analog scale with mild, severe and marked. So, if you want to say I only want to look at
those patients who rate their pain above 25 at baseline and what happened to
that group, you can do that from this same set. But now what we are doing is getting to Dr.
Fleming told us. Maybe you don't want to
go there; now you are looking at a subset analysis.
DR. TEMPLE: Yes, but
I could also stratify and I could make that my primary hypothesis.
DR. GRALLA: You
could; you could.
DR. TEMPLE: You
could say to yourself if they don't have a whole lot of impairment in this
dimension I am not likely to say much benefit.
So, I want to make my primary hypothesis people who are very impaired in
this dimension.
DR. GRALLA: Yes, I
like to think of the opposite criticism.
So, you only looked at those patients who rate their pain. So, is your drug no good for people who don't
have pain?
DR. TEMPLE: It
doesn't improve their pain.
DR. GRALLA: But what
I showed you before, looking at the difference between pemetrexed and CIS, even
within responders was eight out of eight parameters favored the combination, a
significant difference in itself. This
is what the patients say and, to me, that is very compelling. I don't know how the FDA would see that but
to me that was very compelling. But no
one of those was hugely different but in each one of those areas people looked
at it being different. Your suspicion
would have been that many of them would have been the same.
DR. TEMPLE: I am
only asking because we see so many "unsuccesses" and one of the
possible explanations for that is that there isn't much room for
improvement. You know, if you have ten
items in a score and only one of them is capable of being improved, that is
pretty tough. If all ten are, well, you
are much more likely to show something.
DR. GRALLA: But the
differences in the areas that are looked at here--for example since we were
talking about mesothelioma, there are only five. In the validation studies for the instrument
there were only five that were important.
When you think of pain and dyspnea and cough and anorexia and this sort
of thing--I can't remember the other one, you know it is not too surprising
when you get a tumor response. The
problem is lung cancer comes up with dyspnea where you have COPD as a
concomitant illness. If we have a drug
that fixes the COPD we are really in good shape. There you have the confounding variable
problem.
DR. PRZEPIORKA:
Thank you. Thank you to all the
speakers. I would like to now open the
open public hearing and call to the podium Mr. Mark Scott. While he is coming up to the podium I have
been asked to read a statement about financial disclosure.
Both the FDA and the public believe in a transparent
process for information gathering and decision-making. To ensure such transparency at the open
public hearing session of the advisory committee meeting, the FDA believes that
it is important to understand the context of an individual's presentation. For this reason, the FDA encourages you, the
open public hearing speaker, at the beginning of your written or oral statement
to advise the committee of any financial relationship that you have with any
company of any group that is likely to be impacted by the topic of the
meeting. For example, the financial
infection may include a company's or a group's payment for your travel, lodging
or other expenses in connection with your attendance at this meeting. Likewise, FDA encourages you at the beginning
of your statement to advise the committee if you do not have a financial
relationship. If you choose not to
address this issue of financial relationship at the end of your statement, it
will not preclude you from speaking. You
may go ahead.
Open Public Hearing
MR. SCOTT: My name
is Mark Scott. I am the executive
director for development in the U.S. and I work for AstraZeneca Pharmaceuticals
so that would be the financial interest, and they did pay my way here today.
[Laughter]
Madam Chairman, members of the committee, ladies and
gentlemen, thank you for the opportunity to speak. I am representing actually AstraZeneca
Oncology for this presentation today and I believe in your package you received
a seven-page document outlining a number of points we intended to make as part
of this committee meeting.
I believe that most of the points have already been
discussed today so I want to go into them with the detail I had originally
intended. Some of the points were made
this morning and some of the points are directly relevant to the discussion you
will have after this with respect to the questions that are being addressed.
The first point is that we wanted to endorse the committee
discussion on symptomatic improvement as used as the basis for full approval
for oncologic agents, and especially for non-small cell lung cancer as it is a
disease of symptoms. With well validated
scales that are available, including the lung cancer symptom scale, a
demonstration of relief of these symptoms as determined by well conducted and
controlled patient-reported outcome studies could be acceptable as a sole basis
for full approval of new agents.
The next area was in trials in subsets of patients,
specifically performance status II. This
wasn't necessarily directly germane to the discussion but, given that you are
talking about lung cancer, we thought it to be important. Inclusion and exclusion criteria for many
clinical trials in non-small cell lung cancer exclude performance status II
patients because of their short life expectancy and because many are considered
unsuitable for cytotoxic chemotherapy.
Novel agents with better tolerability may offer a chance to
bring clinical benefit to this ill-served patient population. The FDA has recently granted fast-track
status for a compound to be investigated in a trial in performance status II
patients and we are asking the committee do they agree that a PS-II population
in advanced non-small cell lung cancer is an identifiable population worthy of
clinical study, and for whom an indication could be written? If the answer was no, how would they propose
to define the population of patients often considered too unfit to tolerate
chemotherapy and, therefore, being excluded from many current clinical trials?
Another area that we wanted some debate about which got
covered this morning is that we are very encouraged that there was a
recommendation by the committee that progression-free survival could serve as
the sole basis for approval in certain situations.
The last area we wanted to discuss was the efficacy
standard, and I will not go into it in great detail but it has to do with
non-inferiority trials, which I will talk about at the end. We would briefly like to reinforce the
implications for oncologic drug development as raised by Dr. Williams this
morning. It is actually through an
article by Rothman et al. that was published in the January, 2003 edition of Statistics
in Medicine on non-inferiority trials.
The methods described in this article are increasingly used by
regulators in the United States and Europe to evaluate the design analysis of
trials of new agents. The consequences
for trial size are enormous as a result of this paper.
In this context, there has been something of a paradigm
shift though in the approach to cancer treatment over the recent years. Academia and industry alike are now fully
engaged in the discover, research and development of novel, well tolerated,
biologically targeted anti-cancer agents.
It is hoped that these new treatments will offer significant advantages
to patients in terms of improved tolerability, but they may not always
demonstrate increased efficacy. This
naturally leads to the use of active control in non-inferiority trials to
compare the new agent standard to standard agents, with the conventional aim being
to show no clinically relevant loss of efficacy.
But the key problem for researchers, physicians and
patients alike is that with Rothman's approach there is a dramatic increase in
the size of the trial required to determine non-inferiority. We don't believe that the answer is to avoid
non-inferiority trials. We believe that
there are situations that are clinically relevant where a non-inferiority trial
would be the trial of choice to define efficacy.
We don't believe that the scientific statistical debate
about how to best draw inferences from active control, non-inferiority trials
should be considered complete. Rothman's
approach serves to highlight that considerable statistical, methodological and
philosophical issues remain, and failure to consider these issues
constructively will, at the very least, lead to ever-increasing drug
development costs, time, and delay the availability of new therapeutic options
to patients with life-threatening diseases.
At worst, the barriers posed will discourage drug development where it
otherwise might have been feasible and so prevent potentially useful new
medicines from becoming available to patients.
We sincerely hope the scientific community, together with
regulatory bodies worldwide will give this important area further careful
thought, and we, at AstraZeneca, recommend that the advisory committee here, as
well as academic interest and industry interest have a panel like this meeting
to address this issue. Thank you.
Questions for Discussion
DR. PRZEPIORKA: Any
questions for Mr. Scott?
[No response]
Thank you. Our hosts
have provided some guidance, if you will, on the importance of the questions
and, given the hour, we will be taking these out of order.
The first question to be discussed will be question seven,
under the surgical adjuvant setting. The
FDA has stated that disease-free survival can support regular drug approval in
cancers where the majority of recurrences are symptomatic. Others propose that prolongation of
disease-free survival should support regular approval in all clinical settings
because a delay in cancer detection or a delay in the need for toxic cancer
treatment is of clinical benefit.
In non-small cell lung cancer, should a disease-free
survival improvement from adjuvant chemotherapy support regular drug
approval? If so, clarify why you
consider disease-free survival an established surrogate for clinical benefit in
this setting.
Part b) is if not, could a disease-free survival
improvement support accelerated approval?
Would a survival advantage ultimately be required for conversion to
regular approval?
So, the question before us is should disease-free survival
in the adjuvant setting be a primary endpoint or a surrogate for survival. Dr. Johnson?
DR. B. JOHNSON: I
think this is a more a philosophical than a real question in that adjuvant
therapy hasn't yet been proven to play a role in lung cancer, and I can't
imagine--I don't know of any company that has a plan to look at this. So, it is not something that is going to come
up for three to five years. So, I think
yes is probably the answer but I don't think it is terribly important to define
the answer at this time.
DR. PRZEPIORKA: Just
to question you, you indicated that there has been no drug that has been shown
to have an advantage in that setting.
Was that based on survival as opposed to disease-free survival, and
would you be willing to suggest that disease-free survival would be an
appropriate endpoint rather than survival?
DR. B. JOHNSON:
There are two studies that have been presented in abstract form that
Paul talked about, and it looks like there will likely be an advantage for at
least one of those two studies when it gets published and the disease-free
survival fits with the actual survival.
The point I was trying to make is I can't imagine that somebody is going
to submit for approval a new drug unless you are going to be approving it for a
new indication.
DR. PRZEPIORKA: Dr.
Johnson?
DR. D. JOHNSON: Dr.
Bruce Johnson and I decided ahead of time to avoid the confusion that the
good-looking Johnson--
[Laughter]
DR. PRZEPIORKA: You
are also in alphabetical order!
DR. D. JOHNSON: I
would say yes, disease-free survival can be used as a primary endpoint and I
would say that I would interpret the two studies that have been presented
slightly differently. One will be
published in The New England Journal soon, which was presented at a
plenary session at ASCO this year. It is
really the only study that is sufficiently large to address this question. It was an international study, done largely
out of France. The disease-free survival
essentially mirrors the overall survival.
This is essentially identical to what we see in breast cancer adjuvant
trials.
The second trial, which shows the same pattern, is a trial
out of Japan which used a drug that is not available in the U.S., UFT. It too showed a disease-free survival that
was reflected in the overall survival.
So, I personally think that this is a worthwhile
endpoint. If it is going to be used in
future trials, I think DFS can be used as it is in breast cancer adjuvant
trials.
DR. PRZEPIORKA:
Other comments from our experts?
DR. BONOMI: I agree
and I think you are going to see that there is going to be a lot more activity
in this area with these trials, especially with the ALT trial turning out to be
positive. I know the cooperative groups
are gearing up to do new studies.
DR. ETTINGER: There
are two studies. One is the Canadian
study that has been completed with vinorelbine/CIS that we await with bated
breath in early disease, stage I actually, and there is the CALGB study that is
very similar with a different set of drugs, hopefully, going in the same
direction otherwise we will have a real problem on our hands. Right now we have the ALPI study, although
there was a trend that was negative, and we have the ALT study that obviously
is positive.
So, I agree that disease-free survival in that study as
well as the UFT study in Japan show that the disease-free survival and survival
are in the same direction and should be able to use either one of them or both.
DR. PRZEPIORKA:
Other questions? Comments? Ms. Ross?
MS. ROSS: Just a
quick comment because my duty here is to represent patients, and the status quo
is not acceptable. We can't remain with
a 14 percent survival rate with lung cancer.
We have to open this up. Yes, I
would agree with that position. Please
open it up.
DR. PRZEPIORKA: Do
you have other points you want us to discuss with that question? No?
Okay.
DR. WILLIAMS: There
is one other issue though. I would like
you to vote on it.
DR. PRZEPIORKA: To
vote on it?
[Multi-member discussion]
DR. WILLIAMS: What
we are asking for, call it what you want, is would you grant full approval for
this? That is the question before you--or
regular approval.
DR. PRZEPIORKA: If
we get a positive vote on a) we won't need to vote on b) then. Going around the table then, the question
before us is in the surgical adjuvant setting would one accept disease-free
survival improvement to support regular full approval for a drug. Dr. Ettinger?
DR. ETTINGER: Yes.
DR. PRZEPIORKA: Dr.
Saxon?
DR. SAXON: No.
DR. BONOMI: No.
DR. D. JOHNSON: Yes.
DR. B. JOHNSON: Yes.
DR. GRILLO-LOPEZ:
Although I don't have a vote, if I had one I would like you to know that
I would vote yes.
[Laughter]
DR. GEORGE: Yes.
DR. CHESON: Yes.
DR. DOROSHOW: Yes.
DR. RODRIGUEZ: Yes.
DR. BRAWLEY: Yes.
MS. ROSS: Yes.
DR. FLEMING:
Conditionally yes. Sorry, I have
to give a condition because it wasn't totally clear to me. If we can say consistently that at recurrence
there are symptoms, then that makes it what I would call a level one
outcome. Short of that, if we can put
forward data that would indicate that there is a clear consistency between
effects on disease-free survival and effects on survival that would also be the
basis.
DR. LEVINE: Yes.
DR. REAMAN: Yes.
DR. PRZEPIORKA: Yes.
MS. HAYLOCK: Yes.
DR. CARPENTER: Yes.
DR. REDMAN: Yes.
DR. TAYLOR: Yes.
DR. PRZEPIORKA: It
is overwhelmingly yes so we will forego b).
Back to the first page of the afternoon session, first-line
non-small cell lung cancer treatment setting, approval based on demonstrating
superior time to progression. So,
considering the pros and cons that we all discussed this morning in the time to
progression session, for approval of drugs for first-line treatment of advanced
lung cancer, could time to progression benefit of a new drug compared to a
standard first-line regimen justify regular full approval? Assume that the standard control arm has a
known small, two-month, benefit.
Comments?
DR. CHESON: So, we
are really keeping this at time to progression and not progression-free
survival?
DR. WILLIAMS: Why
don't you change it to progression-free survival?
DR. PRZEPIORKA:
Progression-free survival.
DR. WILLIAMS: Thank
you, you have made it easier.
DR. PRZEPIORKA: Dr.
Johnson?
DR. D. JOHNSON:
Actually, my comments were relative to time to progression, but actually
I just want to make one other point that may be self-evident to everybody at
the table but it may be more germane to Dr. Bunn's comments vis-a-vis
response. One of the problems I think in
lung cancer studies is the tremendous heterogeneity of the population that we
study. I think one of the problems that
FDA faces and this advisory committee faces when it comes to lung cancer is the
fact that there has been a stage creep that affects us. Stage IV disease is very much more
homogeneous and a lot of the data that I think that Dr. Bunn presented really applies
principally to stage IV disease. When
you start including unresectable stage III disease, first of all, you have to
define unresectable and then you have to define which stage III disease one is
dealing with. At least in cooperative
group trials, a review of the database shows as much as a three-month
difference in median survival in various so-called unresectable stage III
patients relative to stage IV. That is
actually the difference that many trials are designed to see. None, as Dr. Bruce Johnson has shown,
actually has quite achieved that level in advanced disease. Typically, the best one sees is about a
two-month improvement in the so-called statistically positive trials in stage
IV.
So, I just want to make this point. It also has to do with response rates because
response rates are consistently higher in patients with unresectable but
locally advanced disease as compared to patients that have metastatic,
extrathoracic metastases. So, there is a
huge issue here that I didn't really hear addressed but I am assuming, maybe
incorrectly, that this particular committee is familiar with and knows about.
DR. PRZEPIORKA:
Would you feel more comfortable asking this question in a metastatic
setting versus the non-metastatic setting separately?
DR. D. JOHNSON: I
think it would be helpful to our colleagues at FDA but maybe they can answer
that question for themselves.
DR. PRZEPIORKA:
Would you like to hear that?
DR. WILLIAMS:
Certainly, if it makes a difference, we would.
DR. PRZEPIORKA:
Other comments before we move to vote?
Dr. Fleming?
DR. FLEMING: I would
be interested to know if there is more evidence to put on the table than what I
have heard thus far. The distinction
here between what I have been calling a level two as a marker versus level
three is profound. Level three means it
is reasonably likely to predict clinical benefit. Level two is, is it reliable? It is reliable evidence; it is
established. Across clinical areas the
number of established surrogates is really small. They are very rare. It takes striking evidence to be able to
reliably say that the effect on this marker will tell us the effect on the
clinical endpoint.
When this FDA/ASCO group met, after several meetings the
summary of the conclusions, which are presented in this document, basically
were it has not been established that the benefit on TTP reliably predicts
benefit on survival--reliably predicts.
Listening to Paul's presentation, the vast majority of it was advocating
for greater attention to response. His
comments indicated, if anything, some real skepticism, pointing out a number of
inconsistencies in time to progression prediction of survival. So, I would consider that a fairly negative
summary that, in fact, endorsed what the FDA/ASCO summary indicated after its
sessions. But maybe there are more
comprehensive analyses other people have done that can give a more positive
view than this.
Essentially I am trying to summarize what I heard at
FDA/ASCO and what I heard from Paul. It
sounds as though for time to progression these data are well short of what we
would typically think of as necessary to say reliable.
DR. WILLIAMS: Tom, I
think some of those things we were talking about this morning really need to be
discussed a little bit here. Does it matter
that there is a short difference between time to progression and survival, and
which way does it matter? Does it make
it more acceptable or less acceptable?
Do you think there are symptoms when people progress and, therefore, is
that the reason you would accept it? You
know, what would be the pros and cons of accepting it here? So, I think a bit of discussion on that point
would be helpful.
DR. B. JOHNSON: One
of the potential means for this is that this will pick up an important endpoint
that survival misses. The length of time
between time to progression and death in advanced disease is very short. So, the help of that would be very small as a
surrogate to outcome.
The second potential problem is that now with therapies in
the second- and third-line you would have problems in interpreting data that
the randomized did not take care of. To
me, that is a hypothetical problem; not a problem that has been proven to be
shown. So, I don't see that adding a
time to progression or progression-free survival would be particularly helpful
in interpreting the trials.
DR. D. JOHNSON: I
don't know if this helps, Tom, but one thing that we have done over the last
several years is to do a detailed analysis of the ECOG database for advanced
disease, with all of the recognized limitations of such an analysis. But what I can say is that at least in stage
IV disease--which is fairly reliably diagnosable, perhaps even more so today
but certainly in the '80s and '90s with CT scans one could pretty reliably
diagnose stage IV disease--one thing we observed is at the time of progression,
as documented by the individual taking care of the patient, typically by a
physical finding or a new radiographic finding, before widespread availability
of second-line treatment or the widespread acceptability of that, the median
survival of patients from that point forward was approximately 14 weeks or
so. That was borne out in the docetaxel
study that Dr. Cohen alluded to where the median survival of patients after
first-line therapy was four months. What
docetaxel did was extend that by approximately two and a half months, more or
less, in one study not in the second study.
We did an analysis which we then presented this year at
ASCO, looking at the ECOG trials subsequent to the approval of docetaxel. That is, presumably the widespread
availability of second-line therapy.
What we found was that the median survival of patients from progression
was extended by approximately six weeks beyond what it had been according to
the data prior to that. Again, this more
or less validates in my mind the data that we saw in that relatively small
trial of docetaxel.
Another thing we did during that same analysis which was of
interest to me, and I presented this at the forum, were two separate
analyses. Again, we are talking almost
exclusively about stage IV disease.
These data were developed in patients, 85-90 percent of whom had
documented stage IV disease. Patients
that had disease control--forget about whether their tumor got smaller or not
but they didn't progress, did as well regardless of whether their disease got
smaller by X amount, 30 percent, 40 percent or whatever. Those patients had virtually identical
survivals.
The other thing we looked at was percent of progression at
various time points. We chose time
points when physicians would have evaluated patients according to the
protocol. So, that would be every three
weeks or every four weeks, whatever. It
didn't really matter whether one chose three weeks, six weeks, nine weeks or
whatever. If one selected a time point
and then calculated the percent of progressors, non-progressors, in only those
studies where there was a statistically significant survival benefit was there
a difference in percent of non-progressors in favor of the arm that did better,
if you follow what I am saying.
So, it is a little bit different than progression-free
survival, but it is a fixed time point where one can say X amount of patients
are progressing at this point in time, fewer in this group and this group does
better. And, that was surrogate, if you
will, of survival. So, we looked at
those. I think that was something you
were talking about earlier, could one use some marker of that nature to do
that.
DR. FLEMING: The
evidence that we really need here would be a wide array of studies, conducted
in a given setting where we are advocating the use of a given marker as the
reliable evidence of benefit that would show treatment-induced effects on that
marker at a certain level which are always going to tell us that we have
treatment-induced effects on survival and, more generally, that the
relationship between those two is very strong.
Some of the examples that Paul gave were ones that gave very
inconsistent results in progression from survival. He also mentioned the ECOG 1594, saying that
the GC arm was a month and a half longer in time to progression, suggesting a
difference but the survival effects were the same.
DR. D. JOHNSON:
Actually, those survival results are not the same. They are not statistically significantly
different but actually the better survival is in that arm. But that is a whole other argument. I would disagree with Paul's analysis of that
particular data.
But let me say this, that what we did was develop those
markers in one set of data, 5592 which was the predecessor trial and was a
three-arm trial, and we tested the model in the 1594 data. We also went back and tested it in another
data set, 1583, which was a study that Dr. Bonomi chaired back in 1983. He is not that old; he just looks that old--
[Laughter]
--and again validated those endpoints in the same
direction. There was a survival
advantage in his study with carboplatin as a single agent and, yet, it had the
lowest objective response rate. But the
percent of patients who progressed at various time points was lower in that
particular arm. There was
"crossover" but only a small percentage of patients actually crossed
over. But it was that percent of
noon-progressors that actually best correlated with outcome in that particular
study.
DR. FLEMING: But you
are saying the aggregate data showed a lower time to progression in the arm--
DR. DL. JOHNSON: No,
what I am saying is the objective response rate in 1583 for carboplatin as a
single agent was nine percent. That was
the lowest overall response rate. The
highest response rate was 27 percent, as I recall, a three-fold difference in
response rate, and yet the 27 percent group had the lowest, statistically less
survival compared to carboplatin. But
then when we applied our rule of non-progression, and you could pick the point
you want, after two cycles, after three cycles or whatever, not looking at
objective response rate but non-progression it comes out in favor of the
carboplatin arm, just as we had predicted from the 5592 data and 1494 data and
then applied to the 1583 data. So, there
were three separate databases.
DR. FLEMING: It is
this kind of data that certainly gives one concern about the reliability of the
response predictor where you are telling us it goes in the wrong
direction. More broadly, for time to
progression or any other measure of tumor burden what one needs is much more
evidence than what I am hearing, and it may exist but just needs to be looked
at in a meta-analysis framework to understand whether treatment-induced effects
on whatever measure you are advocating--time to progression right now-- is
reliably telling us treatment-induced effects on clinical endpoints such as
survival.
DR. PRZEPIORKA: Dr.
Bonomi?
DR. BONOMI: I want
to make one comment. The MBP regimen is
a peculiar regimen. I don't know if Dick
Gralla is still here. We used a very low
dose of cisplatin, 40 mg/m2, and some people would say, and I think
Dick would be one of them, that dose might be below or right at the minimum
effective dose. The point I want to make
is there is discordance between response and survival in the study but that
particular regimen isn't a good one to base it on because in three consecutive
studies it gave the highest response rate, statistically significant in I think
two out of the three, and a trend for a shorter survival. In fact, when it was lumped together it
actually gave a significantly lower one-year survival rate, MBP did. So, higher response rate, lower
survival. We thought that regimen either
was doing something detrimental in people or possibly the platinum dose was too
low. Mitomycin might have been
detrimental. We thought it was a
combination of toxicity and the actual anti-tumor effects. That is a peculiar regimen. I wouldn't want to base any correlation
response and survival on that particular one.
DR. FLEMING: But
that really gets at the essence of what leads these predictors to not be
reliable. It is not that they are
irrelevant; they are relevant but are they adequately relevant? Are they adequately capturing the
complexities of how the disease process influences the outcome, and are they
adequately capturing some of the unintended effects? This is the heart of why these are often
misleading.
DR. GRALLA: If I could
make a comment?
DR. WILLIAMS: You
need a mike.
DR. PRZEPIORKA: Will
you take the podium?
DR. GRALLA: There
are other aspects, suck as Lucio Guino's study where, with different doses of
cisplatin, he finds that the same drugs put together differently equal, for
example, gemcitabine/cisplatin which is approved.
I think that we can find exceptions, but what I think Paul
was trying to do was to put them all together.
He was looking at single agents.
When you put single agents together at the doses at which they are used,
you do find exceptions but what you find is a fairly strong correlation between
response and survival. You know, we can
put together regimens in ways that don't have duration of response, that are
too low to do that. So, I think Paul was
looking at single agents, not combinations that are more subject to that
because when you put that together differently you can get a different result.
DR. FLEMING: But,
Richard, a lot of that single agent was Phase II data and that is not the kind
of data that you need to have to validate a surrogate because that is just
getting at correlation of response and the outcomes. That is just a foot in the door step.
DR. GRALLA: It may
be. I mean, you are right, many of those
were Phase II studies. I think if you
looked at the randomized studies looking at single agents though you would come
up with a clearer correlation between survival and response but we only have
about 15 or 20 of those in the last few years.
I must say, in my heart of hearts I believe really
ultimately response does agree with survival.
The question is are the data robust enough to agree with that at this
time, and that I am not sure of and why wouldn't we want to look at the data to
see that rather than just have an opinion?
DR. PRZEPIORKA: We
will get back to the question of progression-free survival. Dr. Johnson, before I could answer this
question the question I really have for you or anyone else in the expert row
there is would you limit enrollment in such a study on the basis of performance
status? If, in fact, we want to use
progression-free survival as the ultimate reason for approval and we think
progression-free survival is actually a measure of clinical benefit, is it going
to be likely in somebody who has ECOG performance status II or are we looking
for people who are pretty healthy looking people?
DR. D. JOHNSON:
Well, I think most of the data that have been developed in the last
decade has really been restricted to patients with performance status 0 or
I. We could debate about II should be
allowed or not but, frankly, the numbers here are not generally a problem. So, I personally think restricting to 0 or I
is still the way to go. There is a
higher level of toxicity associated with performance level II. Actually, response rates tend to be fairly
similar across the performance status and we have shown that several times in
the ECOG database but the toxicity levels are much different. So, I personally think it should be
preferentially in patients with performance status 0 and I. I wouldn't mandate that it be limited that
way but I would certainly urge that that be done in that fashion.
DR. PRZEPIORKA: Dr.
Ettinger?
DR. ETTINGER: Since
progression-free survival in my opinion is a fuzzy endpoint, it seems to me the
quality of life issue becomes paramount.
Therefore, I would say you want patients that are symptomatic if you are
going to use that as an endpoint because then there is clinical benefit, and I
think that is critical and I think that is what the patient wants. If the survival didn't come out to be
statistically significant, at least there was a clinical benefit and that is
enough to approve a drug, especially if the progression-free survival was in
the right direction that was statistically significant.
DR. TEMPLE: Just to
make the point, we have long said that improvement in symptoms is a basis for
full approval. That is why we haven't
been asking you about that. So, that is
already true and we haven't had any reason to debate it. The question here is suppose you don't have
that. So, if you have that along with
whatever it is, you are fine; that is not an issue.
DR. PRZEPIORKA: Dr.
Williams?
DR. WILLIAMS: First,
I believe Dr. Johnson is saying that you believe there probably is a correlation,
at least that it could be that progression-free survival could be a substitute
or a surrogate for survival. Perhaps we
don't have all the data yet to validate it as such. So, I would like to pursue a little bit
further also whether or not in these patients you believe that progression is
an indicator of symptoms and that would be the other basis where you might
consider this endpoint--a little discussion on that matter.
DR. D. JOHNSON:
Well, I got off in a little o- bit of a tangent. The point I was trying to make when I was
talking with Dr. Fleming is the fact that I do believe progression-free
survival is a valid endpoint, and I do think that upon progression, even in
this era when we have second-line therapy, the overall survival after that is
not that good. I mean, it is really
pretty modest and those patients are for the most part symptomatic. Most of the recurrences take place because
the patient walks back in your office not on a scheduled visit but because they
have new lung pain, or they had a seizure, or they are short of breath, or they
are coughing up blood, or they are coughing their lungs out. So, this is not a subtle thing in most
instances. We don't find it on screening
PET scans. It is the type of thing that
patients are really quite symptomatic.
So, I do think prolonging their progression-free is almost
tantamount to their symptom improvement, not symptom free because they rarely
completely resolve their symptoms.
I might add that the first drug that showed benefit in
non-small cell lung cancer that we know about was published in 1948 in Cancer
by David Karnofsky and it was nitrogen mustard.
Nitrogen mustard actually--the reason that he recommended its usage was
not because it induced tumor regression but because it improved symptoms in 70
percent of patients. I am mindful of the
fact that the FDA did approve gefitinib because of its objective response in
symptom improvement, and the rapidity with which that occurred I think was on
average eight days. If you go back and
read Dr. Karnofsky's paper you will note that nitrogen mustard which, by the
way, most of us don't use to treat lung cancer these days, improved symptoms in
approximately six to seven days.
Procarbazine has been shown to do the same thing too in non-small cell
lung cancer. So, this is not a new
concept. This has been going on for 55
years.
DR. PRZEPIORKA:
Other discussion that you need before the vote?
[No response]
As recommended by Dr. Johnson, we will split this out
looking at locally advanced versus metastatic disease, and we will start with
the metastatic patients. So, would you
consider progression-free survival as an appropriate endpoint for full approval
for a patient with metastatic non-small cell lung cancer? We will start with Dr. Taylor and work our
way around.
DR. TAYLOR: no.
DR. REDMAN: Yes.
DR. CARPENTER: yes.
MS. HAYLOCK: Yes.
DR. PRZEPIORKA: Yes.
DR. REAMAN: Yes.
DR. LEVINE: Yes.
DR. FLEMING: No, and
just to amplify a bit, there is a correlation here but I still think that the
essence of the nature of what we need still maybe hasn't gotten clarified
adequately. There is a correlation
between those people who have a longer time to progression and those people who
have a longer time of survival. The
evidence, at least as was brought forward before the ASCO/FDA group and the
evidence that Paul Bunn brought forward today certainly brings out that there
are serious concerns about whether we can rely on time to progression effects
to predict survival effects. Symptomatic
effects have been mentioned. I wonder if
the best way to measure symptom improvement is through time to progression or
whether it would be through some of Richard's approaches that he has indicated
using PROs.
But, in essence, the number of truly validated surrogates
are rare in clinical practice. I think
the data that we would need potentially could be out there but they haven't
been brought forth to be analyzed.
DR. PRZEPIORKA: Ms.
Ross?
MS. ROSS: Yes.
DR. RODRIGUEZ: Yes.
DR. DOROSHOW: No.
DR. CHESON: No.
DR. GEORGE: Yes.
DR. B. JOHNSON: No.
DR. D. JOHNSON: Yes.
DR. BONOMI:
Suggestive but no.
DR. SAXMAN: No.
DR. ETTINGER: No.
DR. PRZEPIORKA: So,
it is 8 no and 11 yes.
DR. WILLIAMS: Can we
do a subgroup analysis? Any particular
group occur to you?
[Laughter]
DR. PRZEPIORKA:
Let's do the second part and see if that changes.
DR. WILLIAMS: Okay,
go ahead.
DR. PRZEPIORKA: So,
those with inoperable, locally advanced disease, would you use progression-free
survival as your primary endpoint for approval?
We will start with Dr. Ettinger.
DR. ETTINGER: No.
DR. SAXMAN: No.
DR. BONOMI: No.
DR. D. JOHNSON: No.
DR. B. JOHNSON: No.
DR. GEORGE: No.
DR. CHESON: No.
DR. DOROSHOW: No.
DR. RODRIGUEZ: No.
MS. ROSS: Yes.
DR. FLEMING: No.
DR. LEVINE: No.
DR. REAMAN: No.
DR. PRZEPIORKA: No.
MS. HAYLOCK: Yes.
DR. CARPENTER: No.
DR. REDMAN: Yes.
DR. TAYLOR: No.
DR. PRZEPIORKA:
Overwhelming no. So, clearly that
reflected the discussion earlier regarding a slightly better prognosis group
that you want to get good, hard endpoints in.
DR. WILLIAMS: So, in
patients that might be more symptomatic or more likely to be symptomatic upon
progression the "non-lungers" said yes and the "lungers,"
except for one, said no. That is what I
heard.
DR. PRZEPIORKA: Do
you want us to continue on question two regarding the metastatic patients?
DR. WILLIAMS: No,
why don't we move on?
DR. PRZEPIORKA:
Well, we can move on because we have said no. If it doesn't support full approval, would it
support accelerated approval? We will again start with Dr. Ettinger.
DR. ETTINGER: No.
DR. SAXMAN: I think
that would depend on the magnitude so I guess the answer is yes.
DR. WILLIAMS: Let me
just give a little guidance here now.
The accelerated approval regulations say that you must show an advantage
over available therapy. Let's say this
is a first-line therapy with a survival advantage and you are showing a TTP
advantage over it so what you need to ask is, is this endpoint reasonably
likely to predict clinical benefit. You
don't have to show that there is clinical benefit. So, that is the call for accelerated
approval, to feel that this is reasonably likely to predict clinical
benefit. So, you can also discuss the
magnitude but I just wanted to make sure that that was clear.
DR. SAXMAN: That is
TTP.
DR. WILLIAMS: Or
progression-free survival, or we will substitute that for each of these.
DR. SAXMAN: What
about accelerated approval?
DR. WILLIAMS:
Accelerated approval. In other
words, you are getting the best thing out there with respect to time to
progression or progression-free survival.
DR. SAXMAN: With the
idea that full approval was intended upon subsequent survival advantage.
DR. WILLIAMS: Right.
DR. BONOMI: I will
say yes on that one.
DR. D. JOHNSON: Yes.
DR. B. JOHNSON: Yes.
DR. GEORGE: Yes,
assuming all those methodologic issues are addressed that we discussed.
DR. CHESON: Yes.
DR. DOROSHOW: Yes.
MS. ROSS: Yes.
DR. FLEMING:
Abstain.
DR. REAMAN: Yes.
DR. PRZEPIORKA: Yes.
MS. HAYLOCK: Yes.
DR. CARPENTER: Yes.
DR. REDMAN: Yes.
DR. TAYLOR: Yes.
DR. PRZEPIORKA: That
is overwhelmingly yes. Then we have to
answer the more important question which is what would be the interval that you
would want to see to say that your progression-free survival was of clinical
benefit. It is open for discussion. Dr. Johnson?
DR. B. JOHNSON:
About three months beyond control.
DR. WILLIAMS: We are
talking about accelerated approval now, right?
So, we are talking about what would be a surrogate reasonably likely to
predict clinical benefit.
DR. PRZEPIORKA: Dr.
Carpenter?
DR. CARPENTER: All
the differences in therapy we have heard about were all either in the two-month
or the three-month range of any therapy over another, if I understand the
experts. It would seem unrealistic to
expect anything larger than that of a new therapy, or not very likely. So, David mentioned the biggest difference in
survival and the disease-free survival threshold level usually pretty closely
parallels that. I think that the data
needed for accelerated approval would have to be pretty compelling and there
would need to be a large, well-controlled study that showed a difference that
is larger than we typically see for survival with best supportive care with a
doublet. I think it would need to be at
least three months.
DR. PRZEPIORKA: Dr.
Johnson?
DR. D. JOHNSON: Just
to give some context and, again, I think you have to think about this in stages
and stage IV I would argue is the most homogeneous group in a group about whom
we have the most data accurately in terms of these numbers. So, median survival in stage IV disease is
about seven and a half, maybe eight months with PS-0 in one patient. If you throw II's in that drops down. The median time to progression in SWOG and
ECOG trials is pretty reliably--the time to progression, not progression-free
survival--is about three and a half months.
You saw that in the 1594 data.
That is unbelievably reproducible.
I use that all the time. You can
just about double the time to progression in most of the cooperative group
trials and you can come up with the survival, median survival. That is what it is going to be.
Now, progression-free survival is a little bit harder to
come up with because those data haven't been as well characterized, at least
within the cooperative group data. But I
would agree with Bruce. I think if one
is looking for accelerated approval one needs to see something that is more
than just a few weeks difference in progression-free survival, and I think
three months may be unattainable. I
don't know but you are talking about accelerated approval here and I would
agree with that number.
There is one method-logic question that has been posed
which I think may be germane even in the accelerated approval setting and that
is should the trial be blinded and, if it is not, or even if it is, should
progression be verified by a blinded central reading of scans. One shakes their head yes, one, no.
DR. BONOMI: I don't
think so. David has pointed out it is
pretty obvious when these people are progressing and I think probably you don't
need to go to that degree of rigor.
Maybe David might dissent.
DR. D. JOHNSON: No;
I don't dissent. I just want to point
out that, in the studies, at least the ones I have been involved in, where
there has been a review committee that reads the X-rays, there is as much
disagreement amongst the review committee as there is amongst the original
investigators. So I am not sure who is
truly accurate in reading these.
Actually, it is my personal view that the way to get better
rigor is not to have someone else read the films but to have someone
consistently read the films at one's institution. That way, I think one gets more accurate. But that is a debate for another day, I
think.
DR. BONOMI: One
other thing. I think more and more
places now have digital radiographs with a cursor and you can measure it. There was just a paper in JCO that is what
Dave said; it should be one person reading these things consistently. You can keep it, put it in a power point
presentation. If somebody wants to look
later and see what you did, they can see exactly what they did. The reading stays right on there in
millimeters. It is much more reliable
than it used to be but it should be one person.
DR. B. JOHNSON: One
point of clarification. When you talk
about blinded, is it blinded to the treatment or is it blinded for determining
the time of progression?
DR. PRZEPIORKA:
Either.
DR. B. JOHNSON: One
of the things, and I think we have heard this consistently, it is nice to blind
you to the treatment but, if you are getting some kind of I.V. infusion, I
don't think it is going to be ethically or practically possible to blind you to
the treatment.
So I think it depends on the circumstances. If it is a pill, certainly. If it is a 14-day infusion, no.
DR. PRZEPIORKA: Dr.
Temple.
DR. TEMPLE: I am
having a disconnect. The question here
is about time to progression irrespective of whether the person is
symptomatic. What you are all saying is
they are always symptomatic, or almost symptomatic, and that is what makes you
know they have progressed. But we never
see that. We are never given data that
show symptomatic progression. If it is
that easy, why isn't everybody collecting it because then there would be
regular approval. It wouldn't be
accelerated. There wouldn't even be a
discussion.
DR. D. JOHNSON: I am
reminded of the time that I sat in this committee informally as a member and
this is like deja vu because I remember your comments many times, Bob--
DR. TEMPLE: Sorry.
DR. D. JOHNSON: No,
no. I am glad to find you are
consistent. In my after-ODAC life, I
have been involved in advising folks and I have made that point many times that
it is something. I think Richard has
made the point many, many times as well.
We, basically, agree with you. We
do think that that is a reason for approval of drugs and we would like to see
more of it ourselves.
So I can't answer why people don't do it. But I am also reminded of one of my favorite
quotes. I actually put it--after I heard
you make this quote, I actually had my wife embroider it and it is on my wall. It is listed there, "Bob Temple, FDA,
Survival Trumps Everything." That
was a quote from you and I have never forgotten that. So we always remind people when they--
DR. TEMPLE: Just one
other observation; we have also asked people, even if you are not absolutely
sure that, at the time of radiologic progression, there are symptoms. It has always been our assumption that, in
something like lung cancer, symptomatic progression must be fairly near at
hand, even if they have crossed over or stopped the drug.
We have invited people to look for symptomatic progression
at any time, even if they are off therapy or moved out and, again, gotten very
little interest in doing that.
DR. B. JOHNSON: Let
me make a comment about this. It has to
do with the clinical practice of it. One
of the things that happens is, when we go in to see somebody and they tell us
they have shortness of breath, you examine them and they have decreased breath
sounds half of the way up, you send them for a chest X-ray and you get the
chest X-ray and it shows a new pleural effusion and enlarging nodules. The thing I always tell the patient--well,
usually I tell them when they are responding, responding, getting better, it is
easier to make jokes when they responding.
But we say, well, one of the things that's nice about being
an oncologist is it is not that complicated because 95 percent of the time the
radiographs agree with the symptoms.
Now, we have grown up with radiographs as our objective criteria for
assessing disease progression. So that
gets categorized not as a symptomatic progression but it gets categorized as a
radiographic progression because that is what has been reviewed in every
cooperative-group study.
Now, one of the things that Richard has talked to us about
is that the symptom scales have evolved so that they may be more objective than
assessing radiographic response which will be a step forward in being able to
recognize and use the data. That hasn't
been something that hasn't been easily available to us outside of a
clinical-study setting.
DR. GRALLA: One of
the problems has been feasibility. The
point is if you see the X-ray that Bruce is pointing to, you say, well, why do
I need to validate this on a scale. I
have this. Unfortunately, we have often
gone from the chest X-ray to the CAT-scan so it is $1,000 procedure that you
wait for a little while on.
It has been necessary to convert these scales to easy
ways. They are not like on a palm-pilot,
some of them. They are just being rolled
out in trials. This should make it
easy. But how do we now adopt that into
clinical practice because we are not used to doing that and, God knows, getting
us to change is the hard part.
So you have got this case-report form that is 40 pages
long and the rest of this and now you want to add something else to it. That is why I think you haven't seen it but I
think it is up to us now, from the cooperative group and from other areas, to
get this so you so you can see it in a way where most of the patients have it.
DR. PRZEPIORKA: Dr.
Temple, just to bring his point back to you and your definition of symptomatic
progression. Would you be looking for
something on a scale that is objective and you can measure or, as he points
out, the patient says, I'm short of breath?
Is that enough to say this is a symptomatic progression?
DR. TEMPLE: That is
a fair question. If we are all blinded,
it would be a much easier question because then you could accept a lot of
things. But there are people here much
better able to think about that than me, but somebody showed the five or six
things that are most of what bother patients.
If there were some systematic question that even asked them
on a ten-point scale, how is your fatigue, your this, your this, your this,
your this, and that was done regularly.
When it looked worse, you then sent them out for an X-ray. That would greatly help the persuasiveness of
that finding of progression as a meaningful thing.
The other thing, of course, is if, in several studies, it
always came out that way, you would have at least some case for saying that
progression pretty much always means symptomatic progression. Then we wouldn't have to do all that anymore.
DR. GRALLA: I think
Dr. Taylor pointed out in second line, where we saw these response rates of 6
to 10 percent, do we need to send all these patients for X-rays for
this? When the patient tells you that
they can't breathe, and you have got a valid way of measuring it, that they
have more pain, that they are using more pain medicine and they are dropping
weight like a stone, I am just not sure that we need the chest X-ray, the MRI,
the PET scan.
DR. TEMPLE: We
totally agree because symptomatic progression is a no-brainer approval, if you
believe it--if you believe it. That's
important
DR. GRALLA: These
instruments do that now. The problem is
getting them incorporated into trials in a feasible way. It is the feasibility that is the problem.
DR. B. JOHNSON:
There is one other problem that comes up with this. Richard may want to address this. We have gone through the design of a trial
now where the symptoms as being assessed on one of the formal scales and the
design want to withhold that information from the physician because they think
it will bias the physician's decision-making.
We are wrestling with the ethical dilemma about do you
withhold patient information from the treating physician with the potential of
biasing the outcome. I would like to
hear Richard's comments on this.
DR. GRALLA: It is a
great point, Bruce. We are doing a
200-patient trial in Ontario right now trying to look at that, trying to look
at how these data affect--did these data affect the physician
decision-making. So I hope we have some
information there. I think it is going
to be difficult to say because the patient comes in and has pain. As David said, it is not at the regular visit
that the patient comes in with this. The
patient comes in telling you this. It
wasn't on the screening PET scan.
But we have a 200-patient study looking at this where the
physicians are given this prospectively and they are given the data each
time. We will see what they tell
us. It will also be interesting to see
the average number of cycles that they use.
DR. WILLIAMS: I
guess the biggest problem in my mind is what about blinding. Can we believe it? How do we know we can believe it. These validations of this and that, they
don't seem to be taking into account the placebo effect or the effect of
knowing your treatment.
So how do we address that?
If can't blind trials, then can we use these endpoints? We have basically moved down to No. 7 and 8
with this discussion, I think. Can
we? I wonder what Dr. Gralla would have
to say about that.
DR. GRALLA: So, by
"these endpoints," you mean these subjective endpoints, the pain, et
cetera?
DR. WILLIAMS: Right.
DR. GRALLA: Let's
look. We have talked about 1594, this
four-arm lung-cancer trial. Was the
patient supposed to feel that they should mark it better because they were
getting the docetaxel or the paclitaxel?
Most of these trials are in that way.
Now, if the patient is getting the gemcitabine or the or
the paclitaxel, my guess is that we could tell which one the patient was
getting if we were blinded. So I think
that actually maintaining the blind is unlikely and that these are, to me,
almost moot points because we are usually looking at Treatment A versus
Treatment B. The patient is usually told
if we are using the best standard versus a new agent, well, you are getting the
very best that we know of.
I don't think that patients answer that their cough or pain
is different six, eight, twelve weeks into a study because of this. Now, I think it is important, such as in the
gefitinib study, et cetera, that the patient then being given a pill is given a
placebo on the other arm when they maybe are getting nothing in second
line. I think that that really is
important.
But, in most of these first-line Stage IV patients--and
that is the other reason that the normative data will be important, also, to be
sure that this is a group.
DR. KEEGAN: Dr.
Gralla, I guess having lived through enough of the hype of certain
drugs--Herceptin was one, Iressa and Gleevec were others--in a lot of trials,
some patients actually are concerned about which arm they are randomized to and
do have a strong feeling. Perhaps
patients might not be as concerned about being on a certain arm and declaring
symptoms as patients who are on the "unfavorable" arm, or what they
perceive to be unfavorable, and want to hurry up and declare their symptoms so
they can be crossed over. Is that a
concern in an unblinded trial, because I think that has been a concern we have
had.
DR. GRALLA: I
certainly think whenever possible to blind, why not. There is absolutely no reason not to. The are many studies where we didn't see that
being done. However, I must say that, in
most of the trials that we have done in the '90s, this really hasn't been where
people have been so excited and where they have dropped out in that way.
If you look at the *pemetrexed study that I showed,
basically, you can see a lot of patients showing improvement on the cisplatin
study, et cetera. There is a strong
correlation with response there, on the cisplatin arm, et cetera.
I agree that it is an issue and whenever possible to blind,
it is reasonable to do. But maybe the
burden of proof is on us to show that your concern actually occurs because it
is like the placebo effect, when they looked at it carefully, it was pretty
hard to show it was really there.
DR. WILLIAMS: That
is kind of our tradition to have the sponsor show that something exists. That is hard to get around.
DR. PRZEPIORKA: Dr.
Bonomi.
DR. BONOMI: Just
very brief. One other objective thing
that could be done in every Stage IV lung-cancer trial is just measure the
serial weights. Obviously, people with
edema would throw that off. But, otherwise,
if I had one thing I could look at in a patient, just show me their serial
weights and pretty much that is going to tell you what is happening to them.
DR. PRZEPIORKA: Is
performance status still a valid--
DR. BONOMI: Oh,
absolutely but it is--you know, the weights are so--it is a quantitative--one,
two is not--Karnofsky is a little bit more detailed.
DR. GRALLA: These
are all valuable. But they are not
surrogates for quality of life. So they
are all valuable. They are components of
quality of life. But they are not, by
themselves, that. So performance status
is really a function scale. It is of
real value, what is your ability to do things.
Actually, we like now the patient-generated activity scale
where they fill that out. That can be
useful. These are all valid points that
are very helpful in clinical management.
It is pretty hard to see a patient who is losing weight like crazy and
think that you are doing something good for that patient.
DR. PRZEPIORKA: Dr.
Saxon.
DR. SAXON: Getting
back to the original question which was to choose a magnitude of
progression-free survival that one would think would be clinically relevant, it
seems to me that the problem with that, and maybe I don't understand this
correctly--but the problem with that is that it dissociates that endpoint from
the toxicity issue.
Whereas, I think a three-month progression-free survival
advantage in a minimally toxic drug may be quite interesting and important, a
three-month progression-free-survival advantage with a very highly toxic drug
probably wouldn't be. So my own opinion
is you can't choose an absolute magnitude that is of clinical relevance, that
you have to take into account the toxicity of the agent. So it is going to be a judgment call each
time this comes up.
So I guess, in that regard, I disagree with Dr. Johnson, B.
Johnson. I don't think it is going to be
possible, quite frankly, to choose an absolute magnitude. That consideration is too important, I think.
DR. PRZEPIORKA: Dr.
Fleming.
DR. FLEMING: I had
voted against use of time to progression as a full reliable endpoint because of
the uncertainties we have talked about.
I abstained on the issue of its use as an accelerated approval because I
am a bit on the fence. I think we are
getting at some very good discussion that I think are the relevant factors that
would pull me off the fence one way or the other.
If we are conducting these studies with a high level of
rigor that minimizes bias due to unblinding which does concern me, and
minimizes missingness, those are issues that certainly are important. I am very favorably persuaded by my
colleagues' comments that, if we were relying on time to progression as an
accelerated-approval endpoint, it would have to be based on a very substantial
evidence of benefit.
I think Scott makes the good point; ultimately, it is
benefit to risk. So what that level of
benefit is going to have to be will be dependent on what the overall safety
profile is. That is certainly relevant
although it is helpful to get Dr. Johnson's sense, three months. My own sense here is it should be something
very substantial taking into account, of course, the toxicity profile.
We didn't talk about statistical strength of evidence, but
it should be strong statistical strength of evidence. Traditionally, we call it strength of
evidence of two trials, 0.25 squared, something on that order, something on
that order. It should be strong evidence
than I might have asked for for survival because, in fact, it is not as
reliable a measure.
The study presumably will give us some information on PROs
or survival. Certainly seeing some
suggestive evidence that those results look to be trending in the right
direction, obviously, would be also very importantly reinforcing.
The final point that I would make is a very important
issue; is accelerated approval tantamount to full approval and, if it is, then
I would argue we should be using criteria close to that for a full
approval. But, if accelerated approval
really is to get early access while we complete the validation trial in a
timely way and, if we have procedures in place that would give us a process to
withdraw the accelerated approval if the validation study shows lack of
benefit, then I am much more willing to say yes, this lower level of evidence
that we would have is, in fact, a basis to providing an accelerated approval.
So I guess I am saying under all of the conditions that we
have talked about, I would also support the accelerated approval. But those conditions mean that we need to
have considerable strength of evidence on time to progression. It would be useful to have supportive
evidence on survival and it would be important to know that, if the validation
study, when completed, showed lack of benefit, that this wasn't going to lead
to indefinite access. If it were, then
we should be looking at full approval criteria.
DR. PRZEPIORKA: Dr.
Johnson.
DR. B. JOHNSON: I
wanted to get back to Dr. Williams' point about being concerned about using the
PROs and the blinding issue. One of the
things that we don't have a lot of examples of in lung cancer is a big
dissociation between patient-related symptoms or patient outcomes and what is
happening with the underlying disease.
The duration of time is relatively short that we typically
see so, until we come up with some examples where there is a moderate
dissociation between the patient's perception of outcome and what we typically
measure in the disease, I think it should be okay. It is not something I would lay awake at
night worrying about.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: I guess
something I want to flag for a later discussion is the difference between what
we usually measure, which is medians or the shape of the curve, and the
possibility that there are widely different results from one piece of the
patient population to the other; that is, a small responder set.
I don't want to try to resolve that now, but has is always
sort of bothered me because I have always been struck by the end of the tail
that goes out real far. That seems, in
some ways, more important than the median.
None of our analyses really reflect that. But I don't want to talk about it now. I just want to flag it for later. Much later.
DR. PRZEPIORKA: In
that case, we will move on to the Question No. 6 which we are now getting into
dreaded territory. First-line
non-small-cell lung-cancer treatment setting approval based on the
noninferiority analysis of time to progression or progression-free survival
and/or response rate.
So, specifically addressing the following situation; a less
toxic experimental drug demonstrate noninferiority of both response rate and
progression-free survival compared to the standard toxic regimen. The standard toxic regimen has previously
demonstrated an estimated two-month survival benefit one trial comparing it to
best supportive care.
In the current trial data, 95 percent confidence intervals
cannot establish whether the experimental therapy retains the survival benefit
of the standard regimen. Could approval
be based on noninferiority analyses of response rate and/or progression-free
survival in situations where the noninferiority analysis of survival cannot be
performed.
Examples would be when there are insufficient patient
numbers to allow the survival noninferiority analysis or when there is
confounding of the survival analysis by crossover.
Discussion? Dr.
Fleming?
DR. FLEMING: 5 and 6
are related. They are both
noninferiority questions. 5 was on
survival, 6 was on surrogate for survival.
I am just wondering, since 5 lays out the fundamental issues that have
to be considered for a valid noninferiority trial which also have to be
considered in Question 6, is it okay to consider those two questions together,
or can we start with 5?
DR. WILLIAMS: I
would prefer not to get into the details.
Let's suppose that we have everything we need for a noninferiority
trial, for time to progression and response rate. I don't want to get into whether we do and
how you would do that, but let's suppose we do.
Not a likely situation, but let's suppose. Given that, and given that we can't deal with
survival compared to this marginal survival benefit of this other agent, but it
is less toxic--I mean, this is a real situation that we definitely will face
with several drugs in the near future.
The question is can you do noninferiority comparison with response and
time to progression.
Certainly, you can do it with response rate. And they are less toxic. So that is the question. I don't want to get into the details of what
are the various numbers of trials we have in order to demonstrate the
time-to-progression effect and the response-rate effect. Let's just assume that we have a margin that
we can establish and we can establish that we have the same noninferiority rate
and time to progression.
I would like to take that as a given, in this question.
DR. TEMPLE: It
didn't say noninferiority on the surrogates.
DR. WILLIAMS:
Right. Response rate and time to
progression.
DR. TEMPLE: But not
for the survival, but tolerability advantages.
DR. WILLIAMS:
Yes. This is an extremely real
example. All of the doublets have very
poorly documented survival effects. It
is very difficult to do an noninferiority survival analysis. So you have either got to beat them or the
other alternative would be to say, I have the same response rate, time to
progression with some sort of rigor and that I am less toxic.
So it is sort of a value judgment. You have already said--part of committee said
they wouldn't take progression-free survival as a benefit anyway. On that basis, maybe it seems obvious. But the situation may be that you cannot deal
with survival here unless you beat the drug. So I would just like you to kind
of struggle with what we are struggling with.
DR. PRZEPIORKA: So,
if I can reinterpret the question, if you have a drug that is really not toxic
and it gives you the same response rate and time to progression as your current
standard which is, come in, get your white count wiped out and have lots of
nausea, vomiting and throwing up and, on the basis of numbers, response rate
and time to progression are exactly the same for the toxic and nontoxic drugs
and there is no way you could look at survival--
DR. WILLIAMS: We
have to go a little better than just on the numbers. We would have to satisfy Dr. Fleming they are
noninferior.
DR. PRZEPIORKA: But
there is no way you could look at survival in those patients because there is
just not enough. Would you be willing to
recommend approval?
DR. D. JOHNSON:
Regular approval.
DR. PRZEPIORKA:
Regular approval.
DR. WILLIAMS: Or
even accelerated approval. That would be
a possibility.
DR. SAXON: But that
is not exactly what this says. What this
says is that you cannot establish whether the experimental therapy retains the
survival benefit. So the confidence
intervals here are overlapping null.
DR. WILLIAMS: Well,
no. When we are talking about with
respect to survival, you are correct.
But we cannot establish it either because we don't have enough data or
because the effect is so poorly established historically that it could never be
practically done.
DR. TEMPLE:
Realistically, if you have a two-month survival, the lower bound for
confidence interval is added somewhere less than that, and you want to preserve
50 percent of it, you would have to rule out a loss of half a month or
something. The size of study that could
do that is not really thinkable.
DR. B. JOHNSON: Can
you give us an example of the sizes. The
unspoken thing here is that it would take a huge trial to do that with a
two-month difference.
DR. WILLIAMS: I
would say 2,000 or 3,000. I don't know
what the statisticians would say.
DR. B. JOHNSON: Can
you give us an idea about the size we are talking about?
DR. FLEMING: It is
easier if you go with me for a moment.
It is easier to start with the perspective of survival and then move
into the perspective of time to progression.
But the size, just to jump ahead, of the trial is going to be dependent
on what alternative you are presuming.
The way this would frequently be done, if it were survival,
for example--let's suppose we have a three-month advantage in survival and it
is estimated with considerable precision, plus-or-minus a month. So it is three months, plus or minus a month.
Now, by the way, that clearly is going to be based on a
metaanalysis because three months plus-or-minus three months is what you get
when you have a p-value that is two-sided 05.
So you are talking about very strong evidence to be three months
plus-or-minus a month.
Then the typical approach is to say, all right, that means
it is at least two months. I will
preserve half the benefit so I will have a one-month margin.
DR. TEMPLE: In that
case, you could do it.
DR. FLEMING: In that
case, it is like the iridia* Zometa example where this is the exact approach
that was used. But, clearly, it takes a
metaanalysis. There has to be
substantial evidence of some benefit.
However, I would even say here the sample size may not be
as horrendous as you would think because, if we are somewhat better, we can
rule out we are somewhat worse. There
was an noninferiority survival improvement and that was *docetaxel against
*navalbine. In essence, the docetaxel median
survival was a month longer. You can
rule out that you are a month worse when you are a month longer without it
being an extraordinary sample size.
Where it becomes extraordinary is if you truly are not any
better and then you are having to rule out a small margin. Then it takes a big sample size.
I would hope we would learn from experience, and I think we
are learning from experience. The
temptation is to say, if I have an effective standard of care and I can come
along with something that is less toxic, if the curves are overlapping, if
their time-to-progression curves, survival curves, whatever, it is very
tempting to say, come on; efficacy is the same and safety is better.
It brings me back to March 14, 1986 when ODAC was meeting
and we were looking at advanced breast cancer with adriamycin as the standard
and mitexantrum was being considered and everybody was impressed by the fact
that it was less nausea, vomiting, cardiotoxicity, myelosuppression. The committee voted 9 to 2 in favor of
approval because there wasn't anything that was compellingly different in
survival.
Yet, the fact that the curves are close together doesn't
really mean we can rule out that it is worse.
Fortunately, Bob Temple and others at the FDA came back and said, let's
revisit this in a year. It was revisited
in December of '87 and, at that point, the differences were significant
favoring the control, now adriamycin, and the committee completely reversed its
vote and it was 11-nothing against approval.
The relevance of what we learned fifteen years ago was it
is important to understand what levels of rigor we have to have in order to
judge that we can rule out that it is meaningfully worse. These margins are not just a statistician's
configuration of something to make clinicians' lives complicated. It does do that, but there is much more of an
intention than that, and that is to be able to say, what is the difference
between evidence that looks consistent with noninferiority versus evidence that
really establishes noninferiority.
For superiority, if you had 30 patients on an arm and you
had a two-month survival difference, we wouldn't claim that superiority if the
p-value is 0.15. We have to be as
rigorous, if not more rigorous, in a noninferiority setting.
So the conclusions that are actually derived and the points
that are made in Paragraph 5 for Question 5 are relevant for Point 5 and Point
6. It is very important that we
understand that we have active comparators that truly provide substantial
benefit that is precisely estimated and where those estimates apply to the
setting in which the noninferiority trial is going to be done. That is called the constancy assumption.
A lot of methods are out there. The Rothman method was referred to by Mark
Scott in his open-session discussion. I
would just point out, that method or any other needs to adjust for the
constancy assumption. Mark Rothman was
mentioned that to me also at lunchtime.
The method is now frequently being applied when it doesn't adjust for
the validity of the constancy assumption which, again, clinically means,
historically, I may have estimated my active comparator to have a certain level
of effect, but it may not have that level of effect as an imputed placebo in my
noninferiority trial if I have different sensitivities for efficacy, if I have
different ways of measuring, if I have different supportive care.
The analysis that is being brought before this committee, I
hope one question people would ask is, are we using rigorous methods to truly
rule out meaningful differences and is that constancy assumption factor being
factored in.
Moving to Question 6, we make our life far more complicated
when we now try to do a noninferiority analysis on a surrogate endpoint. That is where we are in Question 6. If one is looking at ruling out a certain
level of difference in time to progression--let's say you have got these
combination regimens that have been established in first-line as standard of
care on survival and we now want to look to see whether we are not meaningfully
worse in time to progression.
We are not even saying are we better. We are saying, are we not meaningfully
worse. Then what we have to be able to
say--I have registered concerns in using time to progression as a superiority
because I haven't seen the evidence here presented that indicates that if we
achieve a certain difference, beneficial effect in time to progression, that
reliably means a treatment-induced effect in survival.
To answer Question 6 positively, you need far more
information. You have to be able to know
that if you give up a certain fraction of the benefit in time to progression,
that will translate into the fraction of survival benefit that you are willing
to give up. That type of functional
relationship is extraordinarily hard to get at.
We talked about lipids as an example where FDA has used
this as an acceptable surrogate. We have
myriads of studies showing you can get a 10 percent reduction in
cholesterol. It doesn't provide any kind
of benefit. But a 30 to 40 percent does
provide major benefit.
You have got to understand the functional relationship that
says how much time to progression difference translates into the amount of
survival difference I am willing to give up.
I would argue that is wishful thinking.
That level of insight and the data that we would need to be able to do
that just doesn't exist.
DR. PRZEPIORKA: I
think a key question here that he brought out was making sure that survival
doesn't pay the price. If there is a way
that you could keep the confidence intervals--or predict how much you have to
keep the confidence intervals down so that you don't lose survival, if you know
the correlation between the surrogate and survival, that would be one way to
say, okay; it is kind of safe to do this since it is less toxic.
But if you can't predict, I think everybody would have a
difficult time knowing the history of the drugs that we have seen in the long
run to say yes, this would probably be okay to approve.
DR. TEMPLE: In some
ways, probably the example we are more likely to see is where response rates
may be a little better, time to progressions may be a little better than the
control and we don't really have much data on survival. That would raise an interesting question
about accelerated approval, I think.
That is probably more likely to face us.
It is not easy for me to imagine how we would be able to do
successful noninferiority on time to progression if we didn't have a clue about
survival. I am not sure how you could do
that.
DR. PRZEPIORKA: I
guess from our earlier discussion that if this little bit better is less than
three months, as far as we are concerned, it is not inferiority, it is not
superiority.
DR. TEMPLE:
Right. Thanks.
DR. WILLIAMS:
Perhaps we could go to the last area about symptoms again and have a
little bit of discussion. We have heard all
of Dr. Gralla's presentation about the merits of these endpoints, but what are
we ready for now and how should they be used in the studies we are doing? Do we think that they are ready to be a
primary endpoint? Is there a specific
area we need to go with these endpoints?
Do we need to include them in all the studies?
DR. PRZEPIORKA: We
will go ahead and go through the second No. 7 and No. 8. But, before we do that, I just wanted to make
a statement of concern that I had regarding the meaning of validation in these
quality-of-life tools that are used since they seem to be validated against
other quality-of-life tools.
I work with these patients.
I understand their quality of life needs to be good but what is the
definition of quality of life. I sit in
the chair under a cover and don't move but my pain is better or it is I can
take the cover off, fold it up, do some laundry. So I am disappointed to hear that these are
not validated against a functional scale which I think would be a meaningful
clinical benefit.
DR. GRALLA: I'm
sorry, but I think that is incorrect.
These are not, but my pain is a little better, I am shivering. How much pain do you have? None at all or as much as it could be? These are validated in ways that are quite
clear. If you, certainly, take an
example of the FACT-L, there is looking at physical symptoms and how that
affects functionality. So these are
strongly validated. The Melzack-McGill scale
looks at these issues and looks at the quality of pain.
We don't look at the quality of pain. So there is strong correlation with these if
you look at how they are looked at. For
instance, the observer scale, as part of the LCSS, correlates with what type of
pain medicine you now need. So, have you
gone more or less down the WHO ladder or are you just taking tylenol?
So these are validated in ways that correlate with
function, et cetera, but the main answer is do they tell you whether a person
has pain or not. So a pain questionnaire
answers the pain question. They have predictive
validity for survival and maybe even for response as well.
But we don't ask that of survival, does this give us a
function answer. We don't ask it of
response, does it give us a function answer.
Now we are asking of pain? I
think the validity methods, the gold standards that are used are those that are
used elsewhere and that, if you look at the function analysis, especially in
the FACT, it really gives you a lot of information as to how people
function. And they all correlate with
performance status.
DR. PRZEPIORKA: That
was not clear in your presentation, but we would certainly like to know more
about how the quality-of-life scales predict function. I think that is really important.
DR. GRALLA: Again,
if you just look at the validation study for the FACT-L, and it would have
taken half an hour to discuss that alone, you can see that it is divided into
social functioning, physical functioning, psychological functioning. All these areas are right there. So these address exactly the points that you
wish to look at.
DR. PRZEPIORKA: Dr.
Bonomi?
DR. BONOMI: In the
FACT instruction, they have a thing Dr. Cella calls a Trials Outcome
Index. It has 21 questions and it
addresses the things that Dick just talked about. It has lung-cancer symptoms. It has functional symptoms. And it does get all of that stuff. In fact, David alluded to it earlier. That was the best predictor of survival in
the study, 5592 study. It was better
than performance as the initial Trials Outcome Index score was the best
predictor.
The problem is, and this I would like to raise, it has 21
questions that the patients have to answer.
I think that the things that are probably most valuable are the
lung-cancer symptom scale or the FACT-L which is just seven questions about
lung-cancer symptoms.
That is something you can get pretty reliably. You start going to 21, it starts getting a
little tougher. But maybe Dick has a
comment about that.
DR. GRALLA: I
agree. If you want detail--there are
always tradeoffs. How much detail do you
wish to have? If you will accept the
fact that these validity studies that are done and published in the
psychometrics that show all these outcomes that you want, and are boring as
hell to read, these 20-page papers, or whatever, they go into these issues.
The question is when you get this ready for prime time, you
don't want to be doing all those scales that they did because there is
correlation with each one of these areas.
So Dave Cella has developed this 7-question subscale which some people
like, et cetera. The LCSS, which is
supposed to address these, has only nine items to be done.
So these get to the questions, is there really pain relief,
et cetera. The basis has already been
done as to what this means to patients.
There is a lot of information on that.
It is like looking at a CAT-scan and saying, but how do I know it really
works each time. There are other studies
that have shown what it really means, as far as that is concerned.
So I would have to say that if you want to look at the full
scale which is what really Phil is talking about, and you take the T, O, I out
of that, you get 21 items, et cetera.
You can do these. But you can get
answers that tell you that patients are improving in the areas that are most
important to patients just by using the smaller areas. For the LCSS, it is whole instrument. For the FACT-L and for the EORTC, it is a
subscale.
DR. PRZEPIORKA: Dr.
Temple?
DR. TEMPLE: One of
the areas in which we think we have made progress is we don't call these
quality-of-life scales anymore. We call
them patient-reported outcomes because quality of life captures--you have got
to check the spiritual nature of it all and we are not so such cancer treatment
fixes that.
But we think it is at least plausible that it might fix a
good scale of lung-cancer symptoms. So
the focus there is on those and they have a certain amount of face
validity. They seem at least as valid as
the typical questions a physician will be put to the patient, like, how is your
breathing or how are you feeling.
They are pretty solid.
Those seem like the most promising things. Whether performance in the community for
someone with advanced lung cancer is as relevant as how is your breathing these
days, I think could be debated. But at
least some of them seem very plausible on their face and we would be very happy
to see effects on those things, I think.
DR. PRZEPIORKA: I
think a question came up earlier regarding performance status II patients and
whether or not there should be quality-of-life instruments are PROs as a
primary outcome for studies in that subset of patients with lung cancer. Any comments?
DR. D. JOHNSON: You
mean as separate studies altogether and is it something that is valid. I think the answer to that is yes. We have data from, again, prospective
studies, one from Michael Cullen which I think is a really nice trial that was
done in the U.K. in which they included patients with advanced disease who had
performance status II, and they did patient-reported-outcome analyses.
What he demonstrated was what ECOG, SWOG and CALGB and
others have demonstrated, that the better your performance status at diagnosis,
the greater is your "survival benefit." Again, just to give everybody some baseline
data who aren't lung-cancer docs here, if you get a platinum-based therapy and
you are Stage IV, your median survival will be nine months if you are 0
performance status, six months if you are 1, and three months if you are
2. That I call my Rule of 3s.
In Cullen's study, he showed really exactly the same
thing. It was the exact reverse of that
in terms of symptom benefits. Obviously,
if you are asymptomatic, you can't get better.
You can't get more asymptomatic.
The amount of benefit in terms of symptom improvement was
greatest in the patients who were PS-2.
So there was a balance. Their
survival benefit was not as great. It is
one-and-a-half months to two months with no treatment, three months to four
months maximally with treatment.
But, by contrast, their improvement, however you chose to
define that, was a higher percentage of improvement relative to the PS-1
patients although their survival, the PS-1s, was better than the PS-2s. That makes sense. The more symptomatic you are, the more likely
you are to improve.
DR. GRALLA: Dr.
Przepiorka, in the validation studies, Dr. Holland, in Cancer in 1994, looked
at "known groups." So we know
that survival varies by each decline of the Karnofsky scale. So she looked at very low performance-status
group patients, performance status 30 to 50.
She found validity for the very low performance-status group, the
median, the Karnofsky 50 to 70, and then the better 80 to 100. So part of the validation is looking at known
groups and then seeing if this goes true.
This sort of paradoxical finding that David has explained
to us seems to exist through that as well.
So these instruments have all looked at those groups and these
instruments, to some degree, have been looked at in the hospice population as
well.
DR. PRZEPIORKA: Are
there any other lung-cancer settings where the symptom-based endpoints can then
serve as the primary endpoint for approval?
DR. B. JOHNSON: One
of the things I would like to address is one of the reasons why--we work quite
a bit with mesothelioma patients. One of
the reasons why they generated that--why the symptom scale--and we participated
in that study where we assessed that--is that it is very difficult to assess
responses in mesotheliomas because it is pleural based and you can't do Recist
criteria. So you have either got to come
up with a new way of doing it which has since been better validated.
But when those trials started, they didn't really
exist. So they embedded that symptom
scale in there. We got experience doing
it and I agree with Dick. I think that
was one of the first times we were really consistent about it and got it short
enough so the patients could reproducibly do it.
And so mesothelioma would be a very good one to take a look
at. But the thing that happened there is
that the symptoms very closely paralleled what they saw radiographically which
is what you see in almost every situation.
The other thing that happened that we learned in there is
that, and this may be shocking to some people, but they don't always tell the
doctor everything. If you took a look at
what they filled out, they say, I feel great.
Everything is going wonderful.
And they have got it all maximally symptomatic.
So it does collect information, no matter how thorough we
try to be, that does not otherwise exist in the medical record.
DR. PRZEPIORKA: That
is No. 7. Moving on to No. 8, discuss
the role of quality of life as a drug-approval endpoint. Are quality-of-life results meaningful in
single-arm studies? I think Dr. Gralla
actually addressed that a little, if he wants to reiterate his opinion.
DR. GRALLA: My
opinion on this would be that it is very interesting to see it is exploratory,
but, for drug approval, I have real difficulty with it.
DR. PRZEPIORKA: Does
anyone disagree with that? Okay. We also talked about blinding a little bit so
I will skip b. and go to c.; should quality-of-life instruments be routinely
included in lung-cancer studies and, if so, which ones.
DR. B. JOHNSON: If
it is routine, then why would you have to pick them?
DR. D. JOHNSON:
Actually, I am not sure I would mandate that they be included. There are circumstances where, if we are
curing 100 percent of the patients and their quality-of-life drops a little
bit, I think they might accept that to some degree. I am being facetious, but I do think that
there are circumstances where quality of life is really not going to be
necessarily beneficial to the outcome of the trial.
Again, if you are powering for survival benefit, it seems
to me redundant to look at the quality of life and then try to come in for a
drug approval on the basis of that later on, as a secondary endpoint. Now, maybe FDA would feel differently about
that, but, to me, if you want to use it, you should use it in the proper way.
DR. PRZEPIORKA: Dr.
Bonomi.
DR. BONOMI: I agree
with Dr. Johnson completely. I would not
make it mandatory. You would pick it
and, if it is your primary objective, great, and make it simple. It has got to be simple, lung-cancer symptom
scale or something like that.
DR. PRZEPIORKA: Dr.
Saxton.
DR. SAXTON: I agree
with Dr. Johnson.
DR. PRZEPIORKA: Dr.
Ettinger?
DR. ETTINGER: I
agree.
DR. PRZEPIORKA: Do
you have other questions you want us to look at?
DR. WILLIAMS:
No. I just want to thank
everybody for all their input. I think
it has been a great discussion. It is a
great way to sort of kick off the endpoints process.
DR. PRZEPIORKA: Ms.
Ross.
MS. ROSS: Thank you,
Madame Chair. Would it be in order for
me to make a motion to have a vote on objective response rate as an acceptable
endpoint for accelerated approval?
DR. WILLIAMS: We
have already used it. The reason we
didn't ask is because we already did it with Iressa. I guess you could. The only thing that could happen is that it
would turn around that decision which isn't what you want, I don't think.
MS. ROSS: Drop the
motion. Okay. Thank you.
DR. PRZEPIORKA: Just
as a point of information, our next meeting will be March 4 and it will be one
day. It might be one day, it might be
two, but it is a different day than originally planned so please check your
calendars and this meeting is now adjourned.
Thank you.
[Whereupon, at 4:43 p.m., the meeting was adjourned.]
- - -