Type: Article
Publication Date: 2019-10-24
Citations: 78
DOI: https://doi.org/10.1200/jco.19.01681
Evaluation of new anticancer therapies in randomized clinical trials (RCTs) is typically based on comparing a new treatment with a standard one, using a time-to-event end point such as overall survival or progression-free survival (PFS). Although the statistical framework underlying the design of these RCTs is centered on formal testing of a treatment effect, methods for estimation (quantification) of the treatment benefit are also specified. Currently, log-rank statistical tests and/or proportional hazards models are commonly used for the trial design and primary analysis. These methods are optimized for treatment effects that do not change substantially over time (the proportional hazard assumption). Introduction of immunotherapeutic agents with potentially delayed treatment effects has renewed interest in statistical methods that can better accommodate general departures from proportional hazards and, particularly, a delayed treatment effect. This has led to considerable attention in, and some controversy about, appropriate statistical methodology for comparing survival curves, as demonstrated by the comments and replies on trial reports1-24 and at a Duke–US Food and Drug Administration workshop25 that offered alternatives to the standard log-rank/hazard-ratio methodology. While these new methods could be useful, as outlined in comprehensive reviews,26-30 we offer a caution about some of these methods’ limitations in translating statistical evidence into clinical evidence, both for formal treatment-effect hypothesis testing and for estimation (when used for the primary analysis).