Home   |   This Month   |   Archives   |   Free Subscription   |   Links   |   Jobs   |   Contact   | 07 September 2010
  Site Search
 

  Featured Advertisers


 News

University leads major new research into wound care

17 September 2009

The University of York's Department of Health Sciences and NHS Leeds Community Healthcare have been awarded a £1.75m grant from the National Institute for Health Research to carry out a study into complex wounds.

Greece cuts prices of over 4,000 drugs by 20%
Greece cuts prices of over 4,000 drugs by 20%

Elan gets injunction to stop dissident directors
Elan gets injunction to stop dissident directors

Read more...


Ask the experts: missing data
18 September 2009

  Featured Vacancies
Max Resourcing
Clinical Data Manager
Associate Data Manager
UK
Covance
Clinical Research Associate
Clinical Research Associate 2
West Midlands
Clinical Professionals

Senior/Clinical Research Physician


Ask the experts: missing data

In April 2009, the European Medicines Agency issued an updated biostatistical guideline on Missing Data. Alan Phillips discusses some of the critical issues associated with missing data when designing and analysing clinical trials

Why is missing data important to consider when designing the study?

Missing data are a potential source of bias when analysing clinical trials. The extent to which missing data may lead to biased study results is influenced by many factors, among them the relationship between the mechanism causing the missing data, treatment assignment, and outcome. In the design and conduct of a clinical trial it is therefore imperative that all efforts are directed towards minimising the amount of missing data likely to occur.

What practical steps can be taken to avoid the presence of missing data??

There are many reasons for missing data – such as patient refusal to continue in the study, treatment failures or successes, adverse events – only some of which are related to study treatment. In fact, most clinical trials will be subject to missing data. Some practical suggestions to minimise the amount of missing data include:

1. Proactively plan for missing data in the protocol; for example, unambiguously state the objectives of the study, the patient population of interest and how missing data may impact any inferences to be made.
2. Consider a two-step withdrawal process for patients: withdrawal of consent for treatment and withdrawal of consent from observation. Once a patient has withdrawn consent for treatment, only assessments needed to address key efficacy and safety questions of interest should be undertaken. Switching treatments can be an issue when continuing to monitor patients after withdrawal of treatment, for example switching treatments can result in confounding of treatment effects, which may make the study results difficult to interpret. Nevertheless, in a large number of cases the advantages of implementing a two-step withdrawal process outweigh the disadvantages.
3. In any clinical trial there will always be ‘necessary’ and ‘unnecessary’ discontinuations. For ethical reasons, a trial must always be designed to permit ‘necessary’ discontinuations, such as allowing a patient to discontinue because of lack of efficacy or an adverse event. These outcomes in themselves are often useful when assessing a treatment’s effectiveness and safety. However, ‘unnecessary’ discontinuations, such as ‘lost to follow up’, can be reduced through stricter inclusion or exclusion criteria. The disadvantage of the approach is that it can reduce the ability of the trial findings to be generalised – although of course the occurrence of missing data will have an impact on this anyway.

What methods can be used to understand the nature of missing data??

Understanding patient withdrawal patterns and associated missing data mechanism starts with the collection of relevant information. In a large number of clinical trials standard withdrawal or discontinuation Case Report Forms (CRFs) are employed. These have prescribed standard lists for reasons for withdrawal, such as adverse event, lack of efficacy, ‘lost to follow up’, etc. Trialists often don’t give enough thought to the customisation of these CRFs for the disease under consideration or the study objectives; for example, how often are disease- or study-specific reasons included? To illustrate the point consider ‘lost to follow up’ in oncology trials. What does this actually mean? Should study-specific reasons be provided to better understand what happens to these patients?

What statistical methods can be used to understand the causes of missing data?

Trialists often investigate the observed patterns of missing data to provide information relating to the missing data mechanism. Typical analyses include investigating:

• Differential timing of discontinuation by treatment group
• Differential reasons for discontinuation by treatment group, by time, or by treatment and time
• How baseline and/or post-baseline characteristics of those who discontinue differ from those who complete a trial.

Graphical display is one of the most important tools available to statisticians when trying to understand the causes of missing data. Although analytical methods exist for exploring missing data, a large amount of information can be ascertained by simply plotting the data (eg, Kaplan-Meier plots to look at time to withdrawal both overall and for specific reasons). The key to success is thinking through the question of interest and intelligently plotting the data. It is often useful to complement such graphs with logistic regression to explore predictors of dropouts. This is especially true when trying to identify key predictors from a set of candidate predictors.

Regulators have stated on numerous occasions that missing data from patients who drop out are different from other types of missing data. What are the principles for handling different types of missing data?

The key issue when handling any missing data is understanding the mechanism causing the missing data. It is essential the proposed method of analysis, and associated handling of missing data – regardless of whether the patient discontinued or not – must be directly linked and properly reflect the original objectives of the study, including any assumptions made when designing the trial. Specifically for patients who withdraw, the critical question is what information needs to be collected for patients who discontinue, as such patients will occur in every trial. How missing data is handled is an integral part of the description of the primary comparison.

What assumptions are made for the missing data mechanism for the commonly used statistical methods?

The statistical techniques developed for handling missing data usually assume the missing data mechanism can be one of the following.

• Missing completely at random (MCAR)
• Missing at random (MAR)
• Missing not at random (MNAR).

MCAR assumes the missing value mechanism is unrelated to the observed or unobserved responses, or to other measurements such as baseline values and treatment group. In particular, the probability that an observation is missed does not depend on how big or small it would have been if observed or on the size of the previous or subsequent observations on the same or any subject.

MAR assumes the missing value mechanism is dependent on observed measurements, including responses, but given these measurements, there is no remaining dependence on unobserved responses. The concept of MAR is most simply explained in the context of patient dropout in a longitudinal study. Suppose two patients share the same treatment and covariates, and exactly the same response measurements up to the point at which one drops out and the other remains, then the missing data from the subject who drops out are MAR if they have the same statistical behaviour as the observations from the subject who remains.
MNAR assumes that after accounting for observed measurements, there remains dependence between the missing value mechanism and the unobserved responses. Under MNAR a valid analysis does require knowledge of the specific form of the missing value mechanism, but in practice we will almost never know this mechanism.

Which statistical method is commonly used to address missing data issues??

Unfortunately, there is no single methodological approach for handling missing values that is universally accepted in all situations. Commonly used methods include single imputation methods such as last observation carried forward (LOCF), mixed models for repeated measures (MMRM) and multiple imputation (MI). Whichever method is employed it is important – given the study design assumptions – that an analysis is used that does not bias the treatment comparison in favour of the test treatment.

What are the underlying assumptions of single imputation methods such as LOCF? What are the advantages/disadvantages of the approach?

Single imputation methods such as LOCF make an implicit assumption that the patients would sustain the same response seen at an early study visit for the entire duration of the trial. The assumption is untestable and potentially unrealistic.

Single imputation methods tend to be simple to implement and easy for non-statisticians to understand the imputation process. However, even the strong MCAR assumption does not guarantee that an LOCF analysis is valid. Further, the methods suffer from the key disadvantage of systematically underestimating the precision of any treatment effect: that is, the uncertainty of imputation is not taken into account.

What are the underlying assumptions of mixed models for repeated measures (MMRM) methods? What are the advantages/disadvantages of the approach?

MMRM analyses make the assumption that data are MAR. In a MMRM analysis information from the observed data is used via the within-patient correlation structure to provide information about the unobserved data, but the missing data are not explicitly imputed.

MMRM methods provide unbiased treatment estimates under the MAR assumption. It estimates the treatment effects assuming the withdrawn patients have the same statistical behaviour as those who continued. That is, MMRM assumes that the data observed until the point of discontinuation is a valid predictor of the unobserved data.

What are the underlying assumptions of Multiple Imputation (MI) methods? What are the advantages/disadvantages of the approach?

Like MMRM, MI analyses also make the assumption that data are MAR. In MI, the imputation step is separate from the modelling step, and so there is additional flexibility to explore different assumptions about the nature of the missing data. If this flexibility is not used then it may in some circumstances essentially give the same results as MMRM, and so offers no advantages over that method. The MI method seems to have the same concerns from a regulatory perspective as per MMRM methods.

Since all approaches seem to have significant drawbacks from a regulatory perspective, what method should be used in regulatory submissions?

One of the main issues when determining how to handle missing data is that the true missing data mechanism will always be unknown and untestable from the data, and no amount of clever modelling can overcome this. If the mechanism for missing data is informative then it may not be possible to fully evaluate the impact of the treatment on the missing data in the analysis and this must be carefully considered in the interpretation of the data. Therefore the key issues are what questions are being answered from the analysis for the trial, and under what assumptions does the proposed analysis answer the questions. Doubts about aspects of the assumptions can be addressed through appropriate sensitivity analyses, and while the use of these analyses needs to be approached carefully, when focused on specific assumptions it can help determine the robustness of the inferences from a clinical trial.

Unfortunately there is no universally applicable method of handling missing values. This means the different approaches may of course lead to different results.

Alan Phillips, on behalf of PSI Expert Group. The material in this article is based on both the updated guideline, and discussions and agreements from the PSI Expert Group on missing data.?


� 2006 TransEuroMedia Ltd. All rights reserved. Production in whole or in part is prohibited.
Please send any technical comments or questions to our webmaster.