By Ruth Whittington ([email protected])

Part 4 – Interpreting the results of observational studies

In the last of our series on observational studies, this article addresses some of the issues involved in interpreting the results of such studies.

When interpreting or reviewing the results of any study or trial, there are important questions that should be kept in mind to ensure objective and balanced assessment of both the results and the possible implications for future healthcare.

Some of these questions apply to the conduct and design of the trial, for example:

  • was the study conducted in an ethical manner?
  • was the design of the study appropriate to answer the research questions that were asked?
  • were the research questions clinically relevant and scientifically valuable?

However, bias, confounding factors, heterogeneity of the patient groups, and statistical power can all affect the interpretation and implication of the results. These potential influences need to be examined closely.

Bias occurs when preconceptions lead to incorrect conclusions about the effects of treatment. It is important to avoid bias in health research as it distorts outcomes – it could even result in an unsafe or inefficient treatment being licensed for use, or useful treatments being overlooked. Bias is avoided in RCTs by the process of randomisation, and in observational studies statistical analyses can minimise its effects.

Confounding factors
‘Confounding’ is when factors other than the treatment in question could influence the outcome. This can lead to erroneous conclusions, particularly in an observational study. For instance, patients with the worst prognosis may be systematically allocated to a particular treatment. It is possible to control for those confounding factors that are known to affect treatment outcomes, but it may not be possible to control for all confounding factors in an observational study.

Because enrolment in an observational study has few restrictions, the study patient population is usually more heterogeneous than that for an RCT. Statistical tests of heterogeneity are used to assess whether the observed variability in results is greater than that expected to occur by chance.

Statistical power
The statistical power is the ability of a study to demonstrate an association or causal relationship. If the statistical power of a study is low, the results will be questionable. By convention, 80% is an acceptable level of power. As with the design of an RCT, researchers must estimate the parameters needed to detect a difference between treatments in an observational study – for example, the numbers of patients and the length of follow-up.

A checklist can be helpful to determine whether these and other issues have been adequately addressed in the study report, and thus give a degree of confidence about the results. Some useful questions to consider are:

  • has the measurement of important confounding factors been described, so that the reader can judge how they can be controlled?
  • have all the subjects been accounted for in the study follow-up?
  • has any issue of possible bias been addressed?
  • is the statistical power of the study adequate to establish a difference between treatment groups?
  • has ‘clinical significance’ been discussed so that it can be differentiated from ‘statistical significance’?
  • do any measures of association have confidence intervals reported?
  • has the issue of multiple comparisons been addressed?
  • have the study results been placed in the context of existing findings, and any reasons for differences been discussed?

It is also vital, once you have reached the manuscript stage, to put the study into context alongside the evidence generated by other sources, including RCTs. Explaining any differences in findings is a crucial part of having the study results accepted as a significant contribution to the whole. We wish you luck in conducting these studies – we consider them an essential part of the evidence base for a therapy, and encourage you to consider them in your research planning.

If you want examples of some excellent observational studies in the industry, Lilly is a major contributor to this type of research. SOHO, ADORE, EDOS and EMBLEM are some of the acronyms of their studies. Google will pick these up if you add the word ‘study’ to the search term.