Skip to content
BY 4.0 license Open Access Published by De Gruyter February 20, 2023

The EBM+ movement

  • Michael Wilde EMAIL logo

Abstract

In this paper, I provide an introduction for biostatisticians and others to some recent work in the philosophy of medicine. Firstly, I give an overview of some philosophical arguments that are thought to create problems for a prominent approach towards establishing causal claims in medicine, namely, the Evidence-Based Medicine (EBM) approach. Secondly, I provide an overview of further recent work in the philosophy of medicine, which argues that mechanistic studies can help to address these problems. Lastly, I describe a novel approach for establishing causal claims in medicine that has been informed by this recent work in the philosophy of medicine, namely, the EBM+ approach.

1 Introduction

How do we establish causal claims about the effectiveness of medical interventions? One approach is to carry out some Evidence-Based Medicine (EBM). Of course, most people agree that such causal claims are established on the basis of the evidence; the devil is in the details (cf. [1: 981–2]). In this paper, I begin by providing some of the details of the EBM approach. In particular, I will present its emphasis on the role of clinical studies in establishing causal claims about the effectiveness of medical interventions. Next, I present a couple of problems with this approach, emphasized by recent work in the philosophy of science, for instance, due to Cartwright [2, 3] and Worrall [1, 4]. I then provide an overview of further recent work in the philosophy of science to the effect that mechanistic studies can help to address these problems [5]. And I will describe a novel approach for establishing causal claims in medicine that has been informed by this recent work in the philosophy of science: the EBM+ approach [68]. I hope this paper serves as an introduction for biostatisticians and others to some of the implications of recent work in the philosophy of science, and in particular the philosophy of medicine.

2 EBM

According to a classic definition, EBM is ‘the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients’ [9: 71, quoted by Clarke et al. 5: 339]. I will focus on decisions about the best treatment of patients, where a good treatment decision often involves establishing a causal claim about the effectiveness of a medical intervention, and then integrating this knowledge with the individual patient’s values and circumstances [10: 177]. Presumably, any clinician worth their salt has always believed themselves to be making such decisions on the basis of a conscientious, explicit, and judicious use of the current best evidence concerning the effectiveness of medical interventions. EBM is novel in: (i) often appealing to the notion of an evidence hierarchy in order to explain what counts as the current best evidence; (ii) providing explicit guidance to help ensure that this current best evidence is used conscientiously, explicitly, and judiciously.

An evidence hierarchy is simply a ranking of different methods for providing evidence to inform medical decisions: loosely speaking, the higher the ranking of the method, the better the evidence that it provides. In the case of treatment decisions, an evidence hierarchy will typically rank methods in accordance with their potential for establishing a causal claim about the effectiveness of a medical intervention, at least all other things being equal. Such a hierarchy ranks comparative clinical studies above mechanistic reasoning, reflecting the view that clinical studies are better than mechanistic reasoning at establishing causal claims about the effectiveness of medical interventions [10: 5]. This evidence hierarchy helps to explain what counts as the current best evidence: if there are clinical studies available, then they provide the current best evidence, otherwise ‘we must follow the trail to the next best external evidence and work from there’ [9: 72]. And if we follow the trail all the way down, then we end up with expert judgement and mechanistic reasoning. Although in practice this next best external evidence is rarely taken to include non-clinical studies; indeed, even non-randomized clinical studies are sometimes not included in the next best external evidence: ‘[i]f the study wasn’t randomized, we’d suggest that you stop reading it and go on to the next article in your search’ [11: 108].

What exactly is mechanistic reasoning? It perhaps helps to begin with the standard definition of a mechanism from the philosophy of science: a mechanism is something that can be appealed to in an attempt to explain some phenomenon; it is simply an arrangement of entities that perform certain activities in order to produce that phenomenon [12: 3]. For example, if we want to explain the putative effectiveness of some medical intervention, then we can point towards its proposed mechanism of action, such as the proposed mechanism by which inserting a small valve or grommet in the eardrum works to equalize pressure in the ear, allowing the build-up of glue-like fluid in the middle of the ear canal to be drained away, thereby improving so-called glue ear [1: 985]. Mechanistic reasoning involves inferring a causal claim about the effectiveness of a medical intervention simply on the basis of a claim about the existence of such a mechanism [10, 13, 14]. To continue the present example, on the basis of the claimed mechanism linking the insertion of grommets to the draining of fluid in the middle ear, one may infer the causal claim that grommets are an effective intervention for the treatment of patients with glue ear.

One worry is that mechanistic reasoning is known to have a pretty bad track record. Indeed, Howick [10: 154–7] lists a number of cases where ineffective or harmful medical interventions were recommended on the basis of mechanistic reasoning. For example, it is now thought that the insertion of grommets offers no benefit for the treatment of glue ear [1: 985]. Moreover, there is a good explanation of this bad track record for mechanistic reasoning. Firstly, such reasoning may suffer from the problem of story-telling: mechanistic reasoning can be overly psychologically compelling, discouraging further investigation simply because people are too easily convinced by a gripping narrative (cf. [5: 350]). Mechanistic reasoning is therefore liable to be based upon a false theory about the details of some mechanism, in which case such reasoning is not going to establish a causal claim about the effectiveness of the relevant medical intervention. Secondly, mechanistic reasoning suffers from the problem of incompleteness: the typical complexity of the relevant mechanisms means that any knowledge of the details of those mechanisms is often incomplete; even if we have some knowledge of an intervention’s mechanism of action, the intervention may interact with known mechanisms in unexpected ways, or it may set off other unknown mechanisms that work to counteract or mask the expected effects of the intervention, such that the intervention has no overall beneficial effect [1416]. It is therefore difficult to obtain the knowledge required for mechanistic reasoning to provide enough evidence to establish a causal claim about the effectiveness of a medical intervention. This problem with mechanistic reasoning have been nicely summed up by the philosopher of science Miriam Solomon: ‘A general problem with mechanistic accounts is that they are typically incomplete, although they often give an illusion of a complete, often linear, narrative’ [17: 131].

A proponent of the evidence hierarchy will prefer that treatment decisions are based upon on the results of comparative clinical studies, such as observational studies or preferably randomized controlled trials (cf. [18: 2420, quoted by Williamson 8: 192]).

A standard type of clinical study involves comparing pre-determined average health outcomes of participants with a certain disease sometime after they have been separated into one of two groups, where those participants in the test group received some novel intervention, and those participants in the control group received the current standard treatment; if there is a difference in such a health outcome observed between the test group and the control group, then the existence of a correlation is established between the intervention and that health outcome; and if the novel intervention is the only difference between the test group and the control group, then it must be the intervention that explains this correlation, that is, a causal claim about the effectiveness of the intervention is established. For example, a clinical study may reveal that a novel intervention is correlated with an improved average time taken to recover from some disease; if that intervention is the only difference between the test group and the control group, then the study establishes not only the claim that the intervention is correlated with improved recovery times, but also the causal claim that the novel intervention is effective at reducing the average time taken to recover from the disease (cf. [2]).

Of course, the difficulty is in ensuring that the test group and the control group differ only with respect to the novel intervention; if there remains some other difference between the groups, then it may be that any correlation between the novel intervention and some health outcome is explained by appealing to this other difference, at least as long as this other difference is a factor plausibly relevant to the prognosis of the disease, that is, as long as the difference concerns a so-called prognostic factor. For example, if the participants in the test group were aware that they were receiving a novel intervention, or if these participants were treated differently to those participants in the control group, then it may be these differences that explain any established correlation. These sorts of differences can be addressed by carrying out double-blinded, placebo-controlled trials, in strict accordance with a sensible study protocol (cf. [19: 1400]). However, if the control group still differs from the test group in terms of some prognostic factor, then it may be this difference that explains any established correlation between the intervention and some health outcome in the study. For example, the control group may perform less exercise on average than the test group, and it may be this difference that explains any established correlation between the intervention and improved average time taken to recover from the disease; that is, level of exercise is said to be a potential confounding factor, that is, a factor other than the intervention that may be appealed to in order to explain an established correlation between the intervention and some health outcome. As the old saying goes: correlation is not causation.

One method to address this problem of confounding factors involves intentionally matching the test group and the control group in terms of such factors, for example, by ensuring that both groups perform on average the same level of exercise. However, it is only possible to carry out this matching with respect to known potential confounding factors; after such matching, there may remain unknown confounding factors, that is, unsuspected prognostic factors that may be appealed to in order to explain an established correlation between the intervention and some health outcome [4: S322; 1: 1003–4]. So the preferred method involves allocating participants in the study to the test or control group at random, that is, by carrying out a randomized controlled trial. The idea is that the chance that a participant with a certain prognostic factor will be included in the test group is equal to the chance that they will be included in the control group. As a result, if the trial is sufficiently large, then there is a high chance that this factor will be evenly distributed between the test group and the control group. For example, following randomization, a sufficiently large trial will result in a high chance that the test group and the control group perform on average the same level of exercise, even if those groups have not been deliberately matched for this prognostic factor. And so any significant difference in average health outcomes between the test group and the control group cannot be explained by appealing to a difference in level of exercise. Indeed, following randomization, a sufficiently large trial will result in a high chance that any given prognostic factor is similarly distributed in the test group and the control group. And so if such a randomized trial establishes the existence of a correlation between a novel intervention and some health outcome, the trial also arguably establishes the causal claim that it was the intervention that caused this health outcome, on the grounds that the correlation cannot be explained by appealing to some confounding factor. It is in this sense that a randomized trial is said to control for all potential confounding factors, known or unknown. And it is partly for this reason that randomized trials are ranked above non-randomized observational studies in the evidence hierarchy for treatment decisions [10: 31–116]. Of course, things are not quite so cut and dried: more recent evidence hierarchies allow for the possibility that a sufficiently high quality observational study can provide better evidence than a sufficiently low quality randomized trial [20]. And evidence hierarchies more generally permit a number of different interpretations [21]. Although, all other things being equal, randomized trials still rank higher than observational studies in these evidence hierarchies.

One of the putative advantages of a randomized trial is thus that it can establish a causal claim about the effectiveness of a novel intervention without requiring much in the way of knowledge about the exact mechanisms by which the intervention is effective. Accordingly, Richard Ashcroft says that:

[T]he beauty of the [randomized controlled trial] as a methodology is that it seems to operate at a level of scientific theory autonomous from the basic sciences. Apparently we need to know little or nothing of pathogenesis or drug action in order for a randomized controlled trial to be designed and implemented and (perhaps) interpreted successfully. Indeed, our theories at this more basic level could simply be wrong [or at least incomplete] [22: 134, quoted by Solomon 17: 110].

It is for this reason that randomized trials are also ranked above mechanistic reasoning in the evidence hierarchy: randomized trials can establish the effectiveness of an intervention, while avoiding the problems associated with mechanistic reasoning, namely, the problem of incompleteness and the problem of story-telling. At least all other things being equal, a randomized trial thus provides the current best evidence about the effectiveness of a medical intervention.

Some philosophers of science have been critical of this emphasis on randomized trials; they have argued that randomized trials have their own set of problems. One such problem is sometimes known as the problem of non-causal correlations [5: 343–44]. A version of this problem has been explained in detail by Worrall [1, 4], building upon work by Urbach [23, 24]. Worrall argues that randomization ensures at best a high chance that a given confounding factor is distributed similarly in the test and the control group; there will remain a non-zero chance that some factor is better represented in one group than the other [4: 322–3]. He also argues that such a high chance is ensured only by indefinite randomization:

[I]f the randomization were performed indefinitely often, the number of cases of groups skewed with respect to that factor would be very small. The fact, however, is that a given [randomized trial] has not been performed indefinitely often but only once. Hence it is of course possible that, “unluckily,” the distribution even of the one unknown prognostic factor we are considering is significantly skewed between the two groups [4: S323].

In other words, if a correlation between an intervention and some health outcome is established in a randomized trial, it may not be because the intervention caused this health outcome, but rather because there was some other difference between the test and control groups that was not controlled for by randomization.

Now, it might be objected that a systematic review or meta-analysis of multiple replications of a randomized trial will provide an adequate approximation of indefinite randomization. In that case, there will be a high chance that any given prognostic factor is similarly distributed in the test and control groups. Indeed, in more detailed evidence hierarchies, systematic reviews and meta-analyses of randomized trials rank even higher than randomized trials alone (cf. [5: 340]). However, Worrall then argues that it would be a quantification fallacy to conclude that any difference observed between the test and control group must then be attributable to the intervention [4: S324]. An example from contemporary epistemology may help to make clear this fallacy, namely, the example of a large lottery with one winning ticket: although for each given ticket, there is a high chance that the ticket will lose, it does not follow that there is a high chance that every ticket will lose; indeed, the description of the example means that there is a high chance that some ticket will win (cf. [25]). Similarly, although for each potential confounding factor, there is a high chance that it is distributed similarly in the test and control groups, it does not follow that there is a high chance that every such factor is distributed similarly in the new treatment and control treatment groups; there may even be a high chance that some confounding factor is not distributed similarly in the two groups, in which case even a correlation observed across multiple randomized trials may not be sufficient for establishing a causal claim about the medical intervention, since the correlation may be explained by appealing to the chance of there being such a factor. One possible example here concerns a meta-analysis of randomized trials on the clinical efficacy of homeopathy; this meta-analysis established that homeopathic interventions are correlated with better health outcomes than placebo, but this alone was not sufficient to establish that the interventions caused those better health outcomes in the study population, presumably because there was still a significant chance that there was a non-causal explanation of this correlation [26, cited by Clarke et al. 5: 344]. It is thus argued that the problem of non-causal correlations means that even a systematic review or meta-analysis of randomized trials may fail to establish an efficacy claim, that is, the claim that a medical intervention caused some health outcome in the study population. More generally, Stegenga [27] has argued that meta-analyses should not be regarded as the platinum standard of evidence. In particular, Stegenga argues that a given meta-analysis will be insufficiently objective, since it involves making a number of subjective decisions, for example, decisions about which trials to include, and what weighting to give those trials in the final analysis.

It may be that the mere theoretical chance of there being a confounding factor is not sufficient by itself to preclude a systematic review or meta-analysis of randomized trials from establishing an efficacy claim about a medical intervention in practice (cf. [17: 134–7]). However, there remains another problem with emphasizing randomized trials, a problem that has been highlighted by Cartwright [2, 3]. It is sometimes called the problem of external validity [5: 346]. A study has external validity to the extent that its results can be extrapolated from the study population to some target population. Cartwright allows that ideal randomized trials may indeed establish the efficacy claim that an intervention caused some health outcome in the study population, but then argues that this alone does not establish the causal claim of most interest, namely, the effectiveness claim that the intervention causes this health outcome for distinct target populations; the study population may not be representative of a target population, for instance, if participants were recruited to the study from the target population in strict accordance with narrow exclusion criteria. Indeed, trials may exclude participants with multiple comorbidities from the study population, even though the target population is likely to have such comorbidities [28]. One example here concerns the randomized trials which established in their study populations that benoxaprofen was efficacious for musculo-skeletal pain; benoxaprofen was found not to be an effective treatment of musculo-skeletal pain in an older target population [1: 994–5]. Cartwright argues more generally that ‘[t]he lesson to be learned is that although (ideal) [randomized trials] are excellent at securing causal principles, there is a very great deal more that must be assumed—and defended—if the causal principles are to be exported from the experimental [study] population to some target population’ [3: 66].

Now, it might be argued that the problem of external validity does not always preclude randomized trials from establishing an effectiveness claim. Indeed, Petticrew and Chalmers [29] have objected that the default presumption should be that a systematic review of good randomized trials on varied enough study populations does indeed have external validity. However, Cartwright has responded that this objection involves giving up evidence-based medicine in place of default-based medicine. She argues that ‘[w]e should not adopt default positions; we should have evidence. Very often we cannot get it and have to bet. But at that point, when evidence stops and betting starts, we are no longer doing evidence-based medicine’ [30: 1697].

To sum up, a number of philosophers of science have argued that even a group of well-conducted randomized trials can be insufficient to establish a causal claim about the effectiveness of a medical intervention on some health outcome. Firstly, the trials may demonstrate only that the intervention is correlated with the health outcome, thanks to randomization failing to completely control for confounding factors; this is the problem of non-causal correlations. Secondly, even if the trials can in fact demonstrate that the intervention causes the health outcome in the study population, that is not to say that the intervention will similarly cause the health outcome in a distinct target population; this is the problem of external validity. How then do we establish causal claims about the effectiveness of medical interventions, when clinical studies alone are insufficient?

3 EBM+

Some philosophers of science have argued that establishing a causal claim about a medical intervention involves also establishing the existence of a relevant mechanism [57, 31]. One line of argument draws upon the nine viewpoints for distinguishing mere correlation from causation advocated by Hill [32]. Among other things, Hill argues that, in determining whether an observed association is causal, it can be helpful to consider not just the strength of the observed association, but also its coherence with ‘the generally known facts of the natural history and biology of the disease’ [32: 298]. Russo and Williamson [31: 160–1] have argued that the former strength viewpoint demonstrates the importance of establishing a correlation when establishing a causal claim, whereas the latter coherence viewpoint demonstrates the importance of also establishing the existence of a relevant mechanism. Another line of argument appeals to the uses of causal claims in medicine [5: 345–6; 31: 159]. Causal claims are used in medicine for prediction and explanation; for instance, after establishing the claim that a medical intervention causes some health outcome, one is better placed to predict and explain that health outcome in the presence of that intervention. Russo and Williamson [31] argue that such a causal claim can be useful for prediction only if the intervention is appropriately correlated with the health outcome, that is, only if the causal claim tracks the existence of an appropriate correlation. They argue also that such a causal claim can be useful for explanation only if the intervention is linked by a relevant mechanism to the health outcome, that is, only if the causal claim tracks the existence of a relevant mechanism. Their conclusions are sometimes summed up as the RussoWilliamson thesis [33: 133–84]. A recent statement of the thesis goes like this: ‘In order to establish a causal claim in medicine one normally needs to establish two things: first, that the putative cause and effect are appropriately correlated; second, that there is some mechanism which explains instances of the putative effect in terms of the putative cause and which can account for this correlation’ [7: 33]. Given this, it is perhaps unsurprising that establishing the existence of a mechanism can help to address the problem of non-causal correlations and the problem of external validity.

Let us look first at the problem of non-causal correlations. Suppose that randomized trials have established a correlation between some novel intervention and an improved health outcome, but that the effectiveness of this intervention is not the only available explanation of this established correlation. In particular, it may be that the test and control groups still differ in some way other than the novel intervention, and this difference provides an alternative potential explanation of the extent of the established correlation. Attempting to establish a mechanism can help in deciding between these competing explanations [5: 343–4]. On the one hand, establishing the existence of a mechanism linking the novel intervention and the improved health outcome can help to rule out the alternative potential explanations, at least where this mechanism explains the extent of the established correlation. A potential example here involves the move to a pegylated combination therapy for chronic Hepatitis C [13]. Pegylation is the process of modifying molecules with polyethylene glycol (PEG) [34]. Although randomized trials established a correlation between a pegylated combination therapy and improved health outcomes compared to the standard combination therapy, there remained a worry that this correlation could be explained by bias in the available trials rather than the improved effectiveness of the pegylated combination therapy. But independent evidence helped to rule out this alternative potential explanation, by helping to establish the existence of the antiviral mechanism of pegylation, where this mechanism could explain the extent of the established correlation. On the other hand, if the existence of such a mechanism was not established after such an investigation, then the effectiveness of the medical intervention would also not be established. A potential example here concerns the above-mentioned trials establishing a correlation between homeopathic interventions and improved health outcomes; the effectiveness of such interventions has not been established, since the proposed mechanism underlying such effectiveness has not been established [5: 344]. It is in this way that attempting to establish the existence of a mechanism can help to determine whether a correlation is causal.

Let us now look at the problem of external validity. Suppose that observational studies have established the existence of a correlation between a medical intervention and some health outcome in a target population. Suppose also that an efficacy claim has been established by some ideal randomized trials, namely, that the intervention is a cause of health outcome in some study population. On the present account, this requires establishing in the study population both that there is a correlation between the intervention and the health outcome, and that there exists a mechanism in this population that can explain the extent of this correlation. One way to establish that the intervention is a cause of the health outcome in other target populations is by establishing that there also exists a sufficiently similar mechanism in the target population [5: 347; 35]. And failing to establish the existence of such a similar mechanism will lead to failing to establish that the intervention will cause the health outcome in a distinct target population, as in the benaxoprofen example above [1: 994–5]. It is in this way that attempting to establish the existence of a sufficiently similar mechanism can help to determine the conditions under which randomized trials have external validity. (Williamson [7] provides a more detailed account of the role of establishing mechanisms in determining the external validity of clinical studies).

One worry is that this appeal to mechanisms reintroduces all the problems with mechanistic reasoning, namely, the problem of story-telling and the problem of incompleteness (cf. [17: 116–24]). However, the idea is that a causal claim about the effectiveness of a medical intervention should not be based only upon an established mechanism, but instead also upon an established correlation, where this correlation provides evidence that the intervention has a net effect on the health outcome, thus addressing the problem of incompleteness. Moreover, a causal claim is not established by being based upon a mere story about a mechanism, but instead by being based in part upon an established claim about a mechanism, where establishing such a claim requires evidence that may come from mechanistic studies into the details of the relevant mechanisms, that is, the relevant entities, activities, and their organization [6: 14]. (A list of examples of sources of evidence of mechanisms is given in table 1 in Williamson [7: 35]).

Now, it might be argued that there are cases in which mechanistic studies were not required to establish a causal claim about a medical intervention. Indeed, Howick objects that ‘there are many counterexamples where medical interventions have been [rightly] accepted on the basis of evidence from comparative clinical studies alone’ [15: 930]. But this objection seems to assume that the existence of a mechanism can be established only by mechanistic studies. However, Illari has argued that ‘there is no principled distinction between the kinds of empirical work by which we get evidence of mechanisms, and evidence of [correlation]—although in practice for any particular case in the health sciences these different items of evidence are usually got from different studies’ [16: 145]. Indeed, it has been argued in particular that sufficiently many well-conducted randomized trials that establish a large enough correlation can themselves provide establish the existence of a mechanism, for instance, if there is no other plausible explanation of such an established correlation [7: 43–5]. So the claim that clinical studies are sometimes sufficient to establish a causal claim about a medical intervention need not go against the Russo–Williamson thesis. And it certainly does not go against the claim that mechanistic studies can be helpful in those cases where the available clinical studies fail to establish the effectiveness of a medical intervention. At least where such studies are available, the current best evidence often consists of a combination of mechanistic studies and clinical studies.

Another worry is that there is nothing novel about the claim that causal claims can be established by relying upon mechanistic studies as well as comparative clinical studies. Indeed, EBM proponents acknowledged from the outset that:

The dearth of adequate evidence demands that clinical problem solving must rely on an understanding of pathophysiology. Moreover, a good understanding of pathophysiology is necessary for interpreting clinical observations and for appropriate interpretation of evidence (especially in deciding on its generalizability) [18: 2423].

Presumably, a good understanding of pathophysiology can ultimately be traced back, through a sound medical education, to a number of mechanistic studies. Moreover, a published paper describing the results of a clinical study on some medical intervention will typically include a discussion of the intervention’s possible mechanism of action (cf. [6: 67]). Arguably, there has therefore always been an acknowledged role for relying upon both comparative clinical studies and mechanistic studies. Indeed, perhaps a reliance also upon mechanistic studies is precisely required for the use of evidence from randomized trials to count as ‘judicious’ (cf. [9: 71]).

But to count as evidence-based, the use of current best evidence is also supposed to be ‘conscientious’ and ‘explicit’ (cf. [9: 71]). In other words, it requires ‘conducting an efficient search of the literature; selecting the best of the relevant studies and applying rules of evidence to determine their validity’ [18: 2420]. EBM provides a wealth of guidance to ensure the conscientious and explicit use of evidence from clinical studies, for example, by providing explicit guidance on how exactly to find and critically evaluate relevant clinical studies in an efficient, transparent, and reproducible manner (cf. [36: 124]). However, little guidance is provided on how exactly to find and critically evaluate mechanistic studies in a similar manner. So any use of mechanistic studies is therefore arguably neither explicit nor conscientious. The EBM+ approach agrees that evidence from comparative clinical studies should be used conscientiously, explicitly, and judiciously, but also stresses the conscientious, explicit, and judicious use of evidence from mechanistic studies (cf. [6: 3–4]). In their words, ‘if EBM was a useful first approximation to evidence evaluation, then EBM+ is intended as a second, improved, approximation’ [6: 7]. In particular, the EBM+ approach provides an explicit procedure for evaluating evidence from mechanistic studies alongside evidence from clinical studies.

Broadly speaking, the EBM+ procedure is as follows. (A pictorial representation of this procedure in broad outline is captured by Figure 1). The first step is to find and evaluate the available clinical studies in accordance with the guidance provided by EBM. (That is, we start in the bottom left-hand corner of Figure 1). This will result in a body of evidence that can then be ranked in terms of its overall quality: high, moderate, low, or very low quality [6: 26]. The quality of this evidence in part determines the status of two different claims: on the one hand, a correlation claim, namely, that the intervention is appropriately correlated with some health outcome in the study population; on the other hand, a general mechanistic claim, namely, that there exists a relevant mechanism in the study population that can explain the extent of this correlation. (That is, we move along the left-hand arrows of Figure 1). In particular, it partly determines whether each claim is either: (i) established; (ii) provisionally established; (iii) arguably true; (iv) speculative; (v) arguably false; (vi) provisionally ruled out; or (vii) ruled out [6: 27]. For example, a claim is established when high quality evidence warrants a high level of confidence in that claim; a claim is merely provisionally established when it is instead moderate quality evidence that warrants a high level of confidence in that claim.

Figure 1: 
How to evaluate evidence according to the EBM+ approach [6: 28].
Figure 1:

How to evaluate evidence according to the EBM+ approach [6: 28].

In turn, the statuses of the correlation claim and the general mechanistic claim help to determine the status of a causal claim, namely, the claim that the medical intervention causes the health outcome in the study population. (That is, we move to the causal claim box from the correlation claim box and the general mechanistic claim box in Figure 1). In particular, the status of this causal claim is the same as the lower of the statuses of the correlation claim and the general mechanistic claim [6: 92]. For instance, if the correlation claim is established but the general mechanistic claim is merely provisionally established, then the causal claim is likewise merely provisionally established. Now, it may be that the evidence from ideal clinical studies is sufficient to give the status of established to both the correlation claim and the general mechanistic claim; such evidence may then also give the status of established to this causal claim. All of this is in line with the Russo–Williamson thesis. However, it may also be that the evidence from clinical studies alone is insufficient to give the status of established to the causal claim, due to the problem of non-causal correlations: the available evidence establishes the correlation claim, but it fails to establish the causal claim, since it fails to establish the general mechanistic claim. Perhaps the general mechanistic claim is merely provisionally established on the basis of the clinical studies.

The next step then is to find and evaluate also the evidence from mechanistic studies [6: 63–90]. (That is, we carry on to the bottom right-hand corner of Figure 1). Together with the first step, this will result in a total body of evidence consisting of both clinical studies and mechanistic studies. This total body of evidence can then also be ranked in terms of its overall quality, and then relied upon to determine the new statuses of both the correlation claim and the general mechanistic claim. (That is, we move along the right-hand arrows of Figure 1). Of course, mechanistic studies are not typically most concerned with testing a general mechanistic claim; rather, they are more concerned with testing a particular or specific mechanism hypothesis by investigating the proposed entities and activities. But such mechanistic studies can nevertheless still provide evidence that helps to determine the status of the logically weaker general mechanistic claim, that there simply exists a mechanism, without specifying its particular details. (That is, we move from the specific mechanism hypotheses box to the general mechanistic claim box in Figure 1). And if this total evidence determines that both the correlation claim and the general mechanistic claim are established, then it also determines that the causal claim is now established, since the status of the causal claim is the same as the lower of the statuses of the correlation claim and the general mechanistic claim. (That is, we move to the causal claim box from the correlation claim box and the general mechanistic claim box in Figure 1). In other words, the efficacy claim is now established, that is, it is now established that the medical intervention causes the health outcome in the study population. However, this does not mean that the combined clinical studies and mechanistic studies have thereby established the distinct effectiveness claim, namely, that the medical intervention causes the health outcome in some target population. Perhaps this causal claim is only provisionally established. This is just the old problem of external validity [2, 3].

The final step then is to determine the status of another general mechanistic claim, namely, the claim that there exist similar mechanisms in the target population as in the study population. Again, this status is determined in part by the quality of the total body of evidence, including clinical studies and mechanistic studies. And the status of this general mechanistic claim can be used to determine the status of the effectiveness claims that the medical intervention causes the health outcome in the target population. In particular, if the efficacy claim is provisionally established on the basis of the total evidence, but it is also established that there are sufficiently similar mechanisms at work in the target population, then the effectiveness claim is also established. (Other combinations of statuses and their impact on an effectiveness claim are covered in Parkkinen et al. [6: 94–7]).

I have here described the procedure only in broad outline. Parkkinen et al. [6] provides the more detailed account. It also provides guidance on how exactly to carry out each step of this procedure in order to help ensure that the use of the total body of evidence remains conscientious, explicit, and judicious. For example, guidance is given on how to effectively gather and critically evaluate mechanistic studies alongside clinical studies [6: 63–98]. And Parkkinen et al. describe exactly how to determine the status of both a correlation claim and a general mechanistic claim on the basis of a body of evidence that includes both clinical studies and mechanistic studies, for instance, by giving guidance on how to evaluate: (i) the methods used in mechanistic studies; (ii) the implementation of these methods; (iii) the results of the mechanistic studies [6: 80–2]. Moreover, tools are provided to facilitate with these tasks [6: 37–59].

The EBM+ approach thus provides procedural guidance to ensure the conscientious, explicit, and judicious use of the current best evidence, where this current best evidence includes evidence from mechanistic studies as well as comparative clinical studies. Of course, this approach is not completely uncontroversial. One important worry is that it fails to adequately address the biasing effects of financial or other conflicts of interest in medicine [3739]. In response, Williamson argues that taking on board mechanistic studies as well as clinical studies lowers the risk of such biasing effects by giving ‘less scope for any malleability with respect to individual judgements to influence the final assessment of causality’ [8: 206]. In effect, the idea is that a combination of mechanistic studies and clinical studies helps to keep such biases in check, since the mechanistic studies act as independent witnesses for the clinical studies, and vice versa.

Regardless, the EBM+ approach was never intended to be the final word on the matter. This point is acknowledged by Adam La Caze:

The next step is implementing the framework and the tools it provides into decision-making in medicine, public health, and policy. Opportunities for improving the tools and the evaluation framework will come from widespread implementation in range of contexts [40: 2].

In particular, it is by implementing the EBM+ approach that we may be able to determine whether the approach is more or less susceptible to the biasing effects of conflicts of interest than EBM. Indeed, Andreoletti and Teira suggest comparing approaches against appropriate empirical benchmarks in a pilot committee [37: 1109]. By doing this, we may also find ways to address any such biasing effects, thereby improving our methods for establishing the effectiveness of medical interventions.

4 Conclusions

In this paper, I have provided an introduction to some recent work in the philosophy of medicine. I have also drawn out the implications of this work for the practice of establishing causal claims about the effectiveness of medical interventions. In particular, this recent work in the philosophy of medicine argues that such causal claims are best established by evaluating evidence from mechanistic studies as well as clinical studies. And this work has led to guidance on how to evaluate such a diverse body of evidence in a conscientious, explicit, and judicious manner, namely, the EBM+ approach. Although this paper has focused on guidance concerning treatment decisions, similar guidance can be provided for other medical decisions, for example, deciding whether an exposure is a cause of disease [6: 101–10].


Corresponding author: Michael Wilde, Philosophy, University of Kent, CT2 7NZ, Canterbury, UK, E-mail:

Acknowledgements

I would like to acknowledge very helpful comments from David Corfield, Isabelle Drouet, Erica Moodie, Yafeng Shan, and Jon Williamson.

  1. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.

  2. Research funding: None declared.

  3. Conflict of interest statement: The authors declare no conflicts of interest regarding this article.

References

1. Worrall, J. Evidence in medicine and evidence-based medicine. Philos Compass 2007;2:981–1022. https://doi.org/10.1111/j.1747-9991.2007.00106.x.Search in Google Scholar

2. Cartwright, N. Are RCTs the gold standard? BioSocieties 2007;2:11–20. https://doi.org/10.1017/s1745855207005029.Search in Google Scholar

3. Cartwright, N. What are randomized controlled trials good for? Phil Stud 2010;147:59–70. https://doi.org/10.1007/s11098-009-9450-2.Search in Google Scholar

4. Worrall, J. What evidence in evidence-based medicine? Philos Sci 2002;69:S316–30. https://doi.org/10.1086/341855.Search in Google Scholar

5. Clarke, B, Gillies, D, Illari, P, Russo, F, Williamson, J. Mechanisms and the evidence hierarchy. Topoi 2014;33:339–60. https://doi.org/10.1007/s11245-013-9220-9.Search in Google Scholar

6. Parkkinen, VP, Wallmann, C, Wilde, M, Clarke, B, Illari, P, Kelly, MP, et al.. Evaluating evidence of mechanisms in medicine: principles and procedures. Cham, Switzerland: Springer; 2018.10.1007/978-3-319-94610-8Search in Google Scholar PubMed

7. Williamson, J. Establishing causal claims in medicine. Int Stud Philos Sci 2019;32:33–61. https://doi.org/10.1080/02698595.2019.1630927.Search in Google Scholar

8. Williamson, J. The feasibility and malleability of EBM+. Theoria 2021;36:191–209.10.1387/theoria.21244Search in Google Scholar

9. Sackett, DL, Rosenberg, WMC, Gray, JAM, Haynes, RB, Richardson, WS. Evidence based medicine: what it is and what it isn’t. BMJ Br Med J 1996;312:71–2. https://doi.org/10.1136/bmj.312.7023.71.Search in Google Scholar PubMed PubMed Central

10. Howick, J. The philosophy of evidence-based medicine. Chichester: Wiley-Blackwell; 2011.10.1002/9781444342673Search in Google Scholar

11. Sackett, DL, Straus, SE, Richardson, WS, Rosenburg, W, Haynes, RB. Evidence-based medicine: how to practice and teach EBM, 2nd ed. Edinburgh: Churchill Livingstone; 2000.Search in Google Scholar

12. Machamer, P, Darden, L, Craver, CF. Thinking about mechanisms. Philos Sci 2000;67:1–25. https://doi.org/10.1086/392759.Search in Google Scholar

13. Auker-Howlett, D, Wilde, M. Reinforced reasoning in medicine. J Eval Clin Pract 2020;26:458–64. https://doi.org/10.1111/jep.13269.Search in Google Scholar PubMed PubMed Central

14. Wilde, M. Mechanistic reasoning and the problem of masking. Synthese 2021;199:6103–18. https://doi.org/10.1007/s11229-021-03062-2.Search in Google Scholar

15. Howick, J. Exposing the vanities—and a qualified defense—of mechanistic reasoning in health care decision making. Philos Sci 2011;78:926–40. https://doi.org/10.1086/662561.Search in Google Scholar

16. Illari, P. Mechanistic evidence: disambiguating the Russo-Williamson thesis. Int Stud Philos Sci 2011;25:139–57. https://doi.org/10.1080/02698595.2011.574856.Search in Google Scholar

17. Solomon, M. Making medical knowledge. Oxford: Oxford University Press; 2015.10.1093/acprof:oso/9780198732617.001.0001Search in Google Scholar

18. Guyatt, G, Cairns, J, Churchill, D, Cook, D, Haynes, B, Hirsh, J, et al.. Evidence-based medicine: a new approach to teaching the practice of medicine. JAMA 1992;268:2420–5. https://doi.org/10.1001/jama.1992.03490170092032.Search in Google Scholar PubMed

19. Cartwright, N. A philosopher’s view of the long road from RCTs to effectiveness. Lancet 2011;377:1400–1. https://doi.org/10.1016/s0140-6736(11)60563-1.Search in Google Scholar PubMed

20. Guyatt, GH, Oxman, AD, Vist, GE, Kunz, R, Falck-Ytter, Y, Alonso-Coello, P, et al.. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. Br Med J 2008;336:924–6. https://doi.org/10.1136/bmj.39489.470347.ad.Search in Google Scholar

21. Jerkert, J. On the meaning of medical evidence hierarchies. Philos Med 2021;2:1–21. https://doi.org/10.5195/pom.2021.31.Search in Google Scholar

22. Ashcroft, RE. Current epistemological problems in evidence based medicine. J Med Ethics 2004;30:131–5. https://doi.org/10.1136/jme.2003.007039.Search in Google Scholar PubMed PubMed Central

23. Urbach, P. Randomisation and the design of experiments. Philos Sci 1985;52:256–73. https://doi.org/10.1086/289243.Search in Google Scholar

24. Urbach, P. The value of randomization and control and clinical trials. Stat Med 1993;12:1421–31. https://doi.org/10.1002/sim.4780121508.Search in Google Scholar PubMed

25. Wheeler, G. A review of the lottery paradox. In: Harper, W, Wheeler, G, editors Probability and inference: essays in honour of Henry E. Kyburg, Jr. London: College Publications; 2007:1–31 pp.Search in Google Scholar

26. Cucherat, M, Haugh, MC, Gooch, M, Boissel, JP. Evidence of clinical efficacy of homeopathy. A meta-analysis of clinical trials. Eur J Clin Pharmacol 2000;56:27–33. https://doi.org/10.1007/s002280050716.Search in Google Scholar PubMed

27. Stegenga, J. Is meta-analysis the platinum standard of evidence? Stud Hist Philos Biol Biomed Sci 2011;42:497–507. https://doi.org/10.1016/j.shpsc.2011.07.003.Search in Google Scholar PubMed

28. Clarke, B, Russo, F. Causation in medicine. In: Marcum, J, editor. The Bloomsbury companion to contemporary philosophy of medicine. London: Bloomsbury; 2017:297–322 pp.Search in Google Scholar

29. Petticrew, M, Chalmers, I. Use of research evidence in practice. Lancet 2011;378:1696. https://doi.org/10.1016/s0140-6736(11)61735-2.Search in Google Scholar PubMed

30. Cartwright, N. Author’s reply. Lancet 2011;378:1697. https://doi.org/10.1016/s0140-6736(11)61737-6.Search in Google Scholar

31. Russo, F, Williamson, J. Interpreting causality in the health sciences. Int Stud Philos Sci 2007;21:157–70. https://doi.org/10.1080/02698590701498084.Search in Google Scholar

32. Hill, AB. The environment and disease: association or causation? Proc Roy Soc Med 1965;58:295–300. https://doi.org/10.1177/003591576505800503.Search in Google Scholar

33. Gillies, D. Causality, probability, and medicine. Abingdon, Oxon: Routledge; 2019.10.4324/9781315735542Search in Google Scholar

34. Strader, DB, Wright, T, Thomas, DL, Seeff, LB. Diagnosis, management, and treatment of hepatitis C. Hepatology 2004;39:1147–71. https://doi.org/10.1002/hep.20119.Search in Google Scholar PubMed

35. Steel, D. Across the boundaries: extrapolation in biology and social science. Oxford: Oxford University Press; 2008.10.1093/acprof:oso/9780195331448.001.0001Search in Google Scholar

36. Straus, SE, Glasziou, P, Richardson, WS, Haynes, RB. Evidence-based medicine: how to practice and teach EBM, 5th ed. Edinburgh: Elsevier; 2019.Search in Google Scholar

37. Andreoletti, M, Teira, D. Rules versus standards: what are the costs of epistemic norms in drug regulation? Sci Technol Hum Val 2019;44:1093–114. https://doi.org/10.1177/0162243919828070.Search in Google Scholar

38. Holman, B. Philosophers on drugs. Synthese 2019;196:4363–90. https://doi.org/10.1007/s11229-017-1642-2.Search in Google Scholar

39. Howick, J. Exploring the asymmetrical relationship between the power of finance bias and evidence. Perspect Biol Med 2019;62:159–88. https://doi.org/10.1353/pbm.2019.0009.Search in Google Scholar PubMed

40. La Caze, A. Better evaluating mechanisms in medicine. J Eval Clin Pract 2019;25:1228–31. https://doi.org/10.1111/jep.13222.Search in Google Scholar PubMed

Received: 2022-10-06
Accepted: 2022-12-09
Published Online: 2023-02-20

© 2022 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 7.5.2024 from https://www.degruyter.com/document/doi/10.1515/ijb-2022-0126/html
Scroll to top button