The problem of evidence for evaluation in a post-truth world
We need better ways to have conversations with each other about action in the social world. Evaluation has the potential to bridge radically different viewpoints. Yet program evaluation has a problem reconciling that part of its history focused on measuring outcomes attributable to an intervention using rigorous quantitative measures, with formative and constructivist attempts to develop initiatives that are more inclusive of diverse values about program objectives and intended outcomes.
The problem of different perspectives is not new. In the 1990s this was apparent in the debate about the primacy of quantitative vs. qualitative methods. More recently the focus on power has shifted the emphasis for evaluation to recognise systems of marginalisation and exclusion and place greater emphasis on the question of ‘whose values?’ A common solution proposed by evaluators is also not new – mixed methods and multi-criteria evaluation including a diversity of perspectives is now as old as the hills.
What is new, is an increasingly post-truth attitude that has moved from the fringes, into the mainstream, and now into the practice of evaluation. Jeff Noonan is correct to point out the flaws in both left- and right-wing post-truth politics. Without any kind of objective reality to discuss in the world of political action the basis for negotiated compromises, rather than winner-takes-all is hard to find. One group brings its own ‘evidence’, the other another. They each have their own statistics and stories. One group seeks to impose their views on the polity in terms of what is ‘true for them’. There appears to be little appetite for unmotivated reasoning and discussion of common values or dialectic about the good society that caters to the needs and aspirations of all people. It is a fairly hopeless situation.
One response to this problem of ‘anything goes’ is to seek more objective measures of program worth – to put questions of power and values largely to the side (or allow them to appear in the form of case studies) and leave evaluation with the major task of applying quantitative methods for causal inference. To simply ask, what were the outcomes, and without the spin please? There are certainly enough poor-quality evaluation reports around that make vague claims and provide weak evidence to lead a reasonable person to favour approaches to evaluation with more quantitative and less ambiguous answers. But this is not the only, or even the best approach to generating more rigour in the evaluation of complex social interventions. There is another way that I think has both the benefit of rigour, and the promise of practical utility that may better engage the people responsible for administering complex programs with evaluation.
Let’s consider a program as a proposition for action, or in the simplest possible terms, a plan. Let’s consider evaluation as more about the management of risky propositions for action than applying social science. Let’s bring people together to deliberate about the logic of a proposed course of action, and while we are doing it, let’s make the values explicit. Let’s consider rigour in terms of the clarity, credibility and utility of claims that can be made about a program’s worth. Let’s focus on making those claims, citing facts and using warrants to turn facts into evidence that supports those claims. Let’s be open to refutation. Let’s use evaluation as an opportunity to flex the core muscles that make any democracy work – the ability to persuade rather than dictate. In short, lets recast evaluation as a discipline concerned with reasoned action.
Once again, this is not a new idea. Aristotle referred to the making and evaluation of claims about the value of public action as deliberative rhetoric. Aristotle’s use of the term referred to its original meaning and not the art of persuasion or trickery with which ‘rhetoric’ was latter associated. He was interested in the art of finding the inherent persuasiveness of any proposition for action. Logic developed over the centuries. Now we may say that a program provides a valid proposition for action if we can accept the reasons provided by proponents that it is likely to work. A well-grounded proposition does in fact work. So, if a program is both valid and well-grounded it is ‘sound’ – the highest praise that should be necessary (and sufficient) to support the expenditure of taxpayer dollars on an initiative to advance the public good. On this approach, we see diagrams with boxes and arrows not as ‘theories of change’, ‘logic models’, or overly linear ‘causal models’ but as the bare bones of an argument structure. To accept this approach just requires a paradigm shift in evaluation: from applied social science to reasoned action.
On a more propositional approach, I believe public servants and program designers (perhaps supported by evaluators) should be concerned with translating the values advanced by government ministers into propositions for action that are logically valid. A proposition that includes compelling reasons as to why inputs + assumptions will be sufficient to generate outputs and outcomes. This must include reasons as to why each input or activity is necessary and why collectively these will be sufficient for generating intended outputs and outcomes - in a given context. Reasons will include evidence from previous studies including randomised controlled trials (RCTs). As an aside, I have no particular problem with the statement that an RCT is the most rigorous method for making claims of causal attribution when the assumptions that underpin the method can be met. I also believe that while RCTs are strong on internal validity (answering questions about whether it was our actions that caused the change) they are weak on making claims with external validity (whether the same action would yield the same result in a different context). This means RCTs are useful (if not necessary) but are not sufficient to provide evidence that a program with many moving parts that worked in the past will work again in the future.
On a reasoned, logical, or propositional approach, the role of evaluation is twofold. Evaluators should be concerned with testing the validity of a proposition in the design phase as well as verifying if the proposition was well-grounded during delivery. This makes room for an ex-ante, prospective, or propositional evaluation phase that is relatively cheap and can rule out plans that were never going to work at a very early stage. In the ex-post phase (more traditionally associated with evaluation) we can proceed methodically and rigorously assessing the extent to which each premise in the argument (or proposition) was brought about. In this way, evaluation can be concerned with managing the risk of program failure and is sufficiently detailed to move beyond black-and-white statements about whether a program works(ed) or not. It can provide clear information supported with evidence as to how and to what extent and in such a way a program may be modified, re-targeted, or abandoned.
In summary, to address the challenges for evaluation in our modern context we need an approach that can handle a volatile, unpredictable, complex and ambiguous world (VUCA - ontology) in a post-truth era (fraught epistemology) that must incorporate a plurality of values (contested axiology). We need a basis for negotiated compromises. I believe a paradigm that shifts evaluation from the task of accumulating knowledge to one of supporting reasonable, valid or justified decisions about action is our best bet. We might never know, or all agree – but still, we must act, and providing transparency for our reasons and then testing them is a rational way for justifying collection action in a democratic society. For more information and examples of propositional evaluation see other blogs and content on this website.