Evaluation is to science as wisdom is to knowledge

andrewjhawkins
Apr 12, 2024
7 min read

Updated: Apr 14, 2024

This morning, I assisted my son with his homework, he had to find a synonym of ‘wisdom’ from a set of focus words. The answer in the book was ‘knowledge’. That this was provided as the correct answer illustrates that while science is important it cannot set a standard for evaluation. Let me explain.

Evaluation is often considered a form of applied social science and less often associated with the study of wisdom or reasoned action. Recently, it was inferred that arguments I was making about the relationship between evaluation and science might benefit from a book titled What is science? by Alan Chalmers. I suspected the suggestion was along the lines of ‘Andrew, your view of science is narrow: what has been considered to be science has changed throughout history and there is much more to science than the testing of theories with experimental evidence’. While this was never my view. I obtained and read a copy of the book to learn Chalmer's views on the matter and see how they may apply to evaluation.

Chalmer’s excellent and very readable book provides a great introduction to the philosophy and epistemology of physical science. Examples are drawn from astronomy, physics, and chemistry over the last 400 or so years. It is engaging and draws on many currents in the philosophy of science to provide a comprehensive yet gentle introduction to the philosophy of physical science. I could not escape the question in my mind while moving through the book, 'Why would a course on the evaluation of public policy set as a foundational text, a book aimed at providing students of the physical sciences with a grounding in the history of their field?' Perhaps to illustrate that it is hard to say exactly what science is and that views on what science is and how it progresses are contested and evolving? This would appear a useful exercise to ground erstwhile scientists in the complexities and difficulties of what they may be setting out to achieve in their careers. But is not sufficient justification for setting the book as a useful entree to evaluation in the social world. Chalmers himself, while he barely touches on social science appears mildly hostile to the concept when he refers to ‘the so-called social or human sciences’ (Chalmers 2013, pxx).

Chalmers resists revealing his own conclusions about science directly, but appears to conclude at the commencement of Chapter 13, his last chapter on epistemology, and after explicating observation, inductivism, falsifiability and experimentation, paradigms, realism, bayesian and then experimentalism again, that the social activity of science is about a program of work that makes bold claims that predict novel phenomena or explain phenomena in novel ways, claims that can be falsified, and claims that may be improved in light of empirical evidence. In the postscript he confirms his view that ‘the distinguishing feature of science lies precisely in the sense in which it [a claim about the world] is empirically supported’ (Chalmers 2013, 233). For our purpose, as evaluators seeking to learn from this book, it is important to note that he tends to prefer experiments as the basis for science, which is not the view of many who seek to present evaluation as part of the social sciences.

The tension between theory and observation is a constant feature of science in the book. Popper and Kuhn give priority to theory over observation (Chalmers 2013, p121) and agree that the point of observation is to test a theory. Chalmers concludes with an affinity for 'new experimentalism' in his epistemology discussion and states that even though he finds it far from a complete answer ‘there is no doubt that new experimentalism has brought the philosophy of science down to earth in a valuable way and that it stands as a useful corrective to some of the excesses of the theory dominated approach’ (p 179). New experimentalism focuses on the importance of experimental practices and the autonomy of experimentation from theoretical science. Chalmers refers frequently to Deborah Mayo, and her critique of the traditional view where experiments were seen primarily as mechanisms for confirming theories. She argues that experimental practice is not merely a tool for theory validation but is an independent source of empirical information for reliably learning about the world. Chalmers also appears to favor that experiments can and should challenge and refine scientific theories, not just confirm them. This does not mean he would expect the evaluation discipline to focus on scientific experiments.

Chalmers does not attempt to distinguish an experimenting science from other worthwhile forms of inquiry such as history, legal reasoning, or ethics but alludes to their value in the presentation of Feyerabend’s critique that scientists need access to ’other forms of knowledge’ (Chalmers 2013, p144). Chalmers himself appears to anticipate the danger of an uncritical acceptance of a scientific worldview when he warns in the introduction

‘A high regard for science is seen as a modern religion, playing a similar role to that played by Christianity in Europe in earlier eras’ (Alan Chalmers in ‘What is Science?’ 2013, pxxi).

To consider evaluation as a type of science, it is necessary to provide an account of evaluation that bears some resemblance to science. Evaluators can make bold claims about law-like behaviour, these are the ‘what works’ claims, or on a realist account, 'demi-regularities. The realist account provides a more sophisticated iteration of the phrase focused not on interventions but abstract causal mechanisms; ‘what works for whom under what circumstances and to what effect’. In either case, these are statements for which law-like regularity is the goal and highest standard for claims about the value of interventions. These are claims that could be observed.

Does evaluation, as it is actually practiced, progress as a program of work that makes bold claims that are falsifiable and generalisable about the law-like or at least regular behaviour of naturally occurring phenomena? Clearly evaluation is not concerned as science is, with naturally occurring phenomena but with human interventions into social systems. It may be claimed that as these interventions are the creation of naturally occurring phenomena (i.e. humans) they are themselves natural. On the face of it this seems disingenuous, setting this as a focus for science would suggest that all humanities are best understood, and the value to humanity can be reduced to a scientific study as the positivists attempted in the 19th century and Bhaskar critiques in the 'possibility of naturalism'. Donald Campbell discussed this state of affairs for evaluation in 1985 and the ‘current pseudoscience in which we inadvertently find ourselves engaged’ and offered a prescription that aligns with the ideas of Chalmers in his seminal paper Can we be scientific in the applied social sciences? His ideas have yet to be taken up. Ideas that I claim would not actually promote a more valid or useful discipline of evaluation, even if it would make it appear more scientific.

The point of measuring an outcome in a theory-driven or scientific evaluation would surely be to test a theory. This would mean evaluation reports focus on the developing and testing of theory, rather than reporting on process and outcomes of a particular intervention. Government guides to evaluation do not suggest theory testing has any relevance other than as a means to the end of demonstrating or measuring valuable outcomes. Academics writing in the research tradition may argue this is what they are being paid to do, but it would seem a rather severe case of bad faith not to realise that a client generally wants an answer to some variation of, what happened, why and how can we do it better. A client’s level of interest in theory is, appropriately, limited to whether or not the intervention is based on sound theory, not how the evaluation could develop theory.

Evaluators are almost always working on the ‘here and now’ and the value of a particular intervention for a particular purpose. They are generally concerned with interventions by specific people into specific contexts ‘a somewhere and somewhen’ as described by Nancy Cartwright and as Stephen Toulmin and Tom Nagel point out “The substance of everyday experience refers always to a ‘where and when’: a ‘here and now’ or ‘there and then’. General theoretical abstractions, by contrast, claim to apply always and everywhere, and so-as Tom Nagel points out-hold good nowhere-in-particular”. Evaluators are not usually generating abstract knowledge claims about the behaviour of phenomena (i.e. programs or causal mechanisms within programs) in some kind of law like manner. If they are, they are then acting as researchers, not evaluators. This not to say that research and evaluation cannot co-exist, but it is to claim they are different things. If they want to know if a new curriculum is effective – it is to improve the curriculum, possibly replace it, but not to make causal claims about curricula. I suggest that most experienced evaluators would accept that evaluation as actually practiced (what might be called an immanent or transcendental critique of the idea of evaluation as science) is not a program of inquiry by a collaborative group of social scientists who set out to make bold claims about the fundamental nature of what works in any kind of generalisable manner.

My own conclusion is that evaluation is the study of action and the ethics, values and above all, soundness of the logic of a proposed or current course of action. This approach is set out in various blogs and articles that appear on this website.

The idea that political science rests on laws and experiments like those of physics, was the notion, either concealed or open, of both Hobbes and Spinoza, each in his own fashion – and of their followers – a notion that grew more powerful in the eighteenth and nineteenth centuries, when the natural sciences acquired enormous prestige, and attempts were made to maintain that anything not capable of being reduced to a natural science could not properly be called knowledge at all…To be rational in any sphere, to apply good judgment in it, is to apply those methods which have turned out to work best in it, anything else is mere irrationalism…Bad judgment [in human studies] consists not in failing to apply the methods of natural science, but, on the contrary in over applying them”— Isaiah Berlin BBC Third Programme, June 19, 1957 cited in Stephen Toulmin (2003) Return To Reason.

Wisdom, like evaluation, is about applying knowledge in the world. If I were writing homework I might offer the following dictionary definitions of wisdom as synonyms of evaluative judgment. The ability to use your knowledge and experience to make good decisions and judgments .or Capacity of judging rightly in matters relating to life and conduct; soundness of judgement in the choice of means and ends; sometimes, less strictly, sound sense, esp. in practical affairs: opposed to folly.