Policy-making, the challenges of reductionism and the role of evidence
In the context of a discussion on kitemarks for policy, Claire Melamed of ODI mentioned the kitemark, tweeting “national resources still have to be allocated [between] diff[erent] things. some way of comparing impact v imp[ortnant]” (helpful filling in, my own). This I thought was interesting and triggered this rather long post. Reading the tweet again, it struck me that it could be taken in two ways: the less extreme interpretation says that evidence is important to understand impact, and inform judgements; the more extreme argues for boiling down the policy-process into a comparison of evidence, one predicted intervention against another.
The latter interpretation would be based on a very bullish view of the value and capabilities of research. It reminded me of a hypothetical decision-making process described in Charles Lindblom’s article back in 1959 (Public Administration Review, 19:2, 79-88), the article which as far as I know really started the examination of the use of evidence in policy (it’s worth replicating ):
“[A policy-maker] might start by trying to list all related values in order of importance … Then all possible policy outcomes could be rated as more or less efficient in attaining a maximum of these values. This would of course require a prodigious inquiry into values held by members of society and an equally prodigious set of calculations on how much of each value is equal to how much of each other value. He could then proceed to outline all possible policy alternatives. In a third step, he would undertake systematic comparison of his multitude of alternatives to determine which attains the greatest amount of values. In comparing policies, he would take advantage of any theory available that generalized about classes of policies… Finally, he would try to make the choice that would in fact maximize his values.”
Lindblom makes clear his attitude to the feasibility of this approach clear:
“For complex problems, [this approach] is of course impossible. Although such an approach can be described, it cannot be practiced except for relatively simple problems and even then only in a somewhat modified form. It assumes intellectual capacities and sources of information that men simply do not possess, and it is even more absurd as an approach to policy when the time and money that can be allocated to a policy problem is limited, as is always the case.”
It is worth noting that it is precisely this Herculean task that the Public Accounts Committee expect DfID to engage in (para 18), as translated through value-for- money indicators as common currency for our values (itself something of a reach, since even for myself I’m not sure in these straitened times whether my money translates better into, say a first rate rye-loaf for lunch today or a pint of stout in my local at the end of the week).
Indeed, Lindblom’s article misses out some key additional elements of the literature which have subsequently emerged, which have gone to further qualify the validity of the bullish interpretation of the quote:
- Complex projects are not predictable and are therefore not replicable – previous experience does not help, so previous evidence is no guide to whether the intervention will work again – while complicated projects may be evaluated, but have so many factors that they are prohibitively difficult to measure reliably; and
- If the question is how you allocate resources to achieve an impact, then there’s an institutional complexity – as the donor side fragments, and becomes more difficult to coordinate, resources are going to be increasingly allocated through different channels, and with different controls, and through different agents in ways that are difficult to predict – what are others doing, what if they change the game, what if they make your aid dollar less valuable through prior actions?
All very difficult.
Evidence plays a vital role for informing decision-making. However, for me, that role is to inform the decisions of the policy-makers who are (of course) competent and humble, willing to learn, deeply embedded in their context, as well as possessed of strong relationships to partners and a good dose of political nous. They will make decisions, based on their knowledge and intuition. No leader working in complex and dynamic environments – whether an officer in charge of a FOB in Helmand or an executive in the private sector – is expected to boil their decision-making down to numbers. It seemed strange to me to want to reduce this to a reductionist formula of compared values – that’s why we have experts – but I suspect it’s because the wider development industry isn’t trusted in the public, and therefore needs to be scrutinised. Again, for me accountability for these decisions could be based on a review that the decision was rational, based on the inputs and stated criteria for justification (thinking about it, not unlike an appeal in our law courts from a lower to a higher court) – i.e. a process assurance, plus a review of the outcomes measured insofar as possible, for the sake of learning.
It’s been interesting to see how DfID has wrestled with these challenges. Writing this recollected two recent evaluations of programme areas that I came across while scutting about the web – one is of public awareness campaigns in the UK, the other transparency and accountability initiatives. In the former the evaluators said:
“We are confident that … raising awareness of development issues in the UK has contributed to reducing poverty overseas. However, the evidence is circumstantial and consequently we have been unable to prove conclusively that this is the case. We can make the argument that it does, but there are simply too many causal connections to be able to prove it.”
Without this, the team concluded that DfID should reconfigure its spending. For the second field, that of TAIs, the evaluators said “[t]he evidence base is not large enough – there are simply not enough good impact studies – to begin to assess overall trends” (PDF p. 19). Investment in transparency and accountability remains (appropriately in my view) a core area of DfID spending. The cases are not identical, since there are differences in the volume and nature of the literature, and in the degree to which theories of change for the interventions have been articulated and tested, but the differences in response to a lack of evidence is illuminating. Another example I came across this morning, from the recent Africa All Party Parliamentary Group report on the Bilateral Aid Review is also interesting:
“Our concerns relate to the lack of objective criteria used to select focus countries… The Needs-Effectiveness Index (NEI) appears to have been used to justify the subjective decisions of officials, rather than to make objective decisions. DFID did not provide a clear and convincing explanation of why its Needs-Effectiveness Index was constructed in the way it was, and our study shows that alternative and equally credible assumptions would have pointed to significantly different results… In particular we disagree with the decision to close DFID’s bilateral programme in Burundi, a small, extremely poor, fragile country recovering from decades of civil war, which is highly dependent on aid and whose stability has deep consequences for the wider region. When our alternative assumptions are used in the Needs-Effectiveness Index, Burundi ranks as the number one country, and, had these criteria been used to select focus countries, DFID’s programme in Burundi would not have been selected for closure.”
The critical conclusion from this work I think is an inevitable consequence of the vast task DfID is being asked to do, through rolling out across the board a comparable and aggregatable VfM approach. For me, by making the decision-making a technical process of reducing down complex programmes to a comparison of aggregated numbers, the actual, real decision is inevitably and inadvertently submerged somewhere in the avalanche of assumptions that are required to make the exercise in any way feasible. In the end, by obscuring the reasons for decisions, I think this makes for less rather than more accountability.