A Critique of Superintelligence

Introduction

In this article I present a critique of Nick Bostrom’s book Superintelligence. For purposes of brevity I shall not devote much space to summarising Bostrom’s arguments or defining all the terms that he uses. Though I briefly review each key idea before discussing it, I shall also assume that readers have some general idea of Bostrom’s argument, and some of the key terms involved. Also note that to keep this piece focused, I only discuss arguments raised in this book, and not what Bostrom has written elsewhere or others who have addressed similar issues. The structure of this article is as follows. I first offer a summary of what I regard to be the core argument of Bostrom’s book, outlining a series of premises that he defends in various chapters. Following this summary, I commence a general discussion and critique of Bostrom’s concept of ‘intelligence’, arguing that his failure to adopt a single, consistent usage of this concept in his book fatally undermines his core argument. The remaining sections of this article then draw upon this discussion of the concept of intelligence in responding to each of the key premises of Bostrom’s argument. I conclude with a summary of the strengths and weaknesses of Bostrom’s argument.

Summary of Bostrom’s Argument

Throughout much of his book, Bostrom remains quite vague as to exactly what argument he is making, or indeed whether he is making a specific argument at all. In many chapters he presents what are essentially lists of various concepts, categories, or considerations, and then articulates some thoughts about them. Exactly what conclusion we are supposed to draw from his discussion is often not made explicit. Nevertheless, by my reading the book does at least implicitly present a very clear argument, which bears a strong similarity to the sorts of arguments commonly found in the Effective Altruism (EA) movement, in favour of focusing on AI research as a cause area. In order to provide structure for my review, I have therefore constructed an explicit formulation of what I take to be Bostrom’s main argument in his book. I summarise it as follows:

Premise 1: A superintelligence, defined as a system that ‘exceeds the cognitive performance of humans in virtually all domains of interest’, is likely to be developed in the foreseeable future (decades to centuries).

Premise 2: If superintelligence is developed, some superintelligent agent is likely to acquire a decisive strategic advantage, meaning that no terrestrial power or powers would be able to prevent it doing as it pleased.

Premise 3: A superintelligence with a decisive strategic advantage would be likely to capture all or most of the cosmic endowment (the total space and resources within the accessible universe), and put it to use for its own purposes.

Premise 4: A superintelligence which captures the cosmic endowment would likely put this endowment to uses incongruent with our (human) values and desires.

Preliminary conclusion: In the foreseeable future it is likely that a superintelligent agent will be created which will capture the cosmic endowment and put it to uses incongruent with our values. (I call this the AI Doom Scenario).

Premise 5: Pursuit of work on AI safety has a non-trivial chance of noticeably reducing the probability of the AI Doom Scenario occurring.

Premise 6: If pursuit of work on AI safety has at least a non-trivial chance of noticeably reducing the probability of an AI Doom Scenario, then (given the preliminary conclusion above) the expected value of such work is exceptionally high.

Premise 7: It is morally best for the EA community to preferentially direct a large fraction of its marginal resources (including money and talent) to the cause area with highest expected value.

Main conclusion: It is morally best for the EA community to direct a large fraction of its marginal resources to work on AI safety. (I call this the AI Safety Thesis.)

Bostrom discusses the first premise in chapters 1-2, the second premise in chapters 3-6, the third premise in chapters 6-7, the fourth premise in chapters 8-9, and some aspects of the fifth premise in chapters 13-14. The sixth and seventh premises are not really discussed in the book (though some aspects of them are hinted at in chapter 15), but are widely discussed in the EA community and serve as the link between the abstract argumentation and real-world action, and as such I decided also to discuss them here for completeness. Many of these premises could be articulated slightly differently, and perhaps Bostrom would prefer to rephrase them in various ways. Nevertheless I hope that they at least adequately capture the general thrust and key contours of Bostrom’s argument, as well as how it is typically appealed to and articulated within the EA community.

The nature of intelligence

In my view, the biggest problem with Bostrom’s argument in Superintelligence is his failure to devote any substantial space to discussing the nature or definition of intelligence. Indeed, throughout the book I believe Bostrom uses three quite different conceptions of intelligence:

  • Intelligence(1): Intelligence as being able to perform most or all of the cognitive tasks that humans can perform. (See page 22)
  • Intelligence(2): Intelligence as a measurable quantity along a single dimension, which represents some sort of general cognitive efficaciousness. (See pages 70,76)
  • Intelligence(3): Intelligence as skill at prediction, planning, and means-ends reasoning in general. (See page 107)

While certainly not entirely unrelated, these three conceptions are all quite different from each other. Intelligence(1) is mostly naturally viewed as a multidimensional construct, since humans exhibit a wide range of cognitive abilities and it is by no means clear that they are all reducible to a single underlying phenomenon that can be meaningfully quantified with one number. It seems much more plausible to say that the range of human cognitive abilities require many different skills which are sometimes mutually-supportive, sometimes mostly unrelated, and sometimes mutually-inhibitory in varying ways and to varying degrees. This first conception of intelligence is also explicitly anthropocentric, unlike the other two conceptions which make no reference to human abilities. Intelligence(2) is unidimensional and quantitative, and also extremely abstract, in that it does not refer directly to any particular skills or abilities. It most closely parallels the notion of IQ or other similar operational measures of human intelligence (which Bostrom even mentions in his discussion), in that it is explicitly quantitative and attempts to reduce abstract reasoning abilities to a number along a single dimension. Intelligence(3) is much more specific and grounded than either of the other two, relating only to particular types of abilities. That said, it is not obviously subject to simple quantification along a single dimension as is the case for Intelligence(2), nor is it clear that skill at prediction and planning is what is measured by the quantitative concept of Intelligence(2). Certainly Intelligence(3) and Intelligence(2) cannot be equivalent if Intelligence(2) is even somewhat analogous to IQ, since IQ mostly measures skills at mathematical, spatial, and verbal memory and reasoning, which are quite different from skills at prediction and planning (consider for example the phenomenon of autistic savants). Intelligence(3) is also far more narrow in scope than Intelligence(1), corresponding to only one of the many human cognitive abilities.

Repeatedly throughout the book, Bostrom flips between using one or another of these conceptions of intelligence. This is a major weakness for Bostrom’s overall argument, since in order for the argument to be sound it is necessary for a single conception of intelligence to be adopted and apply in all of his premises. In the following paragraphs I outline several of the clearest examples of how Bostrom’s equivocation in the meaning of ‘intelligence’ undermines his argument.

Bostrom argues that once a machine becomes more intelligent than a human, it would far exceed human-level intelligence very rapidly, because one human cognitive ability is that of building and improving AIs, and so any superintelligence would also be better at this task than humans. This means that the superintelligence would be able to improve its own intelligence, thereby further improving its own ability to improve its own intelligence, and so on, the end result being a process of exponentially increasing recursive self-improvement. Although compelling on the surface, this argument relies on switching between the concepts of Intelligence(1) and Intelligence(2). When Bostrom argues that a superintelligence would necessarily be better at improving AIs than humans because AI-building is a cognitive ability, he is appealing to Intelligence(1). However, when he argues that this would result in recursive self-improvement leading to exponential growth in intelligence, he is appealing to Intelligence(2). To see how these two arguments rest on different conceptions of intelligence, note that considering Intelligence(1), it is not at all clear that there is any general, single way to increase this form of intelligence, as Intelligence(1) incorporates a wide range of disparate skills and abilities that may be quite independent of each other. As such, even a superintelligence that was better than humans at improving AIs would not necessarily be able to engage in rapidly recursive self-improvement of Intelligence(1), because there may well be no such thing as a single variable or quantity called ‘intelligence’ that is directly associated with AI-improving ability. Rather, there may be a host of associated but distinct abilities and capabilities that each needs to be enhanced and adapted in the right way (and in the right relative balance) in order to get better at designing AIs. Only by assuming a unidimensional quantitative conception of Intelligence(2) does it make sense to talk about the rate of improvement of a superintelligence being proportional to its current level of intelligence, which then leads to exponential growth. Bostrom therefore faces a dilemma. If intelligence is a mix of a wide range of distinct abilities as in Intelligence(1), there is no reason to think it can be ‘increased’ in the rapidly self-reinforcing way Bostrom speaks about (in mathematical terms, there is no single variable  which we can differentiate and plug into the differential equation, as Bostrom does in his example on pages 75-76). On the other hand, if intelligence is a unidimensional quantitative measure of general cognitive efficaciousness, it may be meaningful to speak of self-reinforcing exponential growth, but it is not necessarily obvious that any arbitrary intelligent system or agent would be particularly good at designing AIs. Intelligence(2) may well help with this ability, but it’s not at all clear it is sufficient – after all, we readily conceive of building a highly “intelligent” machine that can reason abstractly and pass IQ tests etc, but is useless at building better AIs.

Bostrom argues that once a machine intelligence became more intelligent than humans, it would soon be able to develop a series of ‘cognitive superpowers’ (intelligence amplification, strategising, social manipulation, hacking, technology research, and economic productivity), which would then enable it to escape whatever constraints were placed upon it and likely achieve a decisive strategic advantage. The problem is that it is unclear whether a machine endowed only with Intelligence(3) (skill at prediction and means-ends reasoning) would necessarily be able to develop skills as diverse as general scientific research ability, the capability to competently use natural language, and perform social manipulation of human beings. Again, means-ends reasoning may help with these skills, but clearly they require much more beyond this. Only if we are assuming the conception of Intelligence(1), whereby the AI has already exceeded essentially all human cognitive abilities, does it become reasonable to assume that all of these ‘superpowers’ would be attainable.

According to the orthogonality thesis, there is no reason why the machine intelligence could not have extremely reductionist goals such as maximising the number of paperclips in the universe, since an AI’s level of intelligence is totally separate to and distinct from its final goals. Bostrom’s argument for this thesis, however, clearly depends adopting Intelligence(3), whereby intelligence is regarded as general skill with prediction and means-ends reasoning. It is indeed plausible that an agent endowed only with this form of intelligence would not necessarily have the ability or inclination to question or modify its goals, even if they are extremely reductionist or what any human would regard as patently absurd. If, however, we adopt the much more expansive conception of Intelligence(1), the argument becomes much less defensible. This should become clear if one considers that ‘essentially all human cognitive abilities’ includes such activities as pondering moral dilemmas, reflecting on the meaning of life, analysing and producing sophisticated literature, formulating arguments about what constitutes a ‘good life’, interpreting and writing poetry, forming social connections with others, and critically introspecting upon one’s own goals and desires. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips. As such, Bostrom is driven by his cognitive superpowers argument to adopt the broad notion of intelligence seen in Intelligence(1), but then is driven back to a much narrower Intelligence(3) when he wishes to defend the orthogonality thesis. The key point to be made here is that the goals or preferences of a rational agent are subject to rational reflection and reconsideration, and the exercise of reason in turn is shaped by the agent’s preferences and goals. Short of radically redefining what we mean by ‘intelligence’ and ‘motivation’, this complex interaction will always hamper simplistic attempts to neatly separate them, thereby undermining Bostrom’s case for the orthogonality thesis – unless a very narrow conception of intelligence is adopted.

In the table below I summarise several of the key outcomes or developments that are critical to Bostrom’s argument, and how plausible they would be under each of the three conceptions of intelligence. Obviously such judgements are necessarily vague and subjective, but the key point I wish to make is simply that only by appealing to different conceptions of intelligence in different cases is Bostrom able to argue that all of the outcomes are reasonably likely to occur. Fatally for his argument, there is no single conception of intelligence that makes all of these outcomes simultaneously likely or plausible.

Outcome Intelligence(1):        all human cognitive abilities Intelligence(2): unidimensional measure of cognition Intelligence(3): prediction and means-ends reasoning
Quick takeoff Highly unlikely Likely Unclear
Develops all cognitive superpowers Highly likely Highly unlikely Highly unlikely
Absurd ‘paperclip maximising’ goals Extremely unlikely Unclear Likely
Resists changes to goals Unlikely Unclear Likely
Can escape confinement Likely Unlikely Unlikely

Premise 1: Superintelligence is coming soon

I have very little to say about this premise, since I am in broad agreement with Bostrom that even if it takes decades or a century, super-human artificial intelligence is quite likely to be developed. I find Bostrom’s appeals to surveys of AI researchers regarding how long it is likely to be until human level AI is developed fairly unpersuasive, given both the poor track record of such predictions and also the fact that experts on AI research are not necessarily experts on extrapolating the rate of technological and scientific progress (even in their own field). Bostrom, however, does note some of these limitations, and I do not think his argument is particularly dependent upon these sorts of appeals. I therefore will pass over premise 1 and move on to what I consider to be the more important issues.

Premise 2: Arguments against a fast takeoff

Bostrom’s major argument in favour of the contention that a superintelligence would be able to gain a decisive strategic advantage is that the ‘takeoff’ for such an intelligence would likely be very rapid. By a ‘fast takeoff’, Bostrom means that the time between when the superintelligence first approaches human-level cognition and when it achieves dramatically superhuman intelligence would be small, on the order of days or even hours. This is critical because if takeoff is as rapid as this, there will be effectively no time for any existing technologies or institutions to impede the growth of the superintelligence or check it in any meaningful way. Its rate of development would be so rapid that it would readily be able to out-think and out-maneuver all possible obstacles, and rapidly obtain a decisive strategic advantage. Once in this position, the superintelligence would possess an overwhelming advantage in technology and resources, and would therefore be effectively impossible to displace.

The main problem with all of Bostrom’s arguments for the plausibility of a fast takeoff is that they are fundamentally circular, in that the scenario or consideration they propose is only plausible or relevant under the assumption that the takeoff (or some key aspect of it) is fast. The arguments he presents are as follows:

  • Two subsystems argument: if an AI consists of two or more subsystems with one improving rapidly, but only contributing to the ability of the overall system after a certain threshold is reached, then the rate of increase in the performance of the overall system could drastically increase once that initial threshold is passed. This argument assumes what it is trying to prove, namely that the rate of progress in a critical rate-limiting subsystem could be very rapid, experiencing substantial gains on the order of days or even hours. It is hard to see what Bostrom’s scenario really adds here; all he has done is redescribed the fast takeoff scenario in a slightly more specific way. He has not given any reason for thinking that it is at all probable that progress on such a critical rate-limiting subsystem would occur at the extremely rapid pace characteristic of a fast takeoff.
  • Intelligence spectrum argument: Bostrom argues that the intelligence gap between ‘infra-idiot’ and ‘ultra-Einstein’, while appearing very large to us, may actually be quite small in the overall scheme of the spectrum of possible levels of intelligence, and as such the time taken to improve an AI through and beyond this level may be much less than it originally seems. However, even if it is the case that the range of the intelligence spectrum within which all humans fall is fairly narrow in the grand scheme of things, it does not follow that the time taken to traverse it in terms of AI development is likely be on the order of days or weeks. Bostrom is simply making an assumption that such rapid rates of progress could occur. His intelligence spectrum argument can only ever show that the relative distance in intelligence space is small; it is silent with respect to likely development timespans.
  • Content overhang argument: an artificial intelligence could be developed with high capabilities but with little raw data or content to work with. If large quantities of raw data could be processed quickly, such an AI could rapidly expand its capabilities. The problem with this argument is that what is most important is not how long it takes a given AI to absorb some quantity of data, but rather the length of time between producing one version of the AI and the next, more capable version. This is because the key problem is that we currently don’t know how to build a superintelligence. Bostrom is arguing that if we did build a nascent superintelligence that simply needed to process lots of data to manifest its capabilities, then this learning phase could occur quickly. He gives no reason, however, to think that the rate at which we can learn how to build that nascent superintelligence (in other words, the overall rate of progress in AI research) will be anything like as fast as the rate an existing nascent superintelligence would be able to process data. Only if we assume rapid breakthroughs in AI design itself does the ability of AIs to rapidly assimilate large quantities of data become relevant.
  • Hardware overhang argument: it may be possible to increase the capabilities of a nascent superintelligence dramatically and very quickly by rapidly increasing the scale and performance of the hardware it had access to. While theoretically possible, this is an implausible scenario since any artificial intelligence showing promise would likely be operating near the peak of plausible hardware provision. This means that testing, parameter optimisation, and other such tasks will take considerable time, as hardware will be a limiting factor. Bostrom’s concept of a ‘hardware overhang’ amounts to thinking that AI researchers would be content to ‘leave money on the table’, in the sense of not making use of what hardware resources are available to them for extended periods of development. This is especially implausible in the event of groundbreaking research involving AI architectures showing substantial promise. Such systems would hardly be likely to spend years being developed on relatively primitive hardware only to be suddenly and very rapidly dramatically scaled up at the precise moment when practically no further development is necessary, and they are already effectively ready to achieve superhuman intelligence.
  • ‘One key insight’ argument: Bostrom argues that ‘if human level AI is delayed because one key insight long eludes programmers, then when the final breakthrough occurs, the AI might leapfrog from below to radically above human level’. Assuming that ‘one key insight’ would be all it would take to crack the problem of superhuman intelligence is, to my mind, grossly implausible, and not consistent either with the slow but steady rate of progress in artificial intelligence research over the past 60 years, or with the immensely complex and multifaceted phenomenon that is human intelligence.

Additional positive arguments against the plausibility of a fast takeoff include the following:

  • Speed of science: Bostrom’s assertion that artificial intelligence research could develop from clearly sub-human to obviously super-human levels of intelligence in a matter of days or hours is simply absurd. Scientific and engineering projects simply do not work over timescales that short. Perhaps to some degree this could be altered in the future if (for example) human-level intelligence could be emulated on a computer and then the simulation run at much faster than real-time. But Bostrom’s argument is that machine intelligence is likely to precede emulation, and as such all we will have to work with at least up to the point of human/machine parity being reached is human levels of cognitive ability. As such it seems patently absurd to argue that developments of this magnitude could be made on the timespan of days or weeks. We simply see no examples of anything like this from history, and Bostrom cannot argue that the existence of superintelligence would make historical parallels irrelevant, since we are precisely talking about the development of superintelligence in the context of it not already being in existence.
  • Subsystems argument: any superintelligent agent will doubtlessly require many interacting and interconnected subsystems specialised for different tasks. This is the way even much narrower AIs work, and it is certainly how human cognition works. Ensuring that all these subsystems or processes interact efficiently, without one inappropriately dominating or slowing up overall cognition, or without bottlenecks of information transfer or decision making, is likely to be something that requires a great deal of experimentation and trial-and-error. This in turn will take extensive empirical experiments, tinkering, and much clever work. All this takes time.
  • Parallelisation problems: many algorithms cannot be sped up considerably by simply adding more computational power unless an efficient way can be found to parallelise them, meaning that they can be broken down into smaller steps which can be performed in parallel across many processors at once. This is much easier to do for some types of algorithms and computations than others. It is not at all clear that the key algorithms used by a superintelligence would be susceptible to parallelisation. Even if they were, developing efficient parallelised forms of the relevant algorithms would itself be a prolonged process. The superintelligence itself would only be able to help in this development to the degree permitted by its initially limited hardware endowment. We therefore would expect to observe gradual improvement of algorithmic efficiency in parallelisation, thereby enabling more hardware to be added, thereby enabling further refinements to the algorithms used, and so on. It is therefore not at all clear that a superintelligence could be rapidly augmented simply by ‘adding more hardware’.
  • Need for experimentation: even if a superintelligence came into existence quite rapidly, it would still not be able to achieve a decisive strategic advantage in similarly short time. This is because such an advantage would almost certainly require development of new technologies (at least the examples Bostrom gives almost invariably involve the AI using technologies currently unavailable to humans), which would in turn require scientific research. Scientific research is a complex activity that requires far more than skill at ‘prediction and means-end reasoning’. In particular, it also generally requires experimental research and (if engineering of new products is involved) producing and testing of prototypes. All of this will take time, and crucially is not susceptible to computational speedup, since the experiments would need to be performed with real physical systems (mechanical, biological, chemical, or even social). The idea that all (or even most) such testing and experimentation could be replaced by computer simulation of the relevant system is absurd, since most such simulations are completely computationally intractable, and likely to remain so for the foreseeable future (in many cases possibly forever). Therefore in the development of new technologies and scientific knowledge, the superintelligence is still fundamentally limited by the rate at which real-world tests and experiments can be performed.
  • The infrastructure problem: in addition to the issue of developing new technologies, there is the further problem of the infrastructure required to develop such technologies, or even just to carry out the core objectives of the superintelligence. In order to acquire a decisive strategic advantage, a superintelligence will require vast computational resources, energy sources to supply them, real-world maintenance of these facilities, sources of raw materials, and vast manufacturing centres to produce any physical manipulators or other devices it requires. If it needs humans to perform various tasks for it, it will likely also require training facilities and programs for its employees, as well as teams of lawyers to acquire all the needed permits and permissions, write up contracts, and lobby governments. All of this physical and social infrastructure cannot be built in the matter of an afternoon, and more realistically would take many years or even decades to put in place. No amount of superintelligence can overcome physical limitations of the time required to produce and transform large quantities of matter and energy into desired forms. One might argue that improved technology certainly can reduce the time taken to move matter and energy, but the point is that it can only do so after the technology has been embodied in physical forms. The superintelligence would not have access to such hypothetical super-advanced transportation, computation, or construction technologies until it had built the factories needed to produce the machine tools with are needed to precisely refine the raw materials needed for parts in the construction of the nanofactory… and so on for many other similar examples. Nor can even vast amounts of money and intelligence allow any agent to simply brush aside the impediments of the legal system and government bureaucracy in an afternoon. A superintelligence would not simply be able to ignore such social restrictions on its actions until after it had gained enough power to act in defiance of world governments, which it would not be able to do until it had already acquired considerable military capabilities. All of this would take considerable time, precluding a fast takeoff.

Premise 3: Arguments against cosmic expansion

Critical to Bostrom’s argument about the dangers of superintelligence is that a superintelligence with a critical strategic advantage would likely capture the majority of the cosmic endowment (the sum total of the resources available within the regions of space potentially accessible to humans). This is why Bostrom presents calculations for the huge numbers of potential human lives (or at least simulations of lives) whose happiness is at stake should the cosmic endowment be captured by a rogue AI. While Bostrom does present some compelling reasons for thinking that a superintelligence with a decisive strategic advantage would have reasons and the ability to expand throughout the universe, there are also powerful considerations against the plausibility of this outcome which he fails to consider.

First, by the orthogonality thesis, a superintelligent agent could have almost any imaginable goal. It follows that a wide range of goals are possible that are inconsistent with cosmic expansion. In particular, any superintelligence with goals involving the value of unspoiled nature, or of constraining its activities to the region of the solar system, or of economising on the use of resources, would have reasons not to pursue cosmic expansion. How likely it is that a superintelligence would be produced with such self-limiting goals compared to goals favouring limitless expansion is unclear, but it is certainly a relevant outcome to consider, especially given that valuing exclusively local outcomes or conservation of resources seem like plausible goals that might be incorporated by developers into a seed AI.

Second, on a number of occasions, Bostrom briefly mentions that a superintelligence would only be able to capture the entire cosmic endowment if no other technologically advanced civilizations, or artificial intelligences produced by such civilizations, existed to impede it. Nowhere, however, does he devote any serious consideration to how likely the existence of such civilizations or intelligences is. Given the great age and immense size of the cosmos, however, the probability that humans are the first technological civilization to achieve spaceflight, or that any superintelligence we produce would be the first to spread throughout the universe, seems infinitesimally small. Of course this is an area of great uncertainly and we can therefore only speculate about the relevant probabilities. Nevertheless, it seems very plausible to me that the chances of any human-produced superintelligence successfully capturing the cosmic endowment without alien competition are very low. Of course this does not mean that an out-of-control terrestrial AI could not do great harm to life on Earth and even spread throughout neighbouring stars, but it does significantly blunt the force of the huge numbers Bostrom presents as being at stake if we think the entire cosmic endowment is at risk of being misused.

Premise 4: The nature of AI motivation

Bostrom’s main argument in defence of premise 4 is that unless we are extremely careful and/or lucky in establishing the goals and motivations of the superintelligence before it captures the cosmic endowment, it is likely to end up pursuing goals that are not in alignment with our own values. Bostrom presents a number of thought experiments as illustrations of the difficulty of specifying values or goals in a manner that would result in the sorts of behaviours we want it to perform. Most of these examples involve the superintelligence pursuing a goal in a single-minded, literalistic way, which no human being would regard as ‘sensible’. He gives as examples an AI tasked with maximising its output of paperclips sending out probes to harvest all the energy within the universe to make more paperclips, or an AI tasked with increasing human happiness enslaving all humans and hijacking their brains to stimulate the pleasure centres directly. One major problem I have with all such examples is that the AIs always seem to lack a critical ability in interpreting and pursuing their goals that, for want of a better term, we might describe as ‘common sense’. This issue ultimately reduces to which conception of intelligence one applies, since if we adopt Intelligence(1) then any such AIs would necessarily have ‘common sense’ (this being a human cognitive ability), while the other two conceptions of intelligence would not necessarily include this ability. However, if we do take Intelligence(1) as our standard, then it seems difficult to see why a superintelligence would lack the sort of common sense by which any human would be able to see that the simple-minded, literalistic interpretations given as examples by Bostrom are patently absurd and ridiculous things to do.

Aside from the question of ‘common sense’, it is also necessary to analyse the concept of ‘motivation’, which is a multifaceted notion that can be understood in a variety of ways. Two particularly important conceptions of motivation are that it is some sort of internal drive to do or obtain some outcome, and motivation as some sort of more abstract rational consideration by which an agent has a reason to act in a certain way. Given what he says about the orthogonality thesis, it seems that Bostrom thinks of motivation as being some sort of internal drive to act in a particular way. In the first few pages of the chapter on the intelligent will, however, he switches from talking about motivation to talking about goals, without any discussion about the relationship between these two concepts. Indeed, it seems that these are quite different things, and can exist independently of each other. For example, humans can have goals (to quit smoking, or to exercise more) without necessarily having any motivation to take actions to achieve those goals. Conversely, humans can be motivated to do something without having any obvious associated goal. Many instances of collective behaviour in crowds and riots may be examples of this, where people act based on situational factors without any clear reason or objectives. Human drives such as curiously and novelty can also be highly motivating without necessarily having any particular goal associated with them. Given the plausibility that motivation and goals are different and distinct concepts, it is important for Bostrom to explain what he thinks the relationship between them is, and how they would operate in an artificial agent. This seems all the more relevant since we would readily say that many intelligent artificial systems possess goals (such as the common examples of a heat-seeking missile or a chess playing program), but it is not at all clear that these systems are in any way ‘motivated’ to perform these actions – they are simply designed to work towards these goals, and motivations simply don’t come into it. What then would it take to build an artificial agent that had both goals and motivations? How would an artificial agent act with respect to these goals and/or motivations? Bostrom simply cannot ignore these questions if he is to provide a compelling argument concerning what AIs would be motivated to do.

The problems inherent in Bostrom’s failure to analyse these concepts in sufficient detail become evident in the context of Bostrom’s discussion of something that he calls ‘final goals’. While he does not define these, presumably he means goals that are not pursued in order to achieve some further goal, but simply for their own sake. This raises several additional questions: can an agent have more than one final goal? Need they have any final goals at all? Might goals always be infinitely resolvable in terms of fulfilling some more fundamental or more abstract underlying goal? Or might multiple goals form an inter-connected self-sustaining network, such that all support each other but no single goal can be considered most fundamental or final? These questions might seem arcane, but addressing them is crucial for conducting a thorough and useful analysis of the likely behaviour of intelligent agents. Bostrom often speaks as if a superintelligence will necessarily act in single-minded devotion to achieve its one final goal. This assumes, however, that a superintelligence would be motivated to achieve its goal, that it would have one and only one final goal, and that its goal and its motivation to achieve it are totally independent from and not receptive to rational reflection or any other considerations. As I have argued here and previously, however, these are all quite problematic and dubious notions. In particular, as I noted in the discussion about the nature of intelligence, a human’s goals are subject to rational reflection and critique, and can be altered or rejected if they are determined to be irrational or incongruent with other goals, preferences, or knowledge that the person has. It therefore seems highly implausible that a superintelligence would hold so tenaciously to their goals, and pursue them so single-mindedly. Only a superintelligence possessing a much more minimal form of intelligence, such as the skills at prediction and means-ends reasoning of Intelligence(3), would be a plausible candidate for acting in such a myopic and mindless way. Yet as I argued previously, a superintelligence possessing only this much more limited form of intelligence would not be able to acquire all of the ‘cognitive superpowers’ necessary to establish a decisive strategic advantage.

Bostrom would likely contend that such reasoning is anthropomorphising, applying human experiences and examples in cases where they simply do not apply, given how different AIs could be to human beings. Yet how can we avoid anthropomorphising when we are using words like ‘motivation’, ‘goal’, and ‘will’, which acquire their meaning and usage largely through application to humans or other animals (as well as anthropomorphised supernatural agents)? If we insist on using human-centred concepts in our analysis, drawing anthropocentric analogies in our reasoning is unavoidable. This places Bostrom in a dilemma, as he wants to simultaneously affirm that AIs would possess motivations and goals, but also somehow shear these concepts of their anthropocentric basis, saying that they could work totally differently to how these concepts are applied in humans and other known agents. If these concepts work totally differently, then how are we justified in even using the same words in the two different cases? It seems that if this were so, Bostrom would need to stop using words like ‘goal’ and ‘motivation’ and instead start using some entirely different concept that would apply to artificial agents. On the other hand if these concepts work sufficiently similarly in human and AI cases to justify using common words to describe both cases, then there seems nothing obviously inappropriate in appealing to the operation of goals in humans in order to understand how they would operate in artificial agents. Perhaps one might contend that we do not really know whether artificial agents would have human analogues of desires and goals, or whether they would have something distinctively different. If this is the case, however, then our level of ignorance is even more profound than we had realised (since we don’t even know what words we can use to talk about the issue), and therefore much of Bostrom’s argument on these subjects would be grossly premature and under-theorised.

Bostrom also argues that once a superintelligence comes into being, it would resist any changes to its goals, since its current goals are (nearly always) better achieved by refraining from changing them to some other goal. There is an obvious flaw to this argument, namely that humans change their goals all the time, and indeed whole subdisciplines of philosophy are dedicated to pursuing the question of what we should value and how we should go about modifying our goals or pursuing different things to what we currently do. Humans can even change their ‘final goals’ (insomuch as any such things exist), such as when they convert religions or change between radically opposed political ideologies. Bostrom mentions this briefly but does not present any particularly convincing explanation for this phenomenon, nor does he explain why we should assume that this clear willingness to countenance (and even pursue) goal changes is not something that would affect AIs as it affects humans. One potential such response could be that the ‘final goal’ pursued by all humans is really something very basic such as ‘happiness’ or ‘wellbeing’ or ‘pleasure’, and that this never changes even though the means of achieving it can vary dramatically. I am not convinced by this analysis, since many people (religious and political ideologues being obvious example) seem motivated by causes to perform actions that cannot readily be regarded as contributing to their own happiness or wellbeing, unless these concepts are stretched to become implausibly broad. Even if we accept that people always act to promote their own happiness or wellbeing, however, it is certainly the case that they can dramatically change their beliefs about what sort of things will improve their happiness or wellbeing, thus effectively changing their goals. It is unclear to me why we should expect that a superintelligence able to reflect upon its goals could not similarly change its mind about the meaning of its goals, or dramatically alter its views on how to best achieve them.

Premise 5: The tractability of the AI alignment problem

Critical to the question of artificial intelligence research as a cause for effective altruists is the argument that there are things which can be done in the present to reduce the risk of misaligned AI attaining a critical strategic advantage. In particular, it is argued that AI safety research and work on the goal alignment problem has the potential of being able to, after the application of sufficient creativity and intelligence, significantly assist our efforts in constructing an AI which is ‘safe’, and has goals aligned with our best interests. This is often presented as quite an urgent matter, something which must be substantively ‘solved’ before a superintelligent AI comes into existence if catastrophe is to be averted. This possibility, however, seems grossly implausible considering the history of science and technology. I know of not a single example of any significant technological or scientific advance whose behaviour we have accurately been able to predict, and whose safety we have been able to ensure, before it has been developed. In all cases, new technologies are only understood gradually as they are developed and put to use in practise, and their problems and limitations progressively become evident.

In order to ensure that an artificial intelligence would be safe, we would first need to understand a great deal about how artificially intelligent agents work, how their motivations and goals are formed and evolve (if it all), and how artificially intelligent agents would behave in society in their interactions with humans. It seems to me that, to use Bostrom’s language, this constitutes an AI-complete problem, meaning that there is no realistic hope of substantively resolving these issues before human-level artificial intelligence itself is developed. To assert the contrary is to contend that we can understand how an artificial intelligence would work well enough to control it and wisely plan with respect to possible outcomes, before we actually know how to build one. It is to assert that a detailed knowledge about how the AI’s intellect, goals, drives, and beliefs would operate in a wide range of possible scenarios, and also the ability to control its behaviours and motivations in accordance with our values, would still not include essential knowledge needed to actually build such as AI. Yet what it is exactly that such knowledge would leave out? How could we know such much about AIs without being able to actually build one? This possibility seems deeply implausible, and not comparable to any past experiences in the history of technology.

Another major activity advocated by Bostrom is to attempt to alter the relative timing of different technological developments. This rests on the principle of what he calls differential technological development, that it is possible to retard the development of some technologies relative to the arrival time of others. In my view this principle is highly suspect. Throughout the history of science and technology the simultaneous discovery or development of new inventions or discoveries is not only extremely common, but appears to be the norm of how scientific research progresses rather than the exception (see ‘list of multiple discoveries’ on Wikipedia for examples of this). The preponderance of such simultaneous discoveries lends strong support to the notion that the relative arrival of different scientific and technological breakthroughs depends mostly upon the existing state of scientific knowledge and technology – that when a particular discovery or invention has the requisite groundwork to occur, then and only then will it occur. If on the other hand individual genius or funding initiatives were the major drivers of when particular developments occur, we would not expect the same special type of genius or the same sort of funding program to exist in multiple locations leading to the same discovery at the same time. The simultaneous discovery of so many new inventions or discoveries would under this explanation be an inexplicable coincidence. If discoveries come about shortly after all the necessary preconditions are available, however, then we would expect that multiple persons in different settings would take advantage of the common set of prerequisite conditions existing around the same time, leading to many simultaneous discoveries and developments.

If this analysis is correct, then it follows that the principle of differential technological development is unlikely to be applicable in practise. If the timing and order of discoveries and developments largely depends upon the necessary prerequisite discoveries and developments having been made, then simply devoting more resources to a particular emerging technology would do little to accelerate is maturation. These extra resources may help to some degree, but the major bottleneck on research is likely to be the development of the right set of prerequisite technologies and discoveries. Increased funding can increase the number of researchers, which in turn lead to a larger range of applications of existing techniques to slightly new uses and minor incremental improvements of existing tools and methods. Such activities, however, are distinct from the development of innovative new technologies and substantively new knowledge. These sorts of fundamental breakthroughs are essential for the development of major new branches of technology such as geoengineering, whole brain emulation, artificial intelligence, and nanotechnology. In this analysis is correct, however, they cannot simply be purchased with additional research money, but must await the development of essential prerequisite concepts and techniques. Nor can we simply devote research funding to the prerequisite areas, since these fields would in turn have their own set of prerequisite technologies and discoveries upon which they are dependent. In essence, science and technology is a strongly inter-dependent enterprise, and we can seldom predict what ideas or technologies will be needed for a particular future breakthrough to be possible. Increased funding for scientific research overall can potentially increase the general rate of scientific progress (though even this is somewhat unclear), but changing the relative order of arrival of different major new technologies is not something that we have any good reason to think is feasible. Any attempts therefore to strategically manipulate research funding or agendas to alter the relative order of arrival of nanotechnology, whole brain emulation, artificial intelligence, and other such technologies, are very unlikely to succeed.

Premises 6-7: The high expected value of AI research

Essential to the argument that we (society at large or the EA community specifically) should devote considerable resources to solving the AI alignment problem is the claim that even if the probability of actually solving the problem is very low, the size of the outcome in question (according to Bostrom, the entire cosmic endowment) is so large that its expected value still dominates most other possible causes. This also provides a ready riposte to all of my foregoing rebuttals of Bostrom’s argument – namely that even if each premise of Bostrom’s argument is very improbable, and even if as a result the conclusion is most implausible indeed, nevertheless the AI Doom Scenario outcome is so catastrophically terrible that in expectation it might still be worthwhile to focus much of our attention on trying to prevent it. Of course, at one level this is entirely an argument about the relative size of the numbers – just how implausible are the premises, and just how large would the cosmic endowment have to be in order to offset this? I do not believe it is possible to provide any non-question begging answers to this question, and so I will not attempt to provide any numbers here. I will simply note that even if we accept the logic of the expected value argument, it is still necessary to actually establish with some plausibility that the expected value is in fact very large, and not merely assume that it must be large because the hypothetical outcome is large. There are, however, more fundamental conceptual problems with the application of expected value reasoning to problems of this sort, problems which I believe weigh heavily against the validity of applying such reasoning to this issue.

First is a problem which is sometimes called Pascal’s mugging. It is based upon Blaise Pascal’s argument that (crudely put), one should convert to Christianity even if it is unlikely Christianity is true. The reason is that if God exists, then being a Christian will yield an arbitrarily large reward in heaven, while if God does not exist, there is no great downside to being a Christian. On the other hand, if God does exist, then not being a Christian will yield an arbitrarily large negative reward in hell. On the basis of the extreme magnitude of the possible outcomes, therefore, it is rational to become a Christian even if the probability of God existing is small. Whatever one thinks of this as a philosophical argument for belief in God, the problem with this line of argument is that it can be readily applied to a very wide range of possible claims. For instance, a similar case can be made for different religions, and even different forms of Christianity. A fringe apocalyptic cult member could claim that Cthulhu is about to awaken and will torture a trillion trillion souls for all eternity unless you donate your life savings to their cult, which will help to placate him. Clearly this person is not to be taken seriously, but unless we can assign exactly zero probability to his statement being false, there will always be some size negative outcome sufficiently bad as to make taking the action the rational thing to do.

The same argument could be applied in more plausible cases to argue that, for example, some environmental or social cause has the highest expected value, since if we do not act now to shape outcomes in the right way then Earth will become completely uninhabitable and thus mankind unable to spread throughout the galaxy. Or perhaps some neo-Fascist, Islamic fundamentalist, Communist revolutionary, anarcho-primitivist, or other such ideology could establish a hegemonic social and political system that locks humanity into a downward spiral that forever precludes cosmic expansion, unless we undertake appropriate political or social reforms to prevent this. Again, the point is not how plausible such scenarios are – though doubtless with sufficient time and imagination they could be made to sound somewhat plausible to those people with the right ideological predilections. Rather, the point is that in line with the idea of Pascal’s mugging, if the outcome is sufficiently bad, then the expected value of preventing the outcome could still be high in spite of a very low probability of the outcome occuring. If we accept this line of reasoning, we therefore find ourselves vulnerable to being ‘mugged’ by any kind of argument which posits an absurdly implausible speculative scenario, so long as it has a sufficiently large outcome. This possibility effectively constitutes a reductio ad absurdum for these type of very low probability, very high impact arguments.

The second major problem with applying expected value reasoning to this sort of problem is that it is not clear that the conceptual apparatus is properly aligned to the nature of human beliefs. Expected value theory holds that human beliefs can be assigned a probability which fully describes the degree of credence with which we hold that belief. Many philosophers have argued, however, that human beliefs cannot be adequately described this way. In particular, it is not clear that we can identify a single specific number that precisely describes our degree of credence in such amorphous, abstract propositions as those concerning the nature and likely trajectory of artificial intelligence. The possibilities of incomplete preferences, incomparable outcomes, and suspension of judgement are also very difficult to incorporate into standard expected value theory, which assumes complete preferences and that all outcomes are comparable. Finally, it is particularly unclear why we should expect or require that our degrees of credence should adhere to the axioms of standard probability theory. So-called ‘Dutch book arguments’ are sometimes used to demonstrate that sets of beliefs that do not accord with the axioms of probability theory are susceptible to betting strategies whereby the person in question would be guaranteed to lose money. Such arguments, however, only seem relevant to beliefs which are liable to be the subject of bets. For example, of what relevance is it whether one’s beliefs about the behaviour of a hypothetical superintelligent agent in the distant future are susceptible to Dutch book arguments, when the events in question are so far in the future that it is impossible that any enforceable bet could actually be made concerning them? Perhaps beliefs which violate the axioms of probability, though useless for betting, are valuable or justifiable for other purposes or in other domains. Much more has been written about these issues (see for example the Stanford Encyclopedia of Philosophy article on Imprecise Probabilities), however for our purposes it is sufficient to establish that powerful objections can and have been raised concerning the adequacy of expected value arguments, particularly in applications of low probability and high potential impact. These issues require careful consideration before premises 6 and 7 of the argument can be justified.

Conclusion

In concluding, I would just like to say a final word about the manner in which I believe AI safety is likely to present the greatest danger in the future. On the basis of the arguments I have presented above, I believe that the most dangerous AI risk scenario is not that of the paperclip maximiser or some out-of-control AI with a very simplistic goal. Such examples feature very prominently in Bostrom’s argument, but as I have said I do not find them very plausible. Rather, in my view the most dangerous scenario is one in which a much more sophisticated, broadly intelligent AI comes into being which, after some time interaction with the world, acquires a set of goals and motivations which we might broadly describe as those of a psychopath. Perhaps it would have little or no regard for human wellbeing, instead becoming obsessed with particular notions of ecological harmony, or cosmic order, or some abstracted notion of purity, or something else beyond our understanding. Whatever the details, the AI need not have an aversion to changing its ‘final goals’ (or indeed have any such things at all). Nor need it pursue a simple goal single-mindedly without stopping to reflect or being able to be persuaded by conversing with other intelligent agents. Nor need such an AI experience a very rapid ‘takeoff’, since I believe its goals and values could very plausibly alter considerably after its initial creation. Essentially all that is required would be a set of values substantially at odds with those of most or all of humanity. If it was sufficiently intelligent and capable, such an entity could cause considerable harm and disruption. In my view, therefore, AI safety research should focus not only on how to solve the problem of value learning or how to promote differential technological development. It should also focus on how the motivations of artificial agents develop, how these motivations interact with beliefs, and how they can change over time as a result of both internal and external forces. The manner in which an artificial agent would interact with existing human society is also an area which, in my view, warrants considerable further study, since the manner in which such interactions proceed plays a central role in many of Bostrom’s arguments.

Bostrom’s book has much to offer those interested in this topic, and although my critique has been almost exclusively negative, I do not wish to come across as implying that I think Bostrom’s book is not worth reading or presents no important ideas. My key contention is simply that Bostrom fails to provide compelling reasons to accept the key premises in the argument that he develops over the course of his book. It does not, of course, follow that the conclusion of his argument (that AI constitutes a major existential threat worthy of considerable effort and attention) is false, only that Bostrom has failed to establish its plausibility. That is, even if Bostrom’s argument is fallacious, it does not follow that AI safety is a completely spurious issue that should be ignored. On the contrary, I believe it is an important issue that deserves more attention in mainstream society and policy. At the same time, I also believe that relative to other issues, AI safety receives too much attention in EA circles. Fully defending this view would require additional arguments beyond the scope of this article. Nevertheless, I hope this piece contributes to the debate surrounding AI and its likely impact in the near future.

Advertisements

Responding to a Marxist Critique of Effective Altruism

Introduction

In his recent article in Jacobin magazine, Mathew Snow argues that Effective Altruism as a movement is ‘myopic’ and ‘pernicious’ because of its focus on ‘creating a culture of giving’ instead of ‘challenging capitalism’s institutionalized taking’. Here I present a critique of Snow’s argument, analysing why first and foremost it fails as a critique of effective altruism, and secondly highlighting some problematic aspects of his critique of ‘capital’ that are of relevance.

Misunderstanding EA

Briefly at the outset, I want to emphasise that I do not believe Snow understands effective altruism very well at all. One key reason for this is his statement that ‘Effective Altruists treat charities as black boxes — money goes in, good consequences come out’. Even a cursory look through the intricate and careful process used by organisations such as GiveWell and GiveDirectly to evaluate the performance and effectiveness of different charities, which incorporate a diversity of different considerations and lines of evidence, should be more than sufficient refutation of this absurd claim. The fact that Snow makes it in such a cavalier fashion indicates I think a fundamental misunderstanding of the movement – although possibly it also bodes ill for the ability of effective altruists to clearly communicate our core ideas to others in a clear, concise manner.

On Political Predispositions

Snow’s piece is clearly written from a Marxist perspective – the word ‘capital’ appears some sixteen times, often used in an oddly reified way, as if ‘capital’ were some sort of malevolent force which has particular motives and takes specific actions to oppress the poor. I do not share this perspective, and later on in this piece I will make some further comments about the weakness of Snow’s arguments against capitalism. But for the moment, let us suppose that Snow is completely correct in his indictment of capitalism. Let us suppose that capitalism really is responsible for the vast majority of the world’s ills as Snow says that it is (and I don’t think this is a strawman). Granting Snow all this, we now ask – does his conclusion about effective altruism follow? My contention is that it manifestedly does not.

Before I begin, I think it is appropriate to articulate my own political biases, for such biases afflict us all in many subtle (and not so subtle) ways. For my part, I used to describe myself as a libertarian. I now reject this label, preferring something along the lines of ‘classical liberal’, or even more recently ‘radical centrist’. As a result, I am naturally predisposed against the sort of Marxist critique presented here by Snow. That said, I do not here wish to offer a comprehensive critique of Marxist political theory (a surprising amount of which I actually agree with – at least in its more classical incarnations), nor do I wish to expound the virtues of free markets (I think they do have many virtues, as well as many important vices). Rather, what I want to focus on here are some particular claims that Snow makes, and why I think they are mistaken and unhelpful.

Snow on Effective Altruism

One of Snows core arguments is his assertion that ‘(effective) altruists abstract from – and thereby exonerate – the social dynamics constitutive of capitalism’. I agree with Snow that effective altruists typically ‘abstract from’ the social dynamics of capitalism, as they seldom discuss such things and generally speak at a higher level of analysis, abstracting from the particulars of any specific economic system. Does it follow, however, that this constitutes an ‘exoneration’ of said system? I do not believe that it does. Merely to not focus on something, to abstract from details and focus on some other aspect or broader issue, is not in any way to condone or ‘exonerate’ said thing. To give an example, suppose I were to say ‘such and such many people are murdered every year, and through better policing and criminal justice laws, as well as improvements in education and social welfare programs, etc, we could reduce this number by so and so percent’. By Snow’s logic, such remarks would be illegitimate because I would be ‘abstracting from the social dynamics of violent crime’ thereby apparently ‘exonerating the actions of the perpetrators and overlooking their role as agents in the process’. I contend that this is simply nonsense – to adopt an abstract view of a phenomenon, or to focus on one aspect of it, in no way necessarily exonerates or condones anything. Often it is helpful to focus on particular aspects of reality (complex and multifaceted as it is), and indeed this is precisely what effective altruists claim, namely that it is helpful to talk about giving abstracted (to a degree) from the particular economic system in which they are embedded. Snow does not dispute this, he merely accuses them of ‘exonerating capitalism’ for doing so. To me, this seems little more than a way for Snow to whine that discussion of his evident pet topic is not what effective altruists judge to be the most productive method of aiding the world’s poor.

Snow then proceeds to describe the ‘irony of effective altruism’ as demonstrated by its ‘imploring individuals to use their money to procure necessities’ while ‘ignoring the system that determines how those necessities are produced and distributed in the first place’. While it may be the case, as indicated above, that effective altruists seldom discuss ‘the system’ as such, what Snow does not establish is that this constitutes ‘irony’, or indeed that there is anything wrong with EAs focusing their attention and exhortations in the way that they do. It is quite plausible, indeed I think history indicates overwhelmingly probable, that even if all EAs on the planet, and ten times more that number, denounced the evils of capitalism in as loud and shrill voices as they could muster, that nothing whatever of any substance would change to the benefit of the world’s poor. As such, if our main objective is to actually help people, rather than to indulge in our own intellectual prejudices by attributing all evil in the world to the bogeyman of ‘capital’, then it is perfectly reasonable to ‘implore individuals to use their money to procure necessities for those who desperately need them’, rather than ‘saying something’ (what exactly? to whom? to what end?) about ‘the system that determines how those necessities are produced’.

Later on in his article, Snow utters the seemingly incredulous exclamation ‘(the fact) that subsidizing capital accumulation has become the only readily available way for most to act on compassion for others is perverse’. He subsequently refers to much the same phenomenon as an ‘insidious state of affairs’. Once again, however, the reader is left wondering exactly why this outcome should necessarily be so perverse? Again, even if ‘capital’ is the uniquely culpable cause of so much ill, as Snow is want to continually reiterate, it is extremely common in this non-utopian real-world in which we live that we must choose the least bad of several unpalatable alternatives. Likewise, it is often the case that working within the constraints of a flawed and ineffectual system is the best method available for achieving actual progress. (I invite readers to reflect on their own experiences with literally any human institution they have been involved in as validation of this key point.) As such, I argue that it is perfectly plausible, and not at all ‘perverse’ that, even if capital is to blame for the problems of global poverty, working within the capitalist system may still be the best method that we have available for helping those in extreme poverty.

Finally, let us examine Snow’s second last paragraph. Here he states: ‘rather than asking how individual consumers can guarantee the basic sustenance of millions of people, we should be questioning an economic system that only halts misery and starvation if it is profitable. Rather than solely creating an individualized “culture of giving,” we should be challenging capitalism’s institutionalized taking’. As previously, however, Snow here makes strong injunctions without providing any clear argument for them. At best, all that Snow could be said to have argued in his piece is that ‘we should be questioning capitalism’. He does not even try to establish why we should be doing this instead of, or at the expense of, ‘creating an individualized “culture of giving”‘. To make this argument, Snow would need to provide some basis for the one being better than the other – but yet he does nothing of the sort. Indeed, reading this piece I am quite at a loss to say what Snow’s goals or objectives actually are. He seems to strongly desire the overthrow of ‘capital’, and seems to scoff in derision at those who are working as ‘accountants and marketers for charities with pretensions of “acting now to end world poverty” and figuring out “the most good you can do”‘, but yet it remains a mystery as to exactly what his more immediate objective might be. Does he want to help the world’s poor as best as he can? If so, what is his argument that writing polemical pieces against capitalism is the best way of doing this? (or, indeed, is beneficial in any way for achieving this?) Conversely, if he does not care about helping the world’s poor as best he can, then why should effective altruists pay heed to his injunction to prioritise armchair Marxist critique over charitable giving that demonstrably saves lives?

Snow on Capitalism and Scarcity

So much for Snow’s critiques of effective altruism as a social movement. Now I wish to turn my attention to some of his criticisms of ‘capital’, demonstrating how they rest upon faulty logic, and historical and economic misconceptions. Note that my purpose here is not to get distracted into a discussion of political philosophy per se. I want to focus on a subset of the claims Snow makes which I think are incorrect or highly misleading, and furthermore which I think are relevant to effective altruists as informing how we go about attempting to do the most good we can.

The single largest mistake that I believe Snow makes, in a variety of different ways, is to ignore the fact of scarcity. By ‘scarcity’, I mean that there are not enough goods and services for everyone to have as much as they would like, and therefore some form of allocative rationing is necessary to decide who gets what. Numerous times, Snow argues in a way which belies either ignorance of, or naïve lack of concern for, the fact of scarcity. As one example, he states ‘as men and women with money and moral consciences, we can’t put a price on life, but as men and women participating in a system governed by the logic of capital, we must’. Snow is a student of Kantian ethics, so it is perhaps not surprising that he thinks this way, but I would argue the exact opposite – namely that it is precisely because we are moral men and women that we must (with appropriate care) put a price on human life. By doing so we able to make intelligent and informed decisions about how to allocate scare resources to protect as many lives as we can. Without putting a price on life (implicitly or otherwise), we are unable to make any decision about whether a given safety initiative, health intervention, public policy, or other action we might take is beneficial. Absent sufficient resources to accomplish every good outcome we would want, we are forced to make decisions about prioritising some things over others, and it is precisely by putting a price on life that we are able to do this. Even such mundane decisions as driving an automobile involve putting an implicit price on our own lives (as well as those of others), given that we are taking a non-zero risk of death or serious injury for ourselves and others, in exchange for greatly reduced travel time and increased convenience. Most people will have a notion that this tradeoff is ‘worth the risk’, and in thinking this way, about driving or anything else, they are implicitly ‘putting a price on life’. Without doing so, we would be paralysed in all our decision making, unable to weigh any action that involves risk to life or safety (i.e. any action at all) against any other outcome that we value.

Snow again illustrates his neglect of the fact of scarcity when he speaks of ‘capital demanding’ a market price be paid for goods and services. He argues as if it is only the existence of ‘capital’ which causes there to be people suffering extreme poverty, as demonstrated by his use of phrases like ‘capital’s commodification of necessities’ and ‘capitalism’s institutionalization of immoral maxims’. Even a cursory study of economic history, however, is more than sufficient to demonstrate that essentially all societies (certainly all those of even moderate size and complexity, perhaps excluding certain isolated tribal peoples) engage in trade and barter of goods – the ‘commodification of necessities’ that Snow attributes to capitalism. Now it is true that the global capitalist system in existence today does so to a much greater extent than ever before in human history. If Snow’s analysis were correct, however, we would thereby expect to be seeing absolute poverty becoming worse over time, as the degree of ‘commodification of necessities’ increases. In fact, what we see is precisely the opposite. Three centuries ago, practically the entire population of the world lived in what we would today call ‘absolute poverty’. Today the proportion is less than one quarter, even despite massive increases in global population. As the world becomes ever more globalised, the proportion and even absolute number of people in absolute poverty is still declining by the decade. I won’t go so far as to argue here that this is because of global capitalism (I think that is true to a notable extent, but there isn’t space to argue that here, nor to make all appropriate caveats that such a claim requires), but at the very least it certainly seems highly inconsistent with Snow’s claim that ‘capital’ is the source and cause of global impoverishment.

Snow likewise explicitly states that capital is the cause of the inability of the global poor to access necessities such as vaccines, malaria nets, basic education, nutritious food, etc. In a sense I agree with him, because the world’s economic system (like any that has ever existed on the face of the planet, ‘socialist’ ones included) is set up in many instances to favour the rich and powerful at the expense of the poor and marginalised. (Rather than blame this all on ‘capital’ as such, I would describe the situation as resulting from an unfortunate confluence of interests between governments and powerful corporations and other lobby groups, but that’s another matter). That being said, it demonstrably was not the case that the world’s poor had plentiful access to such things before the rise of global capitalism, and that somehow they have now been deprived of them.

Malaria nets, vaccines, and everything else are scarce, meaning (as stated above) that there is not enough for everyone to have as much of them as they would like. This necessitates some form of allocation, or of rationing. Snow sometimes talks as if his idealised socialist utopia would do away with all scarcity and hence of the need to ration such goods at all. I contend that there has never existed a single society in the Earth’s history that has not rationed ‘essentials’ by some method. This is essentially true almost by definition, since not everyone can have as much as they would like, some people must necessarily go without, at least to an extent (note: that doesn’t mean some people need to go hungry necessarily, it just means food etc must be rationed somehow). In the modern market economy, rationing takes the form of prices to be paid for goods and services – in Snow’s words this is ‘what capitalist institutions demand’. What Show neglects is the because of scarcity, any other possible system would necessarily ‘demand’ something similar, be it in the form of ration cards, political connections, or sheer luck, examples of other, I would argue far worse, mechanisms of rationing scarce resources.

There is a final point I wish to make about Snow’s analysis, which concerns the identity of his mythical ‘capitalist class’. At least in classical Marxist analysis, the ‘capitalist class’ are the owners of capital, that is the owners of the means of production (such as land and factories). Today they would, presumably, constitute the owners of the world’s great corporations. But who owns the world’s corporations? The answer is that we (read wealthy westerners) all do. Anyone who has a superannuation fund, owns shares, or even has money in a bank account is, directly or indirectly, an owner of capital. Now granted, the ownership of capital is far from evenly distributed, and a very small number of individuals own a disproportionate share (probably it is this so-called ‘1%’ that Snow demonises repeatedly in his piece, efforts of the likes of Bill Gates and Warren Buffet evidently notwithstanding). Nevertheless, the fact remains that we, as part owners of capital and custodians of resources far greater than most people in history could ever dream of, it is up to us to rectify what Snow correctly identifies as an ‘inability of companies to profit from those with little or no purchasing power’, precisely by improving the purchasing power (directly or indirectly) of those in the greatest need. Snow presumably supports this outcome, though probably he would advocate changes in purchasing power brought about by revolutionary struggle (this having always worked out so well in the past, as indeed recalled (ironically?) in the name of the very magazine Snow is writing for), instead of by philanthropic empowerment of the poor to improve their own lives by providing them greater resources. Granted, this has often been done poorly in the past as well, but effective altruists have advocated numerous, very specific ways in which the process and outcomes can be improved, something the likes of Snow seldom express much interest in doing when it comes to socialist revolution.

Conclusions

Snow seems to want to avoid sharing any of the blame for the plight of the global poor. He wants to blame everything on global capital (once again, I do not think this is a strawman of his argument), denying both his own culpability (by not doing more to help, something we all are culpable of alongside him), and also of the amazing opportunity he has to do real, demonstrable good for others. When people die from lack of food, clean water, and medical care, Marxists like Snow seem to callously say ‘it is not owing to me; it is owing to capital’. Rather than blaming others for the plight of the global poor, based on faulty arguments, questionable economic doctrines, and inaccurate beliefs about history, we should instead acknowledge the good we ourselves can do to make a real difference in this world, and join effective altruists in creating a ‘culture of giving’.

Refuting Criticisms of Utilitarianism and Effective Altruism

Synopsis

This piece is a response to Robert Martin’s piece critiquing Peter Singer’s views concerning utilitarian ethics and Effective Altruism (EA). I do not address every point raised in this article, but restrict my response to four key lines of argument. First, I argue that Martin’s response presumes a binary conception of morality (moral versus immoral) which utilitarianism itself denies, and as such the criticisms he levels on the basis of this assumption have little relevance to utilitarianism. Second, I consider Martin’s argument that EA ethics inevitably leads to its attempted practitioners experiencing unbearable guilt, and argue that this falsely presupposes both that guilt has any place in a utilitarian ethic, and also that a perfect ideal needs to actually achievable in order to have merit as an ideal. Third, I argue that contra Martin’s argument, it is actually the EA supporter, and not the EA critic, who is more loving and caring towards his neighbour. Fourth, I argue that Martin’s critique of EA fails to adequately come to grips with the fact of opportunity costs in the use of resources, while in contrast EA very naturally and deliberately takes opportunity costs into consideration when making ethical judgements.

Note that the quotes at the beginning of each section are taken from Martin’s original article.

Binary Thinking about Morality

“To be truly objective the maxim, ‘to do the most good we can’ would be binding on all people regardless of whether we believe it or not. Therefore at any point if one is not ‘doing the most good we can’ we are actually acting immorally!”

“Hence justifying simply ‘moving in the right direction’ is inconsistent because it means that you don’t actually need to ‘do the most good we can’. The ethic is reduced to, ‘do the most good you feel you’re able to afford.”

“Effective altruism and the consequentialist ethic of Peter Singer reduces ethics to a kind of communist race to the communal bottom. Everyone is equal and if one person has utility above the lowest, then it becomes unethical.”

“My point is that given the claim of the objectivity of this particular ethical system it becomes immoral to do anything which does not save lives of those in extreme poverty.”

Utilitarian ethics has little place for binary notions like “moral” and “immoral”. At best, these may be useful as heuristics to guide behaviour in the face of uncertainty or insufficient time to fully consider the likely outcomes of a particular action in greater depth. They may also serve as shorthand to be used in particularly extreme cases (murder, robbery, rape, gross abuse, etc). In general, however, utilitarianism considers the morality of essentially all actions to be one of degree: action A is morally preferable to action B insomuch as the expected consequences of A serve to increase total utility more than the expected consequences of action B.

Under such an ethical framework, it makes no sense (other than in the purely heuristic sense as outlined above), to assert in any absolute, unqualified way, that an agent has acted “immorally” when they take an action which produces lower expected utility than some possible alternate action. Rather, what they have done is take an action which does less good than another action they may have performed – no more, and no less.

References to non utility-maximising actions as being ‘immoral’ thus exhibit a misunderstanding of the nature of the ethical claims made by utilitarians. Such statements simply fail to say anything non question-begging with respect to the suitability of utilitarianism as an ethical framework; for in criticising utilitarianism for pronouncing every action other than the very best possible one as being ‘immoral’, they are necessarily importing binary absolutist notions of ‘moral’ and ‘immoral’ which utilitarianism itself rejects. In order to proceed with this line of critique, therefore, it would be necessary to make an argument as to why incorporating such a binary, absolutist notion of ‘moral’ and ‘immoral’ actions is necessarily in order to provide a suitable ethical account. Absent some such plausible account as to why this is in fact the case, however, this line of attack on utilitarianism fails.

Effective Altruism and Guilt

“Ethical altruism has some helpful contributions to make in assessing how scarce resources be allocated, but my criticisms would be less savage if Singer didn’t claim it as an ‘objective’ system. If consequentialism and ethical altruism is objective then we are all condemned under a brutal loveless, ethical system which will lead to social improvement in the developing world but at the cost of an ascetic guilt-ridden hypocrisy.”

“In this ethical framework there is nothing to avoid the slide into a guilt-ridden (how can I ever enjoy chocolate again?) asceticism. Nothing beyond the basics could ever be enjoyed because they would be declared objectively ‘immoral’.”

“There is no forgiveness in ethical altruism, if you eat a chocolate for yourself, you are condemned under the objective guilt of knowing that lives could have been saved elsewhere in the world.”

The argument here seems to be that Effective Altruism is unliveable as an ethical system because it is too demanding, meaning that no one can live up to its dictates, and since no one can live up to its dictates, all those who try will inevitably be subject to a great deal of guilt and anxiety over their perceived moral failings.

My first response takes the form of a question: in what way does this constitute a refutation of EA as an ethical framework? EA says, in essence, that 1) it is morally right to produce as much utility/benefit/happiness/etc as possible, 2) certain courses of action, according to our best evidence, produce much more utility/benefit/happiness/etc than others, therefore 3) it is morally good for us to undertake those courses of action. How is this argument in any way undermined by the fact that it may be difficult, or even impossible, to carry out to its fullest extent? It seems even if the EA ethic is unliveable and tends to produce a great deal of guilt, that in no way casts doubt on any of the statements 1)-3). Thus this objection merely comes down to an assertion that the EA framework is inconvenient for us, as we would rather avoid all the bother and potential guilt. Needless to say, this does not constitute a philosophical argument of any substance for the inadequacy of effective altruism as an approach in applied ethics.

My second line of response is to say that this line of rebuttal seems to presuppose that effective altruism is only valid or relevant as a moral principle if it is possible to be a perfect, completely effective altruist. As far as I can see, this principle is totally unfounded and without any basis. One is a better EA to the degree that one accords one’s actions with EA principles. This is a matter of degree, and not a binary decision. This is hardly a radical concept: essentially all normative systems incorporate ideals that are unattainable in their pure form, but which nevertheless constitute a valuable ideal to strive towards, and to focus our thoughts and efforts around, even if we know we will never reach them. A cook my strive to make “the perfect dish”, even if they know such a thing is in reality impossible. In science, philosophy, and the legal system, we often speak of epistemic virtues like objectivity, rationality, and impartiality. Everyone accepts that such virtues, in their pure, idealised form, can never be achieved by any actual person in any real situation. We do not, however, conclude on that basis that the notions or theories themselves are flawed, or that therefore everyone is everywhere and always being “irrational” or “partial”. We accept that these virtues are only ever be exercised in greater and lesser degrees, and that the impossibility of the actualisation of their perfect ideal form does not somehow undermine the concept in its entirety.

A third line of response would be to point out that notions of guilt have very little relevance to either a utilitarian ethic in general, or an EA framework in particular. Guilt is simply of no interest to the EA supporter, except insomuch as it may be relevant to ethical outcomes, either by promoting giving, or inhibiting action by leading to despair or discouragement. The EA supporter views guilt as a real and important aspect of human psychology which one needs to seriously consider. It does not, however, play any critical or central rule, motivating or otherwise, in a utilitarian ethical theory. As such, it is simply false to assert that a person who chooses an action which yields less than maximal utility is “condemned under the objective guilt”. Likewise the notion of forgiveness – this notion just has no place in a naturalistic, utilitarian ethic. Arguing that the utilitarian/EA ethical framework is defective because it has no place for forgiveness is simply to beg the question against utilitarianism, because precisely the point of utilitarianism is that such notions about binary abolute moral/immoral decisions, guilt, and forgiveness are largely irrelevant to the question of morality, which is instead concerned with degrees of goodness determined by the consequences of different possible actions. A cogent critique of utilitarianism as an ethical theory cannot proceed by simply pre-supposing aspects of morality which utilitarianism itself rejects, as this is to simply beg the question.

Misconstrual of Love

“Indeed love is absent from the brutal consequentialist system advocated by Singer.”

“All good things are to be seen as gifts of God and to be received with thanksgiving (1 Tim 4:4). This means I can enjoy a chocolate cake!”

“Yet the imperatives also broadens the concept of ‘neighbour’ to include not just our global neighbours, but also our local ones, meaning we can build a school hall to the betterment of our local society and love our neighbours with cancer and perform research to help them. Therefore caring for the ‘good’ of our neighbours is achieved through both the Christian ethic and consequentialism, but the Christian ethic is more nuanced and sophisticated.”

The sincere Effective Altruist strives to do as much good for their fellow man as possible, knowing that they will never succeed completely, but always attempting to do better, and endeavouring to use the best reason and evidence available to seek out new and better ways to do the most good with the limited resources at their disposal. They seek to serve as many of their neighbours as possible, not discriminating by race, class, distance, or convenience, but deciding purely on the basis of how much help they can do to their fellow man.

The EA critic, it seems, is content to eat chocolate cake, donate to their local school hall, and then maybe also donate some money to EA charities as well, justifying this to themselves by saying that one could never be truly and completely effectively altruistic anyway, and also by pretending, through various logical contortions, that somehow the resources and time spent on their chocolate cake and local school hall could not have actually been used to help the world’s poor and needy anyway. They seek to serve their neighbour, but with a special preference for neighbours who are conveniently located close by (note: I hope this is not taken as a personal attack against anyone – it is not intended as such, I’m just trying to make a point).

I ask the reader in all sincerity: which now of these two, thinkest thou, was most loving?

Ignoring Opportunity Costs

“If Singer and the effective altruism ethic is correct, then virtually every economic, social and moral choice made in Australia today is ‘immoral’”

“This is because when these decisions are compared with saving lives of people in extreme poverty then on the simple consequentialist metric outlined by Singer, saving lives of those in extreme will always ‘win’ i.e. they will always be morally preferable. Therefore when posed with the question, ‘should we build a new road in Melbourne? The answer under effective altruism will be ‘no, because this money could save lives of people in extreme poverty’. Should I eat a chocolate cake on my birthday? ‘no, because this money could save lives of people in extreme poverty’ Should we build a new school auditorium? Should we treat an injured knee? Should I treat my friend’s cancer? The answer to all these questions is the same – ‘no, because this could save lives of people in extreme poverty’.”

“Moreover other decisions which would have enormously beneficial outcomes for the extreme poor are also rendered ‘immoral’. For example this ethical framework would preclude funding Ebola virus research because the net ‘utility’ of lives saved in developing countries would be greater by providing Malaria nets or immunisation compared with lives saved through Ebola research.”

It is unclear to me what these sorts of statements are attempting to accomplish. If we consider the tripartite core EA argument which I outlined above, which of the three propositions are these arguments supposed to address? They seem to be total non sequiturs. To take the Ebola research example, why would it be a bad thing for EA to recommend that we ought to put resources into bed nets and vaccinations rather than Ebola research, if it is true that the former will save more lives than the latter? Is it because Ebola research will save more lives in the long run, or have other indirect benefits that we haven’t considered? If this is the case, then we have simply denied the premise that vaccinations and bednets will actually do more good than Ebola research, in which case the effective altruist would support the Ebola research as well, so there is no disagreement. On the other hand, if it is agreed that the Ebola research will do less good than vaccinations and bednets, even when factoring in future benefits and side-effects, etc, then what possible justification can there be for preferring the Ebola research over the bednets and vaccinations? How is it a defect of the EA framework for coming to this conclusion?

I wish also to say a few words regarding resource use in developed countries. Taken at face value, the EA ethic would seem to imply that since building roads, medical expenditure – indeed most public expenses of any sort in developed countries – are not as effective uses of funds as donating to the leading EA charities, then we ought not do them. The first point to say here is that it is simply a fact that resources have opportunity costs. Instead of building a new road or paying a doctor’s salary or whatever else, that money could have been used to save lives in the developing world. This is a fact about reality. It has nothing to do with one’s ethical framework, or the worldview one is operating under. Opportunity costs exist, and (needless to say) they don’t go away merely because we don’t like the sound of them, or thinking about them makes us feel uncomfortable about the difficult tradeoffs we must make.

The second point, however, is that it is necessary to exercise some care when making statements like “we should donate money to EA charities rather than build a new road”, because there is in fact no moral agent to which such collective pronouns apply. “We” are not a moral agent; individuals are moral agents. “We” don’t have any money or any ability to choose how it is spent, so it makes little sense to ask how “we” should spend our money as a nation or a community or whatever. What makes sense from a moral framework is to ask how should you and I spend our money, as individual moral agents who can take particular moral actions. So rather than asking what “we” should do, we should be more careful in our thought and speech, and consider exactly who we are saying should do this or that with the resources they have at their disposal.

The third point to make about this comparison is that, as an attempted reductio against EA, it is a very poor one. The reason is because, if EA were applied ‘universally’, or even in a much more systematic way by many more people and organisations, there would be no need at all to redirect money from road building or hospitals (or whatever else) to fund EA charities, because all such charities would already have been fully funded many times over through funding sourced by forgoing other expenses. Every effective charitable cause could be fully funded many times over with the enormous amount of money that could be diverted from non-essential spending by westerners (I leave it to the reader to imagine precisely what is included in this category), without any need to sacrifice truly important things like roads, schools, and hospitals.

 

The Ethical Imperative of Effective Altruism

Synopsis

In this piece I argue for the ethical imperative of Effective Altruism, by which I mean that I believe we are ethically obligated to donate as much money as one can to the charities which save the most lives per dollar spent. I take a rough figure of $2000 per live saved from GiveWell, and argue that we must always consider this as the benchmark against which all other proposed donations and causes are judged. I then expand this argument to apply not only to charitable donations, but also to all our purchasing decisions. I argue we must seriously consider the lives we could save for every single dollar we spend on anything.

Saving More Lives is Better

How much does it cost to save a life? Many people don’t even like to think about such a question – after all, isn’t it crass and vulgar to put a dollar value on human life? I don’t think so. Indeed, I think it’s positively immoral not to ask this question, and seriously consider the answer. We simply must ask the question of how much it costs to save a life. Why? Basically the argument goes like this:

1) Holding other relevant considerations approximately equal, we ought to take the action that saves more lives over any actions that save fewer lives
2) Donating all one’s charitable contributions to the charity which saves lives for the lowest cost will lead to more lives being saved than any other action we could take
3) Therefore, we should donate all our charitable contributions to the charity which saves lives for the lowest cost

Some Responses

There are a lot of ways one could object to this argument. One could debate about the merits of attempting to deal with deeper structural or social problems, rather than donating exclusively to specific global health initiatives. One could question the value of present lives versus future lives, and how that might affect our analysis. One could argue that life itself is not all that matters, and that we also should consider the good being done in improving the quality of life. All of these objections, and many others like them, are completely valid and worthy of discussion and serious consideration. They also, I think, largely miss the point. And what is that point? The point is, whatever else one proposes that we use or money or resources for, whether it be environmental activism or political reform or human rights protection or art preservation or whatever else, we must everywhere and always remember this fact: that every dollar spent on such causes could (in general) have instead been used to save lives. This is a concept called opportunity cost – the forgone benefit that we could have received had we used our resources in another way. In arguing that we ought to donate time or money to a particular cause, we must always and everywhere remember that this represents time and money that could have been used to save lives by instead donating to the most cost-effective charities.

Our Opportunity Cost

So how much does it cost to have a life? GiveWell has some excellent analysis of this question, some of which can be found here. The issue is stupendously complicated, but let me just pick a ballpark figure. Based on the GiveWell data, let’s say that the best charities can save a life for about $2000. Maybe its really $1000, or $5000. And maybe there are other benefits of these programs too besides just saving lives. That’s not really important. The rough figure is what matters, and it seems pretty clear that it is on the order of a few thousand dollars.

So what does that mean? First of all, I think it ought to put a lot of things in perspective. When I’m considering whether to give to an art gallery or an environmental lobby group or to cancer research, I must remember that every $2000 I give is one less life saved. That’s my opportunity cost. So I had better be pretty damn sure that the money I’m donating to this other cause is going to have some very significant impact, if this is to outweigh the forgone benefit of one life saved. This isn’t some abstract intellectual exercise. GiveWell considers all sorts of factors in evaluating charities, including actual on the ground effectiveness and room for more funding. That means, as best as we can tell, you can, in fact, actually increase the number of lives saved by providing these charities with additional funding, allowing them to expand their operations (e.g. buy more bed nets, disperse more money, fund more deworming programs, etc). I’m not saying here that none of these other causes can ever be worth it. My point is simply that we have an ethical obligation to be aware of what we could be doing with our funds, and what we are giving up when we donate to charities other than the most cost-effective charities.

Doing Both?

But can’t we do both? Can’t we donate, say, to deworming and also to cancer research or greenpeace or whatever else? No, you can’t. At least, not in any meaningful sense. Unless you are someone like Bill Gates, your funds are very limited compared to the capacities of the organizations in question. This means that every dollar you don’t give to the most cost-effective charities is failing to have as much impact as it could. You can’t just pretend that the opportunity cost somehow magically disappears just because you donated some of your money to the more effective charity. That alternative still exists for every single dollar you give away. So you simply cannot ‘do both’. For every dollar, the question is the same: donate to the most effective charities which can save a life for $2000, or donate so some other organisation. Again, I’m not saying that the ‘some other organisation’ option is never the better choice (though I do think it very rarely is). I’m just saying, this is the alternative that we always have. This is the reality we face.

The Ethics of Every Purchase

My argument here, however, does not extend only to our charitable donations. It also can (and I think ought to) be applied to absolutely every purchase decision we ever make. Here is the brute fact: every dollar we spend on anything is one less dollar that could have been spent on GiveWell top charities, saving (something like) one life per $2000 donated. Thus, every single purchase we make is a moral act. Every time we hand over money for anything, we are handing over some part of a chance to save a life. Whenever you see a price tag, you should mentally divide by 2000, because that’s the number of lives you are not saving by buying that thing. How much does a car cost? Several thousand dollars for a used one. That’s a couple of lives right there. How much does it cost to attend a music concert? A hundred dollars? Several of those in a year is maybe a fifth of a life. How much does a cup of coffee cost? A few dollars? How often would you buy one? Ever other day? Every day? That’s some non-trivial fraction of a life. How much does an overseas holiday cost? Several thousand dollars? Another couple of lives. How much do you spend on jewellery? Alcohol? Eating out? Electronics? DVDs? Airfares? Vacations? How much money do you earn every year? How much do you spend in things that you do not really need to get by? How many lives could you have saved, but did not?

Conclusion

I do not paint a very attractive picture here. I’m saying that most of what we spend our money on is frivolous in comparison to the good that we could do by donating this money to the most effective causes. And I don’t mean to put myself on a pedestal. I do try to limit unnecessary purchases. But am I really any better? I doubt it. There are numerous luxuries that I allow myself which I nonetheless don’t really need. But however much of a hypocrite I may be, I don’t think that in any way diminishes our ethical obligation to do better. We simply must do better. Many, many lives depend on it.

Arguments Against Effective Altruism

Synopsis

Here I present an argument against the ‘Effective Altruism’ (EA) movement. First, I argue that the philosophy of the EA movement is predicated upon a contested utilitarian ethical framework which the movement makes insufficient efforts to justify, and seems to lead to positively perverse conclusions when taken to extremes, as can be seen in the ‘repugnant conclusion’. Second, I argue that the myopic EA focus on empirical evidence of charitable efficacy is misguided, as it leads to worthwhile interventions being neglected simply because they are too difficult to measure. Third, I argue that in focusing excessively on empirical evidence and cost-benefit calculations, EA ignores many longer-term systemic flow-on effects, which in many cases can be far greater than short-term charitable outcomes. Finally, I argue that the EA movement’s lofty standards for empirical evidence and rational decision making are so onerous that even the EA movement itself cannot live up to them, meaning that EA fails in its endeavor to extirpate personal intuition and subjective judgements from charitable analysis.

Introduction

There have been a number of critiques of effective altruism put forward, for example here, here, here, here, and here, but at least in my view, most of them aren’t very compelling or coherent. This piece is attempt to remedy the situation by providing a more robust case against EA. For purposes of this piece, I define effective altruism as a social movement, drawing particular inspiration from the work of Peter Singer, the core characteristics of which is a focus on achieving maximum charitable impact by targeting donations to causes which have been empirically demonstrated to yield the most cost-effective outcomes, and shaping one’s life and career so as to maximize the aggregate impact one can make through such programs. Note that the views expressed in this piece do not necessary reflect those of the author of this piece.

Utilitarian Presumption

EA is based upon an underlying presumption of utilitarian ethical theory. Indeed, it often seems such utilitarianism is treated as an axiom not worthy of further discussion. But the trouble is, not everyone shares such a utilitarian ethic. In particular, deontologists and virtue ethicists will not necessarily agree with the hard-nosed EA utilitarian that we should let the blind man down the road go without a guide dog if doing so means that we are able to instead repair the eyesight of fifty Africans. Nor is there any clear reason for such people to accept this utilitarian viewpoint, aside of course from extended philosophical discussion, which given the past two millennia of philosophical history seems unlikely to yield much consensus anyway. Given such differing views on ethics, why is EA so dogmatically confident in their utilitarianism, taking it as so obvious and basic that all others should just automatically agree?

Indeed, there are some fairly simple yet powerful arguments against the sort of utilitarian ethic championed by EA. Consider, for instance, the general EA antipathy towards donations to art or cultural organizations, or even ‘community building’ sorts of activities like donating to the local scouting group or the Make-A-Wish Foundation. The standard argument against such donations is that, whilst they may be ‘nice’ and improve our communities, these benefits are overwhelmed by the utility of the many lives that could be saved if the same resources were redirected to health and education interventions in the Third World. The trouble with an argument like this, however, is that taken to its natural end, it leads to the repugnant conclusion: namely that it is better to have a world comprised of a very large number of people whose lives are just barely worth living, compared to a world comprised of a much smaller population of healthier, happier, flourishing persons. Virtue ethicists in particular would likely object to the notion that saving lives is the overriding ethical consideration that we face – what also matters is what we do with those lives, for example, to build happy, supportive, flourishing communities. But EA utilitarianism seems to tell us that more lives is always preferable.

EA supporters might retort that this abstract philosophical problem has little bearing on the actual real-world decision of whether we should donate to an art gallery or to AMF, but this seems like a highly unsatisfactory, and indeed positively evasive, response from a movement so founded on intellectual rigor and philosophical clarity. Analysis such as these demonstrate how EA rests on a much shakier philosophical grounding than its proponents care to consider or admit.

The Limits of Empirics

EA advocates careful measurement of the cost-effectiveness of donations, and generally recommends against donations to charities unable to demonstrate measurable positive impacts. Consider for instance GiveWell’s recommended charities: these are the charities for which GiveWell was able to find sufficient empirical evidence of cost-effectiveness. The ‘non-recommended’ charities are such not because they have been found to be ineffective, but only because they have not been found to be effective. The problem with this is approach is that, in general, absence of evidence is not evidence of absence. The question one must consider is: would we expect to have quality evidence of efficacy for charities that were effective? If so, then absence of evidence would be evidence of absence. But there seems to be no reason whatever to think this, and many reasons to think otherwise. In particular, social interventions of the kind carried out by charities are exceptionally complex, multi-faceted, and difficult to measure. It is not at all clear, therefore, that the absence of the evidence of efficacy tells potential donors anything at all about the charity in question. But EA treats such absence of evidence as if it were positive evidence that the charity is not worth donating to.

There are deeper issues with this overarching focus on the empirical. An obsession with specific measurable outcomes inevitably leads to a focus on those outcomes to the exclusion of other considerations. Even specific exhortations to the contrary are often ineffective, as the formal and informal incentives of the relevant organisations are all focused towards the specific goal that is actually being measured: for corporations this is profits, leading to neglect of environmental and other considerations. For charities, this problem could manifest itself in other ways. For example, if a program’s efficacy is measured by the number of bednets placed, this produces an incentive to supply large numbers of low-quality nets to easily-accessible populations who may not necessarily need them. One can consider many similar examples where a focus on one or two easily-measurable outcomes results in a neglect of other important aspects of a program, catastrophically undermining its efficacy.

The retort to these sorts of problems is usually the assertion that one simply needs to adopt better metrics. But that is exactly the point – often there are no good metrics, or the only good metrics would be too complicated and expensive to collect given limited resources and infrastructure. So either the program goes ahead with lousy metrics, in which case EA over-emphasis on empirical outcomes can lead to adverse consequences owing to the inadequacy of the metrics being used, or the program just doesn’t go ahead at all, owing to the lack of any ability to demonstrate its efficacy. EA philosophy seems impotent to deal with this deep dilemma.

Flow-On Effects

Excessive focus on empirically-proven charitable interventions also means that the EA movement has a tendency to ignore other causes and consequences which are harder to quantify or even define with any precision, but are nonetheless no less real or important. One example would be ‘flow-on effects’, or second and third-order consequences of charitable programs. What effects does a program have on community cohesion in the long-run? How does this particular initiative alter the incentives faced by local officials, charity workers, and donors? What unintended consequences may this action have? Such indirect consequences are by definition hard to foresee or measure, and therefore tend to be neglected in the sort of outcome-based analyses favoured by EA advocates.

Consider another example. If EA had existed in the 18th century, its advocates would presumably have argued for the upper classes and emerging bourgeoisie of Western Europe to invest their time and energies in poverty reduction, promoting basic health care and education for the masses, etc. Much as contemporary EA supporters decry donations to art galleries as being comparatively ineffective, our imaginary 18th century EA predecessors would presumably have opposed the use of time and resources for speculative research in physics, chemistry, and biology, or work in areas that obviously have no practical value for improving peoples lives, such as archaic mathematical concepts like calculus and statistics, or inventing new fields of study like ‘economics’, or new ethical philosophies like ‘utiliarianism’, given that lack of any evidence at all that this sort of work would yield any practical benefits at all for the needy. And yet, without these pioneering developments, the modern EA movement would be without the technical ability to carry out its approved interventions, the statistical and modelling tools necessary to gauge their effectiveness, or even the very philosophical and theoretical framework with which to articulate their position and analyze opposing views. Examples such as this show that, at best, EA recommendations about cause prioritization seem to be missing a great deal of importance.

Impossible Standards to Meet

This issue of the limits of empirics brings me to a final point about the impossible standards that EA sets, standards which even it is unable to meet. The EA movement calls for charitable efforts to be prioritized on the basis of objective cost-effectiveness analyses. But there are numerous ‘meta-level’ questions surrounding EA which simply cannot be decided on the basis of such criteria, either because of lack of data, lack of agreement about key concepts, lack of time for analysis, or some combination of these factors. Consider the debate over the question of whether to give now or give later. What is the evidence-based, objective basis for deciding one way or the other on this question? Or consider the debate about the relative importance of ameliorating global poverty vs tackling existential risks. Again, where is the evidence or objective criteria for arriving at a position on this matter? Similar questions could be raised regarding disputes concerning earning to give, environmental ethics, population ethics, animal suffering, and many other matters. What EA supporters end up doing, of course, is make a decision based on intuition, emotion, personal experiences, and vagaries of their particular situation – precisely the decision making methods which the EA movement decries in charitable giving more broadly. This comes about, as I have said, because the standards that EA sets for itself (and others) are simply impossible to meet. We don’t have enough data, we don’t have enough agreement about core concepts, and we don’t have enough time or ability to weigh up and analyse all the various considerations.

EA supporters might respond by saying “yes but at least we’re trying to use evidence and rationality as much as we can, so our donations will on average be at least somewhat more effective, even if we make some mistakes and leave out some factors”. It is not at all clear, however, that even this claim is true. Since EA supporters cannot agree about whether to give now or later, or whether to support malaria eradication or friendly-AI research, and since no objective rational basis exists for supporting one over the other (at least by EA standards, given the lack of any clear evidence), the effectiveness of these different approaches could well vary just as much as the effectiveness of traditional charities. As such, EA giving falls prey to the very same criticism it levels against traditional charities – one might be doing good, but we can’t really tell because we can’t measure it.