Symposium on Military AI and the Law of Armed Conflict: Front- and Back-End Accountability for Military AI

Symposium on Military AI and the Law of Armed Conflict: Front- and Back-End Accountability for Military AI

[Rebecca Crootof is a Professor of Law at the University of Richmond School of Law and the inaugural ELSI Visiting Scholar at DARPA.

The author is not authorized to speak on behalf of DARPA and has written this purely in her personal capacity.]

Want to get a sense of what’s coming with regard to AI-enabled military systems? Allow me to share some what I’ve learned about DARPA’s research programs—and how DARPA is expanding its efforts to identify and address the ethical, legal, and societal implications of new military technologies from the earliest stage of development.

The mission of the U.S. Defense Advanced Research Projects Agency (DARPA) is “to make pivotal investments in breakthrough technologies for national security.” It has a strong track record of doing so, having developed prototype stealth technology, miniaturized GPS devices, the micro-electro-mechanical systems which are now used in everything from air bags to video games, the technology that allows for voice-based interaction with handheld devices, and the digital protocols that enabled the Internet. It arguably has “the longest-standing, most consistent track record of radical invention in history,” in part because it focuses on “DARPA hard” (*cough**mindbogglingly implausible**cough*) research, like growing plants that sense national security threats (Advanced Plant Technologies (APT)), enabling scalable quantum computers (Optimization with Noisy Intermediate-Scale Quantum (ONISQ)), and exploring space-based biomanufacturing methods to convert astronaut waste into useful materials (Biomanufacturing: Survival, Utility, and Reliability beyond Earth (B-SURE)).

As relevant to this symposium, DARPA is at the forefront of U.S. military AI research and development. To address AI’s general inability to extrapolate from one scenario to another, the Science of Artificial Intelligence and Learning for Open-world Novelty (SAIL-ON) program is dedicated to creating systems that can identify and adapt to novel inputs. To foster machine learning progress, the Real Time Machine Learning (RTML) program is designing chips with the computing power needed for future advances. To minimize human overtrust of generated text, the Friction for Accountability in Conversational Transactions (FACT) program is exploring when and how to introduce friction into human/AI interactions. To protect vulnerable AI, the Guaranteeing AI Robustness Against Deception (GARD) program is designing adversarial AI defenses. 

Simultaneously, some programs are exploring how AI might enable new capabilities, ranging from improving designs for cyber-physical systems (Symbiotic Design), finding and fixing software vulnerabilities (AI Cyber Challenge), and facilitating knowledge curation to improve intelligence analysis and military decisionmaking (Collaborative Knowledge Curation (CKC)). Still more programs straddle the line between improving AI and enabling new capabilities. Air Combat Evolution (ACE) is investigating when and why humans trust AI with vital tasks, such as assisting in a dogfight; In the Moment (ITM) is assessing whether aligning AI to human values increases our willingness to grant them decision-making power. (And, if that wasn’t challenging enough, ITM is evaluating this question in the context of battlefield triage: a complex, time-sensitive situation where even experts disagree about what should be prioritized.)

DARPA’s structure foregrounds how much individual choice can impact technological research and development. Program managers wield enormous discretionary power over what technological breakthroughs are pursued and how new technologies take shape. They determine the problems that significant funds are dedicated to solving, set the metrics for success, select the performers (researchers and developers who endeavor to achieve the set goals), decide which designs advance or are abandoned, and oversee transitions that enable further development with defense, other government, and commercial partners.

In some ways, DARPA might seem an odd place for a scholar whose work is often about increasing international legal accountability regimes, including for harms caused by malicious cyberoperations, autonomous weapon systems, and war more generally. It is neither an international institution nor much concerned with the structures of new legal regimes. And while individuals here exercise immense influence over the development of new military technologies, not one of them is subject to legal liability for these choices (and nor should they be, under any reasonable mens rea or proximate cause analysis). And yet, a few months ago, I began my tenure as DARPA’s inaugural Ethical, Legal, and Societal Implications (ELSI) Visiting Scholar, shifting my focus from conceptualizing back-end accountability mechanisms to affirmatively fostering front-end moral accountability practices.


Every DARPA program proposal must have answers to the Heilmeier Catechism questions, which include, “What are you trying to do?”; “Who cares? If you are successful, what difference will it make?”; and “What are the risks?” 

It’s possible to answer these questions narrowly, focused solely on a particular problem. Say, if you’re interested in inventing a magnetohydrodynamic drive – using magnets to silently propel a submarine – you might perceive the risks as being the technological limitations which threaten the ultimate success of your program. Maybe it won’t be possible to generate sufficiently powerful magnetic fields, or maybe the likelihood of corrosion to electrode materials will render any such technology inherently short-lived and ultimately useless. 

But one can also think about these Heilmeier questions broadly. Assuming technical success, how far out would the magnetic field project? Would it affect local sea life? Risk detonating mines? Disturb whale song or fish and bird migration patterns? To DARPA’s credit, they have long taken a broader perspective on risks. Before I arrived, the manager of Principles of Undersea Magnetohydrodynamic Pumps (PUMP) had already identified the potential environmental impacts of magnetic fields as a concern and devised a research plan for better understanding their scope and effects. 

But in this “Year of ELSI,” as DARPA Director Stefanie Tompkins has termed it, I am charged with devising processes and strategies to make explicit what is already happening implicitly and expanding the ways in which folks think about programs and their impacts. How might design choices made today affect the world, should today’s DARPA projects become just as commonplace as navigation apps, SIRI, and the Internet?

To that end, I and others at DARPA are developing institutional infrastructure to further a culture of ELSI. Program managers will be able to work with advisors with subject matter expertise to consider a range of questions, imagine potential use cases, identify constraints that could limit or mitigate risks, and select metrics and partners that will maximize opportunities. These include:

  • How is the technology intended to be used? 
  • What else might the technology enable? 
  • What training, expertise, or infrastructure is necessary for the technology to be used effectively? 
  • What are the sources of error and malfunction? 
  • What are the vulnerabilities? 
  • How might it be foreseeably misused? 
  • How might it be weaponized?
  • What are the benefits and risks of creating this technology? Of not creating it? 
  • Are those benefits or risks marginal when compared with practices enabled by current technologies? (And might that answer change, depending on which technology is used as the comparator?)
  • Who will be affected? Which individuals, populations, or entities will enjoy the benefits? Which will bear the risks?
  • Which of these ELSI considerations are raised during the program’s lifecycle? If the program is successful, which of these arise in the near-term? In the long-term?
  • What actions or constraints should be incorporated to address unknowns, to promote beneficial usage and externalities, or to minimize problematic usage and externalities?

One exhilarating aspect of this work is the possibility that taking time to pause and consider these and related questions may highlight considerations or concerns that a purely defense-driven perspective might miss. Identifying unknowns – such as the extent of the environmental impact associated with a magnetohydrodynamic drive – allows one to draft a program plan with requirements for measuring the scope of the resulting magnetic fields and evaluating their effects. Identifying potential negative uses and externalities might scuttle a program – or result in choices and changes that enable achieving a similar goal by other means. And identifying positive alternative uses and externalities might result in choices and changes that maximize them, as discussed further below. My favorite conversations with program managers are those in which they pause and say, “Huh – I hadn’t thought of that . . .”

Nor is this a zero-sum game, where considering these issues comes at the cost of speed or operational performance. Quite the contrary: One of the more surprising outcomes of Urban Reconnaissance through Supervised Autonomy (URSA), which had ongoing meetings with an ELSI working group, was that many of the “ELSI” recommendations – that the system be more explainable, transparent, and have a less biasing and more intuitive user interface – were welcomed by the folks designing the systems as simply good engineering. Based on these suggestions, the performers altered the technological design in ways that resulted in the operator having more information and options than they would have had otherwise, allowing for it to be used in a way that minimized the risk of certain accidents and more accurately reflected the operator’s intent. 

Identifying concerns can even spur entirely new programs and research areas. Augmented Reality (AR) devices might help pilots and troops identify and avoid restricted targets, but they also introduce vulnerabilities that adversaries might exploit, ranging from inducing operator motion sickness to data corruption (such as data poisoning) that could increase the risk of accidental targeting. In anticipation of the proliferation of military AR applications, Intrinsic Cognitive Security (ICS) is exploring how to build protections into the design of AR devices before there’s a stabilized design and accompanying ecosystem of insecure and dangerous products. Imagine the problems (and problems) (and problems) we might have sidestepped if the insecure Internet of Things had been subject to a similar set of baseline requirements.

Even more broadly, some DARPA programs have focused on norm-building within a research and development community. Safe Genes sought to produce methods and technologies to counter the misuse of genome editors. In light of the obvious ethical issues associated with gene editing, it brought together various stakeholders to draft a Code of Ethics for Gene Drive Research. More recently, DARPA has launched the Autonomy Standards and Ideals with Military Operational Values (ASIMOV) program, which aims to develop quantitative benchmarks for measuring and evaluating the ability of autonomous systems to perform ethically within various military scenarios. If successful, it would dispel some of the confusion that arises when technologists, policymakers, civil society advocates, and others concerned about autonomous weapon systems talk past each other. A shared understanding of what it means for a system to be, say, “Traceable Readiness Level 5” would be invaluable in designing systems to minimize risk, in enabling interoperability with allies’ systems, and in making progress towards crafting detailed and effective regulations for weapons with varying autonomous capabilities.


One daunting aspect of this work is that, as the great philosopher Yogi Berra observed, “It’s tough to make predictions, especially about the future.” I imagine being at DARPA (then ARPA) in 1969, attempting to brainstorm the possible implications of the network protocols they were constructing and which are foundational to internet communications today. It might have been possible to foresee email and file sharing – but social media? E-commerce? Online entertainment? Telemedicine? Botnets? Changed expectations of surveillance and privacy? Please.

I am also well aware of what the road to hell is paved with, and I am hardly a techno-utopian. (Rather, my general perspective has been kindly referred to as that of a “skeptical, progress-hating monster.”) No matter how much time and resources are poured into anticipating issues, humans have a remarkable ability to figure out how to use devices in unanticipatable ways. And, ultimately, while technology might mitigate or exacerbate societal issues, it generally cannot solve them. 

But I do believe that some harms may be minimized through a responsible approach to research and development – yes, even weapons research and development – that encourages everyone involved to consider the impact of and feel morally accountable for their choices, regardless of whether they are ever at risk of legal liability. 


All that being said, I stand by my earlier legal accountability arguments. No matter how much good faith work is done on the front end, accidents will still occur. And in warfare, accidents are often destructive and lethal, with devastating consequences for warfighters and civilians. Presumably, warfighters understand and knowingly undertake the risk of such harms. Civilians do not.

New AI military technologies may increase both accidental and incidental – and therefore lawful – civilian harm. This raises an accountability problem. Absent willful action, no human can (or should) be held criminally liable. And, to the extent no international humanitarian rules are violated, no state can be held responsible for an internationally wrongful act. In response, some have proposed stretching the law of superior/command responsibility to criminalize negligence by commanders, procurers, and others involved in the design and deployment of AI-enabled weapon systems, but this is a misguided and insufficient response. (Misguided because it threatens to further delegitimize international criminal law, insufficient because it would still not address all unintended civilian harms.)

When civilians suffer the horrific consequences of armed conflict, they deserve redress. But neither international criminal law nor state responsibility provides any form of remedy when civilian harm is accidental or incidental to otherwise lawful action. And even in cases where individuals are held criminally liable or states are deemed responsible, there is no guarantee that individual civilians will be compensated for their individual harms. 

To address the actual accountability gap at the heart of international humanitarian law, we need to establish an effective mechanism for compensating harmed civiliansall harmed civilians.


It is impossible to anticipate all potential consequences of a technological breakthrough ahead of time. And after-the-fact legal accountability is inherently inadequate – even in the most robust and efficient legal regimes, anyone would far prefer to have prevented a harm in the first place than to be eventually compensated for it. 

A comprehensive governance structure for military AI requires developing both front-end moral accountability practices and back-end legal accountability mechanisms. This will entail intentionally incorporating strategies and processes for identifying and minimizing sources of unnecessary harm at the earliest technological design stages and creating routes to redress for civilians who unfairly suffer the harms of war.

Print Friendly, PDF & Email
Artificial Intelligence, Autonomous Weapons, Featured, General, Public International Law, Symposia, Themes, Use of Force
No Comments

Sorry, the comment form is closed at this time.