When Tech Disrupts Faster Than Rules Adapt: Drafting Emergency Guidance for AI-Affected Evidence

When Tech Disrupts Faster Than Rules Adapt: Drafting Emergency Guidance for AI-Affected Evidence

[Sabrina Rewald is a lawyer and independent legal consultant specialising in criminal justice, human rights, and technology, and a co-founder of Fénix Foundation.

Basile Simon is the director of the law program at the Starling Lab for Data Integrity, and a fellow at Stanford University.

Emma Irving is an independent legal consultant specialising in standards for digital evidence, and a co-founder of Fénix Foundation.

Kate Keator supports nonprofits with data strategy, operations, and emerging technologies at the intersection of peace and social impact. She is also a co-founder of Fénix Foundation.]

On July 11, 2025, as legal experts and technologists gathered at Leiden University’s campus in The Hague, the mood was one of immediacy and pragmatism. With the rapid democratisation of generative AI, criminal accountability is facing an epistemic crisis. The question facing the room: when it comes to digital evidence, what is changed by widespread, cheap, and accessible generative AI?

This question, interrogated by scholars (e.g., here and here), was the focal point of a muti-disciplinary international criminal justice roundtable at Georgetown University in September 2024. Following up from that discussion, Fénix Foundation and the Starling Lab are now seeking to bridge the growing gap between ‘traditional’ thinking on digital evidence and the challenges posed by AI. To that end, the two organisations convened experts from diverse legal traditions to consolidate guidance in the context of AI and evidence for immediate use by investigators, legal practitioners, and civil society. 

Drawing on our collective knowledge and expertise on digital evidence standards and AI, we have identified eight pillars to guide fact-finders when handling AI-affected evidence: four of these are built upon applicable (digital) evidence principles adapted for the AI age, and four are specific to AI challenges (such that we can term them ‘AI-native’). The objective of this post is to 1) define the term ‘AI-affected’ evidence, 2) highlight gaps that are emerging in current standards for digital investigations, and 3) provide an overview of the evidentiary pillars that can guide the treatment of AI-affected evidence. 

‘AI-Affected Evidence’: A Future-Looking Term

Evidence can be affected by AI in a number of complex ways – a diversity that current legal terminology does not yet fully capture. These influences fall into three main categories:

  • ‘AI-generated’: Evidence that is entirely synthetic, such as deepfakes.
  • ‘AI-modified’: Non-synthetic material that has been altered using AI, for example producing a clear image from a grainy original (e.g., superresolution satellite imagery).
  • ‘AI-surfaced’: Evidence that has been identified or collected through the use of AI tools. This can include filtering large collections via object recognition algorithms, or using AI-assisted data scraping for the collection of online material.

While AI-generated, -modified, and -surfaced evidence raise different questions and require distinct solutions, they share a core set of evidentiary challenges. As such, using the single term ‘AI-affected evidence’ helps to more comprehensively capture the various dimensions of the relationship between AI and evidence. 

Gaps are Emerging in Existing Standards

Over the past decade, different initiatives have emerged aimed at establishing standards for digital investigations and evidence. A prominent initiative is the Berkeley Protocol on Digital Open Source Investigations, which has served as a pioneering resource for digital investigations since its first publication in 2020. 

The Berkeley Protocol’s methodological core remains highly relevant for AI-affected evidence, in particular the importance placed on corroboration, on adopting a skeptical mindset, and on establishing an ‘unbreakable’ chain of custody from the moment of collection. Despite the information environment becoming more polluted, the Berkeley Protocol persists as crucial digital guidance. It is therefore notable how the rapid development of machine learning creates intriguing gaps in the Berkeley Protocol’s approach.

The Berkeley Protocol’s three-part verification framework (source, technical, and content analysis) rests on the assumption that there is a human source whose credibility can be assessed and a verifiable link between digital media and a physical event. All three can be undermined by powerful generative AI: the ‘source’ might be an algorithm; geolocation and chronolocation can be compromised by the ease of generating internally consistent but fictional scenes; and AI modifications can introduce artifacts or alter key details in ways that are difficult to detect. While the challenge of verifying digital content is not new, evidence affected by AI before collection requires even more context, such as the AI model used, input prompts, or training data (a ‘chain of generation’ – expanding upon a ‘chain of custody’).

Similarly, on analysis: the Berkeley Protocol’s guidance on interpreting findings does not account for the extraordinarily compelling nature of hyper-realistic AI-generated media, which threatens a prejudicial effect on fact-finders. New procedures (such as technical review of provenance before substantive content is watched) may be necessary. Furthermore, the use of AI as an investigative tool creates novel and heightened ethical quandaries. Investigators who use AI tools for translation, pattern recognition, or predictive analysis face the concerns of, for example, algorithmic bias and transparency for AI-assisted conclusions. These AI-specific concerns are not covered in the Berkeley Protocol’s ethical and methodological framework – but the Berkeley Protocol’s approach serves as a model to build upon.   

To address the urgency created by developments in AI outpacing existing standard-setting initiatives, Fénix and Starling Lab have identified evidentiary pillars that fact-finders can use to fill lacunae in legal frameworks. 

Evidentiary Pillars for the AI Era

In line with Fénix’s interdisciplinary ethos, the language of ‘pillars’ is borrowed from the peacebuilding and peacekeeping realms (e.g., the Women, Peace, & Security (WPS) framework and its four pillars, or the positive peace framework and its eight pillars). The term reflects our understanding that, while it is still too early for best practices in this time of uncertainty, fact-finders would benefit from foundational ‘scaffolding’ to help guide their decision making.  This section outlines four of the eight pillars, focusing on the pillars that are built on broadly applicable (digital) evidence principles. Each pillar’s description outlines our methodological approach to its development, where it: 1) identifies the evidentiary principle;  2) elucidates the principle in a digital evidence context; and 3) positions the principle in the context of AI-affected challenges. 

1. Auditability: The ‘Explainable-Enough’ Standard

Auditability refers to the creation of a chronological record (an ‘audit trail’) that documents the evidence handling process – how evidence was collected, processed, and stored, along with when, and by whom, evidence was accessed. 

The audit trail of digital evidence is often reflected in digital information’s metadata, which can outline or validate the chain of custody of the digital evidence. A detailed audit trail has been crucial in ICL practice for establishing the probative value of digital evidence.

For AI-affected evidence, the ‘black box’ nature of deep learning models – where reasoning pathways are opaque and training data is proprietary, massive, or contains biases – poses a direct challenge to the principle of auditability. If an investigator cannot explain why an AI tool flagged a specific video, can it be relied upon?

Translating the auditability pillar to an AI context may require the adoption of an ‘explainable-enough’ standard. Recognising that total transparency is often technically impossible, an explainable-enough standard would focus on creating an audit trail of the tools used, with the aim of ensuring that the methodology is as reproducible as possible given technical constraints. The threshold for ‘explainable-enough’ could differ depending on the type of AI-affected evidence, and would adapt over time as efforts advance to make AI more transparent.

2. Corroboration: The Ultimate Defense

While international criminal procedure does not formally require corroboration, it is critical for establishing the probative value and credibility of evidence. Digital information is especially easy to manipulate compared to non-digital information. Consequently, external corroboration through varied and diverse sources, both digital and non-digital, has evolved as an essential method of verification. 

In an information environment polluted by synthetic media, corroboration demands are further heightened. Approaches to verifying the source of potential evidence typically depend on evaluating indicators that were once difficult and time-consuming to fabricate convincingly –  e.g., for social media evidence, this would include account history, posting patterns, and internal consistency. Yet with cheap, powerful, and accessible generative AI, bad actors can now produce entire synthetic personas with coherent histories and internally consistent details at minimal cost, dramatically reducing the reliability of known authenticity markers. 

With that said, when positioning the corroboration pillar in an AI context, caution is needed to guard against over-correction. The ‘Liar’s Dividend’ (where perpetrators cast doubt on authentic evidence by claiming they are synthetic) and ‘Impostor Bias’ (an a priori distrust of the authenticity of all digital media) have the potential to undermine trust in genuine digital information, even when corroborated. In response, practitioners may impose unduly high corroboration requirements on themselves and others —  potentially straining time and resources, as well as risking probative evidence being left unused. 

3. Provenance: Cryptography in the Courtroom

Provenance identifies who the creator or author of a piece of evidence is and where it came from. Before international criminal courts and tribunals, judges prefer for the creator or author of evidence to testify in court. However, in a digital investigations context, and particularly in an open-source investigation, the author may be uncertain or unknown. 

On a more existential level, generative AI calls the very concept of authorship into question. While investigators have historically compensated for a lack of provenance for open-source evidence by relying on content-based verification, sophisticated deepfakes can now convincingly mimic genuine material, making these methods increasingly unreliable. The AI context may therefore require a pivot toward tools that are designed to identify and secure provenance of AI-affected evidence. One such tool, the Coalition for Content Provenance and Authenticity (C2PA) content credentials standard, can embed a digital asset with metadata (a cryptographically verified signature) that provides a transparent account of any alteration or tampering. Yet, the extent to which such cryptographically verifiable evidence satisfies admissibility requirements remains to be adequately tested in courtrooms.  

4. Prejudice: Managing the “Reverse CSI Effect”

Under Rome Statute Art 69(4), judges must take into account the prejudicial effect of evidence on a fair trial. Thus, even where evidence is relevant and probative, it may still be excluded where its admission would cause undue prejudice. 

Visual evidence is psychologically potent; highly realistic AI-generated or AI-modified imagery even more so. Positioning the prejudice pillar in the digital and AI context requires the fact-finder to understand the extraordinarily compelling nature of visual evidence and its potential to create severe prejudicial effects. Indeed, research on the ‘Continued Influence Effect suggests that even if a fact-finder is aware that evidence is AI-generated, the impression may still substantially impact their perception of the facts. Undertaking a technical review of provenance before the substantive content is viewed may help to to minimise these psychological biases.

Conversely, practitioners should be wary of the ‘Reverse CSI Effect’, where a factfinder may give undue deference to AI-affected evidence simply because of its technical sophistication, despite limitations like inauditable algorithms. 

Next Steps

The four pillars outlined above are built upon existing evidentiary standards applied to the digital and AI-affected evidence contexts. We have additionally identified four ‘AI-native’ novel pillars: explainability, literacy, collaboration, and independence. We will be elaborating on these in the coming months. Crossing silos is more important than ever in the current technological climate. The four evidentiary pillars discussed in this post were well received during a recent expert group meeting in Jakarta on countering terrorist exploitation of AI in Southeast Asia (organised by IIJ). This meeting made clear, once again, the demand and need for guidance at both international and domestic levels, as well as across different fields of criminal law, on how to approach AI-affected evidence. In this spirit, we hope this post will act as an open invitation for input and feedback from legal scholars and practitioners, as well as non-legal experts. We invite you to contact us at info[at]fenix.foundation.

Print Friendly, PDF & Email
Topics
Artificial Intelligence, Featured, General, International Criminal Law, Technology

Leave a Reply

Please Login to comment
avatar
  Subscribe  
Notify of