The New Evidentiary Frontier: Supreme Court's 'Twin-Test' for AI in Criminal Trials
- Chintan Shah

- Jul 15
- 4 min read
In a seminal judgment with far-reaching consequences for the criminal justice system, the Supreme Court of India in State of Maharashtra v. Arjun Sharma has established a stringent 'twin-test' for the admissibility of evidence generated by Artificial Intelligence (AI). The ruling arrives at a critical juncture where law enforcement agencies are increasingly reliant on advanced technology, yet the legal framework for evidence, primarily the Indian Evidence Act, 1872, has struggled to keep pace. Provisions like Section 65B of the Act, designed for conventional computer outputs, have proven inadequate for the complexities of generative AI, creating a significant legal vacuum that this judgment seeks to fill.1
The 'Twin-Test' Explained
The framework laid down by the bench, comprising Chief Justice D.Y. Chandrachud and Justices B.V. Nagarathna and J.B. Pardiwala, mandates that AI-generated evidence must satisfy two distinct but interconnected prongs: authenticity and reliability.
The Authenticity Prong requires the proponent of the evidence to establish the integrity of the process from data input to the final output. This goes significantly beyond the procedural requirement of a certificate under Section 65B. The prosecution must now demonstrate an unbroken chain of custody for both the data fed into the AI system and the system itself. This includes proving that the system was secure and not subject to tampering, hacking, or manual override during the evidence generation process. It shifts the focus from a mere declaration of proper operation to a positive proof of the entire process's integrity.
The Reliability Prong is the more novel and challenging component of the test. It compels the prosecution to open the "black box" of the AI system for judicial scrutiny. To satisfy this prong, the party introducing the evidence must demonstrate the scientific and technical soundness of the underlying AI model. This involves providing clear evidence on the AI's architecture, the nature and quality of its training data, and the methods used to mitigate bias.1 The Court has effectively mandated a level of transparency that requires disclosure of the AI's error rates and the validation processes it has undergone. This is a direct response to the inherent opacity of many advanced AI systems and the risk of "hallucinations" or biased outputs that could lead to grave miscarriages of justice.3
Balancing Technology and the Rights of the Accused
The judgment meticulously balances the investigative potential of AI with the sacrosanct right to a fair trial guaranteed under Article 21 of the Constitution. While acknowledging that AI can be a powerful tool in solving complex crimes, the Court has foregrounded the rights of the accused. The high evidentiary bar set by the 'twin-test' serves as a crucial procedural safeguard against the introduction of unreliable, manipulated, or fabricated digital evidence. It directly confronts the emerging threat of "deepfakes" and other forms of AI-generated fabrications, which have been a growing concern in global legal circles.3 By requiring comprehensive proof of the underlying algorithm and its training data, the Court empowers the defense to mount a meaningful challenge to the evidence, moving beyond a superficial cross-examination to a substantive critique of the technology itself.
Practical Challenges and Comparative Analysis
The implementation of this judgment poses formidable challenges for the Indian criminal justice ecosystem. Law enforcement agencies must urgently develop new standard operating procedures for the seizure, preservation, and documentation of AI systems and their data logs, a task requiring significant technical upskilling. India's Forensic Science Laboratories (FSLs) are largely unprepared for this new reality; they will require massive investment in sophisticated software, hardware, and trained personnel capable of conducting forensic audits of complex AI algorithms.2 Furthermore, the ruling creates a pressing demand for a new class of expert witnesses—individuals who can not only understand but also lucidly explain the intricacies of neural networks and machine learning models to non-technical judges.
The Supreme Court's approach is notably more proactive and stringent when compared to other common law jurisdictions.
Parameter | India (Post-Arjun Sharma) | United Kingdom | United States |
Primary Legal Basis | Judicially created 'twin-test' under the framework of the Indian Evidence Act, 1872, and Article 21. | Common law presumption that a computer was working properly at the material time. | Existing Federal Rules of Evidence (FRE), primarily FRE 901 (Authentication) and FRE 403 (Prejudice), supplemented by the Frye or Daubert standards for scientific evidence. |
Burden of Proof | Lies squarely and heavily on the proponent of the evidence (prosecution) to prove both authenticity and reliability from the outset. | Presumption of reliability; the burden shifts to the opposing party to adduce "evidence to the contrary" that the computer was not working properly.5 | The proponent must produce sufficient evidence to support a finding that the item is what it is claimed to be. The standard can be higher for novel scientific methods.2 |
Key Test(s) | 'Twin-Test' of Authenticity (process integrity) and Reliability (algorithmic soundness, data quality, bias check). | Rebuttable presumption of reliability. | Varies by jurisdiction; generally involves authentication under FRE 901 and a reliability assessment under Daubert/Frye standards for expert/scientific evidence.2 |
Approach to "Black Box" Algorithms | Mandates transparency. The 'reliability' prong requires the "black box" to be opened for scrutiny of its architecture, training data, and error rates. | The presumption of reliability is increasingly seen as inadequate for complex, opaque AI systems, especially post the Horizon IT scandal.5 | Courts are grappling with this. Admissibility may require disclosure of underlying data and system operations to allow the opposing party a fair chance to challenge it.2 |
Key Challenges | Building technical capacity in law enforcement and FSLs; scarcity of qualified expert witnesses; risk of proprietary AI systems being inadmissible. | Overcoming the outdated legal presumption which is ill-suited for modern probabilistic AI systems; risk of admitting flawed evidence. | Lack of a uniform, AI-specific standard; inconsistent application across different courts; challenges in applying traditional evidence rules to generative AI.4 |
The Supreme Court's decision can be seen not merely as an evidentiary rule but as a form of de facto regulation for the use of AI in the public sector. By conditioning admissibility on the transparency of algorithms and training data, the Court is compelling state agencies to procure and develop "explainable AI" (XAI). Vendors offering proprietary "black box" solutions will find their products at a significant disadvantage, as evidence generated by them risks being ruled inadmissible. This judicial act of setting a high evidentiary standard will thus have a powerful ripple effect, shaping the government procurement market and pushing the development of AI technology towards greater accountability and transparency, a role typically reserved for the legislature.



Where can i get the citation of this particular case,