Resources
Benchmarks and Evaluations
Addleshaw Goddard: RAG report: Can Large Language Models be good enough for legal due diligence?
AdSoLve: Addressing Socio-technical Limitations of LLMs for Medical and Social Computing
AI Future Forum: AI Adoption in UK Law Firm: Benchmarking Its Adoption and Anticipating The Future
Ashurst: Vox PopulAI: Lessons from a global law firm's exploration of generative AI
CodeX - Stanford Law School: A Supervisory AI Agent Approach to Responsible Use of GenAI in the Legal Profession
Harvey: BigLaw Bench
Legal Benchmarks: AI Benchmarks for the Legal Industry
Linklaters: The LinksAI English law benchmark / v2 Linklaters Put AI Through Law Exams - Above the Law
ODI (Open Data Institute): Building a better future with data and AI white paper
Open Compass LawBench (China): An Evaluation Benchmark Assessing comprehensive performance of Large Language Models (LLM’s) in highly specialized legal domains
Oxethica: AI Audit Tool
Ryan McDonald: Building Your Own Legal Benchmarks for LLMs and Vendor AI Tools
Stanford LegalBench A collaboratively built large language model benchmark for legal reasoning
Stanford study: Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools
UK Government Guidance: Introduction to AI assurance
Vals: Legal AI Report
Due Diligence of AI Tools
Regulation
UK Government Whitepaper: A pro-innovation approach to AI regulation