AI and Legal informatics

Background
I came to law school after spending several years doing data analysis and modeling in the physical sciences and finance. In contrast to those fields, legal practice seemed old-fashioned. How could I automate a process? How could I make sense of everything I was reading? Are we still in the nineteenth century or what? For all of its zeal to tackle new problems, law remained in many ways stuck in the past when it came to actually doing the work. It was frustrating, and I even wrote a Note about it as a kind of protest.
We’re now in a different era, and the current crop of transformer-based language models promises to make lots of ideas that were enticingly out of reach a decade or two ago much closer. How can we take advantage of the new AI tools?
Caveat lector: this list is clearly a bit scattered and idiosyncratic. Among other things, it doesn’t include any individual papers. May it improve over time!
AI, general background
- Bishop, C. and Bishop, H. (2024). Deep Learning: Foundations and Concepts. Springer. Authors’ site.
- Jurafsky, D. and Martin, C. (2025). Speech and Language Processing (3rd ed. draft Aug. 24, 2025). Full draft here.
- Russell, S. and Norvig, P. (2021). Artifical Intelligence: A Modern Approach (4th ed.). Pearson. Authors’ site.
Periodicals, conferences, and collections
Organizations and projects
Books on reasoning
- Besnard, P. and Hunter, A. (2008). Elements of Argumentation. MIT Press. Full text at ResearchGate.
- Bradley, A. and Manna, Z. (2007). The Calculus of Computation. Springer. Manna lecture notes.
- Halpern, J. (2017). Reasoning about Uncertainty (2nd. ed.). MIT Press. Full text at MIT Press Direct.
- Kroening, D. and Strichman, O. (2017). Decision Procedures (2nd ed.). Springer. Authors’ site.
- Kowalski, R. (2011). Computational Logic and Human Thinking. Cambridge Univ. Press. Publisher site.
- Mueller, E. (2014). Commonsense Reasoning: An Event Calculus Based Approach (2nd ed.). Elsevier. Publisher site
Books on computational law, generally
- Ashley, K. (2017). Artificial Intelligence and Legal Analytics. Cambridge Univ. Press. Publisher site.
- Katz, D., Dolin, R. and Bommarito, M., eds. (2021). Legal Informatics. Cambridge Univ. Press. Publisher site.
- Lodder, A. and Oskamp, A. (2006). Information Technology & Lawyers. Springer.
- Stranieri, A. and Zeleznikow, J. (2005). Knowledge Discovery from Legal Databases. Springer.
Books on ethics & society
- Bullock, J. et al., eds. (2022). The Oxford Handbook of AI Governance. Oxford Univ. Press. Publisher site. The print version is expensive, but it should be available in electronic form through institutional libraries.
- Crawford, K. (2021). Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale Univ. Press. Author’s site.
- Dubber, M., Pasquale, F. and Das, S., eds. (2020). Oxford Handbook of Ethics of AI. Oxford Univ. Press. Publisher site.
- Hernández-Cano, A. et al. (2025). Apertus v1 Technical Report. Project site. Report on a family of language models trained by a team at EPFL and ETH Zurich with transparency and other ethical considerations top of mind.
- Lessig, L. (2006), Code v2.0. Basic Books. Author’s site with full text. Code may be law, but isn’t the converse also true?
- Schneier, B. (2023). A Hacker’s Mind. Norton. Author’s site. Especially the chapters on law and government.
- Schneier, B. and Sanders, N. (2025). Rewiring Democracy: How AI Will Transform Our Politics, Government, and Citizenship. MIT Press. Authors’ site.
- Vallor, S. (2024), The AI Mirror. Oxford Univ. Press. Author’s site.
Useful tools for hacking LLMs
- Axolotl: fine-tune open-weight models.
- DeepEval: library for LLM evaluation.
- dstack: orchestration for GPU workloads.
- Hugging Face Transformers: swiss army knife for working with transformer models.
- Instructor: provider-agnostic library for structured LLM outputs.
- LLM: Simon Willison’s CLI and library for interacting with LLMs.
- Marimo: reactive Python notebooks with bells and whistles.
- nupunkt: sentence boundary detection library optimized for legal texts.
- NVIDIA DGX Spark: examples and docs for GB10 developer workstations.
- Ollama: local LLM inference (e.g., on a laptop).
- Pydantic AI: Python ‘agent’ framework (validation, model abstraction, evals, workflows).
- Pytorch: tensor library for deep learning.
- Textual: TUI application framework (‘move at terminal velocity’).
- Typer: Python CLI application library.
- uv: a modern Python package manager.
- vLLM: LLM inference (e.g., on NVIDIA data center GPUs).
Data
- Free Law Project: case law database (RECAP, CourtListener, etc.).
- HLS Caselaw Access Project: scanned law books.
- Hugging Face: open models and data sets.
- KL3M: free-range LLMs, with training data (lots of legal sources) on Hugging Face.
- Pile of Law: dataset on Hugging Face.