Surface · Legal Intelligence

Citations: extracted, resolved, and verified — end to end.

A brief or memo cites cases, statutes, regulations, and Federal Register notices. The platform pulls those citations out of the text, follows each one to its canonical source (eCFR, Federal Register, GovInfo, CourtListener), and decides whether the claim being made about that source is supported, contradicted, or unsupported. Government records and entity diligence ship in the same group.

Terminal window
pip install kaos-citations kaos-source

Extract, resolve, verify

Every legal-AI vendor talks about citation checking; kaos-citations ships the actual pipeline open-source, with a fixture-backed evaluation. The extractor parses eight kinds of citation out of free text: CFR, statute, case law, Federal Register, Constitution, DOI, PubMed, plus an opt-in arXiv parser. The resolver layer turns each typed citation into a canonical URL and — for the four heavyweight resolvers — fetches the actual source body. The verifier then returns a verdict — supported, contradicted, or unsupported — with the source span behind it. It is a verification aid, not a sign-off: every verdict points back to text a lawyer can read and confirm.

Verification has two strategies. The first is a deterministic substring match that uses no LLM at all and answers in microseconds. The second is a natural-language-inference judge that calls a small Claude model when the exact words aren't in the source. The default is a cascade: try substring first, escalate to NLI only when needed. Free verification when it works, model escalation when it doesn't.

Citations carry the character offsets where they were found and an optional source URL, so they round-trip onto the typed document model as annotations. When extraction came from a PDF, provenance points back to the exact page and bounding box in the file.

The pipeline

Citation-checking is one of the eight industry-standard legal-AI tasks (the VLAIR taxonomy). KAOS ships the open-source extraction-to-verdict pipeline; the agent layer wraps it into a one-line workflow.

Extract eyecite + regex 8 citation kinds Resolve 8 resolvers URL + source body Verify substring + NLI auto cascade Verdict supported / not with reasoning

Citations the extractor recognizes

eyecite (Free Law Project, BSD-2) handles case and statute parsing; pure regex covers the rest so the offline path stays dependency-free.

CFRCitation 17 CFR 240.10b-5 CaseCitation eyecite StatuteCitation eyecite FederalRegisterCitation 88 FR 26100 ConstitutionCitation U.S. Const. art. I DOICitation doi.org/... PubMedCitation PMID ArXivCitation opt-in

A taste

from kaos_citations import extract_citations
from kaos_nlp_transformers import NliModel
# Parse the citation a model emitted, then check the claim it made about it.
text = "The court relied on 17 CFR 240.10b-5, the SEC's anti-fraud rule."
cite = extract_citations(text, kinds=["cfr"])[0] # -> 17 CFR 240.10b-5
rule = ("Rule 10b-5 makes it unlawful to employ any device, scheme, or "
"artifice to defraud in connection with the purchase or sale of a security.")
claim = "17 CFR 240.10b-5 prohibits employment discrimination."
# On-device NLI — no API key. Does the rule text actually support the claim?
score = NliModel.load().score(rule, [claim])[0]
verdict = "supported" if score.entailment > 0.5 else "UNSUPPORTED"
print(f"{cite.normalized}: claim {verdict}")
print(f" entailment {score.entailment:.2f} contradiction {score.contradiction:.2f}")
# 17 CFR 240.10b-5: claim UNSUPPORTED
# entailment 0.00 contradiction 0.98

Packages in this group

kaos-citations handles extraction, resolution, and verification. kaos-source brings the public-record corpus: Federal Register, eCFR, EDGAR (filing search, company lookup, ticker lookup), GovInfo, GLEIF (~2.5 million legal entities by LEI), plus email and EXIF forensics tools for diligence work.

How it compares

vs. eyecite alone. eyecite is the BSD-2 case + statute parser kaos-citations builds on; it's the right tool when you only need extraction. kaos-citations adds resolution (8 resolvers, including 4 heavyweight that fetch the actual source body from eCFR / Federal Register / GovInfo / CourtListener) and verification (a substring + NLI cascade that falls back to a deterministic check).

vs. Lexis Shepard's Citation Agent, Westlaw Litigation Document Analyzer, Harvey citation checker. Every Tier 1 vendor ships citation checking as a flagship feature. The proprietary platforms have richer historical-treatment data; KAOS ships an open-source pipeline against public-record sources, an explicit substring-then-NLI cascade, and 45 fixture-backed citation goldens. One of the eight industry-standard legal-AI tasks — table stakes for the market, and an open-source answer.

vs. Everlaw Deep Dive refusal. Everlaw's "explicitly refuses when uncertain" UX is the gold standard for trust-first Q&A. kaos-citations exposes the same behavior at the schema level: verify() returns supported, contradicted, or unsupported, and never silently fabricates. Build the refusal UX on top of that contract, not the other way around.

See /compare.

Get started

See the quickstart, browse all 18 packages, or read the docs.