Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision

Implement the Case-Grounded Evidence Verification framework to compel AI models to genuinely depend on provided evidence, rather than superficially attaching it. This improves factual accuracy and trustworthiness by ensuring models' decisions are directly contingent on verifiable evidence.

intermediate1 hour5 steps

The play

Define Grounding Criteria
Establish explicit rules for what constitutes 'grounded' evidence. Specify how evidence must directly support, contradict, or be irrelevant to a given claim for your domain.
Construct Verification Datasets
Create or augment datasets with claims, corresponding evidence snippets, and clear labels indicating the evidence-claim relationship (e.g., 'supports', 'contradicts', 'insufficient'). Annotate specific spans of evidence that justify the label.
Implement Evidence-Sensitive Training
Integrate training methodologies that penalize models for making claims not directly supported by the provided evidence. This could involve specific loss functions, negative sampling, or multi-task learning objectives focused on evidence verification.
Evaluate Evidence Reliance
Develop and apply evaluation metrics that quantify the model's dependence on evidence. This often involves perturbing or removing key evidence phrases and measuring how consistently the model's prediction or justification changes.
Iterate and Refine
Continuously refine your grounding criteria, annotation guidelines, training objectives, and evaluation methods based on model performance, identified grounding failures, and human feedback.

Starter code

{
  "claim": "The capital of France is Berlin.",
  "evidence": "Paris is the capital and most populous city of France.",
  "question": "Does the evidence provided directly support the claim? Explain your reasoning, citing specific phrases from the evidence if applicable. If not, state why."
}

Source

Paperarxiv.org