hallucination-detection

Here are 465 public repositories matching this topic...

uptrain-ai / uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

machine-learning monitoring evaluation experimentation jailbreak-detection autoevaluation root-cause-analysis prompt-engineering llmops openai-evals llm-prompting llm-eval llm-test hallucination-detection

Updated Aug 18, 2024
Python

chrisryugj / korean-law-mcp

Star

국가법령정보MCP v4.4 | 법제처 42개 API → 9개 MCP 도구. 법령·판례·조례·조약 + 다단계 리서치(legal_research) + 정밀분석(legal_analysis: 인용검증·판례생사·행위시법·영향그래프) | 42 Korean legal APIs → 9 MCP tools

law legal typescript mcp mermaid claude precedent legal-ai legal-tech hallucination-detection citation-verification llm-hallucination model-context-protocol mcp-server korean-law law-diff legal-rag impact-graph citizen-guide

Updated Jun 11, 2026
TypeScript

cvs-health / uqlm

Star

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

uncertainty-quantification uncertainty-estimation ai-safety confidence-score hallucination confidence-estimation ai-evaluation llm llm-evaluation llm-safety hallucination-evaluation hallucination-detection hallucination-mitigation llm-hallucination

Updated Jun 8, 2026
Python

MigoXLab / dingo

Star

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

Updated Jun 10, 2026
Python

KRLabsOrg / LettuceDetect

Star

Lightweight hallucination detection framework for RAG applications

python nlp pytorch information-extraction bert token-classification hallucination-evaluation hallucination-detection

Updated Jun 7, 2026
Python

ifixai-ai / iFixAi

Star

Catch your AI's mistakes and blind spots before your customers or regulators do. iFixAi runs 45 inspections, 32 graded core plus 13 extended for frontier risks like sabotage, sandbagging, and oversight evasion. It returns a letter grade in under 5 minutes. Industry and model agnostic.

Updated Jun 9, 2026
Python

juyterman1000 / entroly

Star

Cut your Claude / OpenAI / Gemini bill 70–95% on AI coding. Local proxy that compresses context, keeps provider caches hot, and verifies LLM output ($0 hallucination guard). Drop-in for Cursor, Claude Code, Codex, Aider + 34 more and custom providers — 30s, no code changes

rust productivity open-source ai mcp cursor ai-agents claude rag llm chatgpt anthropic hallucination-detection context-compression mcp-server claude-code token-optimization llm-grounding ai-hallucination

Updated Jun 10, 2026
Python

NishilBalar / Awesome-LVLM-Hallucination

Star

up-to-date curated list of state-of-the-art Large vision language models hallucinations research work, papers & resources

mlm hallucination large-language-models llm mllm large-vision-language-models multimodal-large-language-models hallucination-evaluation hallucination-detection vision-language-models lvlm hallucination-mitigation hallucination-survey hallucination-research hallucination-benchmark multimodal-language-model

Updated Feb 8, 2026

IAAR-Shanghai / UHGEval

Star

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

benchmark evaluation dataset openai hallucination huggingface huggingface-transformers ceval gpt-3 openai-api hallucinations gpt-4 large-language-models llm chatgpt qwen hallucination-evaluation hallucination-detection

Updated Jun 7, 2025
Python

voidism / Lookback-Lens

Star

Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"

text-generation factuality hallucinations large-language-models hallucination-detection

Updated Oct 13, 2025
Python

sinewaveai / agent-security-scanner-mcp

Star

Security scanner MCP server for AI coding agents. Prompt injection firewall, package hallucination detection (4.3M+ packages), 1000+ vulnerability rules with AST & taint analysis, auto-fix.

Updated Jun 8, 2026
JavaScript

Alsace08 / Chain-of-Embedding

Star

[ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"

interpretability trustworthy-ai large-language-models mechanistic-interpretability self-evaluation hallucination-detection iclr-2025

Updated Dec 19, 2024
Python

nikolamilosevic86 / verifAI

Sponsor

Star

VerifAI initiative to build open-source easy-to-deploy generative question-answering engine that can reference and verify answers for correctness (using posteriori model)

Updated Oct 5, 2025
Jupyter Notebook

fabio-rovai / brain-in-the-fish

Star

Score any document. Prove every claim.

rust ai sparql rdf owl ontology multi-agent knowledge-graph fact-checking owl-ontology audit-trail document-evaluation llm-evaluation hallucination-detection mcp-server evidence-verification anti-hallucination tender-evaluation

Updated Apr 5, 2026
Rust

open-compass / ANAH

Star

[ACL 2024] ANAH & [NeurIPS 2024] ANAH-v2 & [ICLR 2025] Mask-DPO

acl alignment gpt iclr neurips llms hallucination-detection hallucination-mitigation

Updated Apr 30, 2025
Python

OpenKG-ORG / EasyDetect

Star

An Easy-to-use Hallucination Detection Framework for LLMs.

natural-language-processing knowledge-graph generation hallucinations aigc large-language-models multimodal-large-language-models genrative-ai easydetect hallucination-detection

Updated Apr 21, 2024
Python

Gianthard-cyh / ValiRef

Star

detect hallucinated citations in academic papers.

agent academic citations papers hallucination-detection

Updated Jun 10, 2026
Python

QWED-AI / qwed-verification

Sponsor

Star

AISecOps (AI Security Operations) framework for deterministic verification of AI systems. QWED verifies LLM outputs using math, logic, and symbolic execution — creating an auditable trust boundary for agentic AI systems. Not generation. Verification.

Updated Jun 6, 2026
Python

Ruiyang-061X / VL-Uncertainty

Star

🔎Official code for our paper: "VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation".

uncertainty uncertainty-quantification multi-modal uncertainty-estimation uncertainty-analysis hallucination vision-language vision-language-model large-vision-language-model hallucination-evaluation hallucination-detection multi-modal-large-language-model

Updated Mar 18, 2025
Python

deshwalmahesh / PHUDGE

Star

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

nlp ai evaluation ml pytorch judge feedback-collection sota custom-dataset finetuning hallucination llm llm-evaluation hallucination-detection phi-3

Updated Jul 10, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the hallucination-detection topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hallucination-detection topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hallucination-detection

Here are 465 public repositories matching this topic...

uptrain-ai / uptrain

chrisryugj / korean-law-mcp

cvs-health / uqlm

MigoXLab / dingo

KRLabsOrg / LettuceDetect

ifixai-ai / iFixAi

juyterman1000 / entroly

NishilBalar / Awesome-LVLM-Hallucination

IAAR-Shanghai / UHGEval

voidism / Lookback-Lens

sinewaveai / agent-security-scanner-mcp

Alsace08 / Chain-of-Embedding

nikolamilosevic86 / verifAI

fabio-rovai / brain-in-the-fish

open-compass / ANAH

OpenKG-ORG / EasyDetect

Gianthard-cyh / ValiRef

QWED-AI / qwed-verification

Ruiyang-061X / VL-Uncertainty

deshwalmahesh / PHUDGE

Improve this page

Add this topic to your repo