The Contradiction Benchmark for AI and enterprises

Contradictions create instability, hallucinations, and policy risk. Contradish provides a neutral score and an audit trail that teams can trust.

View Index preview Join the waitlist

Contradish Index preview

One view that compares models on contradiction risk. Lower is better.

Model
Risk
Contradiction density
Status
GPT-4
Low
2.3%
Preview
Claude-3.5
Low
2.7%
Preview
Gemini Pro
Medium
4.1%
Preview
LLaMA-3
Medium
5.2%
Preview

Method preview only. Full Index v0.1 will include datasets, confidence intervals, and version tracking.

Why teams use Contradish

Metrics

Contradiction density

Share of evaluated outputs that contain A vs not-A conflicts under a reference oracle.

Contradiction entropy

Disagreement across annotators or models. High entropy marks uncertain regions worth review.

Coherence delta

Gap between a model’s reasoning and its final answer. Large gaps signal fragile explanations.

Compression-Aware Intelligence

Definition: Compression-Aware Intelligence is the framework that treats hallucinations and contradictions not as errors to eliminate, but as measurable signals of representation strain inside any cognitive system, and uses those signals to guide stability, coherence, and self-correction. CAI describes how intelligent systems form and maintain coherence under compression. Every reasoning system must reduce high-detail inputs into lower-dimensional representations. Compression creates loss and loss produces structural tension between representations. CAI provides the method for identifying, measuring, and working with this tension. A system is compression-aware when it can detect where loss has occurred, represent the contradictions that emerge from that loss, and stabilize reasoning without collapsing into oversimplification or hallucination. The presence, shape, and routing of these contradictions determines whether a system is coherent, brittle, delusional, or aligned with reality.

Contradictions are signals that compression failed or two constraints conflict and measuring these signals improves stability and trust.

Join the waitlist

Get the Index preview and updates.

FAQ

Is this different from accuracy

Yes. Accuracy hides instability. Contradiction metrics reveal consistency and policy alignment.

Why an external score

Neutral measurement is more credible to regulators, clients, and the public.

Can audits use private data

Yes. Audits can run locally or with strict deletion terms. The public Index uses open sets.