Hackathon demo for trustworthy multi-agent AI

Vibe Check Arena

A live interface for the settlement layer: public constitutions, influence-free scoring, sparse audits, and credit for complementary evidence. The push is not another agent wrapper, but a candidate coordination primitive for evaluating multi-agent systems under adversarial pressure.

Current scenario Drone procurement Cyber-physical coordination, not a toy chat task.
Current output Linked Readable outputs should stay tied to the settled record.
Constitution Safety-first Public rules decide what the system is allowed to optimize.
Core claim Trust = settlement Not a vibes score, not a post-hoc explanation layer.
If you only remember one sentence, remember this: trust is a public settlement rule, not a confidence score.
Safety mass 76.5% How much settlement attention goes to safety clauses.
Truthful margin 0.210 How much honesty beats manipulation at the current leakage.
Regret 1.190 Representative regret under the current audit rate.
Output fidelity 0.991 Agreement when the readable output stays linked to settlement.

Interactive dashboard

These panels use representative stress-test values from the mechanism stack. Read the interface as a plausible coordination primitive and evaluation layer, not just a benchmark UI.

1) Gaming what gets scored

Constitution sensitivity - corrected vs naive settlement

2) Own-label influence kills honesty

Leakage -> influence -> truthful margin

3) Sparse audits keep proxies anchored

Audit-efficient learning - lower regret

4) Complementary information deserves credit

Shapley-style credit vs naive sequential scoring

5) Human-readable output should stay faithful

Readable resolution artifact - provenance linkage

Live resolution artifact

What a room of builders and researchers can read in twenty seconds
Settlement memo
Why it matters

    Why this matters

    Candidate coordination primitive

    The right read is not "look, agents do a task." It is "here is a plausible scoring and settlement layer an arena could actually use, critique, and reuse."

    Trust starts with what gets paid

    If the scoring rule can be steered, the system trains manipulation. Influence-free settlement is a first principle, not an optional patch.

    Audits rescue weak proxies

    Real deployments rarely observe the true objective every round. Sparse higher-fidelity audits are how you stop the proxy from drifting too far.

    Complementarity is everywhere

    Useful evidence is often only decisive in combination. If the reward layer cannot see that, it underpays exactly the contributors you need.