Skip to content

Latest commit

 

History

History
101 lines (77 loc) · 6.13 KB

File metadata and controls

101 lines (77 loc) · 6.13 KB

ECHIDNA — Extensible Cognitive Hybrid Intelligence — Show Me The Receipts

The README makes claims. This file backs them up with evidence.

Key Claims From README

ECHIDNA orchestrates 30 theorem provers, SMT solvers, first-order ATPs, and constraint solvers through a unified Rust core. Every proof result passes through a trust-hardening pipeline that checks solver integrity, tracks axiom usage, verifies proof certificates, and assigns a 5-level confidence score.

— README section "Overview"

Evidence: The project implements 48 prover backends across 10 tiers (lines 29-57): - Tier 1: Agda, Coq/Rocq, Lean 4, Isabelle/HOL, Idris2, F* - Tier 2: Z3, CVC5, Alt-Ergo - Tier 3: Dafny, Why3 - Tier 4: Metamath, HOL Light, Mizar, HOL4, PVS, ACL2, TLAPS, Twelf, Nuprl, Minlog, Imandra - Tier 5: Vampire, E Prover, SPASS - Tier 6: GLPK, SCIP, MiniZinc, Chuffed, OR-Tools

All 48 implement the ProverBackend trait (line 54-56). The trust pipeline (lines 59-114) applies 13 checks: solver integrity, portfolio solving, certificate checking, axiom tracking, sandboxing, 5-level confidence hierarchy, mutation testing, proof exchange, Pareto optimization, statistical tracking, and dispatch orchestration.

Caveat: All 48 backends have ProverBackend trait implementations with no unimplemented!()/todo!()/unreachable!() macros (sample verified 2026-04-05: Coq 1121 LoC, Lean 4 1648, Agda 673, Isabelle 360, Idris2 1253, Z3 840, CVC5 841). The README claims 30 but the system supports 48 total. "Operational" here means trait-implemented and shells out to a real external binary (e.g. coqc, lean, agda, z3) — it does NOT mean the live subprocess path has been exercised in CI against non-trivial goals. See testing caveat below.

Caveat (testing): The 30 e2e prover tests (tests/e2e_prover_test.rs) and most integration tests are in-process and mock-based (common::mock_prover). They exercise the dispatch infrastructure and trust pipeline with mocked prover results; they do NOT invoke Tier 1 binaries (coqc, lean, agda, z3, etc.) against non-trivial goals. The verify command on the included proofs/coq/basic.v fixture does shell out to coqc — but via a round-trip through echidna’s own re-serialization (parser fidelity untested on non-trivial proofs). The Z3 prove on proofs/z3/basic.smt2 short-circuits before subprocess spawn (the fixture is (assert true)). Live subprocess correctness for the 48 backends is not covered by the current test suite.

Caveat (binaries): Running Tier 1 backends requires the corresponding external binaries installed on the host (coqc, lean, agda, isabelle, idris2, z3, cvc5). These are not bundled. On a reference Fedora 43 system (2026-04-05), Coq/Lean/Agda/Idris2/Z3 were installed via opam/elan/local; Isabelle and CVC5 were absent.

306+ tests passing across the codebase: 232 unit tests, 38 integration tests, 21 property-based tests

— README section "Test Suite"

Evidence: The test metrics (lines 255-262) report: - 232 unit tests (lib) - 38 integration tests - 21 property-based tests (PropTest) - Additional interface and neural integration tests - Total: 638+ tests passing (v2.1 status, line 296)

This is verifiable with cargo test --lib (unit) and cargo test --test integration_tests.

Caveat: The 638+ number (v2.1) is higher than the README’s 306+ (v1.5). The README was written for v1.5; v2.1 added GNN integration with 28+ new tests.

Technology Choices

Technology Learn More

Rust

https://www.rust-lang.org (core logic, 48 prover backends, trust pipeline)

Julia 1.10+

https://julialang.org (ML inference: tactic prediction, premise selection)

ReScript + Deno

UI components (33 files, zero TypeScript)

Chapel

Optional parallel proof dispatch (compute-heavy operations)

Idris2 ABI

https://www.idris-lang.org (formal proofs, 16 modules, zero believe_me)

Zig FFI

https://ziglang.org (C-compatible API for all 48 provers)

Dogfooded Across The Account

Uses the hyperpolymath ABI/FFI standard (Idris2 + Zig). Same pattern across proven, burble, and gossamer.

ECHIDNA-specific: - src/abi/ — Idris2 formal proofs (16 modules, type soundness proofs) - ffi/zig/ — Zig FFI layer (7 exported functions, C ABI compatibility) - generated/abi/ — Auto-generated C headers from Idris2

Agda meta-checker (30+ formally verified trust pipeline properties) provides independent formal verification of the Rust trust pipeline correctness.

File Map: Key Modules

Path What’s There

src/rust/provers/

48 prover backend implementations (ProverBackend trait)

src/rust/verification/

Trust pipeline (portfolio, certificates, axioms, confidence, mutation, pareto, statistics)

src/rust/integrity/

Solver binary integrity (SHAKE3-512, BLAKE3)

src/rust/executor/

Sandboxed solver execution (Podman, bubblewrap)

src/rust/exchange/

Cross-prover proof exchange (OpenTheory, Dedukti)

src/rust/dispatch.rs

Full trust-hardening dispatch pipeline

src/rust/agent/

Agentic proof search (actor model)

src/rust/gnn/

GNN integration (graph construction, embeddings, guided search)

src/rust/neural.rs

Neural premise selection (tactic prediction)

src/rust/core.rs

Core types (Term, ProofState, Tactic, Goal, Context, Theorem)

src/rust/server.rs

HTTP API server (GraphQL, gRPC, REST)

src/rust/repl.rs

Interactive REPL

src/interfaces/

API interfaces (GraphQL, gRPC, REST with real prover invocation)

src/julia/

ML layer (logistic regression, MRR 0.66)

src/rescript/

UI components (33 files, proof exploration)

src/abi/

Idris2 formal specifications (16 modules)

ffi/zig/

Zig FFI implementation (7 functions)

proofs/agda/

Agda meta-checker (30+ formally verified trust properties)

tests/

638+ tests (unit, integration, property, interface)

Questions?

Open an issue in the hyperpolymath/echidna repository for questions about prover backend implementation, trust pipeline architecture, formal verification approach, or GNN-guided proof search integration.