AI Challenges & Blinded Evaluation
AI Challenges: Secure Collaborative Model Testing
In CENTAURON, an AI Challenge provides the mechanism for training or validating AI models across multiple institutions without transferring whole-slide images or revealing ground-truth annotations. Instead of pooling data, models are dispatched to participating nodes, executed locally and evaluated against encrypted annotations inside institutional environments.
This structure enables meaningful, multi-center AI benchmarking under strict data-sovereignty guarantees.
Core Principles
- Model-to-Data Execution
AI algorithms are sent to the institutions; data never leaves the clinical environment. - Blinded Evaluation
Ground-truth labels remain encrypted and are decrypted only locally for evaluation; model outputs never reveal underlying annotations. - Cross-Institution Validation
Model performance metrics are computed independently at each site, then aggregated to generate unbiased, multi-center evidence. - Transparent Attribution
All Challenge submissions and evaluation events are recorded on the permissioned blockchain, creating an immutable audit trail. - Controlled Participation
Institutions choose which models may be executed locally and can set cryptographic usage constraints via smart contracts.
Evaluation Pipeline
- A clinical Project defines the evaluation protocol and quality criteria
- Institutions register eligible cases locally to the Project
- The AI model is deployed across participating nodes
- Evaluation runs locally using encrypted ground truth
- Only performance metrics and cryptographic proofs are returned
- No raw data or annotation content is ever exposed
Outcome
AI Challenges allow institutions to participate in collaborative model development and validation while preserving data sovereignty and annotation confidentiality. The result is a scalable, verifiable and bias-resistant framework for generating real-world evidence for medical AI systems without centralized data aggregation.