Pentagon, IC want industry to provide an ‘evaluation harness’ to standardize testing of AI systems

Pentagon, IC want industry to provide an ‘evaluation harness’ to standardize testing of AI systems

Pentagon, IC want industry to provide an ‘evaluation harness’ to standardize testing of AI systems

https://defensescoop.com/2026/03/11/ai-system-testing-dod-intelligence-agencies/

Publish Date: 2026-03-11 11:57:00

Source Domain: defensescoop.com

  • The Defense Department and the Intelligence Community are seeking an “evaluation harness” to test AI technologies from various vendors for government-use.
  • This effort, known as “MYSTIC DEPOT,” is run by the Pentagon’s Defense Innovation Unit and will be pursued through a commercial solutions opening contracting mechanism.
  • The initiative is spearheaded by Defense Secretary Pete Hegseth and Pentagon CTO Emil Michael in their push to integrate advanced AI capabilities across military and office functions.
  • The effort aims to create rigorous, reproducible, and vendor-agnostic AI system assessments against government-defined criteria, to stay current with rapid advancements in AI technology.
  • The government is looking for evaluation benchmarks that apply across various classified workflows, including unclassified, secret and top secret environments to ensure multi-program applicability.
  • Officials are seeking an advanced evaluation harness with functionalities that allow for testing AI models in mission-critical, denied, degraded, intermittent or limited (DDIL) environments, as well as automated red-teaming capabilities.
  • The envisioned evaluation harness should also provide interfaces for subject matter experts to assess human workload, usability, and mission performance in human-only, AI-only, and human-AI team scenarios.
  • Responses to the solicitation are due by March 24th, as part of efforts to modernize military technology and operations.