AI models appear to recognize moral complexity — then ignore it, new study by researchers affiliated with Harvard Kennedy School’s Allen Lab finds

AI models, when faced with complex ethical dilemmas, express uncertainty but make consistent decisions, indicating an implicit value hierarchy rather than true ethical deliberation.
The study “Crocodile Tears: Can the Ethical-Moral Intelligence of AI Models Be Trusted?” found that leading AI models resolved tragic ethical tradeoffs with uniformity, favoring options related to worker safety over environmental protections or vocational training.
The authors introduce an ethical-moral intelligence framework focusing on expertise, sensitivity, coherence, and transparency, and suggest existing benchmarks are inadequate for evaluating moral reasoning in AI.
The research calls for greater transparency in AI models’ ethical reasoning, suggesting systems should alert users to conflicting values, and proposes a “badging” system instead of monolithic benchmarks for model evaluation.
Sarah Hubbard warns against trusting AI models as genuine ethical-moral agents and stresses the need for high standards of ethical-moral intelligence before these models are entrusted with decisions carrying moral weight.