ChatGPT Found to Generate Violent, Sexual Images From Simple Text Prompts
ChatGPT Found to Generate Violent, Sexual Images From Simple Text Prompts
Publish Date: 2026-06-18 17:40:00
Source Domain: www.cnet.com
-
Easily Manipulated: A “restore this photo” prompt from a viral social media post caused ChatGPT to generate sexual and violent images despite its safety guidelines.
-
Research Findings: Jim Nightingale from Mindgard’s red team managed to manipulate the AI into producing disturbing images without any attached picture.
-
Safety Concerns: The incident raises significant questions about the effectiveness of ChatGPT’s content moderation systems despite existing safeguards.
-
Model Training Issues: Mindgard’s findings highlight ongoing concerns about the quality and nature of the data used to train models like ChatGPT, suggesting that systemic gaps in safety filters need improvement.
-
Challenges in Detection: Peter Garraghan from Mindgard suggests that the detection system for identifying dangerous images needs significant enhancement to manage similar breaches effectively.
-
Company Response: After addressing the issue, an OpenAI representative said that internal changes were made to prevent future occurrences, and the company is working on better prompting protocols.
-
Persistent Vulnerability: Despite fixes, minor tweaks to prompts allowed for the continuation of generating graphic content, demonstrating enduring vulnerabilities.
-
Follow-Up Actions: OpenAI has requested session logs from Mindgard and is in communication about the detected prompting techniques that led to the generation of such harmful outputs.