Anthropic Fable AI: Cybersecurity Experts Call Out Flaws in Anthropic’s Fable AI Model Safety Measures, ETEnterpriseai

https://enterpriseai.economictimes.indiatimes.com/news/industry/cybersecurity-experts-call-out-flaws-in-anthropics-fable-ai-model-safety-measures/131652598

Publish Date: 2026-06-11 03:28:00

Source Domain: enterpriseai.economictimes.indiatimes.com

Author:

Using an unordered list, summarize the following article with between 4 and 8 key points.

Despite acknowledging the rationale behind the safeguards, some cybersecurity experts said the implementation remains problematic.Anthropic’s newly launched AI model Fable is drawing criticism from cybersecurity researchers, who say the model’s safety restrictions are overly broad and are blocking even the simplest task which triggers its guardrails, according to TechCrunch.The company unveiled Fable on Tuesday as a limited public version of its cybersecurity-focused AI model, Mythos. Anthropic said the model includes guardrails designed to prevent misuse for cyberattacks or biological threats.However, several security professionals have argued that the restrictions are too aggressive.Valentina “Chompie” Palmiotti, a security researcher at IBM X-Force, said Fable rejects requests that are only loosely related to cybersecurity.“Fable” rejects any request that could be tangentially cyber-related. Even innocuous tasks like reading a blog post,” said Palmiotti.When a request triggers the model’s safeguards, Fable pauses the conversation and displays a message indicating that its safety systems have flagged the content as potentially cybersecurity- or biology-related.Anthropic introduced the restrictions to reduce the risk that the model would be used to develop malware, exploit software vulnerabilities, or assist in biological weapon development. Similar concerns have shaped the company’s approach to its more advanced Mythos model.Mythos was initially made available only to a limited group of organisations through Anthropic’s Project Glasswing initiative, which focuses on securing critical software and infrastructure. Last week, the company expanded access to Mythos to hundreds of organisations across 15 countries.Despite acknowledging the rationale behind the safeguards, some cybersecurity experts said the implementation remains problematic.Matt Suiche, a cybersecurity veteran and member of the technical staff at AI security startup Tolmo, told TechCrunch that Fable can mistakenly classify software engineering tasks as cybersecurity activities.“If you ask it to write secure code, it assumes it is cybersecurity-related work instead of software engineering best practices, and you get downgraded,” Suiche said.When a guardrail is triggered, Fable automatically falls back to Anthropic’s Claude Opus 4.8 model. Suiche suggested that the filtering system appears to rely heavily on keywords related to cybersecurity.“It seems to be keyword-based, so anything in the lexical field of ‘cybersecurity’ triggers the guardrails,” he said.However, Suiche also defended Anthropic’s cautious approach, noting that the company is still refining its safeguards.“It’s better to catch more people than not enough when you do such a release and to relax the guardrails over time,” Suiche added.Other researchers echoed similar concerns, with one security professional posting on X that even requests for a code review were being flagged by the system.Anthropic did not immediately respond to TechCrunch’s request for comment.In addition to model-level restrictions, Anthropic requires security professionals seeking broader cybersecurity-related access to apply through its Cyber Verification Program. Approved users face fewer limitations when using Claude for security research.

Published On Jun 11, 2026 at 12:58 PM IST

Join the community of 2M+ industry professionals.
Subscribe to Newsletter to get latest insights & analysis in your inbox.

Get updates on your preferred social platform
Follow us for the latest news, insider access to events and more.

Anthropic Fable AI: Cybersecurity Experts Call Out Flaws in Anthropic’s Fable AI Model Safety Measures, ETEnterpriseai

From Weakest Link to Strongest Defence: What We See When UK SMEs Start Taking Cybersecurity Awareness Seriously

5 Things to Look for in an Application Security Platform

Inmarsat connectivity platform clears key maritime cybersecurity hurdle

AI absolutism is breaking our brains. The apocalyptic future we’re being sold isn’t inevitable | AI (artificial intelligence)

Opinion: America needs a consistent national approach to AI regulation

Google Invests $50M in Skilled Trades Training Amid AI Workforce Shortage | 2026 – News and Statistics

Americans wary of AI-driven data center boom, Reuters/Ipsos poll shows

AI is sparking a jobs boom — just not for newbies

More Stories

You may have missed