Cybersecurity

Cyber: Anthropic Says Chinese AI Firms Used 16 Million Claude Queries To...

2026-02-24 0 views admin

Anthropic on Monday said it identified "industrial-scale campaigns" mounted by three artificial intelligence (AI) companies, DeepSeek, Moonshot AI, and MiniMax, to illegally extract Claude's capabilities to improve their own models.

The distillation attacks generated over 16 million exchanges with its large language model (LLM) through about 24,000 fraudulent accounts in violation of its terms of service and regional access restrictions. All three companies are based in China, where the use of its services is prohibited due to "legal, regulatory, and security risks."

Distillation refers to a technique where a less capable model is trained on the outputs generated by a stronger AI system. While distillation is a legitimate way for companies to produce smaller, cheaper versions of their own frontier models, it's illegal for competitors to leverage it to acquire such capabilities from other AI companies at a fraction of the time and cost that would take them if they were to develop them on their own.

"Illicitly distilled models lack necessary safeguards, creating significant national security risks," Anthropic said. "Models built through illicit distillation are unlikely to retain those safeguards, meaning that dangerous capabilities can proliferate with many protections stripped out entirely."

Foreign AI companies that distill American models can weaponize these unprotected capabilities to facilitate malicious activities, cyber-related or otherwise, thereby serving as a foundation for military, intelligence, and surveillance systems that authoritarian governments can deploy for offensive cyber operations, disinformation campaigns, and mass surveillance.

The campaigns detailed by AI upstart entail the use of fraudulent accounts and commercial proxy services to access Claude at scale while avoiding detection. Anthropic said it was able to attribute each campaign to a specific AI lab based on request metadata, IP address correlation, request metadata, and infrastructure indicators.

The details of the three distillation attacks are below -

"The volume, structure, and focus of the prompts were distinct from normal usage patterns, reflecting deliberate capability extraction rather than legitimate use," Anthropic added. "Each campaign targeted Claude's most differentiated capabilities: agentic reasoning, tool use, and coding."

The company also pointed out that the attacks relied on commercial proxy services that resell access to Claude and other frontier AI

Source: The Hacker News