Researchers Find Chatgpt Vulnerabilities That Let Attackers Trick...
Cybersecurity researchers have disclosed a new set of vulnerabilities impacting OpenAI's ChatGPT artificial intelligence (AI) chatbot that could be exploited by an attacker to steal personal information from users' memories and chat histories without their knowledge.
The seven vulnerabilities and attack techniques, according to Tenable, were found in OpenAI's GPT-4o and GPT-5 models. OpenAI has since addressed some of them.
These issues expose the AI system to indirect prompt injection attacks, allowing an attacker to manipulate the expected behavior of a large language model (LLM) and trick it into performing unintended or malicious actions, security researchers Moshe Bernstein and Liv Matan said in a report shared with The Hacker News.
The disclosure comes close on the heels of research demonstrating various kinds of prompt injection attacks against AI tools that are capable of bypassing safety and security guardrails -
The findings show that exposing AI chatbots to external tools and systems, a key requirement for building AI agents, expands the attack surface by presenting more avenues for threat actors to conceal malicious prompts that end up being parsed by models.
"Prompt injection is a known issue with the way that LLMs work, and, unfortunately, it will probably not be fixed systematically in the near future," Tenable researchers said. "AI vendors should take care to ensure that all of their safety mechanisms (such as url_safe) are working properly to limit the potential damage caused by prompt injection."
The development comes as a group of academics from Texas A&M, the University of Texas, and Purdue University found that training AI models on "junk data" can lead to LLM "brain rot," warning "heavily relying on Internet data leads LLM pre-training to the trap of content contamination."
Last month, a study from Anthropic, the U.K. AI Security Institute, and the Alan Turing Institute also discovered that it's possible to successfully backdoor AI models of different sizes (600M, 2B, 7B, and 13B parameters) using just 250 poisoned documents, upending previous assumptions that attackers needed to obtain control of a certain percentage of training data in order to tamper with a model's behavior.
From an attack standpoint, malicious actors could attempt to poison web content that's scraped for training LLMs, or they could create and distribute their own poisoned versions of open-source models.
"If attackers only need to inject a fixed, small nu
Source: The Hacker News