Hackers Compromised ChatGPT Model with Indirect Prompt Injection

Recently at the Black Hat event, the following cybersecurity researchers demoed how they compromised the chatGPT model with indirect prompt injection:- Kai Greshake from Saarland University and Sequire Technology GmbH Sahar Abdelnabi from CISPA Helmholtz Center for Information Security Shailesh Mishra from Saarland University Christoph Endres from Sequire Technology GmbH Thorsten Holz from CISPA Helmholtz Center for Information Security Mario Fritz from CISPA Helmholtz Center for Information Security LLM-integrated applications (Source – BlackHat) Indirect Prompt Injection Indirect Prompt Injection challenges LLMs, blurring data-instruction lines, as adversaries can remotely manipulate systems via injected prompts. Retrieval of such prompts indirectly controls the models, raising concerns about recent incidents revealing behaviors that are unwanted. This shows that how adversaries could deliberately alter LLM behavior in apps, impacting millions of users. The unknown attack vector brings diverse threats, prompting the development of a comprehensive taxonomy to assess these vulnerabilities from a security perspective.

Source: GBHackers