Meta’s machine-learning model for detecting prompt injection attacks – special prompts to make neural networks behave inappropriately – is itself vulnerable to, you guessed it, prompt injection attacks. Prompt-Guard-86M, introduced by Meta last week in conjunction with its Llama 3.1 generative model, is intended “to help developers detect and respond to prompt injection and jailbreak inputs,”
Source: The Register