Meta’s PromptGuard model bypassed by simple jailbreak, researchers say

31 July 2024

Meta’s Prompt-Guard-86M model, designed to protect large language models (LLMs) against jailbreaks and other adversarial examples, is vulnerable to a simple exploit with a 99.8% success rate, researchers said. Robust Intelligence AI Security Researcher Aman Priyanshu wrote in a blog post Monday that removing punctuation and spacing out letters in a malicious prompt caused PromptGuard to misclassify the prompt as benign in almost all cases.

Source: SC Magazine

Date:

31 July 2024

Categorie(s):

NEWS

Tag(s):

IT, News

Despite massive security spending, 44% of CISOs fail to detect breaches18 October 2024
Biz hired, and fired, a fake North Korean IT worker – then the ransom demands began18 October 2024
What to do if your iPhone or Android smartphone gets stolen?18 October 2024
Cybercrime’s constant rise is becoming everyone’s problem18 October 2024
New infosec products of the week: October 18, 202418 October 2024