Microsoft’s ‘AI Watchdog’ defends against new LLM jailbreak method

Microsoft has discovered a new method to jailbreak large language model (LLM) artificial intelligence (AI) tools and shared its ongoing efforts to improve LLM safety and security in a blog post Thursday. Microsoft first revealed the “Crescendo” LLM jailbreak method in a paper published April 2, which describes how an attacker could send a series of seemingly benign prompts to gradually lead a chatbot, such as OpenAI’s ChatGPT, Google’s Gemini, Meta’s LlaMA or Anthropic’s Claude, to produce an output that would normally be filtered and refused by the LLM model.

Source: SC Magazine

 


Date:

Categorie(s):

Tag(s):