归档 | LLM Security Group 's Notes

全部文章 - 114

2025

2025-11-10

A Wolf in Sheep’s Clothing Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily

2025-11-10

Open Sesame! Universal Black Box Jailbreaking of Large Language Models

2025-11-04

PAPILLON: Efficient and Stealthy Fuzz Testing-Powered Jailbreaks for LLMs

2025-11-04

SNIS: A Signal Noise Separation-Based Network for Post-Processed Image Forgery Detection

2025-11-04

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

2025-11-04

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

2025-11-03

Identification of image global processing operator chain based on feature decoupling

2025-11-03

Is Artificial Intelligence Generated Image Detection a Solved Problem

2025-11-03

AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models

2025-11-03

GPT-4 Is Too Smart to Be Safe: Stealthy Chat with LLMs via Cipher

数据加载中