文章
106
标签
103
分类
28
首页
归档
分类
类别分类
作者分类
标签
友链
关于
LLM Security Group 's Notes
搜索
首页
归档
分类
类别分类
作者分类
标签
友链
关于
越狱攻击
标签 - 越狱攻击
2025
2025-11-17
Distract Large Language Models for Automatic Jailbreak Attack
2025-11-17
GeneShift: Impact of Different Scenario Shift on Jailbreaking LLM
2025-11-10
A Wolf in Sheep’s Clothing Generalized Nested Jailbreak Prompts can Fool Large Language Models Easily
2025-11-10
Open Sesame! Universal Black Box Jailbreaking of Large Language Models
2025-11-03
AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models
2025-10-27
Jailbreaking? One Step Is Enough
2025-10-26
Images are Achilles’ Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models
2025-10-25
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
2025-09-14
Align is not Enough: Multimodal Universal Jailbreak Attack against Multimodal Large Language Models
2025-09-14
Safety Misalignment Against Large Language Models
1
2
LLM Security Group
分享知识,认识世界
文章
106
标签
103
分类
28
Follow Me
公告
This is my Blog
最新文章
PLeak: Prompt Leaking Attacks against Large Language Model Applications
2025-11-24
Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction
2025-11-23
BaitAttack: Alleviating Intention Shift in Jailbreak Attacks via Adaptive Bait Crafting
2025-11-23
Salience-Aware Face Presentation Attack Detection via Deep Reinforcement Learning
2025-11-20
Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network
2025-11-20
分类
ADVERSARIAL DEFENSE
1
AI系统优化
1
Adversarial
2
Adversarial Text Generation
1
Adversarial attack
1
Attack
1
BLACK BOX ATTACKS
1
High Confidence Predictions for Unrecognizable Images
1
标签
噪声表示学习
GCG优化
人脸伪造检测
基本迭代法
基于CatmullRom样条回归
算法
补丁攻击
模型安全
大型多模态模型
PUZZLED
微调
成对排序学习
多智能体协作
对抗提示
自适应感知模块
面部伪装攻击检测
特征增强
注意力分散
adversarial example
Image Recognition
密码攻击
上下文学习
越狱攻击防御
特征融合
对抗样本
编码器解码器
Search-R1
LLM辅助越狱
进化算法
Adversarial Text Generation
信噪分离
注意力机制
MASTERKEY
多轮越狱
频域特征
可学习干预
BaitAttack
数据集创建(自动标注)
PAPILLON
梯度上升
归档
十一月 2025
23
十月 2025
25
九月 2025
13
八月 2025
45
网站信息
文章数目 :
106
本站访客数 :
本站总浏览量 :
最后更新时间 :
搜索
数据加载中