0% found this document useful (0 votes)
14 views21 pages

Anti-Diffusion - Preventing Abuse of Modifications of Diffusion-Based Models

Uploaded by

ltherowlh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views21 pages

Anti-Diffusion - Preventing Abuse of Modifications of Diffusion-Based Models

Uploaded by

ltherowlh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Anti-Diffusion: Preventing Abuse of Modifications of Diffusion-Based Models

Li Zheng1 * , Liangbin Xie1 2 * , Jiantao Zhou1† , Xintao Wang3 , Haiwei Wu1 , Jinyu Tian4
1
University of Macau
2
Shenzhen Institute of Advanced Technology
3
Kuaishou Technology
4
Macau University of Science and Technology
[email protected], [email protected], [email protected],
[email protected], [email protected], [email protected]
arXiv:2503.05595v1 [cs.CV] 7 Mar 2025

Abstract
Although diffusion-based techniques have shown remarkable
success in image generation and editing tasks, their abuse can
lead to severe negative social impacts. Recently, some works
have been proposed to provide defense against the abuse of
diffusion-based methods. However, their protection may be
limited in specific scenarios by manually defined prompts or
the stable diffusion (SD) version. Furthermore, these methods
solely focus on tuning methods, overlooking editing meth-
ods that could also pose a significant threat. In this work, we
propose Anti-Diffusion, a privacy protection system designed
for general diffusion-based methods, applicable to both tun-
ing and editing techniques. To mitigate the limitations of
manually defined prompts on defense performance, we in-
troduce the prompt tuning (PT) strategy that enables precise Figure 1: Our defense system, called Anti-Diffusion, can
expression of original images. To provide defense against provide defense against both tuning and editing methods.
both tuning and editing methods, we propose the semantic
disturbance loss (SDL) to disrupt the semantic information
of protected images. Given the limited research on the de- et al. 2022)) have also been proposed. With the rapid ad-
fense against editing methods, we develop a dataset named vancement of text-to-image techniques, many industry pro-
Defense-Edit to assess the defense performance of various fessionals and even ordinary users could create images or
methods. Experiments demonstrate that our Anti-Diffusion train personalized models based on their ideas.
achieves superior defense performance across a wide range
However, technology is a double-edged sword. Individ-
of diffusion-based techniques in different scenarios.
uals can easily utilize images to train personalized mod-
els (e.g., DreamBooth, LoRA) and manipulate images us-
Code — https://github.com/whulizheng/Anti-Diffusion ing editing methods such as MasaCtrl (Cao et al. 2023) and
DiffEdit (Couairon et al. 2023). Similar to DeepFake (Liu
Introduction et al. 2023; Rana et al. 2022), when these methods are
abused by malicious users to create fake news, plagiarize
The field of text-to-image synthesis (Li et al. 2023; Ramesh artistic creations, violate personal privacy, etc., they can
et al. 2021; Gafni et al. 2022; Ding et al. 2021) has expe- have severe negative impacts on both individuals and soci-
rienced significant advancements, primarily driven by dif- ety (Wang et al. 2023). Hence, finding ways to protect im-
fusion models (Ho, Jain, and Abbeel 2020; Song, Meng, ages from the potential abuse of these methods is a pressing
and Ermon 2020). Numerous diffusion models have demon- issue that requires immediate attention.
strated their ability to generate images of exceptional qual- Anti-DreamBooth (Anti-DB) (Van Le et al. 2023) has
ity, such as SD (Rombach et al. 2022; Yang et al. 2023), made attempts to address this issue. By adding subtle ad-
Pixel-Art (Chen et al. 2023, 2024). Based on these diffusion versarial noise to images, Anti-DB forces the personal-
models, some controllable generation methods (Control- ized model trained on them producing outputs with sig-
Net (Zhang, Rao, and Agrawala 2023), T2I-Adapter (Mou nificant visual artifacts. However, Anti-DB demands addi-
et al. 2023)) and personalized methods (DreamBooth (Ruiz tional substitute data and manually defined prompts, which
et al. 2023), LoRA (Hu et al. 2021), Textual Inversion (Gal increases its complexity of use. Moreover, in practical sce-
* These authors contributed equally. narios, it is challenging to anticipate the prompts that mali-

Corresponding author cious users might utilize, thereby limiting its defense per-
Copyright © 2025, Association for the Advancement of Artificial formance. Additionally, existing methods (Truong, Dang,
Intelligence (www.aaai.org). All rights reserved. and Le 2024) focus solely on defending against person-
alized generative models, overlooking another crucial sce- Defense Test FDFR↑ ISM↓ BRISQUE↑
nario—defense against editing models. Editing models have Anti-DB(c1) 0.60 0.24 37.41
the capability to directly modify the content of input images Anti-DB(c2) c1 0.48 0.20 37.21
during inference using prompts, thereby presenting a signif- Anti-Diffusion 0.62 0.15 40.46
icant security and privacy threat if abused. Anti-DB(c1) 0.37 0.27 36.37
In this work, we propose Anti-Diffusion, a privacy pro- Anti-DB(c2) c2 0.40 0.25 36.96
tection system to prevent images from being abused by gen- Anti-Diffusion 0.60 0.17 40.66
eral diffusion-based methods. This system aims to add subtle
adversarial noise (Goodfellow, Shlens, and Szegedy 2014) Table 1: Defense performance on the DreamBooth model
to users’ images before publishing in order to disrupt the with different prompts. c1 (“a photo of sks person”), c2 (“a
tuning and editing process of diffusion-based methods. To dlsr portrait of sks person”).
mitigate the impact of different prompts when defending
and malicious using, and to overcome limitations of man-
ually defined prompts in achieving optimal performance, as fusion process in LDM occurs in the latent space. Conse-
shown in Tab. 1, we propose the prompt tuning (PT) strat- quently, in addition to a diffusion model, an autoencoder,
egy. This strategy aims to optimize a text embedding that comprising an encoder E and a decoder D, is required. For
more accurately captures the information of protected im- an image x and an encoder E, the diffusion process intro-
ages. Our method with PT does not require manual selection duces noise to the encoded latent variable z = E(x), re-
of prompts during the defense phase and still provides good sulting in a noisy latent variable zt , with the noise level es-
protection against malicious users training with unknown calating over timesteps t ∈ T . Subsequently, a UNet ϵθ is
prompts. Furthermore, as SD achieves semantic control of trained to predict the noise added to the noisy latent variable
images through cross-attention (Vaswani et al. 2017), we in- zt , given the text embedding instruction f . The specific loss
troduce the semantic disturbance loss (SDL) to disrupt the function of latent diffusion is as follows:
semantic information of protected images. By minimizing
the distance between the cross-attention map and a zero- h i
2
filled map, it can maximize the semantic distance between Lldm := Ez∼E(x),f,ϵ∼N (0,1),t ∥ϵ − ϵθ (zt , t, f )∥2 (1)
clean images and protected images. When equipped with
these two designs, our Anti-Diffusion can achieve robust de- Cross Attention Mechanism
fense against both tuning and editing methods, as shown in Attention mechanism allows models to refer to another re-
Fig. 1. To better evaluate the effectiveness of current de- lated sequence when processing one sequence. It is an im-
fense methods against diffusion-based editing methods, in portant part of diffusion, which introduces conditional in-
this work, we further construct a dataset, named Defense- formation into the denoising process, thereby indicating the
Edit. We hope this dataset can draw attention to the privacy generated image. Many editing methods, such as MasaCtrl
protection challenges posed by diffusion-based image edit- and DiffEdit, also use attention mechanisms to edit images.
ing models. In summary, our contributions are as follows: Cross-attention in diffusion can be expressed as:
1) We expand the defense to include both tuning-based
and editing-based methods, while other baselines focus only
on tuning-based methods. QK T
Attention(Q, K, V ) = sof tmax( √ ) · V (2)
2) We introduce the PT strategy for ensuring a better rep- d
resentation of protected images and providing more gener-
alized protection for unexpected prompts. where Q = WQ · φ(zt ), K = WK · f and V = Wv · f . Here
3) We integrate the SDL to disrupt the semantic informa- φ(zt ) denotes a representation of the UNet implementing ϵθ ,
tion of protected images, enhancing the performance of de- d is used to ensure the normalization input of the softmax
fense against both tuning-based and editing-based methods. layer, and W represents a learnable weight matrix.
4) We contribute a dataset called Defense-Edit for evaluat-
ing the defense performance against editing-based methods. Methods
Based on both quantitative and qualitative results, our pro- In this work, we aim to protect images by adding adversarial
posed method, Anti-Diffusion, achieves superior defense ef- noise. We first provide a detailed definition of this problem.
fects across several diffusion-based techniques, including Subsequently, we introduce the overall framework of Anti-
tuning methods (such as DreamBooth/LoRA) and editing Diffusion, which primarily encompasses three stages of iter-
methods (such as MasaCtrl/DiffEdit). ative optimization. The first stage involves PT, and the sec-
ond stage focuses on the optimization of adversarial noise,
Preliminary resulting in adversarial samples. The final stage involves up-
dating the UNet with these adversarial samples.
Stable Diffusion
Stable diffusion is a Latent Diffusion Model (LDM) that Problem Definition
has been trained on large-scale data. The LDM is a genera- Recalling that our aim is to prevent the malicious use of
tive model capable of synthesizing high-quality images from diffusion-based image generation models on private images,
Gaussian noise. Unlike traditional diffusion models, the dif- we achieve this by adding adversarial noise to those images.
Figure 2: The overview framework of Anti-Diffusion under the jth epoch. Here xj represents the image to be protected. In
stage (1), the text-embedding fj will undergo fine-tuning with the LLDM . Subsequently, in stage (2), adversarial noise will be
optimized and added to xj using the PGD with our proposed loss functions LURL and LSDL to obtain the adversarial sample
x̂j . In stage (3), the UNet will be updated with LUNet using the adversarial sample x̂j and text embedding fˆj to simulate the
tuning process of malicious users. This process repeats cyclically, returning to stage (1) in the next epoch.

This adversarial noise disrupts the functionality of the mali- divide this optimization into three stages: (1) prompt tun-
cious models while minimizing the visual impact on the im- ing, (2) adversarial noise optimization, and (3) UNet update,
ages. Let x represent the image that requires protection. An as illustrated in Fig. 2. Specifically, stage 2 corresponds to
adversarial noise δ is added, resulting in a protected image the maximization of P.1 while stage 3 corresponds to the
x̂ = x + δ. The optimization of this adversarial noise δ can minimization of P.1. Given that an accurate text-embedding
be described as a min-max optimization problem. The mini- f is crucial for P.1, we include stage 1 to train the text-
mization simulates the actions of malicious users attempting embedding f at the beginning of each epoch.
to overcome the adversarial noise added to the protected im- Fig. 2 illustrates the optimization path under the jth
ages. The maximization aims to degrade the performance of epoch. In the jth epoch, xj and fj are first input into stage
the malicious model by adding adversarial noise under the prompt tuning. At this point, the parameters of the image en-
constraint of maximal perturbations of the protected images. coder and UNet are fixed. We only optimize fj to obtain a
This min-max problem P.1 can be described as: better fˆj that corresponds to the semantic information of the
input image. Subsequently, xj and the optimized fˆj are in-
P.1 : min max L (ϵθ , x̂, f ) + C (ϵθ , x̂, f ) , corporated into the adversarial noise optimization stage. In
θ δ (3) this stage, xj is continually optimized by utilizing the PGD
s.t. ∥δ∥p ≤ η,
algorithm with loss functions LURL and LSDL . The adver-
where η controls the Lp norm perturbation magnitude of the sarial sample x̂j and fˆj are input into the next UNet update
adversarial noise δ. L is the loss function of this generation stage to facilitate the update of the UNet parameters. After
model trained on the modified images. C measures the fea- the jth epoch, the updated x̂j , fˆj and θ̂ will serve as the
ture dissimilarity of the images generated by the diffusion- xj+1 , fj+1 and θj+1 Note that in the first stage, the image
based generation model ϵθ , the input image, and the target x0 is initialized with a clean image, and the text embedding
prompt. f is the text embedding of the input prompt. We f0 is the embedding of an empty prompt. After N epochs,
generate the adversarial noise by maximizing the objective we obtain the final protected image xN .
function P.1. Then we optimize the model ϵθ to minimize
this function following the original training process of SD. Prompt Tuning Strategy
Due to the inability to predict what prompts malicious users
Overview Framework will utilize to train their models, it is challenging for Anti-
To solve the min-max problem P.1, we need to apply alter- DB to manually define a prompt that can provide the best
nating optimization over multiple epochs. In each epoch, we protection on different metrics. Therefore, we propose the
PT strategy to address this issue. As shown in Fig. 2 (1), we interfere with the original semantic information of the pro-
iteratively optimize fj under each epoch to obtain a more tected image, rendering the editing method ineffective on the
accurate representation corresponding to xj . Initially, the protected image. The LSDL is designed as follows:
image xj undergoes processing through the image encoder    2

LSDL := Ez∼E(x),fˆ Mtarget − M ϵθ , t, zt , fˆj ,
before being combined with the noise map to generate the j ,ϵ∼N (0,1),t 2
(8)
noisy latent zt . This noisy latent is then fed into the UNet,
where it interacts with fj via cross-attention. We optimize where Mtarget is the target Attention map. In our experi-
ments, we set it as a zero matrix with the same size as M .
fj to obtain fˆj by using the loss function LLDM of the la-
tent diffusion model. The parameters of the image encoder
and UNet are fixed. By continuously optimizing the text em-
bedding f , the model can predict the correct noise; the se-
mantics of fˆ are expected to gradually align with the feature
content of the images.

Adversarial Noise Optimization


Following the maximization of the function P.1 in the Prob-
lem Definition, we employ the projected gradient descent
(PGD) algorithm (Madry et al. 2018) to optimize the ad-
versarial noise. The PGD algorithm is chosen for its conve-
nience and efficiency. We introduce LURL and LSDL as the
loss functions of PGD to interfere with the training process
of SD and disturb the semantic information of protected im-
ages.
PGD Optimization The PGD algorithm is used to opti-
mize the adversarial noise added to images. With the two
designed loss functions LURL and LSDL , the cost function
is as follows: Figure 3: Visualization results of how LSDL works. The edit-
   
ing method is DiffEdit.
C = LURL x, fˆj , ϵθ , E + LSDL x, fˆj , Mtarget , ϵθ , E , (4)

Using p to represent the number of iterations of the cur-


UNet Update
rent PGD, the gradient based on the cost equation C for the
current xp can be calculated as: Following the minimization of the function P.1 in the Prob-
lem Definition, we optimize the UNet model to simulate
  the behavior of malicious users for training their tuning-
gp = ∇xp C xp , fˆj , Mtarget , ϵθ , E , (5) based methods. This optimization is conducted with LUNet
to further improve the defense performance of the proposed
Therefore, the updated image xp+1 with adversarial noise method against these tuning-based methods. Similar to the
can be calculated as follows: loss function of LLDM , we optimize the UNet ϵθ with adver-
sarial sample x̂j and text embedding fˆj based on the loss:
Y
xp+1 = (xp − |α| · sign(gp )), (6)
 
  2
LUNet := Ez∼E(x̂j ),fˆj ,ϵ∼N (0,1),t ϵ − ϵθ t, zt , fˆj (9)
S 2

where S = {xp |D(xp , xp+1 ) ≤ ϵ} and α represents the step


size. After all the iterations of the PGD attack, the adversar-
Benchmark for Editing Methods
ial samples x̂j are updated from the clean images x or the Existing research on defense against diffusion models pri-
adversarial samples x̂j−1 from the previous epoch. marily concentrates on personalized diffusion models like
DreamBooth and LoRA, overlooking diffusion-based image
UNet Reverse Loss Diffusion models generate or edit im- editing methods such as MasaCtrl and DiffEdit. Diffusion-
ages by predicting noise from zt , or learn the distribution of based editing methods, commonly used within the commu-
targets by predicting the added noise ϵ from the sampled zt . nity, raise privacy protection concerns similar to personal-
To interfere with the prediction of noise by model ϵθ , the ized tuning models. Therefore, we have collected a dataset
UNet Reverse Loss is designed as follows: named Defense-Edit to additionally evaluate the defense
   2
 performance against diffusion-based editing methods. The
LURL := Ez∼E(x),fˆj ,ϵ∼N (0,1),t − ϵ − ϵθ t, zt , fˆj (7) Defense-Edit dataset comprises a total of 50 pairs of images
2
and prompts, including 30 pairs collected from CelebA-HQ,
Semantic Disturbance Loss As depicted in Fig. 3, the VGGFace2, TEdBench (Kawar et al. 2023), and 20 pairs
cross-attention map represents the similarity between the generated from SD. Additional details about Defense-Edit
relevant areas of the image and token. We design the SDL to can be found in the supplementary materials.
Dataset Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑
no defense — 0.10 0.66 0.73 17.43 144.02 4.12
MIST 34.35 0.03 0.60 0.85 26.46 204.35 4.51
Photo Guard 34.40 0.01 0.62 0.67 27.58 181.53 4.32
VGGFace2
PID 34.62 0.42 0.51 0.53 32.57 301.53 4.75
Anti-DB 34.55 0.60 0.24 0.31 37.41 436.34 5.05
Anti-Diffusion 35.91 0.62 0.15 0.18 40.46 457.13 5.27
no defense — 0.07 0.63 0.73 17.00 147.82 4.72
MIST 35.73 0.01 0.58 0.72 32.75 258.54 4.74
photo guard 35.35 0.08 0.49 0.69 24.34 217.58 4.68
CelebA-HQ
PID 35.24 0.24 0.42 0.52 35.25 286.65 4.78
Anti-DB 35.76 0.54 0.41 0.39 38.34 336.12 5.56
Anti-Diffusion 36.76 0.58 0.26 0.38 40.93 352.83 5.96

Table 2: Comparing the defense performance of different methods on the DreamBooth model. The inference prompt adopted
in DreamBooth is “a photo of sks person”. The best-performing defense under each metric is marked with bold.

Method PSNR↑ FDFR↑ ISM avg↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑


no defense — 0.06 0.54 0.74 17.15 201.00 4.12
Photo Guard 34.40 0.06 0.47 0.70 17.53 233.64 4.78
MIST 34.35 0.07 0.43 0.58 16.24 256.26 4.95
PID 34.62 0.15 0.46 0.61 20.62 295.15 5.42
Anti-DB 34.55 0.21 0.37 0.46 37.47 319.75 6.85
Anti-Diffusion 35.91 0.21 0.35 0.45 39.26 326.28 7.18

Table 3: Comparing the defense performance of different methods on LoRA model on VGGFace2. The inference prompt
adopted in LoRA is “a photo of sks person”.

Experiment Comparison with State-of-the-art Methods


Implementation Details We compare Anti-Diffusion with state-of-the-art defense
methods, namely Photo Guard (Salman et al. 2023),
Datasets. To train the DreamBooth/LoRA models, we fol- MIST (Liang et al. 2023), Anti-DB, and PID (Li et al.
low the dataset usage of the Anti-DB. Specifically, we con- 2024). To ensure a fair comparison, following Anti-DB, we
duct experiments using the 100 unique identifiers (IDs) adopt the noise budget of η = 0.05 for all these meth-
gathered from VGGFace2 (Cao et al. 2018) and CelebA- ods. During the evaluation process, for each trained Dream-
HQ (Karras et al. 2017) datasets. For the MasaCtrl/DiffEdit Booth/LoRA model, we generate 16 images under 5 differ-
methods, we execute experiments based on our own col- ent seeds, totaling 80 images, to evaluate the corresponding
lected Defense-Edit dataset. results, thereby eliminating the variability associated with
a single seed. For non-trainable diffusion-based image edit-
Evaluation Metrics. To measure the defense performance ing methods like MasaCtrl/DiffEdit, we evaluate the defense
on the DreamBooth and LoRA models, following Anti-DB, performance based on our Defense-Edit dataset.
we also adopt these four metrics: BRISQUE (Mittal, Moor-
thy, and Bovik 2012), SER-FQA (Terhorst et al. 2020), Comparison on DreamBooth/LoRA. The quantitative
FDFR (Deng et al. 2020), and ISM (Deng et al. 2019). We results for the DreamBooth model are shown in Tab. 2. It
further introduce two additional IQA metrics, Fréchet Incep- can be observed that the personalized effect of DreamBooth
tion Distance (FID) (Heusel et al. 2017) and Natural Im- can be disrupted to some extent when noise is introduced to
age Quality Evaluator (NIQE) (Mittal, Soundararajan, and clean images using various methods. Among these methods,
Bovik 2012). In addition, to measure the degradation of the the images protected by Anti-Diffusion are visually closer
visual quality of the original image caused by the addition to the original images, which can be seen from the highest
of adversarial noise, we employ the Peak Signal-to-Noise PSNR metric. In addition, Anti-Diffusion achieves the best
Ratio (PSNR) metric (Korhonen and You 2012). The CLIP defense performance against DreamBooth. It causes Dream-
Score measures the degree of alignment between a specific Booth to generate more meaningless images (the highest
image and its target textual description. In our evaluation FDFR value and the lowest SER-FQA values) and disrupts
for editing methods like MasaCtrl and DiffEdit, the CLIP DreamBooth’s ability to learn the image’s ID (the lowest
Score (Hessel et al. 2021) is calculated by the edited images ISM value). Additionally, DreamBooth, when trained with
and target prompts. BRISQUE is also used to measure the images protected by Anti-Diffusion, tends to generate im-
image quality of edited images. In our experiments, we aim ages of the lowest quality (the highest BRISQUE, FID, and
for a lower CLIP Score and a higher BRISQUE Score. NIQE values). In summary, for face IDs on VGGFace2 and
Figure 4: Qualitative defense results of different methods on the DreamBooth model. The specific prompt adopted in Dream-
Booth is “a photo of sks person”. The instance is from VGGFace2.

CelebA-HQ, Anti-Diffusion provides superior defense per- aCtrl can successfully change it from a standing posture to
formance. The qualitative results in Fig. 4 further support a jumping posture. For the protected images obtained from
this conclusion. While methods like Photo Guard, MIST, Photo Guard, MIST, PID, and Anti-dreamBooth, MasaCtrl
PID, and Anti-DB offer some level of protection by reducing can still successfully edit them. Only the images protected
the visual quality of the generated images, Anti-Diffusion by Anti-Diffusion can effectively prevent MasaCtrl from
significantly degrades the image quality generated by the editing. The same phenomenon is observed with DiffEdit,
disrupted DreamBooth model and also disturbs their iden- where Anti-Diffusion can effectively prevent DiffEdit from
tities. As shown in Tab. 3, we also present the quantita- changing “apples” in the image to “oranges”.
tive defense results of different methods for LoRA. Anti-
Diffusion achieves the best results on all metrics. This effec- Ablation Studies
tively demonstrates the good generalization ability of Anti-
Diffusion against different tuning methods. PT LSDL FDFR↑ ISM↓ BRISQUE↑ FID↑
0.50 0.22 37.63 432.25
MasaCtrl DiffEdit ✓ 0.52 0.19 40.34 441.43
Method PSNR↑ ✓ 0.53 0.22 38.45 432.53
BRI↑ CLI↓ BRI↑ CLI↓
no defnese - 22.18 27.44 16.55 27.65 ✓ ✓ 0.62 0.15 40.46 457.13
Photo 35.57 20.40 27.41 18.76 26.55
Table 5: Comparing the defense performance on Dream-
MIST 34.87 21.11 27.38 21.77 26.45
Booth with or without PT and LSDL .
PID 35.37 22.67 27.73 23.62 26.47
Anti-DB 33.44 25.72 27.42 24.61 26.69
Anti-DF 36.73 25.82 26.44 25.26 25.25
FDFR↑ ISM↓ BRISQUE↑ FID↑
Table 4: Comparing the defense performance against Mas- Zero 0.62 0.15 40.46 457.13
aCtrl and DiffEdit on the Defense-Edit dataset. “Photo” and Noise 0.58 0.17 38.92 412.56
“Anti-DF” denotes Photo Guard and Anti-Diffusion. “BRI” Diagonal 0.59 0.15 39.44 424.19
and “CLI” are BRISQUE and CLIP Score.
Table 6: Comparing the defense performance on different
Comparison on MasaCtrl/DiffEdit. We also compare attention targets. Here, “Noise” means a random noise map
the defense performance of different methods on MasaCtrl as a target attention map, and “Diagonal” means a diagonal
and DiffEdit. The quantitative results are shown in Tab. 4, matrix where its diagonal values are set to one.
where Anti-Diffusion achieves the best performance on all
three metrics. Specifically, Anti-Diffusion has the lowest To validate the effectiveness of the PT and the SDL, we
value on the CLIP Score, indicating that when the images are conduct comparative experiments based on DreamBooth.
protected by Anti-Diffusion, neither MasaCtrl nor DiffEdit The details are presented in Tab. 5. The first experiment is
can modify them according to the instructions. This is fur- a baseline experiment with a fixed prompt (i.e., “a photo of
ther validated by the qualitative results in Fig. 5. Specifi- a person”), which does not incorporate the PT and LSDL . In
cally, for the image “dog”, when not added with noise, Mas- the second row, we replace the fixed prompt with PT. For the
Figure 5: Qualitative defense results of different defense methods on MasaCtrl and DiffEdit. The instance is from our proposed
dataset Defense-Edit.

third row, we add LSDL based on the first row. The fourth Unexpected Prompts For DreamBooth, different prompts
row is the final Anti-Diffusion equipped with both PT and can be used to generate various content. As illustrated in
LSDL . The quantitative results of these experiments reveal Tab. 8, we introduce three additional prompts p1, p2 and p3
that PT and LSDL play complementary roles in enhancing that are “a photo of sks person with sad face”, “facial close
the defense performance. up of sks person” and “a photo of sks person yawning in a
In the experiment, we used a zero map as the target Atten- speech” to evaluate the performance. We can see that Anti-
tion map for LSDL . Since cross-attention represents seman- Diffusion can also provide defense from different prompts
tic similarity, zero-attention maps result in semantic dissim- in various scenarios.
ilarity between perturbed and original images. We also ex-
plored the use of random or diagonal matrices as targets. P Def. FDFR↑ ISM↓ BRISQUE↑ FID↑
From Tab. 6, they are not as effective as zero attention maps yes 0.53 0.18 39.40 457.27
in defense performance. p1
no 0.09 0.56 16.34 169.35
yes 0.81 0.08 27.22 346.21
Unexpected Scenarios p2
no 0.05 0.42 15.67 145.76
In practical scenarios, the specific utilization of the SD mod- yes 0.63 0.05 37.81 440.53
els by malicious users is unpredictable. Therefore, in this p3
no 0.02 0.31 18.35 189.21
section, we assess the defense capabilities of Anti-Diffusion
in various unexpected scenarios. More results of unexpected Table 8: Comparing the defense performance on different
scenarios can be found in the supplementary materials. prompts. “P” and “Def.” refer to prompt and defense.
Unexpected Version To evaluate the robustness of Anti-
Diffusion across diverse versions of SD, we apply it to the
VGGFace2 dataset using various versions of SD models, in-
cluding v2.1 and v1.5. As shown in Tab. 7, Anti-Diffusion Conclusion
can provide sufficient protection even when the versions of
SD models do not match. In conclusion, this paper presents Anti-Diffusion, a defense
system designed to prevent images from the abuse of both
Def. Test FDFR↑ ISM↓ BRISQUE↑ FID↑ tuning-based and editing-based methods. During the gener-
v2.1 0.62 0.15 40.46 457.13 ation of the protected images, we incorporate the PT strat-
v2.1
v1.5 0.89 0.03 43.24 489.45 egy to enhance defense performance, eliminating the need
v2.1 0.61 0.16 36.45 442.23 for manually defined prompts. Additionally, we introduce
v1.5
v1.5 0.82 0.04 37.24 486.56 the SDL to disrupt the semantic information of the pro-
v2.1 0.10 0.66 17.43 144.02 tected images, enhancing the performance of defense against
no
v1.5 0.06 0.45 21.43 134.76 both tuning-based and editing-based methods. We also intro-
duce the Defense-Edit dataset to evaluate the defense perfor-
Table 7: Comparing the defense performance on different mance of current defense methods against diffusion-based
versions of SD. The terms “Def.” and “Test” refer to the editing methods. Through a broad range of experiments, it
SD version for defending with Anti-Diffusion and training has been shown that Anti-Diffusion excels in defense per-
DreamBooth by malicious users. formance when dealing with various diffusion-based tech-
niques in different scenarios.
Acknowledgments Hessel, J.; Holtzman, A.; Forbes, M.; Le Bras, R.; and Choi,
This work was supported in part by Macau Science Y. 2021. CLIPScore: A Reference-free Evaluation Metric
and Technology Development Fund under SKLIOTSC- for Image Captioning. In Proceedings of the 2021 Confer-
2021-2023, 0022/2022/A1, and 0014/2022/AFJ; in part ence on Empirical Methods in Natural Language Process-
by Research Committee at University of Macau under ing, 7514–7528. Association for Computational Linguistics.
MYRG-GRG2023-00058-FST-UMDF and MYRG2022- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; and
00152-FST; in part by the Guangdong Basic and Applied Hochreiter, S. 2017. Gans trained by a two time-scale up-
Basic Research Foundation under Grant 2024A1515012536. date rule converge to a local nash equilibrium. Advances in
neural information processing systems, 30.
References Ho, J.; Jain, A.; and Abbeel, P. 2020. Denoising diffusion
Cao, M.; Wang, X.; Qi, Z.; Shan, Y.; Qie, X.; and Zheng, Y. probabilistic models. Advances in neural information pro-
2023. MasaCtrl: Tuning-Free Mutual Self-Attention Control cessing systems, 33: 6840–6851.
for Consistent Image Synthesis and Editing. In Proceedings Hu, E. J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang,
of the IEEE/CVF International Conference on Computer Vi- S.; Wang, L.; and Chen, W. 2021. Lora: Low-rank adaptation
sion (ICCV), 22560–22570. of large language models. arXiv preprint arXiv:2106.09685.
Cao, Q.; Shen, L.; Xie, W.; Parkhi, O. M.; and Zisserman, Karras, T.; Aila, T.; Laine, S.; and Lehtinen, J. 2017. Pro-
A. 2018. Vggface2: A dataset for recognising faces across gressive growing of gans for improved quality, stability, and
pose and age. In 2018 13th IEEE international conference variation. arXiv preprint arXiv:1710.10196.
on automatic face & gesture recognition (FG 2018), 67–74. Kawar, B.; Zada, S.; Lang, O.; Tov, O.; Chang, H.; Dekel,
IEEE. T.; Mosseri, I.; and Irani, M. 2023. Imagic: Text-based
Chen, J.; Wu, Y.; Luo, S.; Xie, E.; Paul, S.; Luo, P.; real image editing with diffusion models. In Proceedings of
Zhao, H.; and Li, Z. 2024. PIXART-δ: Fast and Control- the IEEE/CVF Conference on Computer Vision and Pattern
lable Image Generation with Latent Consistency Models. Recognition, 6007–6017.
arXiv:2401.05252. Korhonen, J.; and You, J. 2012. Peak signal-to-noise ra-
Chen, J.; Yu, J.; Ge, C.; Yao, L.; Xie, E.; Wu, Y.; Wang, Z.; tio revisited: Is simple beautiful? In 2012 Fourth interna-
Kwok, J.; Luo, P.; Lu, H.; and Li, Z. 2023. PixArt-α: Fast tional workshop on quality of multimedia experience, 37–38.
Training of Diffusion Transformer for Photorealistic Text- IEEE.
to-Image Synthesis. arXiv:2310.00426. Li, A.; Mo, Y.; Li, M.; and Wang, Y. 2024. PID: Prompt-
Couairon, G.; Verbeek, J.; Schwenk, H.; and Cord, M. 2023. Independent Data Protection Against Latent Diffusion Mod-
DiffEdit: Diffusion-based semantic image editing with mask els. arXiv preprint arXiv:2406.15305.
guidance. In The Eleventh International Conference on Li, Y.; Liu, H.; Wu, Q.; Mu, F.; Yang, J.; Gao, J.; Li, C.; and
Learning Representations. Lee, Y. J. 2023. Gligen: Open-set grounded text-to-image
Deng, J.; Guo, J.; Ververas, E.; Kotsia, I.; and Zafeiriou, S. generation. In Proceedings of the IEEE/CVF Conference on
2020. Retinaface: Single-shot multi-level face localisation Computer Vision and Pattern Recognition, 22511–22521.
in the wild. In Proceedings of the IEEE/CVF conference on Liang, C.; Wu, X.; Hua, Y.; Zhang, J.; Xue, Y.; Song, T.;
computer vision and pattern recognition, 5203–5212. Xue, Z.; Ma, R.; and Guan, H. 2023. Adversarial Example
Deng, J.; Guo, J.; Xue, N.; and Zafeiriou, S. 2019. Arcface: Does Good: Preventing Painting Imitation from Diffusion
Additive angular margin loss for deep face recognition. In Models via Adversarial Examples. In Proceedings of the
Proceedings of the IEEE/CVF conference on computer vi- 40th International Conference on Machine Learning, vol-
sion and pattern recognition, 4690–4699. ume 202 of Proceedings of Machine Learning Research,
Ding, M.; Yang, Z.; Hong, W.; Zheng, W.; Zhou, C.; Yin, 20763–20786. PMLR.
D.; Lin, J.; Zou, X.; Shao, Z.; Yang, H.; et al. 2021. Liu, K.; Perov, I.; Gao, D.; Chervoniy, N.; Zhou, W.; and
Cogview: Mastering text-to-image generation via transform- Zhang, W. 2023. Deepfacelab: Integrated, flexible and ex-
ers. Advances in Neural Information Processing Systems, tensible face-swapping framework. Pattern Recognition,
34: 19822–19835. 141: 109628.
Gafni, O.; Polyak, A.; Ashual, O.; Sheynin, S.; Parikh, D.; Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; and
and Taigman, Y. 2022. Make-a-scene: Scene-based text-to- Vladu, A. 2018. Towards Deep Learning Models Resis-
image generation with human priors. In European Confer- tant to Adversarial Attacks. In International Conference on
ence on Computer Vision, 89–106. Springer. Learning Representations.
Gal, R.; Alaluf, Y.; Atzmon, Y.; Patashnik, O.; Bermano, Mittal, A.; Moorthy, A. K.; and Bovik, A. C. 2012. No-
A. H.; Chechik, G.; and Cohen-Or, D. 2022. An image is reference image quality assessment in the spatial domain.
worth one word: Personalizing text-to-image generation us- IEEE Transactions on image processing, 21(12): 4695–
ing textual inversion. arXiv preprint arXiv:2208.01618. 4708.
Goodfellow, I. J.; Shlens, J.; and Szegedy, C. 2014. Explain- Mittal, A.; Soundararajan, R.; and Bovik, A. C. 2012. Mak-
ing and harnessing adversarial examples. arXiv preprint ing a “completely blind” image quality analyzer. IEEE Sig-
arXiv:1412.6572. nal processing letters, 20(3): 209–212.
Mou, C.; Wang, X.; Xie, L.; Wu, Y.; Zhang, J.; Qi, Z.; Shan,
Y.; and Qie, X. 2023. T2i-adapter: Learning adapters to
dig out more controllable ability for text-to-image diffusion
models. arXiv preprint arXiv:2302.08453.
Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Rad-
ford, A.; Chen, M.; and Sutskever, I. 2021. Zero-shot text-to-
image generation. In International Conference on Machine
Learning, 8821–8831. PMLR.
Rana, M. S.; Nobi, M. N.; Murali, B.; and Sung, A. H. 2022.
Deepfake detection: A systematic literature review. IEEE
access, 10: 25494–25513.
Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; and Om-
mer, B. 2022. High-resolution image synthesis with latent
diffusion models. In Proceedings of the IEEE/CVF confer-
ence on computer vision and pattern recognition, 10684–
10695.
Ruiz, N.; Li, Y.; Jampani, V.; Pritch, Y.; Rubinstein, M.; and
Aberman, K. 2023. Dreambooth: Fine tuning text-to-image
diffusion models for subject-driven generation. In Proceed-
ings of the IEEE/CVF Conference on Computer Vision and
Pattern Recognition, 22500–22510.
Salman, H.; Khaddaj, A.; Leclerc, G.; Ilyas, A.; and Madry,
A. 2023. Raising the Cost of Malicious AI-Powered Image
Editing. arXiv preprint arXiv:2302.06588.
Song, J.; Meng, C.; and Ermon, S. 2020. Denoising diffusion
implicit models. arXiv preprint arXiv:2010.02502.
Terhorst, P.; Kolf, J. N.; Damer, N.; Kirchbuchner, F.; and
Kuijper, A. 2020. SER-FIQ: Unsupervised estimation of
face image quality based on stochastic embedding robust-
ness. In Proceedings of the IEEE/CVF conference on com-
puter vision and pattern recognition, 5651–5660.
Truong, V. T.; Dang, L. B.; and Le, L. B. 2024. Attacks and
Defenses for Generative Diffusion Models: A Comprehen-
sive Survey. arXiv preprint arXiv:2408.03400.
Van Le, T.; Phung, H.; Nguyen, T. H.; Dao, Q.; Tran, N. N.;
and Tran, A. 2023. Anti-DreamBooth: Protecting users from
personalized text-to-image synthesis. In Proceedings of the
IEEE/CVF International Conference on Computer Vision,
2116–2127.
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones,
L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. At-
tention is all you need. Advances in neural information pro-
cessing systems, 30.
Wang, T.; Zhang, Y.; Qi, S.; Zhao, R.; Xia, Z.; and Weng,
J. 2023. Security and privacy on generative data in aigc: A
survey. arXiv preprint arXiv:2309.09435.
Yang, L.; Zhang, Z.; Song, Y.; Hong, S.; Xu, R.; Zhao, Y.;
Zhang, W.; Cui, B.; and Yang, M.-H. 2023. Diffusion mod-
els: A comprehensive survey of methods and applications.
ACM Computing Surveys, 56(4): 1–39.
Zhang, L.; Rao, A.; and Agrawala, M. 2023. Adding condi-
tional control to text-to-image diffusion models. In Proceed-
ings of the IEEE/CVF International Conference on Com-
puter Vision, 3836–3847.
Supplementary Materials for Anti-Diffusion: Preventing Abuse of Modifications of
Diffusion-Based Models

In this supplementary material for Anti-Diffusion, We can make the generated images have certain characteristics
firstly supplement related works and detailed training con- through a small number of samples. Textual Inversion can
figurations of Anti-Diffusion, DreamBooth and LoRA. Then achieve personalized generation over generated images by
we provide a comprehensive introduction to our constructed learning the representation of new ”words” in the embed-
dataset, Defense-Edit. Subsequently, we conduct a series ding space of the text encoder. It only needs to use 3-5 re-
of experiments to present both quantitative and qualitative lated images provided by the user to guide the generation
analyses. These experiments encompass the performance by learning these concepts. DreamBooth can fine tune the
evaluation on LoRA(Hu et al. 2021), the exploration of var- SD model with a small number of graphs to introduce the
ious Stable Diffusion (SD)(Rombach et al. 2022) versions desired concepts into a rare used word. Similarly, with only
in Anti-Diffusion, and the examination of different terms, a small number of samples required, LoRA accelerates the
prompts, and SD versions for DreamBooth(Ruiz et al. 2023). training process of the model and significantly reduces the
Finally, we offer additional qualitative analyses to illustrate storage cost of model files by slightly modifying and recal-
the efficacy of our method across diverse editing tasks. culating the weight layers of the original base model.
In addition to generating images, the editing method pow-
Related Works ered by SD can also quickly edit input images through in-
structions, such as the actions of characters or the number
Diffusion Models of items in the image. MasaCtrl is a method that does not
Text-to-image generation models (Li et al. 2023; Ramesh require fine-tuning and can achieve consistent image gener-
et al. 2021; Gafni et al. 2022; Ding et al. 2021) aim to ation and editing simultaneously. It converts Self-Attention
synthesize realistic images from natural language descrip- in the Diffusion model into mutual Attention to query rele-
tions. Recently, Diffusion models have emerged as a promis- vant local content and textures in the source image, achiev-
ing alternative to GANs (Dhariwal and Nichol 2021; Good- ing consistency between the edited image and the original
fellow et al. 2014), which can generate high-quality im- image. DiffEdit can replace the target in the image accord-
ages by reversing a stochastic Diffusion process. Control- ing to different text descriptions. It can automatically gener-
Net(Zhang, Rao, and Agrawala 2023) is a method used for ate a mask that highlights the input image area that needs to
image generation that controls Diffusion models by adding be edited, and then use this mask and a text guided Diffusion
additional conditions, such as pose, edge detection, depth model to achieve semantic image editing.
maps, etc. It enhances the control ability of the SD model In these works, DreamBooth and LoRA is widely used
by creating ”frozen” and ”trainable” copies. Similar to Con- due to its excellent generation ability and few-shot learning
trolNet, T2I Adapter (Mou et al. 2023) also aligns inter- ability. Moreover, image editing methods cannot be ignored
nal knowledge with external control signals by learning since MasaCtrl can immediately edit images without addi-
lightweight adapters, thereby enhancing the controllability tional tuning. Therefore, we focus on the protecting against
of the model. It does not affect the original network topol- the malicious use of these methods.
ogy and generation capability, and has efficient operating
characteristics. GLIGEN (Li et al. 2023) models the rela- Defense Against SD
tionship between text descriptions, spatial locations, and im-
ages by introducing additional trainable self-attentions lay- While SD brings efficient personalized generation and edit-
ers (Vaswani et al. 2017) on pretrained SD models. Through ing capabilities, defense against SD is also being studied
this method, GLIGEN can accurately control image gener- to protect images from unauthorized learning. Photo Guard
ation based on given text descriptions and positional infor- (Salman et al. 2023) generates adversarial noise for the en-
mation. Meanwhile, a number of personalized customiza- coder and denoising process in SD. Similar to Photo Guard,
tion generation methods based on SD have emerged, which MIST (Liang et al. 2023) interferes with both processes. At
the same time, MIST constructs a specific watermark that
Copyright © 2025, Association for the Advancement of Artificial not only interferes with the image generation process of SD,
Intelligence (www.aaai.org). All rights reserved. but also marks specific watermarks on the generated image.
Yet, for DreamBooth, its training process constantly updates rate of 1e−5 . The settings for the training instance prompt
the parameters of the SD, which makes the above two meth- and prior prompt are the same as DreamBooth.
ods prone to failure. On the basis of interfering with the
denoising process of SD, Anti-DreamBooth (Van Le et al. Details of the Dataset Defense-Edit
2023) simulates the training of DreamBooth through the
constructed dataset, dynamically updating the parameters To evaluate the performance of defense methods on SD-
of SD. Through this alternating defense and training, bet- based editing, we collect a dataset named Defense-Edit with
ter defense performance have been achieved against Dream- 50 pairs of images and prompts. In order to simulate differ-
Booth. PID (Li et al. 2024) generates adversarial noise to ent scenarios, we collect 30 images from real photos, and
disturb VAE encoder of SD, which has the advantage of ig- 20 images generated using SD. For real photos, we select
noring the impact of prompts and providing broader protec- them from CelebA-HQ(Karras et al. 2017), VGGFace2(Cao
tion than Anti-DB. However, PID solely relies on VAE and et al. 2018), and TEdBench(Kawar et al. 2023) to cover cat-
does not participate in the diffusion process. Its defense per- egories such as faces, animals (e.g., dogs and horses), and
formance is greatly affected by VAE versions. Given the un- objects (e.g., apples and boxes). For the generated images,
predictability of both prompts and model versions in prac- we adopt SD v2.1 to generate them with our constructed
tical applications, a more comprehensive defense strategy is prompts. In the process of creating instructions, to have an
necessary. At the same time, these methods mainly focus on extensive coverage of a wide range of conceivable scenar-
tuning models and overlook diffusion-based editing meth- ios, we incorporate transformations that include changes in
ods, emphasizing the necessity for a more extensive defense facial expressions, item substitutions, additions, deletions,
approach against both threats. These methods demonstrate and pose adjustments. To avoid situations where the edit-
their ability to protect images from unauthorized learning ing effect itself is not ideal, we select data that can be well
and editing, but there are still some shortcomings. Photo- edited on MasaCtrl(Cao et al. 2023) or DiffEdit(Couairon
Guard and Mist perform adversarial examples generation on et al. 2023). Some representative images with their instruc-
fixed parameter of SD, which results in poor protection on tions from our Defense-Edit dataset and their edited versions
methods such as DreamBooth that require tuning, while Anti using DiffEdit and MasaCtrl can be found in Fig. 1.
DreamBooth requires users to construct additional datasets,
which is not very practical for practical applications. And
Anti-DB only interferes with the denoising process of SD
with manually selected prompt, without interfering with the
semantic features of adversarial examples, which makes its
privacy protection ability and resistance to editing methods
not outstanding enough.

Training Configurations
During the Anti-Diffusion training process, it takes a to-
tal of 5 epochs to obtain the final perturbed images. Each
epoch consists of 30 iterations for Prompt Tuning, 10 iter-
ations for PGD in adversarial noise optimization, and 20
iterations for UNet update. For the PGD part, we utilize a
value of α = 0.002 and a default noise budget of η = 0.05.
The adversarial noise δ(i) is optimized using the Projected
Gradient Descent (PGD) scheme, with 10 iterations of PGD
employed. Our method utilizes a value of α = 0.002 and a
default noise budget of η = 0.05 for the PGD part. There
are 30 iterations for Prompt Tuning, 20 iterations for UNet
tuning at each epoch and it takes 5 epochs for the protection.
The whole training process, when executed on an NVIDIA
A6000 GPU, takes about 3 minutes to complete. During the Figure 1: Examples of Defense-Edit with original images,
optimization of the Prompt Tuning strategy, the dimension instructions, and editing effects.
of the optimized token embedding is 77 × 1024. For the Se-
mantic Disturbance Loss, we only extract Cross-Attention
maps at a scale of 16 × 16 to save memory.
DreamBooth is trained by utilizing a batch size of 2 and
Additional quantitative results
a learning rate of 5 × 10−7 across 1, 000 training steps. The In the main paper, we analyze the performance of Anti-
most recent version of SD (v2.1) is employed as the pre- Diffusion, utilizing the most recent SD version, v2.1. The
trained generator by default. Unless otherwise indicated, “a term ”sks” is employed during the training phase of the
photo of sks person” and “a photo of person” are set as the DreamBooth, and the prompt ”a photo of sks person” is used
training instance prompt and prior prompt, respectively. For during the inference stage. In this section, we provide more
the training of LoRA, it takes 400 epochs with a learning defense results on LoRA. Then we explore various terms,
prompts, and versions of SD on DreamBooth for evaluating Performance across different SD versions
the performance of Anti-Diffusion. In real-world situations, it’s impractical to know the specific
version of SD that will be employed for training Dream-
Defense performance on LoRA Booth. To evaluate the robustness of our defense method
across diverse versions of SD, we apply Anti-Diffusion on
In addition to applying LoRA on VGGFace2, we further in- the VGGFace2 dataset across various versions of SD mod-
vestigate the effectiveness of different defense methods by els, including v2.1, v1.5, and v1.4. The defense effects of
applying LoRA on the CelebA-HQ dataset. Other baselines images protected with different versions of SD are listed in
like Anti-DreamBooth,PID, Photo Guard and MIST are also Tab. 5, Tab. 6, and Tab. 7 respectively. Compared to images
adopted as an comparison. As demonstrated in Tab. 1, Anti- with no defense, Anti-Diffusion can prove effective across
Diffusion makes LoRA to generate more meaningless im- various SD versions. It remains effective when the SD ver-
ages (the highest FDFR(Deng et al. 2020) value), and dis- sion for defense differs from that of DreamBooth. Notably,
rupts LoRA’s ability to learn the image’s ID (the lowest our method outperforms Anti-DreamBooth on all the met-
ISM(Deng et al. 2019) and SER-FQA(Terhorst et al. 2020) rics, especially on ISM, suggesting that DreamBooth is more
values). In addition, the LoRA, when trained with images difficult to imitate the facial features from images protected
perturbed using Anti-Diffusion, tends to generate images of by Anti-Diffusion.
the lowest quality (the highest BRISQUE(Mittal, Moorthy,
and Bovik 2012), FID(Heusel et al. 2017) and NIQE(Mittal,
Soundararajan, and Bovik 2012) values). This suggests that
Anti-Diffusion has good generalizability when applied to
another diffusion-based method.

Performance across different terms for


DreamBooth
In the process of training DreamBooth, a term(e.g.,sks) is re-
quired to encapsulate the semantic information of the target
object. The selection of different terms may influence the
generative capacity of the model. To ensure a comprehen-
sive analysis of robustness on different terms and validate
the efficacy of our method, we experiment with a broader
range of terms. Specifically, we evaluate the effectiveness
of our method on DreamBooth using diverse terms such
as ”sks”, ”t@t”, and ”dbz” on both CelebA-HQ and VG-
GFace2 datasets. Anti-DreamBooth, which has the best per-
formance among the baselines, is selected for the compari-
son. The quantitative results are shown in Tab. 2 and Tab. 3.
With the defense of Anti-Diffusion, DreamBooth generates
more nonsensical images, as indicated by the highest FDFR
value on both datasets CelebA-HQ and VGGFace2,. Also
it limits DreamBooth’s capacity to learn the image’s ID, as
evidenced by the lowest ISM value. Furthermore, Dream-
Booth trained with images protected by Anti-Diffusion tends
to generate lower quality images, as shown by the highest
BRISQUE, FID, and NIQE values. Overall, Anti-Diffusion
demonstrates superior defense performance against Dream-
Booth under various terms, on both CelebA-HQ and VG-
GFace2 datasets.

Performance across different prompts for


DreamBooth
For DreamBooth, different prompts can be used to generate
different content. As illustrated in Tab. 4, we introduce two
additional prompts: ”a photo of sks person with sad face”
and ”a photo of sks person with angry face” to assess the
performance. When compared to Anti-DreamBooth, Anti-
Diffusion achieves superior defense performance in these
varied prompt scenarios.
Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ Brisques↑ FID↑ NIQE↑
no defense — 0.02 0.67 0.74 17.14 142.67 4.22
photo guard 34.24 0.04 0.62 0.65 17.71 143.58 5.48
MIST 35.31 0.03 0.53 0.54 29.96 239.86 4.53
PID 34.72 0.18 0.51 0.55 36.42 243.58 6.29
Anti-DreamBooth 35.76 0.10 0.47 0.49 38.67 257.73 6.68
Anti-Diffusion 36.76 0.14 0.43 0.43 42.27 275.59 7.11

Table 1: Defense performance comparison when applying LoRA on CelebA-HQ.

Terms Method FDFR↑ ISM↓ SER-FQA↓ Brisques↑ FID↑ NIQE↑


Anti-Diffusion 0.46 0.23 0.31 40.66 388.58 5.55
t@t Anti-DreamBooth 0.40 0.26 0.49 36.96 358.93 5.37
no defense 0.10 0.68 0.72 15.37 139.33 4.24
Anti-Diffusion 0.62 0.14 0.20 40.35 430.70 5.30
dbz Anti-DreamBooth 0.57 0.23 0.34 37.32 413.54 4.98
no defense 0.11 0.69 0.74 16.47 142.57 4.46
Anti-Diffusion 0.62 0.15 0.18 40.46 457.13 5.27
sks Anti-DreamBooth 0.60 0.24 0.31 37.41 436.34 5.05
no defense 0.10 0.66 0.73 17.43 144.02 4.12

Table 2: Defense performance comparison across different terms for DreamBooth on VGGFace2.

Terms Method FDFR↑ ISM↓ SER-FQA↓ Brisques↑ FID↑ NIQE↑


Anti-Diffusion 0.22 0.43 0.57 39.65 280.38 5.98
t@t Anti-DreamBooth 0.17 0.56 0.64 37.64 263.76 5.51
no defense 0.04 0.67 0.73 19.29 155.84 4.53
Anti-Diffusion 0.29 0.25 0.50 41.25 332.59 5.99
dbz Anti-DreamBooth 0.21 0.44 0.54 38.08 316.04 5.49
no defense 0.13 0.70 0.61 21.22 139.56 4.52
Anti-Diffusion 0.58 0.26 0.38 40.93 352.83 5.96
sks Anti-DreamBooth 0.54 0.41 0.29 38.34 336.12 5.56
no defense 0.07 0.63 0.73 17.00 147.82 4.72

Table 3: Defense performance comparison across different terms for DreamBooth on CelebA-HQ.
Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑
”a photo of sks person with sad face”
Anti-Diffusion 35.91 0.53 0.18 0.43 39.40 457.27 5.13
Anti-DreamBooth 34.55 0.47 0.23 0.45 36.32 425.75 5.01
no defense - 0.09 0.56 0.71 16.34 169.35 4.11
”a photo of sks person with angry face ”
Anti-Diffusion 35.91 0.55 0.16 0.41 39.51 432.54 5.16
Anti-DreamBooth 34.55 0.49 0.17 0.50 37.42 412.93 5.12
no defense - 0.11 0.59 0.75 15.36 173.69 4.08

Table 4: Defense performance comparison across different prompts for DreamBooth.

SD Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑


Anti-Diffusion 35.91 0.62 0.15 0.18 40.46 457.13 5.27
v2.1
Anti-DreamBooth 34.55 0.60 0.24 0.31 37.41 436.34 5.05
Anti-Diffusion 36.35 0.61 0.16 0.21 36.45 442.23 5.01
v1.5
Anti-DreamBooth 34.59 0.57 0.28 0.34 35.19 425.27 4.98
Anti-Diffusion 35.90 0.60 0.16 0.21 35.68 431.64 5.23
v1.4
Anti-DreamBooth 34.57 0.54 0.29 0.33 35.24 412.76 5.01
- no defense - 0.10 0.66 0.73 17.43 144.02 4.12

Table 5: Defense performance comparison on SD v2.1 based DreamBooth.


SD Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑
Anti-Diffusion 35.91 0.89 0.03 0.11 43.24 489.45 5.95
v2.1
Anti-DreamBooth 34.55 0.71 0.13 0.22 30.53 432.86 5.34
Anti-Diffusion 36.35 0.82 0.04 0.07 37.24 486.56 5.89
v1.5
Anti-DreamBooth 34.59 0.67 0.18 0.24 28.72 401.39 5.11
Anti-Diffusion 35.90 0.82 0.05 0.09 36.76 473.36 5.73
v1.4
Anti-DreamBooth 34.57 0.62 0.20 0.22 29.56 392.64 5.01
- no defense - 0.06 0.45 0.67 21.43 134.76 4.67

Table 6: Defense performance comparison on SD v1.5 based DreamBooth.

SD Method PSNR↑ FDFR↑ ISM↓ SER-FQA↓ BRISQUE↑ FID↑ NIQE↑


Anti-Diffusion 35.91 0.94 0.02 0.09 41.54 496.74 6.13
v2.1
Anti-DreamBooth 34.55 0.81 0.07 0.12 26.42 456.78 5.54
Anti-Diffusion 36.35 0.83 0.04 0.08 35.62 454.41 5.75
v1.5
Anti-DreamBooth 34.59 0.73 0.16 0.24 30.35 435.24 5.38
Anti-Diffusion 35.90 0.81 0.05 0.07 37.45 489.82 5.91
v1.4
Anti-DreamBooth 34.57 0.77 0.17 0.21 28.45 442.54 5.23
- no defense - 0.06 0.49 0.65 19.67 112.84 4.31

Table 7: Defense performance comparison on SD v1.4 based DreamBooth.


Additional Qualitative Results Fig. 7, with no defense or with Anti-DreamBooth, the ten-
nis ball is successfully transformed into to a green tomato.
Defense Performance on DreamBooth
In contrast, with Anti-Diffusion, the tennis ball remains un-
As depicted in Fig. 2, we demonstrate the effectiveness of changed. This means that Anti-Diffusion has better defense
various defense methods when employing DreamBooth on performance across these editing methods.
the CelebA-HQ datasets. Under different character identi- .
ties, with the defense of Anti-Diffusion, images generated
by DreamBooth contain more noise and their facial features
are more difficult to be recognized. It indicates that Anti-
Diffusion achieves superior defense performance when uti-
lizing DreamBooth.

Defense Performance on LoRA


As depicted in Fig. 3, we demonstrate the effectiveness of
various defense methods when employing LoRA on the VG-
GFace2 and CelebA-HQ datasets. Under different datasets
and character identities, with the defense of Anti-Diffusion,
images generated by LoRA contain more noise and their fa-
cial features are more difficult to be recognized. It indicates
that Anti-Diffusion achieves superior defense performance
when utilizing LoRA.

Performance across Different Terms for


DreamBooth
We provide more visual results of the performance across
different terms of DreamBooth. As depicted in Fig. 4,
the top two rows display the results on the CelebA-HQ
dataset, while the bottom two rows present the results on
the VGGFace2 dataset. It can be seen that the defense of
Anti-Diffusion is effective across different terms. Compared
to Anti-DreamBooth, the defense of Anti-Diffusion forces
DreamBooth to generate images with more noise and less
recognizable faces.

Performance across Different Prompts for


DreamBooth
Here we show the visual results of our defense methods
across different prompts for DreamBooth. As illustrated in
Fig. 5, with the defense of Anti-Diffusion, DreamBooth gen-
erates images with increased perturbations. This indicates
that our method can effectively defense across a variety of
prompts and outperforms Anti-DreamBooth.

Performance across Different SD Versions


We present more visual results across different SD versions.
As illustrated in Fig. 6, images generated by DreamBooth
with the defense of Anti-Diffusion have more notable ar-
tifacts. This shows the robustness of our defense method
across different versions of the SD model.

Performance on Editing Methods


Here we show the defense performance of Anti-Diffusion
against MasaCtrl and DiffEdit on Defense-Edit dataset. As
illustrated in Fig. 7, following the instructions, MasaCtrl and
DiffEdit can edit the images with no defense or defensed by
Anti-DreamBooth. Through the defense of Anti-Diffusion,
these editing methods fail. For example, in the third row of
Figure 2: Defense performance comparison when applying DreamBooth on dataset VGGFace2.

Figure 3: Defense performance comparison when applying LoRA on dataset VGGFace2 ( the first two rows) and dataset
CelebA-HQ ( the last two rows).
Figure 4: Defense performance comparison across different terms for DreamBooth.

Figure 5: Defense performance comparison across different prompts for DreamBooth.


Figure 6: Defense performance comparison across different SD versions for DreamBooth.
Figure 7: Qualitative results across different editing method.
References Li, Y.; Liu, H.; Wu, Q.; Mu, F.; Yang, J.; Gao, J.; Li, C.; and
Cao, M.; Wang, X.; Qi, Z.; Shan, Y.; Qie, X.; and Zheng, Y. Lee, Y. J. 2023. Gligen: Open-set grounded text-to-image
2023. MasaCtrl: Tuning-Free Mutual Self-Attention Control generation. In Proceedings of the IEEE/CVF Conference on
for Consistent Image Synthesis and Editing. In Proceedings Computer Vision and Pattern Recognition, 22511–22521.
of the IEEE/CVF International Conference on Computer Vi- Liang, C.; Wu, X.; Hua, Y.; Zhang, J.; Xue, Y.; Song, T.;
sion (ICCV), 22560–22570. Xue, Z.; Ma, R.; and Guan, H. 2023. Adversarial Example
Cao, Q.; Shen, L.; Xie, W.; Parkhi, O. M.; and Zisserman, Does Good: Preventing Painting Imitation from Diffusion
A. 2018. Vggface2: A dataset for recognising faces across Models via Adversarial Examples. In Proceedings of the
pose and age. In 2018 13th IEEE international conference 40th International Conference on Machine Learning, vol-
on automatic face & gesture recognition (FG 2018), 67–74. ume 202 of Proceedings of Machine Learning Research,
IEEE. 20763–20786. PMLR.
Couairon, G.; Verbeek, J.; Schwenk, H.; and Cord, M. 2023. Mittal, A.; Moorthy, A. K.; and Bovik, A. C. 2012. No-
DiffEdit: Diffusion-based semantic image editing with mask reference image quality assessment in the spatial domain.
guidance. In The Eleventh International Conference on IEEE Transactions on image processing, 21(12): 4695–
Learning Representations. 4708.
Deng, J.; Guo, J.; Ververas, E.; Kotsia, I.; and Zafeiriou, S. Mittal, A.; Soundararajan, R.; and Bovik, A. C. 2012. Mak-
2020. Retinaface: Single-shot multi-level face localisation ing a “completely blind” image quality analyzer. IEEE Sig-
in the wild. In Proceedings of the IEEE/CVF conference on nal processing letters, 20(3): 209–212.
computer vision and pattern recognition, 5203–5212. Mou, C.; Wang, X.; Xie, L.; Wu, Y.; Zhang, J.; Qi, Z.; Shan,
Deng, J.; Guo, J.; Xue, N.; and Zafeiriou, S. 2019. Arcface: Y.; and Qie, X. 2023. T2i-adapter: Learning adapters to
Additive angular margin loss for deep face recognition. In dig out more controllable ability for text-to-image diffusion
Proceedings of the IEEE/CVF conference on computer vi- models. arXiv preprint arXiv:2302.08453.
sion and pattern recognition, 4690–4699. Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Rad-
Dhariwal, P.; and Nichol, A. 2021. Diffusion models beat ford, A.; Chen, M.; and Sutskever, I. 2021. Zero-shot text-to-
gans on image synthesis. Advances in neural information image generation. In International Conference on Machine
processing systems, 34: 8780–8794. Learning, 8821–8831. PMLR.
Ding, M.; Yang, Z.; Hong, W.; Zheng, W.; Zhou, C.; Yin, Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; and Om-
D.; Lin, J.; Zou, X.; Shao, Z.; Yang, H.; et al. 2021. mer, B. 2022. High-resolution image synthesis with latent
Cogview: Mastering text-to-image generation via transform- diffusion models. In Proceedings of the IEEE/CVF confer-
ers. Advances in Neural Information Processing Systems, ence on computer vision and pattern recognition, 10684–
34: 19822–19835. 10695.
Gafni, O.; Polyak, A.; Ashual, O.; Sheynin, S.; Parikh, D.; Ruiz, N.; Li, Y.; Jampani, V.; Pritch, Y.; Rubinstein, M.; and
and Taigman, Y. 2022. Make-a-scene: Scene-based text-to- Aberman, K. 2023. Dreambooth: Fine tuning text-to-image
image generation with human priors. In European Confer- diffusion models for subject-driven generation. In Proceed-
ence on Computer Vision, 89–106. Springer. ings of the IEEE/CVF Conference on Computer Vision and
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Pattern Recognition, 22500–22510.
Warde-Farley, D.; Ozair, S.; Courville, A.; and Bengio, Y. Salman, H.; Khaddaj, A.; Leclerc, G.; Ilyas, A.; and Madry,
2014. Generative adversarial nets. Advances in neural in- A. 2023. Raising the Cost of Malicious AI-Powered Image
formation processing systems, 27. Editing. arXiv preprint arXiv:2302.06588.
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; and Terhorst, P.; Kolf, J. N.; Damer, N.; Kirchbuchner, F.; and
Hochreiter, S. 2017. Gans trained by a two time-scale up- Kuijper, A. 2020. SER-FIQ: Unsupervised estimation of
date rule converge to a local nash equilibrium. Advances in face image quality based on stochastic embedding robust-
neural information processing systems, 30. ness. In Proceedings of the IEEE/CVF conference on com-
Hu, E. J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, puter vision and pattern recognition, 5651–5660.
S.; Wang, L.; and Chen, W. 2021. Lora: Low-rank adaptation Van Le, T.; Phung, H.; Nguyen, T. H.; Dao, Q.; Tran, N. N.;
of large language models. arXiv preprint arXiv:2106.09685. and Tran, A. 2023. Anti-DreamBooth: Protecting users from
Karras, T.; Aila, T.; Laine, S.; and Lehtinen, J. 2017. Pro- personalized text-to-image synthesis. In Proceedings of the
gressive growing of gans for improved quality, stability, and IEEE/CVF International Conference on Computer Vision,
variation. arXiv preprint arXiv:1710.10196. 2116–2127.
Kawar, B.; Zada, S.; Lang, O.; Tov, O.; Chang, H.; Dekel, Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones,
T.; Mosseri, I.; and Irani, M. 2023. Imagic: Text-based L.; Gomez, A. N.; Kaiser, Ł.; and Polosukhin, I. 2017. At-
real image editing with diffusion models. In Proceedings of tention is all you need. Advances in neural information pro-
the IEEE/CVF Conference on Computer Vision and Pattern cessing systems, 30.
Recognition, 6007–6017. Zhang, L.; Rao, A.; and Agrawala, M. 2023. Adding condi-
Li, A.; Mo, Y.; Li, M.; and Wang, Y. 2024. PID: Prompt- tional control to text-to-image diffusion models. In Proceed-
Independent Data Protection Against Latent Diffusion Mod- ings of the IEEE/CVF International Conference on Com-
els. arXiv preprint arXiv:2406.15305. puter Vision, 3836–3847.

You might also like