SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Giakoumoglou, Paschalis; Karageorgiou, Dimitrios; Papadopoulos, Symeon; Petrantonakis, Panagiotis C.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2502.06593 (cs)

[Submitted on 10 Feb 2025 (v1), last revised 22 May 2025 (this version, v2)]

Title:SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Authors:Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos, Panagiotis C. Petrantonakis

View PDF HTML (experimental)

Abstract:Recent advancements in generative AI have made text-guided image inpainting -- adding, removing, or altering image regions using textual prompts -- widely accessible. However, generating semantically correct photorealistic imagery, typically requires carefully-crafted prompts and iterative refinement by evaluating the realism of the generated content - tasks commonly performed by humans. To automate the generative process, we propose Semantically Aligned and Uncertainty Guided AI Image Inpainting (SAGI), a model-agnostic pipeline, to sample prompts from a distribution that closely aligns with human perception and to evaluate the generated content and discard one that deviates from such a distribution, which we approximate using pretrained Large Language Models and Vision-Language Models. By applying this pipeline on multiple state-of-the-art inpainting models, we create the SAGI Dataset (SAGI-D), currently the largest and most diverse dataset of AI-generated inpaintings, comprising over 95k inpainted images and a human-evaluated subset. Our experiments show that semantic alignment significantly improves image quality and aesthetics, while uncertainty guidance effectively identifies realistic manipulations - human ability to distinguish inpainted images from real ones drops from 74% to 35% in terms of accuracy, after applying our pipeline. Moreover, using SAGI-D for training several image forensic approaches increases in-domain detection performance on average by 37.4% and out-of-domain generalization by 26.1% in terms of IoU, also demonstrating its utility in countering malicious exploitation of generative AI. Code and dataset are available at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2502.06593 [cs.CV]
	(or arXiv:2502.06593v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2502.06593

Submission history

From: Paschalis Giakoumoglou [view email]
[v1] Mon, 10 Feb 2025 15:56:28 UTC (5,179 KB)
[v2] Thu, 22 May 2025 18:13:28 UTC (19,862 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators