The limitations of current image manipulation localization methods, such as the extraction of features at a single scale, the misdetection and omission of small tampered regions caused by background confusion, and the uncertainty existing in prediction results, are addressed. An image manipulation localization method based on edge uncertainty learning is proposed. The base features of the tampered image are extracted by means of a pyramid vision transformer. A coarse localization map is then generated through multi-level interactive coarse localization branches. To enhance the detection of small tampered regions, a small target-aware refinement branch is employed. Multi-scale feature fusion is achieved with the use of a dedicated module, which enables the full interaction and integration of features across different scales. Additionally, entropy-based perceptual loss is introduced to supervise boundary uncertainty, thus significantly reducing the uncertainty of the prediction results. The proposed method is evaluated on five widely-used public image tampering datasets in both in-domain and cross-domain experiments. It is demonstrated by the results that the method can effectively localize tampered regions and outperform existing approaches.
全局注意分支可帮助模型捕获丰富的上下文信息并增强语义表达。用3个卷积层生成查询特征 Q 、关键特征 K 和值特征 V。假设输入特征映射为,其中c、h、w分别为通道数、宽度和高度。为节省计算开销,将 Q 和 K 的通道数设置为,将 Q 、 K 重构为, V 重构为,其中。对 Q 的转置和 K 进行矩阵相乘,并用Softmax函数生成全局空间注意力图。将 V 和利用矩阵乘法相结合,得到语义增强特征。
首先,使用大小为和的池化核分别对输入特征进行全局平均池化和线性操作,得到两个不同维度的特征。其次,将两种特征进行裁剪和融合,得到中间特征。再次,将输入特征分别在H和W维度上分割成H个和W个切片,生成 K 和 V,通过计算二者相似度得到注意力矩阵。最后,将注意力矩阵与 V 相乘,使每个像素能感知水平和垂直方向上其他像素的信息,再乘以尺度参数,并加上原始输入特征得到最终增强后的特征。
对于第k阶段(),首先将模块下一阶段的输出与当前阶段小目标感知细化分支获得的特征进行串联融合。其次,将融合结果输入卷积层,通过添加执行元素求和操作,再采用卷积层进一步优化。最后,通过多级交互粗定位分支生成的粗定位图 M 得到位置信息。模块采用条件归一化方法[13]将位置信息注入融合特征,将粗定位图 M 输入卷积层生成调制参数和。和通过卷积层动态生成,能根据输入特征自适应地调整特征的尺度和位置信息。为将位置信息更好地嵌入特征空间,该模块利用和作为仿射参数来调制特征,此过程可表示为:
ZhongHui, KangHeng, Ying-daLyu, et al. Image manipulation localization algorithm based on channel attention convolutional neural networks[J]. Journal of Jilin University (Engineering and Technology Edition), 2021, 51(5): 1838-1844.
[3]
ShiZ, ChenH, ZhangD. Transformer-auxiliary neural networks for image manipulation localization by operator inductions[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2023, 33(9): 4907-4920.
ShiZe-nan, ChenHai-peng, ZhangDong, et al. Pretraining-driven multimodal boundary-aware vision transformer[J]. Journal of Software, 2023, 34(5): 2051-2067.
[6]
LiuY, ZhuX, ZhaoX, et al. Adversarial learning for constrained image splicing detection and localization based on atrous convolution[J]. IEEE Transactions on Information Forensics and Security, 2019, 14(10): 2551-2566.
[7]
ChenB, TanW, CoatrieuxG, et al. A serial image copy-move forgery localization scheme with source/target distinguishment[J]. IEEE Transactions on Multimedia, 2020, 23: 3506-3517.
[8]
ZhangY, FuZ, QiS, et al. Localization of inpainting forgery with feature enhancement network[J]. IEEE Transactions on Big Data, 2022, 9(3): 936-948.
[9]
PopescuA C, FaridH. Exposing digital forgeries in color filter array interpolated images[J]. IEEE Transactions on Signal Processing, 2005, 53(10): 3948-3959.
[10]
BianchiT, PivaA. Image forgery localization via block-grained analysis of JPEG artifacts[J]. IEEE Transactions on Information Forensics and Security, 2012, 7(3): 1003-1017.
[11]
MahdianB, SaicS. Using noise inconsistencies for blind image forensics[J]. Image and Vision Computing, 2009, 27(10): 1497-1503.
[12]
ChenX, DongC, JiJ, et al. Image manipulation detection by multi-view multi-scale supervision[C]//Proceedings of the IEEE/CVF. International Conference on Computer Vision. Montreal, QC, Canada, 2021: 14165-14173.
[13]
ShiC, WangC, ZhouX, et al. DAE-net: Dual attention mechanism and edge supervision network for image manipulation detection and localization[J]. IEEE Transactions on Instrumentation and Measurement, 2024(73): No.5028112.
[14]
WangW, XieE, LiX, et al. Pvt v2: Improved baselines with pyramid vision transformer[J]. Computational Visual Media, 2022, 8(3): 415-424.
[15]
ParkT, LiuM Y, WangT C, et al. Semantic image synthesis with spatially-adaptive normalization[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA: USA, 2019: 2337-2346.
[16]
PangY, ZhaoX, XiangT Z, et al. Zoom in and out: A mixed-scale triplet network for camouflaged object detection[C]∥Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, LA: USA, 2022: 2160-2170.
[17]
DongJ, WangW, TanT. Casia image tampering detection evaluation database[C]∥2013 IEEE. China Summit and International Conference on Signal and Information Processing.Beijing: China, 2013: 422-426.
[18]
GuanH, KozakM, RobertsonE, et al. MFC datasets: Large-scale benchmark datasets for media for ensic challenge evaluation[C]∥2019 IEEE.Winter Applications of Computer Vision Workshops (WACVW). Waikoloa, HI: USA, 2019: 63-72.
[19]
WenB, ZhuY, SubramanianR, et al. COVERAGE—a novel database for copy-move forgery detection[C]∥2016 IEEE. International Conference on Image Processing (ICIP).Phoenix, AZ: USA, 2016: 161-165.
[20]
HsuY F, ChangS F. Detecting image splicing usinggeometry invariants and camera characteristics consistency[C]∥2006 IEEE. International Conference on Multimedia and Expo. Toronto, ON: Canada, 2006: 549-552.
[21]
NovozamskyA, MahdianB, SaicS. IMD2020 a largescale annotated dataset tailored for detecting manipulated images[C]∥Proceedings of the IEEE/CVF. Winter Conference on Applications of Computer Vision Workshops. Snowmass Village, CO: USA, 2020: 71-80.
[22]
WuY, AbdAlmageedW, NatarajanP. Mantra-net manipulation tracing network for detection and localization of image forgeries with anomalous features[C]∥Proceedings of the IEEE/CVF. Conference on Computer Vision and Pattern Recognition. Long Beach, CA: USA, 2019: 9543-9552.
[23]
HuX, ZhangZ, JiangZ, et al. SPAN spatial pyramid attention network for image manipulation localization[C]∥Proceedings of the European Conference on Computer Vision (ECCV). Glasgow: UK, 2020: 312-328.
[24]
ZhuangP, LiH, TanS, et al. Image tampering localization using a dense fully convolutional network[J]. IEEE Transactions on Information Forensics and Security, 2021, 16: 2986-2999.
[25]
ZhuoL, TanS, LiB, et al. Self-adversarial training incorporating forgery attention for image forgery localization[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 819-834.
[26]
XiaX, SuL C, WangS P, et al. DMFF-net: Double-stream multilevel feature fusion network for image forgery localization[J]. Engineering Applications of Artificial Intelligence, 2024, 127: No.107200.