To solve the problem of a large amount of human resource consumption caused by tampering with video content review and screening, a technique for identifying tampered content in videos is proposed in this paper. First, key frames are extracted by calculating inter-frame differences. Then, the locations of tampering are determined through differential analysis between the key frames and the original frames. Subsequently, based on the phenomenon of location association and semantic aggregation, the problems of location dispersion and semantic dispersion are solved by the DBSCAN algorithm, which is caused by the randomness of tampering with location and content structure. Finally, optical character recognition (OCR) technology is applied to decipher the specific content that has been altered. Spatio-temporal position and content identification of tampered video content are achieved by the proposed method, providing a solid technical foundation for video content inspection in fields such as public safety, media, and business.
近年来,视频内容篡改识别技术成为备受关注的研究方向,国内外众多研究人员持续致力于探索更为先进的识别技术。基于光流一致性的视频帧间篡改检测(A novel video inter-frame forgery model detection scheme based on optical flow consistency)[2], 通过窗口移动的方式计算帧之间的光流特征。当视频遭受篡改时,相邻帧之间的光流高度不一致,据此可查找出篡改帧的所在位置。光流通过跟踪图像中的特征点或像素估计运动的速度和方向,在计算机视觉领域常用于物体运动分析等方面。基于结构相似度均值(Mean structural similarity index,MSSIM)商的一致性检测[3,4],提出相邻帧之间的MSSIM商具有连续性,而被篡改的视频其篡改位置的MSSIM商会发生突变,从而实现视频内容篡改识别。该算法的准确性较高,但其篡改识别的效果对专家经验具有较高的依赖性,并不具备自适应能力。基于聚类[5]的篡改检测算法会利用多种方法对提取到的视频帧特征进行聚类,然后定位到篡改区域的位置,从而实现视频篡改识别。使用到的图像特征包括颜色特征(颜色直方图、颜色矩)、统计特征(Hu矩)、轮廓特征(SIFT、SURF)、空间关系特征等。基于聚类的篡改检测算法泛用性强,可以根据不同的视频内容有针对性地选择视频帧特征,从而获得较高准确度的检测结果。
针对以上问题,本文提出一种基于关键帧提取与DBSCAN的视频内容篡改识别技术。该算法的流程如下:首先,通过分析篡改视频的图像特征[6,7],获取能够体现视频主要内容的关键帧;其次,对篡改视频的关键帧和源视频相同位置的原始帧进行差值分析,获得篡改内容的空间位置信息;再次,根据语义近邻现象和空间位置信息,利用DBSCAN算法聚合内容相关的篡改区域,获取具有语义完整性的篡改内容的空间位置;最后,依据空间位置信息截取篡改区域,利用光学字符识别技术(Optical character recognition,OCR)[8,9]提取篡改内容。
密度聚类算法(Density-based spatial clustering of applications with noise,DBSCAN)[18]通过设置搜索半径(Eps)和最小邻域点数(MinPts)进行聚类,适用于不同形状的数据集,特别是流型数据。在数据分布较为稀疏或密度差距较大时,容易出现误判的情况。同时为了确保聚类精度,DBSCAN 算法还需要提前分析数据集的密度设置这两个参数。DBSCAN 算法流程中涉及3个重要概念。①核心点:如果任意样本点pi的Eps内含有的数据点数量大于MinPts,则判断pi为核心点。②非核心点:如果任意样本点pi的搜索半径内的数据点少于最小邻域点数,但是pi位于核心点的邻域范围内,则判断pi为非核心点。③异常点:除去核心点和非核心点的剩余样本点,被标记为异常点。
本节展示与其他算法的对比结果。实验使用的硬件环境为:Intel(R) Core(TM) i7-8750H CPU @ 2.20 GHZ,16 GB 内存。实验数据方面,本文从Facebook AI Research 公开发布的HowTo100M 数据集中挑选了100个视频作为源视频。然后在视频的任意位置添加文字后作为篡改视频。
CuiXue-bing, FengQiao-juan, CuiPing-fei. MPEG video authentication scheme based on content feature[J]. Journal of Computer Applications, 2010, 30(1): 214-216.
[3]
SunT F, JiangX H, ChaoJ. A novel video inter-frame forgery model detection scheme based on optical flow consistency[C]∥International Workshop on Digital Watermarking, Berlin, Germany, 2012: 261-281.
ZhangZhen-zhen, HouJian-jun, LiZhao-hong, et al. Video-frame insertion and deletion detection based on consistency of quotients of MSSIM[J]. Journal of Beijing University of Posts and Telecommunications, 2015, 38(4): 84-88.
[6]
LvC H, HuangY. Effective keyframe extraction from personal video by using nearest neighbor clustering[C]∥11th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics, Beijing, China, 2018: 1-4.
[7]
ValognesJ, AmerM A, DastjerdiN S. Effective keyframe extraction from RGB and RGB-D video sequences[C]∥Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Montreal, Canada, 2017: 1-5.
[8]
DuanL, XiongD, LeeJ, et al. A local density based spatial clustering algorithm with noise[C]∥IEEE International Conference on Systems, Man and Cybernetics, Taipei, China,2006:4061-4066.
QinXu-jia, WangHui-ling, DuYi-cheng, et al. Structured light image enhancement algorithm based on retinex in hsv color space[J]. Journal of Computer-Aided Design & Computer Graphics, 2013, 25(4):488-493.
[13]
SabuA M, DasA S. A survey on various optical charac-ter recognition techniques[C]∥Conference on Emerg-ing Devices and Smart Systems (ICEDSS),Tiruchengode, India, 2018: 152-155.
[14]
SarikaN, SirisalaN, VelpuruM S. CNN based optical character recognition and applications[C]∥6th Inter-national Conference on Inventive Computation Technologies, Coimbatore, India, 2021: 666-672.
[15]
JunL. An improved DBSCAN clustering algorithm[J]. Computer and Communications, 2008, 8: 47468 -47476.
GuYi-jun, XieYi, XiaTian. Keyframe extraction based on representative evaluation of contents[J]. Computer Science, 2014, 41(8): 286-288.
[18]
DecombasM, DufauxF, RenanE, et al. A new object based quality metric based on sift and SSIM[C]∥19th IEEE International Conference on Image Processing, Orlando, USA, 2012: 1493-1496.
[19]
GuptaP, SrivastavaP, BhardwajS, et al. A modified PSNR metric based on HVS for quality assessment of color images[C]∥International Conference on Communication and Industrial Application, Kolkata, India, 2011: 1-4.
[20]
AlainH, ZiouD. Image quality metrics: PSNR vs. SSIM[C]∥20th International Conference on Pattern Recognition, Istanbul, Turkey, 2010: 2366-2369.
JinLi-na, YuJiong, DuXu-sheng, et al. Generative adversarial network and variational auto-encoder based outlier detection[J]. Application Research of Computers, 2022, 39(3): 774-779.
[25]
EvansA N. Morphological gradient operators for colour images[C]∥International Conference on Image Processing, Singapore, 2004: 3089-3092.
[26]
SmitiA, EloudiZ. Soft DBSCAN: improving DBSCAN clustering method using fuzzy set theory[C]∥6th International Conference on Human System Interactions, Sopot, Poland, 2013: 380-385.