近年来,深度学习在光流估计领域取得了显著进展,经过充分训练的神经网络能够直接预测帧间光流,有效避免传统算法复杂优化问题,但生成结果仍存在模糊和噪声问题。SIMONYAN等[8]通过引入变分方法,实现算法性能显著的性能提升。ILG等[9]采用多网络叠加策略提升算法性能。RANJAN等[10]将传统算法中的金字塔理念与光流估计相结合,采用由粗到细的估计方式解决光流估计过程中的大、小位移问题。SUN等[11]在此基础上引入相关体处理算法,实现了网络性能提升和端到端的训练方式。YANG等[12]通过引入4D卷积策略对相关体处理算法进行改进,显著提高了光流估计的准确性。HUI等[13]提出一种级联预测光流与特征正则化的方法,进一步优化光流估计的性能。以上深度学习的方法都采用了由粗到细的金字塔迭代优化策略。然而,该策略存在弊端,即快速移动的小物体在粗级别可能会消失。RAFT(recurrent all-pairs field transforms for optical flow)方法选择保持并更新单一的高分辨率光流场,这种方法在处理快速运动的小物体上表现出了明显的优势,为进一步提高光流估计的准确性和稳定性提供了新的思路。
基于定义查询操作,用于光流的迭代更新。设上一次迭代计算得到的在x和y两个方向上的光流分别为( f1, f2),其中 f 为包含所有像素点的光流信息矩阵,通过( f1, f2)可得I1图上像素点x=(u,v)在I2上对应位置x',x'=(u+ f1(u),v+f2(v)),其中u和v分别为每个像素点在x和y两个方向上的坐标。x'邻域点集L(x') r 为
卷积滤波器的输出Cexport和当前光流位移 fk 在特征维度上进行拼接成为基准编码模块的输出Loutput。将Loutput编码后的特征I3在特征维度合并作为ConvGRU的输入,完成光流的迭代更新。该方法能够精准捕捉到更细致的特征信息,使预测结果更加贴近实际光流情况,提升输出光流场的精确性与细致度。此外,鉴于特征提取阶段输出特征图的分辨率仅为原图的1/8,因而在迭代更新过程中生成的初始光流预测图亦维持这一较低分辨率。为获得与原图相匹配的高分辨率光流场景,采用上采样操作进行必要的处理。
YANGHua, WANGJiao, ZHANGWeijun,et al.Lightweight video frame interpolation algorithm based on optical flow estimation[J].Journal of Shenyang Aerospace University,2022,39(6):57-64.
LIZhihui, HUYongli, ZHAOYonghua,et al.Locating moving pedestrian from running vehicle[J]. Journal of Jilin University (Engineering and Technology Edition),2018,48(3):694-703.
[7]
HORNB K P, SCHUNCKB G.Determining optical flow[J].Artificial Intelligence,1981,17(1/2/3):185-203.
[8]
BLACKM J, ANANDANP.A framework for the robust estimation of optical flow[C]//1993(4th) International Conference on Computer Vision.May 11-14,1993,Berlin,Germany.IEEE,1993:231-236.
[9]
ZACHC, POCKT, BISCHOFH.A duality based approach for realtime TV-L1 optical flow[C]//Pattern Recognition.Berlin,Heidelberg:Springer Berlin Heidelberg,2007:214-223.
[10]
WEINZAEPFELP, REVAUDJ, HARCHAOUIZ,et al.DeepFlow:large displacement optical flow with deep matching[C]//2013 IEEE International Conference on Computer Vision.December 1-8,2013, Sydney,NSW,Australia.IEEE,2013:1385-1392.
[11]
SIMONYANK, ZISSERMANA, SIMONYANK,et al.Two-stream convolutional networks for action recognition in videos[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems-Volume 1.December 8-13,2014,Montreal,Canada.ACM,2014: 568-576.
[12]
ILG E, MAYERN, SAIKIAT,et al.FlowNet 2.0:evolution of optical flow estimation with deep networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.July 21-26,2017,Honolulu,HI,USA. IEEE,2017:1647-1655.
[13]
RANJANA, BLACKM J.Optical flow estimation using a spatial pyramid network[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.July 21-26,2017,Honolulu,HI,USA.IEEE,2017: 2720-2729.
[14]
SUND Q, YANGX D, LIUM Y,et al.PWC-net:cnns for optical flow using pyramid,warping,and cost volume[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:8934-8943.
[15]
YANGG, RAMANAND.Volumetric Correspondence Networks for Optical Flow[C]//Annual Conference on Neural Information Processing Systems,December 2019.Vancouver,BC,Canada:NeurIPS,2019:793-803.
[16]
HUIT W, TANGX O, LOYC C.LiteFlowNet:a lightweight convolutional neural network for optical flow estimation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. June 18-23,2018,Salt Lake City,UT,USA.IEEE,2018:8981-8989.
[17]
TEEDZ, DENGJ.RAFT:recurrent all-pairs field transforms for optical flow[C]//Computer Vision-ECCV 2020.Cham:Springer International Publishing,2020:402-419.
BUTLERD J, WULFFJ, STANLEYG B,et al.A naturalistic open source movie for optical flow evaluation[C]//Computer Vision-ECCV 2012.Berlin,Heidelberg:Springer Berlin Heidelberg,2012:611-625.
ZHANGShuifa, ZHANGWensheng, DINGHuan,et al.Background modeling and object detecting based on optical flow velocity field[J].Journal of Image and Graphics,2011,16(2):236-243.
XUGuangfu, ZENGJichao, LIUXixiang.Visual odometer based on optical flow method and feature matching[J].Laser & Optoelectronics Progress,2020,57(20):270-278.