To accurately describe and parameterize scene light sources and achieve high-precision single-image illumination estimation, an illumination representation method based on linear transformation of cosine spherical distribution was proposed. A regression neural network was designed to infer the parametric distribution and intensity of light sources from a single image. A loss function based on singular value decomposition was innovatively introduced. This function could precisely and succinctly measure the distance between two parameterized light sources, significantly enhancing the accuracy of the regression network. Experimental results demonstrate that,compared with existing methods, this method performs exceptionally well under complex illumination conditions, particularly showing a notable improvement in capturing anisotropic illumination information.
回归网络的结构如图1所示,网络处理LDR图像,输出参数化照明图的关键参数,包括光分布矩阵 M 、光强度 I 、光色温 T 和环境项 A .模型首先根据HDR像素照度的通道贡献权重生成HDR照度图.在照度图中,选择亮度值前2.5%的像素作为光源区域,并计算它们的连通分量,将其分离为N个光源.每个连通分量的光强度 I 由该分量的最大亮度确定;光色温 T 由其单个分量的平均RGB值计算;环境项 A 则通过计算排除所有连通分量后的剩余像素的平均值来获得.针对每个光源区域,通过优化式计算得出最佳拟合该光源区域线性变换的余弦球面分布所对应的矩阵 M 的前8个元素.这样,所有光源区域处理完成后,可以得到8N个参数.然而,由于N值不同,不同光照图的矩阵元素规模可能不同;为了标准化训练过程,本文固定N=10,即仅考虑每个光照图中亮度最高的前10个光源.对于不足10个光源的情况,将剩余光源的光强度设为0.在典型室内HDR光照图中,最亮的前几个光源贡献了绝大部分(>95%)能量;亮度排序超过10的光源贡献通常小于5%,对整体光照影响微弱.因此,设定N=10以聚焦于对场景照明起主导作用的关键光源.
本研究中的回归网络基于改进的ResNet-101架构,采用4个独立分支分别回归光分布矩阵 M 、光强度 I 、光色温 T 和环境项 A .对于 I, T 和 A,采用标准的L2损失函数进行回归.然而,对于具有实际线性变换意义的光分布矩阵 M,简单的L2损失函数无法充分利用球面分布的空间信息和线性变换的几何特性.为此,本文创新性地提出了一种基于奇异值分解(SVD)的损失函数,专门用于光分布矩阵回归,以便更准确地捕捉其几何意义和结构特征.
1.4 SVD损失函数的设计
对于两个光源在球面映射的分布,可以通过标准余弦球面分布经过1个三维线性变换来拟合;对于不同光分布之间的距离衡量,可以转化为对这两个线性变换的距离衡量.在设置光分布矩阵参数的损失函数时,常见的简化方法是采用矩阵 M 的范数损失或将 M 视作向量并运用L2损失进行回归,但这些方法存在一定的局限性.首先,矩阵范数损失仅能反映线性变换的强度,却无法捕捉线性变换的方向信息,而方向信息是决定变换效果的关键要素.其次,使用L2损失进行回归会忽略矩阵的空间结构特性.此外,通过L2损失回归80个参数,隐含地假定这些参数具有相同的权重并彼此独立,这与实际情况不符;在LTC环境中,决定光照方向属性的矩阵元素的权重是不同的,并且光照图中不同光源的能量贡献也不相同.
为了充分利用线性变换矩阵的变换属性和球面分布的空间信息,需要考虑矩阵 M 在空间变换中的作用.对于一个3×3的实矩阵 M 来说,经过奇异值分解后可以得到:,其中是包含 M 的奇异值的对角矩阵,而和的列向量分别是左右奇异向量.SVD分解的旋转与缩放示意图如图3所示,这些奇异值表示变换中的缩放因子,而左右奇异向量分别代表旋转操作.
这种SVD损失函数在球面光分布矩阵回归中具有明显优势.首先,它充分利用了矩阵 M 在表征光分布时的几何意义,明确了线性变换中旋转和缩放的组合效应,有效地惩罚了偏离真实线性变换的预测.其次,这种方法提供了一种更细致的方式来量化光分布之间的差异,从而促进网络权重更精确地更新,尤其是在处理复杂的光照场景时.
DebevecP E. Rendering synthetic objects into real scenes: bridging traditional and image-based graphics with global illumination and high dynamic range photography[C]// Special Interest Group on Graphics and Interactive Techniques. New York: Association for Computing Machinery, 1998: 189-198.
[2]
KarschK, HedauV, ForsythD, et al. Rendering synthetic objects into legacy photographs[J]. ACM Transactions on Graphics, 2011, 30(6): 1-12.
[3]
MarschnerS R, DonaldP G. Inverse lighting for photography [C]//Color and Imaging Conference. Scottsdale. Arizona: International Society for Optical Engineering, 1997: 262-265.
[4]
TocciM D, KiserC, TocciN, et al. A versatile HDR video production system[J]. ACM Transactions on Graphics, 2011, 30(4): 1-10.
[5]
BarronJ T, MalikJ. Intrinsic scene properties from a single RGB-D image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(4): 690-703.
[6]
WuC L, WilburnB, MatsushitaY, et al. High-quality shape from multi-view stereo and shading under general illumination[C]//CVPR 2011. Colorado: Springs, 2011: 969-976.
[7]
LiuB, XuK, MartinR R. Static scene illumination estimation from videos with applications[J]. Journal of Computer Science and Technology, 2017, 32(3): 430-442.
[8]
LombardiS, NishinoK. Reflectance and illumination recovery in the wild[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 129-141.
[9]
MaierR, KimK, CremersD, et al. Intrinsic3D: high-quality 3D reconstruction by joint appearance and geometry optimization with spatially-varying lighting[C]//2017 IEEE International Conference on Computer Vision (ICCV). Venice, 2017: 3133-3141.
[10]
ZhanF N, YuY C, ZhangC G, et al. GMLight: lighting estimation via geometric distribution approximation[J]. IEEE Transactions on Image Processing, 2022, 31: 2268-2278.
[11]
WangG C, YangY N, LoyC C, et al. StyleLight: HDR panorama generation for lighting estimation and editing[C]//Computer Vision-ECCV 2022. Cham: Springer Nature Switzerland, 2022: 477-492.
[12]
KarimiD M R, EisenmannJ, Hold-GeoffroyY, et al. EverLight: indoor-outdoor editable HDR lighting estimation[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, 2023: 7386-7395.
[13]
GaronM, SunkavalliK, HadapS, et al. Fast spatially-varying indoor lighting estimation[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 6901-6910.
[14]
GardnerM A, Hold-GeoffroyY, SunkavalliK, et al. Deep parametric indoor lighting estimation[C]//2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, 2019: 7175-7183.
[15]
LiM T, GuoJ, CuiX F, et al. Deep spherical Gaussian illumination estimation for indoor scene[C]//Proceedings of the ACM Multimedia Asia. Beijing, 2019: 1-6.
[16]
HeitzE, DupuyJ, HillS, et al. Real-time polygonal-light shading with linearly transformed cosines[J]. ACM Transactions on Graphics, 2016, 35(4): 1-8.
[17]
GardnerM A, SunkavalliK, YumerE, et al. Learning to predict indoor illumination from a single image[J]. ACM Transactions on Graphics, 2017, 36(6): 1-14.
[18]
LeGendreC, MaW C, FyffeG, et al. DeepLight: learning illumination for unconstrained mobile mixed reality[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, 2019: 5911-5921.
[19]
ZhanF N, ZhangC G, YuY C, et al. EMLight: lighting estimation via spherical distribution approximation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. California: AAAI, 2021: 3287-3295.