Aiming at the problems of insufficient robustness and poor reconstruction effect of weak texture regions in indoor 3D reconstruction, a new indoor 3D reconstruction method M-HashRecon is proposed based on the principle of neural radiation field. The algorithm utilizes a point cloud selection module to extract the key point cloud information, and introduces multi-resolution hash coding to realize the multi-scale feature index of the point cloud. The residual module is designed to optimize the performance of the model and improve the training efficiency of the deep network. Experiments are carried out in four typical scenarios of ScanNet dataset, and the experimental results and the convergence of the model are analyzed. The research results show that the F-score comprehensive index of the algorithm is significantly better than that of the comparison algorithm, and the reconstruction accuracy of multiple scenes is high and the stability is good. The research conclusions can provide reference for the design of subsequent high-precision indoor three-dimensional reconstruction system.
M-HashRecon网络总体结构见图1。通过旋转矩阵 R 和平移矩阵 T 获取视角信息。在射线方向上,借助点云选择模块从点云中筛选出具有代表性的点。利用基于深度矩阵的深度信息d 引导采样,在深度值附近均匀地选择近地表的点。采用多分辨率哈希编码对筛选出的点云进行不同分辨率的索引处理。将这些编码后的点云输入至以下网络模块:符号距离预测网络,用于预测每个点的有符号距离(SDF);颜色渲染网络,用于预测每个点的颜色信息;几何约束网络,通过语义分割增强场景理解。综合这些信息,最终生成三维模型。
FURUKAWAY, PONCEJ. Accurate, dense, and robust multiview stereopsis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(8): 1362-1376.
[2]
SHOTTONJ, FITZGIBBONA, COOKM, et al.Real-time human pose recognition in parts from single depth images[M]//CIPOLLA R, BATTIATO S, FARINELLA G.Machine Learning for Computer Vision. Berlin: Springer Berlin Heidelberg, 2013: 119-135.
[3]
IZADIS, KIMD, HILLIGESO, et al. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera[C]//The 24th Annual ACM Symposium on User Interface Software and Technology, October 16-19, 2011, Santa Barbara, California, USA. New York: ACM, 2011: 559-568.
[4]
SCHÖNBERGERJ L, FRAHMJ M. Structure-from-motion revisited[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, June 27-30, 2016, Las Vegas, NV, USA. IEEE, 2016: 4104-4113.
[5]
SITZMANNV, THIESJ, HEIDEF, et al. DeepVoxels: learning persistent 3D feature embeddings[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15-20, 2019, Long Beach, CA, USA. IEEE, 2019: 2432-2441.
[6]
OECHSLEM, PENGS Y, GEIGERA.UNISURF: unifying neural implicit surfaces and radiance fields for multi-view reconstruction[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV), October 10-17, 2021, Montreal, QC, Canada. IEEE,2021: 5569-5579.
[7]
MARTIN-BRUALLAR, RADWANN, SAJJADIM S M, et al. NeRF in the wild: neural radiance fields for unconstrained photo collections[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20-25, 2021, Nashville, TN, USA. IEEE, 2021: 7206-7215.
[8]
谢浩哲.多源多视的三维场景和物体重建[D].哈尔滨:哈尔滨工业大学,2021:29-30.
[9]
WUH Y, GRAIKOSA, SAMARASD. S-VolSDF: sparse multi-view stereo regularization of neural implicit surfaces[C]//2023 IEEE/CVF International Conference on Computer Vision, October 1-6, 2023, Paris, France. IEEE, 2023: 3533-3545.
[10]
GAOY M, CAOY P, SHANY. SurfelNeRF: neural surfel radiance fields for online photorealistic reconstruction of indoor scenes[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 17-24, 2023, Vancouver, BC, Canada. IEEE, 2023: 108-118.
HEHongtian, CHENHan, LIUYang,et al.3D visual understanding oriented towards multimodal interactive fusion and progressive refinement[J].Application Research of Computers,2024,41(5):1554-1561.
[13]
LUC S, YINF K, CHENX, et al. A large-scale outdoor multi-modal dataset and benchmark for novel view synthesis and implicit scene reconstruction[C]//2023 IEEE/CVF International Conference on Computer Vision, October 1-6, 2023, Paris, France. IEEE, 2023: 7523-7533.
[14]
SUNJ K, JIAOH, LIG Y, et al. 3DGStream: on-the-fly training of 3D Gaussians for efficient streaming of photo-realistic free-viewpoint videos[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-22, 2024, Seattle, WA, USA. IEEE, 2024: 20675-20685.
[15]
MESCHEDERL, OECHSLEM, NIEMEYERM, et al. Occupancy networks: Learning 3d reconstruction in function space[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15-20, 2019, Long Beach, CA, USA. IEEE, 2019: 4455-4465.
[16]
CARLBOMI, CHAKRAVARTYI, VANDERSCHELD.A hierarchical data structure for representing the spatial decomposition of 3-D objects[J].IEEE Computer Graphics and Applications, 1985, 5(4):24-31.
[17]
DENGZ, XIAOH Y, LANGY N, et al. Multi-scale hash encoding based neural geometry representation[J]. Computational Visual Media, 2024, 10(3): 453-470.
[18]
SOMMERC, SANGL, SCHUBERTD, et al. Gradient-SDF: a semi-implicit surface representation for 3D reconstruction[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-24, 2022, New Orleans, LA, USA. IEEE, 2022: 6270-6279.
[19]
FRIDOVICH-KEILS, YUA, TANCIKM, et al. Plenoxels: radiance fields without neural networks[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-24, 2022, New Orleans, LA, USA. IEEE, 2022: 5491-5500.
[20]
WALKERT, MARIOTTIO, VAXMANA, et al. Spatially-adaptive hash encodings for neural surface reconstruction[C]//2025 IEEE/CVF Winter Conference on Applications of Computer Vision, February 26-March 6, 2025, Tucson, AZ, USA. IEEE, 2025: 2963-2972.
[21]
DAIA, CHANGA X, SAVVAM, et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. IEEE, 2017: 2432-2443.
[22]
ZHUL W, ZHANGY, PANZ Q, et al. Binary and multi-class learning based low complexity optimization for HEVC encoding[J]. IEEE Transactions on Broadcasting, 2017, 63(3): 547-561.