Objective To construct a bone tumor classification model based on feature decoupling and fusion for processing modality loss and fusing multimodal information to improve classification accuracy. Methods A decoupling completion module was designed to extract local and global bone tumor image features from available modalities. These features were then decomposed into shared and modality-specific features, which were used to complete the missing modality features, thereby reducing completion bias caused by modality differences. To address the challenge of modality differences that hinder multimodal information fusion, a cross-attention-based fusion module was introduced to enhance the model's ability to learn cross-modal information and fully integrate specific features, thereby improving the accuracy of bone tumor classification. Results The experiment was conducted using a bone tumor dataset collected from the Third Affiliated Hospital of Southern Medical University for training and testing. Among the 7 available modality combinations, the proposed method achieved an average AUC, accuracy, and specificity of 0.766, 0.621, and 0.793, respectively, which represent improvements of 2.6%, 3.5%, and 1.7% over existing methods for handling missing modalities. The best performance was observed when all the modalities were available, resulting in an AUC of 0.837, which still reached 0.826 even with MRI alone. Conclusion The proposed method can effectively handle missing modalities and successfully integrate multimodal information, and show robust performance in bone tumor classification under various complex missing modality scenarios.
ChoiJH, RoJY. The 2020 WHO classification of tumors of soft tissue: selected changes and new entities[J]. Adv Anat Pathol, 2021, 28(1): 44-58. doi:10.1097/PAP.0000000000000284
[5]
WinnRJ, McClureJ. The NCCN clinical practice guidelines in oncology: a primer for users[J]. J Natl Compr Canc Netw, 2003, 1(1): 5-13. doi:10.6004/jnccn.2003.0003
[6]
DoBH, LanglotzC, BeaulieuCF. Bone tumor diagnosis using a Naïve Bayesian model of demographic and radiographic features[J]. J Digit Imaging, 2017, 30(5): 640-7. doi:10.1007/s10278-017-0001-7
[7]
BoardW. Soft tissue and bone tumours[J]. Inte Agen Res Cancer: Lyon, France, 2020: 472-4.
GianferanteDM, MirabelloL, SavageSA. Germline and somatic genetics of osteosarcoma: connecting aetiology, biology and therapy[J]. Nat Rev Endocrinol, 2017, 13(8): 480-91. doi:10.1038/nrendo.2017.16
[10]
FritzscheH, SchaserKD, HofbauerC. Benign tumours and tumour-like lesions of the bone: general treatment principles[J]. Orthopade, 2017, 46(6): 484-97. doi:10.1007/s00132-017-3429-z
[11]
GutowskiCJ, Basu-MallickA, AbrahamJA. Management of bone sarcoma[J]. Surg Clin North Am, 2016, 96(5): 1077-106. doi:10.1016/j.suc.2016.06.002
[12]
GaumeM, ChevretS, CampagnaR, et al. The appropriate and sequential value of standard radiograph, computed tomography and magnetic resonance imaging to characterize a bone tumor[J]. Sci Rep, 2022, 12(1): 6196. doi:10.1038/s41598-022-10218-8
[13]
DorentR, JoutardS, ModatM, et al. Hetero-modal variational encoder-decoder for joint modality completion and segmentation[M]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2019. Cham: Springer International Publishing, 2019: 74-82. doi:10.1007/978-3-030-32245-8_9
[14]
DingYH, YuX, YangY. RFNet: region-aware fusion network for incomplete multi-modal brain tumor segmentation[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). October 10-17, 2021, Montreal, QC, Canada. IEEE, 2021: 3955-64. doi:10.1109/iccv48922.2021.00394
[15]
ZhangY, HeNJ, YangJW, et al. mmFormer: multimodal medical transformer for incomplete multimodal learning of brain tumor segmentation[M]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2022. Cham: Springer Nature Switzerland, 2022: 107-17. doi:10.1007/978-3-031-16443-9_11
[16]
GuoWK, HuangHB, KongXW, et al. Learning disentangled representation for cross-modal retrieval with deep mutual information estimation[C]//Proceedings of the 27th ACM International Conference on Multimedia. Nice France. ACM, 2019: 1712-1720. doi:10.1145/3343031.3351053
[17]
LuY, WuY, LiuB, et al. Cross-modality person re-identification with shared-specific feature transfer[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 13-19, 2020. Seattle, WA, USA. IEEE, 2020: 13376-86. doi:10.1109/cvpr42600.2020.01339
[18]
YangHR, SunJ, XuZB. Learning unified hyper-network for multi-modal MR image synthesis and tumor segmentation with missing modalities[J]. IEEE Trans Med Imaging, 2023, 42(12): 3678-89. doi:10.1109/tmi.2023.3301934
[19]
TsengKL, LinYL, HsuW, et al. Joint sequence learning and cross-modality convolution for 3D biomedical segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 21-26, 2017, Honolulu, HI, USA. IEEE, 2017: 3739-46. doi:10.1109/CVPR.2017.398
[20]
ZhouCH, DingCX, LuZT, et al. One-pass multi-task convolutional neural networks for efficient brain tumor segmentation[M]//Medical Image Computing and Computer Assisted Intervention – MICCAI 2018. Cham: Springer International Publishing, 2018: 637-45. doi:10.1007/978-3-030-00931-1_73
PengZL, HuangW, GuSZ, et al. Conformer: local features coupling global representations for visual recognition[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). October 10-17, 2021, Montreal, QC, Canada. IEEE, 2021: 357-66. doi:10.1109/iccv48922.2021.00042
[24]
WangH, ChenYH, MaCB, et al. Multi-modal learning with missing modality via shared-specific feature modelling[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 17-24, 2023, Vancouver, BC, Canada. IEEE, 2023: 15878-87. doi:10.1109/cvpr52729.2023.01524
MaMM, RenJ, ZhaoL, et al. Are multimodal transformers robust to missing modality? [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 18-24, 2022, New Orleans, LA, USA. IEEE, 2022: 18156-65. doi:10.1109/cvpr52688.2022.01764
[27]
ShenY, GaoMC. Brain tumor segmentation on MRI with missing modalities[M]//Information Processing in Medical Imaging. Cham: Springer International Publishing, 2019: 417-28. doi:10.1007/978-3-030-20351-1_32
[28]
CipollaR, GalY, KendallA. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. June 18-23, 2018, Salt Lake City, UT, USA. IEEE, 2018: 7482-91. doi:10.1109/cvpr.2018.00781
[29]
HeY, PanI, BaoBT, et al. Deep learning-based classification of primary bone tumors on radiographs: a preliminary study[J]. EBioMedicine, 2020, 62: 103121. doi:10.1016/j.ebiom.2020.103121
[30]
EwejeFR, BaoBT, WuJ, et al. Deep learning for classification of bone lesions on routine MRI[J]. EBioMedicine, 2021, 68: 103402. doi:10.1016/j.ebiom.2021.103402
[31]
HakimDN, PellyT, KulendranM, et al. Benign tumours of the bone: a review[J]. J Bone Oncol, 2015, 4(2): 37-41. doi:10.1016/j.jbo.2015.02.001