Advancements in Multimodal Image Fusion and Deep Learning-based Segmentation Techniques for Gliomas: A Comprehensive Review

Authors

Lirong Chen School of Electronic Engineering, Tianjin University of Technology and Education, Tianjin 300222, China
Liqiang Wang School of Electronic Engineering, Tianjin University of Technology and Education, Tianjin 300222, China; Tianjin Engineering Research Center of Fieldbus Control Technology, Tianjin 300202, China
Wei Wang Xuanwu Hospital of Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing 100053, China

DOI:

https://doi.org/10.53469/wjimt.2025.08(07).13

Keywords:

Deep learning, Glioma, Medical imaging, Multimodal, Segmentation

Abstract

This paper reviews recent developments in deep learning techniques for multimodal image fusion and segmentation of brain tumors. Gliomas, the most common tumors of the central nervous system in adults, require accurate image segmentation to support effective diagnosis and treatment. Multimodal image fusion integrates information from different imaging modalities, offering a more comprehensive and precise characterization of tumors. In this review, we introduce the characteristics of gliomas, outline preprocessing and fusion methods for multimodal images, and summarize commonly used deep learning models for glioma segmentation. We also highlight the benefits of integrating attentional mechanisms and multiscale features into deep learning architectures. In addition, current evaluation metrics and publicly available datasets are discussed. Finally, we address key challenges such as data management, protection of surrounding organs, and model interpretability, aiming to provide researchers with a valuable reference for future studies in multimodal brain tumor segmentation.

References

Louis DN, Perry A, Wesseling P, Brat DJ, Cree IA, Figarella-Branger D, et al. The 2021 WHO classification of tumors of the central nervous system: a summary. Neuro Oncol. 2021;23:1231-51.

Valbuena Rubio S, García-Ordás MT, García-Olalla Olivera O, Alaiz-Moretón H, González-Alonso MI, Benítez-Andrades JA, et al. Survival and grade of the glioma prediction using transfer learning. PeerJ Comput Sci. 2023;9:e1723.

Koizumi S, Oishi T, Iwaizumi M, Kurozumi K. Genomic medicine advances for brain tumors. Int J Clin Oncol. 2024;29:1407-16.

Du P, Chen H, Lv K, Geng D. A survey of radiomics in precision diagnosis and treatment of adult gliomas. J Clin Med. 2022;11:3802.

Tang F, Liang S, Zhong T, Huang X, Deng X, Zhang Y, et al. Postoperative glioma segmentation in CT image using deep feature fusion model guided by multi-sequence MRIs. Eur Radiol. 2020;30:823-32.

Thust SC, Heiland S, Falini A, Jäger HR, Waldman AD, Sundgren PC, et al. Glioma imaging in Europe: a survey of 220 centres and recommendations for best clinical practice. Eur Radiol. 2018;28:3306-17.

He X, Xu W, Yang J, Mao J, Chen S, Wang Z. Deep convolutional neural network with a multi-scale attention feature fusion module for segmentation of multimodal brain tumor. Front Neurosci. 2021;15:782968.

Zhang H, Ille S, Sogerer L, Schwendner M, Schröder A, Meyer B, et al. Elucidating the structural-functional connectome of language in glioma-induced aphasia using nTMS and DTI. Hum Brain Mapp. 2022;43:1836-49.

Cirillo S, Battistella G, Castellano A, Sanvito F, Iadanza A, Bailo M, et al. Comparison between the inferior frontal gyrus intrinsic connectivity network and the verb-generation task fMRI network for presurgical language mapping in healthy controls and glioma patients. Brain Imaging Behav. 2022;16:2569-85.

Brindle KM, Izquierdo-García JL, Lewis DY, Mair RJ, Wright AJ. Brain tumor imaging. J Clin Oncol. 2017;35:2432-8.

Bernal J, Kushibar K, Asfaw DS, Valverde S, Oliver A, Martí R, et al. Deep convolutional neural networks for brain image analysis on magnetic resonance imaging: a review. Artif Intell Med. 2019;95:64-81.

Wei Y, Chen X, Zhu L, Zhang L, Schonlieb CB, Price S, et al. Multi-modal learning for predicting the genotype of glioma. IEEE Trans Med Imaging. 2023;42:3167-78.

Magadza T, Viriri S. Deep learning for brain tumor segmentation: a survey of state-of-the-art. J Imaging. 2021;7:19.

Baeesa S, Maghrabi Y, Moshref R, Al-Maghrabi J. Optic pathway-hypothalamic glioma apoplexy: a report of two cases and systematic review of the literature. Front Surg. 2022;9:891556.

Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.

Zhou T, Ruan S, Canu S. A review: deep learning for medical image segmentation using multi-modality fusion. Array. 2019;3-4:100004.

Simpson AL, Antonelli M, Bakas S, Bilello M, Farahani K, Ginneken BV, et al. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv [Preprint]. 2019.

Menze BH, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J, et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans Med Imaging. 2015;34(10):1993-2024.

Khodadadi Shoushtari F, Dehkordi ANV, Sina S. Quantitative and visual analysis of data augmentation and hyperparameter optimization in deep learning-based segmentation of low-grade glioma tumors using Grad-CAM. Ann Biomed Eng. 2024;52:1359-77.

Isensee F, Jaeger PF, Kohl SA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods. 2021;18:203–11.

Azam MA, Khan KB, Salahuddin S, Rehman E, Khan SA, Khan MA, et al. A review on multimodal medical image fusion: compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics. Comput Biol Med. 2022;144:105253.

Bakas S, Reyes M, Jakab A, Bauer S, Rempfler M, Crimi A, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BraTS challenge. arXiv preprint. 2018; arXiv:1811.02629.

Dolz J, Desrosiers C, Ayed IB. HyperDense-Net: a hyper-densely connected CNN for multi-modal image segmentation. IEEE Trans Med Imaging. 2018;38(5):1116–26.

Zhang Y, Wu J, Zhu J, et al. ME-Net: Multiencoder network for brain tumor segmentation. IEEE J Biomed Health Inform. 2020;24(10):2743–52.

Liu L, Zhang Y, Liu Q, et al. CBAM-UNet: Attention-based network for medical image segmentation. Comput Methods Programs Biomed. 2021;208:106304.

Sun L, Zhang S, Chen H, Luo L. Brain tumor segmentation and survival prediction using multimodal MRI scans with deep learning. Front Neurosci. 2019;13:810.

Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, et al. HeMIS: Hetero-modal image segmentation. In: Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer; 2016. p. 469–77.

Zhu S, Chen Y, Jiang S, Chen W, Liu C, Wang Y, et al. XLSTM-HVED: cross-modal brain tumor segmentation and MRI reconstruction method using vision XLSTM and heteromodal variational encoder-decoder. arXiv [Preprint]. 2024.

Sharma A, Hamarneh G. Missing MRI pulse sequence synthesis using multi-modal generative adversarial network. IEEE Trans Med Imaging. 2019;39(4):1170-83.

Gupta S, Hoffman J, Malik J. Cross-modal distillation for supervision transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016; Las Vegas, NV, USA. p. 2827-36.

Zhang D, Huang F, Liu S, Wang X, Ge Z. Cross-modality deep feature learning for brain tumor segmentation. Pattern Recognit. 2021;110:107630.

Rahimpour M, Bertels J, Radwan A, Vandermeulen H, Sunaert S, Vandermeulen D, et al. Cross-modal distillation to improve MRI-based brain tumor segmentation with missing MRI sequences. IEEE Trans Biomed Eng. 2022;69(7):2153-64.

Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84-90.

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015); 2015 May 7-9; San Diego, CA, USA.

Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39:640-51.

Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; 2015 Jun 7-12; Boston, MA, USA. p. 1-9.

Iandola F, Moskewicz M, Karayev S, Girshick R, Darrell T, Keutzer K. DenseNet: implementing efficient ConvNet descriptor pyramids. arXiv [Preprint] 2014.

Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV); 2018; Munich, Germany. p. 801-18.

Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2018; Salt Lake City, UT, USA. p. 7132-41.

Milletari F, Navab N, Ahmadi SA. V-net: fully convolutional neural networks for volumetric medical image segmentation. In: Proceedings of the 4th International Conference on 3D Vision (3DV); 2016; Stanford, CA, USA. p. 565-71.

Dosovitskiy A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv [Preprint] 2020.

Shamshad F, Khan S, Zamir SW, Khan MH, Hayat M, Khan FS, et al. Transformers in medical imaging: a survey. Med Image Anal. 2023;88:10280.

Nguyen QD, Thai HT. Crack segmentation of imbalanced data: the role of loss functions. Eng Struct. 2023;297:116988.

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436-44.

Sudre CH, Li W, Vercauteren T, Ourselin S, Cardoso MJ. Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Proceedings of the 3rd International Workshop on Deep Learning in Medical Image Analysis (DLMIA); 2017 Sep 14; Québec City, Canada. p. 240-8.

Lin TY, Goyal P, Girshick R, He K, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV); 2017; Venice, Italy. p. 2980-8.

Liu Y, Mu F, Shi Y, Chen X. SF-Net: a multi-task model for brain tumor segmentation in multimodal MRI via image fusion. IEEE Signal Process Lett. 2022;PP:1-5.

Englesson E, Azizpour H. Generalized Jensen-Shannon divergence loss for learning with noisy labels. arXiv [Preprint]. 2021.

Hou Q, Peng Y, Wang Z, Wang J, Jiang J. MFD-Net: modality fusion diffractive network for segmentation of multimodal brain tumor image. IEEE J Biomed Health Inform. 2023;27:5958-69.

Chang Y, Zheng Z, Sun Y, Zhao M, Lu Y, Zhang Y. DPAFNet: a residual dual-path attention-fusion convolutional neural network for multimodal brain tumor segmentation. Biomed Signal Process Control. 2023;79:104037.

Liu Y, Mu F, Shi Y, Cheng J, Li C, Chen X. Brain tumor segmentation in multimodal MRI via pixel-level and feature-level image fusion. Front Neurosci. 2022;16:1000587.

Tian W, Li D, Lv M, Huang P. Axial attention convolutional neural network for brain tumor segmentation with multi-modality MRI scans. Brain Sci. 2023;13(1):12.

Huang Z, Lin L, Cheng P, Peng L, Tang X. Multi-modal brain tumor segmentation via missing modality synthesis and modality-level attention fusion. arXiv preprint arXiv:2203.04586. 2022 Mar 9.

Chen C, Liu X, Ding M, Zheng J, Li J. 3D dilated multi-fiber network for real-time brain tumor segmentation in MRI. arXiv [Preprint]. 2019.

Hu Z, Sun Y, Bian L, Luo C, Zhu J, Zhu J, et al. UDA-GS: a cross-center multimodal unsupervised domain adaptation framework for Glioma segmentation. Comput Biol Med. 2025;185:109472.

Zhou T. Modality-level cross-connection and attentional feature fusion based deep neural network for multi-modal brain tumor segmentation. Biomed Signal Process Control. 2023;81:104524.

Peng Y, Sun J. The multimodal MRI brain tumor segmentation based on AD-Net. Biomed Signal Process Control. 2023;80:104336.

Wu S, Cao Y, Li X, Liu Q, Ye Y, Liu X, et al. Attention-guided multi-scale context aggregation network for multi-modal brain glioma segmentation. Med Phys. 2023;50(12):7629-40.

Li P, Li Z, Wang Z, Li C, Wang M. mResU-Net: multi-scale residual U-Net-based brain tumor segmentation from multimodal MRI. Med Biol Eng Comput. 2024;62(3):641-51.

Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI); 2015; Munich, Germany. p. 234-41.

Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, et al. Attention U-Net: learning where to look for the pancreas. arXiv [Preprint] 2018.

Guan X, Yang G, Ye J, Yang W, Xu X, Jiang W, et al. 3D AGSE-VNet: an automatic brain tumor MRI data segmentation framework. BMC Med Imaging. 2022;22:6.

Gamal A, Bedda K, Ashraf N, Ayman S, AbdAllah M, Rushdi MA. Brain tumor segmentation using 3D U-Net with hyperparameter optimization. In: Proceedings of the 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES); 2021; Cairo, Egypt. p. 269-72.

Zhao L, Ma J, Shao Y, Jia C, Zhao J, Yuan H. MM-UNet: a multimodality brain tumor segmentation network in MRI images. Front Oncol. 2022;12:950706.

Li X, Jiang Y, Li M, Zhang J, Yin S, Luo H. MSFR-Net: multi-modality and single-modality feature recalibration network for brain tumor segmentation. Med Phys. 2023;50:2249-62.

Pan Y, Yong H, Lu W, Li G, Cong J. Brain tumor segmentation by combining MultiEncoder UNet with wavelet fusion. J Appl Clin Med Phys. 2024;25(11):e14527.

Marinov Z, Reiß S, Kersting D, Kleesiek J, Stiefelhagen R. Mirror U-Net: marrying multimodal fusion with multi-task learning for semantic segmentation in medical imaging. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW); 2023; Paris, France. p. 2275-85.

Ma B, Sun Q, Ma Z, Li B, Cao Q, Wang Y, et al. DTASUnet: a local and global dual transformer with the attention supervision U-network for brain tumor segmentation. Sci Rep. 2024;14:28379.

Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, et al. TransUNet: Transformers make strong encoders for medical image segmentation. arXiv [Preprint]. 2021.

Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, et al. Unetr: Transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); 2022; Waikoloa, HI, USA. p. 574-84.

Cai Y, Long Y, Han Z, Liu M, Zheng Y, Yang W, et al. Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution. BMC Med Inform Decis Mak. 2023;23:33.

Hatamizadeh A, Nath V, Tang Y, Yang D, Roth H, Xu D, et al. Swin UNETR: Swin transformers for semantic segmentation of brain tumors in MRI images. arXiv [Preprint]. 2022.

Wang W, Chen C, Ding M, Yu H, Zha S, Li J. TransBTS: multimodal brain tumor segmentation using transformer. In: de Bruijne M, Cattin PC, Cotin S, et al., editors. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021: 24th International Conference; 2021; Strasbourg, France. Cham: Springer; 2021. p. 109-19. (Lecture Notes in Computer Science; vol. 12901)

Lin J, Lin J, Lu C, Chen H, Lin H, Zhao B, et al. CKD-TransBTS: clinical knowledge-driven hybrid transformer with modality-correlated cross-attention for brain tumor segmentation. IEEE Trans Med Imaging. 2023;PP:1.

Zakariah M, Al-Razgan M, Alfakih T. Dual vision Transformer-DSUNET with feature fusion for brain tumor segmentation. Heliyon. 2024;10(18):e37804.

Zhou HY, Guo J, Zhang Y, Yu L, Wang L, Yu Y. nnformer: interleaved transformer for volumetric segmentation. IEEE Trans Image Process. 2021;30:9210-21.

Zhang Y, He N, Yang J, Li Y, Wei D, Huang Y, et al. mmFormer: multimodal medical transformer for incomplete multimodal learning of brain tumor segmentation. In: Wang L, Dou Q, Fletcher PT, et al., editors. Medical Image Computing and Computer Assisted Intervention – MICCAI 2022: 25th International Conference; 2022; Singapore. Cham: Springer Nature Switzerland; 2022. p. 107-17. (Lecture Notes in Computer Science; vol. 13435).

Xing Z, Yu L, Wan L, Han T, Zhu L. NestedFormer: nested modality-aware transformer for brain tumor segmentation. In: Proceedings of the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2022); 2022; Singapore. p. 140-50.

Sun K, Ding J, Li Q, Chen W, Zhang H, Sun J, et al. CMAF-Net: a cross-modal attention fusion-based deep neural network for incomplete multi-modal brain tumor segmentation. Quant Imaging Med Surg. 2024;14:4579-604.

Woo S, Park J, Lee JY, Kweon IS. CBAM: convolutional block attention module. In: Proceedings of the 15th European Conference on Computer Vision (ECCV 2018); 2018 Sep 8-14; Munich, Germany. p. 3-19.

Chen LC, Papandreou G, Schroff F, Adam H. Rethinking atrous convolution for semantic image segmentation. arXiv [Preprint] 2017.

Xie L, Wisse LEM, Wang J, Ravikumar S, Khandelwal P, Glenn T, et al. Deep label fusion: a generalizable hybrid multi-atlas and deep convolutional neural network for medical image segmentation. Med Image Anal. 2022;83:102683.

Zhang R, Wei Y, Wang D, Chen B, Sun H, Lei Y, et al. Deep learning for malignancy risk estimation of incidental sub-centimeter pulmonary nodules on CT images. Eur Radiol. 2024;34(7):4218-29.

Huang W, Tan K, Zhang Z, Hu J, Dong S. A review of fusion methods for omics and imaging data. IEEE/ACM Trans Comput Biol Bioinform. 2022;20:74-93.

Advancements in Multimodal Image Fusion and Deep Learning-based Segmentation Techniques for Gliomas: A Comprehensive Review