面向医学影像细微特征的低损耗压缩编码算法的研究与应用

王瑞青; 何昆仑; 陈华; 曹德森; 栗嘉楠; 马骏

doi:10.12435/j.issn.2095-5227.24113002

解放军医学院学报 ›› 2025, Vol. 46 ›› Issue (10) : 982 -987. DOI: 10.12435/j.issn.2095-5227.24113002

医学人工智能

面向医学影像细微特征的低损耗压缩编码算法的研究与应用

王瑞青 ¹^,² ,
何昆仑 ¹^,³^,⁴ ,
陈华 ⁵ ,
曹德森 ⁶ ,
栗嘉楠 ⁷ ,
马骏 ⁸

作者信息 +

Research and application of low-loss compression coding algorithms for fine features in medical imaging

Ruiqing WANG ¹^,² ,
Kunlun HE ¹^,³^,⁴ ,
Hua CHEN ⁵ ,
Desen CAO ⁶ ,
Jianan LI ⁷ ,
Jun MA ⁸

Author information +

文章历史 +

PDF (1886K)

摘要

背景现有医学影像压缩技术基于均方误差优化，并不能完全反映人类对医学影像的主观质量感受，与临床诊断所需的结构特征保留度存在一定差距。目的提出一种面向医学影像细微特征的低损耗压缩编码算法，旨在不降低医学影像主观质量的同时降低其传输带宽。方法本研究收集了解放军总医院14例骨科手术的CT图像序列，首先基于医学影像的亮度、对比度及细节纹理等关键视觉特征，重构结构相似性指数(structural similarity index measure，SSIM)，其中亮度因子α=1.15，对比度/结构因子β=γ=0.95；进而基于线性失真模型和大数定律，建立结构相似性指数和均方误差的关系式；其次，将1/SSIM作为图像失真的度量指标，构建适用于率失真优化的SSIM失真测度；在此基础上，在目标速率约束条件下使失真指标最小化，建立基于SSIM的率失真优化框架；最后，依托x264平台，将所提方法与标准编码器进行比较，验证其在率失真性能上的优势。结果本团队的方法相较x264标准编码器取得了恒定量化参数下平均-5.2%和恒定质量因子下平均-4.8%的率失真收益；在主观质量上，编码前后图像的SSIM均＞0.95，码率平均降低372 kbps，在计算效率上未增加编码时间复杂度。结论本研究提出的方法在保证医学影像高感官质量的同时兼顾了计算复杂度的控制，为医疗影像传输提供了更优秀的压缩编码方案。

Abstract

Background Current medical image compression techniques primarily optimize for mean squared error (MSE), which does not fully capture human subjective perception of image quality and often fail to preserve the structural features essential for clinical diagnosis. Objective To propose a low-loss compression coding algorithm for subtle features in medical images, aiming to reduce transmission bandwidth without compromising subjective image quality. Methods CT image sequences from 14 orthopedic surgeries at Chinese PLA General Hospital were collected in this study. Firstly, the Structural Similarity Index (SSIM) was reconstructed based on key visual features of medical images, including brightness, contrast, and detail texture, with the brightness factor set to α= 1.15 and the contrast/structure factors set to β=γ= 0.95. Subsequently, a relationship between the SSIM and MSE was established based on the linear distortion model and the law of large numbers. Then, 1/SSIM was employed as a distortion metric, and an SSIM-based distortion measure suitable for rate-distortion optimization (RDO) was constructed. On this basis, a SSIM-based rate-distortion optimization framework was developed by minimizing the distortion metric under a given target bitrate constraint. Finally, the proposed method was implemented on the x264 platform, and its rate-distortion performance was compared with that of the standard encoder to verify its advantages. Results Compared to the standard x264 encoder, our approach achieved an average rate-distortion gain of -5.2% under constant quantization parameter and -4.8% under constant quality factor. In terms of subjective quality, the SSIM of the encoded images remained above 0.95, with an average bitrate reduction of 372 kbps. Furthermore, no increase in computational complexity or encoding time was observed. Conclusion The proposed method effectively preserves the high perceptual quality of medical images while maintaining computational efficiency, offering a superior compression solution for medical image transmission.

Graphical abstract

关键词

医学影像 / 视频压缩 / 率失真优化 / 主观质量评价 / 结构相似度指数 / 感官质量 / 远程医学

Key words

medical imaging / video compression / rate-distortion optimization / subjective quality assessment / SSIM / perceptual quality / telemedicine

引用本文

引用格式 ▾

王瑞青,何昆仑,陈华,曹德森,栗嘉楠,马骏. 面向医学影像细微特征的低损耗压缩编码算法的研究与应用[J]. 解放军医学院学报, 2025, 46(10): 982-987 DOI:10.12435/j.issn.2095-5227.24113002

登录浏览全文

4963

注册一个新账户忘记密码

医学影像作为现代医学的重要组成部分，是医师诊断和治疗的重要依据^[1-2]。但其数据量庞大的特点，对传输带宽和存储空间提出了挑战。在网络资源有限的情况下，如何在保证影像质量的前提下实现高效的压缩和传输，成为亟待解决的问题^[3-4]。然而，传统视频编码器通常采用基于均方误差(mean squared error，MSE)的评价指标，MSE虽然计算简单，但与人眼主观质量的相关性较弱。因此研究者们提出了诸如结构相似性指数(structural similarity index measure，SSIM)、多尺度结构相似性指数(multi-scale structural similarity index measure，MS-SSIM)、视频多方法融合评估算法(video multi-method assessment fusion，VMAF)等与感官质量更相关的指标^[5-8]。其中SSIM凭借其具有与人类视觉感知高度相关且算法复杂度适中的优势受到重点关注。本研究通过将SSIM评价指标代替原本的MSE评价指标进行率失真优化过程，以提升医学影像压缩的率失真性能。

1 对象和方法

1.1 研究对象

从解放军总医院2023年10月至2024年10月进行远程骨科手术患者的CT影像数据中筛选出14个具有代表性的CT影像视频作为测试序列。本研究经解放军总医院第四医学中心医学伦理委员会批准(编号：2023KY147-KS001)。

1.2 实验平台

实验基于开源编解码软件x264(编码参数配置见表1)，分别使用x264标准编码器和基于SSIM率失真优化的x264编码器，采用恒定量化参数(quantization parameter，QP)和恒定码率因子(rate factor，RF)，编码测试序列。

1.3 测试指标

本研究选用PSNR和SSIM作为质量指标，同时通过对比原始图像和编码图像的主客观质量来验证本算法的主客观性能。主观质量是指通过人眼观察和感知获得的图像视觉感受，反映人类对编码图像整体视觉效果的评价，通常基于受试者对图像的清晰度、细节还原度、噪声、失真等方面的综合判断^[9]。客观质量指标主要包括恒定QP、恒定RF两种条件下的Bjøntegaard增量率(bjøntegaard delta rate，BD-Rate)收益^[10]。BD-Rate由编码后的码率、PSNR和SSIM计算得到，其降低越多说明在同等的SSIM下可以用越少的码率编码视频，编码率失真性能越优异。

1.4 研究方法

SSIM指数需要计算原始图像和重建图像的协方差

σ x y

，而这会增加SSIM的计算复杂度。在编码器中，为了降低计算复杂度，使用当前容易拥有的MSE来得到SSIM将是最好的选择。由此，我们研究了SSIM与MSE之间的关系式，并选择1/SSIM作为失真度量，建立了可用于率失真优化的SSIM失真指标；然后在一定的速率约束条件下最小化失真指标，最终得到了基于SSIM的率失真优化框架。

(1)构建SSIM和MSE的关系式。SSIM通过亮度、对比度、结构三个维度对图像进行视觉评价，表示为：

S S I M (x, y) = 2 μ x μ y + C 1 μ x 2 + μ y 2 + C 1 α ∙ 2 σ x σ y + C 2 σ x 2 + σ y 2 + C 2 β ∙ σ x y + c 3 σ x σ y + c 3 γ

(1)

其中，

x

表示原始图像，

y

表示失真图像，

u

表示当前区域内像素的均值，

σ 2

表示当前区域内像素的方差，

σ x y

表示当前区域内得到的原始图像与失真图像的协方差，

C 1

、

C 2

和

C 3

为常数，

α

、

β

和

γ

为3个权重因子^[10]。为了便于实际应用，SSIM可简化为：

S S I M = 2 u x u y + u x 2 + u y 2 C 1 + C 1 ⋅ 2 σ x y + C 2 σ x 2 + σ y 2 + C 2

(2)

SSIM在自然图像的质量评价中，通常设置

α

β

γ

=1，

C 3

C 2 / 2

^[11-12]。本研究针对CT图像特点，尤其是CT图像在医学诊断中的重要性判别因素，调整各分量幂指数因子。CT图像中包含大量细节纹理特征和明显的边缘区域，病灶区域往往包含于大片的细节纹理中，由区域对比度和结构性差异凸显。因此CT图像的整体亮度变化对诊断准确性影响较小，而其对比度和结构特征的变化，对诊断准确性影响较大。因此本研究设置亮度幂指数因子

α

为1.15，而对比度和结构分量的幂指数因子

β

和

γ

为0.95。针对CT图像的质量评价SSIM指标的最终形式如下：

S S I M x, y = 2 μ x μ y + C 1 1.15 2 σ x y + C 2 0.95 1.15 * σ x 2 + σ y 2 + C 2 0.95

(3)

原始帧与重建帧之间可以表示为

y = x + e

，其中

e

表示残差。由于MSE可表示为：

M S E = 1 N S S E = 1 N ∑ i y i - x i 2 = 1 N ∑ i e i 2

(4)

其中SSE是和方差(sum of squared errors，SSE)。假设

e

是均值为0，方差为

σ e 2

的随机变量，并且与

x

相互独立。当N足够大时，MSE接近于

σ e 2

。在高分辨率情况下，通过上述假设还可以进一步推导出：

u x ≈ u y

(5)

σ y 2 ≈ σ x 2 + σ e 2

(6)

σ x y ≈ σ x 2

(7)

代入式(3)，得到SSIM及其倒数dSSIM与MSE的关系：

S S I M ≈ 2 μ x μ y + C 1 0.95 2 σ x y + C 2 0.95 μ x 2 + μ y 2 + C 1 0.95 σ x 2 + σ y 2 + C 2 0.95 ≈ (2 σ x 2 + C 2) 0.95 (2 σ x 2 + M S E + C 2) 0.95

(8)

d S S I M = 1 S S I M ≈ 1 + 0.95 ∙ M S E 2 σ x 2 + C 2

(9)

(2)构建基于SSIM的率失真优化模型。原率失真优化过程可表示为：

J S S E = S S E + λ S S E R = N ⋅ M S E + λ S S E R

(10)

其中

λ S S E

是拉格朗日乘子，

R

是码率^[13]。

类似的，基于SSIM的率失真优化过程可以表示为：

J S S I M = N ⋅ d S S I M + λ S S I M R = N 1 + 0.95 ∙ M S E 2 σ x 2 + C 2 + λ S S I M R = N + 1 2 σ x 2 + C 2 0.95 ∙ S S E + 2 σ x 2 + C 2 λ S S I M R

(11)

为使

J S S I M

达到最小，须获得最优

λ S S I M

。在整个编码过程中率失真优化过程是在一定的总码率下使得总失真最小，表示为：

m i n S S E = ∑ i d i s . t . R = ∑ i r i ≤ R c

(12)

其中

d i

表示每个宏块的SSE，

r i

表示每个宏块消耗的大小，通过拉格朗日乘数法可表示为：

m i n J S S E = ∑ i ⅆ i + λ S S E ∑ i r i = ∑ i ⅆ i + λ S S E r i

(13)

通常假设宏块之间不具有相关性，则针对每一个宏块来说可以分别进行优化：

m i n d i + λ S S E r i

(14)

其求

d i

的偏导可以得到：

∂ J S S E ∂ d i = 1 + λ S S E ∂ r i ∂ d i = 0

(15)

根据广泛采用的率失真模型：

r d = N α l o g σ 2 d / N

(16)

将其带入式(15)即可求得最优的

d o p t

与

r o p t

：

d o p t = N α λ S S E

(17)

r o p t = N α l o g σ i 2 α λ S S E

(18)

进而可以得到：

R S S E = N α ∑ i l o g σ i 2 a λ S S E

(19)

对基于SSIM的率失真优化过程同样可以推导出：

R S S I M = N α ∑ i l o g 0.95 ∙ σ i 2 a 2 σ x 2 + C 2 λ S S I M

(20)

无论用哪种失真度量，总的码率需满足

R S S I M = R S S E

，进而可以得到：

λ S S I M = λ S S E ⋅ e x p - 1 M ∑ i = 1 M l o g 2 σ x i 2 + C 2 0.95

(21)

对于每个宏块，有：

λ i = (2 σ x i 2 + C 2) / 0.95 e x p 1 M ∑ i = 1 M l o g (2 σ x i 2 + C 2) / 0.95 λ S S E

(22)

通过调整

λ

即可实现宏块级的SSIM率失真优化。

2 结果

2.1 客观评价

实验结果见表2。本研究的算法相较于x264标准算法，虽然部分序列上序列在基于PSNR的BD-Rate上有小幅上升，但在基于SSIM的BD-Rate上均取得不错的编码收益。在恒定QP的情况下，测试序列在基于SSIM的BD-rate下取得最大-10.3%，平均-5.2%的收益。在恒定RF情况下，编码收益与恒定QP情况下的趋势相似。在基于SSIM的BD-rate上取得了最大-7.8%，平均-4.8%的收益。图1显示了所提算法与对照算法的码率失真曲线图(F444序列)，本文算法在测试的码率范围内均优于对照算法。

2.2 主观评价

不同测试序列主观质量如图2 ~ 图9所示，本研究从中选取了测试序列中具有代表性的图像顺序计数(picture order count，POC)展示编码结果。可以看出本研究的方法可在几乎不影响主观质量的情况下使得码率出现显著下降。

3 讨论

医学影像通常包含大量高分辨率原始数据，文件体积较大，给不同医疗机构及专家团队间的数据共享和传输带来较大挑战^[14-16]。尤其在偏远地区或发展中国家，网络带宽有限且不稳定，这进一步加剧了高质量医学影像传输的难题。鉴于医学影像的诊断价值高度依赖于图像的细节和结构完整性，压缩技术在保证图像质量的同时，必须充分考虑带宽限制和传输效率，这对压缩算法提出了更高的综合性能要求^[17-19]。

本研究提出了一种基于SSIM的率失真优化方法，旨在改善医学影像压缩中的细微特征和诊断相关特征的保留问题。实验结果显示，使用SSIM作为优化目标，能够在不显著增加计算复杂度的情况下提升影像的主观质量。与传统的基于MSE优化的x264编码器方法相比，本研究的算法在恒定量化参数(QP=32)条件下实现了平均-5.2%的码率节省，在恒定码率因子(RF=22)条件下码率降低约4.8%。此外，本研究的算法的编码结果图与原图的主观质量极为相似(SSIM均在0.95以上)，且与传统算法相比，传输的平均码率降低了372 kbps。

随着视频编码技术的不断进步，最新编码标准如高效视频编码(high efficiency video coding，HEVC)、下一代标准通用视频编码(versatile video coding，VVC)和AV1已在医学图像、视频压缩领域展现出更优的压缩效率和视觉质量^[20-22]。这些研究表明，VVC在PSNR增益和比特率节省方面都优于HEVC和AV1，对于更高的视频分辨率，AV1优于HEVC。而对于超声视频数据集使用的有效分辨率，HEVC则要优于AV1编码器。尽管本研究由于实验资源和环境限制，未能直接与这三种先进编码器进行实验对比，但所提方法依托传统编码体系并融入基于SSIM的感知优化策略，兼顾了算法复杂度和编码效率，在资源受限的实际应用环境下具备较强的可行性与推广潜力。未来工作计划引入HEVC、VVC和AV1编码框架。

然而，本研究也存在一些局限性。首先，实验仅在特定数据集上进行，未来的研究应考虑在更大规模和多样化的数据集上验证该方法的普适性。其次，虽然SSIM在主观质量评价中表现优异，但在某些特定的医学影像类型中，可能需要结合其他质量评价指标以获得更全面的质量评估。最后，未来的研究可以将基于SSIM的优化方法引入HEVC、VVC和AV1等先进的编码标准，进一步验证和提升方法的性能^[23-24]。

综上所述，基于结构相似性指数的率失真优化技术为医学影像压缩编码提供了一种创新且高效的解决方案。相比传统仅依赖峰值信噪比等客观指标的方法，基于SSIM的优化更贴近人眼视觉感知，能够更准确地保留影像的结构和细节信息，从而显著提升压缩后图像的主观质量和诊断价值。这种方法不仅在保证压缩效率的同时，有效减少了重要医学特征的丢失，降低了图像重建伪影对医师诊断的干扰，同时也为医学影像的存储、传输和临床诊断开辟了新的发展路径，具有良好的应用价值和广阔的研究前景。

数据共享声明　本文数据可以从通讯作者处获得，Email：majun@butel.com。

参考文献

原文顺序 | 出版日期 | 本文引用

[1]	Ali Abumalloh R， Nilashi M， Yousoof Ismail M，et al ． Medical image processing and COVID-19： A literature review and bibliometric analysis［J］． J Infect Public Health，2022，15（1）：75-93．

[2]	Tay YX， Kothan S， Kada S，et al ． Challenges and optimization strategies in medical imaging service delivery during COVID-19［J］． World J Radiol，2021，13（5）：102-121．

[3]	Alenoghena CO， Ohize HO， Adejo AO，et al ． Telemedicine： a survey of telecommunication technologies，developments，and challenges［J］． J Sens Actuator Networks，，2023，12（2）：20．

[4]	Wang YL， Xiong HL， Sun KC，et al ． Toward general text-guided multimodal brain MRI synthesis for diagnosis and medical image analysis［J］． Cell Rep Med，2025，6（6）：102182．

[5]	Chikkerur S， Sundaram V， Reisslein M，et al ． Objective video quality assessment methods： a classification，review，and performance comparison［J］． IEEE Trans Broadcast，2011，57（2）：165-182．

[6]	Luu HM， van Walsum T， Franklin D，et al ． Efficiently compressing 3D medical images for teleinterventions via CNNs and anisotropic diffusion［J］． Med Phys，2021，48（6）：2877-2890．

[7]	Usman MA， Martini MG ． On the suitability of VMAF for quality assessment of medical videos： Medical ultrasound & wireless capsule endoscopy［J］． Comput Biol Med，2019，113：103383．

[8]	Tan HL， Li ZG， Tan YH，et al ． A perceptually relevant MSE-based image quality metric［J］． IEEE Trans Image Process，2013，22（11）：4447-4459．

[9]	韩光川，李伟生，王国芬，等．多模态医学图像融合图像质量评估［J］．重庆邮电大学学报（自然科学版），2024，36（3）：591-600．

[10]	Venugopal G， Muller K， Pfaff J，et al ． Region-based template matching prediction for intra coding［J］． IEEE Trans Image Process，2023，32：779-790．

[11]	Wang Z， Bovik AC， Sheikh HR，et al. Image quality assessment: from error visibility to structural similarity［J］． IEEE Trans Image Process，2004，13（4）：600-612.

[12]	Zhang XX， Sisniega A， Zbijewski WB，et al ． Combining physics-based models with deep learning image synthesis and uncertainty in intraoperative cone-beam CT of the brain［J］． Med Phys，2023，50（5）：2607-2624．

[13]	Harshalatha Y， Biswas PK ． Rate distortion optimization using SSIM for 3D video coding［C］//2016 23rd International Conference on Pattern Recognition （ICPR）． December 4-8，2016，Cancun，Mexico． IEEE，2017：1261-1266．

[14]	Sho M ． Properties of the SSIM metric in medical image assessment： correspondence between measurements and the spatial frequency spectrum［J］． Phys Eng Sci Med，2023，46（3）：1131-1141．

[15]	Pareek PK， Sridhar C， Kalidoss R，et al ． IntOPMICM： intelligent medical image size reduction model［J］． J Healthc Eng，2022，2022：5171016．

[16]	Pourasad Y， Cavallaro F ． A novel image processing approach to enhancement and compression of X-ray images［J］． Int J Environ Res Public Health，2021，18（13）：6724．

[17]	Panayides AS， Amini A， Filipovic ND，et al ． AI in medical imaging informatics： current challenges and future directions［J］． IEEE J Biomed Health Inform，2020，24（7）：1837-1857．

[18]	Kesner A， Laforest R， Otazo R，et al ． Medical imaging data in the digital innovation age［J］． Med Phys，2018，45（4）：e40-e52．

[19]	Liu F， Hernandez-Cabronero M， Sanchez V，et al ． The current role of image compression standards in medical imaging［J］． Information，2017，8（4）：131．

[20]	Panayides AS， Pattichis MS， Pantziaris M，et al ． The battle of the video codecs in the healthcare domain - a comparative performance evaluation study leveraging VVC and AV1［J］． IEEE Access，2020，8：11469-11481．

[21]	Wang Y， Tohidypour HR， Pourazad MT，et al ． Comparison of modern compression standards on medical images for telehealth applications［C］//2023 IEEE International Conference on Consumer Electronics （ICCE）． January 6-8，2023，Las Vegas，NV，USA． IEEE，2023：1-6．

[22]	Bui V， Chang LC， Li DL，et al ． Comparison of lossless video and image compression codecs for medical computed tomography datasets［C］//2016 IEEE International Conference on Big Data （Big Data）． December 5-8，2016，Washington，DC，USA． IEEE，2017：3960-3962．

[23]	Pambrun JF， Noumeir R ． Limitations of the SSIM quality metric in the context of diagnostic imaging［C］//2015 IEEE International Conference on Image Processing （ICIP）． September 27-30，2015，Quebec City，QC，Canada． IEEE，2015：2960-2963．

[24]	Mudeng V， Kim M， Choe SW ． Prospects of structural similarity index for medical image analysis［J］． Appl Sci，2022，12（8）：3754．