South-Central Minzu University,a. College of Electronic and Information Engineering; b. Hubei Key Lab of Intelligent Wireless Communication; c. College of Computer Science,Wuhan 430074,China
Continuous super-resolution technology for remote sensing images is critical for tasks such as multi-scale ground object recognition, change detection and semantic analysis. However, existing methods struggle to balance local detail reconstruction with global semantic consistency under complex background interference and large-scale variations. To address this problem, a continuous super-resolution method for remote sensing image via cross-scale Transformer with global-local interaction is proposed. A multi-scale parameter generator integrated with Contextual Attention Mechanism (CAM) is designed, which selectively enhances local high-frequency features at different scales, builds a cross-scale Transformer interaction module that leverages self-attention mechanism to achieve global semantic modeling and local feature fusion. A dual-branch global-local parser that jointly optimizes coordinate-aware positional encoding and context-dependent semantic decoding to ensure reconstruction accuracy at different scaling factors is proposed. Experimental results demonstrate that the proposed method can achieve a gain of 0.17 dB in PSNR com pared to state-of-the-art continuous super-resolution approaches.
QIUD, CHENGY, WANGX. Medical image super-resolution reconstruction algorithms based on deep learning: A survey[J]. Computer Methods and Programs in Biomedicine, 2023, 238: 107590.
[2]
HSUW Y, YANGP Y. Pedestrian detection using multi-scale structure-enhanced super-resolution[J]. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(11): 12312-12322.
[3]
JIANGH, PENGM, ZHONGY, et al. A survey on deep learning-based change detection from high-resolution remote sensing images[J]. Remote Sensing, 2022, 14(7): 1552.
[4]
WANGP, BAYRAMB, SERTELE. A comprehensive review on deep learning based remote sensing image super-resolution methods[J]. Earth-Science Reviews, 2022, 232: 104110.
[5]
CHENK, LIW, LEIS, et al. Continuous remote sensing image super-resolution based on context interaction in implicit function space[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 4702216.
[6]
DONGC, LOYC C, HEK, et al. Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295-307.
[7]
ZEILERM D, KRISHNAND, TAYLORG W, et al. Deconvolutional networks[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010: 2528-2535.
[8]
SHIW, CABALLEROJ, HUSZÁRF, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 1874-1883.
[9]
HUX, MUH, ZHANGX, et al. Meta-SR: A magnification-arbitrary network for super-resolution[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach: IEEE, 2020: 1575-1584.
[10]
CHENY, LIUS, WANGX. Learning continuous image representation with local implicit image function[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville: IEEE, 2021: 8624-8634.
[11]
XUX, WANGZ, SHIH. UltraSR: Spatial encoding is a missing key for implicit image function-based arbitrary-scale super-resolution[J]. arXiv:
[12]
LIUY T, GUOY C, ZHANGS H. Enhancing multi-scale implicit learning in image super-resolution with integrated positional encoding[J]. arXiv:
[13]
LEEJ, HWANJ. Local texture estimator for implicit representation function[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans: IEEE, 2022: 1919-1928.
[14]
CHENH W, XUY S, HONGM F, et al. Cascaded local implicit transformer for arbitrary-scale super-resolution[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver: IEEE, 2023: 18257-18267.
[15]
CAOJ, WANGQ, XIANY, et al. CiaoSR: Continuous implicit attention-in-attention network for arbitrary-scale image super-resolution[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver: IEEE, 2023: 1796-1807.
[16]
WUH, NIN, ZHANGL. Learning dynamic scale awareness and global implicit functions for continuous-scale super-resolution of remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5602315.
[17]
LIY, YAOT, PANY, et al. Contextual transformer networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(2): 1489-1500.
[18]
WANGC Y, MARK LIAOH Y, WUY H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle: IEEE, 2020: 1571-1580.
[19]
YANGY, NEWSAMS. Bag-of-visual-words and spatial extensions for land-use classification[C]//Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems. San Jose California: ACM, 2010: 270-279.
[20]
XIAG S, HUJ, HUF, et al. AID: A benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(7): 3965-3981.