CN112084901A

CN112084901A - Automatic detection method and system of airport runway area in high resolution SAR image based on GCAM

Info

Publication number: CN112084901A
Application number: CN202010871235.3A
Authority: CN
Inventors: 陈立福; 谭思雨; 潘舟浩; 邢进; 李振洪; 袁志辉; 邢学敏
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2020-08-26
Filing date: 2020-08-26
Publication date: 2020-12-15
Anticipated expiration: 2040-08-26
Also published as: CN112084901B

Abstract

The invention discloses a GCAM-based high-resolution SAR image airport runway area automatic detection method and a system thereof, wherein the GCAM-based high-resolution SAR image airport runway area automatic detection method comprises the steps of carrying out downsampling on a high-resolution SAR image to generate a medium-resolution image; inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area; and carrying out coordinate mapping on the extracted runway area to obtain the detection result of the runway area of the final high-resolution SAR image. Experiments show that compared with DeeplaLV 3+, RefineNet and MDDA networks, the method is high in precision and short in time consumption, can fully learn the geospatial information of the SAR image airport, and can realize high-precision, rapid and automatic extraction of the runway area of the high-resolution SAR image airport.

Description

Automatic detection method of airport runway area in high-resolution SAR image based on GCAM system

技术领域technical field

本发明涉及机场跑道区自动检测技术，具体涉及一种基于GCAM的高分辨率SAR图像机场跑道区自动检测方法及系统。The invention relates to an airport runway area automatic detection technology, in particular to a GCAM-based high-resolution SAR image airport runway area automatic detection method and system.

背景技术Background technique

机场是重要的交通枢纽和军事设施，从合成孔径雷达(Synthetic ApertureRadar,SAR)图像中检测机场目标已经成为一个重要应用。SAR具有全天时全天候成像、穿透云雾等优点，但SAR图像相对于光学图像更不易读，解译更为复杂，因而大多数机场检测通常基于光学遥感影像。随着SAR图像分辨率越来越高、数据越来越多，近几年来利用SAR图像提取机场的研究逐步增多，相关的研究也开始不断深入。传统机场提取方法耗时耗力且大部分只对光学影像的机场提取颇有成效，但对SAR图像的机场提取效果较差。因此，在高分辨率SAR图像上实现机场跑道区域的自动快速提取有着深远且迫切的现实意义。此外，利用机场跑道区来掩模飞机检测也能大大降低飞机检测中出现的虚警，提高飞机检测精度。Airports are important transportation hubs and military facilities, and the detection of airport targets from Synthetic Aperture Radar (SAR) images has become an important application. SAR has the advantages of all-day and all-weather imaging, penetrating clouds and fog, etc., but SAR images are less readable and more complicated to interpret than optical images, so most airport detection is usually based on optical remote sensing images. With the increasing resolution of SAR images and more and more data, the research on using SAR images to extract airports has gradually increased in recent years, and related research has also begun to deepen. Traditional airport extraction methods are time-consuming and labor-intensive, and most of them are only effective for airport extraction from optical images, but they are less effective for airport extraction from SAR images. Therefore, it has far-reaching and urgent practical significance to realize the automatic and rapid extraction of airport runway areas on high-resolution SAR images. In addition, using the airport runway area to mask the aircraft detection can also greatly reduce the false alarms in the aircraft detection and improve the aircraft detection accuracy.

机场检测在航站导航、事故搜救和飞机定位等方面有着广泛的应用。跑道区是机场的主要组成部分之一，在有关机场检测的研究中，在光学遥感影像上检测机场的研究居多。目前已有现有技术用提取机场边缘线段的传统方法进行了机场检测，但提取线段的方法要求机场有明显的线性特征，这对于航站楼较多且跑道线性特征不强的大型民用机场并不适用；还有某些的方案利用稀疏重构显著模型(SRS)和目标感知主动轮廓模型(TAACM)来完成机场检测，该方法加强了对机场的细节提取；还有某些的方案结合视觉显著分析模型、双向互补显著性分析模块和显著性主动轮廓模型(SOACM)来进行机场轮廓提取，该方法对大部分光学遥感影像都适用；SAR图像具有较强的突防能力，能够无干扰工作，并且能获取丰富的地物信息，这一优点使得SAR图像逐渐成为机场检测的实验对象。还有某些的方案结合传统线段分组方法和显著性分析模型对小型SAR图像进行机场检测，但该方法并不适用于在大型SAR图像上检测机场；还有某些的方案提出了一种结合优化极化特征和随机森林的PolSAR机场跑道检测算法，但该方法只能有效提取机场中的平行跑道特征。Airport detection has a wide range of applications in terminal navigation, accident search and rescue, and aircraft positioning. The runway area is one of the main components of the airport. In the research on airport detection, most of the researches on the detection of airports are based on optical remote sensing images. At present, airport detection has been carried out by the traditional method of extracting airport edge line segments in the existing technology, but the method of extracting line segments requires the airport to have obvious linear characteristics, which is not suitable for large-scale civil airports with many terminals and weak runway linear characteristics. Not applicable; some schemes use sparse reconstruction saliency model (SRS) and target-aware active contour model (TAACM) to complete airport detection, which enhances the detail extraction of airports; and some schemes combine vision The saliency analysis model, the bidirectional complementary saliency analysis module and the saliency active contour model (SOACM) are used for airport contour extraction. This method is applicable to most optical remote sensing images; SAR images have strong penetration ability and can work without interference. , and can obtain rich ground object information, this advantage makes SAR images gradually become the experimental object of airport detection. There are also some schemes that combine traditional line segment grouping methods and saliency analysis models to detect airports on small SAR images, but this method is not suitable for detecting airports on large SAR images; some schemes propose a combination of The PolSAR airport runway detection algorithm based on polarization features and random forest is optimized, but this method can only effectively extract parallel runway features in the airport.

近些年来，深度学习在语义分割方向上取得了非常好的效果。语义分割是基于图像像素点进行特征学习，从而实现图像不同类别划分的深度学习方法。机场检测需要提取所有机场特征，这一原理和语义分割思想相一致，因此深度学习结合机场检测的方法开始出现。例如：某现有技术提出了结合深度学习YOLO模型和显著性分析模型的机场检测方法；某现有技术结合深度学习Goole-LF网络和支持向量机SVM的方法来检测机场；某现有技术结合深度学习Faster-CNN网络和空间分析方法来进行机场提取；某现有技术构建了端对端的深度可转移卷积深度学习网络来检测机场；但以上方法都是将深度学习应用到光学遥感影像上的例子，又因为机场样本数据的匮乏，深度学习模型在训练时往往会出现过拟合。针对高分辨率SAR图像机场提取，某现有技术提出了一种高分辨率SAR图像跑道区的深度学习网络MDDA(Mult-level and densely dual attention)，它能实现高精度机场提取，但要求较大数据集且训练时间较长。因此，寻求一种适用于小样本数据集且能高效提取机场的深度学习方法十分具有现实意义。In recent years, deep learning has achieved very good results in the direction of semantic segmentation. Semantic segmentation is a deep learning method that performs feature learning based on image pixels to achieve different categories of images. Airport detection needs to extract all airport features. This principle is consistent with the idea of semantic segmentation, so the method of combining deep learning with airport detection begins to appear. For example: a prior art proposes an airport detection method combining deep learning YOLO model and saliency analysis model; a prior art combines deep learning Goole-LF network and support vector machine SVM method to detect airports; a prior art combines Deep learning Faster-CNN network and spatial analysis method for airport extraction; an existing technology builds an end-to-end deep transferable convolutional deep learning network to detect airports; but the above methods all apply deep learning to optical remote sensing images For example, and because of the lack of airport sample data, deep learning models often overfit during training. For high-resolution SAR image airport extraction, an existing technology proposes a deep learning network MDDA (Mult-level and densely dual attention) for high-resolution SAR image runway area, which can achieve high-precision airport extraction, but requires more Large datasets and long training times. Therefore, it is of practical significance to seek a deep learning method suitable for small sample datasets and can efficiently extract airports.

深度学习网络发展非常迅速，深度学习中的DeepLab系列在语义分割领域中有着优越表现。2014年DeepLabv1被提出，首次引用带孔卷积(Atrous Conv)，解决了传统CNN算法在像素标记中存在的信号下采用和空间不变形等问题，并通过条件随机场(CRF)提高了模型捕获精细细节的能力，DeepLabv1在PASCAL语义分割挑战中获得第二名；2016年DeepLabv2被提出，DeepLabv2在DeepLabv1的基础上进一步提出ASPP(Astrous spatialpyramid pooling)模块，从而从多尺度方向上捕捉上下文语义信息，并将主干网络VGG-16换成ResNet，克服了传统CNN中池化导致的特征分辨率下降问题；2017年DeepLabv3出现，DeepLabv3在DeepLabv2的基础上改进了ASPP，使得网络性能更佳；2018年DeepLabv3+在DeepLabv3的基础进一步改进，DeepLabv3+引进编码-解码结构，将DeepLabv3作为编码部分，设计了一个简单有效的解码块，并在主干网络中加入深度可分离卷积(Depthwiseseparable convolution)，DeepLabv3+使得模型在保持性能的前提下，有效降低了计算量和参数量。The deep learning network is developing very rapidly, and the DeepLab series in deep learning has excellent performance in the field of semantic segmentation. In 2014, DeepLabv1 was proposed, and Atrous Conv was cited for the first time, which solved the problems of traditional CNN algorithm in the presence of signals in pixel labels and spatial invariance, and improved model capture through Conditional Random Field (CRF). The ability of fine details, DeepLabv1 won the second place in the PASCAL semantic segmentation challenge; DeepLabv2 was proposed in 2016, and DeepLabv2 further proposed the ASPP (Astrous spatialpyramid pooling) module on the basis of DeepLabv1, so as to capture contextual semantic information from a multi-scale direction, The backbone network VGG-16 is replaced by ResNet, which overcomes the problem of feature resolution reduction caused by pooling in traditional CNN; DeepLabv3 appeared in 2017, and DeepLabv3 improved ASPP on the basis of DeepLabv2, making the network performance better; DeepLabv3+ in 2018 Further improvement on the basis of DeepLabv3, DeepLabv3+ introduces an encoding-decoding structure, uses DeepLabv3 as the encoding part, designs a simple and effective decoding block, and adds a depthwise separable convolution to the backbone network. DeepLabv3+ makes the model maintain Under the premise of performance, the amount of calculation and parameters are effectively reduced.

因此，针对SAR图像机场提取中存在的问题，如何基于深度学习来实现高分辨率SAR图像机场跑道区的高精度、快速、自动提取，已经成为一项亟待解决的关键技术问题。Therefore, in view of the problems existing in SAR image airport extraction, how to achieve high-precision, fast and automatic extraction of high-resolution SAR image airport runway areas based on deep learning has become a key technical problem to be solved urgently.

发明内容SUMMARY OF THE INVENTION

本发明要解决的技术问题：针对现有技术的上述问题，提供一种基于GCAM的高分辨率SAR图像机场跑道区自动检测方法及系统，本发明能充分学习SAR图像机场的地理空间信息，能够实现高分辨率SAR图像机场跑道区的高精度、快速、自动提取。The technical problem to be solved by the present invention: aiming at the above-mentioned problems of the prior art, a GCAM-based high-resolution SAR image airport runway area automatic detection method and system are provided. The present invention can fully learn the geospatial information of the SAR image airport, and can Realize high-precision, fast and automatic extraction of airport runway area from high-resolution SAR images.

为了解决上述技术问题，本发明采用的技术方案为：In order to solve the above-mentioned technical problems, the technical scheme adopted in the present invention is:

一种基于GCAM的高分辨率SAR图像机场跑道区自动检测方法，包括：A GCAM-based high-resolution SAR image airport runway area automatic detection method, comprising:

1)对高分辨率SAR图像进行下采样生成中等分辨率图像；1) Downsampling the high-resolution SAR image to generate a medium-resolution image;

2)将中等分辨率图像输入到地理空间上下文注意力机制网络GCAM提取跑道区；2) Input the medium resolution image to the geospatial context attention mechanism network GCAM to extract the runway area;

3)针对提取得到的跑道区进行坐标映射，获得最终高分辨率SAR图像的检测结果。3) Perform coordinate mapping on the extracted runway area to obtain the detection result of the final high-resolution SAR image.

可选地，步骤1)中对高分辨率SAR图像进行下采样具体是指采用像素值抽取的方法对SAR图像进行5倍下采样处理。Optionally, the down-sampling of the high-resolution SAR image in step 1) specifically refers to performing a 5-fold down-sampling process on the SAR image by using a pixel value extraction method.

可选地，所述地理空间上下文注意力机制网络GCAM包括编码块和解码块，编码块包括残差网络ResNet、多尺度挤压金字塔MSP和边缘细化模块EDM，所述残差网络ResNet用于对输入数据集进行特征提取得到初步特征，所述多尺度挤压金字塔MSP用于针对初步特征从不同分辨率上以不同池化卷积层操作来获取全局上下文信息，边缘细化模块EDM用于针对初步特征加强网络边缘提取能力，多尺度挤压金字塔MSP、边缘细化模块EDM的输出进一步融合得到多层次特征；解码块用于结合初步特征、多层次特征来进行机场跑道区的语义分割来提取跑道区。Optionally, the geospatial context attention mechanism network GCAM includes an encoding block and a decoding block, and the encoding block includes a residual network ResNet, a multi-scale squeeze pyramid MSP, and an edge refinement module EDM, and the residual network ResNet is used for Perform feature extraction on the input data set to obtain preliminary features. The multi-scale extrusion pyramid MSP is used to obtain global context information from different pooling convolution layer operations at different resolutions for the preliminary features. The edge refinement module EDM is used for To strengthen the network edge extraction ability for the preliminary features, the output of the multi-scale extrusion pyramid MSP and the edge refinement module EDM are further fused to obtain multi-level features; the decoding block is used to combine the preliminary features and multi-level features to perform semantic segmentation of the airport runway area. Extract the runway area.

可选地，所述残差网络ResNet为在残差网络ResNet_101的基础上将空洞率为2、4、8和16的空洞卷积替代普通二维卷积后得到的改进残差网络。Optionally, the residual network ResNet is an improved residual network obtained by replacing ordinary two-dimensional convolution with atrous convolutions with atrous rates of 2, 4, 8, and 16 on the basis of the residual network ResNet_101.

可选地，所述多尺度挤压金字塔MSP包括多感受野并行池化工作层和有效注意力模块eSE，所述多感受野并行池化工作层由一个空洞率为1的1×1卷积、三个空洞率分别为6,12,18的3×3卷积、一个全局平均池化模块GAP和一个条纹池化模块SP并行搭建而成；所述条纹池化模块SP针对输入大小为H×W的二维特征张量，利用带状池化窗口H×1在水平方向进行池化操作、带状池化窗口1×W在垂直方向上进行池化操作，分别对池化核内的元素值求平均得到水平方向上条纹池化的输出、垂直方向上条纹池化的输出，然后针对水平方向上条纹池化的输出、垂直方向上条纹池化的输出分别使用两个一维卷积对输出进行左右方向和上下方向的扩张，且扩张后两个特征图的尺寸相同，再将扩张后两个特征图进行融合，最后将原始数据和对融合进行Sigmoid处理后的数据进行相乘作为最终得到的H×W的二维特征张量输出；所述有效注意力模块eSE针对输入特征图Xi首先通过全局平均池化来学习的特征F_avg，将特征F_avg通过全连接层处理得到权重矩阵W_C，将权重矩阵W_C通过Sigmoid函数来重新调整提取得到的通道注意力特征A_eSE，然后将通道注意力特征A_eSE应用到输入的特征图Xi得到精细化特征图X_refine，最后将精细化特征图X_refine进行特征再筛选获得全局上下文信息。Optionally, the multi-scale squeeze pyramid MSP includes a multi-receptive field parallel pooling working layer and an effective attention module eSE, and the multi-receptive field parallel pooling working layer is composed of a 1×1 convolution with a dilation ratio of 1. , three 3×3 convolutions with dilation rates of 6, 12, and 18, a global average pooling module GAP and a stripe pooling module SP are built in parallel; the stripe pooling module SP is designed for an input size of H ×W two-dimensional feature tensor, using the band-shaped pooling window H×1 to perform the pooling operation in the horizontal direction, and the band-shaped pooling window 1×W to perform the pooling operation in the vertical direction, respectively. The element values are averaged to obtain the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction, and then two one-dimensional convolutions are used for the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction respectively. The output is expanded in the left and right and up and down directions, and the size of the two feature maps after the expansion is the same, and then the two feature maps after the expansion are fused, and finally the original data and the fused data after Sigmoid processing are multiplied as The final obtained H×W two-dimensional feature tensor output; the effective attention module eSE first learns the feature F _avg through global average pooling for the input feature map Xi, and processes the feature F _avg through the fully connected layer to obtain the weight Matrix W _C , the weight matrix W _C is re-adjusted by the Sigmoid function to obtain the channel attention feature A _eSE , and then the channel attention feature A _eSE is applied to the input feature map Xi to obtain the refined feature map X _refine , and finally the The refined feature map X _refine performs feature re-screening to obtain global context information.

可选地，所述边缘细化模块EDM包括用于增强了特征图与像素分类层的密切联系和处理不同分辨率特征图的能力以获得全局信息的全局卷积模块GCB、从全局信息上提高编码块的边缘提取能力的边缘细化模块BR；所述全局卷积模块GCB包括k×k的大卷积核和特征组合模块，所述k×k的大卷积核包括两路，一路由k×1×c×c的卷积和1×k×c×c的卷积组成，另一路由1×k×c×c的卷积和k×1×c×c的卷积组成，其中c为通道数量，两路的输出结果一起输入特征组合模块得到特征Sum_W×H×_C；所述边缘细化模块BR针对特征Sum_W×H×C依次通过小卷积核、激活函数、小卷积核处理，然后将处理结果叠加到原始的特征Sum_W×H×C上，最终得到精细化跑道区边缘后的特征图。Optionally, the edge refinement module EDM includes a global convolution module GCB that enhances the close connection between the feature map and the pixel classification layer and the ability to process feature maps of different resolutions to obtain global information. The edge refinement module BR for the edge extraction capability of the coding block; the global convolution module GCB includes a k×k large convolution kernel and a feature combination module, and the k×k large convolution kernel includes two routes, one route The convolution of k×1×c×c is composed of the convolution of 1×k×c×c, and the other route is composed of the convolution of 1×k×c×c and the convolution of k×1×c×c, where c is the number of channels, and the output results of the two channels are input into the feature combination module together to obtain the feature Sum _W×H _× _C ; the edge refinement module BR passes through the small convolution kernel, the activation function, the small convolution kernel, the activation function, the small The convolution kernel is processed, and then the processing results are superimposed on the original feature Sum _W×H×C , and finally the feature map after refining the edge of the runway area is obtained.

可选地，所述解码块针对将编码块的输出特征经过1×1卷积降维、利用边缘细化模块EDM得到的精细化跑道区边缘后的特征图进行边缘信息解码、进行双线性4倍上采样，然后将残差网络ResNet输出的初步特征经过1×1卷积降维后和双线性4倍上采样得到的结果进行连接，再将连接得到的特征应用一个3×3卷积来细化特征，最后进行一个简单的双线性4倍上采样，从而得到了最后的分割结果。Optionally, the decoding block performs edge information decoding on the feature map obtained by reducing the dimension of the output feature of the coding block through 1×1 convolution and using the edge refinement module EDM to refine the edge of the runway area, and performing bilinear decoding. 4 times upsampling, then connect the preliminary features output by the residual network ResNet after 1×1 convolutional dimension reduction and the results obtained by bilinear 4 times upsampling, and then apply the connected features to a 3×3 volume The product is used to refine the features, and finally a simple bilinear 4x upsampling is performed to obtain the final segmentation result.

此外，本发明还提供一种基于GCAM的高分辨率SAR图像机场跑道区自动检测系统，包括：In addition, the present invention also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, including:

下采样程序单元，用于对高分辨率SAR图像进行下采样生成中等分辨率图像；A down-sampling program unit for down-sampling high-resolution SAR images to generate medium-resolution images;

跑道区提取程序单元，用于将中等分辨率图像输入到地理空间上下文注意力机制网络GCAM提取跑道区；The runway area extraction program unit is used to input the medium resolution image to the geospatial context attention mechanism network GCAM to extract the runway area;

坐标映射程序单元，用于针对提取得到的跑道区进行坐标映射，获得最终的检测结果。The coordinate mapping program unit is used to perform coordinate mapping for the extracted runway area to obtain the final detection result.

此外，本发明还提供一种基于GCAM的高分辨率SAR图像机场跑道区自动检测系统，包括计算机设备，该计算机设备包括相互连接的微处理器和存储器，该微处理器被编程或配置以执行所述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的步骤，或者该存储器中存储有被编程或配置以执行所述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的计算机程序。In addition, the present invention also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, comprising a computer device including an interconnected microprocessor and a memory, the microprocessor being programmed or configured to execute Steps of the GCAM-based high-resolution SAR image airport runway area automatic detection method, or a computer program programmed or configured to execute the GCAM-based high-resolution SAR image airport runway area automatic detection method stored in the memory .

此外，本发明还提供一种计算机可读存储介质，该计算机可读存储介质中存储有被编程或配置以执行所述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的计算机程序。In addition, the present invention also provides a computer-readable storage medium storing a computer program programmed or configured to execute the GCAM-based high-resolution SAR image airport runway area automatic detection method.

和现有技术相比，本发明具有下述优点：本发明包括对高分辨率SAR图像进行下采样生成中等分辨率图像；将中等分辨率图像输入到地理空间上下文注意力机制网络GCAM提取跑道区；针对提取得到的跑道区进行坐标映射，获得最终高分辨率SAR图像的检测结果，通过将深度学习和SAR图像进行机场跑道区提取结合，能充分学习SAR图像机场的地理空间信息，能够实现高分辨率SAR图像机场跑道区的高精度、快速、自动提取。Compared with the prior art, the present invention has the following advantages: the present invention includes down-sampling a high-resolution SAR image to generate a medium-resolution image; inputting the medium-resolution image into the geospatial context attention mechanism network GCAM to extract the runway area; ; Carry out coordinate mapping for the extracted runway area to obtain the detection result of the final high-resolution SAR image. By combining deep learning and SAR image extraction of airport runway area, the geospatial information of the SAR image airport can be fully learned, and the high-resolution SAR image can be fully learned. High-precision, fast, and automatic extraction of airport runway areas from high-resolution SAR images.

附图说明Description of drawings

图1为本发明实施例方法的基本原理示意图。FIG. 1 is a schematic diagram of a basic principle of a method according to an embodiment of the present invention.

图2为本发明实施例中改进残差网络的结构示意图。FIG. 2 is a schematic structural diagram of an improved residual network in an embodiment of the present invention.

图3为本发明实施例中条纹池化模块SP的结构示意图。FIG. 3 is a schematic structural diagram of a stripe pooling module SP in an embodiment of the present invention.

图4为本发明实施例中有效注意力模块eSE的结构示意图。FIG. 4 is a schematic structural diagram of an effective attention module eSE in an embodiment of the present invention.

图5为本发明实施例中全局卷积模块GCB和边缘细化模块BR的结构示意图。FIG. 5 is a schematic structural diagram of a global convolution module GCB and an edge refinement module BR in an embodiment of the present invention.

图6为本发明实施例中某个机场样本标签的SAR图像、标签及光学遥感图像。FIG. 6 is a SAR image, a label, and an optical remote sensing image of a sample label of an airport in an embodiment of the present invention.

图7为本发明实施例中针对机场I的跑道提取结果。FIG. 7 is a runway extraction result for Airport I in an embodiment of the present invention.

图8为本发明实施例中针对机场II的跑道提取结果。FIG. 8 is a runway extraction result for Airport II in an embodiment of the present invention.

图9为本发明实施例中针对机场III的跑道提取结果。FIG. 9 is a runway extraction result for Airport III in an embodiment of the present invention.

具体实施方式Detailed ways

如图1所示，本实施例基于GCAM的高分辨率SAR图像机场跑道区自动检测方法包括：As shown in Figure 1, the GCAM-based high-resolution SAR image airport runway area automatic detection method in this embodiment includes:

本实施例中，步骤1)中对高分辨率SAR图像进行下采样具体是指采用像素值抽取的方法对SAR图像进行5倍下采样处理。主要包含了两个部分的下采样:一是对数据集样本图片的下采样，二是对三张高分辨率测试SAR图像进行下采样，采样后为中等分辨率SAR图像。In this embodiment, the down-sampling of the high-resolution SAR image in step 1) specifically refers to performing a 5-fold down-sampling process on the SAR image by using a pixel value extraction method. It mainly includes two parts of downsampling: one is the downsampling of the sample images of the dataset, and the other is the downsampling of three high-resolution test SAR images, which are then sampled as medium-resolution SAR images.

为了快速提取SAR图像机场跑道区，本实施例提出了一种地理空间上下文注意力机制网络GCAM(Geospatial Contextual Attention Mechanism)，如图2所示，地理空间上下文注意力机制网络GCAM包括编码块和解码块，编码块包括残差网络ResNet、多尺度挤压金字塔MSP(Multi-scale Squeeze Pyramid)和边缘细化模块EDM(Edge DetectionModule)，残差网络ResNet用于对输入数据集进行特征提取得到初步特征，多尺度挤压金字塔MSP用于针对初步特征从不同分辨率上以不同池化卷积层操作来获取全局上下文信息，边缘细化模块EDM用于针对初步特征加强网络边缘提取能力，多尺度挤压金字塔MSP、边缘细化模块EDM的输出进一步融合得到多层次特征；解码块用于结合初步特征、多层次特征来进行机场跑道区的语义分割来提取跑道区。编码块首先使用残差网络ResNet对输入数据集进行初步特征提取；多尺度挤压金字塔MSP、边缘细化模块EDM分别对这些初步特征进行特征再提取后融合，多尺度挤压金字塔MSP从不同分辨率上以不同池化卷积层操作来获取全局上下文信息，边缘细化模块EDM加强网络边缘提取能力，进一步融合多层次特征；解码块采用边缘细化解码，解码块一部分接收来自编码块的多层次高级特征，一部分接收来自残差网络ResNet的初步特征，实现了机场跑道区的语义分割。In order to quickly extract the airport runway area of the SAR image, this embodiment proposes a geospatial contextual attention mechanism network GCAM (Geospatial Contextual Attention Mechanism). As shown in Figure 2, the geospatial contextual attention mechanism network GCAM includes encoding blocks and decoding. Blocks, coding blocks include residual network ResNet, Multi-scale Squeeze Pyramid (Multi-scale Squeeze Pyramid) and edge refinement module EDM (Edge Detection Module). Residual network ResNet is used to perform feature extraction on the input dataset to obtain preliminary features , the multi-scale extrusion pyramid MSP is used to obtain global context information from different pooling convolution layer operations at different resolutions for preliminary features, and the edge refinement module EDM is used to strengthen the network edge extraction ability for preliminary features. The output of the pyramid pressing MSP and the edge refinement module EDM is further fused to obtain multi-level features; the decoding block is used to combine the preliminary features and multi-level features to perform semantic segmentation of the airport runway area to extract the runway area. The coding block first uses the residual network ResNet to perform preliminary feature extraction on the input data set; the multi-scale extruded pyramid MSP and the edge refinement module EDM re-extract and fuse these preliminary features respectively. In terms of efficiency, different pooling convolution layer operations are used to obtain global context information. The edge refinement module EDM strengthens the network edge extraction capability and further integrates multi-level features; the decoding block adopts edge thinning decoding, and a part of the decoding block receives multiple data from the encoding block. Hierarchical high-level features, part of which receive preliminary features from the residual network ResNet, enable semantic segmentation of airport runway areas.

残差网络ResNet为地理空间上下文注意力机制网络GCAM的骨干网络，残差网络ResNet具有跳跃连接、优化残差等特点，其结构可以加速训练，提高模型准确度，非常适用于搭建语义分割网络。为了解决解决网络池化操作易丢失细节特征的问题，如图2所示，本实施例中采用的残差网络ResNet为在残差网络ResNet_101的基础上将空洞率为2、4、8和16的空洞卷积替代普通二维卷积后得到的改进残差网络。空洞卷积会解决网络池化操作易丢失细节特征的问题，且空洞卷积的加入不会额外增加残差网络ResNet的参数数量，但能使靠后卷积层保持较大的特征图尺寸，从而有利于对目标像素的检测，提高模型整体性能。考虑到加入空洞卷积，对于图片任意位置j，在输入特征为x[j+r.k]上应用滤波器ω(k)，则输出y(j)为:The residual network ResNet is the backbone network of the geospatial context attention mechanism network GCAM. The residual network ResNet has the characteristics of skip connections and optimized residuals. Its structure can speed up training and improve the accuracy of the model. It is very suitable for building a semantic segmentation network. In order to solve the problem that the network pooling operation is easy to lose detailed features, as shown in Figure 2, the residual network ResNet used in this embodiment is based on the residual network ResNet_101, and the hole rate is 2, 4, 8 and 16. The improved residual network obtained after replacing the ordinary two-dimensional convolution with the atrous convolution of . The hole convolution will solve the problem that the network pooling operation is easy to lose detailed features, and the addition of the hole convolution will not increase the number of parameters of the residual network ResNet, but it can keep the feature map size larger in the later convolution layer. This is beneficial to the detection of target pixels and improves the overall performance of the model. Considering the addition of atrous convolution, for any position j in the picture, apply the filter ω(k) on the input feature x[j+r.k], then the output y(j) is:

其中速率r在采样点之间引入r-1个0值，有效的将感受野从k×k扩展到k+(k-1)(r-1)，而不增加参数量和计算量。如图2给出了改进残差网络的改进结构部分。我们将残差网络ResNet_101的最后一个块(block)复制4次然后并行搭建，但块单纯的并行工作并不利用网络获取深层语义信息，会导致特征集中在最后几层较小的特征图中，且连续的带步长卷积不利用语义分割。因此，本实施例中将空洞率为2、4、8和16的空洞卷积替代普通二维卷积，从而改进了最终的输出步长。空洞卷积的加入改变了部分特征图的分辨率，这使得残差网络ResNet_101最后输出特征不仅有高维低分辨率特征图，还包含了部分低维高分辨率特征，实现了多尺寸特征的充分提取。The rate r introduces r-1 0 values between sampling points, effectively extending the receptive field from k×k to k+(k-1)(r-1) without increasing the amount of parameters and computation. Figure 2 shows the improved structural part of the improved residual network. We copied the last block (block) of the residual network ResNet_101 4 times and built it in parallel, but the simple parallel work of the block does not use the network to obtain deep semantic information, which will cause the features to be concentrated in the smaller feature maps of the last few layers. And continuous strided convolutions do not utilize semantic segmentation. Therefore, in this embodiment, atrous convolutions with atrous rates of 2, 4, 8, and 16 are used to replace ordinary two-dimensional convolutions, thereby improving the final output stride. The addition of atrous convolution changes the resolution of some feature maps, which makes the final output features of the residual network ResNet_101 not only have high-dimensional and low-resolution feature maps, but also include some low-dimensional and high-resolution features, realizing multi-dimensional features. fully extracted.

参见图1，多尺度挤压金字塔MSP包括多感受野并行池化工作层和有效注意力模块eSE。Referring to Figure 1, the Multi-Scale Squeeze Pyramid MSP includes multi-receptive field parallel pooling working layers and an effective attention module eSE.

参见图1，多感受野并行池化工作层由一个空洞率为1的1×1卷积、三个空洞率分别为6,12,18的3×3卷积、一个全局平均池化模块GAP和一个条纹池化模块SP并行搭建而成；所述条纹池化模块SP针对输入大小为H×W的二维特征张量，利用带状池化窗口H×1在水平方向进行池化操作、带状池化窗口1×W在垂直方向上进行池化操作，分别对池化核内的元素值求平均得到水平方向上条纹池化的输出、垂直方向上条纹池化的输出，然后针对水平方向上条纹池化的输出、垂直方向上条纹池化的输出分别使用两个一维卷积对输出进行左右方向和上下方向的扩张，且扩张后两个特征图的尺寸相同，再将扩张后两个特征图进行融合，最后将原始数据和对融合进行Sigmoid处理后的数据进行相乘作为最终得到的H×W的二维特征张量输出；本实施例中，改进残差网络处理后的特征图包含256个通道和丰富语义信息，首先输入到多感受野并行池化工作层，该层由一个空洞率为1的1×1卷积、三个空洞率分别为6,12,18的3×3卷积、一个全局平均池化模块GAP和一个条纹池化模块SP并行搭建而成。四个不同空洞率的空洞卷积能有效从不同感受野上捕捉多尺度信息；全局平均池化的加入对特征做降采样处理，以防止网络过拟合；条纹池化捕捉特征的局部信息；多感受野并行池化工作层实现了多尺度特征融合。Referring to Figure 1, the multi-receptive field parallel pooling working layer consists of a 1×1 convolution with a dilation rate of 1, three 3×3 convolutions with a dilation rate of 6, 12, and 18, and a global average pooling module GAP. It is constructed in parallel with a stripe pooling module SP; the stripe pooling module SP uses a band-shaped pooling window H×1 to perform pooling operations in the horizontal direction for a two-dimensional feature tensor with an input size of H×W. The band-shaped pooling window 1×W performs the pooling operation in the vertical direction. The element values in the pooling kernel are averaged to obtain the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction. The output of stripe pooling in the direction and the output of stripe pooling in the vertical direction use two one-dimensional convolutions to expand the output in the left and right and up and down directions, and the size of the two feature maps after expansion is the same. The two feature maps are fused, and finally the original data and the data after sigmoid processing are multiplied to obtain the final H×W two-dimensional feature tensor output; The feature map contains 256 channels and rich semantic information. It is first input to the multi-receptive field parallel pooling working layer. 3×3 convolution, a global average pooling module GAP and a stripe pooling module SP are built in parallel. Four dilated convolutions with different dilation rates can effectively capture multi-scale information from different receptive fields; the addition of global average pooling downsamples features to prevent network overfitting; stripe pooling captures local information of features; The receptive field parallel pooling working layer achieves multi-scale feature fusion.

条纹池化模块SP(Stripe Pooling)可克服了一般池化易产生虚警的缺点。如图3所示，当输入二维特征张量x∈R^H×W时，条纹池化模块SP利用带状池化窗口H×1和1×W分别在水平方向和垂直方向上进行池化操作，并对池化核内的元素值求平均，并以该值作为池化输出值。水平方向上条纹池化的输出y^h＝R^H为：The stripe pooling module SP (Stripe Pooling) can overcome the shortcomings of general pooling that are prone to false alarms. As shown in Fig. 3, when a two-dimensional feature tensor x∈R ^H×W is input, the stripe pooling module SP utilizes band-like pooling windows H×1 and 1×W to pool in the horizontal and vertical directions, respectively operation, and average the element values in the pooling kernel, and use this value as the pooling output value. The output y ^h =R ^H of stripe pooling in the horizontal direction is:

上式中，

为是水平方向上条纹池化对任意矩阵元素的输出，x_i,j为池化核内所有的矩阵元素。In the above formula,

is the output of stripe pooling for any matrix element in the horizontal direction, and x _i,j are all matrix elements in the pooling kernel.

垂直方向上条纹池化的输出y^v＝R^W为：The output y ^v = R ^W of stripe pooling in the vertical direction is:

上式中，

为是垂直方向上条纹池化对任意矩阵元素的输出，x_i,j为池化核内所有的矩阵元素。In the above formula,

is the output of stripe pooling for any matrix element in the vertical direction, and x _i,j are all matrix elements in the pooling kernel.

H×1和1×W核化处理后，使用两个一维卷积对输出进行左右方向和上下方向的扩张。扩张后两个特征图的尺寸相同，再进行融合，最后将原始数据和Sigmoid函数处理后数据进行相乘，输出结果。在水平和垂直方向上的条纹池化层中，离散分布像素区域和带状像素区域很容易相互依赖。由于卷积核长且窄，且卷积核形状沿相反维度较窄，因此很容易捕获特征的局部信息。这些特点都使得条纹池化优于基于方形核的平均池化。After H×1 and 1×W kernelization, two one-dimensional convolutions are used to dilate the output in the left-right and up-down directions. After the expansion, the two feature maps are of the same size, and then fused. Finally, the original data and the data processed by the Sigmoid function are multiplied, and the result is output. In the stripe pooling layers in the horizontal and vertical directions, the discretely distributed pixel regions and the banded pixel regions can easily depend on each other. Since the convolution kernel is long and narrow, and the convolution kernel shape is narrow along the opposite dimension, it is easy to capture the local information of the feature. All of these features make stripe pooling superior to square kernel-based average pooling.

有效注意力模块eSE(effective Squeeze-and-Excitation Module)用于接收多尺度特征后从通道信息上对特征进行优劣筛选。参见图4，有效注意力模块eSE针对输入特征图Xi首先通过全局平均池化来学习的特征F_avg，将特征F_avg通过全连接层(Fullyconnect，FC)处理得到权重矩阵W_C，将权重矩阵W_C通过Sigmoid函数来重新调整提取得到的通道注意力特征A_eSE，然后将通道注意力特征A_eSE与输入的特征图Xi相乘得到精细化特征图X_refine，从而给每个输入Xi逐像素进行权重赋值，实现特征的再筛选。其中，全连接层(Fullyconnect，FC)和一个Sigmoid函数来重新调整输入的特征图，达到提取有用通道信息的作用。Effective Squeeze-and-Excitation Module (eSE) is used to screen features from channel information after receiving multi-scale features. Referring to Figure 4, the effective attention module eSE firstly learns the feature F _avg through global average pooling for the input feature map Xi, and processes the feature F _avg through the fully connected layer (Fullyconnect, FC) to obtain the weight matrix W _C , and the weight matrix W _C readjusts the extracted channel attention feature A _eSE through the Sigmoid function, and then multiplies the channel attention feature A _eSE with the input feature map Xi to obtain the refined feature map X _refine , so as to give each input Xi pixel by pixel Perform weight assignment to realize feature re-screening. Among them, a fully connected layer (Fullyconnect, FC) and a sigmoid function are used to readjust the input feature map to extract useful channel information.

当输入特征图的大小为X_i∈R^C×W×H，则有效通道注意力映射A_eSE(X_i)∈R^C×1×1，计算如下:When the size of the input feature map is X _i ∈ R ^C×W×H , the effective channel attention map A _eSE (X _i )∈R ^C×1×1 is calculated as follows:

A_eSE(X_i)＝σ(W_C(F_gap(X_i)))A _eSE (X _i )=σ(W _C (F _gap (X _i )))

上式中，A_eSE(X_i)表示针对输入的特征图Xi提取得到的通道注意力特征A_eSE，σ为Sigmoid函数，W_C为权重矩阵，F_gap(X_i)为针对输入的特征图Xi全局平均池化得到的特征F_avg，且F_gap(X_i)的函数表达式为：In the above formula, A _eSE (X _i ) represents the channel attention feature A _eSE extracted from the input feature map Xi, σ is the Sigmoid function, W _C is the weight matrix, and F _gap (X _i ) is the input feature map. The feature F _avg obtained by the global average pooling of Xi, and the function expression of F _gap (X _i ) is:

上式中，Xi,j表示特征图Xi的矩阵内所有的元素。In the above formula, Xi,j represent all the elements in the matrix of the feature map Xi.

将通道注意力特征A_eSE应用到输入的特征图Xi得到精细化特征图X_refine的表达式如下：The channel attention feature A _eSE is applied to the input feature map Xi to obtain the refined feature map X _refine . The expression is as follows:

上式中，

表示异或。输入的特征图Xi是来自多尺度挤压金字塔MSP输出的多尺度特征图。将A_eSE(X_i)作为通道特征注意力应用到多尺度特征图中，会使得多尺度特征更具信息性。最后将输出的特征图逐元素的输入到精细化特征图X_refine进行特征再筛选。In the above formula,

Represents an exclusive or. The input feature map Xi is the multi-scale feature map from the output of the multi-scale squeeze pyramid MSP. Applying A _eSE (X _i ) as channel feature attention to multi-scale feature maps makes multi-scale features more informative. Finally, the output feature map is input element by element to the refined feature map X _refine for feature re-screening.

参见图1可知，多尺度挤压金字塔MSP、边缘细化模块EDM并行工作，且同时接收来自改进残差网络的输出特征图。如图1所示，边缘细化模块EDM包括用于增强了特征图与像素分类层的密切联系和处理不同分辨率特征图的能力以获得全局信息的全局卷积模块GCB(Global Convolutional Block)、从全局信息上提高编码块的边缘提取能力的边缘细化模块BR(Boundary Refinement)。边缘细化模块EDM能有效解决语义分割中像素点分类和定位问题，其中全局卷积模块GCB将卷积核的大小增加到特征图的空间大小，使得特征图与像素分类层之间保持密切联系，从而增强处理不同特征的能力，获得全局信息；而后引入边缘细化模块BR来进一步提高网络边缘提取能力。Referring to Figure 1, it can be seen that the multi-scale squeeze pyramid MSP and the edge refinement module EDM work in parallel, and simultaneously receive the output feature map from the improved residual network. As shown in Figure 1, the edge refinement module EDM includes a global convolution module GCB (Global Convolutional Block), which is used to enhance the close connection between the feature map and the pixel classification layer and the ability to process feature maps of different resolutions to obtain global information. The edge refinement module BR (Boundary Refinement) improves the edge extraction ability of the coding block from the global information. The edge refinement module EDM can effectively solve the problem of pixel classification and localization in semantic segmentation. The global convolution module GCB increases the size of the convolution kernel to the spatial size of the feature map, so that the feature map and the pixel classification layer are closely related. , so as to enhance the ability to process different features and obtain global information; and then introduce the edge refinement module BR to further improve the network edge extraction ability.

如图5所示，本实施例中全局卷积模块GCB包括k×k的大卷积核和特征组合模块，所述k×k的大卷积核包括两路，一路由k×1×c×c的卷积和1×k×c×c的卷积组成，另一路由1×k×c×c的卷积和k×1×c×c的卷积组成，其中c为通道数量，两路的输出结果一起输入特征组合模块得到特征Sum_W×H×C；边缘细化模块BR针对特征Sum_W×H×C依次通过小卷积核、激活函数、小卷积核处理，然后将处理结果叠加到原始的特征Sum_W×H×C上，最终得到精细化跑道区边缘后的特征图。As shown in FIG. 5 , in this embodiment, the global convolution module GCB includes a large convolution kernel of k×k and a feature combination module. The large convolution kernel of k×k includes two paths, one path is k×1×c The convolution of ×c is composed of the convolution of 1×k×c×c, and the other route is composed of the convolution of 1×k×c×c and the convolution of k×1×c×c, where c is the number of channels, The output results of the two channels are input into the feature combination module together to obtain the feature Sum _W×H×C ; the edge refinement module BR processes the feature Sum _W×H×C through a small convolution kernel, an activation function, and a small convolution kernel in turn, and then the The processing results are superimposed on the original feature Sum _W×H×C , and finally the feature map after refining the edge of the runway area is obtained.

参见图5，全局卷积模块GCB采用卷积搭建模式，充分利用特征的多通道信息。在针对像素点分类问题上，全局卷积模块GCB采用大卷积核，使得每个像素点对应的语义信息不会因为图像变换(平移、翻转等)而改变，像素间联系更加密切；在针对像素点定位问题上，全局卷积模块GCB使用完全卷积，利用矩阵分解原理，用1×k和k×1、k×1和1×k卷积来代替k×k的大核卷积，降低了参数量，减少了计算量，能够使每个像素种类和对应正确种类匹配，实现像素精准分割。由于全局卷积模块GCB没有BN层(Batch Normalization)和激活函数，因此引入了小卷积核的边缘细化模块BR，防止物体边界像素错分现象，实现了分类准、定位准。Referring to Figure 5, the global convolution module GCB adopts the convolution construction mode to make full use of the multi-channel information of the features. In the problem of pixel classification, the global convolution module GCB adopts a large convolution kernel, so that the semantic information corresponding to each pixel will not be changed due to image transformation (translation, flip, etc.), and the relationship between pixels is closer; On the problem of pixel point positioning, the global convolution module GCB uses full convolution, and uses the principle of matrix decomposition to replace the k×k large kernel convolution with 1×k and k×1, k×1 and 1×k convolutions. The amount of parameters is reduced and the amount of calculation is reduced, and each pixel type can be matched with the corresponding correct type to achieve accurate pixel segmentation. Since the global convolution module GCB does not have a BN layer (Batch Normalization) and an activation function, an edge refinement module BR with a small convolution kernel is introduced to prevent the misclassification of object boundary pixels and achieve accurate classification and positioning.

如图1所示，解码块针对将编码块的输出特征经过1×1卷积降维、利用边缘细化模块EDM得到的精细化跑道区边缘后的特征图进行边缘信息解码、进行双线性4倍上采样，然后将残差网络ResNet输出的初步特征经过1×1卷积降维后和双线性4倍上采样得到的结果进行连接，再将连接得到的特征应用一个3×3卷积来细化特征，最后进行一个简单的双线性4倍上采样，从而得到了最后的分割结果。解码块的输入包括两个部分：编码块的输出特征、残差网络ResNet输出的初步特征。编码块的输出特征首先经过1×1卷积降维，再利用EDM进行边缘信息解码，再被双线性4倍上采样，此操作在于降低特征通道数的同时充分解码边缘信息；而后与来自相同空间分辨率的骨干网络的相应特征连接，因为来自骨干网络的特征包含一部分低级特征，低级特征通常包含大量通道，于是我们同样采取一个1×1卷积来减少通道数，减少网络不必要的通道计算。As shown in Figure 1, the decoding block performs edge information decoding on the feature map obtained by reducing the dimension of the output feature of the coding block through 1×1 convolution and using the edge refinement module EDM to refine the edge of the runway area, and performing bilinear decoding. 4 times upsampling, then connect the preliminary features output by the residual network ResNet after 1×1 convolutional dimension reduction and the results obtained by bilinear 4 times upsampling, and then apply the connected features to a 3×3 volume The product is used to refine the features, and finally a simple bilinear 4x upsampling is performed to obtain the final segmentation result. The input of the decoding block consists of two parts: the output features of the encoding block and the preliminary features output by the residual network ResNet. The output features of the coding block are first reduced by 1×1 convolution, then edge information is decoded by EDM, and then upsampled by bilinear 4 times. This operation is to fully decode the edge information while reducing the number of feature channels; The corresponding feature connections of the backbone network with the same spatial resolution, because the features from the backbone network contain some low-level features, and the low-level features usually contain a large number of channels, so we also use a 1 × 1 convolution to reduce the number of channels and reduce unnecessary network. channel calculation.

本实施例中步骤3)用于针对提取得到的跑道区进行坐标映射，获得最终的检测结果。由于进行坐标映射与现有方法相同，故在此不再赘述，地理空间上下文注意力机制网络GCAM实现中等分辨率SAR图像机场跑道区域分割后，使用坐标映射的方法对结果图进行处理，进而得到高分辨率SAR原图的结果图。最后再将结果图与原图进行可视化，实现了高分辨率SAR图像的跑道区提取。In this embodiment, step 3) is used to perform coordinate mapping on the extracted runway area to obtain the final detection result. Since the coordinate mapping is the same as the existing method, it will not be repeated here. After the geospatial context attention mechanism network GCAM realizes the segmentation of the airport runway area of the medium-resolution SAR image, the coordinate mapping method is used to process the result map, and then obtain The resulting plot of the high-resolution SAR original image. Finally, the result image and the original image are visualized to realize the extraction of the runway area from the high-resolution SAR image.

下文将对本实施例基于GCAM的高分辨率SAR图像机场跑道区自动检测方法进行实验验证。实验环境如下：CPU Inter至强金牌5120；GPU(单)NVIDIA RTX 2080Ti；数据集使用的是高分3号系统的SAR图像，首先利用像素抽取的方法，对10张机场样本图进行5倍降采样；然后使用LaberImage软件进行像素标注，分为跑道区和背景两类。我们将10张降采样后的中等分辨率SAR图任意切成大于480×480的图像做成小样本数据集，共生成466张图像。训练集和验证集的比例是4:1。如图6中的子图(a)～(c)分别为某个机场样本标签的SAR图像、标签及光学遥感图像；其中标记a所在的连通区是跑道区，跑道区包含了飞机跑道、滑行道、停车坪和飞机；其余各个独立的连通区是背景。The following will perform experimental verification of the GCAM-based high-resolution SAR image airport runway area automatic detection method in this embodiment. The experimental environment is as follows: CPU Inter Xeon Gold 5120; GPU (single) NVIDIA RTX 2080Ti; the data set uses the SAR images of the Gaofen-3 system. First, the 10 airport sample images are reduced by 5 times by using the method of pixel extraction. Sampling; then use LaberImage software for pixel labeling, which is divided into two categories: runway area and background. We arbitrarily cut 10 downsampled medium-resolution SAR images into images larger than 480×480 to make a small-sample dataset, generating a total of 466 images. The ratio of training set and validation set is 4:1. Subfigures (a) to (c) in Figure 6 are the SAR image, label and optical remote sensing image of an airport sample label respectively; the connected area where the label a is located is the runway area, which includes the runway, taxiing Roads, parking pads, and aircraft; the remaining individual connected areas are the background.

参数设置如下：网络训练的过程中学习率设置为0.00001，权重衰减系数为0.995。输入图片的批次(batch size)为1，网络训练迭代100次，每5次保存一个epoch。训练过程中对输入图片进行随机裁剪，随机裁剪的窗口大小为480×480。The parameters are set as follows: the learning rate is set to 0.00001 during network training, and the weight decay coefficient is set to 0.995. The batch size of the input image is 1, and the network is trained for 100 iterations, saving an epoch every 5 times. During the training process, the input image is randomly cropped, and the window size of random cropping is 480×480.

本实施例中，采用PA(像素精度，Pixel accuracy)和IOU(交并比，Intersectionover union)作为验证跑道提取精度的参数。PA代表标记正确的像素占总像素的比例；IOU代表分割结果与标签间的交、并集的比率；则MPA(平均像素比，Mean pixel accuracy)代表每个类别被正确分类像素数的比例；MIOU(均交并比，Mean intersection over union)代表每个类别上IOU的平均。具体的有：In this embodiment, PA (Pixel accuracy, Pixel accuracy) and IOU (Intersectionover Union, Intersectionover Union) are used as parameters for verifying the accuracy of track extraction. PA represents the proportion of correctly labeled pixels to total pixels; IOU represents the ratio of intersection and union between segmentation results and labels; MPA (Mean pixel accuracy) represents the proportion of correctly classified pixels for each category; MIOU (Mean intersection over union) represents the average of IOUs on each category. Specifically:

假设共有k+1类(其中包含一类背景)，上式中，P_ij表示本属于i类但被预测为j类的像素数量，是假正样本，P_ji表示本属于j类但被预测为i类的像素数量，是假负样本，P_ii表示i类真正的像素数量。Assuming that there are k+1 classes (including a class of backgrounds), in the above formula, P _ij represents the number of pixels that belong to class i but is predicted to be class j, which is a false positive sample, and P _ji represents that it belongs to class j but is predicted is the number of pixels in class i, which is a false negative sample, and P _ii represents the real number of pixels in class i.

为了验证本实施例方法提取SAR图像机场跑道区的高效性，一共做了三组对比实验。将本实施例方法和DeepLabV3+、RefineNet和MDDA进行对比实验。实验机场共三个，分别为大小12000×15000的机场Ⅰ、9600×9600的机场Ⅱ和15000×17500的机场Ⅲ，这些机场均未在数据集中使用过。MDDA是我们之前所提出的适用提取SAR图像机场跑道区的深度学习网络，DeepLabV3+和RefineNet均为语义分割主流网络。实验所使用的数据集是我们人工标注的466张小样本数据集。因为网络输出的是降采样后的中等分辨率图像，降采样后机场Ⅰ、Ⅱ、Ⅲ的大小分别为2400×3000、2000×2000和3000×3500。最后我们对结果图进行坐标映射处理直接得到未降采样前的结果图。我们对网络训练时间、图片测试时间以及采样前后跑道区提取精度都做了分析。In order to verify the efficiency of the method of this embodiment in extracting the airport runway area of the SAR image, a total of three sets of comparative experiments were performed. The method of this embodiment is compared with DeepLabV3+, RefineNet and MDDA. There are three experimental airports, namely Airport I with a size of 12000×15000, Airport II with a size of 9600×9600, and Airport III with a size of 15000×17500. None of these airports have been used in the dataset. MDDA is a deep learning network that we proposed before for extracting airport runway areas in SAR images. DeepLabV3+ and RefineNet are both mainstream networks for semantic segmentation. The dataset used in the experiment is our manually annotated 466 small-sample dataset. Because the network outputs a down-sampled medium-resolution image, the sizes of airports I, II, and III after down-sampling are 2400×3000, 2000×2000, and 3000×3500, respectively. Finally, we perform coordinate mapping on the result graph to directly obtain the result graph before downsampling. We analyzed the network training time, the image testing time, and the extraction accuracy of the runway area before and after sampling.

图7～图9分别给出了机场Ⅰ、Ⅱ、Ⅲ的机场跑道区提取结果。其中(a)是高分辨SAR原图，(b)是5倍降采样后的中等分辨率SAR图，(c)是跑道区的类别标注，红色是跑道区，黑色是非跑道区，即背景；(d)是RefineNet对中等分辨率SAR图的提取结果，(e)是MDDA对中等分辨率SAR图的提取结果，(f)是DeepLabV3+对中等分辨率SAR图的提取结果，(g)是本实施例方法(GCAM)对中等分辨率SAR图的提取结果；(h)是RefineNet结果(d)和中等分辨率SAR图(b)的融合图，(i)是MDDA结果(e)和中等分辨率SAR图(b)的融合图，(j)是DeepLabV3+结果(f)和中等分辨率SAR图(b)的融合图，(k)是本实施例方法(GCAM)结果(g)和中等分辨率SAR图(b)的融合图；(l)-(o)是对结果(d)-(g)进行坐标映射处理后和高分辨率原图(a)的融合图；其中，标号为1的区域为跑道区域；标号为2的区域框为误检框，即背景误检为跑道区的部分；标号为3的区域框标记了漏检框，即未被检测到的跑道区部分。Figures 7 to 9 show the extraction results of airport runway areas for airports I, II, and III, respectively. Among them, (a) is the original high-resolution SAR image, (b) is the medium-resolution SAR image after downsampling by 5 times, (c) is the category label of the runway area, red is the runway area, and black is the non-runway area, that is, the background; (d) is the extraction result of RefineNet for medium resolution SAR images, (e) is the extraction result of MDDA for medium resolution SAR images, (f) is the extraction result of DeepLabV3+ for medium resolution SAR images, (g) is this Extraction results of the example method (GCAM) for the intermediate resolution SAR image; (h) is the fusion map of the RefineNet result (d) and the intermediate resolution SAR image (b), (i) is the MDDA result (e) and the intermediate resolution SAR image. The fusion image of the rate SAR image (b), (j) is the fusion image of the DeepLabV3+ result (f) and the medium resolution SAR image (b), (k) is the result (g) of the method in this embodiment (GCAM) and the medium resolution SAR image (b). The fusion map of the rate SAR image (b); (l)-(o) is the fusion map of the result (d)-(g) after coordinate mapping processing and the high-resolution original image (a); where the label is 1 The area marked with 2 is the runway area; the area frame marked 2 is the false detection frame, that is, the background false detection is the part of the runway area; the area frame marked 3 marks the missed detection frame, that is, the part of the runway area that has not been detected.

一、机场Ⅰ实验结果及分析。1. Experiment results and analysis of Airport I.

如图7中的子图(a)所示，机场Ⅰ主要由大面积长跑道区和飞机停靠坪构成，机场内飞机数量较多，有明显的飞机目标亮点；背景区有聚集住房区和错综复杂的交通线路。As shown in sub-figure (a) in Figure 7, Airport I is mainly composed of a large area of long runway area and aircraft landing pads. There are a large number of aircraft in the airport, and there are obvious aircraft target highlights; traffic lines.

我们将机场Ⅰ的中等分辨率SAR图像进行测试，测试图大小为2400×3000。如图7中的子图(d)-(g)所示，本实施例方法的提取结果是最接近标签的，MDDA对跑道区部分边缘的提取不完整；DeepLabV3+对跑道区有小部分漏检现象；RefineNet提取效果最差。根据可视图(h)-(k)，我们标记了主要的漏检框。本实施例方法的没有大的漏检区，而MDDA有2个主要的漏检区，DeepLabV3+存在4个比较明显的漏检区，RefineNet的误检框最多且存在较多边缘漏检，且漏检区都在跑道区边缘地带。对比本实施例方法的网络结果(j)和DeepLabV3+的提取结果(k)，可见本实施例方法中边缘细化模块EDM的加入加强了网络边缘特征的学习。We test the medium-resolution SAR image of Airport I with a test image size of 2400×3000. As shown in sub-figures (d)-(g) in Figure 7, the extraction result of the method in this embodiment is the closest to the label, and MDDA's extraction of part of the edge of the runway area is incomplete; DeepLabV3+ misses a small part of the runway area Phenomenon; RefineNet has the worst extraction effect. According to the visual graphs (h)-(k), we mark the main missed boxes. There is no large missed detection area in the method of this embodiment, while MDDA has 2 main missed detection areas, and DeepLabV3+ has 4 relatively obvious missed detection areas. The inspection areas are all on the edge of the runway area. Comparing the network result (j) of the method of this embodiment with the extraction result (k) of DeepLabV3+, it can be seen that the addition of the edge refinement module EDM in the method of this embodiment strengthens the learning of network edge features.

二、机场Ⅱ实验结果及分析。2. Experiment results and analysis of Airport II.

机场Ⅱ比机场Ⅰ的特征简单一些。机场Ⅱ的跑道区主要由长直跑道构成，除了临近机场边缘区有小片建筑群外，没有大片居民区，但周围有较多水域地区。水域在合成孔径雷达下成像呈现和跑道一样的深黑色特征，这对网络区分特征具有干扰性。Airport II has simpler features than Airport I. The runway area of Airport II is mainly composed of long and straight runways. Except for a small building complex near the edge of the airport, there is no large residential area, but there are many water areas around. The water area is imaged under the synthetic aperture radar and has the same dark black features as the runway, which is disturbing to the network distinguishing features.

我们将机场Ⅱ的大小为2000×2000的中等分辨率SAR图像进行测试。如图8给出了机场Ⅱ的跑道区提取情况。对比图8中的子图(d)、(e)、(f)、(g)和(c)可以看出，本实施例方法提取结果无虚警并且提取效果最好。由图8中的子图(h)-(k)可以看出，本实施例方法只有一处很小的漏检框；MDDA有1个误检框和4个比较明显的漏检框；DeepLabV3+虚警较多且漏检框也最多；对于机场Ⅱ的左侧边缘区域，RefineNet存在几处漏检区域；它们的提取效果都有待提升。说明本实施例方法的边缘提取能力和去虚警能力都是最好，这也提现了本实施例方法中多尺度挤压金字塔MSP的优越之处。We test a medium-resolution SAR image of Airport II with a size of 2000 × 2000. Figure 8 shows the runway area extraction of Airport II. Comparing the subgraphs (d), (e), (f), (g) and (c) in FIG. 8 , it can be seen that the extraction result of the method of this embodiment has no false alarm and the extraction effect is the best. It can be seen from the subgraphs (h)-(k) in Fig. 8 that the method of this embodiment has only one small missed frame; MDDA has one false frame and four obvious missed frames; DeepLabV3+ There are many false alarms and the most missed frames; for the left edge area of Airport II, RefineNet has several missed areas; their extraction effects need to be improved. It is illustrated that the edge extraction capability and the false alarm removal capability of the method in this embodiment are both the best, which also highlights the advantages of the multi-scale extrusion pyramid MSP in the method in this embodiment.

三、机场Ⅲ实验结果及分析。3. Experiment results and analysis of Airport Ⅲ.

机场Ⅲ的跑道区结构和周围地物结构最为复杂。飞机跑道、滑行道、休息站和停车坪比较多。机场Ⅲ是一个民用型机场，且机场Ⅲ的跑道区大多为短跑道，没有大面积的长直跑道。周围地物SAR特征中灰色和亮点居多，和机场跑道区有明显的特征对比，这降低了网络误判的几率。但机场Ⅲ的边缘特征比较复杂，包含的边缘信息最多，这要求网络有较好的全局语义信息学习能力且能有效解码边缘信息。The runway area structure and surrounding structures of Airport III are the most complex. There are many airstrips, taxiways, rest stops and parking pads. Airport III is a civil airport, and most of the runway areas of Airport III are short runways, and there is no large area of long straight runways. The SAR features of surrounding ground objects are mostly gray and bright spots, which have obvious feature contrast with the airport runway area, which reduces the probability of network misjudgment. However, the edge features of Airport III are more complex and contain the most edge information, which requires the network to have better ability to learn global semantic information and to effectively decode edge information.

我们将机场Ⅲ的大小为3000×3500的中等分辨率SAR图像进行测试。对比图9中的子图(d)-(l)，本实施例方法提取效果同样也是最好的，只有部分漏检小区域；MDDA检测有两块明显的虚警；DeepLabV3+有大量漏检，说明其对边缘信息的学习能力不强；RefineNet存在大量虚警且提取效果最差。这也提现了本实施例方法的边缘解码的有效性。We test a medium-resolution SAR image of Airport III with a size of 3000 × 3500. Comparing the subgraphs (d)-(l) in Fig. 9, the extraction effect of the method in this embodiment is also the best, and only some small areas are missed; MDDA detection has two obvious false alarms; DeepLabV3+ has a large number of missed detections, It shows that its ability to learn edge information is not strong; RefineNet has a large number of false alarms and the extraction effect is the worst. This also improves the effectiveness of the edge decoding of the method in this embodiment.

为了更加直观的体现本实施例方法对机场跑道区提取的高效性。表1给出了三个机场的中等分辨率SAR图像在不同算法下的提取精度。本实施例方法对三个机场跑道区的平均提取精度达0.9823，平均IOU达0.9665，均高于MDDA、DeepLabV3+和RefineNet。根据表1，本实施例方法对同一机场跑道区的PA和IOU值的差很小，说明本实施例方法对跑道区几乎能完整提取且没有虚警；DeepLabV3+易产生虚警，所以同一机场跑道区的PA和IOU的值存在一定值差，虚警会降低跑道区的IOU值；MDDA虽然整体提取效果不差，但对小样本数据集的细节学习上存在不足；RefineNet的PA和IOU值都是最低的。In order to more intuitively reflect the high efficiency of the method of this embodiment for extracting the airport runway area. Table 1 presents the extraction accuracy of the medium-resolution SAR images of the three airports under different algorithms. The average extraction accuracy of the method in this embodiment for the three airport runway areas reaches 0.9823, and the average IOU reaches 0.9665, which are all higher than MDDA, DeepLabV3+ and RefineNet. According to Table 1, the difference between the PA and IOU values of the method in this embodiment is very small for the runway area of the same airport, indicating that the method in this embodiment can almost completely extract the runway area without false alarms; DeepLabV3+ is prone to false alarms, so the same airport runway There is a certain difference between the values of PA and IOU in the area, and false alarms will reduce the IOU value of the runway area; although the overall extraction effect of MDDA is not bad, it is insufficient in learning the details of the small sample data set; the PA and IOU values of RefineNet are both is the lowest.

表1：不同网络的提取精度分析。Table 1: Analysis of extraction accuracy for different networks.

表2给出了不同算法对小样本数据集的训练时间和三个机场的中等分辨率SAR图像的测试时间。根据表2，从对小样本数据集的训练时间来看，我们的网络只要2个小时左右的训练时间；MDDA的训练时间是最长的，将近8小时，MDDA训练小样本的效果明显没有大样本好；DeepLabV3+和RefineNet的训练时间和本实施例方法差不多，但是精度却相差甚远。从对三个机场的中等分辨率SAR图像的测试时间来看，图片尺寸越小，测试时间就越短，本实施例方法的平均测试时间只有16.95s，RefineNet平均测试时间为16.69s，DeepLabV3+的平均测试时间为15.89s，MDDA的测试时间大概是本实施例方法的2.5倍。MSP和EDM的加入给网络带来了一定的参数量，这也是使得本实施例方法的训练时间和测试时间比DeepLabV3+略长的原因，网络训练时间和图片测试时间越短会提高实际工程的效率。因此，综合来看，在针对SAR图像小样本数据集的处理上，本实施例方法能实现高精度快速提取，本实施例方法具有高效性。Table 2 presents the training time of the different algorithms on the small sample dataset and the test time on the medium-resolution SAR images of the three airports. According to Table 2, from the point of view of the training time for the small sample data set, our network only needs about 2 hours of training time; the training time of MDDA is the longest, nearly 8 hours, and the effect of MDDA training small samples is obviously not great. The samples are good; the training time of DeepLabV3+ and RefineNet is similar to that of the method in this example, but the accuracy is quite different. From the test time of the medium-resolution SAR images of the three airports, the smaller the image size, the shorter the test time. The average test time of the method in this embodiment is only 16.95s, the average test time of RefineNet is 16.69s, and the average test time of DeepLabV3+ The average test time is 15.89s, and the test time of MDDA is about 2.5 times that of the method in this embodiment. The addition of MSP and EDM brings a certain amount of parameters to the network, which is also the reason why the training time and testing time of the method in this embodiment are slightly longer than those of DeepLabV3+. The shorter the network training time and the image testing time, the more efficient the actual project will be. . Therefore, on the whole, the method of this embodiment can achieve high-precision and rapid extraction in the processing of small sample data sets of SAR images, and the method of this embodiment has high efficiency.

表2：不同网络的数据集训练时间和中等分辨率机场图像的测试时间。Table 2: Dataset training time and test time for medium resolution airport images for different networks.

由此可知，本实施例方法能够实现对高分辨率SAR图像机场跑道区的快速自动提取。我们的网络设计的很轻便，大大缩短了网络层迭代时间，减少了网络训练时间和图片测试时间。MSP使得网络能多尺度全方位学习全局特征并编码有效特征，EDM和MSP的并行工作方式加强了上下文语义信息间的学习，EDM使边缘信息能被完全解码并提取。同时，我们的网络更适用训练小样本数据集，如今并没有用于语义分割的SAR机场公用大数据集，只能人工标注，小样本适用更有利于节约人工时间和成本。总的来说，无论是从提取精度、数据集训练时间和图片测试时间来看，我们的网络都优于主流算法DeepLabV3+，GCAM的性能更是超越了之前我们提出的算法MDDA，实现了高效自动。It can be seen from this that the method of this embodiment can realize the rapid and automatic extraction of the airport runway area of the high-resolution SAR image. Our network design is very lightweight, which greatly shortens the network layer iteration time, reduces network training time and image testing time. MSP enables the network to learn global features at multiple scales and encode effective features. The parallel working mode of EDM and MSP strengthens the learning between contextual semantic information, and EDM enables edge information to be fully decoded and extracted. At the same time, our network is more suitable for training small-sample datasets. Today, there is no public large dataset of SAR airports for semantic segmentation, which can only be manually annotated. The application of small samples is more conducive to saving labor time and cost. In general, our network is superior to the mainstream algorithm DeepLabV3+ in terms of extraction accuracy, dataset training time and image testing time, and the performance of GCAM surpasses the algorithm MDDA we proposed before, realizing efficient automatic .

综上所述，为了实现高分辨率SAR图像的机场快速自动提取，本实施例中提出了一种基于GCAM的高分辨率SAR图像机场跑道区自动检测方法，该方法包括三个部分:对原始高分辨SAR图降采样处理、GCAM进行机场跑道区的提取、GCAM产生结果图的坐标映射。降采样的处理使得单张训练样本包含更多机场信息，有利于制成小样本数据集；MSP加入条纹池化和四个并行卷积一起工作从而能多尺度学习特征，eSE模块进行有用特征筛选；EDM帮助网络学习边缘语义信息，坐标映射处理能够得到原始高分辨率SAR图的提取结果。在对三个机场跑道区的检测试验来看，我们的网络对比于DeepLabV3+、RefineNet和MDDA都是最好的，MPA可达0.98，MIOU可达0.96。此外，我们的网络训练数据集时间只需要2.25h，图片平均测试时间只需16.94s。从提取结果看，GCAM无虚警且漏检少，能高效的实现机场跑道区提取。此外，GCAM在实际工程中可以提高检测效率；且提取机场跑道区后，可以缩短后续飞机的提取的检测范围，节约时间。To sum up, in order to realize the rapid and automatic extraction of high-resolution SAR images, a GCAM-based automatic detection method for airport runway areas of high-resolution SAR images is proposed. The method includes three parts: High-resolution SAR image downsampling processing, GCAM extraction of airport runway area, GCAM generated coordinate mapping of the resulting image. The downsampling process makes a single training sample contain more airport information, which is conducive to making a small sample data set; MSP adds stripe pooling and four parallel convolutions to work together to learn features at multiple scales, and the eSE module performs useful feature screening. ; EDM helps the network learn edge semantic information, and coordinate mapping processing can obtain the extraction results of the original high-resolution SAR image. In the detection test of three airport runway areas, our network is the best compared to DeepLabV3+, RefineNet and MDDA, with MPA up to 0.98 and MIOU up to 0.96. In addition, our network training data set time is only 2.25h, and the average image test time is only 16.94s. From the extraction results, GCAM has no false alarms and few missed detections, and can efficiently extract the airport runway area. In addition, GCAM can improve the detection efficiency in practical engineering; and after extracting the airport runway area, it can shorten the detection range of subsequent aircraft extraction and save time.

此外，本实施例还提供一种基于GCAM的高分辨率SAR图像机场跑道区自动检测系统，包括：In addition, this embodiment also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, including:

坐标映射程序单元，用于针对提取得到的跑道区进行坐标映射，获得最终高分辨率SAR图像的检测结果。The coordinate mapping program unit is used to perform coordinate mapping for the extracted runway area to obtain the detection result of the final high-resolution SAR image.

此外，本实施例还提供一种基于GCAM的高分辨率SAR图像机场跑道区自动检测系统，包括计算机设备，该计算机设备包括相互连接的微处理器和存储器，该微处理器被编程或配置以执行前述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的步骤，或者该存储器中存储有被编程或配置以执行前述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的计算机程序。In addition, the present embodiment also provides a GCAM-based high-resolution SAR image airport runway area automatic detection system, including a computer device, the computer device including an interconnected microprocessor and a memory, the microprocessor being programmed or configured to Perform the steps of the aforementioned GCAM-based high-resolution SAR image airport runway area automatic detection method, or a computer program programmed or configured to execute the aforementioned GCAM-based high-resolution SAR image airport runway area automatic detection method is stored in the memory.

此外，本实施例还提供一种计算机可读存储介质，该计算机可读存储介质中存储有被编程或配置以执行前述基于GCAM的高分辨率SAR图像机场跑道区自动检测方法的计算机程序。In addition, the present embodiment also provides a computer-readable storage medium storing a computer program programmed or configured to execute the aforementioned GCAM-based high-resolution SAR image airport automatic detection method.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The present application refers to flowcharts of methods, apparatus (systems), and computer program products according to embodiments of the present application and/or processor-executed instructions generated for implementing a process or processes and/or block diagrams in a flowchart. A means for the function specified in a block or blocks. These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams. These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

以上所述仅是本发明的优选实施方式，本发明的保护范围并不仅局限于上述实施例，凡属于本发明思路下的技术方案均属于本发明的保护范围。应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理前提下的若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above are only the preferred embodiments of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions under the idea of the present invention belong to the protection scope of the present invention. It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principle of the present invention should also be regarded as the protection scope of the present invention.

Claims

1. A GCAM-based high-resolution SAR image airport runway area automatic detection method is characterized by comprising the following steps:

1) down-sampling the high-resolution SAR image to generate a medium-resolution image;

2) inputting the medium-resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;

3) and carrying out coordinate mapping on the extracted runway area to obtain a final detection result of the high-resolution SAR image.

2. The GCAM-based high-resolution SAR image airport runway area automatic detection method according to claim 1, characterized in that, the down-sampling of the high-resolution SAR image in step 1) is specifically a 5-fold down-sampling processing of the SAR image by adopting a pixel value extraction method.

3. The GCAM-based high resolution SAR image airport runway area automatic detection method of claim 1, wherein the geospatial context attention mechanism network GCAM comprises a coding block and a decoding block, the coding block comprises a residual network ResNet, a multi-scale extrusion pyramid MSP and an edge refinement module EDM, the residual network ResNet is used for performing feature extraction on an input data set to obtain preliminary features, the multi-scale extrusion pyramid MSP is used for obtaining global context information from different resolutions with different pooling convolutional layer operations for the preliminary features, the edge refinement module EDM is used for enhancing network edge extraction capability for the preliminary features, and outputs of the multi-scale extrusion pyramid MSP and the edge refinement module EDM are further fused to obtain multi-level features; the decoding block is used for carrying out semantic segmentation on the runway area of the airport by combining the preliminary features and the multi-level features to extract the runway area.

4. The GCAM-based high-resolution SAR image airport runway area automatic detection method according to claim 3, wherein the residual error network ResNet is an improved residual error network obtained by replacing ordinary two-dimensional convolution with hole convolutions with hole rates of 2, 4, 8 and 16 on the basis of the residual error network ResNet _ 101.

5. The GCAM-based high-resolution SAR image airport runway area automatic detection method of claim 3, wherein the multi-scale extrusion pyramid MSP comprises a multi-field parallel pooling working layer and an effective attention module eSE, wherein the multi-field parallel pooling working layer is built by a 1 x 1 convolution with a voidage of 1, a 3 x 3 convolution with three voidages of 6,12 and 18 respectively, a global average pooling module GAP and a stripe pooling module SP in parallel; the stripe pooling module SP performs pooling operation in the horizontal direction by utilizing a stripe pooling window H multiplied by 1 and pooling operation in the vertical direction by utilizing a stripe pooling window H multiplied by 1 in the vertical direction aiming at a two-dimensional feature tensor with the input size of H multiplied by W, averages element values in a pooling kernel respectively to obtain output of stripe pooling in the horizontal direction and output of stripe pooling in the vertical direction, then performs expansion in the left-right direction and the up-down direction on the output respectively by using two one-dimensional convolutions aiming at the output of stripe pooling in the horizontal direction and the output of stripe pooling in the vertical direction, the two expanded feature maps have the same size, then fuses the two expanded feature maps, and finally multiplies the original data and the data subjected to Sigmoid processing to obtain the output of the H multiplied by W two-dimensional feature tensor; the active attention Module eSE learns first by globally averaged pooling of features F for the input feature map Xi_avgWill feature F_avgObtaining a weight matrix W by full connection layer processing_CThe weight matrix W_CReadjusting the extracted channel attention feature A through Sigmoid function_eSEThen the channel attention feature A_eSEApplying the input feature map Xi to obtain a refined feature map X_refineFinally, the refined characteristic diagram X is obtained_refineAnd performing feature re-screening to obtain global context information.

6. The GCAM-based high resolution SAR image airport runway area automatic detection method of claim 3, characterized in that the edge refinement module EDM comprises a global convolution module GCB for enhancing the close relation of feature maps and pixel classification layers and the ability to process feature maps of different resolutions to obtain global information, an edge refinement module BR for enhancing the edge extraction ability of coded blocks from global information; the global convolution module GCB comprises a big convolution kernel of kxk and a characteristic combination module, wherein the big convolution kernel of kxk comprises two paths, one path consists of convolution of kx01 x1 cxc x 2c and convolution of 1 x3 kxc x c, the other path consists of convolution of 1 xkxxc x c and convolution of kx1 xc x c, wherein c is the number of channels, and output results of the two paths are input into the characteristic combination module together to obtain the characteristic Sum_W×H×C(ii) a The edge refinement module BR targets a feature Sum_W×H×CSequentially processing the data by a small convolution kernel, an activation function and a small convolution kernel, and then overlapping the processing result to the original characteristic Sum_W×H×CAnd finally obtaining a characteristic diagram after the edges of the refined runway area.

7. The GCAM-based high-resolution SAR image airport runway area automatic detection method of claim 3, wherein the decoding block performs 1 x 1 convolution dimensionality reduction on output features of a coding block, performs edge information decoding on a feature map obtained by an edge refinement module EDM after refining the runway area edge, performs bilinear 4-fold upsampling, then connects the result obtained by performing 1 x 1 convolution dimensionality reduction on preliminary features output by a residual error network ResNet and bilinear 4-fold upsampling, then applies a 3 x 3 convolution to the connected features to refine the features, and finally performs a simple bilinear 4-fold upsampling, thereby obtaining the final segmentation result.

8. A GCAM-based high-resolution SAR image airport runway area automatic detection system is characterized by comprising:

the down-sampling program unit is used for down-sampling the high-resolution SAR image to generate a medium-resolution image;

a runway area extraction program unit for inputting the medium resolution image into a geographic space context attention mechanism network GCAM to extract a runway area;

and the coordinate mapping program unit is used for carrying out coordinate mapping on the extracted runway area to obtain a detection result of the final high-resolution SAR image.

9. A GCAM-based high resolution SAR image airport runway area automatic detection system comprising a computer device comprising a microprocessor and a memory connected to each other, wherein the microprocessor is programmed or configured to perform the steps of the GCAM-based high resolution SAR image airport runway area automatic detection method according to any one of claims 1 to 7, or the memory has stored therein a computer program programmed or configured to perform the GCAM-based high resolution SAR image airport runway area automatic detection method according to any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon a computer program programmed or configured to perform the GCAM-based high resolution SAR image airport runway area automatic detection method of any of claims 1-7.