CN119649135A - A classification method for multi-source remote sensing data - Google Patents

A classification method for multi-source remote sensing data Download PDF

Info

Publication number
CN119649135A
CN119649135A CN202411798370.4A CN202411798370A CN119649135A CN 119649135 A CN119649135 A CN 119649135A CN 202411798370 A CN202411798370 A CN 202411798370A CN 119649135 A CN119649135 A CN 119649135A
Authority
CN
China
Prior art keywords
features
frequency
remote sensing
feature
sensing data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202411798370.4A
Other languages
Chinese (zh)
Inventor
涂兵
陈卓宇
刘博�
李军
方乐缘
陈云云
曹兆楼
贺燕
刘立成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202411798370.4A priority Critical patent/CN119649135A/en
Publication of CN119649135A publication Critical patent/CN119649135A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

本发明提供了一种多源遥感数据分类方法,属于遥感领域中的高光谱图像处理领域,包括:获取多种目标物的遥感数据;提取每种遥感数据的浅层特征;获得每种遥感数据的多个频率特征;将所有遥感数据的相同频率特征进行融合,获得多个同频融合特征;将多个同频融合特征拼接起来获得多源融合特征;将多源融合特征依次通过叠加的调频层和注意力层获得融合的全局特征和局部特征;在光谱维度上对全局特征和局部特征进行加权获得目标物的预测分类结果。本发明实现多源数据的定向频率特征分解与融合,基于多源遥感数据提取多源融合特征作为分类的依据,为目标物的分类提供更全面的信息,能够充分捕捉复杂的地物特征,使分类结果更加准确。

The present invention provides a multi-source remote sensing data classification method, which belongs to the field of hyperspectral image processing in the remote sensing field, including: obtaining remote sensing data of multiple targets; extracting shallow features of each remote sensing data; obtaining multiple frequency features of each remote sensing data; fusing the same frequency features of all remote sensing data to obtain multiple same-frequency fusion features; splicing multiple same-frequency fusion features to obtain multi-source fusion features; passing the multi-source fusion features through superimposed frequency modulation layers and attention layers in sequence to obtain fused global features and local features; weighting the global features and local features in the spectral dimension to obtain the predicted classification results of the target. The present invention realizes the directional frequency feature decomposition and fusion of multi-source data, extracts multi-source fusion features based on multi-source remote sensing data as the basis for classification, provides more comprehensive information for the classification of the target, can fully capture complex ground features, and make the classification results more accurate.

Description

Multi-source remote sensing data classification method
Technical Field
The invention relates to a multi-source remote sensing data classification method, and belongs to the field of hyperspectral image processing in the remote sensing field.
Background
At present, the remote sensing data presents the coexistence of multi-source data such as high, medium and low resolution, multi-spectrum, hyperspectral, synthetic aperture radar SAR, street view, liDAR laser point cloud and the like, and provides basic data guarantee for remote sensing monitoring and other multi-field applications.
The existing remote sensing images mainly comprise visible light RGB remote sensing images, panchromatic remote sensing images, multi/hyperspectral remote sensing images, infrared remote sensing images, liDAR remote sensing images and synthetic aperture radar SAR remote sensing images.
The visible light RGB remote sensing image is a special case in a multispectral image, the waves of three channels of red spectrum, blue spectrum and green spectrum are fused, the RGB remote sensing image is the most commonly applied remote sensing image in real life, and the RGB remote sensing image is usually used for distinguishing the terrain and the ground objects.
The full-color remote sensing image is different from the RGB remote sensing image, and is a black-and-white image of the whole visible light wave region acquired by a remote sensor and is called full-color image, the full-color remote sensing image is displayed as a gray picture on a picture, and the full-color remote sensing image is high in spatial resolution generally but cannot display the colors of ground objects.
Multispectral remote sensing technology can bring more color information by fusing tens to hundreds of spectrums, can assist in judging the properties of earth surface substances, but has lower spatial resolution.
Hyperspectral imaging techniques generate rich spectral data by capturing reflectance spectra in multiple bands, and can detect unique spectral features at different spatial locations of a single object, and thus. The hyperspectral imaging technology can finely analyze and distinguish different substances, such as plant types, soil types, water quality and the like, each wave band provides unique spectral characteristics, and visually indistinguishable substances can be detected, so that the hyperspectral image has unique advantages in material identification and change detection.
The infrared remote sensing image is an image obtained by sensing infrared rays reflected by a ground object and radiated by the ground object, and has the defects of low resolution, low contrast, low signal to noise ratio and blurred visual effect because the infrared rays have long wavelength and strong penetrating power in the atmosphere and are not influenced by night and smog.
LiDAR remote sensing images are images obtained by resolving the ground coordinates of a laser spot for the angle of laser light emitted from the air or space vehicle and the distance of the laser light detected. The laser radar LiDAR technology obtains three-dimensional space information of a target object by emitting laser and measuring reflection time, generates high-precision point cloud data, and the LiDAR data not only provides ground height, but also can depict the shape and structure of a ground object, thereby having important value for forest monitoring, city modeling, topography analysis and other applications.
The synthetic aperture radar SAR is a technology for achieving the measurement effect of a large aperture radar by using a small aperture antenna through motion and mathematical calculation, is a high resolution imaging radar system, and can synthesize a larger synthetic aperture than a physical antenna aperture through the antenna motion on a mobile platform (such as a satellite or an airplane), thereby improving imaging resolution, and can accurately image the ground under various weather conditions and illumination conditions, and each pixel of the generated remote sensing image not only comprises a reflected intensity of a reaction surface microwave, namely a so-called gray value, but also comprises a phase value related to a radar tilt distance, but the latter shows extremely high randomness, is generally regarded as noise, and brings inconvenience to interference analysis.
In summary, each single-source remote sensing image has the advantages, but because the single-source remote sensing image has a small information amount, complex ground feature features may not be captured sufficiently, so that it is necessary to combine different remote sensing data to judge the nature of the earth surface material, and perform forest monitoring, land utilization and above-ground biomass estimation by using the information provided by the multi-source data.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides a multi-source remote sensing data classification method, and solves the problem that complex features cannot be fully captured by utilizing single-source remote sensing data.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme:
The invention provides a multi-source remote sensing data classification method which comprises the steps of obtaining multiple types of remote sensing data of a target object, respectively extracting shallow features of each remote sensing data, inputting the shallow features into a pre-built frequency feature decomposition module to obtain multiple preset frequency features of each remote sensing data, inputting the frequency features of all remote sensing data into a pre-built same-frequency feature fusion module, carrying out feature fusion on the same frequency features by the same-frequency feature fusion module to obtain multiple corresponding same-frequency fusion features, splicing and fusing the multiple same-frequency fusion features to obtain multi-source fusion features, sequentially enabling the multi-source fusion features to pass through a superposed frequency modulation layer and an attention layer to obtain fused global features and local features, weighting the fused global features and local features in a spectrum dimension, extracting depth information, and further obtaining a prediction classification result of the target object.
The method comprises the steps of obtaining a plurality of scale characteristics, obtaining channel fusion characteristics, carrying out element summation, average and maximum pooling operations on the channel fusion characteristics along the channel dimension in any preset channel, obtaining a corresponding summation characteristic diagram, an average characteristic diagram and a maximum pooling characteristic diagram, carrying out splicing on the summation characteristic diagram, the average characteristic diagram and the maximum pooling characteristic diagram, carrying out further channel fusion by convolution, obtaining low-dimensional channel characteristics of the channel, and carrying out splicing on the low-dimensional channel characteristics of all channels, thus obtaining the shallow characteristics of the remote sensing data.
Further, the multi-scale convolution operation comprises the steps of respectively carrying out 3×3 convolution, 5×5 convolution and 7×7 convolution on the partitioned patch blocks, and carrying out batch normalization operation and ReLU operation after each convolution operation in sequence to obtain corresponding 3 scale features.
The frequency characteristic decomposition module is constructed based on frequency domain transform, and the obtaining of the plurality of preset frequency characteristics of each shallow characteristic comprises the steps of obtaining the processed shallow characteristics by rolling and normalizing the input shallow characteristicsAnd the processed shallow layer featuresAverage division into spectral dimensionsIn the head and willThe individual heads are divided intoAn aliquot, wherein,For the number of frequency signatures, such that each aliquot is used to calculate one frequency signature; Is a preset value.
Each frequency characteristic corresponds to a preset window form, each head is divided into non-overlapping windows by adopting the window form corresponding to the frequency characteristic for any equal part, then the attention of each window is calculated, the frequency characteristic of each head is further obtained, and finally the frequency characteristic of each head is spliced to obtain the frequency characteristic corresponding to the equal part.
Further, the frequency characteristics comprise low frequency characteristics, high frequency characteristics, vertical characteristics and horizontal characteristics, and window forms corresponding to the low frequency characteristics are as follows:; in the form of a window size index, Taking a positive integer; in a header for computing low frequency features, each window containsA token, which refers to a minimum unit in the window.
The window shape corresponding to the high-frequency characteristic is as follows: in the header for computing the high frequency characteristics, each window contains The window forms corresponding to the vertical features are as follows: in the header for computing the vertical features, each window contains The window forms corresponding to the horizontal features are as follows: in the header for computing the horizontal features, each window contains And a token.
Further, the vertical feature obtaining process comprises the steps ofThe individual heads being uniformly divided into non-overlapping partsIndividual windows,For the number of windows to be the number of windows,,For the processed shallow featuresIs longer or wider than the above;, For the processed shallow features Is defined by the spectral dimensions of (a);
First, the Query tensor for individual headerTensor of keySum tensorIs of the dimension ofWherein, the method comprises the steps of, wherein,;Calculate the firstAttention of each window in the head, whereinThe attention calculations for the individual windows are:
;
In the formula, Is the firstThe result of the attention calculations for the individual windows,In order to calculate the attention operation,,,Respectively the firstA query matrix, a key matrix, and a value matrix for each header;
According to the first Attention of each window in the head gets the firstThe vertical features of the individual head are expressed as:
;
In the formula, Is the firstThe vertical nature of the individual head is such that,,,1 St, 2 nd and 2 nd, respectivelyThe attention calculation results of the windows;
The vertical features of all heads for calculating the vertical features are spliced to obtain the vertical features of the shallow features, and the expression is:
;
In the formula, Is a vertical feature of the shallow features,In order for the splicing operation to be performed,AndRespectively the firstFirst, secondAnd (b)Vertical features of the individual head.
Further, the same-frequency component fusion module performs feature fusion on the same frequency components to obtain a plurality of corresponding same-frequency fusion features, wherein the method comprises the steps of adding any one frequency component from all remote sensing data according to elements to obtain the same-frequency componentWill (i) beGlobal average pooling is carried out on the channel dimension, and then the channel weight is obtained through a channel attention moduleThe expression is:
;
In the formula, For the channel weight to be a function of the channel weight,Is a convolution layer of 1 x1,In order to take the maximum value it is,The output of the global average pooling is performed in the channel dimension.
Will beRespectively carrying out global average pooling and global maximum pooling on the space dimension, and then obtaining space weight through a space attention moduleThe expression is:
;
In the formula, As the spatial weight of the object to be processed,For a 7 x 7 convolutional layer,Is thatThe output of global average pooling is done in the spatial dimension,Is thatThe output of global maximum pooling is done in the spatial dimension.
According to the broadcasting rule, the channel weight is calculated by addition operationAnd spatial weightFusing to obtain coarse weightWill (i) beAndIs rearranged by a rearrangement operation, expressed as:
;
In the formula, Is a fine weight; as a function of the sigmoid, For the group convolution, the number of groups is set to the number of channels,For the channel re-arrangement operation,Is a coarse weight.
And according to the frequency components and the fine weights of all the remote sensing data, combining residual connection, and adopting a weighted summation mode to obtain the same-frequency fusion characteristic of the frequency characteristic.
Further, the frequency modulation layer and the attention layer adopt a staged architecture of superposition of the frequency modulation layer and the attention layer connected in series after superposition, and are introducedThe factor controls the number of fm and attention layers in the total number of layers, wherein,The frequency modulation layer is the ratio of the total layer number.
Further, the frequency modulation layer is used for capturing local features, including by first applying a block-based fast Fourier transformWill input featuresTransforming to frequency domain, then introducing a learnable matrix, suppressing or amplifying all frequency components by multiplication of elements in the frequency domain to obtain frequency modulation characteristicsRe-use of inverse fourier transformAnd reconstruct to obtain refined output characteristicsThe expression is:
;
;
;
In the formula, As an input feature of the frequency modulation layer,For features obtained through the forward propagation network,In order to be a frequency modulation feature,For the output characteristics of the frequency modulation layer,For the layer normalization,Is a convolution layer of 1 x1,In order to activate the function,For the block-partitioning operation of the block,For the multiplication of the elements,In order for the matrix to be a matrix to be learnable,For the block-merging operation,In the case of a fast fourier transform,In the case of an inverse fast fourier transform,Is a multi-layer perceptron operation.
The attention layer is used for capturing global attributes or semantic features, and comprises the steps of sequentially carrying out layer normalization and multi-head attention operation on input features of the attention layer, sequentially carrying out layer normalization and multi-layer perceptron operation, and finally outputting, wherein the multi-layer perceptron operation is used for mixing channels in the attention layer.
Further, weighting the fused global features and local features in the spectrum dimension to obtain depth features and further obtain the classification result of the target object, wherein the method comprises the steps of firstly learning key information through one-dimensional convolution, then highlighting the obvious features through an activation function, and finally passing throughThe function obtains a prediction result, and the expression is:
;
;
In the formula, As a feature of the depth,In order to predict the outcome of the classification,In order to operate the full-connection type of the device,AndIs two activation functions.
Compared with the prior art, the invention has the beneficial effects that:
(1) The multi-source remote sensing data classification method provided by the invention is characterized in that multi-scale convolution and channel level fusion operations are respectively used on the cube blocks of each single-source remote sensing image to realize shallow feature extraction and fusion, and then different window shapes are used for capturing different frequency features of each single-source remote sensing image by utilizing a frequency feature decomposition module established based on multi-head self-attention, and then the same-frequency feature fusion module is used for carrying out feature fusion on multi-source data to realize directional frequency feature decomposition and fusion of the multi-source data, so that multi-source fusion features extracted based on the multi-source remote sensing data are obtained as the basis for classifying target objects. In addition, the frequency modulation layer and the attention layer are combined, local and global feature learning is realized through frequency modulation, different frequency component features are obtained, depth features are extracted according to the different frequency component features, and accurate classification of the target object is completed;
(2) The multi-source remote sensing data classification method provided by the invention adopts a deep learning classification method with stronger characterization and generalization capability to extract deeper image features, learns the spatial and spectral features of the remote sensing image under a low training sample, and improves the classification precision to obtain more discrimination features so as to obtain a good classification result;
(3) The invention is introduced into The factor controls the number of the frequency modulation layers and the attention layers, and the method helps to accurately capture global features and local features by flexibly changing the number of the frequency modulation layers and the attention layers.
Drawings
FIG. 1 is a flow chart of a multi-source remote sensing data classification method provided in embodiment 1 of the present invention;
FIG. 2 is a schematic diagram of shallow features of remote sensing data extraction in embodiment 1 of the present invention;
FIG. 3 is a schematic diagram of obtaining frequency characteristics by using a frequency characteristic decomposition module in embodiment 2 of the present invention;
Fig. 4 is a schematic diagram of obtaining co-frequency fusion features by using a co-frequency feature fusion module in embodiment 1 of the present invention;
FIG. 5 is a schematic diagram of a frequency modulation layer according to embodiment 1 of the present invention;
FIG. 6 is a schematic view of the attention layer provided in example 1 of the present invention;
FIG. 7 is a graph showing the classification results of the different methods of example 3 of the present invention on the Houston2013 dataset.
Detailed Description
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments.
The remote sensing technology is a technology for detecting and recognizing an object by sensing electromagnetic waves, visible light, and infrared rays Q reflected or radiated from the object at a long distance. The remote sensing satellite is provided with a relevant remote sensing sensor, electromagnetic wave information radiated or reflected by the earth or an atmospheric target is collected by the remote sensing sensor and recorded, the information is sent back to the ground by a signal start and transmission device, and a visible image, namely a satellite image which is commonly known by people, is obtained through electromagnetic wave conversion and recognition.
In the process of remote sensing digital image processing, the spatial characteristics of the ground object are mainly represented by the change of spectral characteristics. The multispectral technology is a spectrum detection technology capable of simultaneously acquiring a plurality of optical spectrum bands and expanding towards infrared light and ultraviolet light on the basis of visible light.
Example 1
The invention provides a multi-source remote sensing data classification method, as shown in figure 1, which comprises the following steps:
And step 1, acquiring various types of remote sensing data of the target object.
In a specific embodiment, two or three remote sensing images of the target object are selectively acquired according to the actual requirements of the image study.
It should be noted that, after the hyperspectral image is obtained, PCA dimension reduction processing is required to be performed on the hyperspectral image so as to extract main features in the image and reduce redundant information.
And 2, respectively extracting shallow layer characteristics of each remote sensing data, wherein the shallow layer characteristics comprise:
And step 21, performing patch block division on the remote sensing data.
Specifically, the patch block division is performed on an image to divide the image into units that are smallest one by one.
In embodiments where hyperspectral and LiDAR images are acquired, the dimension-reduced hyperspectral image is processedAnd LiDAR imageRespectively taking patch blocks one by one to obtain hyperspectral image patch blocksAnd LiDAR image patch blockWherein, the method comprises the steps of, wherein,To take the length and width of the pixel after the patch,Is the firstQuery tensor for individual headerTensor of keySum tensorIs a dimension of (c).
And step 22, performing multi-scale convolution operation on the partitioned patch blocks to obtain corresponding multi-scale features.
The multi-scale convolution operation comprises the steps of respectively carrying out different convolution operations on the divided images for a plurality of times, and sequentially carrying out batch normalization operation and ReLU operation after each convolution operation to obtain corresponding various scale features.
As shown in FIG. 2, in an embodiment in which hyperspectral and LiDAR images are acquired, a block of hyperspectral images is patchedAnd LiDAR image patch blockThe operations of 3×3 convolution+batch normalization+relu, 5×5 convolution+batch normalization+relu and 7×7 convolution+batch normalization+relu are performed respectively, and the calculation expressions are:
;
;
Wherein, For the convolution kernel size, set to 3,5,7,Is the firstThe normalization function is batched under the scale of each,Is a common function of activation and is,AndThe first of hyperspectral image and LiDAR image, respectivelyThe individual dimensions output features.
Step 23, in any preset channel, firstly, stacking the multiple scale features along the channel dimension to obtain a channel fusion feature.
Specifically, the expression of the channel fusion feature is:
;
Wherein, Is the firstThe number of channels in the channel is the same,,,Respectively the firstThree different scale output features of the channels,Is the firstThe channel fusion characteristics of the individual channels,For channel stacking operations.
In the embodiment of acquiring hyperspectral and LiDAR images, three different scale output features of the hyperspectral and LiDAR images are calculated respectively, and by using the expression, a plurality of channel fusion features corresponding to the hyperspectral images and a plurality of channel fusion features corresponding to the LiDAR images are calculated respectively in each preset channel dimension.
And then, element summation, average and maximum pooling operations are respectively carried out on each channel fusion characteristic along the channel dimension, channel level attributes are extracted, and a corresponding summation characteristic diagram, average characteristic diagram and maximum pooling characteristic diagram are obtained.
Specifically, the expression is:
;
;
;
Wherein, To perform the feature map obtained for the element summing operation,In order to perform the element summing operation,In order to perform the feature map obtained by the averaging operation,In order to perform the averaging operation,To perform the feature map obtained for the max-pooling operation,To perform the max-pooling operation.
Then, the three feature maps are spliced, and channel fusion is carried out by using 3×3 convolution, so that low-dimensional channel features are obtained.
Specifically, the expression is:
;
Wherein, Is the firstLow dimensional channel characteristics of individual channels.
In embodiments that acquire hyperspectral and LiDAR images, the hyperspectral image acquires the low-dimensional channel characteristics of each channel, and the LiDAR image also acquires the low-dimensional channel characteristics of each channel.
And step 24, splicing the low-dimensional channel characteristics of all channels to obtain shallow layer characteristics of the remote sensing data.
Specifically, the expression is:
;
Wherein, As a shallow feature of the remote sensing data,1 St, 2 nd and 2 nd, respectivelyLow dimensional channel characteristics of individual channels.
In the embodiment of acquiring hyperspectral and LiDAR images, the method is used for splicing the low-dimensional channel characteristics of each channel of the hyperspectral image to obtain the shallow layer characteristics of the hyperspectral imageSimultaneously, splicing the low-dimensional channel characteristics of all the channels of the LiDAR image to obtain the shallow layer characteristics of the LiDAR image
The shallow features of each type of remote sensing data also comprise the data characteristics of the technology of the shallow features, as different remote sensing technologies have the advantages and disadvantages of each type of remote sensing data. Next, feature fusion is performed on the different types of remote sensing data.
And step 3, inputting the shallow features into a frequency feature decomposition module to obtain a plurality of preset frequency features of each remote sensing data.
It should be noted that the frequency characteristic decomposition module is constructed based on the frequency domain transducer and is used for extracting characteristics of different frequencies of the remote sensing data by adopting different windows, so that the remote sensing data can be analyzed more comprehensively.
Specifically, before frequency division, the shallow features input into the frequency feature decomposition module are rolled and normalized to obtain processed shallow features.
In an embodiment that acquires hyperspectral and LiDAR images, the expression is:
;
;
Wherein, For shallow features of the processed hyperspectral image,For the processed LiDAR image shallow features,Is a layer normalization.
The process of obtaining multiple frequency features from the processed shallow features includes equally dividing the processed shallow features into spectral dimensionsIndividual head and willThe individual heads are divided intoAn aliquot, wherein,For the number of frequency signatures, such that each aliquot is used to calculate one frequency signature; Is a preset value.
Each frequency characteristic corresponds to a preset window form, for any equal part, each head is uniformly divided into non-overlapping windows by adopting the window form corresponding to the frequency characteristic, then the attention of each window is calculated, the frequency characteristic of each head is further obtained, and finally the frequency characteristic of each head is spliced to obtain the frequency characteristic corresponding to the equal part.
In the embodiment where hyperspectral and LiDAR images are acquired, the preset frequency features include a low frequency feature, a high frequency feature, a vertical feature, and a horizontal feature. In these embodiments, the low frequency, high frequency, vertical, and horizontal features of the hyperspectral image, as well as the low frequency, high frequency, vertical, and horizontal features of the LiDAR image, are obtained by the methods described above.
And 4, inputting the frequency characteristics of all the remote sensing data into a same-frequency characteristic fusion module to obtain a plurality of corresponding same-frequency fusion characteristics.
The same-frequency characteristic fusion module performs characteristic fusion on the same frequency characteristics, and comprises the following steps:
Step 41, for any frequency component, adding the frequency components from all remote sensing data by elements to obtain the same frequency component Will (i) beCarrying out global average pooling on the channel dimension, and then obtaining channel weight through a channel attention module, wherein the expression is as follows:
;
In the formula, For the channel weight to be a function of the channel weight,Is a convolution layer of 1 x1,The output of the global average pooling is performed in the channel dimension.
Step 42. Will beRespectively carrying out global average pooling and global maximum pooling on the space dimension, and then obtaining space weight through a space attention moduleThe expression is:
;
In the formula, As the spatial weight of the object to be processed,For a 7 x 7 convolutional layer,Is thatThe output of global average pooling is done in the spatial dimension,Is thatThe output of global maximum pooling is done in the spatial dimension.
Step 43, weighting the channel by addition operation according to the broadcasting ruleAnd spatial weightFusing to obtain coarse weightWill (i) beAndIs rearranged by a rearrangement operation to obtain a fine weightThe expression is:
;
In the formula, For the fine weight of the weight, the weight of the weight is,As a function of the sigmoid,For the group convolution, the number of groups is set to the number of channels,For the channel re-arrangement operation,Is a coarse weight.
And step 44, according to the frequency components and the fine weights of all the remote sensing data, combining residual connection, and adopting a weighted summation mode to obtain the same-frequency fusion characteristic of the frequency characteristic.
In the embodiment of acquiring hyperspectral and LiDAR images, as shown in FIG. 4, the same-frequency feature I from the hyperspectral image and the same-frequency feature II from the LiDAR image are fused by the method to obtain the same-frequency fusion feature, wherein the same-frequency fusion feature comprises a high-frequency fusion feature, a low-frequency fusion feature, a vertical fusion feature and a horizontal fusion feature.
And 5, splicing and fusing the same-frequency fusion characteristics to obtain multi-source fusion characteristics.
In the embodiment of acquiring hyperspectral and LiDAR images, the acquired low-frequency fusion features, high-frequency fusion features, vertical fusion features and horizontal fusion features are spliced to obtain multi-source fusion features.
And 6, sequentially passing the multi-source fusion features through the overlapped frequency modulation layer and the attention layer to obtain the fused global features and local features.
Specifically, the frequency modulation layer and the attention layer are laid out in a staged architecture, i.e. a plurality of frequency modulation layers and a plurality of attention layers are respectively stacked in series, and then the two parts are connected in series.
Wherein the frequency modulation layer is used to capture local features, as shown in FIG. 5, comprising first applying a block-based fast Fourier transformWill input featuresTransforming to frequency domain, then introducing a learnable matrix, suppressing or amplifying all frequency components by multiplication of elements in the frequency domain to obtain frequency modulation characteristicsRe-use of inverse fourier transformAnd reconstruct to obtain refined output characteristicsThe expression is:
;
;
;
In the formula, As an input feature of the frequency modulation layer,For features obtained through the forward propagation network,In order to be a frequency modulation feature,For the output characteristics of the frequency modulation layer,Is a convolution layer of 1 x1,In order to activate the function,For the block-partitioning operation of the block,For the multiplication of the elements,In order for the matrix to be a matrix to be learnable,For the block-merging operation,In the case of a fast fourier transform,In the case of an inverse fast fourier transform,Is a multi-layer perceptron operation.
The attention layer is used to capture global attributes or semantic features, as shown in FIG. 6, which is a standard attention layer, including input features that will be the attention layerLayer normalization and multi-head attention operation are sequentially performed, then layer normalization and multi-layer perceptron operation are sequentially performed, and finally output is performed, wherein the expression is as follows:
;
;
Wherein, For the input features of the attention layer,For features obtained through layer normalization and multi-head attention,For the output characteristics of the attention layer,For the operation of the multi-head attention,For multi-layer perceptron operation, for channel mixing in the attention layer.
In some specific embodiments, by introducingThe factor controls the number of fm and attention layers in the total number of layers, wherein,The frequency modulation layer is the ratio of the total layer number.
It is known that the fm layer has the disadvantage of not accurately handling global properties or semantic features, while the attention layer has the disadvantage of not accurately capturing local features, where the two are combined and introducedThe factor flexibly changes the number of frequency modulation layers and attention layers, which is helpful for accurately capturing global features and local features.
Step 7, weighting the fused global features and local features in the spectrum dimension to obtain depth features, and further obtaining the prediction classification result of the target object, wherein the method comprises the steps of firstly learning key information through one-dimensional convolution, then highlighting significant features through an activation function, and finally passing throughThe function obtains a prediction result, and the expression is:
;
;
In the formula, As a feature of the depth,In order to predict the outcome of the classification,In order to operate the full-connection type of the device,AndIs two activation functions.
Example 2
On the basis of embodiment 1, this embodiment provides a process of obtaining frequency characteristics of two remote sensing data by using a frequency characteristic decomposition module in the embodiment of obtaining hyperspectral and LiDAR images.
In this embodiment, as shown in FIG. 3, the frequency characteristics include a low frequency characteristic, a high frequency characteristic, a vertical characteristic, and a horizontal characteristic.
Specifically, the window shape corresponding to the low frequency characteristic is: the window shape corresponding to the high-frequency characteristic is as follows: the window shape corresponding to the vertical feature is as follows: The window forms corresponding to the horizontal features are as follows: Wherein, the method comprises the steps of, wherein, In the form of a window size index,Taking a positive integer.
It should be noted that the number of the components,Is a window size index, characterizes the size of the window,The value determines the number of windows obtained by the division.
First, shallow features of the processed hyperspectral imageAnd processed LiDAR image shallow featuresAverage division in spectral dimensionIndividual head and willThe individual heads were divided into 4 equal parts.
First part of the first equal part) For calculating low frequency characteristics byWindow morphology of the first aliquot, each window comprisingA token, wherein the token refers to a minimum unit in a window.
First part of the second equal part) For calculating high frequency characteristics byWindow morphology of the second aliquot, each window comprisingAnd a token.
First part of the third equal part) For computing vertical features, usingDividing each header in the third aliquot into windows, each window comprisingAnd a token.
First part of the fourth equal part) For calculating horizontal features, usingWindow morphology of the first aliquot, each window comprisingAnd a token.
The following is a processed LiDAR image shallow layer featureThe vertical feature of the third equal part is calculated by taking the example, and a specific calculation process is shown, wherein the method comprises the following steps:
S1, by Will be the window form of (2)The individual heads being uniformly divided into non-overlapping partsIndividual windows,;;For shallow features of processed LiDAR imagesIs defined, the spectral dimensions of (a) are defined.
First, theThe query tensor, key tensor and value tensor of the individual header have dimensions ofWherein, the method comprises the steps of,
S2, calculating the firstAttention of each window in the head, whereinThe attention calculations for the individual windows are:
;
In the formula, Is the firstThe result of the attention calculations for the individual windows,In order to calculate the attention operation,,,Respectively the firstA query matrix, a key matrix, and a value matrix for each header.
S3 according to the firstAttention of each window in the head gets the firstVertical features of individual head, 1The vertical features of the individual head are:
;
In the formula, Is the firstThe vertical nature of the individual head is such that,,,1 St, 2 nd and 2 nd, respectivelyThe attention calculation results of the windows;
s4, splicing vertical features of all heads in the third equal part to obtain vertical features of the processed LiDAR image shallow features, wherein the expression is as follows:
;
In the formula, For vertical features of the processed LiDAR image shallow features,In order for the splicing operation to be performed,AndRespectively the firstFirst, secondAnd (b)Vertical features of the individual head.
It should be noted that the process of calculating the low frequency feature, the high frequency feature, and the horizontal feature is similar to the above method of calculating the vertical feature.
The frequency characteristic decomposition module in the invention adopts the method to extract different frequency characteristics respectively in different window postures, cuts in from the angle of frequency domain decomposition, and well utilizes various directional frequency characteristics to enhance the accuracy of classification results by a multi-head attention mechanism mode.
Example 3
The present example provides an experimental procedure to verify the classification effect of the proposed method.
The experimental hardware platform is a high-performance computer, and is configured as Intel Core i9-11900K, the CPU operation speed is 3.50GHz, the eight-Core processor, the memory is 32G, the graphics card is NVIDIA GTX3070Ti, the software platform is Python3.8 in Windows11 system environment, and the proposed method is realized in PyTorch frames.
1. Experimental data and sample partitioning
To evaluate the classification effect of the proposed method, the Houston2013 dataset was selected to verify the performance of the proposed method. The Houston2013 dataset is collected by a national onboard laser mapping center of a university campus of Houston to obtain LiDAR data of the region based on HSI and DSM. The spatial resolution of the HSI and LiDAR data was 2.5m, containing 349×1905 pixels in total. The HSI has 144 bands ranging from 0.38 nm to 1.05 nm.
The dataset was marked as 15029 ground truth samples, and samples selected from 15 categories, the size of each category sample, and the partitioning of the number of experimental training samples and test set samples are shown in table 1.
The classification accuracy evaluation index of the hyperspectral image adopts three common evaluation indexes of overall classification accuracy (OA), average classification accuracy (AA) and Kappa coefficient to measure the classification accuracy.
TABLE 1 training set and test set sample numbers for Houston2013 dataset
2. Parameter setting
In the experiment, three parameters of learning rate, space size and discarding rate can have significant influence on the experiment. Taking the Houston2013 dataset as an example, the experimental parameters were evaluated in detail.
1) Learning rate-in experiments, higher learning rates may lead to rapid model convergence but may also cause training instability and even loss function divergence, while lower learning rates generally make convergence more stable but training process slower. In addition, the learning rate also affects the ability of the model to escape from the locally optimal solution, and too much learning rate may cause concussion to fail to converge, while too little learning rate may stagnate in the locally optimal solution. Thus, the experiment selects different learning rates to test the influence of the learning rate, respectively, and the selected learning rates include 0.01,0.005,0.001,0.0007,0.0005,0.0003,0.0001,0.00007,0.00005,0.00003 and 0.00001. Experimental results show that the classification effect is best when the learning rate is 0.0001.
2) Space size-because of the extraction of image space features, the size of the spatial domain area is severely dependent. And the larger space input provides more opportunities to learn more space features, the smaller space size can capture more detailed local features, the sensitivity of the model to fine objects and changes is enhanced, the model is suitable for identifying complex ground object types, however, the too small space size can lead to sparse information and increase noise interference so as to influence classification accuracy, and conversely, the larger space size can extract more abundant context information, enhance the learning of global features, is suitable for classification of larger areas, but can ignore details and lead to class confusion. Therefore, it is very important to select a proper space size to improve the classification performance, and in the case that the number of spectrum channels is fixed, the optimal learning rate, the batch size is 64, and the training iteration number is 100, the classification precision results under different space sizes are shown in table 2.
3) Discarding rate:
as can be seen from Table 3, when the spatial size of the input data is 8 8, The best discarding rate was chosen in this experiment to optimize the classification performance, since the classification effect was best when the discarding rate was 0.5.
TABLE 2 classification accuracy at different spatial dimensions
TABLE 3 classification accuracy at different discard rates
3. Experimental results
To ensure the accuracy of the experimental results, the experiment was repeated 10 times and then averaged.
In order to verify the effectiveness and superiority of the proposed method, the invention is experimentally compared with some traditional methods and mainstream deep learning methods.
The comparison method comprises a linear self-attention fusion algorithm LSAF, a CNN fusion algorithm CoupledCNN, a classification algorithm CALC based on coupling antagonism learning, a hierarchical CNN and a transducer algorithm HCT, and a network method g2 for global and local feature fusion.
The classification performance of the different methods on the Houston2013 dataset versus the experimental results are shown in table 4.
As can be seen from the results in Table 4, the OA value, AA value and Kappa coefficient value of the method provided by the invention are higher in accuracy than those of other mainstream deep learning classification methods on the Houston2013 data set.
Wherein, the OA value is 4.26% higher than LSAF, 2.07% higher than CoupledCNN, 2.02% higher than CALC, 1.89% higher than HCT, and 0.30% higher than the accuracy of the g2 classification method.
AA 3.45% higher than LSAF, 1.48% higher than CoupledCNN, 1.62% higher than CALC, 1.63% higher than HCT, 0.42% higher than g 2.
Kappa coefficient value 4.61% higher than LSAF, 2.24% higher than CoupleCNN, 2.18% higher than CALC, 2.06% higher than HCT, 0.37% higher than g 2.
All three indexes show that the method provided by the invention is superior to other methods in classification performance.
TABLE 4 Classification Properties of different methods in the Houston2013 dataset
In addition, the classification chart of the different methods on the Houston2013 dataset is shown in fig. 7, and it can be seen from the chart that the final classification result of the network method (g 2) of the linear self-attention fusion algorithm (LSAF), the coupled CNN fusion algorithm (CoupledCNN), the Classification Algorithm (CALC) based on coupled antagonism learning, the hierarchical CNN and the Transformer algorithm (HCT) and the global and local feature fusion all have a large number of cluttered spots, and some areas have the phenomenon of misclassification compared with the ground truth value (GT). The g2 method has good classification effect, but less clutter exists in the middle and lower positions. The classification result graph predicted by the method provided by the invention is basically and completely classified, almost no spots are seen, and the classification result graph is relatively smooth in a homogeneous region.
Therefore, the method provided by the invention does not need a complex and huge multi-stage network, the method well utilizes various directional frequency characteristics in a multi-head attention mechanism mode, carries out multi-source same-frequency fusion in a self-adaptive method, finally comprehensively extracts local and global characteristics, further weights and extracts depth characteristic information from spectrum information, achieves a more ideal classification effect, effectively improves the accuracy of multi-source remote sensing data combined classification, and is superior to a plurality of advanced classification methods.
The foregoing embodiments are merely for illustrating the technical solution of the present application, but not for limiting the same, and although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that modifications may be made to the technical solution described in the foregoing embodiments or equivalents may be substituted for parts of the technical features thereof, and that such modifications or substitutions do not depart from the spirit and scope of the technical solution of the embodiments of the present application in essence.

Claims (10)

1.一种多源遥感数据分类方法,其特征在于,包括:1. A multi-source remote sensing data classification method, characterized by comprising: 获取目标物多种类型的遥感数据;Obtain various types of remote sensing data of the target object; 分别提取每种遥感数据的浅层特征;Extract shallow features of each type of remote sensing data respectively; 将所述浅层特征输入预构建的频率特征分解模块,获得每种遥感数据的多个预设的频率特征;Inputting the shallow features into a pre-built frequency feature decomposition module to obtain a plurality of preset frequency features for each remote sensing data; 将所有遥感数据的频率特征输入预构建的同频特征融合模块,所述同频特征融合模块对相同的频率特征进行特征融合,获得对应的多个同频融合特征;The frequency features of all remote sensing data are input into a pre-built same-frequency feature fusion module, and the same-frequency feature fusion module performs feature fusion on the same frequency features to obtain corresponding multiple same-frequency fusion features; 将所述多个同频融合特征进行拼接融合,获得多源融合特征;The multiple same-frequency fusion features are concatenated and fused to obtain multi-source fusion features; 将所述多源融合特征依次通过叠加的调频层和注意力层,获得融合的全局特征和局部特征;Passing the multi-source fusion features through the superimposed frequency modulation layer and attention layer in sequence to obtain fused global features and local features; 在光谱维度上对融合的全局特征和局部特征进行加权,提取深度信息,进而获得目标物的预测分类结果。The fused global and local features are weighted in the spectral dimension to extract depth information and obtain the predicted classification results of the target object. 2.根据权利要求1所述的多源遥感数据分类方法,其特征在于,所述提取每种类型遥感数据的浅层特征,包括:2. The multi-source remote sensing data classification method according to claim 1, characterized in that the extraction of shallow features of each type of remote sensing data comprises: 对遥感数据进行图像的patch块划分;Divide remote sensing data into image patches; 对划分后的patch块进行多尺度卷积操作,获得对应的多种尺度特征;Perform multi-scale convolution operations on the divided patch blocks to obtain corresponding multi-scale features; 在任一预设的通道内,In any preset channel, 将所述多种尺度特征沿着通道维度进行叠加,获得通道融合特征;Superimposing the multiple scale features along the channel dimension to obtain channel fusion features; 将所述通道融合特征沿通道维度分别执行元素求和、平均和最大池化操作,获得对应的求和特征图、平均特征图和最大池化特征图;Perform element summation, averaging and maximum pooling operations on the channel fusion features along the channel dimension to obtain corresponding sum feature maps, average feature maps and maximum pooling feature maps; 将所述求和特征图、平均特征图和最大池化特征图进行拼接,并使用卷积进行进一步通道融合,获得该通道的低维通道特征;The sum feature map, the average feature map and the maximum pooling feature map are concatenated, and convolution is used to further perform channel fusion to obtain low-dimensional channel features of the channel; 将所有通道的低维通道特征进行拼接,得到遥感数据的浅层特征。The low-dimensional channel features of all channels are concatenated to obtain the shallow features of the remote sensing data. 3.根据权利要求2所述的多源遥感数据分类方法,其特征在于,所述多尺度卷积操作包括:对划分后的patch块分别进行3×3卷积、5×5卷积和7×7卷积,每个卷积操作后再依次进行批归一化操作和ReLU操作,获得对应的3种尺度特征。3. The multi-source remote sensing data classification method according to claim 2 is characterized in that the multi-scale convolution operation includes: performing 3×3 convolution, 5×5 convolution and 7×7 convolution on the divided patch blocks, and performing batch normalization operation and ReLU operation in sequence after each convolution operation to obtain corresponding three scale features. 4.根据权利要求1所述的多源遥感数据分类方法,其特征在于,所述频率特征分解模块是基于频域Transformer构建的;所述获得每个浅层特征的多个预设的频率特征,包括:4. The multi-source remote sensing data classification method according to claim 1 is characterized in that the frequency feature decomposition module is constructed based on the frequency domain Transformer; the multiple preset frequency features of each shallow feature are obtained, including: 将输入的浅层特征通过卷积和归一化层,获得处理后的浅层特征The input shallow features are passed through the convolution and normalization layers to obtain the processed shallow features ; 将处理后的浅层特征在光谱维度上平均分到个头中,并将个头均分成等份,其中,为频率特征的数量,以使每个等份用于计算一个频率特征;为预设值;The shallow features after processing Averaged over the spectral dimension In the head, Divide into Equal parts, of which, is the number of frequency features, so that each equal portion is used to calculate one frequency feature; is the default value; 每个频率特征对应一种预设的窗口形态;Each frequency feature corresponds to a preset window shape; 对于任一等份,先采用对应频率特征的窗口形态将每个头均匀划分为不重叠的窗口,然后计算每个窗口的注意力;进而获得每个头的频率特征;最后,将每个头的频率特征拼接,获得该等份对应的频率特征。For any equal part, each head is first evenly divided into non-overlapping windows using the window shape corresponding to the frequency characteristics, and then the attention of each window is calculated; then the frequency characteristics of each head are obtained; finally, the frequency characteristics of each head are spliced to obtain the frequency characteristics corresponding to the equal part. 5.根据权利要求4所述的多源遥感数据分类方法,其特征在于,所述频率特征包括:低频特征、高频特征、垂直特征和水平特征;5. The multi-source remote sensing data classification method according to claim 4, characterized in that the frequency features include: low-frequency features, high-frequency features, vertical features and horizontal features; 所述低频特征对应的窗口形态为:为窗口大小指数,取正整数;The window shape corresponding to the low-frequency feature is: , is the window size index, Take a positive integer; 用于计算低频特征的头中,每个窗口包含个token;所述 token是指窗口中的一个最小单位;In the header used to calculate low-frequency features, each window contains token; the token refers to a minimum unit in the window; 所述高频特征对应的窗口形态为:;用于计算高频特征的头中,每个窗口包含个token;The window shape corresponding to the high-frequency feature is: ; In the header used to calculate high-frequency features, each window contains Tokens; 所述垂直特征对应的窗口形态为:;用于计算垂直特征的头中,每个窗口包含个token;The window shape corresponding to the vertical feature is: ; In the header used to calculate vertical features, each window contains Tokens; 所述水平特征对应的窗口形态为:;用于计算水平特征的头中,每个窗口包含个token。The window shape corresponding to the horizontal feature is: ; In the header used to calculate horizontal features, each window contains tokens. 6.根据权利要求5所述的多源遥感数据分类方法,其特征在于,所述垂直特征的获得过程包括:6. The multi-source remote sensing data classification method according to claim 5, characterized in that the process of obtaining the vertical features comprises: 将第个头均匀划分为不重叠的个窗口为窗口数量,为处理后的浅层特征的长或宽;为处理后的浅层特征的光谱维度;第个头的查询张量、键张量和值张量的维度为,其中,;计算第个头中每个窗口的注意力;其中,第个窗口的注意力计算结果为:The first The heads are evenly divided into non-overlapping Windows , is the number of windows, , The shallow features after processing length or width; , The shallow features after processing The spectral dimension of The query tensor of the head , key tensor Sum value tensor The dimension is ,in, ; ; Calculate the The attention of each window in the head; The attention calculation result of the window is: ; 式中,为第个窗口的注意力计算结果,为计算注意力操作,分别为第个头的查询矩阵、键矩阵和值矩阵;In the formula, For the The attention calculation results of the windows are: To calculate the attention operation, , , Respectively The query matrix, key matrix and value matrix of each head; 根据第个头中每个窗口的注意力获得第个头的垂直特征,表达式为:According to The attention of each window in the head gets The vertical characteristics of the head are expressed as: ; 式中,为第个头的垂直特征,分别为第1个、第2个和第个窗口的注意力计算结果;In the formula, For the The vertical characteristics of the head, , , The first, second and The attention calculation results of the window; 将用于计算垂直特征的所有头的垂直特征进行拼接,获得浅层特征的垂直特征,表达式为:The vertical features of all heads used to calculate vertical features are concatenated to obtain the vertical features of shallow features. The expression is: ; 式中,为浅层特征的垂直特征,为拼接操作,分别为第个、第个和第个头的垂直特征。In the formula, is the vertical feature of the shallow feature, For splicing operation, , and Respectively , and The vertical characteristics of the head. 7.根据权利要求1所述的多源遥感数据分类方法,其特征在于,所述同频分量融合模块对相同的频率分量进行特征融合,获得对应的多个同频融合特征,包括:7. The multi-source remote sensing data classification method according to claim 1 is characterized in that the same-frequency component fusion module performs feature fusion on the same frequency components to obtain corresponding multiple same-frequency fusion features, including: 对于任一种频率分量,For any frequency component, 将来自所有遥感数据的该种频率分量按元素相加得到同频分量The frequency components from all remote sensing data are added element by element to obtain the same frequency components. ; 在通道维度上进行全局平均池化,再经过通道注意力模块,得到通道权重,表达式为:Will Perform global average pooling on the channel dimension, and then pass through the channel attention module to obtain the channel weight, which is expressed as: ; 式中,为通道权重,为1×1卷积层,为取最大值,在通道维度上进行全局平均池化的输出;In the formula, is the channel weight, is a 1×1 convolutional layer, To obtain the maximum value, The output of global average pooling in the channel dimension; 在空间维度上分别进行全局平均池化和全局最大池化,再经过空间注意力模块,得到空间权重,表达式为:Will Global average pooling and global maximum pooling are performed in the spatial dimension, and then the spatial weight is obtained through the spatial attention module. The expression is: ; 式中,为空间权重,为7×7卷积层,在空间维度上进行全局平均池化的输出,在空间维度上进行全局最大池化的输出;In the formula, is the spatial weight, is a 7×7 convolutional layer, for The output of global average pooling in the spatial dimension, for The output of global maximum pooling in the spatial dimension; 根据广播规则,通过加法运算,将通道权重和空间权重进行融合,得到粗权重;According to the broadcast rule, the channel weight and the spatial weight are fused through addition operation to obtain the coarse weight; 将粗权重和同频分量的每个通道通过重排操作进行重新排列,获得细权重,表达式为:The coarse weight and each channel of the same frequency component are rearranged through the rearrangement operation to obtain the fine weight, which is expressed as: ; 式中,为细权重,为sigmoid 函数,为组卷积,组数设置为通道数,为通道重排操作,为粗权重;In the formula, is the fine weight, is the sigmoid function, For group convolution, the number of groups is set to the number of channels. For channel rearrangement operation, is the coarse weight; 根据所有遥感数据的该种频率分量和细权重,结合残差连接,采取加权求和方式得到该种频率特征的同频融合特征。According to the frequency components and detailed weights of all remote sensing data, combined with residual connection, a weighted summation method is adopted to obtain the same-frequency fusion features of this frequency feature. 8.根据权利要求1所述的多源遥感数据分类方法,其特征在于,所述调频层和注意力层采用调频层叠加和注意力层叠加之后串联的分阶段架构,并引入因子控制总层数中调频层和注意力层的数量,其中,为调频层在总层数中占比。8. The multi-source remote sensing data classification method according to claim 1 is characterized in that the frequency modulation layer and the attention layer adopt a staged architecture in which the frequency modulation layer is superimposed and the attention layer is superimposed and then connected in series, and introduces The factor controls the number of FM layers and attention layers in the total number of layers, where is the proportion of FM layers in the total number of layers. 9.根据权利要求8所述的多源遥感数据分类方法,其特征在于,所述调频层用于捕获局部特征,包括:先通过应用基于块的快速傅里叶变换将输入特征变换到频域;接着,引入可学习矩阵,通过频域中元素的乘法来抑制或放大所有频率分量,获得调频特征9. The multi-source remote sensing data classification method according to claim 8, characterized in that the frequency modulation layer is used to capture local features, comprising: first applying a block-based fast Fourier transform The input features Transform to the frequency domain; then, introduce a learnable matrix to suppress or amplify all frequency components by multiplying the elements in the frequency domain to obtain the frequency modulation feature ; 再使用傅里叶反变换并进行重构,得到精细化的输出特征,表达式为:Then use the inverse Fourier transform And reconstruct to obtain refined output features , the expression is: ; ; ; 式中,为输入特征,为经过前向传播网络获得的特征,为调频特征,为精细化的输出特征,为层归一化,为1×1卷积层,为激活函数,为块分块操作,为元素相乘,为可学习矩阵,为块合并操作,为快速傅里叶变换,为快速傅里叶反变换,为多层感知机操作;In the formula, is the input feature, is the feature obtained through the forward propagation network, is the frequency modulation feature, is the refined output feature, is layer normalization, is a 1×1 convolutional layer, is the activation function, For block operation, is element-wise multiplication, is the learnable matrix, For block merging operations, is the fast Fourier transform, is the inverse fast Fourier transform, Operate for multi-layer perceptron; 所述注意力层用于捕获全局属性或语义特征,包括:将输入注意力层的特征先依次进行层归一化和多头注意力操作,然后再依次进行层归一化和多层感知机操作,最后进行输出;其中,多层感知机操作用于注意力层中的通道混合。The attention layer is used to capture global attributes or semantic features, including: first performing layer normalization and multi-head attention operations on the features of the input attention layer, then performing layer normalization and multi-layer perceptron operations in sequence, and finally outputting; wherein the multi-layer perceptron operation is used for channel mixing in the attention layer. 10.根据权利要求1所述的多源遥感数据分类方法,其特征在于,所述在光谱维度上对融合的全局特征和局部特征进行加权,获得深度特征,进而获得目标物的分类结果,包括:先经过一维卷积学习关键信息,再通过激活函数突出显著特征,最后通过函数得到预测结果,表达式为:10. The multi-source remote sensing data classification method according to claim 1 is characterized in that the fused global features and local features are weighted in the spectral dimension to obtain deep features, and then obtain the classification results of the target object, including: first learning key information through one-dimensional convolution, then highlighting significant features through activation functions, and finally The function gets the prediction result, the expression is: ; ; 式中,为深度特征,为预测分类结果,为全连接操作,是两种激活函数。In the formula, is the deep feature, To predict the classification results, For full connection operation, and There are two activation functions.
CN202411798370.4A 2024-12-09 2024-12-09 A classification method for multi-source remote sensing data Pending CN119649135A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411798370.4A CN119649135A (en) 2024-12-09 2024-12-09 A classification method for multi-source remote sensing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411798370.4A CN119649135A (en) 2024-12-09 2024-12-09 A classification method for multi-source remote sensing data

Publications (1)

Publication Number Publication Date
CN119649135A true CN119649135A (en) 2025-03-18

Family

ID=94958092

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411798370.4A Pending CN119649135A (en) 2024-12-09 2024-12-09 A classification method for multi-source remote sensing data

Country Status (1)

Country Link
CN (1) CN119649135A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120088295A (en) * 2025-04-27 2025-06-03 南京信息工程大学 A snow distribution prediction method and system based on a dual-branch model
CN120495663A (en) * 2025-05-08 2025-08-15 耕宇牧星(北京)空间科技有限公司 Remote sensing image segmentation method based on hyperspectral nucleation and space-time feature fusion
CN120997066A (en) * 2025-08-11 2025-11-21 安徽大学 An Arbitrary Size Image Enhancement Method Based on Frequency Aggregation Self-Attention

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863247A (en) * 2023-08-22 2023-10-10 南京信息工程大学 Multi-mode remote sensing data classification method integrating global information and local information
CN117876890A (en) * 2024-03-11 2024-04-12 成都信息工程大学 A multi-source remote sensing image classification method based on multi-level feature fusion

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116863247A (en) * 2023-08-22 2023-10-10 南京信息工程大学 Multi-mode remote sensing data classification method integrating global information and local information
CN117876890A (en) * 2024-03-11 2024-04-12 成都信息工程大学 A multi-source remote sensing image classification method based on multi-level feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BING TU ET AL.: "NCGLF2: Network combining global and local features for fusion of multisource remote sensing data", BING TU ET AL., vol. 104, 15 December 2023 (2023-12-15), pages 1 - 17 *
杨曦等: "多模态数据融合与检索技术", vol. 1, 30 June 2021, 西安电子科技大学出版社, pages: 81 - 82 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120088295A (en) * 2025-04-27 2025-06-03 南京信息工程大学 A snow distribution prediction method and system based on a dual-branch model
CN120495663A (en) * 2025-05-08 2025-08-15 耕宇牧星(北京)空间科技有限公司 Remote sensing image segmentation method based on hyperspectral nucleation and space-time feature fusion
CN120997066A (en) * 2025-08-11 2025-11-21 安徽大学 An Arbitrary Size Image Enhancement Method Based on Frequency Aggregation Self-Attention

Similar Documents

Publication Publication Date Title
CN112381013B (en) Urban vegetation inversion method and system based on high-resolution remote sensing images
Wang et al. Global feature-injected blind-spot network for hyperspectral anomaly detection
CN119649135A (en) A classification method for multi-source remote sensing data
CN112101271A (en) Hyperspectral remote sensing image classification method and device
CN108229551B (en) Hyperspectral remote sensing image classification method based on compact dictionary sparse representation
CN115909112A (en) A method and system for detecting changes in farmland utilization in UAV hyperspectral remote sensing images based on multi-scale differential depth feature fusion
CN113421198A (en) Hyperspectral image denoising method based on subspace non-local low-rank tensor decomposition
Xu et al. Vegetation information extraction in karst area based on UAV remote sensing in visible light band
CN106529472B (en) Target detection method and device based on large-scale high-resolution hyperspectral images
Liu et al. Hyperspectral real-time local anomaly detection based on finite Markov via line-by-line processing
CN114926694B (en) Hyperspectral image classification method, device, electronic device and storage medium
Yin et al. Multibranch 3D-dense attention network for hyperspectral image classification
CN114067217A (en) SAR image target identification method based on non-downsampling decomposition converter
Liu et al. Hyperspectral real-time online processing local anomaly detection via multiline multiband progressing
Wang et al. Expansion spectral–spatial attention network for hyperspectral image classification
Zhang et al. Feature-band-based unsupervised hyperspectral underwater target detection near the coastline
Zhao et al. Exploring an application-oriented land-based hyperspectral target detection framework based on 3D–2D CNN and transfer learning
Wang et al. A novel BH3DNet method for identifying pine wilt disease in Masson pine fusing UAS hyperspectral imagery and LiDAR data
CN117975266A (en) Change detection method for multimodal remote sensing images based on boundary extraction and constraints
CN116188830B (en) Cross-domain classification method of hyperspectral images based on multi-level feature alignment
CN117115669B (en) Object-level ground object sample self-adaptive generation method and system with double-condition quality constraint
CN117690039A (en) A deep learning cloud detection method that integrates texture information, spectral information, polarization information, and multi-angle information
CN118628821A (en) A wavelet-guided network object classification method for on-site security scenarios
CN117115675A (en) A cross-temporal lightweight spatial spectrum feature fusion hyperspectral change detection method, system, equipment and medium
CN118887547B (en) Cross-domain small sample SAR oil spill detection method based on category perception distance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination