Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
To this end, it is an object of the present invention to propose a high-texture image classification method based on visual saliency detection, which can automatically distinguish various textures in an image (field of view) so that subsequent scene description or object recognition is possible.
Another object of the present invention is to provide a high-texture image classification apparatus based on visual saliency detection.
In order to achieve the above object, an embodiment of an aspect of the present invention provides a high-texture image classification method based on visual saliency detection, including the following steps: segmenting a saliency region by an image pixel saliency value detection method of color contrast; extracting texture features and gradient features of the salient region through a complete local binary pattern operator and a direction gradient histogram algorithm, and jointly representing image detail information through an effective series fusion strategy; and classifying the extracted fusion vector through a nearest neighbor classifier to obtain the recognition rate.
According to the high-texture image classification method based on visual saliency detection, a saliency region is segmented by using an image pixel saliency value detection method based on color contrast, then the texture and gradient features of the saliency region are extracted by using a complete local binary pattern algorithm and a direction gradient histogram algorithm, an effective series fusion strategy is carried out, image detail information is jointly represented, and finally a nearest neighbor classifier is used for classifying the extracted texture feature vectors to obtain an identification rate, so that the purpose that a computer automatically classifies the extracted textures is achieved.
In addition, the high-texture image classification method based on visual saliency detection according to the above embodiment of the present invention may also have the following additional technical features:
further, in an embodiment of the present invention, the method for detecting a saliency value of an image pixel through color contrast segments a saliency region, further comprising: quantizing the color space to obtain a set of representative colors; acquiring the occurrence frequency of colors corresponding to the representative colors in an input image, and forming a histogram; calculating a significance value of the representative color according to the difference between each representative color and other representative colors; and assigning the significance value of each representative color to the corresponding pixel.
Further, in an embodiment of the present invention, the color space quantization method is octree color quantization, wherein the octree color quantization is divided into establishing a color octree, generating a color palette, generating a quantization file, sequentially reading in pixel colors, establishing a color octree with a leaf node smaller than the number of quantized colors, and traversing the color octree, if any color in an image does not exist in the color octree, newly inserting a leaf node to represent the any color, and if the number of leaf nodes of the color octree exceeds the number of quantized colors after inserting a pixel color, performing a merging operation of the leaf nodes according to a merging strategy, so that only colors not exceeding the number of quantized colors are saved as the color palette after all pixels are inserted, and scanning the file again to map each color to the color palette, a new image after quantization is generated.
Further, in an embodiment of the present invention, the complete local Binary pattern CLBP (complete local Binary pattern) algorithm is used to extract texture features of the salient region, the complete Binary pattern describes texture features of pixel points from a gray-value magnitude relation feature CLBP _ S (CLBP-Sign), a gray-value difference magnitude feature CLBP _ M (CLBP-Magnitudes), and a pixel gray-value and global average gray-value magnitude relation feature CLBP _ C (CLBP-Center), so as to maximally extract image gray texture information of a single pixel point, and mathematical description of the complete local Binary pattern features is as follows:
wherein, gi(i-1, 2, …, N) denotes the number gcThe gray value of the neighborhood pixel point as the center, R is the neighborhood radius, mNRepresenting the difference value between the central pixel point and the neighborhood pixel point, and c represents m in the local imageNAverage value of clRepresenting a global gray mean.
Further, in an embodiment of the present invention, the gradient feature of the saliency region is extracted through the directional gradient histogram algorithm hog (histogram of ordered gradient), and a formula for calculating the gradient of the pixel point by the directional gradient histogram algorithm is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y),
Gy(x,y)=H(x,y+1)-H(x,y-1),
wherein, G in the formulax(x, y), Gy (x, y), H (x, y) respectively represent the horizontal direction gradient, the vertical direction gradient and the pixel value at the pixel point (x, y) in the input image;
the gradient amplitude and the gradient direction at the pixel point are respectively as follows:
further, in one embodiment of the present invention, the texture feature vector and the gradient feature vector are fused in series to form the fused feature vector, and the formula is as follows:
H=[Hclbp,Hhog],
wherein, Hclbp,HhogRespectively representing a fusion feature vector, a CLBP feature vector and a HOG feature vector.
Further, in an embodiment of the present invention, the classifying the extracted fusion vector by the nearest neighbor classifier further includes: calculating, by the nearest neighbor classifier, a similarity and a dissimilarity between the two histograms.
In order to achieve the above object, another embodiment of the present invention provides a high-texture image classification system based on visual saliency detection, including: the detection module is used for segmenting a saliency region according to an image pixel saliency value detection method of the color contrast; the extraction module is used for extracting the texture features and the gradient features of the salient region through a complete local binary pattern operator and a direction gradient histogram algorithm, and jointly representing image detail information through an effective series fusion strategy; and the classification module is used for classifying the extracted fusion vector through a nearest neighbor classifier so as to obtain the recognition rate.
According to the high-texture image classification system based on visual saliency detection, a saliency region is segmented by using an image pixel saliency value detection method based on color contrast, then the texture and gradient features of the saliency region are extracted by using a complete local binary pattern algorithm and a direction gradient histogram algorithm, an effective series fusion strategy is carried out, image detail information is jointly represented, and finally a nearest neighbor classifier is used for classifying the extracted texture feature vectors to obtain an identification rate, so that the purpose that a computer automatically classifies the extracted textures is achieved.
In addition, the high-texture image classification system based on visual saliency detection according to the above embodiment of the present invention may also have the following additional technical features:
further, in an embodiment of the present invention, the complete local binary pattern algorithm is used to extract the texture features of the significant region, the complete binary pattern describes the texture features of the pixel points from the gray value magnitude relation feature CLBP _ S, the gray value difference amplitude feature CLBP _ M, and the magnitude relation feature CLBP _ C between the pixel gray value and the global average gray value, the image gray texture information of a single pixel point is maximally extracted, and the mathematical description of the complete local binary pattern features is as follows:
wherein, gi(i-1, 2, …, N) denotes the number gcThe gray value of the neighborhood pixel point as the center, R is the neighborhood radius, mNRepresenting the difference value between the central pixel point and the neighborhood pixel point, and c represents m in the local imageNAverage value of clRepresenting a global gray mean.
Further, in an embodiment of the present invention, the gradient feature of the significant region is extracted through the direction gradient histogram algorithm HOG, and a formula for calculating a gradient of a pixel point by the direction gradient histogram algorithm is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y),
Gy(x,y)=H(x,y+1)-H(x,y-1),
wherein, G in the formulax(x, y), Gy (x, y), H (x, y) respectively represent the horizontal direction gradient, the vertical direction gradient and the pixel value at the pixel point (x, y) in the input image;
the gradient amplitude and the gradient direction at the pixel point are respectively as follows:
additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The high-texture image classification method and system based on visual saliency detection proposed according to an embodiment of the present invention will be described below with reference to the accompanying drawings, and first, the high-texture image classification method based on visual saliency detection proposed according to an embodiment of the present invention will be described with reference to the accompanying drawings.
Fig. 1 is a flowchart of a high-texture image classification method based on visual saliency detection according to an embodiment of the present invention.
As shown in fig. 1, the high-texture image classification method based on visual saliency detection according to the embodiment of the present invention includes the following steps:
in step S101, a saliency region is segmented by an image pixel saliency value detection method of color contrast.
Further, in an embodiment of the present invention, segmenting out the saliency region by the image pixel saliency value detection method of the color contrast, as shown in fig. 2, may further include: quantizing the color space to obtain a set of representative colors; acquiring the occurrence frequency of colors corresponding to the representative colors in the input image, and forming a histogram; calculating a significance value of the representative color according to the difference between each representative color and other representative colors; the saliency value of each representative color is assigned to the corresponding pixel.
Further, in one embodiment of the present invention, the color space quantization method is octree color quantization, wherein, the octree color quantization comprises establishing color octree, generating palette and generating quantization file, reading pixel color in sequence, establishing a color octree with leaf node smaller than quantized color number, traversing the color octree, if any color in the image does not exist in the color octree, a leaf node is newly inserted to represent any color, and if the number of leaf nodes of the color octree exceeds the number of quantized colors after the pixel color is inserted, merging the leaf nodes according to a merging strategy so that only a plurality of colors which do not exceed the quantized colors are saved as a color palette after all the pixels are inserted, and the file is scanned again to map each color to the palette to produce a new quantized image.
Specifically, as shown in fig. 3, quantization is performed on the color space using an octree structure to obtain a set of representative colors, pixel colors are sequentially read in (R, G, B), a color octree with leaf nodes smaller than 256 (the number of colors after quantization) is created, then the color octree is traversed, and if a certain color in the image does not exist in the octree, a leaf node is newly inserted to represent the color, so as to remove the influence of the repeated colors. If the number of leaf nodes of the octree exceeds 256 after the pixel colors are inserted, merging operation of the leaf nodes is carried out according to a certain merging strategy, therefore, after all the pixels are inserted, only less than 256 colors are saved as a color palette, finally, the file is scanned again, each color is mapped to the color palette, and a quantized new image is generated.
And calculating the appearance frequency of the color corresponding to the representative color in the input image to form a histogram, wherein the proportion of the pixel corresponding to a certain representative color in all the pixels in the input image is the appearance frequency of the representative color. Each representative color has a frequency. The frequency of occurrence of this set of representative colors is called a histogram. In order to save computing resources, a representative color with a high frequency of appearance is usually reserved, and the frequencies of appearance of the remaining representative colors are added to the frequencies of appearance of the reserved representative colors with the closest color. When selecting the representative color with a high frequency of appearance, the frequency of appearance of the representative color is sorted from large to small. A representative color is then selected from front to back that is sufficient to cover a certain proportion of the image pixels. This proportion is usually chosen to be 95% in the experiments.
Calculating the significance value S of the representative color according to the difference between each representative color and other representative colors, wherein the specific calculation formula is as follows:
wherein, cjTo remove clOther representative colors than fjIs cjFrequency of occurrence of D (c)l,cj) Is cl、cjEuclidean distance in color space.
For each representative color, its saliency value is assigned to the corresponding pixel.
For example, as shown in fig. 4, it is a schematic diagram of the results of the experiment performed on the plant picture by using the visual saliency detection method.
In step S102, texture features and gradient features of the salient region are extracted through a complete local binary pattern operator and a directional gradient histogram algorithm, and image detail information is jointly represented through an effective series fusion strategy.
Further, in an embodiment of the present invention, a complete local binary pattern algorithm is used to extract texture features of a significant region, where the complete binary pattern describes texture features of pixel points from a gray value magnitude relation feature CLBP _ S, a gray value difference amplitude feature CLBP _ M, and a magnitude relation feature CLBP _ C between a pixel gray value and a global average gray value, so as to maximally extract image gray texture information of a single pixel point, and mathematical description of the complete local binary pattern features is as follows:
wherein, gi(i-1, 2, …, N) denotes the number gcThe gray value of the neighborhood pixel point as the center, R is the neighborhood radius, mNRepresenting the difference value between the central pixel point and the neighborhood pixel point, and c represents m in the local imageNAverage value of clRepresenting a global gray mean.
Further, in an embodiment of the present invention, the gradient feature of the significant region is extracted by a direction gradient histogram algorithm, and a formula for calculating the gradient of the pixel point by the direction gradient histogram algorithm is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y),
Gy(x,y)=H(x,y+1)-H(x,y-1),
wherein, G in the formulax(x, y), Gy (x, y), H (x, y) respectively represent the horizontal direction gradient, the vertical direction gradient and the pixel value at the pixel point (x, y) in the input image;
the gradient magnitude and gradient direction at the pixel point are respectively:
further, in one embodiment of the present invention, the texture feature vector and the gradient feature vector are fused in series to form a fused feature vector, and the formula is as follows:
H=[Hclbp,Hhog],
wherein, Hclbp,HhogRespectively representing a fusion feature vector, a CLBP feature vector and a HOG feature vector.
Specifically, as shown in fig. 5, a complete binary pattern algorithm is used to extract the salient region texture features. The complete binary pattern describes texture features of pixel points from the gray value magnitude relation feature CLBP _ S, the gray value difference amplitude feature CLBP _ M and the pixel gray value and global average gray value magnitude relation feature CLBP _ C, and image gray texture information of a single pixel point is extracted to the maximum extent. The mathematical description of the CLBP features is as follows:
wherein gi (i ═ 1,2, …, N) denotes gcThe gray value of the neighborhood pixel point which is the center; r is the neighborhood radius; m isNIndicating a large difference between the central pixel and the neighborhood pixelSmall; c represents m in the partial imageNThe mean value of (a); c. ClRepresenting a global gray mean.
And (5) extracting the gradient characteristics of the salient region by using an HOG algorithm. The formula for calculating the gradient of the pixel point (x, y) by the HOG algorithm is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y)
Gy(x,y)=H(x,y+1)-H(x,y-1)
in the formula Gx(x, y), Gy (x, y), H (x, y) respectively represent the horizontal direction gradient, the vertical direction gradient, and the pixel value at the pixel point (x, y) in the input image. The gradient amplitude and gradient direction at pixel point (x, y) are respectively:
then extracting a CLBP characteristic histogram HclbpAnd HOG feature histogram HhogAnd performing serial connection to form a fusion feature vector H, wherein the formula is as follows:
H=[Hclbp,Hhog]
in step S103, the extracted fusion vector is classified by the nearest neighbor classifier to obtain a recognition rate.
Further, in an embodiment of the present invention, classifying the extracted fusion vector by a nearest neighbor classifier may further include: the similarity and difference between the two histograms is calculated by the nearest neighbor classifier.
That is, nearest neighbor is a simple and effective classification criterion, and the similarity and difference between two histograms can be quickly calculated by a nearest neighbor classifier, for example, the most common euclidean distance, mahalanobis distance. The mahalanobis distance was used as a measure in the experiments described herein and is shown below:
wherein,is a feature vector, H ', of the kth class of images in the training sample set'testD is the Mahalanobis distance between two characteristic vectors for the characteristic vectors of the image to be identified in the test sample set.
According to the high-texture image classification method based on visual saliency detection provided by the embodiment of the invention, a saliency region is segmented by using an image pixel saliency value detection method based on color contrast, then the texture and gradient features of the saliency region are extracted by using a complete local binary pattern algorithm and a direction gradient histogram algorithm, an effective series fusion strategy is carried out, image detail information is jointly represented, and finally the extracted texture feature vectors are classified by using a nearest neighbor classifier to obtain an identification rate, so that the purpose of automatically classifying the extracted texture by using a computer is realized.
Next, a high-texture image classification system based on visual saliency detection proposed according to an embodiment of the present invention is described with reference to the drawings.
Fig. 6 is a schematic structural diagram of a high-texture image classification system based on visual saliency detection according to an embodiment of the present invention.
As shown in fig. 6, the high-texture image classification system 10 based on visual saliency detection includes: a detection module 100, an extraction module 200 and a classification module 300.
The detection module 100 is configured to segment a saliency region according to an image pixel saliency value detection method of color contrast. The extraction module 200 is configured to extract texture features and gradient features of the salient region through a complete local binary pattern operator and a directional gradient histogram algorithm, and collectively represent image detail information through an effective series fusion strategy. The classification module 300 is configured to classify the extracted fusion vector by a nearest neighbor classifier to obtain a recognition rate. The system 10 of embodiments of the present invention enables automatic differentiation of various textures in an image (field of view) so that subsequent scene description or object recognition is possible.
Further, in an embodiment of the present invention, a complete local binary pattern algorithm is used to extract texture features of a significant region, where the complete binary pattern describes texture features of pixel points from a gray value magnitude relation feature CLBP _ S, a gray value difference amplitude feature CLBP _ M, and a magnitude relation feature CLBP _ C between a pixel gray value and a global average gray value, so as to maximally extract image gray texture information of a single pixel point, and mathematical description of the complete local binary pattern features is as follows:
wherein, gi(i-1, 2, …, N) denotes the number gcThe gray value of the neighborhood pixel point as the center, R is the neighborhood radius, mNRepresenting the difference value between the central pixel point and the neighborhood pixel point, and c represents m in the local imageNAverage value of clRepresenting a global gray mean.
Further, in an embodiment of the present invention, the gradient feature of the significant region is extracted by a direction gradient histogram algorithm, and a formula for calculating the gradient of the pixel point by the direction gradient histogram algorithm is as follows:
Gx(x,y)=H(x+1,y)-H(x-1,y),
Gy(x,y)=H(x,y+1)-H(x,y-1),
wherein, G in the formulax(x, y), Gy (x, y), H (x, y) respectively represent the horizontal direction gradient, the vertical direction gradient and the pixel value at the pixel point (x, y) in the input image;
the gradient magnitude and gradient direction at the pixel point are respectively:
it should be noted that the foregoing explanation of the embodiment of the high-texture image classification method based on visual saliency detection is also applicable to the apparatus of this embodiment, and is not repeated here.
According to the high-texture image classification system based on visual saliency detection provided by the embodiment of the invention, a saliency region is segmented by using an image pixel saliency value detection method based on color contrast, then the texture and gradient features of the saliency region are extracted by using a complete local binary pattern algorithm and a direction gradient histogram algorithm, an effective series fusion strategy is carried out, image detail information is jointly represented, and finally the extracted texture feature vectors are classified by using a nearest neighbor classifier to obtain an identification rate, so that the purpose of automatically classifying the extracted texture by using a computer is realized.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.