CN112036232B

CN112036232B - Image table structure identification method, system, terminal and storage medium

Info

Publication number: CN112036232B
Application number: CN202010662891.2A
Authority: CN
Inventors: 刘云锴; 彭程; 边赟
Original assignee: Chengdu Zhongke Information Technology Co ltd; Chengdu Information Technology Co Ltd of CAS
Current assignee: Chengdu Zhongke Information Technology Co ltd; Chengdu Information Technology Co Ltd of CAS
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2023-07-18
Anticipated expiration: 2040-07-10
Also published as: CN112036232A

Abstract

The application relates to an image table structure identification method, an image table structure identification system, a terminal and a storage medium. Comprising the following steps: performing frame line detection on a form image to be identified by using an LSD algorithm to respectively obtain transverse line detection results and longitudinal line detection results of a form structure in the form image to be identified; detecting each straight line in the transverse line and longitudinal line detection results according to a set transverse threshold value and a set longitudinal threshold value respectively to obtain straight lines belonging to a collinear common section in the transverse line and longitudinal line detection results, and combining two or more straight lines belonging to the same frame line and the collinear common section to obtain complete transverse lines and longitudinal lines in the table structure; and merging the complete transverse lines and the complete longitudinal lines, and aligning the merged transverse lines and the merged longitudinal lines to obtain the table structure in the table image to be identified. The image preprocessing method and device are fewer in image preprocessing work, higher in recognition speed and more accurate in recognition result.

Description

Image table structure identification method, system, terminal and storage medium

Technical Field

The application belongs to the technical field of image processing, and particularly relates to an image table structure identification method, an image table structure identification system, a terminal and a storage medium.

Background

The table is used as the most simplified expression mode of text data record summarization or the most common expression format in data statistics and result analysis, and is a basic tool in various data analysis tools. At present, various form data are filled in network information, but many forms are provided in the form of pictures, such as various scanning files, PDF files and the like, the image form data are automatically identified, and the reduction of the form content of the picture type into digital data is the basis for quick processing and analysis of the data. Because of the form structure characteristics of the form data, the form image is more difficult to identify than the common image text data.

The table image recognition method adopted in the prior art comprises the following steps:

1. detecting a table frame line based on a projection method; the method has the following defects: the image preprocessing workload is large, the image needs to be corroded, the requirement on the corroded structural element is high, and the final recognition result can be directly influenced; in addition, because each pixel point in the picture needs to be judged, and the foreground pixels are accumulated and summed, whether the pixel points are the form frame lines or not is judged according to the sum value, and the algorithm has poor effect and overlong detection time under the conditions of low image resolution and complex form.

2. Form image recognition based on a Huffman method; the method has the following defects: the cost of the frame line detection time is too high, the specific coordinate representation of the detection straight line cannot be determined, and false detection and missing detection are caused for adjacent pixel points or parallel lines similar to the same straight line.

3. The detection of the frame lines of the table based on the run method has high requirements on the resolution of the picture, has poor effect in the low-resolution image, has poor detection effect on the frame lines of different sections of the same line, and has low accuracy when applied to engineering practice.

Disclosure of Invention

The application provides an image table structure identification method, an image table structure identification system, a terminal and a storage medium, and aims to solve the technical problems of large image preprocessing workload, long detection time and poor effect in low-resolution image or frame line detection in the prior art to at least a certain extent.

In order to solve the above problems, the present application provides the following technical solutions:

an image table structure identification method comprises the following steps:

step a: performing frame line detection on a form image to be identified by using an LSD algorithm to respectively obtain transverse line detection results and longitudinal line detection results of a form structure in the form image to be identified;

Step b: detecting each straight line in the transverse line and longitudinal line detection results according to a set transverse threshold value and a set longitudinal threshold value respectively to obtain straight lines belonging to a collinear common section in the transverse line and longitudinal line detection results, and combining two or more straight lines belonging to the same frame line and the collinear common section to obtain complete transverse lines and longitudinal lines in the table structure;

step c: and merging the complete transverse lines and the complete longitudinal lines, and aligning the merged transverse lines and the merged longitudinal lines to obtain the table structure in the table image to be identified.

The technical scheme adopted by the embodiment of the application further comprises: in the step a, the performing wire detection on the form image to be identified by using the LSD algorithm includes:

calculating the level-line angle of each pixel point in the form image to be identified;

defining an error value of the to-be-identified table image, calculating an error between a level-line angle of each pixel point and a current region angle, carrying out region merging on the pixel points with the error smaller than the error value, and updating the merged region;

constructing an circumscribed matrix for each updated region, calculating an NFA value of each updated region, and judging a matrix with the NFA value meeting a set threshold value as an output straight line;

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step a, the detecting the frame line of the form image to be identified by using the LSD algorithm further includes:

and screening all the straight lines by using the set parameter threshold value, and removing useless frame lines in all the straight lines to obtain the transverse line and longitudinal line detection results of the table structure in the table image to be identified.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step b, the detecting each straight line in the transverse line and longitudinal line detection results according to the set transverse threshold and longitudinal threshold respectively includes:

let the coordinates of two adjacent straight lines be (x) ₀ ,y ₀ ,x ₁ ,y ₁ ) And (x) ₀ ′,y ₀ ′,x ₁ ′,y ₁ '), a transverse threshold lineWidth and a longitudinal threshold lineHeight are given, and a straight line judging rule based on double thresholds is adopted to judge whether straight lines need to be combined or not; the straight line judging rule based on the double threshold value specifically comprises the following steps:

will satisfy ((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ ) Judging that the two straight lines less than or equal to lineHeight) are transversely collinear;

will satisfy (((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ )≤lineHeight))∧ ((x ₀ ′-x ₁ ) Judging that the two straight lines less than or equal to lineWidth) are transversely collinear and co-segmented;

will satisfy ((x) ₀ ′-x ₀ )≤lineWidth)∨((x ₁ ′-x ₁ ) Judging that the two straight lines less than or equal to lineWidth) are longitudinally collinear;

will satisfy (((x) ₀ ′-x ₀ )≤lineWidth)∨((x ₁ ′-x ₁ )≤lineWidth))∧ ((y ₀ ′-y ₁ ) And judging that the two straight lines less than or equal to lineHeight) are longitudinal collinearly co-sections.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step b, the merging the two or more straight lines belonging to the same common collinear section of the frame line further comprises:

sorting the transversal coordinate sets in the transversal detection results from small to large according to the longitudinal coordinates of the transversal coordinate sets to obtain transversal sets, and sorting the longitudinal coordinate sets in the longitudinal detection results from small to large according to the transversal coordinates of the transversal coordinate sets to obtain longitudinal sets;

traversing each transverse line and each longitudinal line in the transverse line set and the longitudinal line set respectively, and detecting to obtain a transverse line colinear set and a longitudinal line colinear set by utilizing the straight line judging rule based on the double threshold values;

traversing each transverse line and each longitudinal line in the transverse line colinear collection and the longitudinal line colinear collection respectively, and detecting the transverse line and the longitudinal line of the colinear common section by utilizing the straight line judging rule based on the double threshold values;

combining and refining the transverse lines and the longitudinal lines of the collinear common section respectively;

and screening the transverse lines and the longitudinal lines after the merging and refining according to a set length threshold value to obtain a final transverse line merging and refining result set heng_final and a longitudinal line merging and refining result set zong_final.

The technical scheme adopted by the embodiment of the application further comprises the following steps: in the step c, the aligning the combined transverse line and the combined longitudinal line includes:

Introducing heng_left, heng_right and y_is_visual into the transverse line merging and refining result set heng_final, wherein the heng_left, heng_right and y_is_visual are respectively used for judging whether each transverse line meets the left coordinate alignment, right coordinate alignment and longitudinal coordinate alignment;

traversing each transverse line in heng_final, and respectively performing left coordinate alignment, right coordinate alignment and vertical coordinate pair Ji Cao on each transverse line by utilizing heng_left, heng_right and y_is_visual to obtain aligned heng_final;

introducing zong_up, zong_below and x_is_visual into a zong_final result set to be combined and refined for each longitudinal line, wherein the zong_up, zong_below and x_is_visual are respectively used for judging whether each longitudinal line meets the alignment of an upper coordinate, the alignment of a lower coordinate and the alignment of an abscissa;

and traversing each longitudinal line in the zong_final, and respectively performing upper coordinate alignment, lower coordinate alignment and abscissa pair Ji Cao on each longitudinal line by utilizing the zong_up, zong_below and x_is_visual to obtain the zong_final after coordinate alignment.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the left coordinate alignment, right coordinate alignment and longitudinal coordinate alignment operation for each transverse line comprises the following steps:

respectively obtaining a left coordinate, a right coordinate and a vertical coordinate of a current transverse line, if heng_left, heng_right and y_is_visual of the current transverse line are 0, which means that the transverse line is not subjected to left alignment, right alignment and vertical alignment treatment, traversing the rest transverse lines by taking the transverse line as a reference transverse line, finding out all transverse lines meeting preset conditions, respectively calculating a left coordinate mean value, a right coordinate mean value and a vertical coordinate mean value of all transverse lines meeting the preset conditions, and respectively updating the left coordinate, the right coordinate mean value and the vertical coordinate of the reference transverse line and all transverse lines meeting the preset conditions by using the left coordinate mean value, the right coordinate mean value and the vertical coordinate mean value to obtain heng_final after the transverse line alignment;

The preset conditions are as follows: the left coordinate difference value, the right coordinate difference value and the longitudinal coordinate difference value of the reference transverse line are all within a set alignment threshold value, and the left alignment, the right alignment and the longitudinal alignment are not performed.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the performing the upper coordinate alignment, the lower coordinate alignment and the abscissa alignment on each longitudinal line comprises the following steps:

respectively obtaining an upper coordinate, a lower coordinate and an abscissa of a current longitudinal line, if zong_up, zong_below and x_is_visual are 0, which means that the longitudinal line is not subjected to upper alignment, lower alignment and transverse alignment treatment, traversing the remaining longitudinal line by taking the longitudinal line as a reference longitudinal line, finding out all longitudinal lines meeting a preset condition, respectively calculating an upper coordinate mean value, a lower coordinate mean value and an abscissa mean value of all longitudinal lines meeting the preset condition, respectively updating the upper coordinate, the lower coordinate mean value and the abscissa of the reference longitudinal line and all longitudinal lines meeting the preset condition by using the upper coordinate mean value, the lower coordinate mean value and the abscissa mean value, and obtaining zong_final after the longitudinal line is aligned;

the preset conditions are as follows: the upper coordinate difference value, the lower coordinate difference value and the horizontal coordinate difference value of the reference vertical line are within a set alignment threshold value, and the upper alignment, the lower alignment and the horizontal alignment are not performed.

The technical scheme adopted by the embodiment of the application further comprises the following steps: the following steps of obtaining the heng_final after the transverse line alignment further comprise:

traversing each transverse line in the heng_final after the transverse lines are aligned, taking the current transverse line as a reference transverse line, traversing each longitudinal line in the zong_final, finding out all longitudinal lines of which the transverse coordinates are within an alignment threshold value with the transverse coordinates of the reference transverse line, and assigning the transverse coordinates of the reference transverse line as the transverse coordinates of all longitudinal lines within the alignment threshold value; if no vertical line is found, the horizontal line is judged to be a pseudo frame line, and the pseudo frame line is removed.

The technical scheme adopted by the embodiment of the application further comprises: the method for obtaining the zong_final after alignment of the vertical lines further comprises the following steps:

traversing each longitudinal line in the zong_final after the longitudinal lines are aligned, taking the current longitudinal line as a reference longitudinal line, traversing each transverse line in the heng_final, finding out all transverse lines of which the longitudinal coordinates are within an alignment threshold value with the longitudinal coordinates of the reference longitudinal line, and assigning the longitudinal coordinates of the reference longitudinal line as the longitudinal coordinates of all transverse lines within the alignment threshold value; if a transverse line of which the ordinate is within the alignment threshold value with the ordinate of the reference longitudinal line is not found, the longitudinal line is judged to be a pseudo frame line, and the pseudo frame line is removed.

The embodiment of the application adopts another technical scheme that: an image table structure identification system, comprising:

the wire detection module: the method comprises the steps of performing frame line detection on a form image to be identified by using an LSD algorithm to obtain a transverse line detection result and a longitudinal line detection result of a form structure in the form image to be identified;

a threshold detection module: the method comprises the steps of respectively detecting each straight line in a transverse line detection result and a longitudinal line detection result according to a set transverse threshold value and a set longitudinal threshold value to respectively obtain straight lines belonging to a collinear common section in the transverse line detection result and the longitudinal line detection result, and combining two or more straight lines belonging to the same frame line and the collinear common section to obtain complete transverse lines and complete longitudinal lines in the table structure;

and a frame wire combining module: the integrated transverse lines and the integrated longitudinal lines are combined;

and a wire alignment module: and the method is used for aligning the combined transverse lines and the combined longitudinal lines to obtain the table structure in the table image to be identified.

The embodiment of the application adopts the following technical scheme: a terminal comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions for implementing the image table structure identification method;

The processor is configured to execute the program instructions stored by the memory to control image table structure identification.

The embodiment of the application adopts the following technical scheme: a storage medium storing program instructions executable by a processor for performing the image table structure identification method.

Compared with the prior art, the beneficial effect that this application embodiment produced lies in: according to the image table structure identification method, system, terminal and storage medium, the LSD algorithm is used for detecting the frame lines in the table structure, a straight line judgment rule based on double thresholds is adopted for merging and thinning a plurality of line segments belonging to the same frame line, a complete frame line is obtained, and alignment and correction operations are carried out on the complete frame line, so that a final table frame line is obtained. Compared with the prior art, the embodiment of the application has at least the following advantages:

1. the algorithm parameter has strong universality and generalization; in the actual experimental process, the resolution of the image is close, and parameters are almost not required to be adjusted in the image with similar table structure, so that the image can be directly packaged for use.

2. Compared with the traditional Huffman frame line detection method, the projection-based frame line detection method and the run-based line detection method, the LSD line detection algorithm widely applied to remote sensing images is applied to image table structure identification, and the image preprocessing work is less, the identification speed is faster, and the identification result is more accurate.

Drawings

FIG. 1 is a flow chart of an image table structure identification method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of the detection results of the horizontal and vertical lines of the table structure according to the embodiment of the present application;

FIGS. 3 (a) and 3 (b) are diagrams showing the merged refinement results of the collinear common segment straight lines in the embodiment of the present application;

fig. 4 (a) and fig. 4 (b) are schematic diagrams of rough table structures obtained after merging the transverse lines and the longitudinal lines in the embodiment of the present application;

FIG. 5 is a schematic diagram of a table structure with aligned transverse lines and longitudinal lines according to an embodiment of the present application;

FIG. 6 (a) is a schematic diagram of a non-strictly defined form image, and FIG. 6 (b) is a schematic diagram of a recognition result of recognizing the non-strictly defined form image by using the algorithm of the embodiment of the present application;

FIG. 7 is a schematic diagram of an image table structure recognition system according to an embodiment of the present application;

fig. 8 is a schematic diagram of a terminal structure according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a storage medium according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

Referring to fig. 1, a flowchart of an image table structure recognition method according to an embodiment of the present application is shown. The image table structure identification method of the application embodiment comprises the following steps:

step 100: acquiring a form image to be identified;

step 200: performing frame line (including transverse lines and longitudinal lines of the table) detection on the table image to be identified by using an LSD (Line Segment Detector) algorithm to obtain transverse line and longitudinal line detection results of the table structure;

in step 200, the LSD algorithm is an image straight line processing algorithm, and is widely applied to detection and identification of geometric objects in remote sensing images because the algorithm has strong sensitivity to straight lines, and the LSD algorithm completes combination of pixel points based on an error value set for each image, and the combined areas of a plurality of pixel points are detected frame lines, so that detection of the image frame lines can be completed in linear time.

Specifically, in the embodiment of the present application, the table frame line detection algorithm based on the LSD algorithm includes the following steps:

step 201: calculating a level-line angle LLA and a gradient value of each pixel point in the form image I to be identifiedLet I (x, y) be the gray value at (x, y) in the form image I to be identified, I _x (x, y) represents the gradient value of pixel point I (x, y) in x direction, I _y (x, y) represents the gradient value of pixel point i (x, y) in y direction, +.>For the total gradient value, the calculation is as follows:

step 201: pixel area increases; defining an error value tau of a table image to be identified, calculating the error between the level-line angle of each pixel point and the current region angle, carrying out region merging on the level-line angles of a plurality of pixel points with the error smaller than the error value tau, and updating the merged region;

further, let S _x Represents the current region cos (θ _region ) Change of direction, S _y Represents the current region sin (θ) _region ) The angle of the current area is calculated by the following method:

S _x ＝S _x +cos(LLA(i(x，y)))

S _y ＝S _y +sin(LLA(i(x，y))) (2)

step 202: and respectively constructing an external matrix for each updated region, calculating the NFA value of each external matrix, judging whether the NFA value meets a set threshold epsilon, if so, judging that the region is an output straight line, and otherwise, judging that the region is not the output straight line.

Further, assuming that the rows and columns of the to-be-identified table image I are M and N, respectively, where N represents the number of pixels in the external matrix, and k represents the number of alignment points in the external matrix (the alignment points are pixels whose level-line angle of the pixels in the external matrix and the main direction angle of the matrix are within the error value τ), the NFA value of the matrix rect is calculated as follows:

In the formula (3),the parameter gamma indicates the number of possible p-values and τ is the error value.

Step 203: detecting all straight lines in the output frame line area by using an LSD straight line detection algorithm, screening the straight line detection result by using a set parameter threshold value, and removing useless frame lines in the frame line detection result to obtain the transverse line and longitudinal line detection result of the table structure in the table image to be identified;

in particular, it is assumed that (x ₀ ,y ₀ ,x ₁ ,y ₁ ) Represents each frame line in terms of coordinates, wherein (x ₀ ,y ₀ ) Represents the starting position of a straight line, (x) ₁ ,y ₁ ) And (3) representing the termination position of the straight line, wherein the table image contains text information or useless frame lines such as frame lines of a non-table structure, after all the straight lines are detected by the LSD algorithm, further screening the straight line detection result through a set parameter threshold value, and removing the useless frame lines to obtain the transverse line detection result and the longitudinal line detection result of the table structure. Referring to fig. 2, a schematic diagram of a detection result of a horizontal line and a vertical line in a table structure according to an embodiment of the present application is shown.

Step 300: performing binarization processing on the frame line result of the table image detected by the LSD based on a maximum inter-class variance method of the global threshold;

in step 300, since the background and the foreground of the table image have large differences, in the embodiment of the present application, the inter-class variance between the background and the foreground is calculated first, and the threshold obtained when the inter-class variance between the background and the foreground reaches the maximum value is used as the global threshold for image binarization.

Further, assuming that the form image to be recognized has M gray values, an initial gray value t is given ₀ Dividing the form image to be identified into G ₁ And G ₂ Two groups of G ₁ Containing all values less than the grey value t ₀ G, G ₂ Containing all values greater than the grey value t ₀ N represents the total number of pixels in the form image to be identified, N _i The number of pixels representing the gray value i, the probability of each gray value i appearing in the to-be-identified table image is:

G ₁ and G ₂ Probability of occurrenceAnd->Mean->And->The method comprises the following steps of:

inter-class variance sigma (t) ₀ ) ² The method comprises the following steps:

the maximum inter-class variance is determined by argmax sigma (t ₀ ) ² ,t ₀ ∈[0,M-1]。

Based on the above, according to the embodiment of the application, the binarization processing is performed on the table image frame line result detected by the LSD through the maximum inter-class variance method based on the global threshold, and the threshold selected by the maximum inter-class variance method is relatively stable, so that the segmentation efficiency is relatively high, and a better binarization effect can be obtained.

Step 400: combining and refining two or more straight lines belonging to the same frame line in the transverse line and longitudinal line detection results by adopting a straight line judgment rule based on double threshold values to obtain complete transverse lines and longitudinal lines in a table structure;

in step 400, the form image to be identified has problems such as coarsening, breaking, and even missing of the scanned form frame line due to printing process, human interference or other external factors, so that one frame line is detected as a plurality of different line segments. Therefore, each line segment needs to be detected respectively to obtain a line segment with a common line segment and a line segment with no common line segment, and a plurality of line segments belonging to the same frame line are combined and thinned into a complete frame line.

Further, the embodiment of the application adopts a straight line judging rule based on double threshold values to judge whether the straight lines need to be combined. Let the coordinates of two adjacent straight lines be (x) ₀ ,y ₀ ,x ₁ ,y ₁ ) And (x' ₀ ,y′ ₀ ,x′ ₁ ,y′ ₁ ) Given a horizontal threshold lineWidth and a vertical threshold lineHeight (preferably, in the embodiment of the present application, the lineWidth value is set to 15, the lineHeight value is set to 20, and the adjustment can be specifically performed according to practical applications), a straight line judgment rule based on the dual thresholds is defined as follows:

definition 1 satisfies ((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ ) And less than or equal to lineHeight), the two line segments are transversely collinear.

Definition 2 if satisfied (((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ )≤lineHeight))∧((x ₀ ′-x ₁ ) And (4) the two line segments are transverse collinear common segments which are not more than lineWidth.

Definition 3 if satisfied ((x) ₀ ′-x ₀ )≤lineWidth)∨((x ₁ ′-x ₁ ) And not more than lineWidth), the two line segments are longitudinally collinear.

Definition 4 if satisfied (((x) ₀ ′-x ₀ )≤lineWidth)∨((x ₁ ′-x ₁ )≤lineWidth))∧ ((y ₀ ′-y ₁ ) And less than or equal to lineHeight), the two line segments are longitudinal collinear common segments.

Wherein, definition 1, 2 can detect the straight line of the collinear common section and the straight line of the collinear non-common section in the transverse line detection result, definition 3, 4 can detect the straight line of the collinear common section and the straight line of the collinear non-common section in the longitudinal line detection result. When two or more than two straight lines are collinear common sections, the straight lines are in accordance with a transverse threshold or a longitudinal threshold, and the straight lines of the collinear common sections can be combined and refined by using a frame line refinement algorithm.

Further, taking the case of merging and refining straight lines of transverse collinear common segments as an example, the merging and refining algorithm specifically comprises the following steps:

step 401: sequencing a transverse line coordinate set in a transverse line detection result from small to large according to the longitudinal coordinate of the transverse line coordinate set to obtain a transverse line set heng_lines;

step 402: traversing each transverse line in heng_lines, judging whether two adjacent transverse lines are collinear by using definition 1, adding the transverse lines with the collinear judgment result into a transverse line collineation set linewidth_lines, and sequencing according to the initial transverse coordinates of each transverse line from small to large;

step 403: traversing each transverse line in the transverse line collineation set linewidth_lines, judging whether two adjacent transverse lines are collineation common segments or not by using definition 2, and if so, executing step 404; otherwise, step 405 is performed;

step 404: combining and refining the transverse lines of the collinear common sections according to formulas (13) - (15), and adding the combined and refined results into an output result set lines_location;

step 405: the transverse lines which are not collinear and common in section are directly added into an output result set lines_location;

step 406: screening each transverse line in the output result set lines_location according to the set transverse line length threshold value to obtain a final transverse line merging and refining result set heng_final;

The cross line length threshold set in the embodiment of the present application is 70, and it can be understood that the threshold may be determined according to a table structure in the table image.

The vertical line merging refinement algorithm is similar to the horizontal line merging refinement algorithm, and a final vertical line merging refinement result set is zong_final, which is not described herein.

Referring to fig. 3 (a) and 3 (b), the results of straight line merging and thinning are schematically shown in the collinear common section, wherein fig. 3 (a) is a result of horizontal line merging and thinning, and fig. 3 (b) is a result of vertical line merging and thinning. (new_x) in FIG. 3 (a) ₀ ，new_x ₁ New_y) represents the new coordinates of the merged transversal line, (new_y) in fig. 3 (b) ₀ ，new_y ₁ New_x) represents the new vertical line coordinates after combination, and the coordinate calculation mode is as follows:

new-x ₀ ＝min(x ₀ x ₀ ′)

new-x ₁ ＝max(x ₁ ，x ₁ ′)

new_y ₀ ＝min(y ₀ ，y ₀ ′)

new_y ₁ ＝max(y ₁ ，y ₁ ′)

based on the above, the embodiment of the application detects the straight lines of the collinear straight lines and the straight lines of the collinear non-common section by using the straight line judging rule based on the double threshold value, and merges the straight lines of the collinear common section belonging to the same frame line by using the frame line thinning algorithm, so that a complex table with combined cells and multiple cells can be processed.

Step 500: combining the horizontal line combining and refining result and the vertical line combining and refining result to obtain a rough table structure;

In step 500, as shown in fig. 4 (a) and fig. 4 (b), in fig. 4 (a), it can be seen that a pseudo frame line (i.e., a vertical line not controlled by any two horizontal lines or a horizontal line not controlled by any two vertical lines) exists in the lower right frame marking area, and in fig. 4 (b), it can be seen that a missing or lengthy line segment also exists at the intersection point of the horizontal line and the vertical line in the multiple frame marking areas. Further refinement of the table structure is therefore required.

Step 600: respectively aligning and correcting transverse lines and longitudinal lines in the rough table structure to obtain a final table structure in the table image to be identified;

in step 600, the length of any one of the frame lines in the table structure is determined by the other two frame lines, and taking a transverse line as an example, any one transverse line is determined by two longitudinal lines, the transverse coordinates of the two longitudinal lines directly determine the transverse coordinates and the length of the transverse lines taking the two longitudinal lines as a starting point and a finishing point, and the transverse coordinates of all transverse lines taking the same longitudinal line as the starting point and the finishing point are the same, and the longitudinal lines are vice versa. If a wire is present alone, then the wire must be a false wire that is misdetected. Therefore, the embodiment of the application uses the characteristics to correct the missing or overlong line segments existing at the intersections of the transverse lines and the longitudinal lines in the rough table structure, and eliminates the pseudo-frame lines to obtain the complete table structure.

Further, taking alignment and correction of the transverse line as an example, the algorithm specifically includes:

step 601: three identifiers are respectively introduced into each frame line in the transverse line merging and refining result set heng_final and the vertical line merging and refining result set zong_final, and the three identifiers introduced in the transverse line merging and refining result set heng_final are as follows: the heng_left, heng_right and y_is_visual are respectively used for judging whether each transverse line meets the left coordinate alignment, the right coordinate alignment and the longitudinal coordinate alignment; the three identifiers introduced in the vertical line merging refinement result set zong_final are: zong_up, zong_below and x_is_visual are used for judging whether each vertical line meets the upper coordinate alignment, the lower coordinate alignment and the horizontal coordinate alignment or not, and the initial assignment of the six identifiers is 0.

Step 602: traversing each transverse line in heng_final, and respectively performing left coordinate alignment, right coordinate alignment and vertical coordinate pair Ji Cao on each transverse line by utilizing heng_left, heng_right and y_is_visual to obtain heng_final after coordinate alignment;

further, taking alignment of left coordinates of a transverse line as an example, the specific algorithm is as follows: and obtaining the left coordinates of the current transverse line, if the heng_left is 0, namely, representing that the transverse line is not subjected to left alignment processing, taking the transverse line as a reference transverse line, traversing the rest transverse lines, finding out all transverse lines meeting the preset condition, calculating the left coordinate mean value of all transverse lines meeting the preset condition, updating the left coordinates of the reference transverse line and all transverse lines meeting the preset condition by using the left coordinate mean value, and assigning heng_left to be 1. The preset conditions for finding all the transverse lines are as follows: the difference value between the left coordinate of the horizontal line and the reference horizontal line is within a set alignment threshold (preferably, the alignment threshold is set to 15 in the embodiment of the present application, which may be specifically set according to practical application), and the left alignment process is not performed.

It will be appreciated that the right and vertical alignment of the horizontal lines are the same as the left alignment algorithm and will not be described in detail herein.

Step 603: traversing each transverse line in the heng_final after coordinate alignment, taking the current transverse line as a reference transverse line, traversing each longitudinal line in the zong_final, finding out all longitudinal lines of which the transverse coordinates are within an alignment threshold value with the transverse coordinates of the reference transverse line, and assigning the transverse coordinates of the reference transverse line as the transverse coordinates of all longitudinal lines within the alignment threshold value; if no vertical line is found, the horizontal line is judged to be a pseudo frame line, and the pseudo frame line is removed.

Step 604: traversing each longitudinal line in the zong_final, and respectively performing upper coordinate alignment, lower coordinate alignment and transverse coordinate pair Ji Cao on each longitudinal line by utilizing zong_up, zong_below and x_is_visual to obtain a zong_final after coordinate alignment;

the vertical line alignment algorithm is the same as the horizontal line alignment algorithm, and will not be described here again.

Based on the above, the final table structure obtained after the alignment and correction operations on the horizontal lines and the vertical lines is shown in fig. 5.

Table 1 shows the recognition results and recognition times for table images of different resolutions and different complexities using the embodiments of the present application. Wherein the more cells, the more complex the representation table, the higher the probability of an identified error. The frame line recognition rate is (N/M) multiplied by 100%, wherein N is the number of recognized frame lines, and M is the actual frame line number.

Table 1 different types of table image recognition results

As can be seen from the recognition results of table 1, in the high-resolution table image, the recognition rate of both the simple table (the small number of cells) and the complex table (the large number of cells) can reach one hundred percent, and the time for processing one image is in the millisecond level. In the low-resolution table image, the recognition rate of the simple table is a percentage, the complex table is not completely recognized, but the recognition rate also reaches 78.3%, and the recognition time of the complex table and the complex table is very short. Therefore, the table frame line recognition speed is high, and the accuracy is high.

The embodiment of the application can also correct the table image within a certain inclination angle range. As shown in table 2, the processing results and the recognition time of the inclined table image according to the embodiment of the present application are:

TABLE 2 different Tilt Angle Table image detection results

As can be seen from Table 2, as the image tilt angle increases, the difficulty of table identification is also greater, and the maximum tolerable tilt angle in the embodiment of the present application is ±1.7 degrees, and the table structure frame line can be accurately identified within the tilt angle range, and the identification speed is fast.

In order to verify the feasibility and effectiveness of the embodiment of the application, the table image recognition algorithm of the embodiment of the application is applied to the community vote project of the Sichuan province basic level 'two delegation' selection intelligent service management platform key technology research and industrialization demonstration for testing. In the actual item of the community vote, since higher image resolution leads to an increase in processing time, the cost to the ticketing machine increases, and thus a high resolution image like 300dpi (image resolution) is not used. However, too low image resolution (for example, 100 dpi) of the ballot will also result in an increase in recognition error rate, so that the image resolution of the ballot in the actual project is generally between 100dpi and 300dpi, which ensures recognition accuracy and can also reduce equipment cost. Test results show that the table image recognition algorithm of the embodiment of the application can be well adapted to the project, and has the following advantages:

1. The recognition result with high accuracy can be obtained in the resolution of the lower ballot image, and the recognition speed is high. Since the community vote of the base layer is generally a simple table with fewer cells, the recognition time for recognizing the table by applying the embodiment of the application is less than 0.2 seconds.

2. The angle of inclination of the ballot in a certain range can be tolerated, and the manual processing work is reduced. When the ballot passes through the counting, the situation that the image is askew occurs, the image inclination angle can be set artificially in the project, and the askew image exceeding the angle is identified manually. By applying the embodiment of the application, the inclination angle of the image can be enlarged, so that the algorithm can automatically cope with and correct the skew image in the inclination angle, and manual operation is reduced.

3. The method has strong universality and can quickly and accurately identify the table structure even in a non-strictly defined table image. In the community vote mode, there may be a non-strictly defined form image, where the non-strictly defined form image is shown in fig. 6 (a), and for such an image, accurate recognition may also be performed by the embodiment of the present application, and the recognition result is shown in fig. 6 (b).

Fig. 7 is a schematic structural diagram of an image table structure recognition system according to an embodiment of the present application. The image table structure identification system of the embodiment of the application comprises:

The wire detection module: the method comprises the steps of performing frame line detection on a frame line result of a table image detected by an LSD algorithm to obtain a transverse line detection result and a longitudinal line detection result of a table structure in the table image to be identified;

Fig. 8 is a schematic diagram of a terminal structure according to an embodiment of the present application. The terminal 50 includes a processor 51, a memory 52 coupled to the processor 51.

The memory 52 stores program instructions for implementing the image table structure identification method described above.

The processor 51 is operative to execute program instructions stored in the memory 52 to control the image table structure identification.

The processor 51 may also be referred to as a CPU (Central Processing Unit ). The processor 51 may be an integrated circuit chip with signal processing capabilities. Processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Fig. 9 is a schematic structural diagram of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 61 capable of implementing all the methods described above, where the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a terminal device such as a computer, a server, a mobile phone, a tablet, or the like.

According to the image table structure identification method, system, terminal and storage medium, the LSD algorithm is used for detecting the frame lines in the table structure, a straight line judgment rule based on double thresholds is adopted for merging and thinning a plurality of line segments belonging to the same frame line, so that a complete frame line is obtained, and alignment and correction operations are carried out on the complete frame line, so that a final table frame line is obtained. Compared with the prior art, the embodiment of the application has at least the following advantages:

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An image table structure identification method is characterized by comprising the following steps:

2. The method for recognizing an image table structure according to claim 1, wherein in the step a, the performing wire detection on the table image to be recognized by using the LSD algorithm comprises:

Constructing an circumscribed matrix for each updated region, calculating the NFA value of each updated region, and judging the matrix with the NFA value meeting a set threshold value as an output straight line.

3. The method for recognizing an image table structure according to claim 2, wherein in the step a, the performing wire detection on the table image to be recognized using the LSD algorithm further comprises:

4. The image table structure recognition method according to claim 1, wherein in the step b, the detecting each straight line in the detection results of the horizontal line and the vertical line according to the set horizontal threshold value and the set vertical threshold value, respectively, includes:

let the coordinates of two adjacent straight lines be (x) ₀ ,y ₀ ,x ₁ ,y ₁ ) And (x' ₀ ,y′ ₀ ,x′ ₁ ,y′ ₁ ) Given a transverse threshold lineWidth and a longitudinal threshold lineHeight, judging whether the straight lines need to be combined or not by adopting a straight line judging rule based on double thresholds; the straight line judging rule based on the double threshold value specifically comprises the following steps:

will satisfy ((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ ) Judging that the two straight lines less than or equal to lineHeight are transversely collinear;

will satisfy (((y) ₀ ′-y ₀ )≤lineHeight)∨((y ₁ ′-y ₁ )≤lineHeight))∧((x ₀ ′-x ₁ ) Judging that the two straight lines less than or equal to lineWidth) are transversely collinear and co-segmented;

will satisfy (((x) ₀ ′-x ₀ )≤lineWidth)∨((x ₁ ′-x ₁ )≤lineWidth))∧((y ₀ ′-y ₁ ) Less than or equal to lineHeight)The straight line is judged to be a longitudinal collinear common segment.

5. The method according to claim 4, wherein in the step b, the merging two or more straight lines belonging to a common collinear section of a same frame line further comprises:

6. The method according to claim 5, wherein in the step c, the aligning the combined horizontal line and vertical line includes:

traversing each transverse line in heng_final, and respectively performing left coordinate alignment, right coordinate alignment and longitudinal coordinate alignment operation on each transverse line by utilizing heng_left, heng_right and y_is_visual to obtain aligned heng_final;

and traversing each longitudinal line in the zong_final, and respectively performing upper coordinate alignment, lower coordinate alignment and horizontal coordinate alignment on each longitudinal line by utilizing the zong_up, zong_below and x_is_visual to obtain the zong_final after coordinate alignment.

7. The method of claim 6, wherein performing left, right and vertical alignment operations on each horizontal line comprises:

respectively obtaining a left coordinate, a right coordinate and a vertical coordinate of a current transverse line, if heng_left, heng_right and y_is_visual of the current transverse line are 0, which means that the transverse line is not subjected to left alignment, right alignment and vertical alignment treatment, traversing the rest transverse lines by taking the transverse line as a reference transverse line, finding out all transverse lines meeting preset conditions, respectively calculating a left coordinate mean value, a right coordinate mean value and a vertical coordinate mean value of all transverse lines meeting the preset conditions, and respectively updating the left coordinate, the right coordinate mean value and the vertical coordinate of the reference transverse line and all transverse lines meeting the preset conditions by using the left coordinate mean value, the right coordinate mean value and the vertical coordinate mean value to obtain heng_final after the transverse line is aligned;

the preset conditions are as follows: the left coordinate difference value, the right coordinate difference value and the vertical coordinate difference value of the reference transverse line are all within a set alignment threshold value, and the left alignment, the right alignment and the vertical alignment are not performed.

8. The method of claim 6, wherein performing the operations of upper, lower and abscissa alignment on each of the vertical lines comprises:

Respectively obtaining an upper coordinate, a lower coordinate and an abscissa of a current longitudinal line, if zong_up, zong_below and x_is_visual are 0, which means that the longitudinal line is not subjected to upper alignment, lower alignment and transverse alignment treatment, traversing the remaining longitudinal line by taking the longitudinal line as a reference longitudinal line, finding out all longitudinal lines meeting a preset condition, respectively calculating an upper coordinate mean value, a lower coordinate mean value and an abscissa mean value of all longitudinal lines meeting the preset condition, respectively updating the upper coordinate, the lower coordinate mean value and the abscissa of the reference longitudinal line and all longitudinal lines meeting the preset condition by using the upper coordinate mean value, the lower coordinate mean value and the abscissa mean value, and obtaining zong_final after the longitudinal line alignment;

9. The method for identifying an image table structure according to claim 7, wherein the obtaining heng_final after the alignment of the transverse lines further comprises:

10. The method for identifying an image table structure according to claim 8, wherein the obtaining the zong_final after the alignment of the vertical lines further comprises:

11. An image table structure identification system, comprising:

a threshold detection module: the method comprises the steps of respectively detecting each straight line in a transverse line detection result and a longitudinal line detection result according to a set transverse threshold value and a set longitudinal threshold value to respectively obtain straight lines belonging to a collinear common section in the transverse line detection result and the longitudinal line detection result, and combining two or more straight lines belonging to the same frame line collinear common section to obtain complete transverse lines and complete longitudinal lines in the table structure;

12. A terminal comprising a processor, a memory coupled to the processor, wherein,

the memory stores program instructions for implementing the image table structure identification method of any one of claims 1 to 10;

13. A storage medium storing program instructions executable by a processor for performing the image table structure identification method of any one of claims 1 to 10.