CN119418139B - Test paper structure acquisition method based on artificial intelligence and graph attention neural network - Google Patents

Test paper structure acquisition method based on artificial intelligence and graph attention neural network Download PDF

Info

Publication number
CN119418139B
CN119418139B CN202510018727.0A CN202510018727A CN119418139B CN 119418139 B CN119418139 B CN 119418139B CN 202510018727 A CN202510018727 A CN 202510018727A CN 119418139 B CN119418139 B CN 119418139B
Authority
CN
China
Prior art keywords
answer
picture
node
neural network
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510018727.0A
Other languages
Chinese (zh)
Other versions
CN119418139A (en
Inventor
于丁
田艳莉
王晨太
卢衡
曹翠妙
马铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hejuli Intelligent Technology Co.,Ltd.
Original Assignee
Beijing Heqi Juli Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Heqi Juli Education Technology Co ltd filed Critical Beijing Heqi Juli Education Technology Co ltd
Priority to CN202510018727.0A priority Critical patent/CN119418139B/en
Publication of CN119418139A publication Critical patent/CN119418139A/en
Application granted granted Critical
Publication of CN119418139B publication Critical patent/CN119418139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides a test paper structure acquisition method based on artificial intelligence and a drawing meaning neural network, which relates to the technical field of intelligent information extraction, and comprises the steps of acquiring answer paper pictures with set quantity; the method comprises the steps of respectively partitioning an answer sheet picture based on a set partitioning rule to obtain a partitioning result, carrying out normalization processing on a set vector obtained by converting the partitioning result through a convolutional neural network to obtain a node characteristic diagram, marking the node characteristic diagram according to a set partitioning area to obtain a marking result, training to obtain a drawing attention model based on a set answer sheet construction rule and a marking result, predicting a new answer sheet by adopting the trained drawing attention model, automatically and accurately extracting global structure information on the answer sheet without marking a large amount of category information, simultaneously being suitable for processing of various subsequent test sheets, improving generalization capability and having strong adaptability to noise and complex typesetting.

Description

Test paper structure acquisition method based on artificial intelligence and graph annotation intention neural network
Technical Field
The application relates to the technical field of intelligent information extraction, in particular to a test paper structure acquisition method based on artificial intelligence and a graph annotation intention neural network.
Background
The conventional test paper structure analysis method is generally based on traditional image processing or a single deep learning model, and has the defects that the traditional method has poor adaptability to noise, complex typesetting and the like, the deep learning model is good at extracting local features, but is difficult to process global structure dependence such as a question sequence relation, and the chart, the formula and the mixed typesetting in the test paper have higher requirements on the generalization capability of the model.
Disclosure of Invention
The embodiment of the application aims to provide a test paper structure acquisition method based on an artificial intelligence and graph annotation intention neural network, which is used for solving the problems that the existing test paper structure analysis method is difficult to process global structure dependence, insufficient in generalization capability and poor in adaptability to noise and complex typesetting.
In a first aspect, an embodiment of the present application provides a test paper structure obtaining method based on artificial intelligence and a graph annotation intention neural network, where the method includes:
obtaining answer sheet pictures with set quantity;
respectively partitioning the answer sheet pictures based on a set partitioning rule to obtain a partitioning result;
normalizing the set vector obtained by the block result through convolutional neural network transformation to obtain a node characteristic diagram;
Marking the node characteristic map according to the set divided regions to obtain marking results, wherein the marking results comprise label information of the divided regions of each node;
Based on the set rule for constructing the answer sheet, and combining the labeling result, training to obtain a drawing attention model;
and predicting the new answer sheet by adopting the trained drawing attention model.
The method comprises the steps of obtaining a set number of answer sheets, respectively partitioning the answer sheets based on a set partitioning rule to obtain partitioning results, carrying out normalization processing on set vectors obtained by converting the partitioning results through a convolutional neural network to obtain node characteristic diagrams, marking the node characteristic diagrams according to set partitioning areas to obtain marking results, wherein the marking results comprise partition area label information of each node, training to obtain a drawing attention model based on set answer sheet construction rules and combining the marking results, predicting new answer sheets by adopting the trained drawing attention model, training the drawing attention model, automatically and accurately extracting global structure information on the answer sheets without marking a large amount of category information, simultaneously being applicable to processing of various subsequent answer sheets, improving generalization capability, and having strong adaptability to noise and complex typesetting.
Further, the block dividing rule based on the setting divides the answer sheet picture to obtain a block dividing result, including:
Respectively obtaining characteristic information of each answer sheet picture in different scales and different directions on a frequency domain through Gabor transformation, wherein the characteristic information comprises text lines;
cutting each answer sheet picture in blocks respectively, and adjusting the corresponding region images to the same size;
And acquiring the positions of each line of texts corresponding to all the answer pictures according to the relative positions of the first answer picture, and taking each line of texts as a node, wherein each node comprises characteristic information of Gabor transformation of each answer picture, a difference value of Gabor transformation of the first answer picture and the second answer picture, and a difference value of Gabor transformation of the first answer picture and the third answer picture.
In the implementation process, the global structure information of the test paper is automatically and accurately extracted, and the answer sheet pictures are respectively segmented, so that node processing is facilitated.
Further, the obtaining the feature information of each answer sheet picture in different dimensions and different directions on the frequency domain through Gabor transformation includes:
Processing the first answer sheet picture by adopting a Gabor filter with a specific scale and a specific direction to extract characteristic values so as to obtain a set number of characteristic images;
Selecting a characteristic image with a first scale and a first direction, blocking the characteristic image by a projection method, and then performing vertical projection and normalization processing on the blocked characteristic image;
According to the normalization result, dividing the characteristic image into blocks along the second direction, and performing binarization processing;
Cutting the characteristic image along a first direction according to the binarization processing result to perform block cutting, and calling a region detection algorithm to process each block to obtain a text line of each block;
After the rest characteristic images are subjected to the same processing, text lines of all the characteristic images of the first answer sheet picture are obtained;
and after the rest answer pictures are subjected to the same processing, text lines of different dimensions and directions of each answer picture on a frequency domain are obtained.
In the implementation process, the Gabor filter can capture local texture and edge information of the image, has characteristic stability and strong adaptability to noise and complex typesetting.
Further, the normalizing processing is performed on the set vector obtained by the block result through the convolutional neural network transformation to obtain a node characteristic diagram, which comprises the following steps:
And combining the regional images of the first answer picture into the input with set dimensionality, converting the input into set vectors through CNN convolutional neural network transformation, normalizing the information of the same node of each answer picture, and then splicing all nodes to obtain the node characteristic diagram.
In the implementation process, the node characteristics are normalized to obtain the node characteristic diagram with high accuracy so as to carry out labeling training.
Further, after labeling the node feature map according to the set divided region, obtaining a labeling result, including:
Labeling each node of the node characteristic map according to the question area, the answer area and the filling area to obtain the label information of the dividing area of each node.
In the implementation process, the node feature map is divided into areas according to requirements, so that the map attention model can be trained conveniently.
Further, the rule is constructed based on the set answer sheet, and the graph attention model is obtained through training by combining the labeling result, and the method comprises the following steps:
Analyzing and loading all answer sheet construction files, wherein the answer sheet construction files comprise files for storing the side relation of nodes with the same spatial frequency, region images to which the nodes belong, sequence numbers of the region images, regional label information of the nodes and attribute characteristics of the nodes;
distributing the nodes to different area images according to the area images to which the nodes belong;
Constructing a graph structure according to a file for storing the side relation of the nodes with the same spatial frequency;
Selecting a graph neural network module according to the set task type;
training a graph attention model by using node attribute features and graph structures as inputs;
performing graph-level supervised learning on the graph annotation force model by using the graph labels;
Splicing the nodes to fuse information of different spatial frequencies;
Predicting the unknown volume picture by using the trained picture attention model;
Judging whether the feature map meets the requirements or not based on the prediction result;
If yes, the drawing meaning force model is used for predicting subsequent answer pictures.
In the implementation process, the graph attention model is obtained through training, so that the subsequent reading operation of the answer sheet is facilitated, and the global structure information of the test sheet can be automatically and accurately extracted.
In a second aspect, an embodiment of the present application further provides a test paper structure obtaining device based on an artificial intelligence and a schematic neutral network, where the device includes:
The answer sheet acquisition module is used for acquiring answer sheet pictures with set quantity;
The block processing module is used for respectively blocking the answer sheet pictures based on the set block rules to obtain block results;
the node processing module is used for carrying out normalization processing on the set vector obtained by the block result through the convolutional neural network transformation to obtain a node characteristic diagram;
The result marking module is used for marking the node characteristic graph according to the set divided areas to obtain a marking result, wherein the marking result comprises the label information of the divided areas of each node;
The model training module is used for training to obtain a graph attention model based on the set answer sheet construction rule and the combination of the labeling result;
and the model prediction module is used for predicting the new answer sheet by adopting the trained graph attention model.
In a third aspect, an embodiment of the present application provides an electronic device, including:
The system comprises a processor, a memory and a bus, wherein the processor is connected with the memory through the bus, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the computer readable instructions are used for realizing the test paper structure acquisition method based on the artificial intelligence and the schematic notes intention neural network.
In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium having a computer program stored thereon, which when executed by a server, implements a test paper structure acquisition method based on artificial intelligence and a schematic force neural network as described above.
In a fifth aspect, embodiments of the present invention provide a computer program product comprising instructions that, when executed by a computer, cause the computer to implement a test paper structure acquisition method based on artificial intelligence and a schematic force neural network as described above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a test paper structure acquisition method based on an artificial intelligence and a schematic drawing force neural network provided by an embodiment of the application;
Fig. 2 is a schematic diagram of 12 feature images after Gabor transformation according to the test paper structure acquisition method based on artificial intelligence and a graph-annotating ideographic neural network provided by the embodiment of the application;
FIG. 3 is a schematic view of a segmented feature image of a test paper structure acquisition method based on an artificial intelligence and graph annotation intention neural network according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a feature image after vertical projection and normalization processing of a test paper structure acquisition method based on an artificial intelligence and graph annotation force neural network according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a characteristic image of a left region of a test paper structure acquisition method based on an artificial intelligence and a schematic diagram artificial force neural network according to an embodiment of the present application;
FIG. 6 is a schematic view of a vertical cutting area of a feature image of a left area of a test paper structure acquisition method based on an artificial intelligence and schematic drawing force neural network according to an embodiment of the present application;
Fig. 7 is a schematic diagram of a cutting area of a characteristic image after cutting along a vertical direction in a test paper structure acquisition method based on an artificial intelligence and schematic drawing force neural network according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a text line of a cutting area of a test paper structure acquisition method based on artificial intelligence and a schematic diagram of a schematic force neural network according to an embodiment of the present application;
FIG. 9 is a node characteristic diagram of a test paper structure acquisition method based on an artificial intelligence and a graph annotation intention neural network according to an embodiment of the present application;
FIG. 10 is a schematic flow chart of a test paper structure acquisition device based on an artificial intelligence and schematic drawing neural network according to an embodiment of the application;
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, fig. 1 is a schematic flow chart of a test paper structure obtaining method based on artificial intelligence and a schematic drawing neural network according to an embodiment of the present application. The test paper structure acquisition method based on the artificial intelligence and the graph annotation intention neural network comprises the following steps:
100. and obtaining the answer sheet pictures with the set number.
Optionally, if the answer sheet is submitted through an online questionnaire or examination platform, the answer sheet or screenshot answer sheet is directly downloaded in the platform, if the answer sheet is paper, the answer sheet is converted into digital pictures by using a scanner or a camera and stored, all the answer sheet pictures are downloaded, and the answer sheet pictures of the same set of test sheets with the set number are taken.
200. And respectively partitioning the answer sheet pictures based on the set partitioning rules to obtain partitioning results.
210. And respectively obtaining characteristic information of each answer sheet picture in different scales and different directions on a frequency domain through Gabor transformation, wherein the characteristic information comprises text lines.
It should be noted that Gabor wavelet is a multi-scale, multi-directional filter, which can capture local texture and edge information of an image, and is particularly suitable for describing features with strong directionality (such as lines, edges, textures, etc.), and has feature stability.
Specifically, the Gabor transformation is utilized to obtain the characteristic information of the pictures in different scales and different directions on the frequency domain, and the method has a good effect on the recognition of the close texture pictures such as characters. Referring to fig. 2, information extracted in 3 spatial frequencies and 4 directions by two-dimensional Gabor transformation is shown, wherein the 4 directions include a horizontal direction and a vertical direction, so that 12 feature images are obtained in total. And then the information extracted vertically and horizontally is used as the block of the test paper.
211. And processing the first answer sheet picture by adopting a Gabor filter with a specific scale and a specific direction to extract characteristic values so as to obtain a set number of characteristic images.
Specifically, the first answer sheet picture is processed through two-dimensional Gabor transformation, and characteristic images are extracted in 3 spatial frequencies and 4 directions, so that 12 characteristic images are obtained. For example, referring to fig. 2, an original image is an initial image, scale1 of a first feature image, dir1 represents a first scale and a first direction, scale1 of a second feature image, dir2 represents a first scale and a second direction, and so on, for example, scale2 of a seventh feature image, dir3 represents a second scale and a third direction, and in the embodiment of the present application, the first direction Dir1 is set to be vertical (vertical direction), and the fourth direction Dir4 is set to be horizontal (horizontal direction).
212. Selecting a characteristic image with a first scale and a first direction, blocking the characteristic image by a projection method, and then carrying out vertical projection and normalization processing on the blocked characteristic image.
For example, scale1 and Dir1 of the first feature image are selected to obtain a Gabor feature value in the vertical direction, and the first feature image is segmented by a projection method to obtain the segmented feature image in fig. 3.
Illustratively, after the first feature image after the blocking is vertically projected, normalization processing is performed, so as to obtain a feature image after the vertical projection and normalization processing in fig. 4.
213. And according to the normalization processing result, the characteristic image is segmented along the second direction and then binarized.
The feature image obtained by normalizing the first feature image is divided into a left area and a right area in the horizontal direction according to the area division, specifically according to the area with the highest value, please refer to fig. 5, the feature image in the left area is subjected to binarization processing, after the binarization processing, the vertical direction projection value is subjected to the vertical direction area division according to the set projection value, optionally, the set projection value is 255 x 0.5 x the picture height, please refer to fig. 6, and the feature image is cut into 4 vertical blocks.
214. And cutting the characteristic image along a first direction according to the result of binarization processing, and calling a region detection algorithm to process each block to obtain a text line of each block.
For example, please refer to fig. 7, which illustrates a block of the cut region after the feature image is cut along the vertical direction, and then the mser algorithm is invoked to obtain a text line, and please refer to fig. 8, which illustrates a block of the cut region.
215. And after the rest characteristic images are subjected to the same processing, obtaining text lines of all the characteristic images of the first answer sheet picture.
Specifically, after the same processing is performed on the remaining 11 feature images, text lines of all feature images of the first answer sheet picture are obtained. Illustratively, scale1, dir4 of the fourth feature image is selected to obtain a Gabor feature value in the vertical direction, and then the text line is obtained after processing by transversely projecting and dicing according to the same processing manner of scale1, dir1 of the first feature image.
216. And after the rest answer pictures are subjected to the same processing, text lines of different dimensions and directions of each answer picture on a frequency domain are obtained.
The second answer picture and the third answer picture are processed in the same manner as the first answer picture, so that text lines of different dimensions and directions of each answer picture in a frequency domain are obtained.
220. And respectively cutting each answer sheet picture in blocks, and adjusting the corresponding region images to the same size.
It can be understood that the three answer sheets are cut in blocks respectively, and the corresponding areas after the three answer sheets are cut are all adjusted to the same size.
230. And acquiring the positions of each line of texts corresponding to all the answer pictures according to the relative positions of the first answer picture, and taking each line of texts as a node, wherein each node comprises characteristic information of Gabor transformation of each answer picture, a difference value of Gabor transformation of the first answer picture and the second answer picture, and a difference value of Gabor transformation of the first answer picture and the third answer picture.
It can be understood that the positions of each line of text corresponding to the three answer sheets are obtained according to the relative positions of the three answer sheets, and each line of text is used as a node. Each node contains the following information, namely, the corresponding position of each answer picture corresponds to the characteristic information of 12 Gabor transformed characteristic images, the difference value of the 12 Gabor transformed characteristic images of each first answer picture and the second answer picture (the characteristic images in the same spatial frequency and direction are calculated), and the difference value of the 12 Gabor transformed characteristic images of each first answer picture and the third answer picture (the characteristic images in the same spatial frequency and direction are calculated).
Therefore, the embodiment of the application automatically and accurately extracts the global structure information of the test paper, and respectively blocks the answer sheet pictures, thereby being convenient for node processing.
300. And carrying out normalization processing on the set vector obtained by the block result through convolutional neural network transformation to obtain a node characteristic diagram.
Specifically, the regional images of the first answer sheet picture are combined into the input with set dimension, the input is converted into the set vector through CNN convolutional neural network transformation, and the information of the same node of each answer sheet picture is normalized and then all the nodes are spliced to obtain the node characteristic diagram.
The method includes the steps of combining the regional images of the first answer sheet picture into n with height input, wherein n is the number of pictures, width and height are the width and height of each picture respectively, converting the regional images into 1 x 1024 vectors through CNN convolutional neural network transformation, normalizing the information of the same node of each answer sheet picture, and then splicing all nodes to obtain a node characteristic diagram, and referring to FIG. 9, the node characteristic diagram is obtained.
400. And marking the node characteristic graph according to the set divided regions to obtain a marking result, wherein the marking result comprises the label information of the divided regions of each node.
Specifically, according to the question area (printing part), the answer area (handwriting part), the filling area (academic number, selection question, diagnosis lack mark) is marked on each node of the node characteristic diagram to obtain the label information of the divided area of each node, and optionally, the marking of each node of the node characteristic diagram can be performed in an automatic marking or manual marking mode.
500. And (5) training to obtain the graph attention model by combining the labeling result based on the set answer sheet construction rule.
Analyzing and loading all answer sheet construction files, wherein the answer sheet construction files comprise files for storing the side relation of nodes with the same spatial frequency, region images to which the nodes belong, sequence numbers of the region images, regional label information of the nodes and attribute characteristics of the nodes.
The method comprises the steps of storing a DS_A.txt file of side relation of nodes with the same spatial frequency, storing the relation of node sides in a Pair mode, wherein the relation of the sides of the nodes is expressed in a Pair mode, the nodes with different spatial frequencies are free of the relation of the sides, and the Pair of the sides is arranged between the nodes with the same spatial frequency, so that different spatial frequencies can be calculated separately at the beginning, and concat fusion is carried out on the information of the different spatial frequencies when the node information is predicted finally, so that final prediction is carried out.
The DS_graph_indicator.txt is an area image to which the node belongs, and the sequence number of the image is from 1 to X.
The sequence number of the region image is DS_graph_labels.txt, and represents the sequence number of the graph, such as the sequence number from 1 to X.
The node partition area label information is DS_node_labels, which represents label labels of nodes in the diagram and class numbers of the nodes.
Wherein, attribute characteristics of the node: DS_node_attributes.txt represents the attribute characteristics of the node.
The method comprises the steps of distributing nodes to different area images according to the area images to which the nodes belong, constructing a graph structure according to files for storing the side relation of the nodes with the same spatial frequency, selecting a graph neural network module according to a set task type, training a graph attention model by using node attribute characteristics and the graph structure as input, performing graph-level supervised learning on the graph attention model by using labels of the graph, and splicing the nodes to fuse information with different spatial frequencies.
Further, the training picture attention model is used for predicting unknown answer pictures, whether the unknown answer pictures are feature pictures meeting requirements is judged based on a prediction result, if yes, the drawing attention model is used for predicting subsequent answer pictures, and if not, the drawing attention model is trained again.
Therefore, the graph attention model is obtained through training, the subsequent answer sheet reading operation is facilitated, and the global structure information of the test paper can be automatically and accurately extracted.
600. And predicting the new answer sheet by adopting the trained drawing attention model.
The embodiment of the application obtains a set number of answer sheets, respectively blocks the answer sheets based on a set block rule to obtain a block result, normalizes a set vector obtained by converting the block result through a convolutional neural network to obtain a node characteristic diagram, marks the node characteristic diagram according to a set dividing area to obtain a marking result, wherein the marking result comprises label information of the dividing area of each node, trains to obtain a drawing attention model based on a set answer sheet construction rule and combines the marking result, predicts a new answer sheet by adopting the trained drawing attention model, trains the drawing attention model, does not need to mark a large amount of category information, automatically and accurately extracts global structure information on the answer sheet, is simultaneously suitable for processing various subsequent answer sheets, improves generalization capability, and has strong adaptability to noise and complex typesetting.
According to the embodiment of the application, through multi-mode feature fusion and global relation modeling of multi-layer blocks of the pictures, accurate extraction of key information such as a test paper question area (a printing part), a answering area (a handwriting part), a filling area (a study number, a selection question and a diagnosis lack mark) is realized, the question area is extracted, the content of a test paper template is made based on the position information, point positioning information is provided for subsequent identical test paper identification, and characteristic area information for auxiliary correction is used as basic information of test paper correction.
According to the embodiment of the application, through combining the Gabor transformation and the advantages of the graphic neural network (Graph Neural Network, GNN), the global structure information of the test paper is automatically and accurately extracted, a basis is provided for subsequent test paper analysis and intelligent evaluation, a large amount of category information is not required to be marked, the method is suitable for processing various subsequent test papers, and the generalization capability is improved.
The above steps are not performed in exactly the order of the number descriptions, but should be understood as an overall scheme.
In a second aspect, based on the above embodiment, the embodiment of the present application further provides a test paper structure obtaining device based on an artificial intelligence and a graph meaning neural network, and referring to fig. 10, the test paper structure obtaining device based on the artificial intelligence and the graph meaning neural network provided in the present embodiment specifically includes an answer obtaining module 201, a block processing module 202, a node processing module 203, a result labeling module 204, a model training module 205 and a model prediction module 206.
The answer sheet obtaining module 201 is used for obtaining answer sheet pictures of set number, the blocking processing module 202 is used for respectively blocking the answer sheet pictures based on set blocking rules to obtain blocking results, the node processing module 203 is used for normalizing set vectors obtained by convolutional neural network transformation of the blocking results to obtain node feature images, the result labeling module 204 is used for labeling the node feature images according to set dividing regions to obtain labeling results, the labeling results comprise label information of the dividing regions of each node, the model training module 205 is used for training to obtain a drawing attention model based on set answer sheet construction rules and combining the labeling results, and the model prediction module 206 is used for predicting new answer sheets by adopting the trained drawing attention model.
The embodiment of the application obtains a set number of answer sheets, respectively blocks the answer sheets based on a set block rule to obtain a block result, normalizes a set vector obtained by converting the block result through a convolutional neural network to obtain a node characteristic diagram, marks the node characteristic diagram according to a set dividing area to obtain a marking result, wherein the marking result comprises label information of the dividing area of each node, trains to obtain a drawing attention model based on a set answer sheet construction rule and combines the marking result, predicts a new answer sheet by adopting the trained drawing attention model, trains the drawing attention model, does not need to mark a large amount of category information, automatically and accurately extracts global structure information on the answer sheet, is simultaneously suitable for processing various subsequent answer sheets, improves generalization capability, and has strong adaptability to noise and complex typesetting.
In a third aspect, the embodiment of the present application further provides an electronic device, where the electronic device may integrate the test paper structure acquisition device based on the artificial intelligence and the ideographic neural network of the user mode polling mechanism provided by the embodiment of the present application. Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 11, the electronic device includes an input device 43, an output device 44, a memory 42 and one or more processors 41, where the memory 42 is configured to store one or more programs, and when the one or more programs are executed by the one or more processors 41, the one or more processors 41 implement a test paper structure acquisition method based on an artificial intelligence and a schematic neural network of a user mode polling mechanism provided in the above embodiment. Wherein the input device 43, the output device 44, the memory 42 and the processor 41 may be connected by a bus or otherwise, for example in fig. 11.
The processor 41 executes various functional applications and data processing of the device by running software programs, instructions and modules stored in the memory 42, i.e. the test paper structure acquisition method based on artificial intelligence and the graph ideographic neural network implementing the user mode polling mechanism described above.
The electronic device provided by the embodiment can be used for executing the test paper structure acquisition method based on the artificial intelligence and the ideographic neural network of the user mode polling mechanism provided by the embodiment, and has corresponding functions and beneficial effects.
In a fourth aspect, an embodiment of the present application further provides a computer readable storage medium, where the computer readable storage medium includes a stored computer program, where when the computer program runs, a device where the computer readable storage medium is located is controlled to execute a test paper structure acquiring method based on an artificial intelligence and a graph meaning neural network as described above, and the same beneficial effects as the method can be achieved.
Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present application is not limited to the test paper structure obtaining method based on the artificial intelligence and the schematic force neural network as described above, and may also perform the related operations in the test paper structure obtaining method based on the artificial intelligence and the schematic force neural network provided in any embodiment of the present application.
In a fifth aspect, embodiments of the present application further provide a computer program product, where the methods according to the embodiments of the present application may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the various embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user device, a core network device, an OAM (OpenApplicationModel ), or other programmable device.
The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium such as a floppy disk, a hard disk, a magnetic tape, an optical medium such as a digital video disk, or a semiconductor medium such as a solid state disk. The computer readable storage medium may be volatile or nonvolatile storage medium, or may include both volatile and nonvolatile types of storage medium.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-only memory (ROM), a random access memory (RAM, randomAccessMemory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that like reference numerals and letters refer to like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.

Claims (9)

1. A test paper structure acquisition method based on artificial intelligence and a graph annotation intention neural network is characterized by comprising the following steps:
obtaining answer sheet pictures with set quantity;
respectively partitioning the answer sheet pictures based on a set partitioning rule to obtain a partitioning result;
normalizing the set vector obtained by the block result through convolutional neural network transformation to obtain a node characteristic diagram;
Marking the node characteristic map according to the set divided regions to obtain marking results, wherein the marking results comprise label information of the divided regions of each node;
Based on the set rule for constructing the answer sheet, and combining the labeling result, training to obtain a drawing attention model;
predicting a new answer sheet by adopting a trained graph attention model;
the method for obtaining the answer sheet picture comprises the steps of:
Respectively obtaining characteristic information of each answer sheet picture in different scales and different directions on a frequency domain through Gabor transformation, wherein the characteristic information comprises text lines;
cutting each answer sheet picture in blocks respectively, and adjusting the corresponding region images to the same size;
And acquiring the positions of each line of texts corresponding to all the answer pictures according to the relative positions of the first answer picture, and taking each line of texts as a node, wherein each node comprises characteristic information of Gabor transformation of each answer picture, a difference value of Gabor transformation of the first answer picture and the second answer picture, and a difference value of Gabor transformation of the first answer picture and the third answer picture.
2. The method for obtaining the test paper structure based on the artificial intelligence and the drawing meaning neural network according to claim 1, wherein the method for obtaining the characteristic information of each answer picture in different dimensions and different directions on a frequency domain through Gabor transformation comprises the following steps:
Processing the first answer sheet picture by adopting a Gabor filter with a specific scale and a specific direction to extract characteristic values so as to obtain a set number of characteristic images;
Selecting a characteristic image with a first scale and a first direction, blocking the characteristic image by a projection method, and then performing vertical projection and normalization processing on the blocked characteristic image;
According to the normalization result, dividing the characteristic image into blocks along the second direction, and performing binarization processing;
Cutting the characteristic image along a first direction according to the binarization processing result to perform block cutting, and calling a region detection algorithm to process each block to obtain a text line of each block;
After the rest characteristic images are subjected to the same processing, text lines of all the characteristic images of the first answer sheet picture are obtained;
and after the rest answer pictures are subjected to the same processing, text lines of different dimensions and directions of each answer picture on a frequency domain are obtained.
3. The test paper structure acquisition method based on artificial intelligence and graph annotation intention neural network according to claim 1, wherein the normalizing processing is performed on the set vector obtained by the transformation of the pair of segmentation results through the convolutional neural network to obtain a node characteristic graph, and the method comprises the following steps:
And combining the regional images of the first answer picture into the input with set dimensionality, converting the input into set vectors through CNN convolutional neural network transformation, normalizing the information of the same node of each answer picture, and then splicing all nodes to obtain the node characteristic diagram.
4. The test paper structure acquisition method based on the artificial intelligence and the graph annotation intention neural network according to claim 1, wherein the labeling of the node feature graph according to the set division area to obtain a labeling result comprises the following steps:
Labeling each node of the node characteristic map according to the question area, the answer area and the filling area to obtain the label information of the dividing area of each node.
5. The method for obtaining a test paper structure based on artificial intelligence and a drawing attention neural network according to claim 4, wherein the method for obtaining a drawing attention model based on set answer construction rules and training with labeling results comprises the following steps:
Analyzing and loading all answer sheet construction files, wherein the answer sheet construction files comprise files for storing the side relation of nodes with the same spatial frequency, region images to which the nodes belong, sequence numbers of the region images, regional label information of the nodes and attribute characteristics of the nodes;
distributing the nodes to different area images according to the area images to which the nodes belong;
Constructing a graph structure according to a file for storing the side relation of the nodes with the same spatial frequency;
Selecting a graph neural network module according to the set task type;
training a graph attention model by using node attribute features and graph structures as inputs;
performing graph-level supervised learning on the graph annotation force model by using the graph labels;
Splicing the nodes to fuse information of different spatial frequencies;
Predicting the unknown volume picture by using the trained picture attention model;
Judging whether the feature map meets the requirements or not based on the prediction result;
If yes, the drawing meaning force model is used for predicting subsequent answer pictures.
6. Test paper structure acquisition device based on artificial intelligence and drawing meaning force neural network, its characterized in that, the device includes:
The answer sheet acquisition module is used for acquiring answer sheet pictures with set quantity;
The block processing module is used for respectively blocking the answer sheet pictures based on the set block rules to obtain block results;
the node processing module is used for carrying out normalization processing on the set vector obtained by the block result through the convolutional neural network transformation to obtain a node characteristic diagram;
The result marking module is used for marking the node characteristic graph according to the set divided areas to obtain a marking result, wherein the marking result comprises the label information of the divided areas of each node;
The model training module is used for training to obtain a graph attention model based on the set answer sheet construction rule and the combination of the labeling result;
the model prediction module is used for predicting a new answer sheet by adopting a trained graph attention model;
the method for obtaining the answer sheet picture comprises the steps of:
Respectively obtaining characteristic information of each answer sheet picture in different scales and different directions on a frequency domain through Gabor transformation, wherein the characteristic information comprises text lines;
cutting each answer sheet picture in blocks respectively, and adjusting the corresponding region images to the same size;
And acquiring the positions of each line of texts corresponding to all the answer pictures according to the relative positions of the first answer picture, and taking each line of texts as a node, wherein each node comprises characteristic information of Gabor transformation of each answer picture, a difference value of Gabor transformation of the first answer picture and the second answer picture, and a difference value of Gabor transformation of the first answer picture and the third answer picture.
7. An electronic device, comprising:
The test paper structure acquisition method based on the artificial intelligence and the drawing force neural network comprises a processor, a memory and a bus, wherein the processor is connected with the memory through the bus, and the memory stores computer readable instructions which are used for realizing the test paper structure acquisition method based on the artificial intelligence and the drawing force neural network according to any one of claims 1 to 5 when the computer readable instructions are executed by the processor.
8. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program is executed by a server, the method for obtaining a test paper structure based on artificial intelligence and graph meaning force neural network according to any one of claims 1-5 is implemented.
9. A computer program product, characterized in that it comprises instructions which, when executed by a computer, cause the computer to carry out the method according to any one of claims 1-5.
CN202510018727.0A 2025-01-07 2025-01-07 Test paper structure acquisition method based on artificial intelligence and graph attention neural network Active CN119418139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510018727.0A CN119418139B (en) 2025-01-07 2025-01-07 Test paper structure acquisition method based on artificial intelligence and graph attention neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510018727.0A CN119418139B (en) 2025-01-07 2025-01-07 Test paper structure acquisition method based on artificial intelligence and graph attention neural network

Publications (2)

Publication Number Publication Date
CN119418139A CN119418139A (en) 2025-02-11
CN119418139B true CN119418139B (en) 2025-04-01

Family

ID=94469865

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510018727.0A Active CN119418139B (en) 2025-01-07 2025-01-07 Test paper structure acquisition method based on artificial intelligence and graph attention neural network

Country Status (1)

Country Link
CN (1) CN119418139B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095270A (en) * 2021-04-23 2021-07-09 山东大学 Unsupervised cross-library micro-expression identification method
CN113761217A (en) * 2021-04-20 2021-12-07 腾讯科技(深圳)有限公司 Artificial intelligence-based question set data processing method and device and computer equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133233A (en) * 2017-12-18 2018-06-08 中山大学 A kind of multi-tag image-recognizing method and device
BE1028347B1 (en) * 2021-08-12 2022-11-08 Oncoradiomics METHODS, SYSTEMS, STORAGE MEDIA AND DEVICES FOR INCREASING THE ACCURACY OF CLASSIFICATION, PREDICTION OR SEGMENTATION OF MEDICAL CONDITIONS IN BIOMEDICAL IMAGES THROUGH ATTENTION MAPS
CN114420309B (en) * 2021-09-13 2023-11-21 北京百度网讯科技有限公司 Method for establishing medicine synergistic effect prediction model, prediction method and corresponding device
CN115034224B (en) * 2022-01-26 2024-10-25 华东师范大学 A news event detection method and system integrating multiple text semantic structure graph representations
CN118334431A (en) * 2024-04-23 2024-07-12 河南大学 Alzheimer's disease classification method and system based on high-order attention and graph convolution network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113761217A (en) * 2021-04-20 2021-12-07 腾讯科技(深圳)有限公司 Artificial intelligence-based question set data processing method and device and computer equipment
CN113095270A (en) * 2021-04-23 2021-07-09 山东大学 Unsupervised cross-library micro-expression identification method

Also Published As

Publication number Publication date
CN119418139A (en) 2025-02-11

Similar Documents

Publication Publication Date Title
US11416672B2 (en) Object recognition and tagging based on fusion deep learning models
CN109960734B (en) Question Answering for Data Visualization
CN109886928B (en) A target cell labeling method, device, storage medium and terminal equipment
US20210089827A1 (en) Feature representation device, feature representation method, and program
Corcoran et al. Segmentation performance evaluation for object-based remotely sensed image analysis
Guilbert Multi-level representation of terrain features on a contour map
US20230084845A1 (en) Entry detection and recognition for custom forms
Tsai et al. Comprehensive, quantitative crack detection algorithm performance evaluation system
CN112651996B (en) Target detection and tracking method, device, electronic device and storage medium
CN114330234B (en) Layout structure analysis method, device, electronic device and storage medium
CN110796145B (en) Multi-certificate segmentation association method and related equipment based on intelligent decision
Prabhu et al. Slum extraction from high resolution satellite data using mathematical morphology based approach
CN117218672A (en) A method and system for medical record text recognition based on deep learning
CN113239227A (en) Image data structuring method and device, electronic equipment and computer readable medium
CN114266901B (en) Document outline extraction model construction method, device, equipment and readable storage medium
Chandra et al. A cognitive framework for road detection from high-resolution satellite images
Chandra et al. Human cognition based framework for detecting roads from remote sensing images
CN119763129A (en) Equipment and facility management method and device based on text detection and recognition model
Luo et al. Extraction of bridges over water from IKONOS panchromatic data
CN113449728A (en) Character recognition method and related equipment thereof
CN112287763A (en) Image processing method, apparatus, device and medium
CN119418139B (en) Test paper structure acquisition method based on artificial intelligence and graph attention neural network
CN112418199B (en) Multi-modal information extraction method and device, electronic equipment and storage medium
CN120125915A (en) Input image category recognition method, device, computer equipment and storage medium
Ghandour et al. Building shadow detection based on multi-thresholding segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 2363, Building 1, No. 17 Cangjingguan Hutong, Dongcheng District, Beijing, 100007

Patentee after: Beijing Hejuli Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Room 2363, Building 1, No. 17 Cangjingguan Hutong, Dongcheng District, Beijing, 100007

Patentee before: BEIJING HEQI JULI EDUCATION TECHNOLOGY Co.,Ltd.

Country or region before: China