CN118038104A - Method for determining target model, method and system for determining image category - Google Patents

Method for determining target model, method and system for determining image category Download PDF

Info

Publication number
CN118038104A
CN118038104A CN202211354700.1A CN202211354700A CN118038104A CN 118038104 A CN118038104 A CN 118038104A CN 202211354700 A CN202211354700 A CN 202211354700A CN 118038104 A CN118038104 A CN 118038104A
Authority
CN
China
Prior art keywords
model
data set
category
online
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211354700.1A
Other languages
Chinese (zh)
Inventor
王耀平
李昭月
张美娟
赵小慧
柴栋
王洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOE Technology Group Co Ltd
Beijing Zhongxiangying Technology Co Ltd
Original Assignee
BOE Technology Group Co Ltd
Beijing Zhongxiangying Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOE Technology Group Co Ltd, Beijing Zhongxiangying Technology Co Ltd filed Critical BOE Technology Group Co Ltd
Priority to CN202211354700.1A priority Critical patent/CN118038104A/en
Publication of CN118038104A publication Critical patent/CN118038104A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a method and a device for determining image categories, wherein the method comprises the following steps: obtaining a sample data set, the sample data set comprising a training data set and a validation data set, each of the training data set and the validation data set comprising a plurality of sample images that have been labeled with a class; training the deep learning model by using a training data set, and obtaining at least two trained models according to different training round numbers; testing at least two trained models with the validation dataset to generate a validation test result; based on the validation test results, generating validation test metrics including at least one of: confusion matrix, accuracy, recall, and F1 score; a target model is determined from the at least two trained models based on the validation test index.

Description

Method for determining target model, method and system for determining image category
Technical Field
The present disclosure relates to the field of image processing, and more particularly, to a method of determining a target model, a method of determining an image category, a system of determining an image category, a computing device, a computer-readable storage medium, and a computer program product.
Background
With the development of computer technology, advanced technologies such as artificial intelligence and deep learning algorithms are widely used. For example, the classification of the image can be completed by using a deep learning model based on advanced technologies such as artificial intelligence and a deep learning algorithm, so that the manual workload is reduced and the efficiency is improved. For example, in the field of screen production, the produced product is poor due to problems in links such as equipment, parameters, operation, environmental interference and the like, and the category of screen defects can be detected by classifying screen images by using a deep learning model. However, in actual production, deployment of a trained model to a production line is a very careful task, as the predictive effect of the model directly affects the production process of the product.
Disclosure of Invention
In view of the above, the present disclosure provides methods of determining a target model, methods of determining an image class, systems of determining an image class, computing devices, computer-readable storage media, and computer program products, which are expected to overcome some or all of the above-referenced drawbacks, as well as other possible drawbacks.
According to an aspect of the present disclosure, there is provided a method of determining a target model, including: obtaining a sample data set, the sample data set comprising a training data set and a validation data set, each of the training data set and the validation data set comprising a plurality of sample images that have been labeled with a class; training the deep learning model by using a training data set, and obtaining at least two trained models according to different training round numbers; testing at least two trained models with the validation dataset to generate a validation test result; based on the validation test results, generating validation test metrics including at least one of: confusion matrix, accuracy, recall, and F1 score; a target model is determined from the at least two trained models based on the validation test index.
In some embodiments, the method further comprises: acquiring an offline test dataset comprising at least one of: a subset partitioned by the sample dataset, an input sample dataset provided by a user, the input sample dataset comprising a plurality of sample images of the annotated class; and testing the target model by using the offline test data set to generate an offline test result.
In some embodiments, the method further comprises: generating an accuracy curve and a recall curve for at least one category based on the verification test result or the offline test result, wherein the accuracy curve reflects a relationship between the accuracy and the confidence threshold, and the recall curve reflects a relationship between the recall and the confidence threshold; and updating the confidence threshold for at least one category according to the accuracy rate curve and the recall rate curve.
In some embodiments, updating the confidence threshold for at least one category according to the accuracy curve and the recall curve comprises: and updating the confidence threshold for at least one category according to the intersection point of the accuracy rate curve and the recall rate curve.
In some embodiments, the method further comprises: determining a recommendation confidence threshold for each category according to the accuracy rate or the recall rate based on the verification test result or the offline test result; based on the recommended confidence threshold, a confidence threshold for at least one category is updated.
In some embodiments, the method further comprises: based on the offline test results, generating offline test metrics, the offline test metrics including at least one of: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and real quantity for each category, confidence distribution for each category; and updating the confidence threshold for at least one category according to the offline test index.
In some embodiments, the method further comprises: generating at least one of the following based on the offline test results according to the updated confidence threshold: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, confidence distribution for each category.
In some embodiments, the method further comprises: acquiring an online test data set, wherein the online test data set comprises a plurality of images of unlabeled categories; and utilizing the online test data set to test the target model to generate an online test index, wherein the online test index comprises at least one of the following: accuracy, recall, confusion matrix, distribution of model output quantity and manual review quantity for each category, and confidence distribution for each category; one of the following two items: responding to the online test index meeting a preset standard, and accessing an online target model; and in response to at least some of the online test metrics being higher than corresponding test metrics of a related online model, the corresponding test metrics being derived based on output results of the related online model for the online test dataset.
In some embodiments, the method further comprises one of: updating the target model by retraining or adjusting the confidence threshold in response to the online test index not meeting the preset criteria; and updating the target model by retraining or adjusting the confidence threshold in response to the online test index not being higher than the corresponding test index of the associated online model.
In some embodiments, acquiring the online test dataset includes: establishing communication connection with an image acquisition device, wherein the image acquisition device is configured to acquire an image to be inspected; receiving an image to be inspected from an image acquisition device via a communication connection; based on the received image to be inspected, an online test dataset is acquired.
In some embodiments, acquiring the online test dataset includes: an online test dataset is acquired based on images received by a relevant online model, wherein the relevant online model is configured to receive images to be inspected from an image acquisition device and predict a category of the images to be inspected based on the received images to be inspected.
In some embodiments, the online target model includes: obtaining an auditing result of a user aiming at a target model; in response to the auditing result indicating that the target model is online, the target model is online such that the target model is configured to receive the image to be inspected from the image acquisition device and predict a category of the image to be inspected based on the received image to be inspected.
In some embodiments, the method further comprises: after the target model is online, acquiring online spot check data based on an image to be checked from an image acquisition device; receiving a manual rechecking result aiming at the online spot check data, wherein the manual rechecking result comprises the category of the image to be checked obtained by manual rechecking; based on the manual rechecking result and the category predicted by the target model, generating an online spot check index, wherein the online spot check index comprises at least one of the following items: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category.
In some embodiments, acquiring the online spot-check data based on the image to be inspected from the image acquisition device includes at least one of: randomly extracting part of images from the images to be detected from the image acquisition device, and generating online spot check data based on the extracted images; and receiving screening conditions for the images to be detected from the image acquisition device, screening the images to be detected from the image acquisition device based on the screening conditions, and generating online spot check data based on the screened images.
In some embodiments, determining the target model from the at least two trained models based on the validation test metrics comprises: a target model is determined from the F1 score among the at least two trained models.
In some embodiments, determining the target model from the at least two trained models further comprises: judging whether the determined target model meets preset requirements according to the confusion matrix; in response to the determined target model not meeting the preset requirements, the target model is updated by retraining or adjusting the confidence threshold.
In some embodiments, the sample image is an image of the target product and the category is a product defect category of the target product.
According to another aspect of the present disclosure, there is provided a method of determining an image category, comprising: predicting an image to be detected by using a target model to obtain the category of the image to be detected, wherein the target model is determined from at least two trained models according to a verification test index, the at least two trained models have different training rounds, and the verification test index comprises at least one of the following: confusion matrix, accuracy, recall, and F1 score.
According to yet another aspect of the present disclosure, there is provided a method of determining a target model, comprising: in response to a user configuration operation on a sample dataset, obtaining the sample dataset, the sample dataset comprising a training dataset and a validation dataset, each of the training dataset and the validation dataset comprising a plurality of sample images of a labeled class; configuring training parameters according to characteristic information of a sample data set, and generating a training parameter display interface, wherein the training parameters displayed by the training parameter display interface comprise a test strategy, the test strategy comprises at least two training rounds, and the characteristic information comprises the sample number of the sample data set; training the deep learning model by utilizing a training data set according to training parameters to obtain at least two trained models, wherein the at least two trained models correspond to the at least two training wheel numbers one by one; testing at least two trained models with the validation dataset to generate a validation test result; based on the verification test result, generating a verification test index display interface, wherein the verification test index display interface is used for displaying at least one of the following: confusion matrix, accuracy, recall, and F1 score; in response to a user selection operation for at least two trained models, the model selected by the user is determined to be the target model.
In some embodiments, the method further comprises: responding to the offline test task establishment operation of a user aiming at a target model, and generating an offline test parameter configuration interface; establishing an offline test task according to configuration input of a user on an offline test parameter configuration interface; acquiring an offline test data set according to configuration input, wherein the offline test data set comprises a plurality of sample images with marked categories; and testing the target model by using the offline test data set to generate an offline test result.
In some embodiments, updating the confidence threshold for the at least one category based on the offline test results comprises: generating a curve display interface based on the verification test result or the offline test result, wherein the curve display interface is used for displaying an accuracy rate curve and a recall rate curve of each category, the accuracy rate curve reflects the relationship between the accuracy rate and the confidence threshold, and the recall rate curve reflects the relationship between the recall rate and the confidence threshold; the confidence threshold for the at least one category is updated in response to a user modifying the confidence threshold for the at least one category.
In some embodiments, the method further comprises: determining a recommendation confidence threshold for each category according to the accuracy rate or the recall rate based on the verification test result or the offline test result; updating the confidence threshold for the at least one category based on the recommended confidence threshold; and responding to the checking operation of the user on the confidence threshold value, and generating a confidence threshold value display interface.
In some embodiments, the method further comprises: based on the offline test result, generating an offline test index display interface, wherein the offline test index display interface is used for displaying at least one of the following: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and real quantity for each category, confidence distribution for each category; the confidence threshold for the at least one category is updated in response to a user modifying the confidence threshold for the at least one category.
In some embodiments, the method further comprises: based on the offline test result, generating an error result display interface, wherein the error result display interface is used for displaying data of which the model output category is inconsistent with the real category.
In some embodiments, the method further comprises: in response to the confidence threshold being updated, generating an index presentation interface based on the offline test results, the index presentation interface for presenting at least one of: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, confidence distribution for each category.
In some embodiments, the offline test parameter configuration interface includes a data set selection option, and wherein, based on the configuration input, obtaining the offline test data set includes: in response to a selection operation performed by a user through data set selection, an offline test data set is acquired based on a sample data set corresponding to the selection operation.
In some embodiments, the offline test parameter configuration interface includes a dataset upload option, and wherein obtaining the offline test dataset further includes, in accordance with the configuration input: receiving an input sample dataset in response to an upload operation performed by a user via a dataset upload option; based on the input sample dataset, an offline test dataset is obtained.
In some embodiments, the method further comprises: responding to the online test task establishment operation of a user aiming at a target model, and generating an online test parameter configuration interface; establishing an online test task according to the configuration input of a user on an online test parameter configuration interface; acquiring an online test data set based on configuration input, wherein the online test data set comprises a plurality of images with unlabeled categories; and utilizing the online test data set to test the target model to generate an online test index, wherein the online test index comprises at least one of the following: accuracy, recall, confusion matrix, distribution of model output quantity and manual review quantity for each category, and confidence distribution for each category; one of the following two items: responding to the online test index meeting the preset standard, and presenting options of an online target model; and presenting the option of the online target model in response to at least some of the online test metrics being higher than corresponding test metrics of the relevant online model, wherein the corresponding test metrics are derived based on output results of the relevant online model for the online test dataset.
In some embodiments, the online test parameter configuration interface includes a new model online option, and based on the configuration input, obtaining the online test dataset includes: responding to the selection operation of a user on-line option of a new model, and establishing communication connection with an image acquisition device, wherein the image acquisition device is configured to acquire an image to be detected; receiving an image to be inspected from an image acquisition device via a communication connection; based on the received image to be inspected, an online test dataset is acquired.
In some embodiments, the online test parameter configuration interface includes a model update option, and based on the configuration input, obtaining the online test dataset includes: in response to a user selection of a model update option, an online test dataset is acquired based on images received by a relevant online model, wherein the relevant online model is configured to receive images to be inspected from an image acquisition device and predict a category of the images to be inspected based on the received images to be inspected.
In some embodiments, the method further comprises: responding to the operation of selecting the online target model by a user, and generating an online auditing interface; in response to a confirmation operation of the user for the online audit, the online target model is such that the target model is configured to receive the to-be-inspected image from the image acquisition device and predict a category of the to-be-inspected image based on the received to-be-inspected image.
In some embodiments, the method further comprises: responding to the monitoring task establishing operation of the user aiming at the online target model, and generating a monitoring task parameter configuration interface; establishing a monitoring task according to configuration input of a user on a monitoring task parameter configuration interface; acquiring online spot check data based on the to-be-checked image from the image acquisition device according to the configuration input; receiving a manual rechecking result aiming at the online spot check data, wherein the manual rechecking result comprises the category of the image to be checked obtained by manual rechecking; based on the manual rechecking result and the category predicted by the target model, generating an online spot check index, wherein the online spot check index comprises at least one of the following items: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category.
According to yet another aspect of the present disclosure, there is provided a system for determining an image category, comprising: a data management module configured to store and manage sample data; a training and testing management module configured to perform the method of determining a target model or the method of determining an image class as described in various embodiments of the foregoing aspects; and the model management module is configured to store, display and manage the target model.
According to yet another aspect of the present disclosure, there is provided a computing device comprising: a memory configured to store computer-executable instructions; a processor configured to perform the method described in accordance with the various embodiments of the preceding aspect when the computer-executable instructions are executed by the processor.
According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium storing computer-executable instructions that, when executed, perform a method described in accordance with various embodiments of the preceding aspects.
According to yet another aspect of the present disclosure, there is provided a computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method described in accordance with the various embodiments of the preceding aspects.
Drawings
Embodiments of the present disclosure will now be described in more detail and with reference to the accompanying drawings, in which:
FIG. 1 illustrates a schematic diagram of a system architecture of an exemplary application environment in which the technical solutions of embodiments of the present disclosure may be applied;
FIG. 2 illustrates a schematic diagram of a computing device to which embodiments of the present disclosure may be applied;
FIG. 3 illustrates a schematic flow diagram of a method of determining image categories according to one embodiment of the present disclosure;
FIG. 4 illustrates an exemplary flowchart of a method for determining confidence thresholds for an image classification model in predicting various categories using a validation data set in accordance with one embodiment of the present disclosure;
FIG. 5 illustrates an exemplary implementation flow chart of a method of determining confidence thresholds for an image classification model in predicting various categories using a validation data set in accordance with an embodiment of the present disclosure;
FIG. 6 illustrates an exemplary flowchart of a method for determining confidence thresholds for an image classification model in predicting various categories using a validation data set in accordance with one embodiment of the present disclosure;
FIG. 7 illustrates an exemplary implementation flow chart of a method of determining confidence thresholds for an image classification model in predicting various categories using a validation data set in accordance with an embodiment of the present disclosure;
FIG. 8 illustrates a schematic flow diagram of a method of determining a confidence threshold in accordance with one embodiment of the present disclosure;
FIG. 9 illustrates a schematic flow chart of a method of determining a confidence threshold in accordance with another embodiment of the disclosure;
FIG. 10 illustrates a graphical interface on which a user may configure parameters of a sample dataset according to another embodiment of the present disclosure;
FIG. 11 illustrates an exemplary parameter configuration interface in accordance with one embodiment of the present disclosure;
FIG. 12 illustrates a confidence threshold presentation interface in which confidence thresholds for respective categories are visually presented, according to one embodiment of the present disclosure;
FIG. 13 illustrates a presentation interface of a training schedule according to one embodiment of the present disclosure;
FIG. 14 illustrates an exemplary block diagram of an apparatus for determining image categories according to one embodiment of the present disclosure;
FIG. 15 illustrates an exemplary block diagram of an apparatus for determining a confidence threshold in accordance with one embodiment of the present disclosure;
FIG. 16 illustrates an exemplary block diagram of an apparatus for determining a confidence threshold in accordance with one embodiment of the present disclosure;
FIG. 17 illustrates a schematic flow diagram of a method of determining a target model according to one embodiment of the present disclosure;
FIG. 18 illustrates a schematic flow diagram of an offline testing scheme, according to one embodiment of the present disclosure;
FIG. 19 illustrates a schematic flow diagram of an online testing scheme according to one embodiment of the present disclosure;
FIG. 20 illustrates a schematic flow diagram of an auditing scheme, according to an embodiment of the present disclosure;
FIG. 21 illustrates a schematic flow diagram of an online review scheme, according to one embodiment of the present disclosure;
FIG. 22 illustrates a schematic flow diagram of a scheme for training and evaluating an image classification model according to an embodiment of the present disclosure;
FIG. 23 illustrates a schematic flow diagram of a method of determining image categories according to one embodiment of the present disclosure;
FIG. 24 illustrates a schematic flow diagram of a method of determining a target model according to one embodiment of the present disclosure;
25A and 25B illustrate schematic diagrams of an index presentation interface according to one embodiment of the present disclosure;
FIG. 26 illustrates a schematic diagram of a confidence threshold presentation interface in accordance with one embodiment of the present disclosure;
FIG. 27 illustrates a schematic diagram of an offline test task creation interface, according to one embodiment of the present disclosure;
FIG. 28 illustrates a schematic diagram of a curved presentation interface according to one embodiment of the present disclosure;
FIG. 29 illustrates a schematic diagram of a confidence threshold setting interface, according to one embodiment of the present disclosure;
FIG. 30 illustrates a schematic diagram of a result distribution interface according to one embodiment of the present disclosure;
FIG. 31 illustrates a schematic diagram of a confidence distribution interface in accordance with one embodiment of the present disclosure;
FIG. 32 illustrates a schematic diagram of an confusion matrix presentation interface, according to an embodiment of the present disclosure;
FIG. 33 illustrates a schematic diagram of an error result presentation interface according to one embodiment of the present disclosure;
34A and 34B illustrate schematic diagrams of an online test task creation interface, according to one embodiment of the present disclosure;
FIG. 35 illustrates a schematic diagram of an audit interface according to one embodiment of the present disclosure;
FIG. 36 illustrates a schematic diagram of an online model management interface, according to one embodiment of the present disclosure;
FIG. 37 illustrates a schematic diagram of a model monitoring task interface, according to one embodiment of the present disclosure;
FIG. 38 illustrates a schematic diagram of a monitoring task creation interface according to one embodiment of the present disclosure;
FIG. 39 illustrates an exemplary block diagram of an apparatus for determining a target model according to one embodiment of the disclosure;
FIG. 40 illustrates an exemplary block diagram of an apparatus for determining image categories according to one embodiment of the present disclosure;
FIG. 41 illustrates an exemplary block diagram of an apparatus for determining a target model according to one embodiment of the disclosure;
FIG. 42 illustrates an exemplary block diagram of a system for determining image categories according to one embodiment of the present disclosure.
Detailed Description
The following description provides specific details of various embodiments of the disclosure so that those skilled in the art may fully understand and practice the various embodiments of the disclosure. It should be understood that the technical solutions of the present disclosure may be practiced without some of these details. In some instances, well-known structures or functions have not been shown or described in detail to avoid obscuring the description of embodiments of the present disclosure with such unnecessary description. The terminology used in the present disclosure should be understood in its broadest reasonable manner, even though it is being used in conjunction with a particular embodiment of the present disclosure.
FIG. 1 illustrates a schematic diagram of a system architecture of an exemplary application environment in which the methods of determining image categories, the methods of determining confidence thresholds, and the methods of determining a target model of embodiments of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of the terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The terminal devices 101, 102, 103 may be a variety of computing devices having computing or processing capabilities, including but not limited to desktop computers, portable computers, smartphones, tablet computers, and the like. It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 105 may be a server cluster formed by a plurality of servers.
The methods provided by the embodiments of the present disclosure, including one or all of the methods of determining image categories, the methods of determining confidence thresholds, and the methods of determining target models, are generally performed by the terminal devices 101, 102, 103. Accordingly, the means for determining image categories, the means for determining confidence thresholds, and the means for determining object models provided by embodiments of the present disclosure may also be implemented (e.g., as computer applications or products) in the terminal devices 101, 102, 103. It will be readily understood by those skilled in the art that the method provided in the embodiment of the present disclosure may be performed by the server 105, or may be performed by the terminal devices 101, 102, 103 and the server 105 in combination, which is not particularly limited in the present exemplary embodiment. For example, in one exemplary embodiment, it may be that a user acquires a sample dataset comprising a plurality of sample images of annotated categories through the terminal devices 101, 102, 103, and then uploads the sample dataset to the server 105. The server divides the sample data set into a training data set and a verification data set, trains the deep learning model by using the training data set to obtain an image classification model, and determines a confidence threshold of the image classification model when predicting each category by using the verification data set. Then, the user obtains a target image through the terminal devices 101, 102, 103, and then transmits the target image to the server 105. The server predicts the user target image based on the determined confidence threshold and by using the image classification model to obtain the category of the target image, and then transmits the category of the target image to the terminal equipment 101, 102, 103 and the like; or the server divides the sample data set to obtain a training data set and a verification data set, trains the deep learning model by using the training data set to obtain at least two trained models, generates a verification test index comprising at least one of confusion matrix, accuracy, recall and F1 score by using the verification data set, and determines a target model in the at least two trained models according to the generated verification test index. Then, the user acquires a to-be-inspected image of the target object through the terminal devices 101, 102, 103, and then transmits the to-be-inspected image of the target object to the server 105. The server predicts the to-be-detected image of the target object by using the determined target model to obtain the category of the to-be-detected image, and then transmits the category of the to-be-detected image to the terminal devices 101, 102, 103 and the like.
Exemplary embodiments of the present disclosure provide a computing device for implementing the above method, which may be the terminal device 101, 102, 103 or the server 105 in fig. 1. The computing device includes at least a processor and a memory for storing executable instructions of the processor, the processor configured to perform the above-described method via execution of the executable instructions.
In an exemplary embodiment of the present disclosure, the system architecture may be a distributed system, which may be a group of computers, a system formed by interconnecting messages and communications over a network and coordinating their behavior. The network may be an internet of things (Internet of Things) based on the internet and/or a telecommunications network, which may be a wired network or a wireless network, for example, it may be an electronic network capable of implementing information exchange functions, such as a Local Area Network (LAN), metropolitan Area Network (MAN), wide Area Network (WAN), cellular data communications network, etc. The distributed system may have software components, such as software objects or other types of individually addressable isolated entities, such as distributed objects, agents, actionable parties (actor), virtual components, and so forth. Typically, each such component is individually addressable and has a unique identity (such as an integer, GUID, string, or opaque data structure, etc.) in a distributed system. In a distributed system that allows for geographic distribution, applications may be resident in a cluster through deployment. There are various systems, components, and network configurations that support distributed computing environments.
Distributed systems provide sharing of computer resources and services through communication exchanges between computing devices and systems. These resources and services include information exchange for objects (e.g., files), cache storage, and disk storage. These resources and services also include sharing of processing power across multiple processing units for load balancing, resource expansion, specialization of processing, and the like. For example, a distributed system may include hosts with network topologies and network infrastructures, such as client devices/servers, peer-to-peer or hybrid architectures.
The construction of a computing device is exemplified below by the mobile terminal 200 of fig. 2. It will be appreciated by those skilled in the art that the configuration of fig. 2 can also be applied to stationary type devices in addition to components specifically for mobile purposes. In other embodiments, mobile terminal 200 may include more or less components than illustrated, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. The interfacing relationship between the components is shown schematically only and does not constitute a structural limitation of the mobile terminal 200. In other embodiments, the mobile terminal 200 may also employ a different interface from that of fig. 2, or a combination of interfaces.
As shown in fig. 2, the mobile terminal 200 may specifically include: processor 210, internal memory 221, external memory interface 222, universal serial bus (Universal Serial Bus, USB) interface 230, charge management module 240, power management module 241, battery 242, antenna 1, antenna 2, mobile communication module 250, wireless communication module 260, audio module 270, speaker 271, receiver 272, microphone 273, headset interface 274, sensor module 280, display screen 290, camera module 291, indicator 292, motor 293, keys 294, and subscriber identity module (subscriber identification module, SIM) card interface 295, among others. Wherein the sensor module 280 may include a depth sensor 2801, a pressure sensor 2802, a gyro sensor 2803, and the like.
Processor 210 may include one or more processing units such as, for example: the Processor 210 may include an application Processor (Application Processor, AP), a modem Processor, a graphics Processor (Graphics Processing Unit, GPU), an image signal Processor (IMAGE SIGNAL Processor, ISP), a controller, a video codec, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), a baseband Processor and/or a neural network Processor (Neural-Network Processing Unit, NPU), and the like. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
The NPU is a neural Network (Neural-Network, NN) computing processor, and can rapidly process input information by referencing a biological neural Network structure, such as referencing a transmission mode among human brain neurons, and can continuously learn. Applications such as intelligent awareness of the mobile terminal 200 may be implemented by the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.
The processor 210 has a memory disposed therein. The memory may store instructions for implementing six modular functions: detection instructions, connection instructions, information management instructions, analysis instructions, data transfer instructions, and notification instructions, and are controlled to be executed by the processor 210.
The charge management module 240 is configured to receive a charge input from a charger. The power management module 241 is used for connecting the battery 242, the charge management module 240 and the processor 210. The power management module 241 receives input from the battery 242 and/or the charge management module 240 and provides power to the processor 210, the internal memory 221, the display 290, the camera module 291, the wireless communication module 260, and the like.
The wireless communication function of the mobile terminal 200 may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, a modem processor, a baseband processor, and the like. Wherein the antenna 1 and the antenna 2 are used for transmitting and receiving electromagnetic wave signals; the mobile communication module 250 may provide a solution including 2G/3G/4G/5G wireless communication applied to the mobile terminal 200; the modem processor may include a modulator and a demodulator; the wireless communication module 260 may provide solutions for wireless communication including wireless local area network (Wireless Local Area Networks, WLAN), such as wireless fidelity (WIRELESS FIDELITY, wi-Fi) network, bluetooth (BT), etc., as applied on the mobile terminal 200. In some embodiments, antenna 1 and mobile communication module 250 of mobile terminal 200 are coupled, and antenna 2 and wireless communication module 260 are coupled, so that mobile terminal 200 may communicate with a network and other devices through wireless communication techniques.
The mobile terminal 200 implements display functions through a GPU, a display screen 290, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 290 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
The mobile terminal 200 may implement a photographing function through an ISP, a camera module 291, a video codec, a GPU, a display screen 290, an application processor, and the like. The ISP is used for processing the data fed back by the camera module 291; the camera module 291 is used for capturing still images or videos; the digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals; video codec is used to compress or decompress digital video, and the mobile terminal 200 may also support one or more video codecs.
The external memory interface 222 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the mobile terminal 200. The external memory card communicates with the processor 210 via an external memory interface 222 to implement data storage functions. For example, files such as music, video, etc. are stored in an external memory card.
The internal memory 221 may be used to store computer executable program code that includes instructions. The internal memory 221 may include a storage program area and a storage data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data (e.g., audio data, phonebook, etc.) created during use of the mobile terminal 200, and the like. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (Universal Flash Storage, UFS), and the like. The processor 210 performs various functional applications of the mobile terminal 200 and data processing by executing instructions stored in the internal memory 221 and/or instructions stored in a memory provided in the processor.
The mobile terminal 200 may implement audio functions through an audio module 270, a speaker 271, a receiver 272, a microphone 273, an earphone interface 274, an application processor, and the like. Such as music playing, recording, etc.
The depth sensor 2801 is used to acquire depth information of a scene. In some embodiments, a depth sensor may be provided at the camera module 291.
The pressure sensor 2802 is used to sense a pressure signal, and may convert the pressure signal into an electrical signal. In some embodiments, pressure sensor 2802 may be disposed on display 290. The pressure sensor 2802 is of various types, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like.
The gyro sensor 2803 may be used to determine a motion gesture of the mobile terminal 200. In some embodiments, the angular velocity of mobile terminal 200 about three axes (i.e., x, y, and z axes) may be determined by gyro sensor 2803. The gyro sensor 2803 can be used to capture anti-shake, navigation, motion-sensing game scenes, and the like.
In addition, sensors for other functions, such as an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc., may be provided in the sensor module 280 according to actual needs.
Other devices that provide auxiliary functionality may also be included in mobile terminal 200. For example, the keys 294 include a power-on key, a volume key, etc., by which a user can generate key signal inputs related to user settings and function controls of the mobile terminal 200. As another example, indicator 292, motor 293, SIM card interface 295, and the like.
In the related art, in the field of screen production, due to problems in links such as equipment, parameters, operation, environmental interference and the like, produced products are poor, and after each process is detected by utilizing optics (AOI), a large amount of image data is produced, so that professional operators are required to carry out poor judgment or classification on the images. With the rise of artificial intelligence algorithms typified by deep learning, the introduction of AI algorithms into this process can greatly mention the processing efficiency and accuracy.
Fig. 3 illustrates a schematic flow diagram of a method 300 of determining image categories according to one embodiment of the present disclosure. The method of determining the image category may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof as shown in fig. 1. As shown in fig. 3, the method 300 includes the following steps.
At step 310, a sample dataset is acquired, the sample dataset comprising a plurality of sample images that have been labeled with a category. The plurality of sample images comprise images corresponding to sample products, and the marked categories are categories of product defects. The product described herein may be, for example, a display screen, a display panel, etc., and the noted categories may refer to categories of display screen defects, such as the presence of residue, the presence of dust, too fine or too coarse of a circuit line, etc.
At step 320, the sample data set is divided into a training data set and a validation data set. The training data set is to be used to train the deep learning model to obtain an image classification model. The validation data set will be used to determine confidence thresholds for the image classification model in predicting the respective categories. In some embodiments, the sample data set may be divided into a training data set and a validation data set at a ratio of 9:1, i.e. the training data includes 90% of the sample images in the sample data set and the validation data set includes 10% of the sample images in the sample data set. Of course, such a division ratio is not necessary, and other ratios of division may be employed as needed. Generally, a 9:1 split ratio can better balance training and determining confidence thresholds, so that the trained model and the determined confidence thresholds have better use effects.
In step 330, the deep learning model is trained using the training dataset to obtain an image classification model. The image classification model is used for outputting the category of the input image and the confidence corresponding to the output category of the input image based on the image input by the image classification model.
In some embodiments, the deep learning model may be a Convolutional Neural Network (CNN) model, a target detection convolutional neural network (master-RCNN) model, a Recurrent Neural Network (RNN) model, a Generative Antagonism Network (GAN) model, but is not limited thereto, and other neural network models known to those skilled in the art may be employed.
In some embodiments, training parameters for training the deep learning model may be configured in advance, where the training parameters include at least a class of training targets, a type of the deep learning model, a total number of training rounds, a learning rate reduction strategy, and a test strategy. The training parameters may also include the size of the image input to the deep learning model.
It should be noted that, in this document, when referring to the class of the image classification model output image, it refers to the class of the image classification model predictive image. In other words, "output" and "prediction" of the image classification model may be equally used, with the same meaning. For example, an image classification model is used to output a category of an input image based on an image input to the image classification model, and may also be expressed as an image classification model used to predict a category of an input image based on an image input to the image classification model.
In step 340, the confidence threshold of the image classification model in outputting each category is determined by using the verification data set, so that the image classification model satisfies a preset accuracy rate or a preset recall rate in outputting. The accuracy represents a ratio of the number of images of the corresponding category to be correctly output to the number of images of the corresponding category, and the recall represents a ratio of the number of images of the corresponding category to be correctly output to the real number of images of the corresponding category.
In practice, in order for the image classification model to reach a particular accuracy or recall at the time of prediction, or in order to balance between accuracy and recall so that the prediction efficiency of the classification model is higher, it is necessary to set a confidence threshold for the classification model at the time of predicting each class. In the related art, for confidence setting, a simple manner of "one-cut" is generally adopted, that is, the same confidence is set for all training categories. In the embodiment of the disclosure, after the image classification model is obtained by training by using the training data set, a confidence threshold value of the image classification model in predicting each category can be determined by using the verification data set, so that the image classification model meets a preset accuracy rate or a preset recall rate in prediction. Therefore, the confidence threshold value can be calculated quickly, and the confidence threshold value can be calculated conveniently by integrating the training process, so that the confidence threshold value of each category can be known after the model is trained intuitively. And for the situation that sample class distribution is unbalanced, prediction or classification accuracy of each class is controlled by subdividing confidence thresholds from each class, so that the balance between the prediction accuracy and recall rate of the deep learning model is facilitated.
It should be noted that any suitable method may be used to determine the confidence threshold for the image classification model in predicting the respective categories using the validation dataset such that the image classification model meets a preset accuracy or a preset recall at the time of prediction, which is not limiting herein.
At step 350, the target image is input into the image classification model and a class of the target image is derived based on the determined confidence threshold. The target image described herein is the same type of image as the sample image described above. For example, the target image is also an image corresponding to the target product. The target product described herein may be, for example, a display screen, a display panel, or the like. The category of the target image may be a category of display screen defects, such as the presence of residue, the presence of dust, too fine or too coarse of a circuit line, etc.
In the method for determining the image category according to the embodiment of the disclosure, a marked sample data set is divided into a training data set and a verification data set, after an image classification model is obtained by training by the training data set, a confidence threshold value of the image classification model in predicting each category is determined by the verification data set, so that the image classification model meets a preset accuracy rate or a preset recall rate in prediction. Therefore, the confidence threshold value can be calculated quickly, and the confidence threshold value can be calculated conveniently by integrating the training process, so that the confidence threshold value of each category can be known after the model is trained intuitively. In addition, for the situation that sample class distribution is unbalanced, the technical scheme can control the prediction or classification accuracy of each class through the classification confidence threshold value of each class, can well process the recognition condition of the deep learning model on each class, is beneficial to realizing the balance between the prediction accuracy and recall rate of the deep learning model, and improves the efficiency of classifying target images.
FIG. 4 illustrates an exemplary flowchart of a method 400 of determining confidence thresholds for an image classification model in predicting various categories using a validation data set according to one embodiment of the disclosure. The method 400 may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1. As shown in fig. 4, the method 400 includes the following steps.
At step 410, each sample image in the validation data set is input into the image classification model to obtain an output category and a corresponding confidence level for each sample image in the validation data set. The image classification model may be the image classification model obtained above in step 330, which may predict the confidence that the category of the input image corresponds to the predicted category of the input image.
At step 420, each of the outputted categories is selected separately, and the sample images outputted as the selected categories are ranked from top to bottom according to the corresponding confidence. As described above, the accuracy represents a ratio of the number of images predicted to the corresponding category correctly to the number of images predicted to the corresponding category. In this way, therefore, sample images predicted to be of a selected class may be ranked from top to bottom according to the corresponding confidence, thereby facilitating review of the prediction accuracy of the image classification model.
In step 430, among the sample images of the selected category ordered from high to low according to the corresponding confidence, determining a target sample image such that a ratio of a number of sample images of a correctly output category in a designated sample image set including sample images of the same output category as the labeled category to a number of sample images in the designated sample image set including sample images of the target sample image and sample images of which the corresponding confidence is higher than that of the target sample image in the sample images of the selected category is smaller than the preset accuracy. In some embodiments, the sample images predicted to the selected category may be sequentially traversed in a confidence-to-bottom ranking until a ratio of a number of sample images of a correctly predicted category to a number of traversed sample images of a correctly predicted category, including sample images of the same predicted category as the annotated category, including a currently traversed sample image and a sample image traversed prior to the current traversal, is less than the preset accuracy. As an example, among 1000 sample images that have been traversed in order of confidence from high to low (the 1000 sample images are predicted to be of the same class), there are 900 sample images that are correctly predicted to be of the same class (i.e., the predicted class is the same class as the annotated class), and 100 sample images that are not correctly predicted, the ratio of the number of sample images of the correctly predicted class to the number of sample images that have been traversed is 0.9. By this step, two sample images can be found that do not meet the threshold for accuracy. The preset accuracy can be determined according to actual needs, and the preset accuracy represents an accuracy index which needs to be reached by the image classification model in prediction. As an example, the current traversal is the fifth traversal, and the traversal preceding the current traversal is the first to fourth traversals, and the last traversal is the fourth traversal.
At step 440, a confidence threshold corresponding to the selected category is determined based on the confidence of the sample image correspondence in the specified sample image set. For example, the confidence threshold value corresponding to the selected category may be determined by calculating a weighted sum of the confidence levels corresponding to the sample images in the designated sample image set, or the confidence level corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from the top to the bottom according to the corresponding confidence level may be directly determined as the confidence threshold value corresponding to the selected category. The weights used and weighted here may be set as desired.
In some embodiments, the confidence threshold value corresponding to the selected category may be determined based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom by the corresponding confidence. For example, a weighted sum of the confidence level corresponding to the target sample image and the confidence level corresponding to the previous sample image may be determined as the confidence threshold corresponding to the selected category. Specifically, an average value of the confidence coefficient corresponding to the target sample image and the confidence coefficient corresponding to the previous sample image is determined as a confidence coefficient threshold value corresponding to the selected category.
The above embodiment of the present disclosure provides a method for dynamically calculating confidence thresholds of image classification models in predicting each class, which can maximize recall rate of the models on the premise of ensuring preset accuracy of the models.
FIG. 5 illustrates an exemplary implementation flow chart 500 of a method 400 of determining confidence thresholds for image classification models in predicting respective categories using a validation data set according to an embodiment of the disclosure. As shown in the graph 500, after the step 420 described above, i.e., after selecting the predicted category and ordering the sample images predicted to be the selected category from top to bottom according to the corresponding confidence, the parameters may be initialized at 510 such that the number of traversed images total_nums=0, the number of correct images correct_nums=0, the confidence of the currently traversed image curScore =1, the confidence of the last image lastScore =1, and the preset accuracy correctRate _metric=0.95.
Then, the sample images predicted as the selected category are sequentially traversed in a confidence-to-bottom ranking, and a determination is made at 520 as to whether the predicted category of the sample image is the same as the labeled category. If so, the parameters are updated at 530: correct number of images correct_nums+=1, number of traversed images total_nums+=1, lastScore = curScore, current traversed image corresponds to confidence curScore =confidence of the current traversed sample image. If not, the parameters are updated at 540: the number of traversed images total_nums+ =1, lastscore= curScore, the corresponding confidence curScore of the currently traversed image=the confidence of the currently traversed sample image, but the number of correct images correct_nums is unchanged.
Then, at 550, it is determined whether the accuracy is greater than a preset accuracy correctRate _metric. The correct rate is the ratio of the number of correct pictures correct_nums to the number of traversed pictures total_nums. To avoid zero denominator, the correct rate can be expressed as correct_nums/(total_nums+0.0001). If the accuracy rate is greater than the preset accuracy rate correctRate _metric, the next image is continuously traversed.
If the accuracy < = preset accuracy correctRate _metric, the next image is not traversed continuously. Then, at 560, the average of the confidence corresponding to the currently traversed sample image and the confidence corresponding to the last traversed sample image is determined as the confidence threshold corresponding to the selected category, i.e., confidence threshold Confidence _th= (lastScore + curScore)/2.
The calculation of the confidence level for the remaining selected categories then continues in a similar manner until all the categories are calculated.
In the mode, the method for simply and efficiently dynamically calculating the confidence threshold value of the image classification model in predicting each category is provided, and the recall rate of the model can be maximized on the premise of ensuring the preset accuracy rate of the model. According to the confidence threshold obtained by the dynamic calculation, most categories can meet the requirements during production line reasoning. But the confidence threshold that may be obtained for some categories is very low (e.g., below 0.5), which may lead to other categories being misjudged as such, resulting in the category being overdetermined. Therefore, in this case, the post-processing may be added to adjust the confidence threshold for a score below 0.5 to 0.5.
It should be noted that in some cases, a sample product may have multiple product defects, and thus a sample image corresponding to the sample product may be labeled with multiple categories. In this case, in some embodiments, the multiple categories may be given different priorities such that more important product defects are given higher priority. Meanwhile, the category with higher priority has a lower confidence threshold, so that the image classification model can better meet the preset accuracy rate when being output. Of course, if the confidence thresholds corresponding to the two categories differ relatively much (e.g., greater than a preset gap threshold, such as 0.5), then the priorities of the two are not considered.
FIG. 6 illustrates an exemplary flowchart of a method 600 for determining confidence thresholds for an image classification model in predicting various categories using a validation data set according to one embodiment of the disclosure. The method 600 may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1. As shown in fig. 6, the method 600 includes the following steps.
At step 610, each sample image in the validation data set is input into the image classification model to obtain the output category and corresponding confidence level for each sample image in the validation data set. The image classification model may be the image classification model obtained above in step 330, which may predict the confidence that the category of the input image corresponds to the predicted category of the input image.
At step 620, each of the annotated categories is selected separately, and the sample images annotated as the selected categories are ranked from top to bottom according to the corresponding confidence. As previously described, the recall represents the ratio of the number of images of the corresponding category that were correctly predicted to the actual number of images of the corresponding category. The actual number of images of the corresponding category is herein the number of sample images marked as the corresponding category. In this way, therefore, sample images labeled as the respective category may be ranked from top to bottom according to the corresponding confidence, thereby facilitating review of recall for the image classification model.
In step 630, among the sample images of the selected category ordered from high to low according to the corresponding confidence, determining a target sample image such that the ratio of the number of sample images of the correctly output category in a designated sample image set including sample images of the same output category as the noted category to the number of sample images in the designated sample image set including sample images of the target sample image and the sample images noted as the selected category having the corresponding confidence higher than the confidence of the target sample image is smaller than the preset recall. In some embodiments, the sample images labeled as the selected category may be sequentially traversed in a top-to-bottom order of confidence until a ratio of a number of correctly predicted category sample images to a number of traversed sample images including sample images of the same predicted category as the labeled category, among the traversed sample images including the currently traversed sample image and the sample images traversed prior to the current traversal, is less than the preset recall. As an example, among 1000 sample images (the 1000 sample images are labeled as the same class) that have been traversed in order from high to low in confidence, 900 sample images are correctly predicted in class (i.e., the predicted class is the same as the labeled class), and 100 sample images that have not been correctly predicted, the ratio of the number of sample images of the correctly predicted class in the traversed sample images to the number of traversed sample images is 0.9. By this step, two sample images can be found that do not meet the threshold at the preset recall rate. The preset recall rate can be determined according to actual needs, and represents a recall rate index which needs to be reached by the image classification model in prediction. As an example, the current traversal is the fifth traversal, and the traversal preceding the current traversal is the first to fourth traversals, and the last traversal is the fourth traversal.
At step 640, a confidence threshold corresponding to the selected category is determined based on the confidence of the sample image correspondence in the specified sample image set. For example, the confidence threshold value corresponding to the selected category may be determined by calculating a weighted sum of the confidence levels corresponding to the sample images in the designated sample image set, or the confidence level corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from the top to the bottom according to the corresponding confidence level may be directly determined as the confidence threshold value corresponding to the selected category. The weights used and weighted here may be set as desired.
In some embodiments, the confidence threshold value corresponding to the selected category may be determined based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom by the corresponding confidence. For example, a weighted sum of the confidence level corresponding to the target sample image and the confidence level corresponding to the previous sample image may be determined as the confidence threshold corresponding to the selected category. Specifically, an average value of the confidence coefficient corresponding to the target sample image and the confidence coefficient corresponding to the previous sample image is determined as a confidence coefficient threshold value corresponding to the selected category.
The above embodiment of the present disclosure provides a method for dynamically calculating confidence thresholds of image classification models in predicting each class, which can maximize the accuracy of the model on the premise of ensuring the preset recall rate of the model.
FIG. 7 illustrates an exemplary implementation flow chart 700 of a method 600 of determining confidence thresholds for image classification models in predicting respective categories using a validation data set according to an embodiment of the disclosure. As shown in graph 700, after step 620 described above, i.e., after selecting the noted category and ordering the sample images noted as the selected category from top to bottom according to the corresponding confidence, the parameters may be initialized at step 710 such that the number of traversed images total_nums=0, the number of recalled images recall _nums=0, the currently traversed image corresponds to confidence curScore =1, the last image corresponds to confidence lastScore =1, and the preset recall RECALLRATE _metric=0.95.
Then, the sample images labeled as the selected category are traversed in order of confidence from top to bottom, and a determination is made at 720 as to whether the predicted category of the sample image is the same as the labeled category. If so, the parameters are updated at 730: the number of recalled images recall _nums+=1, the number of traversed images total_nums+=1, lastScore = curScore, the confidence curScore for the currently traversed image and the confidence for the current traversed sample image. If not, the parameters are updated at 740: the number of traversed images total_nums+ =1, lastscore= curScore, the currently traversed image corresponds to confidence curScore =the confidence of the currently traversed sample image, but the number of recalled images recall _nums is unchanged.
Then, at 750, it is determined whether the current recall is greater than a preset recall RECALLRATE _metric. The current recall is the ratio of the number recall _nums of pictures recalled to the number total_nums of pictures traversed. To avoid zero denominator, the current recall may be expressed as recall _nums/(total_nums+0.0001) if the recall > the preset recall RECALLRATE _metric, then the next image is continued to be traversed.
If the current recall ratio < = preset recall ratio RECALLRATE _metric, the next image is not traversed continuously. Then, at 760, an average of the confidence corresponding to the currently traversed sample image and the confidence corresponding to the last traversed sample image is determined as the confidence threshold corresponding to the selected category, i.e., confidence threshold Confidence _th= (lastScore + curScore)/2.
The calculation of the confidence level for the remaining selected categories then continues in a similar manner until all the categories are calculated.
In the mode, the method for simply and efficiently dynamically calculating the confidence threshold value of the image classification model in predicting each category is provided, and the accuracy of the model can be maximized on the premise of ensuring the preset recall rate of the model. According to the confidence threshold obtained by the dynamic calculation, most categories can meet the requirements during production line reasoning. But the confidence threshold that may be obtained for some categories is very low (e.g., below 0.5), which may lead to other categories being misjudged as such, resulting in the category being overdetermined. Therefore, in this case, the post-processing may be added to adjust the confidence threshold for a score below 0.5 to 0.5.
It should be noted that in some cases, a sample product may have multiple product defects, and thus a sample image corresponding to the sample product may be labeled with multiple categories. In this case, in some embodiments, the multiple categories may be given different priorities such that more important product defects are given higher priority. Meanwhile, the category with higher priority has a lower confidence threshold, so that the image classification model better meets the preset recall rate when outputting. Of course, if the confidence thresholds corresponding to the two categories differ relatively much (e.g., greater than a preset gap threshold, such as 0.5), then the priorities of the two are not considered.
Fig. 8 illustrates a schematic flow diagram of a method 800 of determining a confidence threshold in accordance with one embodiment of the present disclosure. The method of determining the confidence threshold may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1. As shown in fig. 8, the method 800 includes the following steps.
At step 810, a sample dataset is acquired, the sample dataset comprising a plurality of sample images that have been labeled with a category. The plurality of sample images comprise images corresponding to sample products, and the marked categories are categories of product defects. The product described herein may be, for example, a display screen, a display panel, etc., and the noted categories may refer to categories of display screen defects, such as the presence of residue, the presence of dust, too fine or too coarse of a circuit line, etc.
At step 820, the sample data set is divided into a training data set and a validation data set. The training data set is to be used to train the deep learning model to obtain an image classification model. The validation data set will be used to determine confidence thresholds for the image classification model in predicting the respective categories. In some embodiments, the sample data set may be divided into a training data set and a validation data set at a ratio of 9:1, i.e. the training data includes 90% of the sample images in the sample data set and the validation data set includes 10% of the sample images in the sample data set. Of course, such a division ratio is not necessary, and other ratios of division may be employed as needed. Generally, a 9:1 split ratio can better balance training and determining confidence thresholds, so that the trained model and the determined confidence thresholds have better use effects.
In step 830, the deep learning model is trained using the training dataset to obtain an image classification model. The image classification model is used for outputting the category of the input image and the confidence corresponding to the output category of the input image based on the image input by the image classification model. In some embodiments, the deep learning model may be a Convolutional Neural Network (CNN) model, a target detection convolutional neural network (master-RCNN) model, a Recurrent Neural Network (RNN) model, a Generative Antagonism Network (GAN) model, but is not limited thereto, and other neural network models known to those skilled in the art may be employed.
In some embodiments, training parameters for training the deep learning model may be configured in advance, where the training parameters include at least a class of training targets, a type of the deep learning model, a total number of training rounds, a learning rate reduction strategy, and a test strategy. The training parameters may also include the size of the image input to the deep learning model.
In step 840, a confidence threshold for the image classification model in outputting each category is determined using the validation dataset such that the image classification model meets a preset accuracy or a preset recall when output. The accuracy represents a ratio of the number of images of the corresponding category to be correctly output to the number of images of the corresponding category, and the recall represents a ratio of the number of images of the corresponding category to be correctly output to the real number of images of the corresponding category. It should be noted that any suitable method may be used to determine the confidence threshold of the image classification model in outputting the respective categories using the validation dataset such that the image classification model meets a preset accuracy or a preset recall at the time of prediction, which is not limiting herein.
In the method for determining the confidence threshold according to the embodiment of the disclosure, a labeled sample data set is divided into a training data set and a verification data set, after an image classification model is obtained by training by the training data set, the confidence threshold of the image classification model in predicting each category is determined by the verification data set, so that the image classification model meets a preset accuracy rate or a preset recall rate in prediction. Therefore, the confidence threshold value can be calculated quickly, and the confidence threshold value can be calculated conveniently by integrating the training process, so that the confidence threshold value of each category can be known after the model is trained intuitively. In addition, for the situation that sample class distribution is unbalanced, the technical scheme can control the prediction or classification accuracy of each class through the classification confidence threshold value of each class, can well process the recognition condition of the deep learning model on each class, is beneficial to realizing the balance between the prediction accuracy and recall rate of the deep learning model, and improves the classification efficiency of the image classification model.
It should be appreciated that the present embodiment is directed to the process of determining confidence thresholds for an image classification model, whereas the embodiment described with reference to fig. 3 is directed to the process of constructing and using an image classification model. Accordingly, the related schemes described above with reference to fig. 3-7 are also applicable to the present embodiment, and specific details are not described here.
Thus, in some embodiments, determining confidence thresholds for the image classification model in outputting the respective categories using the validation dataset includes: inputting each sample image in the verification data set into the image classification model to obtain an output category and a corresponding confidence level of each sample image in the verification data set; selecting each of the outputted categories separately, and ordering the sample images outputted as the selected categories from top to bottom according to the corresponding confidence; determining target sample images in the sample images of the selected categories which are ranked from top to bottom according to the corresponding confidence, so that the ratio of the number of the sample images of the correctly output categories in a designated sample image set to the number of the sample images in the designated sample image set is smaller than the preset accuracy, wherein the sample images of the correctly output categories comprise the sample images of which the output categories are the same as the marked categories, and the sample images in the designated sample image set comprise the target sample images and the sample images of which the corresponding confidence is higher than the confidence of the target sample images in the sample images of the selected categories; determining a confidence threshold corresponding to the selected category based on the confidence of the sample image correspondence in the designated sample image set
In some embodiments, determining confidence thresholds for the image classification model in outputting the respective categories using the validation dataset includes: inputting each sample image in the verification data set into the image classification model to obtain an output category and a corresponding confidence level of each sample image in the verification data set; selecting each of the labeled categories separately, and ordering the sample images labeled as the selected categories from top to bottom according to the corresponding confidence levels; determining target sample images in the sample images of the selected categories which are ranked from high to low according to the corresponding confidence, so that the ratio of the number of the sample images of the correctly output categories in a designated sample image set to the number of the sample images in the designated sample image set is smaller than the preset recall, wherein the sample images of the correctly output categories comprise the sample images of which the output categories are the same as the marked categories, and the sample images in the designated sample image set comprise the target sample images and the sample images marked as the selected categories, and the corresponding confidence of the sample images is higher than the confidence of the target sample images; a confidence threshold corresponding to the selected category is determined based on the confidence level corresponding to the sample images in the designated sample image set.
Determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the sample image in the designated sample image set, comprising: and determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom according to the corresponding confidence.
Determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom according to the corresponding confidence, comprising: and determining the average value of the confidence coefficient corresponding to the currently traversed sample image and the confidence coefficient corresponding to the previous sample image as a confidence coefficient threshold value corresponding to the selected category.
Fig. 9 illustrates a schematic flow diagram of a method 900 of determining a confidence threshold in accordance with one embodiment of the present disclosure. The method of determining the confidence threshold may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1. As shown in fig. 9, the method 900 includes the following steps.
In step 910, a sample dataset is obtained that includes a plurality of sample images that have been labeled with a class in response to a user's configuration operation on parameters of the sample dataset. The plurality of sample images comprise images corresponding to sample products, and the marked categories are categories of product defects. The product described herein may be, for example, a display screen, a display panel, etc., and the noted categories may refer to categories of display screen defects, such as the presence of residue, the presence of dust, too fine or too coarse of a circuit line, etc.
As an example, a user may configure parameters of a sample data set on a graphical interface as shown in fig. 10, which may include, for example, a training type (e.g., image classification), a selected sample data set (e.g., l6lbf_main code), a version number of the sample data set (e.g., V1), and so on.
At step 920, the sample data set is divided into a training data set and a validation data set. The training data set is to be used to train the deep learning model to obtain an image classification model. The validation data set will be used to determine confidence thresholds for the image classification model in predicting the respective categories. As shown in fig. 10, the sample data set (including 10000 sample images) may be divided into a training data set and a verification data set at a ratio of 9:1, that is, the training data includes 90% of the sample images (i.e., 9000 sample images) in the sample data set, and the verification data set includes 10% of the sample images (i.e., 1000 sample images) in the sample data set.
In step 930, the deep learning model is trained using the training dataset to obtain an image classification model. The image classification model is used for outputting the category of the input image and the confidence corresponding to the output category of the input image based on the image input by the image classification model. In some embodiments, the deep learning model may be a Convolutional Neural Network (CNN) model, a target detection convolutional neural network (master-RCNN) model, a Recurrent Neural Network (RNN) model, a Generative Antagonism Network (GAN) model, but is not limited thereto, and other neural network models known to those skilled in the art may be employed.
In some embodiments, training parameters for training the deep learning model may be configured in advance, where the training parameters include at least a class of training targets, a type of the deep learning model, a total number of training rounds, a learning rate reduction strategy, and a test strategy. The training parameters may also include the size of the image input to the deep learning model.
As an example, a training task may be established in response to a task establishment operation of a user, and a parameter configuration interface may be generated for configuring training parameters when training the deep learning model, where the training parameters include at least a class for which training is aimed, a type of the deep learning model, a total number of training rounds, a learning rate decline strategy, and a test strategy; and training the deep learning model according to the training parameters and by utilizing a training data set to obtain an image classification model.
As shown in fig. 11, in response to a task building operation of a user, a training task is built, and a parameter configuration interface is generated, where the user can configure training parameters when training a deep learning model in the parameter configuration interface, where the training parameters may include a class for which training is performed, a type of the deep learning model, a total number of training rounds, a learning rate drop strategy (e.g., a learning rate drop round number, etc.), a test strategy (e.g., a test round number, etc.), a size of an image input to the deep learning model, and so on. In addition, the training parameters may optionally include some custom parameters, such as departments, sites, model keywords, and the like, which are not limiting. And after the user clicks the configuration, training the deep learning model according to the training parameters and by utilizing a training data set.
In the following embodiments, a plurality of classification models are trained according to the test strategy, and a classification model corresponding to the maximum test round number is obtained as the image classification model. As shown in fig. 11, if there are three test rounds (120000, 150000 and 200000) in the training strategy, a classification model corresponding to the maximum test round (200000) may be obtained as the image classification model.
In some embodiments, the training total number of rounds is positively correlated with the number of sample images in the sample dataset. For example, if the number of sample images is 10000 or less, the total training wheel number is 300000; if the number of the sample images is greater than 10000, adopting the following total number of wheels determining formula to configure the training total number of wheels; wherein the total wheel number determining formula is:
Wherein Y represents the total number of training rounds, X represents the number of sample images, X is 10000 or more, INT is a rounding function, b represents a growth factor, and b is a fixed value, wherein b is 30000 or more and 70000 or less. In the present example, the value of b may be 50000 or 60000 in the real-time manner, and the present example embodiment is not particularly limited.
In step 940, a confidence threshold for the image classification model when outputting each category is determined using the validation dataset, such that the image classification model meets a preset accuracy or a preset recall when output. The accuracy represents a ratio of the number of images of the corresponding category to be correctly output to the number of images of the corresponding category, and the recall represents a ratio of the number of images of the corresponding category to be correctly output to the real number of images of the corresponding category. It should be noted that any suitable method may be used to determine the confidence threshold for the image classification model in outputting the respective categories using the validation data set such that the image classification model meets a preset accuracy or a preset recall at the time of output, which is not limiting herein.
At step 950, a confidence presentation interface is generated for presenting confidence thresholds for the image classification model in outputting the respective categories. As shown in FIG. 12, a confidence threshold presentation interface may be generated in which confidence thresholds for each category are visually presented to facilitate user acquisition and correction of the confidence thresholds.
In the method for determining the confidence threshold according to the embodiment of the disclosure, a labeled sample data set is divided into a training data set and a verification data set, after an image classification model is obtained by training by the training data set, the confidence threshold of the image classification model in predicting each category is determined by the verification data set, so that the image classification model meets a preset accuracy rate or a preset recall rate in prediction. Therefore, the confidence threshold value can be calculated quickly, and the confidence threshold value can be calculated conveniently by integrating the training process, so that the confidence threshold value of each category can be known after the model is trained intuitively. In addition, for the situation that sample class distribution is unbalanced, the technical scheme can control the prediction or classification accuracy of each class through the classification confidence threshold value of each class, can well process the recognition condition of the deep learning model on each class, is beneficial to realizing the balance between the prediction accuracy and recall rate of the deep learning model, and improves the classification efficiency of the image classification model.
In some embodiments, during the training process, a training schedule may be generated and presented that includes a task cancellation identification and a task detail identification, as shown in FIG. 13. After the user triggers the task cancellation mark, training of the deep learning model can be stopped, and control of the user on the training process is facilitated. After triggering operation of the task detail identifier by a user, generating and displaying a loss curve of the training process; the training parameters may then be updated according to the loss curve. The loss curve is usually a curve with two-dimensional coordinates, the abscissa is the number of training wheels, the ordinate is the loss value, and in the model training process, the loss curve is updated in real time according to the state in training, so that the loss curve can be observed, and then training parameters are adjusted according to the curve state. Specifically, if the loss curve is always disordered and does not show a descending trend along with the increase of the abscissa, the training parameters are not properly configured, the training should be stopped, and the parameters are reconfigured for retraining; if the loss curve drops slowly, it should be observed continuously, stopping training or increasing the learning rate at the next training. If the loss curve still shows a decreasing trend (normally should be eventually towards smoothness) after training is completed, the total number of training rounds should be increased after training is completed for retraining.
It should be appreciated that the present embodiment focuses on the construction process of the image classification model, whereas the embodiment described with reference to fig. 3 focuses on the construction and use process of the image classification model. Accordingly, the related schemes described above with reference to fig. 3-7 are also applicable to the present embodiment, and specific details are not described here.
For example, in some embodiments, determining confidence thresholds for the image classification model in outputting the respective categories using the validation dataset includes: inputting each sample image in the verification data set into the image classification model to obtain an output category and a corresponding confidence level of each sample image in the verification data set; selecting each of the outputted categories separately, and ordering the sample images outputted as the selected categories from top to bottom according to the corresponding confidence; determining target sample images in the sample images of the selected categories which are ranked from top to bottom according to the corresponding confidence, so that the ratio of the number of the sample images of the correctly output categories in a designated sample image set to the number of the sample images in the designated sample image set is smaller than the preset accuracy, wherein the sample images of the correctly output categories comprise the sample images of which the output categories are the same as the marked categories, and the sample images in the designated sample image set comprise the target sample images and the sample images of which the corresponding confidence is higher than the confidence of the target sample images in the sample images of the selected categories; a confidence threshold corresponding to the selected category is determined based on the confidence level corresponding to the sample images in the designated sample image set.
In some embodiments, determining confidence thresholds for the image classification model in outputting the respective categories using the validation dataset includes:
Inputting each sample image in the verification data set into the image classification model to obtain an output category and a corresponding confidence level of each sample image in the verification data set; selecting each of the labeled categories separately, and ordering the sample images labeled as the selected categories from top to bottom according to the corresponding confidence levels; determining target sample images in the sample images of the selected categories which are ranked from high to low according to the corresponding confidence, so that the ratio of the number of the sample images of the correctly output categories in a designated sample image set to the number of the sample images in the designated sample image set is smaller than the preset recall, wherein the sample images of the correctly output categories comprise the sample images of which the output categories are the same as the marked categories, and the sample images in the designated sample image set comprise the target sample images and the sample images marked as the selected categories, and the corresponding confidence of the sample images is higher than the confidence of the target sample images; a confidence threshold corresponding to the selected category is determined based on the confidence level corresponding to the sample images in the designated sample image set.
Determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the sample image in the designated sample image set, comprising: and determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom according to the corresponding confidence.
Determining a confidence threshold corresponding to the selected category based on the confidence corresponding to the target sample image and the confidence corresponding to the previous sample image of the target sample image in the sample images of the selected category ranked from top to bottom according to the corresponding confidence, comprising: and determining the average value of the confidence coefficient corresponding to the currently traversed sample image and the confidence coefficient corresponding to the previous sample image as a confidence coefficient threshold value corresponding to the selected category.
Fig. 14 illustrates an exemplary block diagram of an apparatus 1400 for determining image categories according to one embodiment of the disclosure. As shown in fig. 14, the apparatus 1400 for determining image categories includes an acquisition module 1410, a division module 1420, a training module 1430, a determination module 1440, and an output module 1450.
The acquisition module 1410 is configured to acquire a sample dataset comprising a plurality of sample images that have been labeled with a class. The partitioning module 1420 is configured to partition the sample data set into a training data set and a validation data set. The training module 1430 is configured to train the deep learning model with the training data set to obtain an image classification model for outputting a category of the input image and a confidence level corresponding to the output category of the input image based on the image input to the image classification model. The determining module 1440 is configured to determine, using the validation dataset, a confidence threshold for the image classification model when outputting each category such that the image classification model when output meets a preset accuracy or a preset recall, wherein accuracy represents a ratio of a number of images of the respective category that are correctly output to a number of images of the respective category that are output, and recall represents a ratio of a number of images of the respective category that are correctly output to a true number of images of the respective category. The output module 1450 is configured to input the target image into the image classification model and derive a class of the target image based on the determined confidence threshold.
Fig. 15 illustrates an exemplary block diagram of an apparatus 1500 for determining a confidence threshold in accordance with one embodiment of the present disclosure. As shown in fig. 15, the apparatus 1500 for determining a confidence threshold includes an acquisition module 1510, a partitioning module 1520, a training module 1530, and a determination module 1540.
The acquisition module 1510 is configured to acquire a sample dataset comprising a plurality of sample images that have been labeled with a class. The partitioning module 1520 is configured to partition the sample data set into a training data set and a validation data set. The training module 1530 is configured to train the deep learning model with the training data set to obtain an image classification model for outputting a category of the input image and a confidence level corresponding to the output category of the input image based on the image input to the image classification model. The determination module 1540 is configured to determine a confidence threshold for the image classification model when outputting each category using the validation dataset such that the image classification model when output meets a preset accuracy or a preset recall, wherein accuracy represents a ratio of a number of images of a respective category that are correctly output to a number of images of the respective category that are output, and recall represents a ratio of a number of images of the respective category that are correctly output to a true number of images of the respective category.
Fig. 16 illustrates an exemplary block diagram of an apparatus 1600 for determining a confidence threshold according to one embodiment of the present disclosure. As shown in fig. 16, the apparatus 1600 for determining a confidence threshold includes an acquisition module 1610, a division module 1620, a training module 1630, a determination module 1640, and a generation module 1650.
The acquisition module 1610 is configured to acquire a sample data set comprising a plurality of sample images that have been labeled with a class. The partitioning module 1620 is configured to partition the sample data set into a training data set and a validation data set. The training module 1630 is configured to train the deep learning model with the training data set to obtain an image classification model for outputting a category of the input image and a confidence level corresponding to the output category of the input image based on the image input to the image classification model. The determination module 1640 is configured to determine a confidence threshold for the image classification model when outputting each category using the validation dataset such that the image classification model when output meets a preset accuracy or a preset recall, wherein accuracy represents a ratio of a number of images of a respective category that are correctly output to a number of images of the respective category that are output, and recall represents a ratio of a number of images of the respective category that are correctly output to a true number of images of the respective category. The generation module 1650 is configured to generate a confidence presentation interface for presenting confidence thresholds for the image classification model in outputting the respective categories.
The specific details of each module in the above apparatus are already described in the method section, and the details that are not disclosed can be referred to the embodiment of the method section, so that they will not be described in detail.
Further, by analyzing the related art, the applicant found that existing deep learning-based product defect detection schemes are generally classified into the following two categories. In the first scheme, after an algorithm personnel trains the model, the model is embedded into system software to be directly deployed on line. The software has poor algorithm generality, and when the acquired image changes due to changes of the production process or the product model, the algorithm precision tends to be obviously poor, so that the detection effect of the product defect detection system is poor. Such schemes are widely used in the related art because of their simpler development and deployment processes. In a second class of schemes, model training functionality may be provided. Specifically, when a production process or product model changes, or according to other specific requirements, simple sample data collection and model training can be performed by operation and maintenance personnel or other management personnel, and the trained model can be deployed on line. Such schemes are currently less useful. Compared with the first type of scheme, the second type of scheme is helpful to relieve the model effect change caused by the change of the product technology or the product model, however, in the scheme, the trained model is often deployed on line after simple test, and the risk that the on-line effect is inconsistent with the off-line effect exists.
Fig. 17 illustrates a schematic flow diagram of a method 1700 of determining a target model according to one embodiment of the disclosure. The method 1700 of determining the object model may be implemented, for example, by the terminal device 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1. As shown in fig. 17, method 1700 includes the following steps.
At step 1710, a sample dataset is acquired, the sample dataset comprising a training dataset and a validation dataset, each of the training dataset and the validation dataset comprising a plurality of sample images that have been labeled with a class. The sample image may be an image of a target object, for example. Further by way of example, the target object may be a target product and the category may be a product defect category of the target product. As mentioned above, the target product may be, for example, a display screen, a display panel, etc., and the noted category may refer to a category of display screen defect, such as the presence of residue, the presence of dust, too fine or too coarse a circuit line, etc.
The training data set and the verification data set may be acquired separately, or may be obtained by dividing the sample data set, for example. The training data set may be used to train the deep learning model to obtain a trained model. The validation data set may be used to validate the training effect of the deep learning model, for example, a Loss curve, i.e., a curve of the change in the Loss function value, may be generated during the training process based on the validation data set to determine the model training effect and to help determine when to stop training. In some embodiments, the number of sample pictures comprised by the training data set and the validation data set may correspond to a predetermined ratio. For example, the ratio of training data set to validation data set may be 9:1, 8:1, 8:2, etc. Illustratively, the sample data set may include other data sets in addition to the training data set and the validation data set. Further by way of example, the sample data set may be divided into a training set and other data sets prior to dividing the training set into the foregoing training data set and verification data set during training.
At step 1720, the deep learning model is trained using the training dataset, resulting in at least two trained models according to different training rounds. By way of example, the trained model may be the image classification model mentioned in the previous embodiments, which may be used to predict a class based on an input image and derive a confidence level corresponding to the class.
In some embodiments, the deep learning model may be a Convolutional Neural Network (CNN) model, a target detection convolutional neural network (master-RCNN) model, a Recurrent Neural Network (RNN) model, a Generative Antagonism Network (GAN) model, a self-attention (self-attention) model, but is not limited thereto, and other neural network models known to those skilled in the art may be employed.
In some embodiments, training parameters for training the deep learning model may be configured in advance, where the training parameters include at least a test strategy. For example, two or more training wheel numbers may be specified in the training parameters such that a trained model with a corresponding training wheel number may be saved as an alternative model such that a better model among these alternative models may be selected as the target model during subsequent testing. The number of training rounds may be specified empirically by the person performing the model training, or indirectly based on certain parameters, such as on a decline in learning rate, etc., for example. For example, assuming that the model is considered to be better when the second learning rate is decreasing, a number of training rounds may be selected around the number of rounds when the second learning rate is decreasing. It should be understood that the number of model training rounds to be saved can be specified according to other indexes according to actual requirements. And then, when training the deep learning model according to the training parameters, outputting the model with the training round number reaching the test round number according to a test strategy for testing.
For example, the test strategy may include the number of tests and the number of rounds at the time of the test, and the training parameters may include a learning rate reduction strategy, a training total number of rounds, and the like, in addition to the test strategy. And, illustratively, configuring training parameters according to the characteristic information of the sample dataset may refer to configuring a learning rate reduction strategy, a total number of training rounds, a test strategy, and the like according to the number of samples in the characteristic information.
Specifically, the total number of training rounds and the number of samples may be positively correlated. For example, if the number of samples is 10000 or less, the total number of training rounds may be 300000; if the number of samples is greater than 10000, the following total number of rounds determination formula may be used to configure the total number of rounds of training. The total wheel number determining formula is:
Wherein Y represents the total number of training wheels, X represents the number of samples, X is more than or equal to 10000, INT is a rounding function, b represents a growth factor, and b is a fixed value, wherein b is more than or equal to 30000 and less than or equal to 70000. In the present example, the value of b may be 50000 or 60000 in the real-time manner, and the present example embodiment is not particularly limited. In this example embodiment, the mapping relationship between the number of samples and the total number of training rounds may be an optimal result obtained through multiple tests, or may be customized according to a user requirement, which is not specifically limited in this example embodiment.
In this example embodiment, the number of rounds at which the learning rate is lowered is positively correlated with the total number of rounds of training, wherein the number of rounds at which the learning rate is lowered is equal to or greater than the number of rounds at which the learning rate is lowered for the first time and equal to or less than the total number of rounds of training, the number of rounds of learning rate is lowered a plurality of times, at least two tests are performed within a preset number of rounds of learning rate lowering for the second time, and two, three or more tests may be performed, without being particularly limited in this example embodiment. The method has the advantages that the learning rate can be reduced for a plurality of times during training, and the reduction times with the optimal result are selected after the learning rate is reduced for a plurality of times, so that the accuracy of the obtained target model can be improved, and the accuracy of defect detection is further improved; furthermore, in the training process, multiple tests can be performed on models with different training round numbers, and the model with the optimal test result is selected as a target model, so that the performance of the model is improved.
In the present exemplary embodiment, the learning rate decrease method may be a piecewise constant decrease method, an exponential decrease method, a natural exponential decrease method, a cosine decrease method, or the like, and is not particularly limited in the present exemplary embodiment, and the learning rate decrease range may be related to the learning rate decrease method, or may be directly configured as constants such as 0.1 and 0.05, or may be related to each parameter in the learning rate decrease method, and is not particularly limited in the present exemplary embodiment.
In an example embodiment of the present disclosure, the above feature information may further include a size, a type, and the like of a picture in the sample data set, and configuring the training parameters according to the above feature information may further include configuring the training parameters according to the size, the type, and the like of the picture, for example, configuring a size and the like of an input image input to a deep learning model to be trained. Specifically, if the type of the picture is an AOI color picture or a DM image, the size of the input image may be a first preset multiple of the size of the picture; if the type of the picture of the defective product is a TDI image, the size of the input image can be a second preset multiple of the size of the picture; wherein the first preset multiple is less than or equal to 1, and the second preset multiple is greater than or equal to 1.
In an example embodiment of the present disclosure, in the above-mentioned example scenario involving product defects, the feature information may further include a defect level of the defective product and a number of samples corresponding to various defects, and the training parameter may further include a confidence, where the confidence in the training process may be configured according to the number of samples corresponding to various defects and the defect level. Specifically, a preset number may be set first, and the size relationship between the number of samples corresponding to the various defects and the preset number may be determined, and if the number of samples corresponding to a defect is greater than the preset number, the confidence may be configured according to the defect level of the defect. For example, the defect level may include a first defect level and a second defect level, and if the defect level of the defect is the first defect level, the corresponding confidence is configured as a first confidence; if the defect level of the defect is a second defect level, the corresponding confidence level is configured as a second confidence level, wherein the second confidence level may be greater than the first confidence level. Further exemplary, the preset number may be 50, 100, etc., or may be further customized according to the user requirement, which is not specifically limited in this exemplary embodiment. Wherein the first confidence level is greater than or equal to 0.6 and less than or equal to 0.7; the second confidence level is greater than or equal to 0.8 and less than or equal to 0.9; the specific values of the first confidence and the second confidence may be customized according to the needs of the user, which is not specifically limited in this exemplary embodiment.
For example, for defects with a high incidence and a low importance level, i.e. defects with a low defect level, a low confidence level may be configured, for example: for a defect-free map PI820 and a slightly defective PI800, the confidence level is set to 0.6, i.e., the probability score of the map at PI800 or PI820 exceeds 0.6, i.e., the defect is determined. Defects with a lower incidence but higher importance, i.e. defects with a higher defect level, may be configured with a higher confidence, for example: for the severe defects of GT011, SD011, the confidence default is set to 0.85, i.e. the probability score of the graph at either GT011 or SD011 exceeds 0.6, and the graph is judged to be the defect. And for the rest graphs with lower confidence coefficient, the graphs are judged to be unknow (AI is not recognized), and the graphs are manually processed to prevent missed judgment.
At step 1730, the at least two trained models obtained at step 1720 are tested using the validation dataset, generating validation test results. The validation test results may include the class of model output and the confidence of the corresponding class. The images in the validation dataset may be input to each trained model in turn and the categories and corresponding confidence levels of the model outputs saved.
At step 1740, based on the validation test results, validation test metrics are generated, the validation test metrics including at least one of: confusion matrix, accuracy, recall, and F1 score. The accuracy represents the ratio of the number of images correctly predicted for the corresponding category to the number of images predicted for the corresponding category, the recall represents the ratio of the number of images correctly predicted for the corresponding category to the true number of images for the corresponding category, and the F1 score may be regarded as a harmonic average of accuracy and recall, i.e., (2 x accuracy x recall)/(accuracy + recall). Assuming that the model has n prediction categories, the confusion matrix may be a matrix of n×n, each column of the matrix represents a category predicted by the model, and the total number of each column represents the number of pictures predicted as the category; each row represents the true category of the picture, i.e. the category in the tag, and the total number of each row represents the true number of pictures of that category. In addition, the confusion matrix may also include unknown (unknown) columns, which may include the number of pictures for which the model did not predict a category. One or more of the above-described validation test indicators may be generated based on the category in the validation test results and the confidence of the category and the category in the picture tag in the validation dataset.
In step 1750, a target model is determined from the at least two trained models based on the validation test index generated in step 1740. For example, the verification test indexes can be displayed to the user, and the user can refer to the verification test indexes to select one model from at least two trained models as a target model according to actual application requirements; or one model among at least two trained models can be automatically selected as a target model based on the verification test indexes according to a preset selection mechanism. It should be understood that in various embodiments of the present disclosure, expressions like step 1750 should be similarly understood. In other words, unless otherwise stated in this disclosure, the expression "update/determine … … according to … …" is to be understood as meaning that the update, determine, etc. actions may be performed automatically by the system according to one or more metrics, curves, etc. or may be performed by the system presenting one or more metrics, curves, etc. to the user and in response to receiving user input.
As will be appreciated by those skilled in the art, accuracy indicators are very important in the training of deep learning models. However, applicants have found that for a particular application scenario, it is often necessary to consider the performance of the model from different dimensions, rather than just considering a single index such as accuracy. For example, in an application scenario of screen defect detection, different categories of defects may have different priorities. For example, a screen may be scratched with a much higher priority than a screen with stains, because stains may be rinsed, but the scratch must be scrapped. Therefore, the multi-dimensional verification test index is generated based on the verification data set, so that the function of the verification data set can be fully exerted, the training effect of the model can be more comprehensively displayed, the selection of the target model can be more effectively assisted, and the matching degree of the selected target model and the requirements of specific application scenes can be improved.
For example, the target model may be preferentially determined from the F1 score among at least two trained models. As mentioned above, the F1 score is an index that takes into account both the accuracy and the recall, and in general, the closer the F1 score is to 1, the better the training effect of the model is explained, whereas the closer the F1 score is to 0, the worse the training effect of the model is explained. In addition, for example, whether the determined target model meets the preset requirement can be judged according to the confusion matrix; in response to the determined target model not meeting the preset requirements, the target model may be updated by retraining or adjusting the confidence threshold. According to the confusion matrix, the test results of each category can be known in more detail, so that training situations of the model for different categories can be mastered more comprehensively and carefully, for example, according to the confusion matrix, the prediction result distribution, the prediction accuracy, the recall rate, the F1 score and the like of the model for part or all of the categories can be checked, whether the prediction accuracy, the recall rate, the F1 score and the like of the model for part of the defect categories (or all of the defect categories) with higher priorities can be judged to be higher than a preset threshold value or not according to the screen defect detection application scenario. If the target model cannot meet the preset requirements aiming at the prediction conditions of one or more categories, the target model can be tried to be updated by adjusting the confidence threshold value of the corresponding category, or the target model can be updated by training the target model again by supplementing sample data of the corresponding category; or if the target model cannot meet the preset requirements aiming at the prediction conditions of more categories, or cannot be adjusted to meet the requirements by means of adjusting confidence threshold values, supplementing sample data and the like, the target model can be trained again by means of supplementing training data sets or optimizing model parameters and the like, or the initial deep learning model can be trained again, and the target model is reselected again.
In the method 1700 of determining a target model, a sample dataset may include a training dataset and a validation dataset, after training with the training dataset to obtain two or more trained models, testing the two or more trained models with the validation dataset, generating a multi-dimensional validation test index, and selecting a target model from the two or more trained models based on the validation test index. Therefore, through the multi-dimensional verification test index, the training effect of each model can be evaluated and displayed to the user more comprehensively, and therefore the target model which is more suitable for the requirements of specific application scenes can be selected better in an auxiliary mode. Therefore, the prediction performance and the reliability of the prediction function of the finally obtained target model are improved, and the classification performance of the whole classification system is improved. For example, when the method is applied to the field of product defect detection, a more suitable target model can be selected more flexibly according to specific requirements, so that the detection effect of a product defect detection system is effectively improved.
Furthermore, the method 1700 in the present disclosure may enable more efficient use of the validation data set than in the related art schemes. In particular, in related art solutions, as mentioned above, the validation data set is often used only to generate a Loss curve during the training process to help determine whether the model training situation is expected, and then the validation data set is not used. However, the generation of the sample data set often requires a lot of manpower and time, and many data labeling works in the field involving expertise also require a technician with a certain expertise to complete, for example, labeling of sample data involving product defects generally requires a technician with a certain knowledge about the relevant product to ensure the accuracy of labeling. Therefore, if the marked sample data is not fully utilized, the labor and time costs involved in marking the data are wasted to a certain extent. The method 1700 of the present disclosure additionally generates a multi-dimensional verification test index based on the verification data set, which enables the verification data set to be more fully acted in the process of determining the target model, thereby helping to promote the utilization rate of the verification data set and making the best use of things.
In some embodiments, the method 1700 of determining a target model described with reference to FIG. 17 may also include the offline test process 1800 shown in FIG. 18. As shown in fig. 18, the offline test procedure 1800 may include the following steps.
At step 1810, an offline test dataset is acquired, the offline test dataset comprising a plurality of sample images that have been labeled with a class. Likewise, the sample image may be an image of the aforementioned target object, which may be a target product such as a display screen, a display panel, or the like, and the noted category may be the same as the noted category of the sample image in the aforementioned sample data set, and may be a category of display screen defect such as the presence of residue, the presence of dust, too fine or too coarse of a circuit, or the like.
Illustratively, the offline test dataset may include at least one of: a subset partitioned by the sample dataset, an input sample dataset provided by the user, wherein the input sample dataset may comprise a plurality of sample images of the annotated class. Thus, the offline test dataset may be acquired by at least one of the following. In a first approach, the offline test dataset may be obtained directly based on a subset of the sample dataset used by the model training phase. For example, the sample data set may be divided into a training data set, a validation data set, and an offline test data set at a preset ratio; or the sample data can be divided into a training set and an offline test data set in advance according to a preset proportion, and the training set is further divided into a training data set and a verification data set according to the preset proportion. In a second approach, the user may be allowed to provide a new annotated data set. An input sample dataset provided by a user including a plurality of sample images of annotated categories may be received and an offline test dataset may be acquired based on the received input sample dataset. Alternatively, the two approaches described above may be combined to obtain an offline test data set based on both a subset of the original sample data set and the newly provided input sample data set. By providing a new input sample data set, the generalization ability of the target model can be tested. If the provided input sample dataset includes data extracted in the actual production line, the predictive performance of the target model for the production line data may also be tested more intuitively, so that preparation for deployment to the actual production line may be made.
In step 1820, the target model is tested using the offline test dataset, generating offline test results. The offline test results may include the categories of the target model outputs and the confidence of the corresponding categories. The images in the offline test data set can be sequentially input to the target model, and the output category and the corresponding confidence level of the target model are saved.
In some embodiments, the confidence threshold for at least one category may be updated based on the aforementioned validation test results or offline test results. Illustratively, the verification test result or the offline test result may be displayed to the user in various manners, and the user may adjust the confidence threshold of at least one category of the target model according to the actual application requirement with reference to the displayed result; or the confidence threshold of at least one category may be automatically adjusted based on the validation test results or the offline test results according to a preset adjustment mechanism.
In some embodiments, an accuracy curve and a recall curve may be generated for at least one category based on the verification test result or the offline test result, where the accuracy curve is used to reflect a relationship between the accuracy and the confidence threshold, and the recall curve is used to reflect a relationship between the recall and the confidence threshold; the confidence threshold for at least one category may then be updated based on the accuracy curve and the recall curve. For example, the intersection of the accuracy curve and the recall curve is often the most appropriate confidence threshold, and thus the confidence threshold for at least one category may be updated automatically or based on user input from the intersection of the accuracy curve and the recall curve. Or may present the accuracy and recall curves to the user, and optionally may also present other reference metrics so that the user may make more accurate adjustments to the confidence threshold based on these content, in conjunction with actually generating the business scenario. For example, an accuracy curve and a recall curve may be generated in embodiments in which the recommendation confidence threshold is not automatically determined in order to provide a more detailed reference for the user to adjust the confidence threshold.
In some embodiments, the recommended confidence thresholds for the respective categories may be determined based on the validation test results or the offline test results, according to accuracy or recall; based on the recommended confidence threshold, a confidence threshold for at least one category is updated. For example, at the beginning of model training, confidence thresholds may be set in advance for the respective categories, and the confidence thresholds at this time may be default values or empirically set values. When at least two trained models are tested using the validation data set or the offline test data set, a recommendation confidence threshold for each category may be determined for each model according to preset logic based on the validation test results or the offline test results. Various verification test indexes can be generated based on the recommended confidence threshold, so that the prediction performance of each model is reflected more accurately, and the performance of the selected target model is improved. For example, the recommendation confidence threshold herein may be determined according to the scheme of determining confidence thresholds in the various embodiments described above with reference to fig. 3-16. Thus, an adapted confidence threshold value may be relatively easily and accurately determined for each model, which may provide an adjustment range and direction for a user to subsequently manually adjust the confidence threshold value, or may be ready for model online.
In some embodiments, an offline test index may be generated based on the offline test results, the offline test index including at least one of: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and real quantity for each category, confidence distribution for each category; the confidence threshold for the at least one category may then be updated based on the offline test metrics. The distribution of the number of model outputs and the true number for each category means that the number of category images and the number of labels output by the target model are the number of category images for each category, which can be represented by a histogram, a line graph, or the like. The confidence distribution for each category refers to the confidence that the target model outputs for each input image for each category, which can be embodied by a scatter diagram or the like. For example, the offline test index may be generated in an embodiment of automatically determining the recommended confidence threshold, so that a user intuitively observes the predicted effect of the target model based on the recommended confidence threshold, and further fine-tunes the recommended confidence threshold with reference to the current effect in combination with the actual requirement.
In some embodiments, after the confidence threshold is updated, at least one of the following may also be generated based on the offline test results described above according to the updated confidence threshold: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, confidence distribution for each category. Therefore, the prediction effect of the target model based on the updated confidence coefficient threshold value can be displayed for the user in real time, so that whether the updated confidence coefficient threshold value is selected to be used or whether the confidence coefficient threshold value needs to be continuously adjusted or not can be determined more intuitively and conveniently, and the most suitable confidence coefficient threshold value can be found quickly.
In some embodiments, the method 1700 of determining a target model described with reference to FIG. 17 may also include the online testing process 1900 shown in FIG. 19. As shown in FIG. 19, the online testing process 1900 may include the following steps.
At step 1910, an online test dataset is acquired, the online test dataset comprising a plurality of images of unlabeled categories. Likewise, the image in the online test dataset may be an image of the aforementioned target object, which may be a target product such as a display screen, display panel, etc., which may have a class of display screen defects such as the presence of residue, the presence of dust, too fine or too coarse of circuit lines, etc. The images in the online test dataset may be images acquired at various production flows on an actual production line. Therefore, after the online test data set is utilized for testing, the condition that the target model has good effect in the early test and has poor effect after formally online can be prevented, and accordingly guarantee can be provided for online stable operation of the target model.
In some embodiments, the online test dataset may be obtained in the following manner. First, a communication connection may be established with an image acquisition device configured to acquire an image to be inspected. The image to be inspected from the image acquisition device may then be received via a communication connection. In this manner, an online test dataset may be acquired based on the received images to be inspected. By way of example, the image acquisition device may be a camera or other device that may be controlled by a control device of the production system, for example, when it grabs an image, where the image is stored, etc. The communication connection with the image acquisition device may be established via a communication connection with such a production system, for example, such a communication connection may be established via a gateway or other structure. Subsequently, the target object (e.g., product name, model, etc. of the target product) to which the target model corresponds may be notified via such a communication connection, and the production system may provide image data of the target object applicable to the target model, e.g., a storage address or the like of the related image data may be notified via the communication connection so as to acquire the related image data, which may be used as the aforementioned image to be inspected, based on the address.
Or in some embodiments, the online test dataset may be obtained by: an online test dataset may be acquired based on images received by a relevant online model, wherein the relevant online model is configured to receive images to be inspected from the image acquisition device and predict a category of the images to be inspected based on the received images to be inspected. For example, as described in the preceding paragraph, the relevant online model may establish a communication connection with the production system via a structure such as a gateway, and acquire the image to be inspected of the target object acquired by the image acquisition device via the communication connection. And then, after the image to be detected is predicted, the prediction result can be fed back to the production system again through the communication connection for subsequent storage and analysis processing. For example, a structure such as a gateway may sort the predictions output by the relevant online model and send to the production system. For example, in embodiments where the target model is used to replace a related online model, the online test dataset may be obtained based on image data at the related online model.
At step 1920, testing the target model with the online test dataset, generating online test metrics comprising at least one of: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category. For example, the target model may be tested using an online test dataset and the online test results derived from the output of the target model. The online test results may include the categories output by the target model and the confidence of the corresponding categories. For example, images in the online test dataset may be sequentially input to the target model and the categories and corresponding confidence levels of the target model output saved. Subsequently, a manual review result for the online test dataset may be received. For example, each picture in the online test dataset may be presented to a designated person, the designated person may empirically determine the category of each picture, e.g., a defect category of a target product therein, etc., and the picture category entered by the designated person may be saved as a manual review of the corresponding picture. Thus, the online test index can be generated based on the online test result and the manual review result. The user can evaluate the online prediction performance of the target model based on the online test index to further ensure that the target model can achieve a predicted effect after being online.
In step 1930, a determination may be made as to whether the online target model is online based on whether the online test metrics meet the online criteria. Illustratively, in response to the online test metrics meeting a preset criteria, an online target model may be uploaded; or in response to at least some of the online test metrics being higher than corresponding test metrics of the associated online model, the online target model may be uploaded, wherein the corresponding test metrics are derived based on output results of the associated online model for the online test dataset. By way of example, the relevant online model may be an online model that is the same as or similar to the product model, category (such as product defect category), etc., detected by the target model. An online model may be understood as a model currently being used to predict a class of an image to be inspected, for example, for a product defect detection scenario, the online model may be a model arranged on a production line for detecting an image of a target product and predicting a product defect class. Illustratively, the corresponding test metrics of the relevant online model may be obtained by: the output result of the relevant online model aiming at the pictures in the online test data set can be obtained, and the corresponding test indexes are generated based on the output result and the manual review result. Similarly, the corresponding test metrics may include at least one of: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category.
In some embodiments, for a brand new model to be online, whether to online may be determined according to whether at least some of the online test indexes reach a preset threshold, and for an iterative model to be online, whether to online may be determined according to whether at least some of the online test indexes are higher than corresponding test indexes of the relevant online model. Here, a completely new model may be understood as an online model for which no correlation currently exists, and an iterative model may be understood as an updated iterative version of the correlation model for which a current online exists. The brand-new model and the iterative model can be applied to different situations, for example, in the application scene of product defect detection, aiming at defect detection of brand-new products, the brand-new model is usually required to be developed for detection, and in some cases, if similar products exist before, the brand-new products can also be detected by carrying out iterative update on the related models for the similar products; for defect detection of existing products, when there is a large change in the production process or a large change in the camera arrangement, it is often necessary to develop a completely new model to detect the product under the new process or to process the image data acquired under the new camera arrangement, and when there is a small production line variation or camera arrangement variation, it is possible to adapt to these variations by updating iterations of the current online model. In addition, as mentioned in the foregoing embodiments, the acquisition manner of the online test data set may be different for the brand-new model and the updated iterative model, and for the brand-new model, it is generally necessary to first establish a new communication connection with the image acquisition device and receive the related image data from the image acquisition device; for updating the iterative model, then the online test data may be selected to be obtained directly based on the image data at the relevant online model, i.e., part or all of the image data at the relevant present model may be divided or duplicated into two parts, one for use by the relevant online model in conventional prediction and the other as online test data for updating the iterative model. By setting different online test data acquisition methods and different online standards, different requirements of a brand new model and an updated iteration model can be met, and the flexibility of a test flow is improved.
In some embodiments, the online testing process 1900 may further include one of the following: updating the target model by retraining or adjusting the confidence threshold in response to the online test index not meeting the preset criteria; and updating the target model by retraining or adjusting the confidence threshold in response to the online test index not being higher than the corresponding test index of the associated online model. For example, as mentioned previously, for a completely new model to be online, when the online test index does not meet the preset criteria, e.g., if part or all of the index does not reach the preset threshold, the target model may be updated by retraining or adjusting the confidence threshold; for the iterative model to be online, when the online test index is not higher than the corresponding test index of the related online model, the target model can be updated by retraining or adjusting the confidence threshold. The manner in which the target model is updated by training or adjusting the confidence threshold has been described in the embodiments relating to step 1750 and will not be described in detail herein.
In some embodiments, the method 1700 of determining a target model described with reference to FIG. 17 may also include the auditing process 2000 shown in FIG. 20. As shown in fig. 20, the auditing process 2000 may include the following steps.
In step 2010, the results of the user's auditing for the target model are obtained. After the target model is subjected to the early-stage test, the target model can be prepared for online deployment. In order to avoid misoperation and to avoid that the model is erroneously brought on line to influence production, an audit flow can be set before the model is brought on line. The relevant user may be presented with a target model to be online, and the relevant user may confirm whether to online the target model. Optionally, to further ensure the reliability of the system, two or more levels of auditing procedures may be provided, e.g., auditing may be performed via two levels of personnel in relation, respectively, before the target model is brought online.
In step 2020, in response to the auditing result indicating that the target model may be online, the target model is online such that the target model is configured to receive the image to be inspected from the image acquisition device and predict a category of the image to be inspected based on the received image to be inspected. After the verification is passed, the target model can be formally online, and can be used as a new online model or can be used as an update of an original related online model. For example, the online target model may receive the image to be inspected via the communication connection and feed back the prediction result via the communication connection.
In some embodiments, the method 1700 of determining a target model described with reference to fig. 17 may further include the online review process 2100 shown in fig. 21. As shown in fig. 21, the online review process 2100 may include the following steps.
At step 2110, after the object model is online, online spot check data is acquired based on the image to be checked from the image acquisition device. And extracting part of the images from the images to be detected, which are provided to the target model for prediction, so as to form online spot check data. These on-line spot check data may be stored in packets or sent to the relevant review personnel.
In some embodiments, the online spot check data may be obtained by at least one of the following means. In the first mode, an automatic task dispatch mode may be adopted, that is, a partial image may be randomly extracted from the image to be inspected from the image acquisition device, and on-line spot check data may be generated based on the extracted image. For example, a fixed number of images to be inspected may be randomly extracted within a preset period to generate online spot check data. The preset period may be, for example, one day, two days, one week, etc. In the second mode, a manual task extraction mode may be adopted, that is, screening conditions for the image to be inspected from the image acquisition device may be received, the image to be inspected from the image acquisition device may be screened based on the screening conditions, and online spot check data may be generated based on the screened image. For example, the screening conditions may be manually set by a person concerned, and the screening conditions may be, for example, time, product model number, product defect type, number, and the like. The images to be detected can be screened according to the set screening conditions, and spot check data can be generated based on the screened images to be detected. Optionally, spot check data may be generated directly based on the screened images to be detected, or the spot check data may be generated by randomly extracting images with a preset proportion from the screened images to be detected. The two modes can be respectively used for meeting the requirements of daily monitoring and spot check of specific conditions aiming at the target model.
In step 2120, a manual review result for the online spot check data is received, where the manual review result includes a category of the image to be inspected obtained by manual review. Each picture in the online spot check data can be presented to related check personnel, check input aiming at each picture is received, and the check input can comprise the category of the picture and can be stored as a manual check result.
In step 2130, generating an online spot check index based on the manual review result and the category predicted by the target model, the online spot check index including at least one of: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category. The user can monitor the online prediction performance of the target model based on the online spot check index so as to monitor the health degree of the target model from the data angle and make adjustment on the target model in time.
In some embodiments, the entire model training and evaluation process may be implemented according to the flow 2200 shown in FIG. 22. As shown in fig. 22, the deep learning model may be trained based on a training data set, and at least two trained models are obtained according to training parameters; then, at least two trained models may be tested based on the validation data set and a target model is determined among the at least two trained models based on the test results; then, the target model can be subjected to offline testing based on the offline testing data set, and the confidence threshold value of at least one category of the target model is adjusted based on the offline testing result; then, the target model may be tested online based on the online test data set, so as to test the predictive performance of the target model on the production line data, where the update test and the online test may be distinguished, as described above, for the online test, a communication connection may be established with the production system and the online test data set may be obtained via the communication connection, and for the update test, the online test data set may be obtained directly based on the data at the relevant online model; after passing the online test, the target model can be subjected to auditing and online, and as described above, a two-stage auditing mode can be adopted; finally, after the target model is online, the target model can be periodically or based on the specific requirements to carry out online recheck based on the extracted production line data so as to monitor the online performance of the target model. The above-mentioned respective processes have been described in detail in the foregoing embodiments with reference to fig. 17 to 21, and are not described herein.
Through the process 2200 shown in fig. 22, a complete and detailed model test and online scheme is provided while providing training functions for users, so that on one hand, the problem that the model cannot be adapted after the production process, the product model and the like are changed can be solved. On the other hand, a set of perfect model evaluation mechanism can be provided, which covers the five model test stages of verification evaluation, off-line test, on-line test, checking on-line and production line rechecking, and matched evaluation indexes are provided in each stage, so that a user can simply and clearly perform the test and on-line work of the model according to the flow and the evaluation indexes provided by the present disclosure without the assistance of algorithm personnel. In other words, the embodiment of the disclosure provides a set of standard operation flow for the important work of online model test, can provide guarantee for the subsequent maintenance of a product defect detection system or other similar systems and the expansibility and robustness of the system, and avoids a plurality of risks caused by model replacement of the systems.
Fig. 23 illustrates a schematic flow diagram of a method 2300 of determining an image category, according to one embodiment of the disclosure. The method 2300 of determining the object model may be implemented by the terminal devices 101, 102, 103, the server 105, or a combination thereof, as shown in fig. 1, for example. As shown in fig. 23, method 2300 includes the following steps.
At step 2310, the image to be inspected is predicted using a target model to obtain a class of the image to be inspected, wherein the target model is determined from at least two trained models having different numbers of training rounds according to a verification test index, the verification test index including at least one of: confusion matrix, accuracy, recall, and F1 score. The image to be inspected here and the aforementioned sample image may be similar images, which may display various defects existing in the target object. The images to be inspected here may be, for example, images of the target object taken from the production line at one or more production flows.
In some embodiments, the target model used in step 2310 may be a target model obtained by the method 1700 described in various embodiments above. By utilizing the target model to predict the category of the image to be detected, better prediction effects can be realized, such as higher indexes of accuracy, recall rate and the like, or prediction effects which more meet the requirements of specific applications and the like.
It should be appreciated that the method 2300 of determining image categories focuses on the acquisition and use process of the target model, while the various embodiments described with reference to fig. 17-22 focus on the acquisition and evaluation process of the target model. Accordingly, the various embodiments described above with reference to fig. 17-22 are also applicable to the method 2300 of determining image categories, the specific details of which are not described in detail herein.
Fig. 24 illustrates a schematic flow diagram of a method 2400 of determining a target model according to an embodiment of the disclosure. The method of determining the object model may be implemented, for example, by the terminal devices 101, 102, 103, the server 105, or a combination as shown in fig. 1. As shown in fig. 24, the method 2400 includes the following steps.
In step 2410, in response to a user configuration operation on the sample data set, a sample data set is acquired, the sample data set comprising a training data set and a validation data set, each of the training data set and the validation data set comprising a plurality of sample images that have been labeled with a class. For example, the sample image may be an image of a target object, the target object may be a target product, and the category may be a product defect category of the target product. As mentioned above, the target product may be, for example, a display screen, a display panel, etc., and the noted category may refer to a category of display screen defect, such as the presence of residue, the presence of dust, too fine or too coarse a circuit line, etc.
As an example, a user may configure parameters of a sample data set on a graphical interface as shown in fig. 10, which may include, for example, a training type (e.g., image classification), a selected sample data set (e.g., l6lbf_main code), a version number of the sample data set (e.g., V1), and so on. Illustratively, the user may also select two or more data sets in succession in the graphical interface as shown in FIG. 10, and merge the two or more data sets as the sample data set used in step 2410 by clicking, for example, a "ok" or other button.
For example, the validation data set may be used to validate training effects of the deep learning model, e.g., a Loss curve, i.e., a curve of change in Loss function values, may be generated during the training process based on the validation data set to determine model training effects and to help determine when to stop training. In some embodiments, the number of sample images in the training data set and the validation data set may correspond to a preset ratio. For example, the ratio of training data set to validation data set may be 9:1, 8:1, 8:2, etc. Illustratively, the sample data set may include other data sets in addition to the training data set and the validation data set. Further by way of example, the sample data set may be divided into a training set and other data sets prior to dividing the training set into the foregoing training data set and verification data set during training.
For example, after selecting the appropriate sample data set, the sample data set may be automatically divided into the training data set and the verification data set according to a predetermined ratio, or manually divided into the training data set and the verification data set according to a desired ratio by a user. Further exemplary, the sample data set may be first divided into a training set and other data sets, and after the division is completed, the user may select a certain training set in a graphical interface such as that shown in fig. 10, and further, the selected training set may be further divided into a training data set and a verification data set according to a preset ratio automatically, and the respective data amounts of the two may be presented.
In step 2420, training parameters are configured according to the characteristic information of the sample data set, and a training parameter display interface is generated, wherein the training parameters displayed by the training parameter display interface comprise a test strategy, the test strategy comprises at least two training rounds, and the characteristic information comprises the sample number of the sample data set. Illustratively, the test strategy may include the number of tests and the number of rounds at test, and the training parameters may include a learning rate reduction strategy, a total number of rounds of training, and the like, in addition to the test strategy. And, illustratively, configuring the training parameters from the characteristic information of the sample dataset may include one or more of: the learning rate decline strategy, the training total round number, and the test strategy are configured according to the number of samples in the feature information, the size of an input image input to the deep learning model to be trained, etc. are configured according to the size, kind, etc. of pictures in the sample dataset, the confidence in the training process is configured according to the number of samples and the defect level corresponding to various defects (in the above-mentioned example scenario involving product defects), and so on. For a specific embodiment, please refer to the description of step 1720, which is not repeated here.
In an example embodiment of the present disclosure, after completing the configuration of the training parameters according to the feature information, a training parameter display interface may be further generated, and a parameter modification identifier is provided on the training parameter display interface, after the user triggers the parameter modification identifier, a modifiable parameter may be displayed, and the user may modify the configured training parameter on the modification interface.
In step 2430, the deep learning model is trained using the training data set according to the training parameters to obtain at least two trained models, the at least two trained models corresponding one-to-one to the at least two training rounds. By way of example, the trained model may be the image classification model mentioned in the previous embodiments, which may be used to predict a class based on an input image and derive a confidence level corresponding to the class.
In some embodiments, the deep learning model may be a Convolutional Neural Network (CNN) model, a target detection convolutional neural network (master-RCNN) model, a Recurrent Neural Network (RNN) model, a Generative Antagonism Network (GAN) model, a self-attention model, but is not limited thereto, and other neural network models known to those skilled in the art may be employed.
In some embodiments, as described above, training parameters when training the deep learning model may be configured in advance, and the training parameters may include a class for which the training is directed, a type of the deep learning model, a total number of training rounds, a learning rate reduction strategy, a test strategy, a size of an image input to the deep learning model, and the like. Furthermore, two or more training wheel numbers may be specified in the training parameters such that a trained model with a corresponding training wheel number may be saved as an alternative model, so that a better model among these alternative models may be selected as the target model during the subsequent test. The number of training rounds may be specified empirically by the person performing the model training, or indirectly based on certain parameters, such as on a decline in learning rate, etc., for example. For example, assuming that the model is considered to be better when the second learning rate is decreasing, a number of training rounds may be selected around the number of rounds when the second learning rate is decreasing. It should be understood that the number of model training rounds to be saved can be specified according to other indexes according to actual requirements.
At step 2440, at least two trained models are tested using the validation dataset, generating validation test results. The validation test results may include the class of model output and the confidence of the corresponding class. The images in the validation dataset may be input to each trained model in turn and the categories and corresponding confidence levels of the model outputs saved.
At step 2450, based on the validation test result, a validation test index presentation interface is generated, the validation test index presentation interface for presenting at least one of: confusion matrix, accuracy, recall, and F1 score. The accuracy represents the ratio of the number of images correctly predicted for the corresponding category to the number of images predicted for the corresponding category, the recall represents the ratio of the number of images correctly predicted for the corresponding category to the true number of images for the corresponding category, and the F1 score may be regarded as a harmonic average of accuracy and recall, i.e., (2 x accuracy x recall)/(accuracy + recall). Assuming that the model has n prediction categories, the confusion matrix may be a matrix of n×n, each column of the matrix represents a category predicted by the model, and the total number of each column represents the number of pictures predicted as the category; each row represents the true category of the picture, i.e. the category in the tag, and the total number of each row represents the true number of pictures of that category. In addition, the confusion matrix may also include unknown (unknown) columns, which may include the number of pictures for which the model did not predict a category. One or more of the above-described validation test indicators may be generated based on the category in the validation test results and the confidence of the category and the category in the picture tag in the validation dataset.
As an example, the user may view the accuracy, recall, and/or F1 score of each trained model on graphical interface 2500A as shown in fig. 25A. Alternatively, the accuracy, recall, F1 score, etc. of the model may be presented more intuitively through pie charts, ring charts, etc. For example, the user may click on a corresponding identification (e.g., UID in the interface) associated with the trained model to view the accuracy, recall, F1 score, etc. of the graphical presentation of the corresponding model.
As an example, a user may view the confusion matrix for one or more trained models on graphical interface 2500B as shown in fig. 25B. Optionally, in addition to the confusion matrix, the graphical interface 2500B may also present other index parameters such as accuracy, recall, etc. of the corresponding model.
In step 2460, the model selected by the user is determined as the target model in response to a user selection operation for at least two trained models. For example, the verification test indexes can be presented to the user, and the user can select one model from at least two trained models as a target model according to actual application requirements by referring to the verification test indexes. Furthermore, it is also possible, for example, to automatically select one of the at least two trained models as the target model based on the verification test indicators according to a preset selection mechanism.
As will be appreciated by those skilled in the art, accuracy indicators are very important in the training of deep learning models. However, applicants have found that for a particular application scenario, it is often necessary to consider the performance of the model from different dimensions, rather than just considering a single index such as accuracy. For example, in an application scenario of screen defect detection, different categories of defects may have different priorities. For example, a screen may be scratched with a much higher priority than a screen with stains, because stains may be rinsed, but the scratch must be scrapped. Therefore, the multi-dimensional verification test index is generated based on the verification data set, so that the function of the verification data set can be fully exerted, the training effect of the model can be more comprehensively displayed, the selection of the target model can be more effectively assisted, and the matching degree of the selected target model and the requirements of specific application scenes can be improved.
As an example, an interface may be provided for selecting a target model among at least two trained models, such that a user may select it as the target model by clicking on the corresponding model.
In the method 2400 of determining a target model, a labeled sample dataset is partitioned to obtain a training dataset and a validation dataset, after two or more trained models are trained using the training dataset, the two or more trained models are tested using the validation dataset, and a validation test index presentation interface for presenting a multi-dimensional validation test index is generated and presented, thereby allowing a target model to be selected from the two or more trained models according to the validation test index. Therefore, through the display of the multi-dimensional verification test indexes, compared with the scheme in the related art, more full utilization of the verification data set can be realized, and meanwhile, the training effect of each model can be evaluated more comprehensively and displayed to the user more intuitively, so that the target model which is more suitable for the requirements of specific application scenes can be better selected in an auxiliary mode. Therefore, the prediction performance and the reliability of the prediction function of the finally obtained target model are improved, and the classification performance of the whole classification system is improved. For example, when the method is applied to the field of product defect detection, a more suitable target model can be selected more flexibly according to specific requirements, so that the detection effect of a product defect detection system is effectively improved.
In some embodiments, the method 2400 of determining a target model shown in fig. 24 may further include: responding to the offline test task establishment operation of a user aiming at a target model, and generating an offline test parameter configuration interface; establishing an offline test task according to configuration input of a user on an offline test parameter configuration interface; acquiring an offline test data set according to configuration input, wherein the offline test data set comprises a plurality of sample images with marked categories; and testing the target model by using the offline test data set to generate an offline test result.
For example, buttons for creating offline test tasks may be provided on a graphical interface, or corresponding physical buttons or keys may also be designed. When the user clicks on the button, an offline test parameter configuration interface may be presented. As an example, an offline test task creation interface 2600 as shown in fig. 26 may be presented. In this interface 2600, a user can configure offline test parameters, such as selecting a task site, a product model, a dataset for offline testing, an identification of a model for which offline testing is intended, and so forth. After the user clicks the submit task button, an offline test task may be created according to the user configuration, and a corresponding offline test dataset may be used to test the target model and generate an offline test result.
In some embodiments, the confidence threshold for at least one category may be updated according to the following: determining a recommendation confidence threshold for each category according to the accuracy rate or the recall rate based on the verification test result or the offline test result; updating the confidence threshold for the at least one category based on the recommended confidence threshold; and responding to the checking operation of the user on the confidence threshold value, and generating a confidence threshold value display interface. As an example, the user may view the recommended confidence thresholds or current confidence thresholds for the respective trained models in a graphical interface 2700 as shown in fig. 27. As shown, confidence thresholds for the various categories may be presented in a tabular manner. Or the confidence threshold may be presented in other ways as well.
In some embodiments, the confidence threshold for at least one category may be updated according to the following: generating a curve display interface based on the verification test result or the offline test result, wherein the curve display interface is used for displaying an accuracy rate curve and a recall rate curve of each category, the accuracy rate curve reflects the relationship between the accuracy rate and the confidence threshold, and the recall rate curve reflects the relationship between the recall rate and the confidence threshold; the confidence threshold for the at least one category is updated in response to a user modifying the confidence threshold for the at least one category.
As an example, the user may view the accuracy curves and recall curves for each category in a graphical interface 2800 as shown in fig. 28. The accuracy rate curve and the recall rate curve may be displayed in the same graph, or the accuracy rate curve and the recall rate curve may be displayed in two graphs, respectively. For example, 20 different confidence thresholds may be set at intervals of 0.05 for each category according to the category in the real label of the picture in the offline test dataset and the category output by the target model, and the accuracy and recall of each category may be calculated separately, thereby drawing an accuracy curve and recall curve (plotted as a line graph in fig. 28) as shown in fig. 28. Thus, a user may determine appropriate confidence thresholds for each category based on business needs and experience by observing the accuracy and recall curves, and modify the confidence thresholds for one or more categories based on the determined results. Further exemplary, the intersection of the recall curve and the accuracy curve is often the most appropriate confidence threshold, such as the intersection a of the accuracy curve of category a and the recall curve, and the intersection B of the accuracy curve of category B and the recall curve shown in fig. 28. The user can observe the accuracy rate curves and recall rate curves of different categories, and make final adjustment on the confidence threshold value by combining with the business background.
As an example, the user may modify the confidence thresholds for the various categories in the graphical interface 2900 as shown in fig. 29. For example, the user may be provided with a variety of ways to modify the confidence threshold, such as direct input, modification by increasing or decreasing buttons, and so forth.
In some embodiments, the confidence threshold for at least one category may also be updated according to the following: based on the offline test result, generating an offline test index display interface, wherein the offline test index display interface is used for displaying at least one of the following: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and real quantity for each category, confidence distribution for each category; the confidence threshold for the at least one category is updated in response to a user modifying the confidence threshold for the at least one category.
As an example, the user can view the distribution of the number of model outputs and the real number for each category in the graphical interface 3000 as shown in fig. 30. As shown in fig. 30, the model output numbers of the respective categories and the real numbers may be presented in the form of a histogram, wherein the model output numbers refer to the numbers of pictures for which model prediction is the category, and the real numbers refer to the numbers of pictures for which labels are the category, for a certain category. Alternatively, the unknown class may be presented, i.e., the number of pictures that the target model does not fall into either class. Through such a histogram, the user can intuitively observe the degree of difference between the number of model outputs and the actual number for each category.
As an example, a user may view confidence distributions for various categories in a graphical interface 3100 as shown in fig. 31. As shown in fig. 31, the confidence distributions for the respective categories may be presented in the form of a scatter plot, where each dot may represent a confidence score for the respective category that the target model outputs for a certain picture. Through such a scatter diagram, the user can intuitively observe the confidence distribution situation of each category, and can effectively assist in setting a proper confidence threshold.
As an example, a user may view accuracy, recall, confusion matrix, etc. in graphical interface 3200 as shown in fig. 32.
Similarly, as an example, the user may modify confidence thresholds for the various categories in graphical interface 2900 as shown in fig. 29.
In some embodiments, an error result presentation interface may also be provided for facilitating user viewing of the picture data for prediction errors. That is, an error result display interface may be generated based on the offline test result, where the error result display interface is configured to display data in which the model output class is inconsistent with the real class.
As an example, a user may view data of a prediction error in a graphical interface 3300 as shown in fig. 33. Illustratively, as shown in fig. 33, the amount of data that is mispredicted in each category may be presented on the left side of the interface, e.g., the amount of data that is labeled category a but predicted as other categories may be presented. When a user selects to view a certain category, a specific picture of the misprediction under the category can be presented on the right side of the interface. The user can know which categories the pictures are more likely to be misjudged, the possible reasons for misjudgment of the pictures and the like by looking at the pictures with misprediction, so that more data support is provided for fine adjustment of the confidence threshold value, and the prediction performance of the target model is further improved.
In some embodiments, to demonstrate in real-time the impact of updating the confidence threshold on the model predictive effect, to assist the user in deciding whether to employ the updated confidence threshold, or whether to further modify the confidence threshold, a new indicator presentation interface may be generated based on offline test results after updating the confidence threshold to present at least one of: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, confidence distribution for each category.
By way of example, one or more of the various indicators described above may be presented in a graphical interface similar to that shown in fig. 30-32.
In some embodiments, the offline test parameter configuration interface may include a dataset selection option. In such an embodiment, an offline test data set may be acquired based on a sample data set corresponding to a selection operation performed by a user through data set selection.
Illustratively, existing datasets may be selected in a graphical interface 2700 as shown in fig. 27 by selecting a dataset name, dataset version, and the like. For example, a subset of the sample data sets used in the target model training process may be selected for use to form the offline test data set.
In some embodiments, the offline test parameter configuration interface may include a dataset selection option. In such an embodiment, the input sample dataset may be received in response to an upload operation performed by the user through a dataset upload option; based on the input sample dataset, an offline test dataset is obtained.
Illustratively, a dataset upload option may be provided in a graphical interface 2700 as shown in fig. 27, such that a user may construct an offline test dataset by uploading into a sample dataset. Or may provide a separate data set upload option, after which the user may select the uploaded data set in graphical interface 2700 as shown in fig. 27. For example, an input sample dataset may be constructed based on annotated line image data to test the generalization ability of the target model while testing the predicted effect of the target model on the line data.
In some embodiments, the method 2400 of determining a target model shown in fig. 24 may further include: responding to the online test task establishment operation of a user aiming at a target model, and generating an online test parameter configuration interface; establishing an online test task according to the configuration input of a user on an online test parameter configuration interface; acquiring an online test data set based on configuration input, wherein the online test data set comprises a plurality of images with unlabeled categories; testing the target model by using an online test data set to generate an online test index, wherein the online test index comprises at least one of the following: accuracy, recall, confusion matrix, distribution of model output quantity and manual review quantity for each category, and confidence distribution for each category; one of the following two items: responding to the online test index meeting the preset standard, and presenting options of an online target model; and presenting the option of the online target model in response to at least some of the online test metrics being higher than corresponding test metrics of the relevant online model, wherein the corresponding test metrics are derived based on output results of the relevant online model for the online test dataset.
By way of example, one or more of the online test metrics described above may be presented in a graphical interface similar to that shown in fig. 30-32.
As an example, the user may be presented with graphical interfaces 3400A, 3400B for creating online test tasks as shown in fig. 34A, 34B. The user may enter various configurations on the illustrated interface, such as task name, test type, task site, product model number, product code, amount of data used for testing, and the like. The amount of data used for the test may be selected, for example, by LOT_ID, which may represent a batch of data, and GLS_COUNT, which may represent data captured for a glass panel.
In some embodiments, similar to that shown in FIG. 34A, the online test parameter configuration interface may include a new model online option. In such an embodiment, a communication connection may be established with an image acquisition device in response to a user selection of a new model on-line option, wherein the image acquisition device is configured to acquire an image to be inspected; receiving an image to be inspected from an image acquisition device via a communication connection; based on the received image to be inspected, an online test dataset is acquired.
In some embodiments, similar to that shown in FIG. 34B, the online test parameter configuration interface may include model update options. In such an embodiment, the online test dataset may be acquired based on the images received by the relevant online model in response to a user selection of the model update option, wherein the relevant online model is configured to receive the images to be inspected from the image acquisition device and to predict the category of the images to be inspected based on the received images to be inspected.
For example, different configuration options may be presented for the model online test and the model update test. For example, as shown in fig. 34A, for the model online test, a main code model (mainCode model) and a sub code model (subCode model) to be tested need to be selected; as shown in fig. 34B, for the model update test, the main code model (mainCode model) and the sub code model (subCode model) may be selectively tested; etc. In the above description, mainCode models may be directed to defect forms determined in terms of pattern types, and subCode models may be directed to defect forms determined in terms of repair processes. In addition, other types of model configuration options may be set according to specific application requirements.
How to determine whether to go online the object model is described with reference to fig. 19 in the previous embodiment, and will not be described here.
In some embodiments, the method 2400 of determining a target model shown in fig. 24 may further include: responding to the operation of selecting the online target model by a user, and generating an online auditing interface; in response to a confirmation operation of the user for the online audit, the online target model is such that the target model is configured to receive the to-be-inspected image from the image acquisition device and predict a category of the to-be-inspected image based on the received to-be-inspected image.
As an example, the user may be presented with an interface 3500 as shown in fig. 35, and the user may select information for a model number, corresponding product code, corresponding site, etc. to be online to initiate an online audit request. The associated auditor may then view the online audit interface associated with the model and may confirm that the request was audited or denied. For example, there may be a two-level audit interface. The first-stage auditor can audit by the second-stage auditor after the first-stage auditor passes the audit, and the model can be formally on line after both the two-stage auditors pass the audit.
As an example, a user may view information for all models that are online in a graphical interface 3600 as shown in fig. 36, and may operate on one or more of the models as desired, such as replacing the model, viewing a confidence threshold, modifying a confidence threshold, backing to a previous version, offline, synchronizing data, and so forth.
In some embodiments, the method 2400 of determining a target model shown in fig. 24 may further include: responding to the monitoring task establishing operation of the user aiming at the online target model, and generating a monitoring task parameter configuration interface; establishing a monitoring task according to configuration input of a user on a monitoring task parameter configuration interface; acquiring online spot check data based on the to-be-checked image from the image acquisition device according to the configuration input; receiving a manual rechecking result aiming at the online spot check data, wherein the manual rechecking result comprises the category of the image to be checked obtained by manual rechecking; based on the manual rechecking result and the category predicted by the target model, generating an online spot check index, wherein the online spot check index comprises at least one of the following items: accuracy, recall, confusion matrix, distribution of model output number and manual review number for each category, confidence distribution for each category.
By way of example, one or more of the above-described on-line spot check indicators may be presented in a graphical interface similar to that shown in fig. 30-32.
As an example, a user may view and manage all created monitoring tasks in graphical interface 3700 as shown in fig. 37. Illustratively, one or more of monitoring tasks, entering or viewing model evaluations, rechecking, viewing results distributions, viewing rechecking results, deleting tasks, etc. may be viewed in graphical interface 3700. And, the user may create a new monitoring task through a button such as "create task". As an example, the user may be presented with a monitoring task creation interface 3800 as shown in fig. 38. For example, a user may configure various parameters of the monitoring task in the interface, such as a time period, a site, a product model, a product code, a tag group, a choice of whether to obtain an administrator (OP) result, a choice of whether to obtain a model result, a choice of data for which the monitoring task is intended, a code, a task creation purpose, and so forth. The corresponding monitoring task may then be created based on the user's configuration input. The created monitoring tasks may be viewed and managed, for example, in graphical interface 3700 shown in fig. 37.
The method 2400 for determining a target model shown in fig. 24 may have the same or similar embodiments as the various embodiments described with reference to fig. 17 to 23, except for the various embodiments described with reference to the graphical interfaces in fig. 25 to 38, and will not be repeated here.
Fig. 39 illustrates an exemplary block diagram of an apparatus 3900 for determining a target model according to one embodiment of the disclosure. As shown in fig. 39, the apparatus 3900 for determining an image category includes an acquisition module 3910, a training module 3920, a testing module 3930, a generating module 3940, and a determining module 3950.
The acquisition module 3910 may be configured to acquire a sample data set including a training data set and a validation data set, each of the training data set and the validation data set including a plurality of sample images that have been labeled with a class; the training module 3920 may be configured to train the deep learning model with a training dataset, resulting in at least two trained models according to different training round numbers; the test module 3930 may be configured to test the at least two trained models with the validation dataset, generating a validation test result; the generation module 3940 may be configured to generate verification test metrics based on the verification test results, the verification test metrics including at least one of: confusion matrix, accuracy, recall, and F1 score; the determination module 3950 may be configured to determine a target model from at least two trained models based on the validation test metrics.
Fig. 40 illustrates an exemplary block diagram of an apparatus 4000 for determining image categories according to one embodiment of the disclosure. As shown in fig. 40, the apparatus 4000 for determining a target model includes a prediction module 4010.
The prediction module 4010 may be configured to predict the image to be detected using a target model to obtain a class of the image to be detected, wherein the target model is determined from at least two trained models having different training rounds according to a verification test index, the verification test index comprising at least one of: confusion matrix, accuracy, recall, and F1 score.
Fig. 41 illustrates an exemplary block diagram of an apparatus 4100 for determining a target model according to one embodiment of this disclosure. As shown in fig. 41, the apparatus 4100 for determining a target model includes an acquisition module 4110, a configuration module 4120, a training module 4130, a test module 4140, a generation module 4150, and a determination module 4160.
The acquisition module 4110 may be configured to acquire a sample data set in response to a user's configuration operation on the sample data set, the sample data set including a training data set and a verification data set, each of the training data set and the verification data set including a plurality of sample images that have been labeled with a class; the configuration module 4120 may be configured to configure training parameters according to the characteristic information of the sample dataset and generate a training parameter presentation interface, wherein the training parameters presented by the training parameter presentation interface comprise a test strategy comprising at least two training rounds, the characteristic information comprising the number of samples of the sample dataset; the training module 4130 may be configured to train the deep learning model with the training data set according to the training parameters, resulting in at least two trained models, the at least two trained models corresponding one-to-one to the at least two training wheel numbers; the test module 4140 may be configured to test at least two trained models with the validation dataset, generating validation test results; the generation module 4150 may be configured to generate, based on the validation test result, a validation test index presentation interface for presenting at least one of: confusion matrix, accuracy, recall, and F1 score; the determination module 4160 may be configured to determine the user-selected model as the target model in response to a user selection operation for at least two trained models.
Fig. 42 illustrates an exemplary block diagram of a system 4200 for determining image categories according to one embodiment of the present disclosure. As shown in fig. 42, the system 4200 for determining image categories includes a data management module 4210, a training and testing management module 4220, and a model management module 4230.
The data management module 4210 may be configured to store and manage sample data; the training and testing management module 4220 may be configured to perform the method 1700 of determining a target model, the method 2300 of determining an image class, or the method 2400 of determining a target model described in accordance with the various embodiments described above; model management module 4230 is configured to store, present, and manage the target model.
The specific details of each module in the above apparatus or system are already described in the method section embodiments, and the details not disclosed may refer to the method section embodiments, so that they will not be described in detail.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
The present disclosure provides a computer readable storage medium having stored thereon computer readable instructions that when executed implement any of the methods described above.
The present disclosure provides a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read from the computer-readable storage medium by a processor of a computing device, and executed by the processor, cause the computing device to perform any of the methods provided in the various alternative implementations described above.
It should be noted that the computer readable medium shown in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Furthermore, the program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
It should be understood that for clarity, embodiments of the present disclosure have been described with reference to different functional units. However, it will be apparent that the functionality of each functional unit may be implemented in a single unit, in a plurality of units or as part of other functional units without departing from the present disclosure. For example, functionality illustrated to be performed by a single unit may be performed by multiple different units. Thus, references to specific functional units are only to be seen as references to suitable units for providing the described functionality rather than indicative of a strict logical or physical structure or organization. Thus, the present disclosure may be implemented in a single unit or may be physically and functionally distributed between different units and circuits.
It will be understood that, although the terms first, second, third, etc. may be used herein to describe various devices, elements, components or sections, these devices, elements, components or sections should not be limited by these terms. These terms are only used to distinguish one device, element, component, or section from another device, element, component, or section.
Although the present disclosure has been described in connection with some embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present disclosure is limited only by the appended claims. Additionally, although individual features may be included in different claims, these may possibly be advantageously combined, and the inclusion in different claims does not imply that a combination of features is not feasible and/or advantageous. The order of features in the claims does not imply any specific order in which the features must be worked. Furthermore, in the claims, the word "comprising" does not exclude other elements, and the term "a" or "an" does not exclude a plurality. Reference signs in the claims are provided merely as a clarifying example and shall not be construed as limiting the scope of the claims in any way.

Claims (36)

1.一种确定目标模型的方法,包括:1. A method for determining a target model, comprising: 获取样本数据集,所述样本数据集包括训练数据集和验证数据集,所述训练数据集和所述验证数据集中的每一个包括已被标注类别的多个样本图像;Acquire a sample data set, wherein the sample data set includes a training data set and a verification data set, each of the training data set and the verification data set includes a plurality of sample images of labeled categories; 利用所述训练数据集对深度学习模型进行训练,根据不同训练轮数,得到至少两个经训练的模型;Using the training data set to train the deep learning model, and obtaining at least two trained models according to different numbers of training rounds; 利用所述验证数据集测试所述至少两个经训练的模型,生成验证测试结果;Testing the at least two trained models using the validation data set to generate validation test results; 基于所述验证测试结果,生成验证测试指标,所述验证测试指标包括以下各项中的至少一项:混淆矩阵、准确率、召回率和F1分数;Based on the validation test results, generating validation test indicators, the validation test indicators including at least one of the following: confusion matrix, precision, recall and F1 score; 根据所述验证测试指标,在所述至少两个经训练的模型中确定目标模型。According to the validation test indicator, a target model is determined among the at least two trained models. 2.根据权利要求1所述的方法,还包括:2. The method according to claim 1, further comprising: 获取离线测试数据集,所述离线测试数据集包括以下中的至少一项:由所述样本数据集划分得到的子集、由用户提供的输入样本数据集,所述输入样本数据集包括已被标注类别的多个样本图像;Acquire an offline test data set, wherein the offline test data set includes at least one of the following: a subset obtained by dividing the sample data set, an input sample data set provided by a user, and the input sample data set includes a plurality of sample images with labeled categories; 利用所述离线测试数据集测试所述目标模型,生成离线测试结果。The target model is tested using the offline test data set to generate an offline test result. 3.根据权利要求1或2所述的方法,还包括:3. The method according to claim 1 or 2, further comprising: 基于所述验证测试结果或者所述离线测试结果,针对至少一个类别,生成准确率曲线和召回率曲线,其中,所述准确率曲线反映准确率和置信度阈值之间的关系,所述召回率曲线反映召回率与置信度阈值之间的关系;Based on the verification test result or the offline test result, generating an accuracy curve and a recall curve for at least one category, wherein the accuracy curve reflects the relationship between the accuracy and the confidence threshold, and the recall curve reflects the relationship between the recall rate and the confidence threshold; 根据所述准确率曲线和召回率曲线,更新针对至少一个类别的置信度阈值。A confidence threshold for at least one category is updated according to the precision curve and the recall curve. 4.根据权利要求3所述的方法,其中,所述根据所述准确率曲线和召回率曲线,更新针对至少一个类别的置信度阈值包括:4. The method according to claim 3, wherein updating the confidence threshold for at least one category according to the precision curve and the recall curve comprises: 根据所述准确率曲线和召回率曲线的交点,更新针对至少一个类别的置信度阈值。According to the intersection of the precision curve and the recall curve, a confidence threshold for at least one category is updated. 5.根据权利要求1或2所述的方法,还包括:5. The method according to claim 1 or 2, further comprising: 基于所述验证测试结果或者所述离线测试结果,根据准确率或召回率,确定针对各个类别的推荐置信度阈值;Based on the verification test result or the offline test result, determining a recommendation confidence threshold for each category according to the accuracy or the recall rate; 基于所述推荐置信度阈值,更新针对至少一个类别的置信度阈值。Based on the recommendation confidence threshold, a confidence threshold for at least one category is updated. 6.根据权利要求5所述的方法,还包括:6. The method according to claim 5, further comprising: 基于所述离线测试结果,生成离线测试指标,所述离线测试指标包括以下各项中的至少一项:准确率、召回率、F1分数、混淆矩阵、针对各个类别的模型输出数量和真实数量的分布、针对各个类别的置信度分布;Based on the offline test results, generate offline test indicators, wherein the offline test indicators include at least one of the following: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, and confidence distribution for each category; 根据所述离线测试指标,更新针对至少一个类别的置信度阈值。A confidence threshold for at least one category is updated according to the offline test indicator. 7.根据权利要求6所述的方法,还包括:7. The method according to claim 6, further comprising: 根据更新后的置信度阈值,基于所述离线测试结果,生成以下各项中的至少一项:准确率、召回率、F1分数、混淆矩阵、针对各个类别的模型输出数量和真实数量的分布、针对各个类别的置信度分布。According to the updated confidence threshold, based on the offline test results, at least one of the following items is generated: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, and confidence distribution for each category. 8.根据权利要求1或2所述的方法,还包括:8. The method according to claim 1 or 2, further comprising: 获取在线测试数据集,所述在线测试数据集包括未被标注类别的多个图像;Acquire an online test data set, wherein the online test data set includes a plurality of images of unlabeled categories; 利用所述在线测试数据集测试所述目标模型,生成在线测试指标,所述在线测试指标包括以下各项中的至少一项:准确率、召回率、混淆矩阵、针对各个类别的模型输出数量和人工复核数量的分布、针对各个类别的置信度分布;以及Testing the target model using the online test data set to generate online test indicators, wherein the online test indicators include at least one of the following: accuracy, recall, confusion matrix, distribution of the number of model outputs and the number of manual reviews for each category, and confidence distribution for each category; and 以下两项中的一项:One of the following: 响应于所述在线测试指标满足预设标准,上线所述目标模型;以及In response to the online test indicator satisfying a preset standard, launching the target model; and 响应于所述在线测试指标中的至少部分指标高于相关在线模型的对应测试指标,上线所述目标模型,其中,所述对应测试指标是基于所述相关在线模型针对所述在线测试数据集的输出结果得到的。In response to at least some of the online test indicators being higher than corresponding test indicators of the relevant online model, the target model is put online, wherein the corresponding test indicators are obtained based on output results of the relevant online model for the online test data set. 9. 根据权利要求8所述的方法,还包括以下两项中的一项:9. The method according to claim 8, further comprising one of the following: 响应于所述在线测试指标不满足所述预设标准,通过重新训练或调整置信度阈值来更新所述目标模型;以及In response to the online test indicator not satisfying the preset standard, updating the target model by retraining or adjusting the confidence threshold; and 响应于所述在线测试指标不高于所述相关在线模型的对应测试指标,通过重新训练或调整置信度阈值来更新所述目标模型。In response to the online test indicator being not higher than a corresponding test indicator of the related online model, the target model is updated by retraining or adjusting a confidence threshold. 10.根据权利要求8所述的方法,其中,所述获取在线测试数据集包括:10. The method according to claim 8, wherein the acquiring of an online test data set comprises: 与图像获取装置建立通讯连接,其中,所述图像获取装置被配置为获取待检图像;Establishing a communication connection with an image acquisition device, wherein the image acquisition device is configured to acquire an image to be inspected; 经由所述通讯连接,接收来自所述图像获取装置的待检图像;receiving the image to be inspected from the image acquisition device via the communication connection; 基于所接收的待检图像,获取所述在线测试数据集。Based on the received image to be inspected, the online test data set is acquired. 11.根据权利要求8所述的方法,其中,所述获取在线测试数据集包括:11. The method according to claim 8, wherein the acquiring of an online test data set comprises: 基于相关在线模型所接收的图像,获取所述在线测试数据集,其中,所述相关在线模型被配置为接收来自图像获取装置的待检图像,并基于所接收的待检图像预测待检图像的类别。The online test data set is acquired based on the image received by the relevant online model, wherein the relevant online model is configured to receive the image to be inspected from the image acquisition device and predict the category of the image to be inspected based on the received image to be inspected. 12.根据权利要求8所述的方法,其中,所述上线所述目标模型包括:12. The method according to claim 8, wherein the bringing the target model online comprises: 获取用户针对所述目标模型的审核结果;Obtaining the user's review result for the target model; 响应于所述审核结果指示所述目标模型可上线,上线所述目标模型,使得所述目标模型被配置为接收来自图像获取装置的待检图像,并基于所接收的待检图像预测待检图像的类别。In response to the audit result indicating that the target model can be put online, the target model is put online so that the target model is configured to receive the image to be inspected from the image acquisition device and predict the category of the image to be inspected based on the received image to be inspected. 13.根据权利要求8所述的方法,还包括:13. The method according to claim 8, further comprising: 在所述目标模型上线后,基于来自图像获取装置的待检图像,获取在线抽查数据;After the target model is online, obtaining online spot check data based on the image to be inspected from the image acquisition device; 接收针对所述在线抽查数据的人工复核结果,所述人工复核结果包括人工复核得到的待检图像的类别;receiving a manual review result for the online spot check data, wherein the manual review result includes a category of the image to be inspected obtained by manual review; 基于所述人工复核结果和所述目标模型预测得到的类别,生成在线抽查指标,所述在线抽查指标包括以下各项中的至少一项:准确率、召回率、混淆矩阵、针对各个类别的模型输出数量和人工复核数量的分布、针对各个类别的置信度分布。Based on the manual review results and the categories predicted by the target model, online spot check indicators are generated, and the online spot check indicators include at least one of the following items: accuracy, recall rate, confusion matrix, distribution of the number of model outputs and the number of manual reviews for each category, and confidence distribution for each category. 14.根据权利要求13所述的方法,其中,所述基于来自图像获取装置的待检图像,获取在线抽查数据包括以下中的至少一项:14. The method according to claim 13, wherein the step of acquiring online spot check data based on the image to be inspected from the image acquisition device comprises at least one of the following: 在所述来自图像获取装置的待检图像中,随机抽取部分图像,并基于所抽取的图像,生成所述在线抽查数据;Randomly extracting a portion of images from the images to be inspected from the image acquisition device, and generating the online spot check data based on the extracted images; 接收针对所述来自图像获取装置的待检图像的筛选条件,基于所述筛选条件,对所述来自图像获取装置的待检图像进行筛选,并基于筛选出的图像,生成所述在线抽查数据。Receive screening conditions for the images to be inspected from the image acquisition device, screen the images to be inspected from the image acquisition device based on the screening conditions, and generate the online spot check data based on the screened images. 15.根据权利要求1所述的方法,其中,所述根据所述验证测试指标,在所述至少两个经训练的模型中确定目标模型包括:15. The method according to claim 1, wherein determining a target model from among the at least two trained models according to the validation test indicator comprises: 根据F1分数在所述至少两个经训练的模型中确定所述目标模型。The target model is determined among the at least two trained models according to the F1 score. 16.根据权利要求15所述的方法,其中,所述根据所述验证测试指标,在所述至少两个经训练的模型中确定目标模型还包括:16. The method according to claim 15, wherein determining a target model from among the at least two trained models according to the validation test indicator further comprises: 根据混淆矩阵判断所确定的目标模型是否满足预设需求;Judging whether the determined target model meets the preset requirements according to the confusion matrix; 响应于所确定的目标模型不满足预设需求,通过重新训练或调整置信度阈值来更新所述目标模型。In response to the determined target model not meeting the preset requirement, the target model is updated by retraining or adjusting the confidence threshold. 17.根据权利要求1所述的方法,其中,所述样本图像为目标产品的图像,所述类别为所述目标产品的产品缺陷类别。17 . The method according to claim 1 , wherein the sample image is an image of a target product, and the category is a product defect category of the target product. 18.一种确定图像类别的方法,包括:18. A method for determining an image category, comprising: 利用目标模型对待检图像进行预测,以得到所述待检图像的类别,其中,所述目标模型是根据验证测试指标从至少两个经训练的模型中确定的,所述至少两个经训练的模型具有不同训练轮数,所述验证测试指标包括以下各项中的至少一项:混淆矩阵、准确率、召回率和F1分数。A target model is used to predict an image to be inspected to obtain a category of the image to be inspected, wherein the target model is determined from at least two trained models according to a validation test metric, the at least two trained models have different numbers of training rounds, and the validation test metric includes at least one of the following: confusion matrix, accuracy, recall rate, and F1 score. 19.一种确定目标模型的方法,包括:19. A method for determining a target model, comprising: 响应于用户对样本数据集的配置操作,获取样本数据集,所述样本数据集包括训练数据集和验证数据集,所述训练数据集和所述验证数据集中的每一个包括已被标注类别的多个样本图像;In response to a configuration operation of a sample data set by a user, a sample data set is acquired, wherein the sample data set includes a training data set and a verification data set, each of the training data set and the verification data set includes a plurality of sample images of labeled categories; 根据所述样本数据集的特征信息配置训练参数,并生成训练参数展示界面,其中,所述训练参数展示界面展示的训练参数包括测试策略,所述测试策略包括至少两个训练轮数,所述特征信息包括所述样本数据集的样本数量;Configure training parameters according to the characteristic information of the sample data set, and generate a training parameter display interface, wherein the training parameters displayed on the training parameter display interface include a test strategy, the test strategy includes at least two training rounds, and the characteristic information includes the number of samples in the sample data set; 根据所述训练参数利用所述训练数据集对深度学习模型进行训练,得到至少两个经训练的模型,所述至少两个经训练的模型与所述至少两个训练轮数一一对应;Training the deep learning model using the training data set according to the training parameters to obtain at least two trained models, wherein the at least two trained models correspond one-to-one to the at least two training rounds; 利用所述验证数据集测试所述至少两个经训练的模型,生成验证测试结果;Testing the at least two trained models using the validation data set to generate validation test results; 基于所述验证测试结果,生成验证测试指标展示界面,所述验证测试指标展示界面用于展示以下各项中的至少一项:混淆矩阵、准确率、召回率和F1分数;Based on the verification test results, generating a verification test indicator display interface, wherein the verification test indicator display interface is used to display at least one of the following items: confusion matrix, accuracy, recall rate and F1 score; 响应于用户针对所述至少两个经训练的模型的选择操作,将用户所选择的模型确定为目标模型。In response to a selection operation by a user on the at least two trained models, the model selected by the user is determined as a target model. 20.根据权利要求19所述的方法,还包括:20. The method according to claim 19, further comprising: 响应于用户针对所述目标模型的离线测试任务建立操作,生成离线测试参数配置界面;In response to a user's offline test task establishment operation for the target model, generating an offline test parameter configuration interface; 根据用户在所述离线测试参数配置界面的配置输入,建立离线测试任务;Establishing an offline test task according to the configuration input by the user in the offline test parameter configuration interface; 根据所述配置输入,获取离线测试数据集,所述离线测试数据集包括已被标注类别的多个样本图像;According to the configuration input, an offline test data set is obtained, wherein the offline test data set includes a plurality of sample images of labeled categories; 利用所述离线测试数据集测试所述目标模型,生成离线测试结果。The target model is tested using the offline test data set to generate an offline test result. 21.根据权利要求19或20所述的方法,其中,所述根据所述离线测试结果,更新针对至少一个类别的置信度阈值包括:21. The method according to claim 19 or 20, wherein updating the confidence threshold for at least one category according to the offline test result comprises: 基于所述验证测试结果或者所述离线测试结果,生成曲线展示界面,所述曲线展示界面用于展示各个类别的准确率曲线和召回率曲线,其中,所述准确率曲线反映准确率和置信度阈值之间的关系,所述召回率曲线反映召回率与置信度阈值之间的关系;Based on the verification test result or the offline test result, a curve display interface is generated, wherein the curve display interface is used to display the accuracy curve and the recall curve of each category, wherein the accuracy curve reflects the relationship between the accuracy and the confidence threshold, and the recall curve reflects the relationship between the recall rate and the confidence threshold; 响应于用户对至少一个类别的置信度阈值的修改操作,更新针对至少一个类别的置信度阈值。In response to a user modification operation on the confidence threshold of at least one category, the confidence threshold for at least one category is updated. 22.根据权利要求19或20所述的方法,还包括:22. The method according to claim 19 or 20, further comprising: 基于所述验证测试结果或者所述离线测试结果,根据准确率或召回率,确定针对各个类别的推荐置信度阈值;Based on the verification test result or the offline test result, determining a recommendation confidence threshold for each category according to accuracy or recall; 基于所述推荐置信度阈值,更新针对至少一个类别的置信度阈值;Based on the recommendation confidence threshold, updating a confidence threshold for at least one category; 响应于用户对所述置信度阈值的查看操作,生成置信度阈值展示界面。In response to a user's viewing operation on the confidence threshold, a confidence threshold display interface is generated. 23.根据权利要求20所述的方法,还包括:23. The method according to claim 20, further comprising: 基于所述离线测试结果,生成离线测试指标展示界面,所述离线测试指标展示界面用于展示以下各项中的至少一项:准确率、召回率、F1分数、混淆矩阵、针对各个类别的模型输出数量和真实数量的分布、针对各个类别的置信度分布;Based on the offline test results, an offline test indicator display interface is generated, wherein the offline test indicator display interface is used to display at least one of the following items: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, and confidence distribution for each category; 响应于用户对至少一个类别的置信度阈值的修改操作,更新针对至少一个类别的置信度阈值。In response to a user modification operation on the confidence threshold of at least one category, the confidence threshold for at least one category is updated. 24.根据权利要求20所述的方法,还包括:24. The method of claim 20, further comprising: 基于所述离线测试结果,生成错误结果展示界面,所述错误结果展示界面用于展示模型输出类别与真实类别不一致的数据。Based on the offline test results, an error result display interface is generated, where the error result display interface is used to display data whose model output categories are inconsistent with real categories. 25.根据权利要求23所述的方法,还包括:25. The method of claim 23, further comprising: 响应于所述置信度阈值被更新,基于所述离线测试结果,生成指标展示界面,所述指标展示界面用于展示以下各项中的至少一项:准确率、召回率、F1分数、混淆矩阵、针对各个类别的模型输出数量和真实数量的分布、针对各个类别的置信度分布。In response to the confidence threshold being updated, an indicator display interface is generated based on the offline test results, and the indicator display interface is used to display at least one of the following items: accuracy, recall, F1 score, confusion matrix, distribution of model output quantity and true quantity for each category, and confidence distribution for each category. 26.根据权利要求20所述的方法,其中,所述离线测试参数配置界面包括数据集选择选项,并且其中,所述根据所述配置输入,获取离线测试数据集包括:26. The method of claim 20, wherein the offline test parameter configuration interface includes a data set selection option, and wherein acquiring the offline test data set according to the configuration input comprises: 响应于用户通过所述数据集选择执行的选择操作,基于与所述选择操作对应的样本数据集,获取所述离线测试数据集。In response to a selection operation performed by a user through the data set selection, the offline test data set is acquired based on a sample data set corresponding to the selection operation. 27.根据权利要求20所述的方法,其中,所述离线测试参数配置界面包括数据集上传选项,并且其中,所述根据所述配置输入,获取离线测试数据集还包括:27. The method according to claim 20, wherein the offline test parameter configuration interface includes a data set upload option, and wherein the acquiring the offline test data set according to the configuration input further comprises: 响应于用户通过所述数据集上传选项执行的上传操作,接收输入样本数据集;In response to an upload operation performed by a user through the dataset upload option, receiving an input sample dataset; 基于所述输入样本数据集,获取所述离线测试数据集。Based on the input sample data set, the offline test data set is acquired. 28.根据权利要求19或20所述的方法,还包括:28. The method according to claim 19 or 20, further comprising: 响应于用户针对所述目标模型的在线测试任务建立操作,生成在线测试参数配置界面;In response to a user's online test task establishment operation for the target model, generating an online test parameter configuration interface; 根据用户在所述在线测试参数配置界面的配置输入,建立在线测试任务;Establishing an online test task according to the configuration input by the user in the online test parameter configuration interface; 基于所述配置输入,获取在线测试数据集,所述在线测试数据集包括未被标注类别的多个图像;Based on the configuration input, an online test data set is obtained, wherein the online test data set includes a plurality of images of unlabeled categories; 利用所述在线测试数据集测试所述目标模型,生成在线测试指标,所述在线测试指标包括以下各项中的至少一项:准确率、召回率、混淆矩阵、针对各个类别的模型输出数量和人工复核数量的分布、针对各个类别的置信度分布;以及Testing the target model using the online test data set to generate online test indicators, wherein the online test indicators include at least one of the following: accuracy, recall, confusion matrix, distribution of the number of model outputs and the number of manual reviews for each category, and confidence distribution for each category; and 以下两项中的一项:One of the following: 响应于所述在线测试指标满足预设标准,呈现上线所述目标模型的选项;以及In response to the online test indicator satisfying a preset standard, presenting an option of launching the target model; and 响应于所述在线测试指标中的至少部分指标高于相关在线模型的对应测试指标,呈现上线所述目标模型的选项,其中,所述对应测试指标是基于所述相关在线模型针对所述在线测试数据集的输出结果得到的。In response to at least some of the online test indicators being higher than corresponding test indicators of a related online model, an option of launching the target model is presented, wherein the corresponding test indicators are obtained based on output results of the related online model for the online test data set. 29.根据权利要求28所述的方法,其中,所述在线测试参数配置界面包括新模型上线选项,并且所述基于所述配置输入,获取在线测试数据集包括:29. The method according to claim 28, wherein the online test parameter configuration interface includes a new model online option, and the acquiring of the online test data set based on the configuration input comprises: 响应于用户对所述新模型上线选项的选择操作,与图像获取装置建立通讯连接,其中,所述图像获取装置被配置为获取待检图像;In response to a user's selection operation of the new model online option, establishing a communication connection with an image acquisition device, wherein the image acquisition device is configured to acquire an image to be inspected; 经由所述通讯连接,接收来自所述图像获取装置的待检图像;receiving the image to be inspected from the image acquisition device via the communication connection; 基于所接收的待检图像,获取所述在线测试数据集。Based on the received image to be inspected, the online test data set is acquired. 30.根据权利要求28所述的方法,其中,所述在线测试参数配置界面包括模型更新选项,并且所述基于所述配置输入,获取在线测试数据集包括:30. The method of claim 28, wherein the online test parameter configuration interface includes a model update option, and the acquiring of an online test data set based on the configuration input comprises: 响应于用户对所述模型更新选项的选择操作,基于相关在线模型所接收的图像,获取所述在线测试数据集,其中,所述相关在线模型被配置为接收来自图像获取装置的待检图像,并基于所接收的待检图像预测待检图像的类别。In response to a user's selection operation on the model update option, the online test data set is acquired based on the image received by the relevant online model, wherein the relevant online model is configured to receive the image to be inspected from the image acquisition device and predict the category of the image to be inspected based on the received image to be inspected. 31.根据权利要求28所述的方法,还包括:31. The method of claim 28, further comprising: 响应于用户选择上线所述目标模型的操作,生成上线审核界面;In response to the user selecting to launch the target model, generating an online review interface; 响应于用户针对上线审核的确认操作,上线所述目标模型,使得所述目标模型被配置为接收来自图像获取装置的待检图像,并基于所接收的待检图像预测待检图像的类别。In response to the user's confirmation operation for the online review, the target model is put online, so that the target model is configured to receive the image to be inspected from the image acquisition device, and predict the category of the image to be inspected based on the received image to be inspected. 32.根据权利要求31所述的方法,还包括:32. The method according to claim 31 further comprising: 响应于用户针对已上线的目标模型的监控任务建立操作,生成监控任务参数配置界面;In response to a user's operation of establishing a monitoring task for an online target model, a monitoring task parameter configuration interface is generated; 根据用户在所述监控任务参数配置界面的配置输入,建立监控任务;Establishing a monitoring task according to the configuration input by the user in the monitoring task parameter configuration interface; 根据所述配置输入,基于所述来自图像获取装置的待检图像,获取在线抽查数据;According to the configuration input, based on the image to be inspected from the image acquisition device, online spot check data is acquired; 接收针对所述在线抽查数据的人工复核结果,所述人工复核结果包括人工复核得到的待检图像的类别;receiving a manual review result for the online spot check data, wherein the manual review result includes a category of the image to be inspected obtained by manual review; 基于所述人工复核结果和所述目标模型预测得到的类别,生成在线抽查指标,所述在线抽查指标包括以下各项中的至少一项:准确率、召回率、混淆矩阵、针对各个类别的模型输出数量和人工复核数量的分布、针对各个类别的置信度分布。Based on the manual review results and the categories predicted by the target model, online spot check indicators are generated, and the online spot check indicators include at least one of the following items: accuracy, recall rate, confusion matrix, distribution of the number of model outputs and the number of manual reviews for each category, and confidence distribution for each category. 33.一种确定图像类别的系统,包括:33. A system for determining an image category, comprising: 数据管理模块,被配置为用于存储并管理样本数据;A data management module, configured to store and manage sample data; 训练和测试管理模块,被配置为用于执行如权利要求1至17中任一项所述的确定目标模型的方法、如权利要求18所述的确定图像类别的方法或如权利要求19至32中任一项所述的确定目标模型的方法;A training and testing management module, configured to execute the method for determining a target model according to any one of claims 1 to 17, the method for determining an image category according to claim 18, or the method for determining a target model according to any one of claims 19 to 32; 模型管理模块,被配置为存储、展示并管理所述目标模型。The model management module is configured to store, display and manage the target model. 34.一种计算设备,包括:34. A computing device comprising: 存储器,其被配置成存储计算机可执行指令;a memory configured to store computer-executable instructions; 处理器,其被配置成当所述计算机可执行指令被处理器执行时执行根据权利要求1至32中的任一项所述的方法。A processor configured to perform the method according to any one of claims 1 to 32 when the computer executable instructions are executed by the processor. 35.一种计算机可读存储介质,其存储有计算机可执行指令,当所述计算机可执行指令被执行时,执行根据权利要求1至32中的任一项所述的方法。35. A computer-readable storage medium storing computer-executable instructions, which, when executed, perform the method according to any one of claims 1 to 32. 36.一种计算机程序产品,包括计算机指令,所述计算机指令在被处理器执行时实现权利要求1至32中任一项所述的方法的步骤。36. A computer program product comprising computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 32.
CN202211354700.1A 2022-11-01 2022-11-01 Method for determining target model, method and system for determining image category Pending CN118038104A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211354700.1A CN118038104A (en) 2022-11-01 2022-11-01 Method for determining target model, method and system for determining image category

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211354700.1A CN118038104A (en) 2022-11-01 2022-11-01 Method for determining target model, method and system for determining image category

Publications (1)

Publication Number Publication Date
CN118038104A true CN118038104A (en) 2024-05-14

Family

ID=90991914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211354700.1A Pending CN118038104A (en) 2022-11-01 2022-11-01 Method for determining target model, method and system for determining image category

Country Status (1)

Country Link
CN (1) CN118038104A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121029628A (en) * 2025-10-30 2025-11-28 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Methods and apparatus for building large models for basic software testing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121029628A (en) * 2025-10-30 2025-11-28 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Methods and apparatus for building large models for basic software testing

Similar Documents

Publication Publication Date Title
US12602910B2 (en) Method for detecting defect and method for training model
EP4141786B1 (en) Defect detection method and apparatus, model training method and apparatus, and electronic device
CN109711548B (en) Hyperparameter selection method, use method, device and electronic device
CN108830837B (en) Method and device for detecting steel ladle corrosion defect
CN112183166A (en) Method and device for determining training sample and electronic equipment
CN111699499A (en) Inspection system, image recognition system, recognizer generation system, and learning data generation device
CN114004313B (en) Methods, devices, electronic equipment, and storage media for predicting faulty GPUs
CN112613569B (en) Image recognition method, image classification model training method and device
CN112950567B (en) Quality evaluation method, device, electronic device and storage medium
CN113449773A (en) Model updating method and device, storage medium and electronic equipment
CN110286938B (en) Method and apparatus for outputting evaluation information for user
CN114862832A (en) Method, device and equipment for optimizing defect detection model and storage medium
CN110517247A (en) Obtain the method and device of information
CN110069997B (en) Scene classification method and device and electronic equipment
JP2019075078A (en) Construction site image determination device and construction site image determination program
CN118038104A (en) Method for determining target model, method and system for determining image category
CN118038105A (en) Method and device for determining image category and confidence threshold
CN115546218B (en) Confidence Threshold Determination Method and Device, Electronic Equipment and Storage Medium
WO2022147003A1 (en) An adaptive machine learning system for image-based biological sample constituent analysis
WO2019073615A1 (en) Worksite image assessment device and worksite image assessment program
CN118520055B (en) A data cockpit system based on virtual-real fusion
CN116996527B (en) Method for synchronizing data of converging current divider and storage medium
CN115641470B (en) Classification models and training methods, devices and equipment for vehicle image classification models
CN116319386A (en) Availability and fault prediction method and device, electronic equipment and medium
CN111767988B (en) Fusion method and device of neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination