CN119260738A

CN119260738A - 3D visual recognition method for industrial robots based on deep learning

Info

Publication number: CN119260738A
Application number: CN202411722240.2A
Authority: CN
Inventors: 陆杰
Original assignee: Suzhou Berkeley Technology Co ltd
Current assignee: Suzhou Berkeley Technology Co ltd
Priority date: 2024-11-28
Filing date: 2024-11-28
Publication date: 2025-01-07

Abstract

本发明公开了基于深度学习的工业机器人3D视觉识别方法，包括以下步骤：步骤一、进行数据采集与预处理；步骤二、进行数据特征进行提取；步骤三、深度学习模型的选择与构建；步骤四、进行模型的训练与优化；步骤五、进行模型的评估与选择；步骤六、进行模型的部署与应用；步骤七、进行模型的持续改进与维护；本深度学习的工业机器人3D视觉识别方法是一种先进的工业自动化技术，它结合了深度学习算法和3D视觉技术的优势，通过上述步骤共同构成了基于深度学习的工业机器人3D视觉识别方法的完整流程，从数据采集到模型部署与应用，再到持续改进与维护，为工业机器人提供了更高更快的识别精度和适应性，这种技术将在未来工业自动化和智能化的发展中发挥越来越重要的作用。The present invention discloses an industrial robot 3D vision recognition method based on deep learning, comprising the following steps: step one, data collection and preprocessing; step two, data feature extraction; step three, selection and construction of a deep learning model; step four, model training and optimization; step five, model evaluation and selection; step six, model deployment and application; step seven, continuous improvement and maintenance of the model; the deep learning industrial robot 3D vision recognition method is an advanced industrial automation technology, which combines the advantages of deep learning algorithms and 3D vision technology. The above steps together constitute a complete process of the industrial robot 3D vision recognition method based on deep learning, from data collection to model deployment and application, and then to continuous improvement and maintenance, providing industrial robots with higher and faster recognition accuracy and adaptability. This technology will play an increasingly important role in the development of industrial automation and intelligence in the future.

Description

Industrial robot 3D visual recognition method based on deep learning

Technical Field

The invention relates to the technical field of industrial robot recognition, in particular to a 3D visual recognition method of an industrial robot based on deep learning.

Background

The industrial robot vision is equivalent to that the industrial robot is provided with 'eyes', so that the industrial robot can clearly see objects without tiredness, the function of checking and detecting the eyes is exerted, the industrial robot vision is very important in the high-automation mass production, a machine vision system can convert an object to be shot into an image signal through a machine vision product, namely an image shooting device, and transmit the image signal to a special image processing system to obtain the form information of the object to be shot, the form information is converted into a digital signal according to the information of pixel distribution, brightness, color and the like, and then the image system performs various operations on the signals to extract the characteristics of the object, and further controls the on-site equipment action according to the judging result;

The 3D visual technology in the existing common industrial robot 3D visual identification method acquires three-dimensional information of objects, including shapes, sizes, positions and the like, through equipment such as cameras, sensors and the like, and then extracts useful characteristic information from the data through a computer processing system and classifies and identifies the useful characteristic information, but because the data volume is more and the data characteristics are similar, when the data is processed through a conventional computer processing system, the efficiency and the accuracy of data processing are limited, so that the application provides the industrial robot 3D visual identification method based on deep learning.

Disclosure of Invention

The invention aims to solve the defects in the prior art in the background technology, and provides a 3D visual identification method of an industrial robot based on deep learning.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the 3D visual recognition method of the industrial robot based on the deep learning comprises the following steps:

step one, data acquisition and pretreatment are carried out;

Step two, extracting data characteristics;

thirdly, selecting and constructing a deep learning model;

training and optimizing the model;

step five, evaluating and selecting a model;

Step six, deploying and applying the model;

and step seven, continuously improving and maintaining the model.

The method comprises the steps of data acquisition, data preprocessing and data enhancement, wherein the data acquisition is carried out firstly, the data preprocessing is carried out secondly, and the data enhancement is carried out secondly.

The method comprises the steps of designing the 3D features and extracting the deep learning features, wherein the method comprises the steps of designing the 3D features before deep learning, and extracting the deep learning features by using a deep learning model.

The method comprises the following steps of selecting a model and constructing the model, wherein the specific step is to select a proper deep learning model architecture according to task requirements and data characteristics, and then construct a specific deep learning model based on the selected model architecture.

The method comprises the following steps of designing a proper loss function to measure the difference between a model prediction result and an actual label, dividing a preprocessed data set into a training set, a verification set and a test set, training a model by using the preprocessed 3D data set, evaluating the performance of the model on the verification set, and optimizing the model according to an evaluation result.

The method comprises the steps of evaluating indexes, testing process and model selection, wherein the specific steps are that firstly, a verification set or a test set is used for evaluating the performance of the model, then, test set data are used for testing the model, the generalization capability and the actual recognition effect of the transverse model are evaluated, and then, the model with the best performance is selected as a final used model according to the evaluation result.

The method comprises the steps of model deployment, system integration and field test optimization, wherein the method comprises the specific steps of firstly guiding out a trained model into a format suitable for an industrial robot to use, deploying the format into a visual recognition system of the industrial robot to form a 3D visual recognition module, integrating the 3D visual recognition module with other systems of the industrial robot to work, testing the system in an actual working environment, and carrying out necessary optimization and adjustment according to test results.

The method comprises the steps of data collection, feedback, model optimization and system maintenance, wherein the steps are that new 3D data are continuously collected in the actual operation process, the model is further optimized and adjusted according to the data feedback and performance evaluation results in the actual application, the identification precision and generalization capability are improved, and finally the system is required to be maintained and maintained regularly, so that the stable operation and long-term reliability of the system are ensured.

Compared with the prior art, the invention has the beneficial effects that:

The 3D visual recognition method of the industrial robot based on the deep learning is an advanced industrial automation technology, combines a deep learning algorithm and a 3D visual technology, and forms a closed loop system from data acquisition to model deployment and application and continuous improvement and maintenance, three-dimensional information of surrounding environment can be captured, data such as a depth map, point cloud and the like are generated, useful features can be automatically and rapidly extracted from the data by using the deep learning algorithm, classification and recognition can be carried out, the visual recognition capability of the industrial robot can be improved through continuous optimization and improvement, and the industrial robot can perform various tasks more accurately and flexibly, so that the requirements of practical application are met.

Detailed Description

The contents of the present invention can be more easily understood by referring to the following detailed description of preferred embodiments of the present invention and examples included. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present specification, definitions, will control.

The term "by..preparation" as used herein is synonymous with "comprising. The terms "comprising," "including," "having," "containing," or any other variation thereof, as used herein, are intended to cover a non-exclusive inclusion. For example, a composition, step, method, article, or apparatus that comprises a list of elements is not necessarily limited to only those elements but may include other elements not expressly listed or inherent to such composition, step, method, article, or apparatus.

The term "consisting of" excludes any unspecified element, step or component. If used in a claim, such phrase will cause the claim to be closed, such that it does not include materials other than those described, except for conventional impurities associated therewith. When the phrase "consisting of" appears in a clause of the claim body rather than immediately following the subject, it is limited to only the elements described in that clause, and other elements are not excluded from the claim as a whole.

When an equivalent, concentration, or other value or parameter is expressed as a range, preferred range, or a range bounded by a list of upper preferable values and lower preferable values, this is to be understood as specifically disclosing all ranges formed from any pair of any upper range limit or preferred value and any lower range limit or preferred value, regardless of whether ranges are separately disclosed. For example, when ranges of "1 to 5" are disclosed, the described ranges should be construed to include ranges of "1 to 4", "1 to 3", "1 to 2 and 4 to 5", "1 to 3 and 5", and the like. When a numerical range is described herein, unless otherwise indicated, the range is intended to include its endpoints and all integers and fractions within the range.

The singular forms include plural referents unless the context clearly dictates otherwise. "optional" or "any" means that the subsequently described event or event may or may not occur, and that the description includes both cases where the event occurs and cases where the event does not.

Approximating language, in the specification and claims, may be applied to modify an amount that would not limit the application to the specific amount, but would include an acceptable portion that would be close to the amount without resulting in a change in the basic function involved. Accordingly, the modification of a numerical value with "about", "about" or the like means that the present application is not limited to the precise numerical value. In some examples, the approximating language may correspond to the precision of an instrument for measuring the value. In the description and claims of the application, the range limitations may be combined and/or interchanged, if not otherwise specified, including all the sub-ranges subsumed therein.

Furthermore, the indefinite articles "a" and "an" preceding an element or component of the invention are not limited to the requirements of the number of elements or components (i.e. the number of occurrences). Thus, the use of "a" or "an" should be interpreted as including one or at least one, and the singular reference of an element or component also includes the plural reference unless the amount is obvious to the singular reference.

The technical solutions of the embodiments of the present invention will be clearly and completely described in the following in conjunction with the embodiments of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, but not all embodiments.

The deep learning algorithm is one branch of machine learning, simulates the learning process of a human brain by constructing a deep neural network, and can automatically extract the features in the image and learn how to classify, identify and position according to the features in the 3D visual identification of the industrial robot;

The 3D vision technology obtains three-dimensional information of objects, including shapes, sizes, positions and the like, through equipment such as cameras, sensors and the like, and compared with two-dimensional vision, the 3D vision technology can provide richer spatial information, so that an industrial robot can understand surrounding environments more accurately;

According to the industrial robot 3D visual recognition method based on the deep learning, the deep learning is combined with the 3D visual technology, so that the recognition precision and adaptability of the industrial robot can be remarkably improved, and the combination can bring the following advantages:

And the feature extraction, namely the deep learning algorithm can automatically extract key features such as edges, angular points, curved surfaces and the like from the 3D image. These features are critical to subsequent identification and localization;

Training a deep learning model through a large amount of 3D image data, so that the model learns the characteristics of the shape, the color, the texture and the like of the object, and establishes the mapping relation between the characteristics and the object types;

Real-time identification, namely, in industrial production, an industrial robot needs to identify and position an object in real time;

The deep learning algorithm can rapidly process 3D image data, and realize real-time identification and positioning, so that the real-time requirement of industrial production is met;

The self-adaptive capacity is that the deep learning algorithm has strong self-adaptive capacity and can cope with objects with different shapes, sizes and materials, which means that the industrial robot can flexibly adapt to different production tasks without a great deal of programming and debugging.

S1, data acquisition and preprocessing

S1.1, acquiring data, namely capturing 3D data of a target object by using a 3D sensor (such as a structured light camera, a stereoscopic vision camera, liDAR and the like), wherein the 3D data may comprise depth images, point cloud data or voxel grids and the like;

S1.2, preprocessing the acquired 3D data, such as denoising, filtering, registering (if data of a plurality of view angles are involved), dividing (separating a target object from the background) and the like, so as to improve the data quality and reduce the complexity of subsequent processing;

s1.3, enhancing data, namely enhancing preprocessed data, such as rotation, scaling, translation and other transformation, in order to increase data diversity and enhance generalization capability of a model.

S2, feature extraction

S2.1 feature design-before deep learning, it may be necessary to manually design some 3D features (although modern deep learning models tend to automatically learn features), such as geometric features (points, lines, faces), statistical features (histograms, distributions), etc. But this step is gradually replaced by automatic feature learning in modern deep learning methods;

S2.2, deep learning feature extraction, namely performing feature extraction on the preprocessed 3D data by using a deep learning model (such as a 3D convolutional neural network, a point cloud neural network and the like), wherein the model can automatically learn and extract useful feature representations from the data, and the features possibly comprise shapes, textures, edges and the like and are used for describing three-dimensional surface characteristics of an object;

S2.3, classifying and identifying, namely selecting a proper classifier such as a Support Vector Machine (SVM) or a neural network and the like to classify and identify the extracted features.

S3, deep learning model selection and construction

S3.1, selecting a model, namely selecting a proper deep learning model architecture according to task requirements (such as classification, detection, segmentation and the like) and data characteristics (such as point cloud, depth image and the like). For example, for point cloud data, pointNet, pointNet ++ or other models may be selected, for depth images, 3D convolutional neural networks (3D CNNs) or the like may be selected;

S3.2, constructing a model, namely constructing a specific deep learning model on the basis of a selected model architecture, wherein the specific deep learning model comprises the steps of defining a network layer, setting super parameters (such as learning rate, batch size and the like) and the like.

S4, model training and optimization

S4.1, designing a loss function, namely designing a proper loss function to measure the difference between a model prediction result and an actual label, such as cross entropy loss in a classification task, mean square error in a regression task and the like;

S4.2, dividing the preprocessed data set into a training set, a verification set and a test set for training, verifying and testing the model;

s4.3, training the model by using the preprocessed 3D data set, optimizing model parameters through a back propagation algorithm, reducing loss function values, and improving model performance;

and S4.4, verifying and optimizing, namely evaluating the performance of the model on a verification set, and optimizing the model according to an evaluation result, such as adjusting super parameters, adding regularization items and the like.

S5, model evaluation and selection

S5.1, evaluating indexes, namely evaluating the performance of the model by using a verification set or a test set, wherein the common evaluating indexes comprise accuracy, recall rate, F1 score, AUC and the like;

S5.2, testing the model by using test set data, and evaluating the generalization capability and the actual recognition effect of the transverse model;

And S5.3, selecting a model with the best performance according to the evaluation result as a model for final use.

S6, deployment and application

S6.1, deploying the model, namely deploying the trained model on the industrial robot, wherein the deep learning model is usually required to be converted into a format suitable for running on the industrial robot, such as TensorFlow Lite, ONNX and the like;

S6.2, integrating the derived model into a visual recognition system of the industrial robot to form a 3D visual recognition module, integrating the 3D visual recognition module with other systems (such as motion control and path planning) of the industrial robot to cooperatively work, and performing overall test to ensure that the system can stably run and meet the expected waist requirement;

S6.3, on-site testing and tuning, namely testing the system in an actual working environment, and performing necessary tuning and tuning according to a test result.

S7, continuous improvement and maintenance

S7.1, data collection and feedback, namely continuously collecting new 3D data in the actual operation process for continuous improvement and updating of the model;

s7.2, optimizing the model, namely further optimizing and adjusting the model according to data feedback and performance evaluation results in practical application so as to improve recognition accuracy and generalization capability;

And S7.3, maintaining the system regularly to ensure the stable operation and long-term reliability of the system.

The method comprises the steps of forming a complete flow of the 3D visual recognition method of the industrial robot based on deep learning together, acquiring data, deploying and applying a model, continuously improving and maintaining the model, forming a closed loop system, capturing three-dimensional information of surrounding environment, generating data such as a depth map, point cloud and the like, automatically extracting useful features from the data by using a deep learning algorithm, classifying and recognizing the data, and improving the visual recognition capability of the industrial robot by continuous optimization and improvement, so that the requirements of practical application are met.

The examples referred to herein are illustrative only and are intended to explain some of the features of the method described herein, the appended claims are intended to claim the broadest possible scope as contemplated and the examples presented herein are merely illustrative of selected implementations in accordance with all possible combinations of examples. It is, therefore, not the intention of the applicant that the appended claims be limited by the choice of examples illustrating the features of the invention. Some numerical ranges used in the claims also include sub-ranges within which variations in these ranges should also be construed as being covered by the appended claims where possible.

Claims

1. A 3D visual recognition method for industrial robots based on deep learning, characterized in that it comprises the following steps:

Step 1: Data collection and preprocessing;

Step 2: Extract data features;

Step 3: Selection and construction of deep learning model;

Step 4: Train and optimize the model;

Step 5: Evaluate and select the model;

Step 6: Deploy and apply the model;

Step 7: Continuously improve and maintain the model.

2. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step one includes data acquisition, data preprocessing and data enhancement; the specific steps are to first acquire the data, then preprocess the data, and then enhance the data.

3. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step 2 includes feature design and deep learning feature extraction; the specific steps are to perform 3D feature design before deep learning, and then use the deep learning model to extract the deep learning features.

4. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step three includes model selection and model construction; the specific steps are to first select a suitable deep learning model architecture according to task requirements and data characteristics, and then build a specific deep learning model based on the selected model architecture.

5. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step four includes loss function design, data division, training process, verification and tuning; the specific steps are to first design a suitable loss function to measure the difference between the model prediction result and the actual label, then divide the preprocessed data set into a training set, a verification set and a test set, then use the preprocessed 3D data set to train the model, then evaluate the performance of the model on the verification set, and tune the model according to the evaluation results.

6. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step five includes evaluation indicators, testing process and model selection; the specific steps are to first use a validation set or a test set to evaluate the performance of the model, then use the test set data to test the model, evaluate the generalization ability and actual recognition effect of the model, and then select the model with the best performance as the final model based on the evaluation results.

7. According to the deep learning-based industrial robot 3D visual recognition method according to claim 1, it is characterized in that step six includes model deployment, system integration and on-site test tuning. The specific steps are to first export the trained model into a format suitable for use by the industrial robot and deploy it to the industrial robot's visual recognition system to form a 3D visual recognition module, then integrate the 3D visual recognition module with other systems of the industrial robot to work together, and then test the system in an actual working environment, and perform necessary tuning and adjustments based on the test results.

8. According to the deep learning-based industrial robot 3D visual recognition method of claim 1, it is characterized in that step seven: data collection and feedback, model optimization and system maintenance, the specific steps are to first continuously collect new 3D data during the actual operation process, and then further optimize and adjust the model according to the data feedback and performance evaluation results in actual applications to improve the recognition accuracy and generalization ability, and finally the system needs to be regularly maintained and serviced to ensure the stable operation and long-term reliability of the system.