CN119299332B

CN119299332B - A safety monitoring system and monitoring method based on machine vision

Info

Publication number: CN119299332B
Application number: CN202411404721.9A
Authority: CN
Inventors: 金涛; 庄会富
Original assignee: Kunming Institute of Botany of CAS
Current assignee: Kunming Institute of Botany of CAS
Priority date: 2024-10-09
Filing date: 2024-10-09
Publication date: 2025-06-17
Anticipated expiration: 2044-10-09
Also published as: CN119299332A

Abstract

The invention relates to the field of network security monitoring and image processing, in particular to a security monitoring system and a security monitoring method based on machine vision. The method comprises the steps of collecting visual image data of server hardware in real time, carrying out multi-frequency decomposition on the visual image data, carrying out filtering treatment and recombination on obtained signals in different frequency domains to generate a denoised reconstructed image, obtaining global feature representation based on a depth network, constructing a space-time diagram and carrying out behavior analysis to obtain comprehensive space-time feature representation, carrying out fusion treatment on the comprehensive space-time feature representation and sensor data, and carrying out risk assessment to determine an early warning strategy. The problems that false alarm or missing alarm is generated due to the fact that noise interference cannot be effectively separated and restrained, complex abnormal behaviors or dangerous operations are difficult to effectively identify, risk assessment is not comprehensive due to the fact that multi-mode fusion is not carried out on the complex abnormal behaviors or dangerous operations and the data of the sensor cannot be carried out, and response speed is delayed or response measures are not accurate are solved.

Description

Safety monitoring system and monitoring method based on machine vision

Technical Field

The invention relates to the field of network security monitoring and image processing, in particular to a security monitoring system and a security monitoring method based on machine vision.

Background

With the rapid development of data centers and cloud computing technology, security monitoring of server hardware has become increasingly important. In a data center, monitoring of server hardware is an important link to ensure stable operation of the system. The monitoring of the prior server hardware mainly depends on manual inspection and basic hardware monitoring tools, and has the problems of untimely monitoring, high false alarm rate and the like. With the development of technology, some server hardware security monitoring systems based on machine vision are applied to actual scenes.

However, existing machine vision based server hardware security monitoring systems still have some drawbacks. Firstly, when the systems process images in a complex environment inside a server cabinet, it is often difficult to effectively separate and inhibit interference from various noise sources such as server equipment operation and light change inside the cabinet, so that a hardware state detection result is easily affected by noise, and false alarm or missing report is generated. Secondly, in terms of server hardware anomaly detection and maintenance operation analysis, the existing system is difficult to effectively identify a complex hardware failure mode or potential dangerous maintenance operation, and particularly in a multi-server and high-density deployment scene, the performance of the system often cannot meet the actual requirements. Moreover, many systems fail to fully utilize data from different sensors within the server for multimodal fusion, resulting in insufficiently comprehensive risk assessment of the running state of the server, possibly ignoring certain potential hardware failures or potential safety hazards. The early warning mechanism of the existing system often lacks flexibility, so that the response speed to the abnormal state of the server is lagged or the response measures are not accurate enough in practical application, and finally the safe operation effect of the data center is affected.

Disclosure of Invention

The invention provides a safety monitoring system and a monitoring method based on machine vision, which are used for solving the problems that the detection result is easily affected by noise and is false-positive or false-negative due to incapability of effectively separating and inhibiting interference from various noise sources such as equipment operation, environmental light change and the like, complex abnormal behaviors or dangerous operation are difficult to effectively identify, multi-mode fusion is not fully utilized for data from different sensors, risk assessment is not comprehensive and potential dangerous factors in the environment can be possibly ignored, and the early warning mechanism is lack of flexibility and can lead to lag in response speed or inaccurate response measures in practical application and finally influence safety guarantee.

The invention discloses a safety monitoring system and a monitoring method based on machine vision, which specifically comprise the following technical scheme:

A machine vision based security monitoring method comprising the steps of:

S1, acquiring visual image data of server hardware in real time, performing multi-frequency domain decomposition on the visual image data through generalized Fourier-Bessel transformation to obtain signals of different frequency domains, performing adaptive nonlinear filtering processing on the signals of the different frequency domains to obtain frequency domain signals after filtering processing;

s2, obtaining global feature representation through a depth network with nonlinear superposition of a high-order polynomial based on the denoised reconstructed image, constructing a space-time diagram based on the global feature representation, analyzing behaviors to obtain comprehensive space-time feature representation, carrying out fusion processing on the comprehensive space-time feature representation and sensor data to obtain fused fuzzy decision data, and carrying out risk assessment based on the fused fuzzy decision data to generate an early warning strategy.

Preferably, the S1 specifically includes:

And carrying out self-adaptive nonlinear filtering processing on signals in different frequency domains by introducing nonlinear transformation to obtain the frequency domain signals after the filtering processing.

Preferably, the S1 specifically includes:

and recombining the filtered frequency domain signals through weighted fusion to obtain a denoised reconstructed image.

Preferably, the S2 specifically includes:

Based on the denoised reconstructed image, a high-order polynomial nonlinear superimposed depth network is introduced, each layer of feature representation of the depth network is extracted, and the feature representations of all layers of the depth network are weighted and fused to obtain a global feature representation.

Preferably, the S2 specifically includes:

The method comprises the steps of constructing a space-time diagram based on global feature representation, introducing a high-order Bessel function and Laplacian operator joint transformation, analyzing the space-time diagram to obtain space-time feature representation, carrying out convolution operation based on the space-time feature representation, and aggregating space-time feature information at different moments to obtain comprehensive space-time feature representation.

Preferably, the S2 specifically includes:

And performing fusion processing on the comprehensive space-time characteristic representation and the sensor data through fuzzy logic and convolution integration to obtain fused fuzzy decision data.

Preferably, the S2 specifically includes:

And generating an early warning strategy through nonlinear mapping and a fuzzy reasoning model based on the risk assessment result.

A machine vision based security monitoring system comprising:

the system comprises an image acquisition module, a data preprocessing module, a target detection module, a behavior analysis module, an early warning response module and a database;

the image acquisition module captures visual image data of the server hardware in real time and transmits the acquired visual image data to the data preprocessing module;

the data preprocessing module carries out multi-frequency domain decomposition on the visual image data through generalized Fourier-Bezier transformation to obtain signals in different frequency domains; the method comprises the steps of carrying out self-adaptive nonlinear filtering processing on signals in different frequency domains to obtain frequency domain signals after filtering processing, carrying out weighted fusion on reconstructed images based on the frequency domain signals after filtering processing to generate denoised reconstructed images, and transmitting the denoised reconstructed images to a target detection module and a database;

the target detection module is used for extracting each layer of characteristic representation of the depth network from the denoised reconstructed image based on the depth network with nonlinear superposition of the high-order polynomial, and carrying out weighted fusion on the characteristic representations of all layers of the depth network to obtain a global characteristic representation;

the behavior analysis module is used for constructing a space-time diagram by utilizing a high-order Bezier transformation network based on the global feature representation, extracting the space-time feature representation, and aggregating the space-time features through convolution operation to form a comprehensive space-time feature representation;

The early warning response module is used for carrying out fusion processing on the comprehensive space-time characteristic representation and the sensor data to obtain fused fuzzy decision data, carrying out risk assessment based on the fused fuzzy decision data to obtain a risk assessment result, and generating an early warning strategy through nonlinear mapping and a fuzzy reasoning model based on the risk assessment result;

The database is used for storing the data transmitted by the data preprocessing module, the target detection module and the behavior analysis module.

The technical scheme of the invention has the beneficial effects that:

1. Environmental noise is effectively separated and suppressed through combination of generalized Fourier-Bessel transformation and adaptive nonlinear filtering, key visual image characteristics are reserved, the characteristic of local noise is considered in the adaptive nonlinear filtering process, and the contrast of visual image signals is enhanced through nonlinear transformation, so that the accuracy and the robustness of visual image processing are improved, and the reliability of subsequent target detection is ensured;

2. The method comprises the steps of carrying out multi-level feature extraction and fusion on a denoised reconstructed image by utilizing a depth network of a high-order polynomial nonlinear superposition, and effectively capturing complex features, particularly high-order features, in the denoised reconstructed image;

3. The machine vision system can accurately capture the behavior mode and the interrelation of the target object in the space-time dimension by constructing a space-time diagram and combining the high-order Bessel function and the Laplace operator joint transformation, and the aggregation of the space-time characteristics further enhances the capability of the machine vision system in the aspects of analyzing the target behavior and identifying abnormal operation, so that the potential safety risk can be effectively detected;

4. The invention not only relies on the data of the machine vision system, but also combines the data from various sensors (such as temperature, pressure, vibration and the like), and the multi-mode data are fused in a fuzzy logic and convolution integral mode to form unified risk assessment data;

5. Based on the fused fuzzy decision data, the machine vision system realizes dynamic assessment of risks through second derivative analysis, and triggers early warning response mechanisms of different levels according to risk assessment results. The machine vision system is provided with a plurality of early warning thresholds, so that different early warning levels can be flexibly controlled, and the machine vision system can respond timely and accurately under various risk conditions from low-level early warning (such as sound warning) to high-level early warning (such as automatic shutdown).

Drawings

FIG. 1 is a block diagram of a machine vision based security monitoring system according to the present invention;

FIG. 2 is a flow chart of a machine vision-based security monitoring method according to the present invention;

FIG. 3 is a topology diagram of a visual monitoring scene of a server according to the present invention;

fig. 4 is a design diagram of a safety monitoring system architecture based on machine vision according to the present invention.

Detailed Description

In order to further illustrate the technical means and effects adopted by the present invention to achieve the preset purpose, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of a machine vision-based safety monitoring system and a monitoring method provided by the invention with reference to the accompanying drawings.

Referring to FIG. 1, a block diagram of a machine vision based security monitoring system according to one embodiment of the present invention is shown, the system comprising:

the image acquisition module captures visual image data of the server hardware in real time through the camera, and transmits the captured image data to the data preprocessing module;

The data preprocessing module carries out multi-frequency domain decomposition on the visual image data by adopting generalized Fourier-Bessel transformation to obtain signals of different frequency domains; noise suppression and characteristic enhancement are carried out on signals in different frequency domains through a self-adaptive nonlinear filter, and finally, reconstructed image data are fused through weighting to obtain a denoised reconstructed image;

The target detection module is used for extracting multi-layer features from the denoised reconstructed image based on a depth network of the nonlinear superposition of the high-order polynomials, generating a global feature representation for identifying potential dangerous targets or abnormal equipment states;

The behavior analysis module is used for constructing a space-time diagram by utilizing a high-order Bezier transformation network based on the global feature representation, extracting the space-time feature representation, aggregating the space-time features through convolution operation to form a comprehensive space-time feature representation, and analyzing a behavior mode of a target to identify dangerous behaviors or abnormal operation;

the early warning response module is used for carrying out fusion processing on the comprehensive space-time characteristic representation and other sensor data to obtain fused fuzzy decision data, carrying out risk assessment based on the fused fuzzy decision data to obtain a risk assessment result, generating an early warning signal based on the risk assessment result through nonlinear mapping and a fuzzy reasoning model and used for triggering corresponding safety response measures, and the machine vision system is provided with a plurality of early warning thresholds, wherein each early warning threshold corresponds to early warning or response mechanisms of different levels.

Referring to fig. 2, a flow chart of a machine vision based security monitoring method according to an embodiment of the present invention is shown, the method includes the following steps:

Referring to fig. 3 and 4, the client captures visual image data of the server hardware through the image acquisition module and transmits the visual image data to the server, the server comprises a data preprocessing module, a target detection module, a behavior analysis module and an early warning response module, the captured visual image data is stored in a standard format, and the visual image data is transmitted to the data preprocessing module in real time through a high-speed data transmission interface (such as GigE or USB 3.0) and is stored and managed in a server cabinet, so that the integrity and the ready availability of the data are ensured.

The collected visual image data contains abundant environmental information, but also contains a large amount of complex noise, and the noise may come from equipment operation, environmental light change and other uncertain factors, so preprocessing is needed to improve the accuracy of subsequent analysis, and a data preprocessing module in the service-side elastic computing equipment adopts generalized Fourier-Bessel transformation to carry out multi-frequency-domain decomposition on the visual image data. The generalized fourier-bessel transform formula is:

Wherein F _i (x, y) represents the value of the visual image signal in the ith frequency domain, and is taken as the frequency domain component of the input visual image data after being subjected to the generalized Fourier-Bessel transformation, namely the signal in the ith frequency domain; Is a complex exponential term representing the contributions of different frequency components in the Fourier transform, ω _m,n is the angular frequency, J is the ordinal unit, J _v (ar) is a Bessel function representing the amplitude distribution in the frequency domain, v is the order of the Bessel function, α is a parameter regulating the frequency amplitude, r is the spatial distance

After the multi-frequency-domain decomposition of the visual image data is completed, the data preprocessing module also carries out self-adaptive nonlinear filtering processing on signals F _i (x, y) in different frequency domains so as to inhibit noise and enhance key characteristics, and the self-adaptive nonlinear filter not only considers the characteristics of local noise, but also introduces nonlinear transformation to further optimize the processing effect. The specific formula of the adaptive nonlinear filtering is as follows:

Wherein, F' _i (x, y) represents the visual image signal on the ith frequency domain after the adaptive nonlinear filtering process, namely the frequency domain signal after the filtering process; An estimated value representing noise of the visual image signal at the position (x, y) in the ith frequency domain is obtained by statistically analyzing gray scale variation of a local area of the visual image signal, lambda _i is an adjustment parameter of an adaptive nonlinear filter for controlling the intensity of the filter to ensure that a useful signal is retained while suppressing noise, a nonlinear function tanh (gamma.f _i (x, y)) further enhances the contrast of the visual image signal in the ith frequency domain, wherein gamma is a nonlinear coefficient for adjusting the strength of enhancement of the visual image signal in the ith frequency domain, and omega is a frequency component in the frequency domain for describing the distribution of the visual image signal in the frequency domain in the ith frequency domain.

And recombining the filtered frequency domain signals, and weighting and fusing the results of different frequency domains to form a denoised reconstructed image. The formula is as follows:

Wherein, I' (x, y) is the value of the denoised reconstructed image at the space coordinates (x, y), alpha _i is the weighting coefficient of the ith frequency domain, which represents the importance of the ith frequency domain in the final image reconstruction; Cos (βω) is a cosine function used for frequency domain fusion, β is a parameter for adjusting the frequency of the cosine function, and ω is a frequency component in the frequency domain. The above-mentioned de-noised reconstructed image I' (x, y) has noise effectively suppressed while retaining important image features.

The target detection module takes the denoised reconstructed image as input, and uses a high-order polynomial nonlinear superimposed depth network to perform target detection, so that a potential dangerous target or abnormal equipment state is accurately identified from the denoised reconstructed image.

Specifically, the multi-layer characteristic representation is extracted through a depth network of nonlinear superposition of a high-order polynomial, and the formula is as follows:

Wherein G _l is the characteristic representation of the first layer depth network, Y _l is the input characteristic of the first layer depth network, Y ₁ =I' (x, Y) is the input characteristic of the 1 st layer depth network, and the input denoising reconstructed image is subjected to multi-layer nonlinear transformation The generated characteristic representation is used as an input characteristic of the first layer depth network and the second layer depth network; Is the power of q of the input features of the depth of layer network, V _l is the weight matrix of the depth of layer network, b _l is the bias term of the depth of layer network, q represents the order of the polynomial by taking the derivative of q order for a particular dimension in the input features of the depth of layer network The method can further capture the higher-order features in the denoised reconstructed image, wherein z is a specific dimension (such as a space coordinate) in the input features of the first layer depth network, p is the polynomial order of the highest order, namely the depth of the extracted features, and the nonlinear expression capability of the depth network is determined; is the superposition coefficient, controlling the impact of the feature representation of the previous depth-of-layer network on the feature representation of the current depth-of-layer network. Each layer of feature representation G _l of the depth network will be superimposed stepwise in multiple levels, forming a more expressive global feature representation.

The generation of the global feature representation is accomplished by the following formula:

Wherein G _global is a global feature representation, delta _l is a fusion weight of feature representations of each layer of depth network, and L is the total layer number of the depth network with nonlinear superposition of high-order polynomials. And obtaining a global feature representation integrating information of all layers of the depth network by carrying out weighted fusion on the feature representations of all layers of the depth network and integrating the feature representations in a time dimension r.

The behavior analysis module constructs a space-time diagram G based on the global feature representation G _global, the purpose of which is to capture the interrelationship and behavior pattern of the target object in time and space through the structured representation of the diagram, identifying dangerous behavior or abnormal operation. The specific implementation steps are as follows:

The global feature representation G _global is a feature representation of each detection target, namely each layer of the depth network, and is used as a node of a time-space diagram, the feature vector of the node is from a specific subset in the global feature representation and is used for representing the spatial position, speed, appearance features and the like of the detection targets, the edges between the nodes represent the space-time relationship between different detection targets, the features of the edges are calculated by the feature vectors of the two nodes, and common calculation methods comprise Euclidean distance, cosine similarity and the like. The weight of the edge depends on the physical distance and relative speed between the detection targets, namely, the natural logarithm is taken as a base, the index is the Euclidean distance between two nodes divided by the adjusting parameter used for controlling the attenuation speed of the edge weight, and the time space diagrams at different moments are connected through the time dimension to form a time sequence diagram. The nodes at each moment are connected with the nodes at the next moment through edges, so that the continuity in time and space is ensured.

After the time-space diagram G is built, the time-space diagram G is analyzed, and in order to better capture the time-space characteristics, a high-order Bessel function and Laplacian joint transformation is introduced, wherein the formula is as follows:

H₀＝σ(W₀·G_global+b₀)

Wherein H _t+1 is the spatiotemporal feature representation of the next time, the spatiotemporal feature representation H _t;H₀ which combines the second derivative feature of the spatiotemporal graph and the spatiotemporal feature representation of the current time is the initial spatiotemporal feature representation; Is Laplacian, and is used for extracting local second derivative characteristics of a space-time diagram and enhancing the representation of space-time dependency relationship, wherein J _k is a kth order Bessel function; Representing the highest order of the Bessel function; for the weight corresponding to the k-th order Bessel function, the contribution of different order space-time characteristics in the space-time diagram is controlled The method comprises the steps of respectively adjusting coefficient and frequency parameters for adjusting the amplitude and frequency of the change of the space-time characteristic in the time dimension, cosh is a hyperbolic cosine function for further enhancing the expression capability of the space-time characteristic, W ₀ is a weight matrix for carrying out linear transformation on the global characteristic representation, b ₀ is a bias vector added to the result of the linear transformation for adjusting the central position of the space-time characteristic representation and avoiding overreliance on zero values.

The extracted space-time characteristic representations are aggregated by convolution operation to form a comprehensive space-time characteristic representation, and the formula is as follows:

wherein H _final is a comprehensive spatiotemporal feature representation; The method is characterized in that the method is a weighting coefficient of a time dimension, W _c is a convolution kernel, the time-space characteristic information at different moments is aggregated through convolution operation on the time-space characteristic representation, and finally a comprehensive time-space characteristic representation is formed, and T is the length of a time sequence, namely the total time step number.

The comprehensive space-time characteristic representation is used as a comprehensive video stream capable of reflecting the current risk state, the server side displays the processed video stream through the front end, and an operator can check real-time data, adjust safety monitoring system parameters and check a historical alarm record through a configuration page.

The early warning response module takes the comprehensive space-time characteristic representation generated by the behavior analysis module as input, and performs fusion processing on the comprehensive space-time characteristic representation and data acquired by other sensors (such as a temperature sensor, a pressure sensor, a vibration sensor, a humidity sensor and the like), wherein the other sensor data are used for supplementing image data of a machine vision system, and more dimension data are provided to help a safety monitoring system to evaluate the environment state and potential safety risks more accurately. The fusion process is realized by fuzzy logic and convolution integral, and the formula is as follows:

wherein D _f is the fuzzy decision data after fusion as the basic data of risk assessment, u is the index of other sensor data, N is the total number of other sensor data; membership function of fuzzy set, expressed according to The degree of membership calculated is determined by the method,The characteristic value extracted from S _u, S _u is a u-th sensor data matrix, the u-th sensor data matrix is obtained by fusion of convolution operation and comprehensive space-time characteristics, dω is integral of frequency variation, the integral operation ensures that all frequency domain components in the sensor data are effectively processed, and the integral operation shows that the sensor data are comprehensively processed in the frequency domain.

And performing risk assessment on the fused fuzzy decision data, wherein the formula is as follows:

wherein R (t) is the risk level of the current moment t, and represents a risk value obtained by evaluation according to the input data of the current moment, namely a risk evaluation result; Representing the second derivative of the fused fuzzy decision data in the time dimension, and reflecting the dynamic change of the fused fuzzy decision data along with the time; and R (t-1) is the risk level at the previous moment.

According to the risk assessment result, an early warning strategy is determined through nonlinear mapping and a fuzzy reasoning model, and the formula is as follows:

wherein D _output (t) is a decision output signal at the time t, θ is a decision weight representing the influence of a risk assessment result in early warning decision, τ is a normalization coefficient for adjusting the scale of the risk level R (t) and ensuring the smoothness of the decision output signal. The final output D _output (t) is an early warning signal of the alarm system for triggering corresponding safety measures, such as alarms, notifications or automatic stops.

The alarm system sets a plurality of early warning thresholds, each early warning threshold corresponds to an early warning or response mechanism of different levels, and when D _.utput (t) reaches or exceeds a certain early warning threshold, the alarm system triggers the corresponding early warning mechanism. And according to the triggered early warning mechanism, the warning system decides what early warning measures are started. For example, the early warning levels are divided into low, medium and high levels, respectively corresponding to different risk levels, the low level early warning may trigger an audible alarm or display a warning message on a monitor screen to prompt the attention of an operator, the medium level early warning may send a short message or email to inform security manager to further investigate the potential risk, and the high level early warning may immediately execute emergency measures such as automatic shutdown and security isolation to prevent serious accidents.

Therefore, flexible control of different early warning levels is realized, and the machine vision system can make timely and accurate response under various risk conditions, so that the safety of a network environment is ensured.

In summary, the safety monitoring system and the monitoring method based on machine vision are completed.

The sequence of the embodiments of the invention is merely for description and does not represent the advantages or disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment mainly describes differences from other embodiments.

The foregoing embodiments are merely for illustrating the technical solution of the present invention, but not for limiting the same, and although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that the technical solution described in the foregoing embodiments may be modified or substituted for some of the technical features thereof, and that such modifications or substitutions do not depart from the spirit and scope of the technical solution of the embodiments of the present invention and are intended to be included in the scope of the present invention.

Claims

1. A machine vision based security monitoring method, comprising the steps of:

S2, extracting feature representations of each layer of the depth network through a depth network with nonlinear superposition of high-order polynomials based on the denoised reconstructed image, weighting and fusing the feature representations of all layers of the depth network to obtain global feature representations, constructing a space-time diagram based on the global feature representations, analyzing the space-time diagram to obtain space-time feature representations, carrying out convolution operation based on the space-time feature representations, aggregating space-time feature information at different moments to obtain comprehensive space-time feature representations, fusing the comprehensive space-time feature representations with sensor data to obtain fused fuzzy decision data, and carrying out risk assessment based on the fused fuzzy decision data to generate an early warning strategy.

2. The machine vision-based safety monitoring method according to claim 1, wherein in S1, adaptive nonlinear filtering processing is performed on signals in different frequency domains to obtain filtered frequency domain signals, and the method specifically comprises:

3. The machine vision based security monitoring method according to claim 2, wherein in S1, the generating a denoised reconstructed image based on the filtered frequency domain signal specifically comprises:

4. The machine vision-based safety monitoring method according to claim 1, wherein in S2, the space-time diagram is analyzed to obtain a space-time feature representation, and the method specifically comprises:

And introducing a high-order Bessel function and Laplacian operator joint transformation, and analyzing the time-space diagram to obtain a time-space characteristic representation.

5. The machine vision based safety monitoring method according to claim 4, wherein in S2, the integrated space-time feature representation and the sensor data are fused to obtain the fused fuzzy decision data, and the method specifically comprises:

6. The machine vision-based safety monitoring method according to claim 5, wherein in S2, risk assessment is performed based on the fused fuzzy decision data to generate an early warning strategy, and the method specifically comprises:

7. A machine vision based security monitoring system for use in a machine vision based security monitoring method as defined in claim 1, comprising the steps of:

the database is used for storing the data transmitted by the data preprocessing module, the target detection module and the behavior analysis module;

The early warning response module is used for carrying out fusion processing on the comprehensive space-time characteristic representation and the sensor data to obtain fused fuzzy decision data, carrying out risk assessment based on the fused fuzzy decision data to obtain a risk assessment result, and generating an early warning strategy through nonlinear mapping and a fuzzy reasoning model based on the risk assessment result.