Disclosure of Invention
In view of the above problems, the present study is to provide a projection-based abnormal component inspection visualization method, which not only can help predictive maintenance personnel to quickly find abnormal components in data, but also can support abnormal mode discovery and abnormal attribution of component data examples, and can also guide work deployment of subsequent components so as to prolong the service life of the components. The specific technical scheme is as follows:
a projection-based abnormal-component-inspection visualization method, comprising the steps of:
S1, data acquisition and processing
After the data of the aviation gas turbofan engine are obtained, the data are cleaned and processed, and the data are arranged into structural data which can be directly used in a visual view;
s2 visual mapping
Visually mapping the data obtained in step S1 through a visual channel:
Designing a component projection view, and using a scatter diagram and a contour diagram to present a distribution overview of data, wherein the degree of closeness of the distribution overview represents the similarity of a component degradation mode;
displaying real data of the monitored variables by using an area chart, and vertically stacking and discharging the area chart so that an analyst can find the relation among the variables;
s3, visual layout and realization
Visual layout is carried out on the visual module which completes mapping in the step S2, and the visual layout is realized:
in the component projection view, mapping the component into each point in the scatter diagram, and reserving the relative positions among the scatter diagrams, wherein aiming at the diagrams, the running condition of the component is recorded for an improved radar diagram in the middle, and a plurality of overlapped circular rings are arranged on the outer side;
in the data inspection view, a time axis layout method is adopted, a dotted line diagram at the top of the view shows a working condition sequence running in the current time period, and a time axis module with an area diagram is arranged at the bottom of the view;
S4, interactive design
Under the scattered point view, when the mouse is suspended on a certain data point, the point is highlighted for a user to check the local area of the scattered point in detail through zooming, a closed irregular graph is drawn on the scattered point, the scattered point is replaced by a font for deep exploration of data, and the length of a time slice coded by each circular ring in the font is controlled through a sliding bar.
Further, in step S1, the data processing includes:
(1) And (3) identifying working conditions:
dividing the working state according to 3 operation conditions of height, mach number and sea level temperature, and clustering working conditions by using k-means;
(2) Data dimension reduction:
The component is characterized by multidimensional time sequence data, namely:
Xk=[t1,t2…,ti,…,tT],k=1,2,…,n,ti∈RM
Wherein n is the number of components, M is the number of selected features, T is the length of a life cycle, T i is the value of each feature at the moment i, and k is the number of the components;
And (3) polymerizing data by adopting a sliding time window to obtain:
wherein, w is the size of a time window, x i is the value of each characteristic of the window w_i after aggregation;
For the resulting x w_i, using t-distributed random neighborhood embedding, let the x w_i vector be projected into one-dimensional space, then we will represent the building block as:
Xk∈RT-w+1,k=1,2,...,n
aiming at the degradation curve obtained after the dimension reduction, the dimension reduction of t-SNE is carried out to a two-dimensional space, and X k is expressed as:
Xk∈R2,k=1,2,...,n
wherein n is the number of components;
(3) Outlier calculation:
Defining anomalies using Pearson coefficients, two time series of the s-th monitored variable for building blocks numbered k i and k j The correlation is defined as follows:
Wherein: The value of the s-th monitored variable, which is the component numbered k i, at time point t; The value of the s-th monitored variable for component k j at time t; TS is the length of the current time interval; Is that Average value in time interval TS;
wherein outliers are defined as:
Wherein n is the number of members.
Further, in step S2, the visual mapping is specifically:
s21, mapping the shape, the position and the color of the scatter diagram in the component projection view, wherein the position of the scatter diagram is used for mapping the dimension reduction result of the component;
s22, carrying out color and area mapping on the contour map in the component projection view, namely filling blank positions among contour lines by adopting specific colors, mapping the density of the component projected in a local area by using the depth of the filling color, and mapping the quantity of behaviors by using the area of petals;
S23, carrying out shape, size and color mapping on an improved radar chart of a special font in a component projection view, wherein different working condition categories are mapped by using the colors of points;
S24, mapping positions, radians and colors of an improved radar chart of a special font in a component projection view, wherein different ring positions encode different monitoring indexes;
S25, mapping the colors and the positions of the data inspection view, wherein six symbols are adopted to map six working conditions, connecting lines between origins of different colors represent time sequence relations of the components working under different working conditions, and the left side of the area diagram is connected by using three Bezier curves aiming at highly relevant monitoring variables.
Further, the visual layout scheme of the component projection view specifically includes:
S3a, mapping the scatter diagram and the contour line on a screen according to the width and the height of the screen distributed by the system, and mapping the distribution density of components in different position areas through the density and the change of the transparency of the line in the contour line to complete the basic layout of the view;
S3b, for the component font view, selecting each slice range and the monitoring variable to be displayed in a control column, and calculating the font size according to the number of the monitoring variables;
s3c, after the size is determined, the fonts are scattered according to a collision detection algorithm based on force-guided layout, so that the problem of vision shielding of the view caused by dense data of local areas and sparse remote areas is solved;
S3d, counting the operation times of each component under six working conditions, and determining the distance from each point in the improved radar chart to the central point through a linear scale;
and S3e, for the superimposed ring graph, sequentially drawing the ring from inside to outside according to the selection sequence of the monitoring variables, cutting the ring into a plurality of fragments according to the time slice length, and completing the drawing of each fragment according to the abnormal value of each time slice.
Further, the mapping the density of the component distribution in the different location areas by the density and the transparency change of the contour line in S3a specifically includes:
S3a1:two-dimensional scalar field computation
1A) For a given set of data points (x 1,x2…,xn;y1,y2,…,yn), determining its minimum and maximum observations, i.e., (x 1,y1) and (x n,yn), where n is the number of components;
1b) Estimating a grid group distance h to be used in performing two-dimensional grid division, thereby obtaining grid demarcation points (a 0,a1,…,aVe) and (b 0,b1,…,bHe) of data, wherein a ve+1-ave=h,bhe+1-bhe =h, ve=0, 1,..;
the mesh group distance h is calculated as follows:
Wherein IQR (x) is the difference between the upper quartile value and the lower quartile value of the sample, n is the number of components;
1c) Counting the scattered data frequency in each grid area, namely (g 0,0,g0,1,…,gVe,He);
S3a2 contour calculation
2A) Determining an initial threshold value, comparing the data value in the grid with the set threshold value, marking the grid larger than the threshold value as 1, and otherwise marking the grid value as 0, thereby obtaining a binary image;
2b) Constructing a binary index according to four values of corners of each cell, scanning around the cell in a clockwise direction, and generating a four-bit index by using bitwise OR operation from the highest displacement to the lowest displacement of the upper left corner;
2c) Accessing a pre-built look-up table using the cell index, listing the desired edges representing the cells;
2d) Applying linear interpolation between the raw data values to find the exact position of the contour along the cell edges;
s3a3 scale mapping and color mapping
The contour fill color is the contour color curve transitions smoothly from white to dark, and the system represents the area density as a value of 0 to 1.
Further, in the step S3c, the discrete opening method of the font is as follows:
S3c1, obtaining the maximum value and the minimum value of xy coordinates of all data selected by the noose, mapping the difference value between the maximum value and the minimum value into the width and the height of the canvas, recalculating the real coordinates of the data in the canvas according to a linear scale, and reprojecting the data points into a new canvas;
s3c2, assuming the data points of the canvas as dotted particles with the same mass radius, and enabling the particles to move by adding different mechanical models;
s3c3, adding collision force to each particle, and calculating the position and the speed of the particle after the delta t time by using Verlet;
and S3c4, repeatedly iterating for a plurality of times to obtain the position of the particle at the final speed of 0, namely the position of the final data point, wherein the particles are related and separated and tend to be stable.
Further, the S3d specifically is:
S3d1, mapping the operation statistics under each working condition into a distance cRadius from the center of the radar chart to the axis, wherein the calculation process is as follows:
Wherein: The maximum operation times under the working condition o; The sum o is the running times of the current component under the working condition O, radarRadius is the radius of the radar chart, and O is the total number of working condition categories;
S3d2, calculating X-axis coordinates xpos o and Y-axis coordinates ypos o of the point in the graph according to the distance between the point and the circle center, wherein the X-axis coordinates xpos o and the Y-axis coordinates ypos o are specifically as follows:
s3d3, mapping the life cycle length of the component by the size of the center circle, mapping the operation times of the size of the scattered points on the radar coordinate axis under the working condition, keeping the same with the position away from the circle center, and calculating the secondary coding as follows:
Wherein rul min is the shortest life cycle of the component, rul max is the longest life cycle of the component, R is the maximum radius of the central circle, R is the maximum radius of the coordinate axis circle, rulRadius is the radius of the central circle, and dotArea o is the distance from the point represented by the working condition o to the circle center;
S3d4, mapping the abnormal value of the subsequence of each monitoring variable into the length of an arc, and calculating the occupied angle of the arc as follows:
wherein RULLen is the life cycle length of the member degradation; representing the length of the s-th monitored variable of component k within the time interval TS;
s3d5, the distance between the circle represented by the S-th monitoring variable and the center of the circle is as follows:
rRadiuss=rDis+(rGap+rBandWidth)·(s-1),s=1,2,3,...,M
Where M is the number of selected features, rDis inner ring radius, rGap is the spacing distance between rings, and rBandWidth is the width of the rings.
Further, in step S3, the specific process of the visual layout and implementation of the data inspection view is as follows:
S31, determining a life cycle range of display, designing an area diagram, and connecting two highly-related monitoring variables by adopting a cubic Bezier curve according to calculation;
S32, determining the colors corresponding to the working conditions, determining the display forms of the dots, including the layout positions and the connection line drawing modes, and laying out and realizing the dot-line graph.
The beneficial effects of the invention are as follows:
1) The invention aims to overcome the defect of the existing method in the aspect of component abnormal data pattern analysis. The traditional alarm system often cannot acquire the specific reasons of the data abnormality. According to the visualization method, abnormal data in the data can be intuitively found through dimension reduction projection, and a special font is designed for positioning the abnormal reasons of the abnormal data subset. The abnormal mode can be clearly and intuitively displayed through the juxtaposition of the special fonts.
2) The invention overcomes the defects that the traditional system can only check the original data, has large cognitive load and is difficult to locate abnormally. The visualization method subdivides the working conditions of the components into six types, and a user can directly check the working condition sequence of the working of the components to explain the component abnormality caused by the abnormality of the working condition sequence. Meanwhile, the three-time Bessel connection can assist a user to find a monitoring index with extremely high relevance, and the abnormality can be conveniently located.
Detailed Description
The invention is described in further detail below with reference to the drawings and specific data.
According to the invention, through an effective information visualization method and combining a multi-view linkage strategy and a flexible interaction means, multi-angle analysis of anomalies in aviation gas turbofan engine data is realized, predictive maintenance personnel are helped to find anomalies in the data, and the anomalies are analyzed in a mode and a reason of the anomalies. The technical scheme includes that the method comprises the steps of data acquisition and processing, visual mapping, visual layout and realization and interactive design. The method comprises the following specific steps:
step one, data acquisition and processing
And screening effective information according to aviation gas turbofan engine data provided by the NASA of the American aerospace agency, and storing the data.
1. Data acquisition the FD004 data set in the data is selected for use, wherein the FD004 data set contains data from initial state operation to complete failure of 249 engines. Wherein the maximum life cycle of the engine is 543 and the minimum life cycle is 128. The specific data comprises 26 dimensions, wherein 1-2 dimensions are machine numbers and time points, 3-5 dimensions are working conditions of the machine, namely height, mach number and sea level temperature, and 6-26 dimensions are monitoring indexes of an engine, namely fan inlet total temperature, low-pressure compressor temperature, high-pressure compressor temperature and low-pressure turbine temperature, fan inlet pressure, bypass conveying pipe pressure, high-pressure compressor air pressure, physical fan rotating speed, physical core rotating speed, engine pressure ratio, high-pressure compressor static pressure, fuel quantity and high-pressure turbine static pressure ratio, corrected fan rotating speed, corrected core rotating speed, bypass ratio, fuel-air ratio in a combustion chamber, exhaust valve heat content, required fan rotating speed, corrected required fan rotating speed, high-pressure turbine coolant discharge and low-pressure turbine coolant discharge.
2. And the data processing comprises working condition identification and data dimension reduction operation.
(1) Working condition identification, namely, the working conditions of the engine can be divided according to 3 operating conditions of height, mach number and sea level temperature because the engine works in different working conditions. And clustering the working conditions by using k_means. The main flow of the algorithm is that 1) k is designated, namely data is divided into k categories, 2) k points are randomly selected from the data to serve as the mass centers of each cluster, 3) the distance between the data points and the mass centers is measured through a certain distance calculation method, the data points are divided into the nearest cluster mass centers, 4) the mass centers of each cluster are recalculated, 5) if one of three conditions that the mass centers of newly formed clusters are not changed any more, the points are kept in the same cluster and the maximum iteration times is reached, iteration is ended, otherwise, the steps 3 to 5 are repeated. Experiments find that the effect is best when k=6, and the working state of the engine is divided into working condition 1, working condition 2, working condition 3, working condition 4, working condition 5 and working condition 6.
(2) The data dimension reduction means that the component can be characterized by multidimensional time sequence data, namely:
Xk=[t1,t2…,ti,…,tT],k=(1,2,…,n),tj∈RM
Wherein n is the number of components, M is the number of selected features, T is the life cycle length, T i is the value of each feature at the moment i, and k is the component number.
In order to reduce the deviation caused by inaccurate calculation amount and data acquisition, a sliding time window is adopted to aggregate data, so that the method comprises the following steps of:
Wherein, w is the size of a time window, x w_i is the value of each characteristic of the window w_i after aggregation;
for the obtained x w_i, the random neighborhood embedding is carried out by utilizing t-distribution, and the algorithm steps are as follows:
a) Describing the similarity between vectors by using Euclidean distance, wherein the similarity is expressed by conditional probability:
Wherein σ w_i is the variance.
B) Calculating the similarity between points in the two-dimensional space to be projected, and using conditional probability representation:
Where y w_i is the position of the data point in the low-dimensional space.
C) Kullback-Leibler divergence is commonly used to measure the distance between two probability distributions to minimize the sum of K-L divergences:
Wherein n is the number of members.
D) To alleviate the problem of crowding data points in low dimensional space, the similarity between data points is calculated using t-distribution.
Where y w_i is the position of the data point in the low-dimensional space.
Through the above steps, the x w_i vector is projected into one-dimensional space, and the component can then be represented as:
Xk∈RT-w+1,k=(1,2,...,n)
For the one-dimensional time sequence after dimension reduction, the dimension reduction is performed in the two-dimensional space by using the t-SNE, and X k can be expressed as follows:
Xk∈R2,k=(1,2,...,n)
Wherein n is the number of members.
(3) Outlier calculation Using Pearson coefficient to define anomalies for two time series of the No. k i,kj s-th monitored variableThe correlation is defined as follows:
Wherein: The value of the s-th monitored variable, which is the component numbered k i, at time point t; The value of the s-th monitored variable for component k j at time t; TS is the length of the current time interval; Is that Average value in time interval TS;
wherein outliers are defined as:
Wherein n is the number of members.
Step two, visual mapping
After data acquisition and processing, the visual mapping scheme design is carried out on the component projection view (shown in fig. 2) and the data inspection (shown in fig. 5) in the invention.
1. Component projection view
(1) Scatter diagram
The method comprises the steps of carrying out shape, position and color mapping on a scatter diagram in a component projection view, wherein the dimension reduction result of the component is mapped by the position of the scatter diagram, the clustering result of the color mapping component is obtained, and the transparency of the point is mapped to the life cycle length of the component. Specific examples are as follows:
the position is that one point in the scatter diagram is mapped with a component, the two-dimensional coordinates of the point in the view represent the dimension reduction result of the component, and the distance between the points is mapped with the similarity degree between the degradation tracks of the component;
Color-clustering the components into 6 classes according to their degradation trajectories, each class being labeled with a different color. The points with the same color are gathered together, so that the clustering effect is better;
transparency-transparency of a dot characterizes the length of the lifecycle of a component, the longer the lifecycle, the less the transparency.
(2) Contour map
Color mapping the contour map in the component projection view, namely filling blank positions among contour lines by adopting specific colors, and mapping the density of the component projected in the local area by using the darkness of the filling colors.
The area of petals is used for mapping the number of behaviors. Specific examples are as follows:
Contour color-for blank positions between contours, the system fills them with green. If the density of the projected component is larger in the local area, the filled color is darker, so that an analyst is assisted to know the real situation of data distribution;
the larger the area, the more times are indicated, and the smaller the area, the less times are indicated.
(3) Special character form
This portion adopts a radial layout, comprising two portions INNERAREA and OuterArea, as shown in fig. 4 (a). At section INNERAREA, the number of runs per operating condition is mapped using a modified radar map. At section OuterArea, superimposed loop graphs are used to represent the degree of anomaly of the different monitoring indicators over various time periods.
InnerArea
The method comprises the steps of carrying out shape, size and color mapping on an improved radar chart of a special font in a component projection view, mapping different working condition categories by using the colors of points, mapping the life cycle length of the component by using the size of a center circle, encoding the running number of the component under six working conditions by using the shape of the radar chart, and comparing the running conditions of the component during degradation by comparing the shapes of hexagons. Specific examples are as follows:
color, namely mapping different working condition categories through 6 colors.
Size: the size of the center circle encodes the lifecycle length of the component.
The shape comprises mirror image arrangement of six shafts, the running quantity under six working conditions is encoded through the distance between the points on the shafts and the shaft center, the points on the shafts are connected end to form an irregular hexagon, and the running condition of the component during degradation can be roughly compared by comparing the shapes of the hexagons.
OuterArea
The method comprises the steps of mapping positions, radians and colors of an improved radar chart of a special font in a component projection view, wherein different ring positions encode different monitoring indexes, the radian encodes the length of a time slice, and the color of the ring encodes the abnormality degree of the time slice. Specific examples are as follows:
The location OuterArea includes a plurality of rings, each ring representing a monitoring index. In fig. 4 (a), four circles from inside to outside are the monitored data of the total fan inlet temperature, the high pressure compressor temperature, the low pressure turbine temperature, and the fan inlet pressure, respectively. Wherein the number of rings and the representative monitoring index can be selected by the user himself.
The radian, namely slicing the monitored data in time, wherein the length of the time slice is selected by a user, the radian maps the length of the time slice, and the degradation time period is divided into five time slices in the figure 4 (a);
color-each circular slice has a different color, and the degree of abnormality of the time slice is encoded using the color.
2. Data inspection view
The color and position mapping of the data inspection view comprises the steps of mapping six working conditions by adopting six symbols, representing the time sequence relation of the component working under different working conditions by connecting lines between origins of different colors, and connecting the left side of an area diagram by using a three-time Bezier curve aiming at highly relevant monitoring variables. Specific examples are as follows:
(1) Point diagram
The color adopts six color systems with obvious contrast ratio of red, orange, yellow, green, cyan and blue to map each working condition.
Length is the number of runs of the length coding member under the working condition by utilizing the rectangle.
(2) Area map
Height-the height of the area map maps the size of the real data.
The two ends of the red connecting line are respectively highly relevant monitoring variables, and when the Pearson coefficient is greater than 0.5, the system can be automatically connected with the corresponding variables.
Step three, visual layout and realization
1. Component projection view visualization layout and implementation
When the number of the components is too large, the scattered points in the scattered points can be blocked, so that an analyst cannot effectively acquire the data density in the current area. By mapping the density of the line and the change of the transparency in the contour map (shown in fig. 3) to the distribution density of the components in the areas at different positions, a user can clearly find dense or sparse areas in the data, and the method has important significance for finding the outlier islands in the data. The implementation method is as follows:
(1) Two-dimensional scalar field computation
A) For a given set of data points (x 1,x2…,xn;y1,y2,…,yn), its minimum and maximum observations, i.e., (x 1,y1) and (x n,yn), are determined, where n is the number of components.
B) The grid set distance h that should be used in performing the two-dimensional grid division is estimated to obtain grid demarcation points (a 0,a1,…,aVe) and (b 0,b1,…,bHe) of the data, where a ve+1-ave=h,bhe+1-bhe =h, ve=0, 1,... The calculation mode of h is as follows:
Where IQR (x) is the difference between the upper and lower quartile values of the sample, and n is the number of components.
C) The scatter data frequency in each grid area is counted, i.e., (g 0,0,g0,1,…,gVe,He).
(2) Contour line calculation
A) Step one, determining an initial threshold value, comparing a data value in a grid with a set threshold value, marking the grid larger than the threshold value as 1, and otherwise marking the grid value as 0, thereby obtaining a binary image.
B) Assume that a pixel block of 2 x2 in a binary image is a contour element. A binary index is constructed from four values at each cell corner, scanned around the cell in a clockwise direction, and a four-bit index is generated from the highest displacement in the upper left corner to the lowest using a bitwise OR operation.
C) A pre-built look-up table is accessed using the cell index, which contains 16 entries listing the edges required to represent the cells.
D) Linear interpolation is applied between the raw data values to find the exact position of the contour along the cell edges. And obtaining a contour line of a certain threshold value through the four steps, and repeating the four steps with num% serving as a density interval to obtain num contour lines.
(3) Scale mapping and color mapping
And according to the width and the height of the screen distributed by the system, mapping the scatter diagram and the contour line onto the screen according to a linear scale, and completing the basic layout of the view.
Contour fill color-contour color curve transitions smoothly from white to dark green. The system expresses the area density as a value of 0 to 1, and then the final color expression is obtained by the following formula:
Wherein hexColor is RGB color hexadecimal value, data i is density of the current area, data min is density minimum value, and data max is density maximum value.
After user interaction selects the collection of components, the view will preserve the relative positions between the components, expanding into a special glyph as shown in the right half of FIG. 2. The period of time and the monitored amount of abnormality of the member degradation process can be found by the glyph inspection. Meanwhile, the degradation pattern of the abnormal member can be found by discriminating the degree of similarity between glyphs. The special font implementation process is as follows:
(1) Spatial layout implementation
After the abnormal components to be analyzed are selected, the font design of the canvas display component set needs to be regenerated according to the circled area, and the shape selected by the user lasso cannot be obtained, so that when the local area is data-intensive and the remote area is sparse, serious visual occlusion of the view can occur, as shown in fig. 2. The collision detection algorithm based on the force guiding layout is realized, the mutual intersection between fonts is avoided, and the following is realized:
a) And obtaining the maximum value and the minimum value of xy coordinates of all data selected by the noose, mapping the difference value between the maximum value and the minimum value into the width and the height of the canvas, re-calculating the real coordinates of the data in the canvas according to the linear scale, and re-projecting the data points into a new canvas.
B) The data points of the canvas are assumed to be dotted particles with the same mass radius, and the particles are made to move by adding different mechanical models.
C) A certain collision force is added to each particle, and the position and velocity of the particle after Δt time are calculated using Verlet.
D) And repeatedly iterating for a plurality of times, and obtaining the position of the particle when the final speed is 0, namely the position of the final data point, wherein the particles are separated in a correlation way and tend to be stable. At this time, it can be found that the relative positions between the scattered points are well preserved, and cross overlapping occurs between the glyphs.
(2) Font design implementation
For a single component, the data includes operation statistics of the component under each working condition when the component is degraded, and the operation statistics are specifically defined as follows:
D={Io|o∈[1,O]}
wherein O is the type of working condition, and I o is the number of times of operation under the working condition O.
As shown in fig. 4 (b), the operation statistics under each condition are mapped to the distance cRadius from the center of the radar chart to the axis, and the calculation process is as follows:
Wherein: The maximum operation times under the working condition o; The method is characterized by comprising the steps of determining the minimum operation times under the working condition O, determining the operation times of a current component under the working condition O by sum o, determining the radius of a radar chart by radarRadius, and determining the total number of working condition categories by O.
The X coordinate xpos o and the Y coordinate ypos o of the point in the graph are calculated according to the distance between the point and the circle center, and the method is concretely as follows:
the size of the center circle maps the life cycle length of the component, the operation times of the size mapping of the scattered points on the radar coordinate axis under the working condition are consistent with the position away from the center of a circle, the size mapping belongs to secondary coding, and the calculation is as follows:
Wherein rul min is the shortest life cycle of the component, rul max is the longest life cycle of the component, R is the maximum radius of the central circle, R is the maximum radius of the coordinate axis circle, rulRadius is the radius of the central circle, and dotArea o is the distance from the point represented by the working condition o to the center of the circle.
The subsequence outlier of each monitored variable is mapped to the length of an arc, and the angle occupied by the arc is calculated as follows:
wherein RULLen is the life cycle length of the member degradation; representing the length of the s-th monitored variable of the component k within the time point interval TS;
the distance between the circle k and the center of the circle is as follows:
rRadiuss=rDis+(rGap+rBandWidth)·(s-1),s=1,2,3,...,M
where M is the number of selected features, rDis inner ring radius, rGap is the spacing distance between rings, and rBandWidth is the width of the rings, respectively.
2. Data inspection view visualization layout and implementation:
the connection line is drawn-the left side of the area diagram is marked with a red connection line for highly relevant variables as shown in fig. 5. When the Pearson coefficient is greater than 0.5, the system will draw a cubic bezier curve automatically connecting the corresponding variables. Four points are needed for drawing the three-time Bezier curve, namely a starting point spoint, an ending point epoint and two control points cpoint1 and cpoint2, and the result after drawing is shown in FIG. 4. The two control point coordinates are determined as follows:
Wherein spoint x is the x-axis coordinate of the starting point, epoint x is the x-axis coordinate of the ending point, epoint y is the y-axis coordinate of the ending point, and spoint y is the y-axis coordinate of the starting point.
Step four, interactive design
In the component projection view, highlighting, scaling, lasso, time slicing and reconfiguration operations are included:
highlighting, namely under the scattered point view, in order to help a user to quickly know basic information of the component, the system provides a highlight interaction mode, when an analyst focuses a mouse on a certain data point, the point is highlighted by utilizing color, and the information such as the number, the life cycle length, the working condition operation statistics condition and the like of the component is displayed by assisting a prompt box.
Scaling-taking into account that more data points may result in overlap between points, causing visual confusion. The user can view the local area in detail through zooming, and meanwhile, the selection of the data points is facilitated later.
Lasso-users often need to examine this portion of data after finding an outlier or point of interest. The method supports drawing a closed irregular graph on a scatter diagram, if a certain data point is positioned in the middle of the graph when a user completes lasso, the data point is selected, and the font corresponding to the data point is displayed in a pop-up view.
Time slicing-analysts can control the extent of the time slices by means of the slide bars above. Greater flexibility is provided herein in that a user may select overlapping time slices of unequal length to explore the relevance of the monitored variable of the component during degradation.
Reconfiguration-the present visualization method allows the user to switch between the scatter plot and the contour plot via tabs. When the component life cycle is too long, the transparency of the dots may be too low, causing hiding between the dots. Meanwhile, due to uneven data distribution or overlarge data quantity, serious shielding can occur to view parts, and at the moment, vision confusion is relieved by switching to the contour map through reconfiguration.