CN110727560B - Cloud service alarm methods and devices - Google Patents

Cloud service alarm methods and devices

Info

Publication number
CN110727560B
CN110727560B CN201910966336.6A CN201910966336A CN110727560B CN 110727560 B CN110727560 B CN 110727560B CN 201910966336 A CN201910966336 A CN 201910966336A CN 110727560 B CN110727560 B CN 110727560B
Authority
CN
China
Prior art keywords
cloud service
abnormal data
preset
service type
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910966336.6A
Other languages
Chinese (zh)
Other versions
CN110727560A (en
Inventor
刘曾超前
董灵芝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910966336.6A priority Critical patent/CN110727560B/en
Publication of CN110727560A publication Critical patent/CN110727560A/en
Application granted granted Critical
Publication of CN110727560B publication Critical patent/CN110727560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请实施例公开了一种云服务报警方法,首先,获取云服务运行过程中产生的异常数据;然后,基于产生异常数据的数据接口,识别异常数据所属的云服务类型;然后,响应于确定在第一预设时间段内,云服务类型产生的异常数据的数量超过预设数量阈值,发送报警信息。本公开的实施例可应用于云计算领域,通过采集线上异常数据,并对异常数据进行频率分析,实现了对云服务线上故障的主动、快速地感知,以便云服务商进行快速修复和及时止损;而且,实现了云上服务统一的故障感知能力,可以为各云服务商的云服务系统提供线上故障感知能力。

This application discloses a cloud service alarm method. First, it acquires abnormal data generated during the operation of the cloud service. Then, based on the data interface that generated the abnormal data, it identifies the cloud service type to which the abnormal data belongs. Finally, in response to determining that the number of abnormal data generated by the cloud service type exceeds a preset threshold within a first preset time period, it sends an alarm message. This disclosed embodiment can be applied to the cloud computing field. By collecting online abnormal data and performing frequency analysis on the abnormal data, it achieves proactive and rapid detection of online faults in cloud services, enabling cloud service providers to quickly repair and mitigate losses. Furthermore, it realizes a unified fault detection capability for cloud services, providing online fault detection capabilities for the cloud service systems of various cloud service providers.

Description

Cloud service alarm method and device
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a cloud service alarm method and device.
Background
Cloud service providers need to provide stable online services to meet customer service needs. However, in some cases, such as disconnection of a machine room or internal abnormality of cloud service, a large amount of online abnormal data may be generated, and service unavailability may occur. At present, when the situation occurs, abnormal situations are generally known by waiting for a user to perform fault feedback through a work order system or telephone connection, respectively establishing a monitoring system for each service on the cloud, and monitoring the faults on each line in real time.
Based on the mode that the user feedback knows the abnormal condition, cloud service providers can not timely sense the service abnormality problem so as to quickly follow-up the problem and stop loss, and user experience is seriously affected. Based on the abnormal condition learning mode of the monitoring system established by the respective service, the respective team is required to carry out corresponding research and development work and establish related mechanisms, and a large amount of repeated work exists among the teams, so that the labor and material cost are wasted.
Disclosure of Invention
The embodiment of the application provides a cloud service alarm method and a cloud service alarm system.
In a first aspect, an embodiment of the present application provides a cloud service alarm method, where the method includes acquiring abnormal data generated in a cloud service operation process, identifying a cloud service type to which the abnormal data belongs based on a data interface generating the abnormal data, and sending alarm information in response to determining that the number of the abnormal data generated by the cloud service type exceeds a preset number threshold in a first preset time period.
In some embodiments, the sending the alarm information in response to determining that the number value of the abnormal data generated by the cloud service type exceeds the preset number threshold in the first preset time period includes:
And in response to determining that the number value of the time stamps stored in the abnormal data list exceeds a preset number threshold value in a first preset time period, sending alarm information through a preset alarm channel, wherein the alarm channel is used for representing a communication mode of sending the alarm information to a receiver.
In some embodiments, before the identifying the cloud service type to which the abnormal data belongs based on the data interface generating the abnormal data, the method further includes deleting the obtained abnormal data in response to identifying that the abnormal data belongs to the preset abnormal type.
In some embodiments, the first preset time period is a first preset time period taking a certain historical time as a starting time and taking a current time as a ending time, and the method further comprises deleting abnormal data outside the first preset time period in response to determining that the number of the abnormal data generated by the cloud service type does not exceed a preset number threshold in the first preset time period.
In some embodiments, after sending the alarm information in response to determining that the number of abnormal data generated by the cloud service type exceeds the preset number threshold in the first preset time period, the method further includes deleting all abnormal data generated by the acquired cloud service type.
In some embodiments, the method further comprises determining whether the first acquisition of the anomaly data is within a second preset time period based on the anomaly data, first acquisition of the anomaly data to indicate that the same anomaly data has not been acquired before the acquisition of the anomaly data, and transmitting an anomaly prompt signal in response to determining that the anomaly data has been acquired for the first time.
In some embodiments, the method further comprises updating the preset number threshold based on the amount of access to the cloud service type in response to reaching a preset update time.
In a second aspect, an embodiment of the present application provides a cloud service alarm device, where the device includes an acquisition unit configured to acquire abnormal data generated during a cloud service operation process, an identification unit configured to identify a cloud service type to which the abnormal data belongs based on a data interface generating the abnormal data, and an alarm unit configured to send alarm information in response to determining that the number of the abnormal data generated by the cloud service type exceeds a preset number threshold in a first preset time period.
In some embodiments, the alarm unit is further configured to store a timestamp for generating the abnormal data in an abnormal data list corresponding to a cloud service type to which the abnormal data belongs, and send alarm information through a preset alarm channel in response to determining that the number value of the timestamps stored in the abnormal data list exceeds a preset number threshold in a first preset time period, wherein the alarm channel is used for representing a communication mode for sending the alarm information to a receiver.
In some embodiments, the device further comprises a filtering unit configured to delete the acquired abnormal data in response to identifying that the abnormal data belongs to a preset abnormal type before identifying the cloud service type to which the abnormal data belongs based on the data interface generating the abnormal data.
In some embodiments, the first preset time period is a first preset time period taking a certain historical moment as a starting moment and taking a current moment as a ending moment, and the device further comprises a deleting unit, wherein the deleting unit is used for deleting the abnormal data outside the first preset time period in response to the fact that the quantity of the abnormal data generated by the cloud service type does not exceed a preset quantity threshold value in the first preset time period.
In some embodiments, the deleting unit is further configured to delete all abnormal data generated by the acquired cloud service type after sending the alarm information in response to determining that the number of abnormal data generated by the cloud service type exceeds the preset number threshold within the first preset time period.
In some embodiments, the alarm unit is further configured to determine, based on the anomaly data, whether the anomaly data is first acquired within a second preset time period, first acquire the anomaly data to indicate that the same anomaly data has not been acquired before the anomaly data is acquired, and send an anomaly prompt signal in response to determining that the anomaly data is first acquired.
In some embodiments, the apparatus further comprises an updating unit configured to update the preset number threshold based on the access amount of the cloud service type in response to reaching a preset update time.
In a third aspect, embodiments of the present application provide a computer readable medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as described in any of the implementations of the first aspect.
In a fourth aspect, an embodiment of the present application provides an electronic device, including one or more processors, and a storage device having one or more programs stored thereon, which when executed by the one or more processors, cause the one or more processors to implement a method as described in any of the implementations of the first aspect.
The cloud service alarm method and system provided by the embodiment of the application comprise the steps of firstly acquiring abnormal data generated in the cloud service operation process, then identifying the cloud service type to which the abnormal data belong based on a data interface for generating the abnormal data, and then sending alarm information in response to determining that the quantity of the abnormal data generated by the cloud service type exceeds a preset quantity threshold value in a first preset time period. The technical scheme of cloud service alarm provided by the disclosure realizes active and rapid sensing of the on-line fault of the cloud service by collecting the on-line abnormal data and performing frequency analysis, so that cloud service providers can repair and stop damage in time, and uniform fault sensing capability of the on-line service is realized, so that the on-line fault sensing capability can be provided for cloud service systems of all cloud service providers.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a cloud service alerting method in accordance with the present application;
fig. 3 is a schematic diagram of an application scenario of the cloud service alerting method according to the present embodiment;
FIG. 4 is a flow chart of yet another embodiment of a cloud service alerting method in accordance with the present application;
FIG. 5 is a block diagram of one embodiment of a cloud service alerting device in accordance with the present application;
FIG. 6 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
In one exemplary configuration of the application, the terminal device, server each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, etc., such as Read Only Memory (ROM) or flash memory (flashRAM). Memory is an example of computer-readable media.
Computer-readable media include both permanent and non-permanent, removable and non-removable media, and information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer readable media, as defined herein, does not include non-transitory computer readable media (transmission media), such as modulated data signals and carrier waves.
Fig. 1 illustrates an exemplary architecture 100 of a cloud service alarm system to which the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The terminal devices 101, 102, 103 may be hardware devices or software supporting network connections to provide various network services. When the terminal device 101, 102, 103 is hardware, it may be various electronic devices supporting network functions such as information interaction, network connection functions, etc., including but not limited to smart phones, tablet computers, electronic book readers, laptop and desktop computers, etc. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. It may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various cloud services, such as a server providing cloud storage and cloud computing services to the terminal devices 101, 102, 103. The server can store or process the received various data and feed back the processing result to the terminal equipment.
It should be noted that, the cloud service alarm method provided by the embodiment of the present disclosure may be executed by the server 105. Accordingly, the cloud service alerting device may be provided in the server 105. The present invention is not particularly limited herein.
It should be noted that, the server may be hardware, or may be software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules, for example, for providing distributed services, or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices and servers in fig. 1 is merely illustrative. There may be any number of terminal devices and servers, as desired for implementation.
With continued reference to fig. 2, a flow 200 of one embodiment of a cloud service alerting method in accordance with the present application is shown, comprising the steps of:
Step 201, acquiring abnormal data generated in the cloud service operation process.
In this embodiment, cloud services are an addition, use and interaction model of internet-based related services, generally involving providing dynamically extensible and often virtualized resources over the internet. The Cloud service may be various types of Cloud services provided by Cloud service providers for users based on requirements of storage, computation and the like, including but not limited to Public clouds (Public clouds) and Private clouds (Private clouds).
Public cloud is the most basic service, and a plurality of users can share the system resource of a cloud service provider, and can enjoy professional internet technical service without erecting any equipment and equipment management personnel, which is certainly a good method for reducing cost for general creators and middle and small enterprises. Public clouds can also be subdivided into 3 categories, including SaaS (Software-as-a-Service), paaS (Platform-as-a-Service), iaaS (Infrastructure-as-a-Service).
The private cloud is a private cloud network erected by a large enterprise for considering privacy of industries (such as finance and insurance industries) and privacy of users, and the enterprise needs to design a data center, a network and storage equipment by itself so as to have enough resources to ensure that the private cloud operates normally.
In this embodiment, the abnormal data is abnormal data generated by a fault in the operation process of the cloud service, for example, when a machine room providing the cloud service is disconnected, a user may generate a large amount of abnormal data such as network connection failure, data request failure, data storage failure and the like when using the cloud service. The abnormal data includes, but is not limited to, URL (Uniform Resource Locator ) of the abnormal data, line number and column number of the abnormal data, data interface for generating the abnormal data and stack information of the abnormal data.
In this embodiment, the gateway, browser and APP (Application program) interact directly with the cloud service user, so that most of abnormal data generated in the cloud service running process can be collected. The execution body (such as the server in fig. 1) of the embodiment may acquire, by adopting an exception reporting manner, exception data generated in the cloud service running process through a gateway, a browser and an APP in a terminal device applied by a user, where the gateway includes a console gateway and an API (Application Programming Interface ) gateway. When abnormal data is generated in the cloud service running process, the gateway, the browser, the APP and the like can receive the abnormal data of the cloud service, request the abnormal processing service in an asynchronous mode and report the abnormality to the execution body. After the gateway, the browser, the APP and the like report the abnormality, the execution main body acquires the abnormal data in the cloud service running process.
And 202, identifying the cloud service type to which the abnormal data belongs based on the data interface for generating the abnormal data.
In this embodiment, the cloud service types may be classified based on cloud service products proposed by a cloud service provider, such as product types of virtual machines, network EIP (ENTERPRISE INFORMATION PORTAL ) and the like proposed by the cloud service provider.
In this embodiment, the abnormal data includes a data interface for generating the abnormal data, and the cloud service type to which the abnormal data belongs can be known through the data interface for generating the abnormal data.
Step 203, in response to determining that the number of abnormal data generated by the cloud service type exceeds a preset number threshold in a first preset time period, alarm information is sent.
In this embodiment, the first preset time period is a first preset time period with a certain historical time as a starting time and a current time as a ending time, and the time length is specifically set according to the cloud service type, which is not limited herein. In some alternative embodiments, the first preset time period may be set in the manner of a sliding time window. Specifically, a preset duration may be taken as the duration of the first preset time period, and the current time is taken as the cut-off time, so as to determine the starting time of the sliding time window. For example, the time length of the sliding time window for the cloud service type a is set to 100s, and the minimum time unit identifiable by the sliding time window is set to seconds, if the current time is 2019 9 month 17 day 11 time 08 minutes 40 seconds, the cut-off time of the sliding time window is 2019 month 17 day 11 time 08 minutes 50 seconds, the start time thereof is 2019 month 17 day 11 time 07 minutes 10 seconds, and as the current time becomes 2019 month 17 day 11 time 18 minutes 50 seconds, the cut-off time of the sliding time window becomes 2019 month 17 day 11 time 18 minutes 50 seconds, and the start time thereof becomes 2019 month 17 day 11 time 17 minutes 10 seconds. Thus, the execution body can perform frequency analysis on the acquired abnormal data in real time.
In this embodiment, the preset number threshold may be specifically set according to the cloud service type and the online access amount of the cloud service type, which is not limited herein. For example, the average online access amount a of cloud service type a is greater than the average online access amount B of cloud service type B, and accordingly, the number threshold for cloud service type a may be greater than the number threshold for cloud service type B. The preset quantity threshold is a reference threshold for sending alarm information, so that an accurate alarm with a higher reference value is realized by setting a corresponding alarm reference threshold for the online access quantity of the cloud service.
In some optional implementations of the present embodiment, after the determining in the step above is within the first preset time period, the method of the present embodiment may further include updating the preset number threshold based on a change in an access amount of the cloud service type in response to reaching a preset update time.
In the cloud service operation process, the service expansion and the service volume increase of the user may cause the variation of the access volume on the cloud service line. At this time, by updating the preset number threshold for the cloud service type, the updated threshold can be more matched with the current online access amount.
In this embodiment, the alarm information may be sent in a preset alarm format, and the alarm information may include, for example, but not limited to, information indicating at least one of an alarm information receiving person, an abnormality occurrence period, the number of times of occurrence of abnormality, and abnormality data information occurring closest to a time of sending the alarm message.
The execution body of the embodiment counts abnormal data generated by a certain cloud service type based on the cloud service type, and performs frequency analysis on the abnormal data generated by the cloud service type according to a first preset time period, namely, determines that the number of the abnormal data generated by the cloud service type exceeds a preset number threshold in the first preset time period, and the execution body sends alarm information to a receiving party of the alarm information, wherein the receiving party of the alarm information can be maintenance personnel of a cloud service provider.
In some optional implementations of the present embodiment, the frequency analysis may be performed by presetting a corresponding list of abnormal data based on the cloud service type. The execution main body stores the time stamp for generating the abnormal data into an abnormal data list corresponding to the cloud service type to which the abnormal data belongs according to the time schedule, and the number of the time stamps in the abnormal data list is the number of the abnormal data generated by the cloud service type corresponding to the abnormal data list. And in response to determining that the number value of the timestamps stored in the abnormal data list exceeds a preset number threshold in a first preset time period, sending alarm information through a preset alarm channel, wherein the alarm channel is used for representing communication modes of sending the alarm information to a receiver, such as communication modes of various real-time communication application programs, mails, short messages, telephones and the like.
In this embodiment, the executing body may further determine, based on the abnormal data, whether the abnormal data is acquired for the first time in the second preset period of time, acquire the same abnormal data for the first time to indicate that the same abnormal data is not acquired before the abnormal data is acquired, and send an abnormality prompt signal to the receiving party in response to determining that the abnormal data is acquired for the first time. The recipient may be a maintainer of a cloud service provider that provides the cloud service. After the maintenance personnel receives the abnormal prompt signal, corresponding repair measures can be taken, and the abnormal prompt signal does not need to be sent any more in the repair time, so that a second preset time period is set, and the abnormal prompt signal is sent once in the second preset time period. In some alternative embodiments, an anomaly database may be established based on the first acquired anomaly data, the first acquired anomaly data is stored in the anomaly database, and whether the first acquired anomaly data is determined according to a comparison result of the acquired anomaly data and the anomaly data in the anomaly database.
In the embodiment, the execution main body acquires the abnormal data generated in the cloud service operation process in real time, and carries out frequency analysis on the abnormal data in real time based on the cloud service type generating the abnormal data, so that the active and rapid perception of the on-line fault of the cloud service is realized, the cloud service providers can repair and stop the damage in time, the unified fault perception capability of the on-line service is realized, and the on-line fault perception capability can be provided for the cloud service systems of all cloud service providers.
Fig. 3 schematically shows an application scenario of the cloud service alerting method according to the present embodiment. Cloud facilitator 301 provides cloud services for a number of users, including user 302 and user 303. The cloud service type provided by the cloud service provider 301 for the user 302 is a virtual machine service, and the cloud service type provided for the user 303 is a virtual machine service and a network EIP service. The server of the cloud service provider 301 collects abnormal data generated in the running process of the cloud service in real time through a browser, a gateway and an application program used by the user 302 and the user 303 while providing services for the user 302 and the user 303, identifies the cloud service type to which the abnormal data belongs as network EIP service through analysis of a data interface generating the abnormal data, performs frequency analysis on the abnormal data in a first preset time period, determines that the number of the abnormal data generated by the network EIP service exceeds a preset number threshold in the first preset time period, and sends alarm information to maintainers 304 of the cloud service provider.
With continued reference to fig. 4, there is shown a schematic flow 400 of another embodiment of a cloud service alerting method according to the present application, comprising the steps of:
step 401, obtaining abnormal data generated in the cloud service operation process.
In this embodiment, step 401 is performed in a similar manner to step 201, and will not be described here again.
And step 402, deleting the acquired abnormal data in response to the fact that the abnormal data are identified to belong to the preset abnormal type.
In this embodiment, the preset anomaly type is anomaly data for representing that statistics is not required in the cloud service alarm process. The preset exception types include, but are not limited to, exception data types generated due to user parameter input errors, exception data types generated due to user's failure to perform real-name authentication, and exception data types generated due to user's failure to open service rights.
When the abnormal data belongs to a preset abnormal type, the abnormal data is not generated due to the abnormality of the cloud service, and the abnormal data is filtered when alarm analysis is performed. After the preset anomaly type filtering is carried out on the anomaly data, all the anomaly data for alarming are the anomaly data generated by the anomaly of the cloud service, so that the frequency analysis result of the anomaly data is more accurate, and the alarm information has more reference value.
Step 403, identifying the cloud service type to which the abnormal data belongs based on the data interface generating the abnormal data.
In this embodiment, step 403 is performed in a similar manner to step 202, and will not be described here again.
And step 404, transmitting alarm information in response to determining that the quantity of the abnormal data generated by the cloud service type exceeds a preset quantity threshold value in a first preset time period.
In this embodiment, step 404 is performed in a similar manner to step 203, and will not be described here again.
And step 405, deleting all abnormal data generated by the acquired cloud service type.
In this embodiment, after the alarm information is sent, maintenance personnel corresponding to the cloud service provider is required to perform exception processing on the alarm information, the obtained exception data has no use value, and all exception data generated by the obtained cloud service type can be deleted from the viewpoints of saving storage space and improving running performance.
Similarly, in response to determining that the number of abnormal data generated by the cloud service type does not exceed the preset number threshold in the first preset time period, if the setting of the first preset time period adopts a sliding time window mode, abnormal data outside the first preset time period may be deleted. Because the sliding time window always takes the current time as the cut-off time and slides along with the change of the current time, that is, the abnormal data outside the sliding time window is subjected to frequency analysis of the abnormal data, on the premise that the quantity of the abnormal data generated by the cloud service type does not exceed the preset quantity threshold value, the abnormal data outside the sliding time window is not valuable.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the cloud service alarm method in this embodiment specifically illustrates filtering of the abnormal data before the cloud service type identification is performed on the abnormal data, and deleting the abnormal data that is not valuable for reuse in consideration of saving storage space and improving operation performance. After the preset anomaly type filtering is carried out on the anomaly data, all the anomaly data for alarming are the anomaly data generated by the anomaly of the cloud service, so that the frequency analysis result of the anomaly data is more accurate, and the alarm information has more reference value.
With continued reference to fig. 5, as an implementation of the method shown in the foregoing figures, the present disclosure provides an embodiment of a cloud service alarm apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.
As shown in fig. 5, the cloud service alarm device includes an acquisition unit 501, a filtering unit 502, an identification unit 503, an alarm unit 504, a deletion unit 505, and an update unit 506.
The acquisition unit 501 is configured to acquire abnormal data generated during the operation of the cloud service. The filtering unit 502 is configured to delete the acquired abnormal data in response to identifying that the abnormal data belongs to a preset abnormal type. The identifying unit 503 is configured to identify a cloud service type to which the abnormal data belongs, based on a data interface that generates the abnormal data. The alarm unit 504 is configured to send alarm information in response to determining that the number of abnormal data generated by the cloud service type exceeds a preset number threshold within a first preset time period. The deleting unit 505 is configured to delete the abnormal data outside the first preset time period in response to determining that the number of the abnormal data generated by the cloud service type does not exceed the preset number threshold in the first preset time period, and delete all the abnormal data generated by the acquired cloud service type after transmitting the alarm information in response to determining that the number of the abnormal data generated by the cloud service type exceeds the preset number threshold in the first preset time period. The updating unit 506 is configured to update the preset number threshold based on the access amount of the cloud service type in response to reaching a preset update time.
In this embodiment, the alarm unit 504 is further configured to store the timestamp for generating the abnormal data in an abnormal data list corresponding to the cloud service type to which the abnormal data belongs, and send the alarm information through a preset alarm channel in response to determining that the number value of the timestamps stored in the abnormal data list exceeds a preset number threshold in a first preset time period, where the alarm channel is used for characterizing a communication manner of sending the alarm information to the receiver.
In this embodiment, the alarm unit 504 is further configured to determine, based on the abnormal data, whether the abnormal data is acquired for the first time within the second preset period of time, acquire for the first time to indicate that the same abnormal data has not been acquired before the abnormal data is acquired, and send an abnormality prompt signal in response to determining that the abnormal data is acquired for the first time.
Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use with devices (e.g., devices 101, 102, 103, 105 shown in FIG. 1) implementing embodiments of the present application. The apparatus shown in fig. 6 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a processor (e.g., CPU, central processing unit) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the system 600 are also stored. The processor 601, the ROM602, and the RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
Connected to the I/O interface 605 are an input section 606 including a keyboard, a mouse, and the like, an output section 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like, a storage section 608 including a hard disk, and the like, and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the method of the application are performed when the computer program is executed by the processor 601.
The computer readable medium of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented in software or in hardware. The described units may also be provided in a processor, for example as a processor comprising an acquisition unit, a filtering unit, an identification unit, an alarm unit, a deletion unit and an update unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the acquisition unit may also be described as a unit that "acquires abnormal data generated during the operation of the cloud service".
As a further aspect, the application also provides a computer readable medium which may be comprised in the device described in the above embodiments or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the computer device to acquire abnormal data generated during operation of the cloud service, identify a cloud service type to which the abnormal data belongs based on a data interface generating the abnormal data, and send alarm information in response to determining that the number of the abnormal data generated by the cloud service type exceeds a preset number threshold within a first preset time period.
The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims (14)

1. A cloud service alerting method, wherein the method comprises:
acquiring abnormal data generated in the cloud service operation process by adopting an abnormal reporting mode through a gateway, a browser and an application corresponding to the cloud service;
deleting the abnormal data in response to identifying that the abnormal data belongs to a preset abnormal type which characterizes self abnormality not of the cloud service;
based on a data interface for generating the abnormal data, identifying the cloud service type to which the abnormal data belongs;
And in response to determining that the quantity of the abnormal data generated by the cloud service type exceeds a preset quantity threshold value determined according to the cloud service type and the access quantity corresponding to the cloud service type in a first preset time period, sending alarm information, wherein the preset quantity threshold value is positively correlated with the access quantity corresponding to the cloud service type, and in response to a change event of the access quantity of the cloud service, updating the preset quantity threshold value corresponding to the cloud service type to which the cloud service belongs based on the change of the access quantity of the cloud service.
2. The method of claim 1, wherein the sending the alert information in response to determining that the number of abnormal data generated by the cloud service type exceeds a preset number threshold within a first preset time period comprises:
storing the timestamp for generating the abnormal data into an abnormal data list corresponding to the cloud service type to which the abnormal data belongs;
And in response to determining that the number value of the timestamps stored in the abnormal data list exceeds a preset number threshold in a first preset time period, sending alarm information through a preset alarm channel, wherein the alarm channel is used for representing a communication mode for sending the alarm information to a receiver.
3. The method of claim 1, wherein the first preset time period is a first preset time period with a certain historical time as a starting time and a current time as a ending time;
the method further comprises the steps of:
and deleting the abnormal data outside the first preset time period in response to determining that the quantity of the abnormal data generated by the cloud service type does not exceed a preset quantity threshold value in the first preset time period.
4. The method of claim 3, wherein after the sending the alert information in response to determining that the number of abnormal data generated by the cloud service type exceeds a preset number threshold within a first preset time period, the method further comprises:
and deleting all abnormal data generated by the acquired cloud service type.
5. The method of claim 1, wherein the method further comprises:
determining whether the abnormal data is acquired for the first time in a second preset time period based on the abnormal data, wherein the first acquisition is used for representing that the same abnormal data is not acquired before the abnormal data is acquired;
And sending an abnormality prompt signal in response to determining that the abnormal data is acquired for the first time.
6. The method according to claim 1 or 2, wherein the method further comprises:
and updating the preset quantity threshold value based on the access quantity of the cloud service type in response to reaching a preset updating moment.
7. A cloud service alerting device, wherein the device comprises:
The acquisition unit is configured to acquire abnormal data generated in the cloud service operation process by adopting an abnormal reporting mode through a gateway, a browser and an application corresponding to the cloud service;
A filtering unit configured to delete the abnormal data in response to identifying that the abnormal data belongs to a preset abnormal type that characterizes an own abnormality that is not the cloud service;
An identification unit configured to identify a cloud service type to which the abnormal data belongs, based on a data interface that generates the abnormal data;
And the alarm unit is configured to send alarm information in response to determining that the quantity of the abnormal data generated by the cloud service type exceeds a preset quantity threshold value determined according to the cloud service type and the access quantity corresponding to the cloud service type in a first preset time period, wherein the preset quantity threshold value is positively correlated with the access quantity corresponding to the cloud service type, and update the preset quantity threshold value corresponding to the cloud service type to which the cloud service belongs based on the change of the cloud service access quantity in response to a change event of the access quantity of the cloud service.
8. The apparatus of claim 7, wherein,
The alarm unit is further configured to store the timestamp for generating the abnormal data to an abnormal data list corresponding to the cloud service type to which the abnormal data belongs, and send alarm information through a preset alarm channel in response to determining that the number value of the timestamp stored in the abnormal data list exceeds a preset number threshold value in a first preset time period, wherein the alarm channel is used for representing a communication mode for sending the alarm information to a receiver.
9. The apparatus of claim 7, wherein the first preset time period is a first preset time period with a certain historical time as a starting time and a current time as a ending time;
the apparatus further comprises:
And the deleting unit is configured to delete the abnormal data outside the first preset time period in response to determining that the quantity of the abnormal data generated by the cloud service type does not exceed a preset quantity threshold value within the first preset time period.
10. The apparatus of claim 9, wherein,
The deleting unit is further configured to delete all the obtained abnormal data generated by the cloud service type after the number of the abnormal data generated by the cloud service type exceeds a preset number threshold in response to determining that the number of the abnormal data generated by the cloud service type exceeds a preset number threshold in a first preset time period.
11. The apparatus of claim 7, wherein,
The alarm unit is further configured to determine whether the abnormal data is acquired for the first time in a second preset time period based on the abnormal data, wherein the first acquisition is used for representing that the same abnormal data is not acquired before the abnormal data is acquired, and send an abnormal prompt signal in response to the determination that the abnormal data is acquired for the first time.
12. The apparatus according to claim 7 or 8, wherein the apparatus further comprises:
And the updating unit is configured to update the preset quantity threshold value based on the access quantity of the cloud service type in response to reaching a preset updating moment.
13. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-6.
14. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-6.
CN201910966336.6A 2019-10-12 2019-10-12 Cloud service alarm methods and devices Active CN110727560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910966336.6A CN110727560B (en) 2019-10-12 2019-10-12 Cloud service alarm methods and devices

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910966336.6A CN110727560B (en) 2019-10-12 2019-10-12 Cloud service alarm methods and devices

Publications (2)

Publication Number Publication Date
CN110727560A CN110727560A (en) 2020-01-24
CN110727560B true CN110727560B (en) 2026-01-09

Family

ID=69221070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910966336.6A Active CN110727560B (en) 2019-10-12 2019-10-12 Cloud service alarm methods and devices

Country Status (1)

Country Link
CN (1) CN110727560B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111586129A (en) * 2020-04-28 2020-08-25 北京奇艺世纪科技有限公司 Alarm method and device for data synchronization, electronic equipment and storage medium
CN112491858B (en) * 2020-11-20 2023-05-30 北京百度网讯科技有限公司 Method, device, equipment and storage medium for detecting abnormal information
CN112579399B (en) * 2020-12-28 2024-04-09 上海蓝云网络科技有限公司 Cloud service testing method and device, electronic equipment and computer storage medium
CN115145781B (en) * 2021-03-29 2025-06-20 中国移动通信集团安徽有限公司 Dynamic threshold calculation method, device, equipment and medium
CN113551297A (en) * 2021-07-27 2021-10-26 工大科雅(天津)能源科技有限公司 Heat supply network abnormal water replenishing monitoring method and heat supply network monitoring terminal
CN113660510A (en) * 2021-08-19 2021-11-16 杭州时趣信息技术有限公司 Video processing cloud manufacturer configuration method, device and system
CN114024831B (en) * 2021-11-08 2024-01-26 中国工商银行股份有限公司 Abnormal event early warning method, device and system
CN114139936A (en) * 2021-11-29 2022-03-04 合肥安达创展科技股份有限公司 Cloud intelligent early warning and repair reporting system based on exhibition item interactive data
CN115550093A (en) * 2022-09-13 2022-12-30 海尔优家智能科技(北京)有限公司 Applied research methods, storage media and electronic devices
CN116882946B (en) * 2023-09-06 2024-01-19 南京南自华盾数字技术有限公司 An intelligent management system and method based on power generation enterprise data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726081A (en) * 2019-01-02 2019-05-07 深圳壹账通智能科技有限公司 Method, apparatus, computer equipment and the storage medium of service exception processing

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9336061B2 (en) * 2012-01-14 2016-05-10 International Business Machines Corporation Integrated metering of service usage for hybrid clouds
US9667470B2 (en) * 2012-09-07 2017-05-30 Oracle International Corporation Failure handling in the execution flow of provisioning operations in a cloud environment
CN103617038B (en) * 2013-11-28 2018-10-02 北京京东尚科信息技术有限公司 A kind of service monitoring method and device of distribution application system
CN105320585B (en) * 2014-07-08 2019-04-02 北京启明星辰信息安全技术有限公司 A kind of method and device for realizing application failure diagnosis
US9378079B2 (en) * 2014-09-02 2016-06-28 Microsoft Technology Licensing, Llc Detection of anomalies in error signals of cloud based service
CN107104999B (en) * 2016-02-23 2021-05-25 北京京东尚科信息技术有限公司 Method and device for processing service interface invocation request
CN106789158A (en) * 2016-11-11 2017-05-31 工业和信息化部电信研究院 Damage identification method and system are insured in a kind of cloud service
CN106411659A (en) * 2016-11-29 2017-02-15 福建中金在线信息科技有限公司 Business data monitoring method and apparatus
CN106874135B (en) * 2017-02-20 2020-09-04 北京百度网讯科技有限公司 Method, Apparatus and Equipment for Detecting Computer Room Failures
CN110008050B (en) * 2019-04-11 2023-06-30 北京百度网讯科技有限公司 Method and device for processing information
CN110162440A (en) * 2019-04-12 2019-08-23 平安普惠企业管理有限公司 Method, electronic device and the computer readable storage medium of fault location

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726081A (en) * 2019-01-02 2019-05-07 深圳壹账通智能科技有限公司 Method, apparatus, computer equipment and the storage medium of service exception processing

Also Published As

Publication number Publication date
CN110727560A (en) 2020-01-24

Similar Documents

Publication Publication Date Title
CN110727560B (en) Cloud service alarm methods and devices
US10891560B2 (en) Supervised learning system training using chatbot interaction
US9280437B2 (en) Dynamically scalable real-time system monitoring
WO2019051948A1 (en) Method, apparatus, server, and storage medium for processing monitoring data
CN110727563B (en) Cloud service alarm method and device for preset customers
US10127093B1 (en) Method and apparatus for monitoring a message transmission frequency in a robot operating system
US20180159724A1 (en) Automatic task tracking
CN110928934A (en) Data processing method and device for business analysis
CN107644075B (en) Method and device for collecting page information
US10599505B1 (en) Event handling system with escalation suppression
CN111639086B (en) A data reconciliation method, device, equipment and storage medium
CN109560949B (en) Data processing method, management server and business equipment
CN112306723B (en) A method and device for obtaining operation information of a small program
CN113778780A (en) Application stability determination method, apparatus, electronic device and storage medium
US11811894B2 (en) Reduction of data transmissions based on end-user context
CN111290873B (en) Fault processing method and device
CN117149570A (en) Data processing method, device, equipment and storage medium
CN116366677A (en) Data processing method and device
US8606868B2 (en) Community based measurement of capabilities and availability
CN115499431A (en) Public cloud multi-resource pool operation and maintenance monitoring system
CN109508356B (en) Data abnormality early warning method, device, computer equipment and storage medium
CN114253797A (en) Fault processing method and related device of micro-service system
CN113761433A (en) Service processing method and device
US10296967B1 (en) System, method, and computer program for aggregating fallouts in an ordering system
CN113377629B (en) Method and device for monitoring abnormal codes of users

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant