CN102388570B - Single board running method and system under active-standby mode - Google Patents
Single board running method and system under active-standby mode Download PDFInfo
- Publication number
- CN102388570B CN102388570B CN201180002282.3A CN201180002282A CN102388570B CN 102388570 B CN102388570 B CN 102388570B CN 201180002282 A CN201180002282 A CN 201180002282A CN 102388570 B CN102388570 B CN 102388570B
- Authority
- CN
- China
- Prior art keywords
- board
- single board
- resource piece
- standby
- resource block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2035—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant without idle spare hardware
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Hardware Redundancy (AREA)
Abstract
Description
技术领域technical field
本发明涉及通信技术领域,尤其涉及一种主备模式下的单板运行方法及系统。The invention relates to the field of communication technology, in particular to a method and system for operating a single board in an active/standby mode.
背景技术Background technique
电信系统中,为了提高系统的可靠性,通常会对主要单板增加冗余备板,以保证当单板发生故障时,可以及时由冗余备板继续执行当前业务,减小对用户的影响。In the telecommunications system, in order to improve the reliability of the system, a redundant backup board is usually added to the main board to ensure that when a single board fails, the redundant backup board can continue to execute the current business in time and reduce the impact on users .
现有技术中对单板所采用的冗余方式一般包括N+1资源池的工作模式或1+1主备模式。The redundancy mode adopted for single boards in the prior art generally includes an N+1 resource pool working mode or a 1+1 active/standby mode.
其中,N+1资源池的工作模式,即N块单板工作,1块单板进行备份。当N块单板中任一单板故障时,备份单板接替该单板工作。然而,该工作模式中多块单板向一块单板备份,在相同的内存情况下,备份单板不能完全备份所有的呼叫数据。只能选择重要的数据,如基站,小区等信息,所以发生单板故障时,一般用户都会掉话。Wherein, the working mode of the N+1 resource pool means that N boards work and one board performs backup. When any one of the N boards fails, the backup board takes over the work of the board. However, in this working mode, multiple veneers back up to one veneer, and in the case of the same memory, the backup veneer cannot fully back up all call data. Only important data can be selected, such as base station, cell and other information, so when a board fails, most users will drop calls.
对于1+1主备模式,即每一个单板都配有一块单板做备份,主单板工作,备单板只做主单板的备份。当主单板故障时,备单板通过倒换升为主单板,将IP地址切换为主单板IP地址,接替原来主单板的工作。该模式下虽然可以很好的保障原有用户不掉话,可靠性高,但是在主单板不出现故障时,备单板始终处于闲置状态,相当于2块单板只做了1块单板的事情,该模式硬件冗余度高,性能差。For the 1+1 active/standby mode, that is, each board is equipped with a single board as a backup, the main board works, and the backup board only serves as the backup of the main board. When the main board fails, the standby board is promoted to the main board through switching, and the IP address is switched to the IP address of the main board to take over the work of the original main board. Although this mode can well guarantee that the original users will not drop calls and has high reliability, but when the main board does not fail, the backup board is always in an idle state, which is equivalent to only one of the two boards. Board thing, this mode has high hardware redundancy and poor performance.
发明内容Contents of the invention
本发明实施例提供一种主备模式下的单板运行方法及系统,能够降低硬件的冗余度,在高可靠性的情况下,提高硬件性能。Embodiments of the present invention provide a method and system for operating a single board in an active/standby mode, which can reduce hardware redundancy and improve hardware performance under high reliability conditions.
为了解决上述技术问题,本发明实施例的技术方案如下:In order to solve the above technical problems, the technical solutions of the embodiments of the present invention are as follows:
一种主备模式下的单板运行方法,将资源划分为至少两部分可独立运行的资源块,并将所述资源块分配至主单板及其备单板上,所述方法包括:A method for operating a single board in active/standby mode, which divides resources into at least two independently operable resource blocks, and allocates the resource blocks to the main single board and its standby single board, the method comprising:
所述主单板运行第一资源块与外部单板进行通信,所述备单板运行第二资源块与所述外部单板进行通信,其中,所述第一资源块与所述第二资源块不同;The primary board runs a first resource block to communicate with an external board, and the standby board runs a second resource block to communicate with the external board, wherein the first resource block communicates with the second resource block block different;
当所述备单板检测到所述主单板出现故障时,由所述备单板运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,并由所述备单板与所述外部单板进行通信;When the standby board detects that the main board fails, the backup board runs the second resource block and the backup of the first resource block stored on the backup board, and communicating with the external single board by the standby board;
所述备单板运行第二资源块与外部单板进行通信,包括:The standby board runs the second resource block to communicate with the external board, including:
所述备单板运行第二资源块,并通过所述主单板上的代理通信模块与所述外部单板进行通信;The standby board runs the second resource block, and communicates with the external board through an agent communication module on the main board;
所述当所述备单板检测到所述主单板出现故障时,由所述备单板运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,与所述外部单板进行通信,包括:When the standby board detects that the main board fails, the backup board runs the second resource block, and the first resource block stored on the backup board backup, communicating with the external single board, including:
当所述备单板检测到所述主单板出现故障时,由所述备单板运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,使用所述主单板的IP地址,与所述外部单板进行通信。When the standby board detects that the main board fails, the backup board runs the second resource block and the backup of the first resource block stored on the backup board, Using the IP address of the main board to communicate with the external board.
一种单板,包括:A single board, comprising:
通信单元,用于运行第二资源块与外部单板进行通信,其中,所述单板的主单板运行第一资源块与所述外部单板进行通信,所述第一资源块与所述第二资源块不同,且所述第一资源块与所述第二资源块为划分资源获得的可独立运行的资源块;A communication unit, configured to run a second resource block to communicate with an external board, wherein the main board of the board runs a first resource block to communicate with the external board, and the first resource block communicates with the external board The second resource block is different, and the first resource block and the second resource block are independently operable resource blocks obtained by dividing resources;
故障处理单元,用于当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述单板上存储的所述第一资源块的备份,并由所述单板与所述外部单板进行通信;a failure processing unit, configured to run the second resource block and the backup of the first resource block stored on the main board when a failure of the main board is detected, and the board communicating with the external single board;
所述通信单元,具体用于运行第二资源块,并通过所述主单板上的代理通信模块与所述外部单板进行通信;The communication unit is specifically configured to run the second resource block, and communicate with the external board through an agent communication module on the main board;
所述故障处理单元,具体用于当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述单板上存储的所述第一资源块的备份,使用所述主单板的IP地址,与所述外部单板进行通信。The fault processing unit is specifically configured to run the second resource block and backup of the first resource block stored on the single board when a fault is detected on the main board, using the The IP address of the main board to communicate with the external board.
一种主备模式下的单板运行系统,包括主单板及其备单板,所述系统的资源划分为至少两部分可独立运行的资源块,所述资源块被分配至所述主单板及所述备单板上,A single-board operating system in active-standby mode, including a main single-board and a standby single-board, the resources of the system are divided into at least two resource blocks that can operate independently, and the resource blocks are allocated to the main single-board board and the standby board,
所述主单板,用于运行第一资源块与所述外部单板进行通信;The main board is configured to run the first resource block to communicate with the external board;
所述备单板,用于运行第二资源块与外部单板进行通信,其中,所述第二资源块与所述第一资源块不同;当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,并由所述备单板与所述外部单板进行通信;The backup single board is used to run a second resource block to communicate with an external single board, wherein the second resource block is different from the first resource block; when it is detected that the main single board fails, run The second resource block, and the backup of the first resource block stored on the backup board, and the backup board communicates with the external board;
其中,所述运行第二资源块与外部单板进行通信,包括:运行第二资源块,并通过所述主单板上的代理通信模块与所述外部单板进行通信;Wherein, the running the second resource block to communicate with the external board includes: running the second resource block, and communicating with the external board through the proxy communication module on the main board;
所述当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,并由所述备单板与所述外部单板进行通信,包括:When it is detected that the primary board fails, run the second resource block and the backup of the first resource block stored on the standby board, and use the standby board and the communicate with the above-mentioned external boards, including:
当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,并使用所述主单板的IP地址,与所述外部单板进行通信。When it is detected that the main single board fails, run the second resource block and the backup of the first resource block stored on the standby single board, and use the IP address of the main single board, communicate with the external board.
本发明实施例通过将资源进行划分,并分别在主、备单板上运行,使得主备单板同时参与运行资源与外部单板进行通信,避免了现有技术的主备模式下主单板运行,备单板闲置的情况,该实施例方法中主备单板的CPU同时参与运行,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,该方式下的处理能力是现有技术方法处理能力的2倍。而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In the embodiment of the present invention, resources are divided and run on the active and standby boards respectively, so that the active and standby boards can participate in the operation resources and communicate with external boards at the same time. In the case of running and the standby board is idle, the CPUs of the main and standby boards participate in the operation at the same time in the method of this embodiment, which greatly reduces the redundancy of the hardware, makes full use of the hardware resources, and improves the hardware performance. The processing capacity under this mode It is twice the processing capacity of the prior art method. Moreover, when a single board fails, another single board takes over the resources on the faulty single board to avoid service communication interruption and ensure high reliability of the system.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.
图1是本发明实施例一种主备模式下的单板运行方法流程图;FIG. 1 is a flow chart of a method for operating a single board in active/standby mode according to an embodiment of the present invention;
图2是本发明实施例另一种主备模式下的单板运行方法流程图;FIG. 2 is a flow chart of another method for operating a single board in an active/standby mode according to an embodiment of the present invention;
图3是图2所示实施例中主备单板与外部单板通信的示意图;Fig. 3 is a schematic diagram of communication between the active and standby single boards and external single boards in the embodiment shown in Fig. 2;
图4是本发明实施例另一种主备模式下的单板运行方法流程图;FIG. 4 is a flow chart of another method for operating a single board in an active/standby mode according to an embodiment of the present invention;
图5是图4所示实施例中主备单板与外部单板通信的示意图;Fig. 5 is a schematic diagram of communication between the active and standby single boards and external single boards in the embodiment shown in Fig. 4;
图6是本发明实施例一种单板的结构示意图;FIG. 6 is a schematic structural diagram of a single board according to an embodiment of the present invention;
图7是本发明实施例另一种单板的结构示意图;FIG. 7 is a schematic structural diagram of another single board according to an embodiment of the present invention;
图8是本发明实施例另一种单板的结构示意图;FIG. 8 is a schematic structural diagram of another single board according to an embodiment of the present invention;
图9是本发明实施例一种主备模式下的单板运行系统结构示意图。FIG. 9 is a schematic structural diagram of a single-board operating system in active/standby mode according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
参见图1,为本发明实施例一种主备模式下的单板运行方法流程图。Referring to FIG. 1 , it is a flowchart of a method for operating a single board in active/standby mode according to an embodiment of the present invention.
该方法可以包括:The method can include:
步骤101,将资源划分为至少两部分可独立运行的资源块,并将资源块分配至主单板及其备单板上。
在现有技术中单板上的资源并不进行划分,而是要么在主单板运行,要么在备单板运行,在本发明实施例中,首先将资源进行划分,其划分的标准是所划分出的各资源块可以独立运行,在运行某一资源块时,无需参看其它资源块。In the prior art, resources on a single board are not divided, but run on either the main board or the standby board. In the embodiment of the present invention, resources are first divided, and the division standard is Each divided resource block can run independently, and there is no need to refer to other resource blocks when running a certain resource block.
该资源的划分过程可以由主单板和备单板之外的第三方划分,也可以是由两者之一进行划分。The resource division process can be divided by a third party other than the primary board and the standby board, or can be divided by one of them.
在划分完成后,由第三方或两者之一进行资源块的分配,具体的,可以将划分出的至少两部分不同的资源块分别分配在主单板和备单板上,各单板上的资源块不同,其资源量可以相等,也可以不等;也还可以在主单板和备单板上分配所有的资源块,例如,将所有的资源块按单板数量进行备份,然后分配,使得每个单板上都存储有所有的资源块。After the division is completed, a third party or one of the two will allocate resource blocks. Specifically, at least two divided resource blocks can be allocated to the main board and the standby board respectively. different resource blocks, the amount of resources can be equal or unequal; it is also possible to allocate all resource blocks on the main board and the standby board, for example, backup all resource blocks according to the number of boards, and then allocate , so that all resource blocks are stored on each single board.
该资源块划分的步骤可以预先执行,无需在每次单板运行时,都执行一次划分。The step of dividing resource blocks can be performed in advance, and it is not necessary to perform division once every time a single board runs.
步骤102,主单板运行第一资源块与外部单板进行通信,备单板运行第二资源块与外部单板进行通信。
在将资源块分配至主单板和备单板后,由主单板运行其中的部分资源块,记为第一资源块,备单板运行另一部分资源块,记为第二资源块,第一资源块与第二资源块不同。After the resource blocks are allocated to the main board and the standby board, the main board runs part of the resource blocks, which is recorded as the first resource block, and the standby board runs another part of the resource blocks, which is recorded as the second resource block. The first resource block is different from the second resource block.
主备单板在运行各自的资源块时,可以分别与外部单板进行业务通信,也可以是由一公共的通信模块与外部单板进行通信,例如备单板通过主单板上的代理通信模块与外部单板进行通信等,此处对于具体的通信方式不作限定。When the active and standby boards are running their respective resource blocks, they can communicate with the external boards separately, or a common communication module can communicate with the external boards, for example, the standby board communicates through the agent on the main board. The module communicates with an external single board, etc., and the specific communication method is not limited here.
步骤103,当备单板检测到主单板出现故障时,由备单板运行第二资源块,以及在备单板上存储的第一资源块的备份,与外部单板进行通信。
备单板可以通过主备单板之间的软件握手信号对主单板的运行情况进行检测,当获知主单板出现故障时,由备单板在运行原第二资源块的同时,接替运行主单板上的业务,也即运行备单板上该第一资源块的备份,与外部其它单板进行业务通信。The standby board can detect the running status of the main board through the software handshake signal between the main board and the standby board. The business on the primary board is to run the backup of the first resource block on the standby board, and to communicate with other external boards.
其中,该第一资源块的备份可以是预先在资源块分配时存储在备单板上的,也可以是出现故障时或出现故障之前,由主单板临时备份至备单板上的。Wherein, the backup of the first resource block may be pre-stored on the standby board when the resource block is allocated, or may be temporarily backed up by the master board to the backup board when a fault occurs or before a fault occurs.
在另一实施例中,当故障单板恢复使用时,可以切换回步骤102的方式,也即当主单板恢复时,仍由主单板运行第一资源块,与外部单板进行通信,由备单板运行第二资源块,与外部单板进行通信。In another embodiment, when the faulty single board is restored to use, it can be switched back to the method of
在本发明实施例中,主单板和备单板是相对而言的,并非特指某一单板,也即上述方法步骤中,“主单板”和“备单板”对换,也同样可以实现,且在本发明的保护范围之内。“第一”、“第二”也仅为区分不同的资源块,并非特指。In this embodiment of the present invention, the main board and the standby board are relative terms, and do not specifically refer to a certain board. The same can be realized, and it is within the protection scope of the present invention. "First" and "second" are only used to distinguish different resource blocks, and are not specific.
本发明实施例通过将资源进行划分,并分别在主、备单板上运行,使得主备单板同时参与运行资源与外部单板进行通信,避免了现有技术的主备模式下主单板运行,备单板闲置的情况,该实施例方法中主备单板的CPU同时参与运行,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,该方式下的处理能力是现有技术方法处理能力的2倍。而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In the embodiment of the present invention, resources are divided and run on the active and standby boards respectively, so that the active and standby boards can participate in the operation resources and communicate with external boards at the same time. In the case of running and the standby board is idle, the CPUs of the main and standby boards participate in the operation at the same time in the method of this embodiment, which greatly reduces the redundancy of the hardware, makes full use of the hardware resources, and improves the hardware performance. The processing capacity under this mode It is twice the processing capacity of the prior art method. Moreover, when a single board fails, another single board takes over the resources on the faulty single board to avoid service communication interruption and ensure high reliability of the system.
参见图2,为本发明实施例另一种主备模式下的单板运行方法流程图。Referring to FIG. 2 , it is a flow chart of another method for operating a single board in active/standby mode according to an embodiment of the present invention.
该方法可以包括:The method can include:
步骤201,将资源M划分为两部分可独立运行的资源块M1、M2,并将资源块分配至主单板及其备单板上。Step 201, divide the resource M into two resource blocks M1 and M2 that can operate independently, and allocate the resource blocks to the master board and its backup board.
在本实施例中,以将资源M划分为两部分可独立运行的资源块,第一资源块M1、第二资源块M2为例进行说明。In this embodiment, the resource M is divided into two independently operable resource blocks, the first resource block M1 and the second resource block M2 as an example for illustration.
具体的资源划分方法是将每个资源块中的全局变量都抽离出来,存放在一个统一的结构中,称为环境控制块。每个单板通过不同的环境控制块指针使用对应的独立的资源块。The specific resource division method is to extract the global variables in each resource block and store them in a unified structure, called the environment control block. Each single board uses a corresponding independent resource block through a different environment control block pointer.
在划分完成后,将第一资源块M1、第二资源块M2进行备份,将第一资源块M1、第二资源块M2的备份M2'分配至主单板,将第一资源块M1的备份M1'及第二资源块M2分配至备单板,虽然主备单板中存储的资源块相同,但各单板运行的资源块并不相同,如以下步骤202。After the division is completed, the first resource block M1 and the second resource block M2 are backed up, the backup M2' of the first resource block M1 and the second resource block M2 is allocated to the main board, and the backup of the first resource block M1 M1' and the second resource block M2 are allocated to the standby boards. Although the resource blocks stored in the active and standby boards are the same, the resource blocks operated by each board are different, as in
步骤202,主单板运行第一资源块M1与外部单板进行通信,备单板运行第二资源块M2,并通过主单板上的代理通信模块与外部单板进行通信。
在将资源块分配至主单板和备单板后,由主单板运行其中的第一资源块M1,备单板运行第二资源块M2,两资源块不同。After the resource blocks are allocated to the master board and the backup board, the master board runs the first resource block M1, and the backup board runs the second resource block M2, and the two resource blocks are different.
在本实施例中,主单板通过自身的通信模块,采用主单板自身的IP地址IP1与外部单板进行通信,备单板通过主单板上的代理通信模块,采用主单板的IP地址IP1与外部单板进行通信,如图3所示,该主单板代理通信的模式不需要其他单板感知,影响较小。In this embodiment, the main board communicates with the external board through its own communication module using its own IP address IP1, and the standby board uses the IP address IP1 of the main board through the proxy communication module on the main board. The address IP1 communicates with the external board, as shown in Figure 3, the communication mode of the master board does not require other boards to perceive, and the impact is relatively small.
基于该代理通信模式,由于主单板上开销相对变大,备单板和主单板上运行的资源块的资源量可不相同,例如备单板运行的第二资源块M2的资源量大于主单板运行的第一资源块M1的资源量,如M2:M1为6:4等。Based on this proxy communication mode, due to the relatively large overhead on the master board, the amount of resource blocks running on the standby board and the master board may be different. For example, the resource amount of the second resource block M2 running on the slave board is greater than that of the master board. The resource amount of the first resource block M1 running on the single board, for example, M2:M1 is 6:4, and so on.
步骤203,当备单板检测到主单板出现故障时,由备单板运行第二资源块M2,以及第一资源块的备份M1',使用主单板的IP地址IP1,与外部单板进行通信。
备单板通过主备单板之间的软件握手信号对主单板的运行情况进行检测,当获知主单板出现故障时,由备单板接替主单板上的业务,也即使用备单板上存储的第一资源块的备份M1',与外部单板进行通信。该情况下,备单板即可同时运行第一资源块的备份M1'和第二资源块M2,将主单板的IP地址IP1切换为自身的IP地址,也即使用主单板的IP地址IP1,与外部单板进行通信,并向其它外部单板发送ARP(Address Resolution Protocol,地址解析协议)消息,更新其他外部单板的ARP表项。The standby board detects the running status of the main board through the software handshake signal between the main board and the standby board. When it learns that the main board is faulty, the standby board takes over The backup M1' of the first resource block stored on the board communicates with the external board. In this case, the standby board can run the backup M1' of the first resource block and the second resource block M2 at the same time, and switch the IP address IP1 of the main board to its own IP address, that is, use the IP address of the main board IP1, communicate with external single boards, and send ARP (Address Resolution Protocol, Address Resolution Protocol) messages to other external single boards, and update ARP entries of other external single boards.
由于备单板同时运行第一资源块的备份M1'和第二资源块M2,可能业务超过备单板的CPU负荷,此时可以启动流控,流控可以包括分担呼叫到其他单板,或者拒绝某些服务等。Since the standby board runs the backup M1' of the first resource block and the second resource block M2 at the same time, the business may exceed the CPU load of the standby board. At this time, flow control can be started. The flow control can include sharing calls to other boards, or Refusal of certain services, etc.
步骤204,当备单板检测到主单板恢复时,备单板运行第二资源块M2与外部单板进行通信,并通过主单板上的代理通信模块与外部单板进行通信;其中,主单板恢复运行第一资源块M1与外部单板进行通信。
当备单板检测到主单板恢复时,可以将主备单板的运行状态切换回步骤202对应的状态,也即备单板运行第二资源块M2,通过主单板上的代理通信模块与外部单板进行通信,主单板恢复运行第一资源块M1与外部单板进行通信。When the standby board detects that the master board has recovered, it can switch the running state of the master board back to the state corresponding to step 202, that is, the slave board runs the second resource block M2, through the agent communication module on the master board Communicate with the external board, and the main board resumes running. The first resource block M1 communicates with the external board.
在另一实施例中,如果在步骤203中出现故障的是备单板,则直接由主单板运行第一资源块M1,以及第二资源块的备份M2',使用主单板的IP地址IP1,与外部单板进行通信,备单板恢复后,切换回步骤202的状态。In another embodiment, if the fault occurs in
本发明实施例通过将资源进行划分,并分别在主、备单板上运行,使得主备单板同时参与运行资源与外部单板进行通信,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In the embodiment of the present invention, resources are divided and run on the active and standby boards respectively, so that the active and standby boards simultaneously participate in the communication between running resources and external boards, greatly reducing the redundancy of hardware and making full use of hardware resources, which improves the hardware performance, and when a single board fails, another single board takes over the resources on the faulty single board, avoiding the interruption of business communication and ensuring the high reliability of the system.
参见图4,为本发明实施例另一种主备模式下的单板运行方法流程图。Referring to FIG. 4 , it is a flow chart of another method for operating a single board in active/standby mode according to an embodiment of the present invention.
该方法可以包括:The method can include:
步骤401,将资源M划分为两部分可独立运行的资源块M1、M2,并将资源块分配至主单板及其备单板上。
本实施例仍以资源M划分的第一资源块M1、第二资源块M2为例进行说明。其资源划分过程和资源块的分配过程与前述步骤201类似,此处不再赘述。In this embodiment, the first resource block M1 and the second resource block M2 divided by the resource M are still taken as an example for illustration. The resource division process and resource block allocation process are similar to the aforementioned step 201, and will not be repeated here.
步骤402,主单板运行第一资源块M1,并通过主单板的IP地址IP1与外部单板进行通信;备单板运行第二资源块M2,并通过备单板的IP地址IP2与外部单板进行通信。
在将资源块分配至主单板和备单板后,由主单板运行第一资源块M1,备单板运行其中的第二资源块M2,两资源块不同。After the resource blocks are allocated to the master board and the backup board, the master board runs the first resource block M1, and the backup board runs the second resource block M2, and the two resource blocks are different.
在本实施例中,主单板通过自身的通信模块,采用主单板自身的IP地址IP1与外部单板进行通信,备单板通过自身的通信模块,采用备单板自身的IP地址IP2与外部单板进行通信,如图5所示。In this embodiment, the main board communicates with external boards through its own communication module using its own IP address IP1, and the standby board uses its own IP address IP2 and The external single board communicates, as shown in Figure 5.
基于该自主通信模式,主单板和备单板上运行的资源块的资源量可以相同,例如M1:M2为1:1等。Based on the autonomous communication mode, the resource amounts of the resource blocks running on the primary board and the standby board may be the same, for example, M1:M2 is 1:1.
步骤403,当备单板检测到主单板出现故障时,备单板增加主单板的IP地址IP1,并向外部单板发送ARP消息。Step 403, when the standby board detects that the main board fails, the backup board adds the IP address IP1 of the main board, and sends an ARP message to the external board.
备单板通过主备单板之间的软件握手信号对主单板的运行情况进行检测,当获知主单板出现故障时,需要由备单板接替主单板上的业务,在接替业务之前,由于备单板与主单板各自通过自己的IP地址与外部单板通信,所以,备单板需要接替主单板上的业务以及主单板的IP地址IP1。The standby board detects the running status of the main board through the software handshake signal between the main board and the standby board. , because the standby board and the main board communicate with external boards through their own IP addresses, the backup board needs to take over the services on the main board and the IP address IP1 of the main board.
备单板需要首先在本地增加主单板的IP地址IP1,并向其它外部单板发送ARP消息,更新其他外部单板的ARP表项,也即将ARP表项中地址IP1对应的主单板的MAC地址,更新为备单板的MAC地址,以便于备单板可以接替主单板的业务,与外部单板进行通信。The standby board needs to first increase the IP address IP1 of the main board locally, and send an ARP message to other external boards to update the ARP entries of other external boards, that is, the IP address of the main board corresponding to the address IP1 in the ARP entry. The MAC address is updated to the MAC address of the standby board, so that the standby board can take over the services of the main board and communicate with external boards.
步骤404,备单板运行第二资源块M2,使用备单板的IP地址IP2,与外部单板进行通信,且备单板运行第一资源块的备份M1',使用主单板的IP地址IP1,与外部单板进行通信。
在备单板将主单板的IP地址IP1增加至本地后,备单板仍然运行第二资源块M2,使用备单板的原IP地址IP2,与外部单板进行通信,对于接替的主单板的业务,备单板则运行在备单板上存储的第一资源块的备份M1',使用主单板的IP地址IP1,与外部单板进行通信。After the standby board adds the IP address IP1 of the main board to the local area, the standby board still runs the second resource block M2 and uses the original IP address IP2 of the standby board to communicate with external boards. The standby board runs the backup M1' of the first resource block stored on the standby board, and uses the IP address IP1 of the main board to communicate with the external board.
由于备单板同时运行第二资源块M2和第一资源块的备份M1',可能业务超过备单板的CPU负荷,此时可以启动流控,流控可以包括分担呼叫到其他单板,或者拒绝某些服务等。Since the standby board runs the second resource block M2 and the backup M1' of the first resource block at the same time, the business may exceed the CPU load of the standby board. At this time, flow control can be started. The flow control can include sharing calls to other boards, or Refusal of certain services, etc.
步骤405,当备单板检测到主单板恢复时,备单板运行第二资源块M2,使用备单板的IP地址IP2,与外部单板进行通信。
当备单板检测到主单板恢复时,备单板的运行状态切换至步骤402的状态,即运行第二资源块M2,使用备单板的IP地址IP2,与外部单板进行通信。对于备单板所接替的主单板的业务,再切换回主单板,由主单板执行。When the standby board detects that the main board is recovered, the running state of the standby board is switched to the state of
其中,主单板的切换过程可以是:Wherein, the switching process of the main board may be:
主单板在恢复后,若检测到主单板的IP地址IP1被使用(已在步骤403中将该地址IP1增加至备单板),则先以备用IP地址IP3启动,在恢复运行第一资源块M1时,将备用IP地址IP3切换为主单板的IP地址IP1,并向外部单板发送ARP消息,以使外部单板的ARP表项恢复至主单板故障前的状态,主单板运行第一资源块M1,使用IP地址IP1与外部单板进行通信。After the recovery of the main board, if it is detected that the IP address IP1 of the main board is used (this address IP1 has been added to the standby board in step 403), it will first start with the standby IP address IP3, and the first When the resource block is M1, switch the standby IP address IP3 to the IP address IP1 of the main board, and send an ARP message to the external board, so that the ARP entry of the external board returns to the state before the failure of the main board. The board runs the first resource block M1, and uses the IP address IP1 to communicate with the external single board.
在另一实施例中,如果在步骤403中出现故障的是备单板,则直接由主单板执行类似步骤404~405的步骤,备单板恢复后,备单板按照上述主单板的切换过程切换回步骤402的状态。In another embodiment, if the fault occurs in step 403 is the standby board, the main board will directly perform steps similar to steps 404-405. The switching process switches back to the state of
本发明实施例通过将资源进行划分,并分别在主、备单板上运行,使得主备单板同时参与运行资源与外部单板进行通信,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In the embodiment of the present invention, resources are divided and run on the active and standby boards respectively, so that the active and standby boards simultaneously participate in the communication between running resources and external boards, greatly reducing the redundancy of hardware and making full use of hardware resources, which improves the hardware performance, and when a single board fails, another single board takes over the resources on the faulty single board, avoiding the interruption of business communication and ensuring the high reliability of the system.
以上是对本发明方法实施例的描述,下面对实现上述方法的装置进行介绍。The above is the description of the method embodiment of the present invention, and the device for realizing the above method will be introduced below.
参见图6,为本发明实施例一种单板的结构示意图。Referring to FIG. 6 , it is a schematic structural diagram of a single board according to an embodiment of the present invention.
该单板可以包括:This board can include:
通信单元601,用于运行第二资源块与外部单板进行通信,其中,所述单板的主单板运行第一资源块与所述外部单板进行通信,所述第一资源块与所述第二资源块不同,且所述第一资源块与所述第二资源块为划分资源获得的可独立运行的资源块;The communication unit 601 is configured to run a second resource block to communicate with an external board, wherein the main board of the board runs a first resource block to communicate with the external board, and the first resource block communicates with the external board. The second resource block is different, and the first resource block and the second resource block are independently operable resource blocks obtained by dividing resources;
故障处理单元602,用于当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述单板上存储的所述第一资源块的备份,与所述外部单板进行通信。The failure processing unit 602 is configured to run the second resource block when detecting a failure of the main board, and the backup of the first resource block stored on the board, and the external unit board to communicate.
在本发明实施例中,首先将资源进行划分,其划分的标准是所划分出的各资源块可以独立运行,在运行某一资源块时,无需参看其它资源块。在划分完成后,进行资源块的分配,该资源的划分过程和分配过程可以由该单板或其它设备执行。该单板及其主单板获得资源分配后(如第一资源块分配至该单板的主单板,第二资源块分配至该单板),通信单元601可运行该第二资源块,与外部单板进行业务通信,具体的可以通过主单板上的代理通信模块与外部单板进行通信,也可以是通过该单板自身的通信模块与外部单板进行通信。当故障处理单元602检测到主单板出现故障时,由该单元在运行原第二资源块的同时,接替运行主单板上的业务,即运行该单板上存储的该第一资源块的备份,与外部其它单板进行业务通信。In the embodiment of the present invention, resources are firstly divided, and the division standard is that each divided resource block can run independently, and there is no need to refer to other resource blocks when running a certain resource block. After the division is completed, the allocation of resource blocks is performed, and the process of dividing and allocating resources can be performed by the single board or other devices. After the board and its main board obtain resource allocation (for example, the first resource block is allocated to the main board of the board, and the second resource block is allocated to the board), the communication unit 601 can run the second resource block, The service communication with the external board may specifically be performed through the proxy communication module on the main board, or through the communication module of the board itself. When the failure processing unit 602 detects that the main board is faulty, this unit will take over the business on the main board while running the original second resource block, that is, run the business of the first resource block stored on the board. Backup, and communicate with other external boards.
本实施例中,单板通过上述单元与主单板同时参与运行资源与外部单板进行通信,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In this embodiment, the single board participates in the communication between the operating resources and the external single board through the above-mentioned units and the main single board at the same time, which greatly reduces the redundancy of the hardware, makes full use of the hardware resources, and improves the hardware performance. Moreover, in a certain When a single board fails, another single board takes over the resources on the faulty single board, avoiding the interruption of business communication and ensuring the high reliability of the system.
参见图7,为本发明实施例另一种单板的结构示意图。Referring to FIG. 7 , it is a schematic structural diagram of another board according to an embodiment of the present invention.
该单板除了可以包括通信单元701,故障处理单元702之外,还可以包括第一恢复单元703。In addition to the
其中,通信单元701,具体用于运行第二资源块,并通过所述主单板上的代理通信模块与所述外部单板进行通信。Wherein, the
故障处理单元702,具体用于当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述单板上存储的所述第一资源块的备份,使用所述主单板的IP地址,与所述外部单板进行通信。The
第一恢复单元703,用于当检测到所述主单板恢复时,运行第二资源块与外部单板进行通信,并通过所述主单板上的代理通信模块与所述外部单板进行通信;其中,所述主单板恢复运行第一资源块与所述外部单板进行通信。The
本实施例中,单板通过上述单元与主单板同时参与运行资源与外部单板进行通信,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In this embodiment, the single board participates in the communication between the operating resources and the external single board through the above-mentioned units and the main single board at the same time, which greatly reduces the redundancy of the hardware, makes full use of the hardware resources, and improves the hardware performance. Moreover, in a certain When a single board fails, another single board takes over the resources on the faulty single board, avoiding the interruption of business communication and ensuring the high reliability of the system.
参见图8,为本发明实施例另一种单板的结构示意图。Referring to FIG. 8 , it is a schematic structural diagram of another board according to an embodiment of the present invention.
该单板除了可以包括通信单元801,故障处理单元802之外,还可以包括第二恢复单元803。In addition to the
其中,通信单元801,具体用于运行第二资源块,并通过所述单板的IP地址与所述外部单板进行通信。Wherein, the
故障处理单元802可以进一步包括:The
消息发送子单元8021,用于当检测到所述主单板出现故障时,增加所述主单板的IP地址,并向所述外部单板发送ARP消息;The
故障处理子单元8022,用于运行第二资源块,使用所述单板的IP地址,与所述外部单板进行通信,且运行在所述单板上存储的所述第一资源块的备份,使用所述主单板的IP地址,与所述外部单板进行通信。The
第二恢复单元803,用于当检测到所述主单板恢复时,运行第二资源块,使用所述单板的IP地址,与外部单板进行通信。其中,所述主单板在检测到所述主单板的IP地址被使用时,以备用IP地址启动,在恢复运行所述第一资源块时,将所述备用IP地址切换为所述主单板的IP地址,并向所述外部单板发送ARP消息,与所述外部单板进行通信。The
本实施例中,单板通过上述单元与主单板同时参与运行资源与外部单板进行通信,大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In this embodiment, the single board participates in the communication between the operating resources and the external single board through the above-mentioned units and the main single board at the same time, which greatly reduces the redundancy of the hardware, makes full use of the hardware resources, and improves the hardware performance. Moreover, in a certain When a single board fails, another single board takes over the resources on the faulty single board, avoiding the interruption of business communication and ensuring the high reliability of the system.
参见图9,为本发明实施例一种主备模式下的单板运行系统结构示意图。Referring to FIG. 9 , it is a schematic structural diagram of a single-board operating system in active/standby mode according to an embodiment of the present invention.
该系统包括主单板901及其备单板902,该系统的资源划分为至少两部分可独立运行的资源块,资源块被分配至主单板901及备单板902上,The system includes a
主单板901,用于运行第一资源块与所述外部单板进行通信;The
备单板902,用于运行第二资源块与外部单板进行通信,其中,所述第一资源块与所述第二资源块不同;当检测到所述主单板出现故障时,运行所述第二资源块,以及在所述备单板上存储的所述第一资源块的备份,与所述外部单板进行通信。The
本实施例中,该系统大大降低了硬件的冗余度,充分利用了硬件资源,提高了硬件性能,而且,在某一单板出现故障时,由另一单板接替运行故障单板上的资源,避免了业务通信的中断,保证了系统的高可靠性。In this embodiment, the system greatly reduces the redundancy of hardware, makes full use of hardware resources, and improves hardware performance. Moreover, when a single board fails, another single board takes over the operation of the faulty single board. resources, avoiding the interruption of business communication and ensuring the high reliability of the system.
以上单板和系统中各单元的具体实现过程,请参见前述方法实施例的对应描述,此处不再赘述。For the specific implementation process of the above single board and each unit in the system, please refer to the corresponding description of the foregoing method embodiment, and details are not repeated here.
以上所述的本发明实施方式,并不构成对本发明保护范围的限定。任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明的权利要求保护范围之内。The embodiments of the present invention described above are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included in the protection scope of the claims of the present invention.
Claims (7)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2011/080413 WO2012149785A1 (en) | 2011-09-30 | 2011-09-30 | Single board running method and system in main-standby mode |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN102388570A CN102388570A (en) | 2012-03-21 |
| CN102388570B true CN102388570B (en) | 2014-03-26 |
Family
ID=45826508
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201180002282.3A Expired - Fee Related CN102388570B (en) | 2011-09-30 | 2011-09-30 | Single board running method and system under active-standby mode |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN102388570B (en) |
| WO (1) | WO2012149785A1 (en) |
Families Citing this family (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102629906A (en) * | 2012-03-30 | 2012-08-08 | 浪潮电子信息产业股份有限公司 | A design method of using cluster management nodes as dual machines to improve the availability of cluster services |
| CN104731656B (en) * | 2013-12-23 | 2018-10-30 | 华为软件技术有限公司 | A kind of resource allocation methods and device |
| CN103744755A (en) * | 2014-01-08 | 2014-04-23 | 烽火通信科技股份有限公司 | Implement system for primary and standby veneer single port shared protection and method thereof |
| CN105323873A (en) * | 2014-07-03 | 2016-02-10 | 中兴通讯股份有限公司 | Base station and base station load sharing device and method |
| CN104778098A (en) * | 2015-04-09 | 2015-07-15 | 浪潮电子信息产业股份有限公司 | Memory mirroring method and system and memory monitor |
| CN104994071B (en) * | 2015-05-28 | 2018-11-09 | 新华三技术有限公司 | The backup method and device of broadband remote access server equipment |
| CN107147511A (en) * | 2016-03-01 | 2017-09-08 | 深圳市深信服电子科技有限公司 | Data center's control method and device |
| CN113206698B (en) * | 2021-03-22 | 2022-08-16 | 深圳震有科技股份有限公司 | Satellite media resource redundancy protection method, intelligent terminal and storage medium |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1929324A (en) * | 2006-10-17 | 2007-03-14 | 杭州华为三康技术有限公司 | Master-salve switching method and system for mutual backup device |
| US20080163248A1 (en) * | 2006-12-29 | 2008-07-03 | Futurewei Technologies, Inc. | System and method for completeness of tcp data in tcp ha |
| CN101394260A (en) * | 2007-09-17 | 2009-03-25 | 华为技术有限公司 | A method and device for realizing active/standby switching and load sharing |
| CN101404519A (en) * | 2008-11-14 | 2009-04-08 | 华为技术有限公司 | Service board system and service handling method |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100553241C (en) * | 2007-09-04 | 2009-10-21 | 武汉市中光通信公司 | Active-standby switching system and method for session initiation protocol gateway |
-
2011
- 2011-09-30 CN CN201180002282.3A patent/CN102388570B/en not_active Expired - Fee Related
- 2011-09-30 WO PCT/CN2011/080413 patent/WO2012149785A1/en not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1929324A (en) * | 2006-10-17 | 2007-03-14 | 杭州华为三康技术有限公司 | Master-salve switching method and system for mutual backup device |
| US20080163248A1 (en) * | 2006-12-29 | 2008-07-03 | Futurewei Technologies, Inc. | System and method for completeness of tcp data in tcp ha |
| CN101394260A (en) * | 2007-09-17 | 2009-03-25 | 华为技术有限公司 | A method and device for realizing active/standby switching and load sharing |
| CN101404519A (en) * | 2008-11-14 | 2009-04-08 | 华为技术有限公司 | Service board system and service handling method |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2012149785A1 (en) | 2012-11-08 |
| CN102388570A (en) | 2012-03-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102388570B (en) | Single board running method and system under active-standby mode | |
| CN110912780B (en) | High-availability cluster detection method, system and controlled terminal | |
| US8339940B2 (en) | Multi-active detection method and stack member device | |
| US9032240B2 (en) | Method and system for providing high availability SCTP applications | |
| US8032786B2 (en) | Information-processing equipment and system therefor with switching control for switchover operation | |
| CN108075971B (en) | Main/standby switching method and device | |
| CN104639367B (en) | A kind of method and system for realizing active/standby server switching | |
| CN102968357B (en) | A kind of distributed communication equipment method for upgrading software and system | |
| EP3618350B1 (en) | Protection switching method, device and system | |
| US11349706B2 (en) | Two-channel-based high-availability | |
| CN109861867B (en) | MEC service processing method and device | |
| CN101335690A (en) | Seamless Redundant System for IP Communication Networks | |
| WO2018166308A1 (en) | Distributed nat dual-system hot backup traffic switching system and method | |
| CN102647304A (en) | Synchronizing method of address resolution protocols and device | |
| CN106385334A (en) | Call-center system and abnormality detection and self-recovery method therefor | |
| CN114500547B (en) | System, method, device, electronic equipment and storage medium for synchronizing session information | |
| CN101160794B (en) | Disaster recovery system and method for intelligent network service control equipment | |
| CN106385330A (en) | Network function virtualization composer realization method and device | |
| CN109815065B (en) | Main-standby switching method and device for dual computers and electronic equipment | |
| CN113286321B (en) | Backup management method, device, equipment and machine readable storage medium | |
| CN114020466B (en) | Method and device for balancing double active loads, electronic equipment and storage medium | |
| CN106130898B (en) | A virtual routing link guarantee method and device | |
| CN109039798B (en) | Splitting detection system and method | |
| CN113794631A (en) | Port redundancy processing method, device, equipment and machine-readable storage medium | |
| US20260121909A1 (en) | Managing cloud computing environment clusters |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140326 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |