CN113486194A - Weight-proof method and device for knowledge graph - Google Patents

Weight-proof method and device for knowledge graph Download PDF

Info

Publication number
CN113486194A
CN113486194A CN202110849615.1A CN202110849615A CN113486194A CN 113486194 A CN113486194 A CN 113486194A CN 202110849615 A CN202110849615 A CN 202110849615A CN 113486194 A CN113486194 A CN 113486194A
Authority
CN
China
Prior art keywords
knowledge
target
map
knowledge graph
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110849615.1A
Other languages
Chinese (zh)
Other versions
CN113486194B (en
Inventor
万明霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202110849615.1A priority Critical patent/CN113486194B/en
Publication of CN113486194A publication Critical patent/CN113486194A/en
Application granted granted Critical
Publication of CN113486194B publication Critical patent/CN113486194B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90348Query processing by searching ordered data, e.g. alpha-numerically ordered data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种知识图谱的防重方法和装置,预先建立的数据库中包括多个已存储知识,每个已存储知识对应的知识图谱均会按照第一预设规则和第二预设规则得到对应的数字串,作为知识图谱唯一标识存储在数据库对应的数据库表中。在存储新的知识时,获取待存储知识对应的知识图谱,按照第一预设规则调整该知识图谱中的图谱标签,得到目标知识图谱,从而避免后续由于图谱标签顺序不同等因素导致的同一知识图谱生成的目标数字串不同的问题,按照第二预设规则将目标知识图谱转换为目标数字串,以便通过数字查询方式,判断数据库表中是否包括与目标数字串相同的数字串,若数据库表中包括与目标数字串相同的数字串,则拒绝将目标知识图谱写入数据库中。

Figure 202110849615

The present application discloses a method and device for anti-duplication of knowledge graph. The pre-established database includes a plurality of stored knowledge, and the knowledge graph corresponding to each stored knowledge will follow the first preset rule and the second preset rule. The corresponding digital string is obtained and stored in the database table corresponding to the database as the unique identification of the knowledge graph. When storing new knowledge, the knowledge graph corresponding to the knowledge to be stored is obtained, and the graph labels in the knowledge graph are adjusted according to the first preset rule to obtain the target knowledge graph, so as to avoid subsequent identical knowledge caused by factors such as different graph label sequences. If the target number string generated by the map is different, the target knowledge map is converted into the target number string according to the second preset rule, so as to determine whether the database table contains the same number string as the target number string through the digital query method. If there is the same number string as the target number string, it will refuse to write the target knowledge graph into the database.

Figure 202110849615

Description

Weight-proof method and device for knowledge graph
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for preventing a weight of a knowledge graph.
Background
The intelligent customer service can provide 24-hour high-standard uninterrupted service, and is more and more widely applied to modern enterprises. The key functions of the intelligent customer service, namely the intelligent question-answering robot, are to provide consultation and business handling services for the customers through various channels. In the related technology, the intelligent question answering technology mainly adopts three technologies of retrieval technology, knowledge graph and deep learning.
The knowledge graph is a multi-relation graph obtained by connecting different kinds of information together and is stored in a database. According to the constructed knowledge graph, the user questions are converted into query sentences on the knowledge graph by understanding the user questions, and the process that the query sentences in the database obtain answers and return the answers to the user can be called as knowledge graph-based question-answering.
In practical application, questions and answers based on the knowledge graph often have problems that the accuracy of answers to the questions is not high.
Disclosure of Invention
In order to solve the problems, the application provides a weight-loss prevention method and device for a knowledge graph, which are used for solving the problem that the accuracy of the answer to the question is not high frequently in practical application of the question and answer based on the knowledge graph.
Based on this, the embodiment of the application discloses the following technical scheme:
in one aspect, an embodiment of the present application provides a method for preventing duplication of a knowledge graph, where the method includes:
acquiring a knowledge graph corresponding to the knowledge to be stored;
adjusting the map labels in the knowledge map according to a first preset rule to obtain a target knowledge map;
converting the target knowledge graph into a target number string according to a second preset rule;
judging whether a database table comprises a number string which is the same as the target number string or not by a number query mode, wherein the database table comprises a plurality of number strings which are obtained by a plurality of knowledge maps corresponding to stored knowledge according to the first preset rule and the second preset rule;
and if the database table comprises the numeric string which is the same as the target numeric string, refusing to write the target knowledge graph into the database.
Optionally, the converting the target knowledge graph into a target numeric string according to a second preset rule includes:
and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as the target digit string.
Optionally, the converting the target knowledge graph into a target numeric string according to a second preset rule includes:
respectively acquiring index values corresponding to map labels in the knowledge map;
and combining the index values according to the sequence of the map labels in the target knowledge map to obtain a target number string.
Optionally, the method further includes:
if the database table does not contain the numeric string which is the same as the target numeric string, judging whether the target knowledge graph is stored in the database or not in a character string query mode;
if the target knowledge graph is stored in the database, refusing to write the target knowledge graph into the database;
and if the target knowledge graph is not stored in the database, writing the target knowledge graph into the database.
Optionally, the adjusting, according to a first preset rule, the atlas tag in the knowledge atlas to obtain the target knowledge atlas includes:
and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
On the other hand, the embodiment of the application provides a duplication preventing device for the knowledge graph, which comprises an acquisition unit, a first conversion unit, a second conversion unit, a judgment unit and an execution unit;
the acquisition unit is used for acquiring a knowledge graph corresponding to the knowledge to be stored;
the first conversion unit is used for adjusting the map labels in the knowledge map according to a first preset rule to obtain a target knowledge map;
the second conversion unit is used for converting the target knowledge graph into a target numeric string according to a second preset rule;
the judging unit is used for judging whether a database table comprises a number string which is the same as the target number string or not in a number query mode, the database table comprises a plurality of number strings, and the number strings are obtained by a plurality of knowledge maps corresponding to stored knowledge according to the first preset rule and the second preset rule;
and the execution unit is used for refusing to write the target knowledge graph into the database if the database table comprises the numeric string which is the same as the target numeric string.
Optionally, the second conversion unit is configured to:
and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as the target digit string.
Optionally, the second conversion unit is configured to:
respectively acquiring index values corresponding to map labels in the knowledge map;
and combining the index values according to the sequence of the map labels in the target knowledge map to obtain a target number string.
Optionally, the execution unit is further configured to:
if the database table does not contain the numeric string which is the same as the target numeric string, judging whether the target knowledge graph is stored in the database or not in a character string query mode;
if the target knowledge graph is stored in the database, refusing to write the target knowledge graph into the database;
and if the target knowledge graph is not stored in the database, writing the target knowledge graph into the database.
Optionally, the first conversion unit is configured to:
and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
Compared with the prior art, the technical scheme of the application has the advantages that:
the pre-established database comprises a plurality of stored knowledge, and the knowledge graph corresponding to each stored knowledge can obtain a corresponding digital string according to a first preset rule and a second preset rule, so that the digital string is stored in the database table corresponding to the database and can be used as a unique identifier of the knowledge graph. When storing new knowledge, acquiring a knowledge graph corresponding to the knowledge to be stored, adjusting graph labels in the knowledge graph according to a first preset rule to obtain a target knowledge graph, thereby avoiding the subsequent problem that target numeric strings generated by the same knowledge graph due to factors such as different graph label sequences and the like, converting the target knowledge graph into the target numeric strings according to a second preset rule so as to judge whether the database table comprises the numeric strings same as the target numeric strings through a digital query mode, if the database table comprises the numeric strings same as the target numeric strings, indicating that the knowledge graph corresponding to the knowledge to be stored is associated in other knowledge, refusing to write the target knowledge graph into the database, thereby preventing the problem that the knowledge graphs of different knowledge in the database are repeated, and avoiding matching a plurality of query sentences through one knowledge graph, therefore, the condition of wrong answers is returned, and the accuracy of the answers to the questions is improved. Meanwhile, for the database, the digital query mode is faster than the character string query mode, so that the efficiency of problem re-judgment prevention is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of a method for preventing duplication of a knowledge-graph as provided herein;
fig. 2 is a schematic diagram of a weight guard for a knowledge graph provided in the present application.
Detailed Description
In order to make the technical solutions of the present application better understood, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The key points of the knowledge graph-based question-answering are the construction of the knowledge graph and the combing of knowledge. The storage elements of knowledge in the database generally include knowledge identifiers, knowledge titles, knowledge directories, knowledge states (valid or invalid), knowledge creators, updaters, knowledge creation time update time and the like, and when the knowledge graph is used, graph labels are also required to be associated, and one piece of knowledge may need to be expressed by a knowledge graph formed by a plurality of graph labels.
Taking a certain scenario of the financial industry as an example, the map labels can be divided into product, attribute, operation, channel and condition dimensions. For example, when the knowledge is "telephone bank modifies credit card transaction password", the corresponding knowledge map includes four map labels, which are credit card (product), transaction password (attribute), modification (operation), and telephone bank (channel). When the knowledge is 'credit card overseas ATM withdrawal handling fee', the corresponding knowledge map comprises five map labels which are credit card (product), handling fee (attribute), withdrawal (operation), ATM (channel) and overseas (condition).
According to the above-mentioned knowledge graph combing process, the knowledge graphs corresponding to different knowledge should be different. However, in the actual knowledge combing process, due to the fact that the knowledge understanding levels of the dealers are different, the dealers are different and the like, repeated problems can occur to knowledge maps with different knowledge, a plurality of query sentences can be matched according to one knowledge map in the actual knowledge problems, so that a plurality of answers are corresponding, and if the returned answers are not the answers required by the user, the accuracy rate of question answering is low.
Therefore, in order to prevent the new knowledge from being related to other knowledge, the new knowledge is required to be verified again, so that the repeated occurrence of the knowledge is prevented from influencing the accuracy of question answering.
In the related technology, whether the same knowledge graph exists in the database or not is searched in a character string query mode for the knowledge graph to be stored. Specifically, a corresponding query statement is searched according to the record of knowledge in a database table, then a plurality of map labels corresponding to the query statement are found in the database in a character string query mode, and then the map labels of the knowledge map corresponding to the knowledge to be stored are compared with the plurality of map labels corresponding to the query statement. Due to the fact that the database has low searching efficiency for searching the character string, performance is poor, and especially under the condition of more knowledge, the efficiency of anti-replay verification is very low.
Based on this, the embodiment of the application provides a method and a device for preventing duplication of a knowledge graph, a pre-established database comprises a plurality of stored knowledge, and the knowledge graph corresponding to each stored knowledge can obtain a corresponding number string according to a first preset rule and a second preset rule, so as to be stored in a database table corresponding to the database so as to serve as a unique identification attribute of the knowledge graph. When storing new knowledge, acquiring a knowledge graph corresponding to the knowledge to be stored, adjusting graph labels in the knowledge graph according to a first preset rule to obtain a target knowledge graph, thereby avoiding the subsequent problem that target numeric strings generated by the same knowledge graph due to factors such as different graph label sequences and the like, converting the target knowledge graph into the target numeric strings according to a second preset rule so as to judge whether the database table comprises the numeric strings same as the target numeric strings through a digital query mode, if the database table comprises the numeric strings same as the target numeric strings, indicating that the knowledge graph corresponding to the knowledge to be stored is associated in other knowledge, refusing to write the target knowledge graph into the database, thereby preventing the problem that the knowledge graphs of different knowledge in the database are repeated, and avoiding matching a plurality of query sentences through one knowledge graph, therefore, the condition of wrong answers is returned, and the accuracy of the answers to the questions is improved. Meanwhile, for the database, the digital query mode is faster than the character string query mode, so that the efficiency of problem re-judgment prevention is improved.
Referring to fig. 1, a method for preventing duplication of a knowledge graph according to an embodiment of the present application will be described. Referring to fig. 1, the figure is a flowchart of a method for preventing duplication of a knowledge graph provided in the present application, and the method may include the following steps 101-105.
S101: and acquiring a knowledge graph corresponding to the knowledge to be stored.
In practical applications, if a user wants to store a certain knowledge in a database, the knowledge graph corresponding to the knowledge to be stored may be input into the terminal device, and the terminal device may perform the subsequent S102-S105, or the terminal device sends a storage request to the server, and the server performs the subsequent S102-S105, where the storage request carries the knowledge graph corresponding to the storage.
The terminal device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like, but is not limited thereto; the server may be an independent physical server, or may be a server cluster or a distributed system formed by a plurality of physical servers.
S102: and adjusting the map label in the knowledge map according to a first preset rule to obtain the target knowledge map.
The knowledge graph may correspond to a plurality of graph labels, and in order to avoid that subsequently generated target number strings are different due to the fact that the plurality of graph labels have the same substantial content but are different in sequence or expression, the graph labels in the knowledge graph are adjusted according to a first preset rule. Three methods are described below as examples.
The first method is as follows: and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
For example, the map labels in the knowledge map are credit card, commission charge, withdrawal, ATM, oversea, the corresponding initial is X, S, Q, A, J, and the map label sequence of the target knowledge map obtained by adjusting the mode one is ATM, oversea, withdrawal, commission charge, and credit card.
The second method comprises the following steps: and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the dimensionality of the map labels in the knowledge map to obtain the target knowledge map.
For example, the atlas labels in the knowledge atlas are a credit card (product), a transaction password (attribute), a modification (operation), and a telephone bank (channel), the initial corresponding to the dimension is C, S, C, Q, and the atlas label sequence of the target knowledge atlas adjusted in the second way is a credit card, a modification, a telephone bank, and a transaction password, where if there are repeated initials, the sequence may be sorted according to the initial of the second word, and the like, and this application is not particularly limited thereto.
The third method comprises the following steps: and adjusting the sequence of the map labels in the knowledge map according to the sequence of the map labels in the knowledge map to the dimension setting to obtain the target knowledge map.
For example, the atlas labels in the knowledge graph are credit cards (products), transaction passwords (attributes), modification (operations), and telephone banks (channels), the sequence of dimension setting is products, attributes, operations, and channels, and the atlas label sequence of the target knowledge graph obtained by adjusting in the third mode is credit cards, transaction passwords, modification, and telephone banks.
S103: and converting the target knowledge graph into a target number string according to a second preset rule.
The performance is poor due to the low efficiency of the database in searching the character string query mode, especially under the condition of more knowledge stored in the database. Therefore, the embodiment of the application proposes that a character string query mode is not used, but a numerical query mode is used for searching. Based on this, the type of the target knowledge-graph needs to be converted from a string type to an integer type (int value).
The present application does not specifically limit the specific content of the second preset rule, and two ways are described below as examples.
The first method is as follows: and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as a target digit string.
The hash encoding method (hash) is simple in calculation method, and the collision rate of hash values is low, so that the hash encoding method (hash) is suitable for being used as uniqueness judgment. The target numeric string obtained in the hashcode mode can be used as the unique identifier of the target knowledge graph, and the type of the target knowledge graph can be converted from the character string type to the integer type.
By using the characteristics of simple hash mode and low collision rate, the hash value is used as the judgment standard of uniqueness of the map label, and the problem of low repeated check efficiency of the map label in the knowledge maintenance process can be effectively solved.
The hash means that an input with an arbitrary length is converted into an output with a fixed length by a hash algorithm, and the output is a hash value. (different keywords may get the same hash value after being transformed by the hash algorithm, which is called collision; if two hash values are different (if the same hash algorithm is assumed), the original inputs corresponding to the two hash values must be different).
The hashcode is used to determine the storage address of an object in the hash storage structure.
The second method comprises the following steps: and respectively obtaining index values corresponding to the map labels in the knowledge map, and combining the index values according to the sequence of the map labels in the target knowledge map to obtain the target numeric string.
The corresponding relation between the map labels in the knowledge map and the index values can be preset, the index values corresponding to the map labels in the knowledge map are obtained, and after the target knowledge map is obtained through adjustment, the index values are combined according to the sequence of the map labels in the target knowledge map, and the target digit string is obtained.
For example, the map label in the knowledge map is a credit card, a transaction password, a modification, and a telephone bank, the index value of the credit card is 11, the index value of the transaction password is 21, the modified index value is 31, and the index value of the telephone bank is 41, and if the map label sequence of the target knowledge map obtained by the adjustment in the third way described in the above S102 is a credit card, a transaction password, a modification, and a telephone bank, the target number string obtained by the corresponding method is 11213141.
S104: and judging whether the database table comprises the numeric string same as the target numeric string or not in a numeric query mode.
The pre-established database comprises a plurality of stored knowledge, and the knowledge map corresponding to each stored knowledge is stored in the database table corresponding to the database so as to be used as the unique identifier of the knowledge map by firstly converting the knowledge map according to a first preset rule and then obtaining a corresponding number string according to a second preset rule. It should be noted that the database table includes a plurality of numeric strings, which are all verified through the duplication prevention method of the knowledge graph provided by the present application, and the numeric strings correspond to the knowledge graph one to one.
Meanwhile, for the database, the digital query mode is faster than the character string query mode, so that the efficiency of problem re-judgment prevention is improved.
As a possible implementation manner, if the lengths of the numbers obtained by the second manner in S103 are not equal, the length of the target number string may be obtained first, then the number string with the same length is obtained in the database table based on the length of the target number string, and then whether the database table includes the number string that is the same as the target number string is determined based on the number query manner, thereby increasing the speed of determination.
S105: and if the database table comprises the numeric string which is the same as the target numeric string, refusing to write the target knowledge graph into the database.
The unique identification field for representing the knowledge graph is added in the design of a database table of knowledge, and when the knowledge is newly added or modified, whether the knowledge appears in the database table is judged through the unique identification field of the knowledge graph label, so that the problem of graph label repetition is solved, and the efficiency of knowledge maintenance is effectively improved.
As a possible implementation manner, if the target number string is obtained in the first manner in S103, and if the database table includes a number string that is the same as the target number string, it indicates that the knowledge graph corresponding to the knowledge to be stored is associated with other knowledge, and the target knowledge graph is rejected from being written into the database.
If the hashcodes are the same, whether the specific label contents are the same or not needs to be further judged, so that the hash code collision is avoided. It should be noted that, because the probability of collision is extremely low, the method can still greatly improve the efficiency of judging the weight, that is, if the database table does not include the number string the same as the target number string, it is judged whether the target knowledge graph is stored in the database by means of character string query.
And if the target knowledge graph is stored in the database, the knowledge graph corresponding to the knowledge to be stored is associated in other knowledge, the target knowledge graph is refused to be written into the database, and if the target knowledge graph is not stored in the database, the target knowledge graph is written into the database.
According to the technical scheme, the pre-established database comprises a plurality of stored knowledge, and the knowledge graph corresponding to each stored knowledge can obtain the corresponding numeric string according to the first preset rule and the second preset rule, so that the numeric string is stored in the database table corresponding to the database and can be used as the unique identifier of the knowledge graph. When storing new knowledge, acquiring a knowledge graph corresponding to the knowledge to be stored, adjusting graph labels in the knowledge graph according to a first preset rule to obtain a target knowledge graph, thereby avoiding the subsequent problem that target numeric strings generated by the same knowledge graph due to factors such as different graph label sequences and the like, converting the target knowledge graph into the target numeric strings according to a second preset rule so as to judge whether the database table comprises the numeric strings same as the target numeric strings through a digital query mode, if the database table comprises the numeric strings same as the target numeric strings, indicating that the knowledge graph corresponding to the knowledge to be stored is associated in other knowledge, refusing to write the target knowledge graph into the database, thereby preventing the problem that the knowledge graphs of different knowledge in the database are repeated, and avoiding matching a plurality of query sentences through one knowledge graph, therefore, the condition of wrong answers is returned, and the accuracy of the answers to the questions is improved. Meanwhile, for the database, the digital query mode is faster than the character string query mode, so that the efficiency of problem re-judgment prevention is improved.
In addition to the provided method for preventing duplication of the knowledge graph, the embodiment of the application also provides a duplication preventing device for the knowledge graph, as shown in fig. 2, which includes an obtaining unit 201, a first converting unit 202, a second converting unit 203, a judging unit 204 and an executing unit 205;
the acquiring unit 201 is configured to acquire a knowledge graph corresponding to knowledge to be stored;
the first conversion unit 202 is configured to adjust the atlas label in the knowledge atlas according to a first preset rule, so as to obtain a target knowledge atlas;
the second conversion unit 203 is configured to convert the target knowledge graph into a target numeric string according to a second preset rule;
the judging unit 204 is configured to judge whether a database table includes a number string that is the same as the target number string by a number query method, where the database table includes a plurality of number strings, and the number strings are obtained by a plurality of knowledge maps corresponding to stored knowledge according to the first preset rule and the second preset rule;
the executing unit 205 is configured to refuse to write the target knowledge graph into the database if the database table includes a number string that is the same as the target number string.
Optionally, the second converting unit 203 is configured to:
and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as the target digit string.
Optionally, the second converting unit 203 is configured to:
respectively acquiring index values corresponding to map labels in the knowledge map;
and combining the index values according to the sequence of the map labels in the target knowledge map to obtain a target number string.
Optionally, the execution unit 205 is further configured to:
if the database table does not contain the numeric string which is the same as the target numeric string, judging whether the target knowledge graph is stored in the database or not in a character string query mode;
if the target knowledge graph is stored in the database, refusing to write the target knowledge graph into the database;
and if the target knowledge graph is not stored in the database, writing the target knowledge graph into the database.
Optionally, the first conversion unit 202 is configured to:
and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
The device for preventing repetition of the knowledge graph provided by the embodiment of the application comprises a plurality of pieces of stored knowledge in a pre-established database, wherein the knowledge graph corresponding to each piece of stored knowledge obtains a corresponding digital string according to a first preset rule and a second preset rule, and the digital strings are stored in a database table corresponding to the database so as to serve as a unique identifier of the knowledge graph. When storing new knowledge, acquiring a knowledge graph corresponding to the knowledge to be stored, adjusting graph labels in the knowledge graph according to a first preset rule to obtain a target knowledge graph, thereby avoiding the subsequent problem that target numeric strings generated by the same knowledge graph due to factors such as different graph label sequences and the like, converting the target knowledge graph into the target numeric strings according to a second preset rule so as to judge whether the database table comprises the numeric strings same as the target numeric strings through a digital query mode, if the database table comprises the numeric strings same as the target numeric strings, indicating that the knowledge graph corresponding to the knowledge to be stored is associated in other knowledge, refusing to write the target knowledge graph into the database, thereby preventing the problem that the knowledge graphs of different knowledge in the database are repeated, and avoiding matching a plurality of query sentences through one knowledge graph, therefore, the condition of wrong answers is returned, and the accuracy of the answers to the questions is improved. Meanwhile, for the database, the digital query mode is faster than the character string query mode, so that the efficiency of problem re-judgment prevention is improved.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the units and modules described as separate components may or may not be physically separate. In addition, some or all of the units and modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present application and it is noted that numerous modifications and adaptations may be made by those skilled in the art without departing from the principles of the present application and are intended to be within the scope of the present application.

Claims (10)

1. A method of weight loss prevention for a knowledge graph, the method comprising:
acquiring a knowledge graph corresponding to the knowledge to be stored;
adjusting the map labels in the knowledge map according to a first preset rule to obtain a target knowledge map;
converting the target knowledge graph into a target number string according to a second preset rule;
judging whether a database table comprises a number string which is the same as the target number string or not by a number query mode, wherein the database table comprises a plurality of number strings which are obtained by a plurality of knowledge maps corresponding to stored knowledge according to the first preset rule and the second preset rule;
and if the database table comprises the numeric string which is the same as the target numeric string, refusing to write the target knowledge graph into the database.
2. The method of claim 1, wherein converting the target knowledge-graph into a target number string according to a second preset rule comprises:
and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as the target digit string.
3. The method of claim 1, wherein converting the target knowledge-graph into a target number string according to a second preset rule comprises:
respectively acquiring index values corresponding to map labels in the knowledge map;
and combining the index values according to the sequence of the map labels in the target knowledge map to obtain a target number string.
4. The method of claim 1, further comprising:
if the database table does not contain the numeric string which is the same as the target numeric string, judging whether the target knowledge graph is stored in the database or not in a character string query mode;
if the target knowledge graph is stored in the database, refusing to write the target knowledge graph into the database;
and if the target knowledge graph is not stored in the database, writing the target knowledge graph into the database.
5. The method of claim 1, wherein the adjusting the graph labels in the knowledge graph according to a first preset rule to obtain a target knowledge graph comprises:
and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
6. The device for preventing the duplication of the knowledge graph is characterized by comprising an acquisition unit, a first conversion unit, a second conversion unit, a judgment unit and an execution unit;
the acquisition unit is used for acquiring a knowledge graph corresponding to the knowledge to be stored;
the first conversion unit is used for adjusting the map labels in the knowledge map according to a first preset rule to obtain a target knowledge map;
the second conversion unit is used for converting the target knowledge graph into a target numeric string according to a second preset rule;
the judging unit is used for judging whether a database table comprises a number string which is the same as the target number string or not in a number query mode, the database table comprises a plurality of number strings, and the number strings are obtained by a plurality of knowledge maps corresponding to stored knowledge according to the first preset rule and the second preset rule;
and the execution unit is used for refusing to write the target knowledge graph into the database if the database table comprises the numeric string which is the same as the target numeric string.
7. The apparatus of claim 6, wherein the second conversion unit is configured to:
and calculating a target code corresponding to the target knowledge graph in a Hash coding mode, and taking the target code as the target digit string.
8. The apparatus of claim 6, wherein the second conversion unit is configured to:
respectively acquiring index values corresponding to map labels in the knowledge map;
and combining the index values according to the sequence of the map labels in the target knowledge map to obtain a target number string.
9. The apparatus of claim 6, wherein the execution unit is further configured to:
if the database table does not contain the numeric string which is the same as the target numeric string, judging whether the target knowledge graph is stored in the database or not in a character string query mode;
if the target knowledge graph is stored in the database, refusing to write the target knowledge graph into the database;
and if the target knowledge graph is not stored in the database, writing the target knowledge graph into the database.
10. The apparatus of claim 6, wherein the first conversion unit is configured to:
and adjusting the sequence of the map labels in the knowledge map according to the sequence of the first letters of the map labels in the knowledge map to obtain the target knowledge map.
CN202110849615.1A 2021-07-27 2021-07-27 A method and device for preventing duplication in knowledge graphs Active CN113486194B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110849615.1A CN113486194B (en) 2021-07-27 2021-07-27 A method and device for preventing duplication in knowledge graphs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110849615.1A CN113486194B (en) 2021-07-27 2021-07-27 A method and device for preventing duplication in knowledge graphs

Publications (2)

Publication Number Publication Date
CN113486194A true CN113486194A (en) 2021-10-08
CN113486194B CN113486194B (en) 2026-01-09

Family

ID=77943917

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110849615.1A Active CN113486194B (en) 2021-07-27 2021-07-27 A method and device for preventing duplication in knowledge graphs

Country Status (1)

Country Link
CN (1) CN113486194B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115271674A (en) * 2022-08-05 2022-11-01 深圳证券信息有限公司 Information disclosure text auditing method, computer equipment and computer storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376014A (en) * 2013-08-15 2015-02-25 中国科学院声学研究所 Structured P2P network resource publishing and querying method
CN109033358A (en) * 2018-07-26 2018-12-18 李辰洋 News Aggreagation and the associated method of intelligent entity
CN109739854A (en) * 2018-12-27 2019-05-10 新华三大数据技术有限公司 A kind of date storage method and device
CN110020086A (en) * 2017-12-22 2019-07-16 中国移动通信集团浙江有限公司 A kind of user draws a portrait querying method and device
CN111552818A (en) * 2020-04-27 2020-08-18 中国银行股份有限公司 Customer service knowledge base query method and device
CN111930966A (en) * 2020-10-07 2020-11-13 杭州实在智能科技有限公司 Intelligent policy matching method and system for digital government affairs
CN112445889A (en) * 2020-11-30 2021-03-05 杭州海康威视数字技术股份有限公司 Method for storing data and retrieving data and related equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376014A (en) * 2013-08-15 2015-02-25 中国科学院声学研究所 Structured P2P network resource publishing and querying method
CN110020086A (en) * 2017-12-22 2019-07-16 中国移动通信集团浙江有限公司 A kind of user draws a portrait querying method and device
CN109033358A (en) * 2018-07-26 2018-12-18 李辰洋 News Aggreagation and the associated method of intelligent entity
CN109739854A (en) * 2018-12-27 2019-05-10 新华三大数据技术有限公司 A kind of date storage method and device
CN111552818A (en) * 2020-04-27 2020-08-18 中国银行股份有限公司 Customer service knowledge base query method and device
CN111930966A (en) * 2020-10-07 2020-11-13 杭州实在智能科技有限公司 Intelligent policy matching method and system for digital government affairs
CN112445889A (en) * 2020-11-30 2021-03-05 杭州海康威视数字技术股份有限公司 Method for storing data and retrieving data and related equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115271674A (en) * 2022-08-05 2022-11-01 深圳证券信息有限公司 Information disclosure text auditing method, computer equipment and computer storage medium
CN115271674B (en) * 2022-08-05 2026-02-03 深圳证券信息有限公司 Information disclosure auditing method, computer equipment and computer storage medium

Also Published As

Publication number Publication date
CN113486194B (en) 2026-01-09

Similar Documents

Publication Publication Date Title
US11714793B2 (en) Systems and methods for providing searchable customer call indexes
CN109522746B (en) A data processing method, electronic device and computer storage medium
CN110457302B (en) Intelligent structured data cleaning method
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
WO2022105119A1 (en) Training corpus generation method for intention recognition model, and related device thereof
CN112559526A (en) Data table export method and device, computer equipment and storage medium
CN111414427B (en) A data processing method and device suitable for quasi-real-time service
CN115470861A (en) Data processing method, device and electronic device
CN116756762A (en) Method, device and equipment for identifying abnormal privacy attribute information
CN118586397A (en) Intent recognition method, device, computer equipment, readable storage medium and program product
CN113139558A (en) Method and apparatus for determining a multi-level classification label for an article
CN112214494B (en) Retrieval method and device
CN110069594A (en) Contract confirmation method, device, electronic equipment and storage medium
CN110941952A (en) Method and device for perfecting audit analysis model
CN113486194B (en) A method and device for preventing duplication in knowledge graphs
CN111611056A (en) Data processing method, device, computer equipment and storage medium
CN114495138A (en) Intelligent document identification and feature extraction method, device platform and storage medium
CN116932697B (en) A business data processing method and related equipment based on rule engine optimization
CN111695077A (en) Asset information pushing method, terminal equipment and readable storage medium
CN108647301A (en) A method for creating a user relationship network and a terminal device
CN114721582A (en) Information sharing method, device and related equipment
CN115345667A (en) Big data based card and ticket issuing and intercepting method, device, equipment and storage medium
CN113743902A (en) Information auditing method and device based on artificial intelligence, terminal equipment and medium
CN113254640A (en) Work order data processing method and device, electronic equipment and storage medium
Lu et al. Massive data MapReduce fingerprint discriminant algorithm based on hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant