JP7141133B2

JP7141133B2 - Information processing device, information processing method and information processing program

Info

Publication number: JP7141133B2
Application number: JP2020087379A
Authority: JP
Inventors: 圭堀口; 豪宮坂; 智之植木
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2022-09-22
Anticipated expiration: 2040-05-19
Also published as: JP2021182249A

Description

特許法第３０条第２項適用１）ウェブサイトの掲載日２０１９年５月２０日２）ウェブサイトのアドレスｈｔｔｐｓ：／／ｐｒｔｉｍｅｓ．ｊｐ／ｍａｉｎ／ｈｔｍｌ／ｒｄ／ｐ／００００００００５．００００３７６８０．ｈｔｍｌ３）公開者株式会社ＰＲＴＩＭＥＳ１）ウェブサイトの掲載日２０１９年５月２９日２）ウェブサイトのアドレスｈｔｔｐｓ：／／ｗｗｗ．ｃｌｏｕｄｓｉｇｎ．ｊｐ／ｍｅｄｉａ／２０１９０５２９－ｌａｗｇｕｅ／３）公開者弁護士ドットコム株式会社１）ウェブサイトの掲載日２０１９年１２月１８日２）ウェブサイトのアドレスｈｔｔｐｓ：／／ｊｐ．ｔｅｃｈｃｒｕｎｃｈ．ｃｏｍ／２０１９／１２／１８／ｊｌｓｉ－ｆｕｎｄｒａｉｓｉｎｇ／３）公開者ベライゾンメディア・ジャパン株式会社１）ウェブサイトの掲載日２０２０年３月５日２）ウェブサイトのアドレスｈｔｔｐｓ：／／ｌａｗｇｕｅ．ｃｏｍ／ｎｅｗｓ／９４３）公開者堀口圭、宮坂豪及び植木智之１）ウェブサイトの掲載日２０１９年５月２０日２）ウェブサイトのアドレスｈｔｔｐｓ：／／ｌａｗｇｕｅ．ｃｏｍ／３）公開者堀口圭、宮坂豪及び植木智之Application of Article 30, Paragraph 2 of the Patent Act 1) Website publication date May 20, 2019 2) Website address https://prtimes. jp/main/html/rd/p/000000005.000037680. html 3) Publisher PR TIMES Co., Ltd. 1) Posting date of website May 29, 2019 2) Website address https://www. cloudsign. jp/media/20190529-lawgue/ 3) Publisher Bengo4.com, Inc. 1) Website publication date December 18, 2019 2) Website address https://jp. techcrunch. com/2019/12/18/jlsi-fundraising/ 3) Publisher Verizon Media Japan K.K. 1) Website publication date March 5, 2020 2) Website address https://lawgue. com/news/94 3) Publishers: Kei Horiguchi, Go Miyasaka, and Tomoyuki Ueki 1) Date of website publication: May 20, 2019 2) Website address: https://lawgue. com/ 3) Publisher Kei Horiguchi, Go Miyasaka and Tomoyuki Ueki

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program.

一般的に、取引に際して契約書を作成するが、契約書は取引に影響するため非常に重要である。このため、契約書作成を支援するためのシステムが従来から開発されている。例えば、特許文献１には、複数の法令に含まれる複数の法律条文について、条文毎の文書ベクトルを生成する処理と、各条文の文書ベクトル同士を比較し、所定の閾値以上の類似性を有する複数の条文を合体させた条文グループを生成する処理と、各条文グループについて、条文グループ毎の文書ベクトルを生成する処理と、入力された契約書データについて、条項毎の文書ベクトルを生成する処理と、この条項毎の文書ベクトルと、上記条文グループ毎の文書ベクトルとを比較し、類似する条文グループに含まれる各法律条文を、当該契約条項の関連条文と特定する処理と、契約条項毎に関連条文を列記した分析結果画面を生成する処理を実行する契約書分析システムが開示されている。 Contracts are generally drawn up for transactions, and contracts are very important because they affect transactions. For this reason, conventionally, systems have been developed to support the creation of contracts. For example, Patent Literature 1 discloses a process of generating a document vector for each of a plurality of legal provisions included in a plurality of laws and regulations, and comparing the document vectors of each of the provisions with each other to obtain a similarity equal to or greater than a predetermined threshold. A process of generating a clause group by combining a plurality of clauses, a process of generating a document vector for each clause group for each clause group, and a process of generating a document vector for each clause for the input contract data. , the document vector for each clause is compared with the document vector for each clause group, and each legal clause included in a similar clause group is identified as a related clause of the relevant contract clause; A contract analysis system is disclosed that executes a process of generating an analysis result screen listing clauses.

上記契約書分析システムでは、ユーザが契約書データを入力することにより、各契約条項に関連が深い法律条文が自動的に提示されるため、法律に疎いユーザであっても関連条文を事前にチェックすることが可能となるとしている。このように、上記契約書分析システムでは、作成中の契約に関連が深い法律条文が自動的に提示されるものの過去の契約書と比較しながら契約書を作成することができないなど利便性の点において向上の余地がある。 In the above contract analysis system, when the user enters contract data, the legal texts closely related to each contract clause are automatically presented, so even users who are unfamiliar with the law can check the related clauses in advance. It is possible to do so. In this way, the above contract analysis system automatically presents legal texts that are closely related to the contract being drafted, but it is not possible to create a contract while comparing it with past contracts. There is room for improvement in

特開２０１４－２３８６２９号公報JP 2014-238629 A

本発明は、上記課題に鑑みてなされたものであり、利便性の高い情報処理装置、情報処理方法及び情報処理プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to provide a highly convenient information processing apparatus, information processing method, and information processing program.

上記課題を解決するため、本発明の情報処理装置は、第１文書と、第１文書とは異なる第２文書との類似性を所定領域単位で算出する算出部と、算出部で算出された類似性に応じて、第１文書と、第２文書とを所定領域単位で比較する比較部と、比較部での比較結果に応じて、第１文書と第２文書の同一箇所又は異なる箇所を他とは異なる態様で表示させる比較情報を出力する出力部とを備える。 In order to solve the above-described problems, an information processing apparatus of the present invention includes a calculation unit that calculates the similarity between a first document and a second document that is different from the first document for each predetermined area, and a comparison unit that compares the first document and the second document in units of a predetermined region according to the similarity; and an output unit for outputting comparison information to be displayed in a manner different from others.

本発明によれば、利便性の高い情報処理装置、情報処理方法及び情報処理プログラムを提供することができる。 According to the present invention, it is possible to provide a highly convenient information processing apparatus, information processing method, and information processing program.

実施形態に係る情報処理システムの概略構成の一例を示す図である。It is a figure showing an example of a schematic structure of an information processing system concerning an embodiment. 実施形態に係るサーバのハード構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of the server which concerns on embodiment. 実施形態に係るサーバの記憶装置に記憶されているデータベースの一例を示す図である。It is a figure which shows an example of the database memorize|stored in the memory|storage device of the server which concerns on embodiment. 実施形態に係るサーバの機能構成の一例を示す図である。It is a figure showing an example of functional composition of a server concerning an embodiment. 実施形態に係るユーザ端末のハード構成及び機能構成の一例を示す図である。It is a figure which shows an example of the hardware structure and functional structure of the user terminal which concerns on embodiment. 実施形態に係るユーザ端末の表示装置に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display apparatus of the user terminal which concerns on embodiment. 実施形態に係るユーザ端末の表示装置に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display apparatus of the user terminal which concerns on embodiment. 実施形態に係るユーザ端末の表示装置に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display apparatus of the user terminal which concerns on embodiment. 実施形態に係るサーバで実行されるユーザ登録処理の一例を示すフローチャートである。8 is a flowchart illustrating an example of user registration processing executed by the server according to the embodiment; 実施形態に係るサーバで実行される文書登録処理の一例を示すフローチャートである。6 is a flowchart showing an example of document registration processing executed by the server according to the embodiment; 実施形態に係るサーバで実行されるカウント処理の一例を示すフローチャートである。8 is a flowchart illustrating an example of counting processing executed by the server according to the embodiment; 実施形態に係るサーバで実行される比較処理の一例を示すフローチャートである。8 is a flowchart illustrating an example of comparison processing executed by the server according to the embodiment; 実施形態の変形例１に係るサーバの機能構成の一例を示す図である。It is a figure which shows an example of the functional structure of the server based on the modification 1 of embodiment. 実施形態の変形例１に係るユーザ端末の表示装置に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on the display apparatus of the user terminal which concerns on the modification 1 of embodiment.

以下、本発明の実施形態を図面に基づいて説明する。
なお、下記実施形態では、文書として契約書を例に実施形態を説明するが文書は契約書に限られない。例えば、規程文書、条例や条約、有価証券報告書、決算短信、特許明細書などであってもよい。
また、下記実施形態では、後述の変換部２０５が変換したベクトルを契約書の条項ごとに記憶する構成となっているが、比較のたびに変換部２０５が対象となる契約書の条項をベクトルに変換する構成としてもよい。
また、下記実施形態では、変換部２０５が変換したベクトルに基づいて類似性を算出しているが、シーケンスマッチングを用いた類似計算や、編集距離を用いた類似計算により類似性を算出するようにしてもよい。 BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of the present invention will be described below with reference to the drawings.
In the following embodiment, the embodiment will be described using a contract as an example of a document, but the document is not limited to the contract. For example, it may be a regulatory document, an ordinance or a treaty, a securities report, a financial statement, or a patent specification.
In the embodiment described below, a vector converted by the conversion unit 205, which will be described later, is stored for each clause of the contract. It is good also as a structure which converts.
Further, in the following embodiment, the similarity is calculated based on the vector converted by the conversion unit 205, but the similarity is calculated by similarity calculation using sequence matching or similarity calculation using edit distance. may

[実施形態]
図１は、実施形態に係る情報処理システム１の概略構成の一例を示す図である。初めに、図１を参照して情報処理システム１の構成について説明する。情報処理システム１は、サーバ２（情報処理装置）と、このサーバ２とネットワーク４を介して接続されたユーザ端末３とを備える。なお、情報処理システム１が備えるサーバ２及びユーザ端末３の数はそれぞれ任意である。なお、サーバ２及びユーザ端末３間の通信は無線通信であるか有線通信であるか問わない。 [Embodiment]
FIG. 1 is a diagram showing an example of a schematic configuration of an information processing system 1 according to an embodiment. First, the configuration of the information processing system 1 will be described with reference to FIG. The information processing system 1 includes a server 2 (information processing device) and user terminals 3 connected to the server 2 via a network 4 . The number of servers 2 and user terminals 3 included in the information processing system 1 is arbitrary. Communication between the server 2 and the user terminal 3 may be wireless communication or wired communication.

（サーバ２）
図２は、サーバ２（情報処理装置）のハード構成の一例を示す図である。図２に示すように、サーバ２は、通信ＩＦ２００Ａ、記憶装置２００Ｂ、ＣＰＵ２００Ｃなどを備える。なお、サーバ２に入力装置（例えば、キーボード、タッチパネルなど）及び表示装置（例えば、液晶モニタや有機ＥＬモニタなど）を備えるようにしてもよい。 (Server 2)
FIG. 2 is a diagram showing an example of the hardware configuration of the server 2 (information processing device). As shown in FIG. 2, the server 2 includes a communication IF 200A, a storage device 200B, a CPU 200C, and the like. The server 2 may be provided with an input device (for example, keyboard, touch panel, etc.) and a display device (for example, liquid crystal monitor, organic EL monitor, etc.).

通信ＩＦ２００Ａは、外部端末（実施形態では、ユーザ端末３）と通信するためのインターフェースである。 Communication IF 200A is an interface for communicating with an external terminal (user terminal 3 in the embodiment).

記憶装置２００Ｂは、例えば、ＨＤＤや半導体記憶装置である。記憶装置２００Ｂには、サーバ２で利用する情報処理プログラム及びデータベースなどが記憶されている。なお、実施形態では、情報処理プログラム及びデータベースは、サーバ２の記憶装置２００Ｂに記憶されているが、ＵＳＢメモリなどの外部記憶装置やネットワークを介して接続された外部サーバに記憶し、必要に応じて参照やダウンロード可能に構成されていてもよい。 The storage device 200B is, for example, an HDD or a semiconductor storage device. Information processing programs and databases used by the server 2 are stored in the storage device 200B. In the embodiment, the information processing program and database are stored in the storage device 200B of the server 2, but may be stored in an external storage device such as a USB memory or an external server connected via a network, It may be configured so that it can be referred to or downloaded by the user.

図３は、記憶装置２００Ｂに記憶されているデータベースの一例である。図３に示すように、記憶装置２００Ｂには、ユーザデータベース１（以下、ユーザＤＢ１）及び文書データベース２（以下、文書ＤＢ２）が記憶されている。なお、下記データベースＤＢ１～ＤＢ２の情報をどのように関連付けて記憶装置２００Ｂに記憶するかは任意であり、図３に示す例に限られない。また、必ずしもデータベースとして記憶装置２００Ｂに記憶する必要はない。 FIG. 3 is an example of a database stored in the storage device 200B. As shown in FIG. 3, the storage device 200B stores a user database 1 (hereinafter referred to as user DB1) and a document database 2 (hereinafter referred to as document DB2). It is arbitrary how the information of the databases DB1 and DB2 described below is associated and stored in the storage device 200B, and is not limited to the example shown in FIG. Also, it is not always necessary to store the data in the storage device 200B as a database.

図３は、データベースとして記憶装置２００Ｂに記憶されている情報（データ）の一例を示す図である。
（ユーザＤＢ１）
ユーザＤＢ１には、ユーザの情報、例えば、パスワード、氏名、性別、年齢、生年月日、連絡先、アイコン用の自画像データなどの情報がユーザＩＤに関連付けて記憶されている。ユーザＩＤは、ユーザ毎に異なる識別情報であり、情報処理システム１にログインするためのＩＤともなる。パスワードは、情報処理システム１にログインするためのパスワードである。氏名、性別、年齢及び生年月日は、ユーザの氏名、性別、年齢及び生年月日である。連絡先は、ユーザのメールアドレス、電話番号、住所などである。なお、どのような情報をユーザＩＤに関連付けてユーザＤＢ１に記憶するかは任意であり、上記情報に限られない。 FIG. 3 is a diagram showing an example of information (data) stored in the storage device 200B as a database.
(User database 1)
The user DB 1 stores user information such as password, name, gender, age, date of birth, contact information, self-portrait data for icons, etc. in association with the user ID. The user ID is identification information different for each user, and also serves as an ID for logging into the information processing system 1 . The password is a password for logging into the information processing system 1 . The name, gender, age and date of birth are the user's name, gender, age and date of birth. The contact information is the user's e-mail address, telephone number, address, and the like. In addition, what kind of information is stored in the user DB 1 in association with the user ID is arbitrary, and is not limited to the above information.

（文書ＤＢ２）
文書ＤＢ２には、契約書、各種テーブルなどの情報が記憶されている。以下、文書ＤＢ２に記憶されている情報について説明する。
契約書種類マスタテーブルＭ１には、契約書種類ＩＤと、条項名との組のレコードが蓄積されている。ここで契約書種類ＩＤは、契約書の種類を識別する契約書種類識別情報の一例である。
種類マスタテーブルＭ２には、条項種類ＩＤと、条項名との組のレコードが蓄積されている。ここで条項種類ＩＤは、条項の種類を識別する条項種類識別情報の一例である。
標準条項テーブルＴ１には、予め用意された標準的な契約書に含まれる条項に関する情報が格納される。
標準条項テーブルＴ１には、契約書種類ＩＤと、条項種類ＩＤと、ベクトルと、条項との組のレコードが蓄積されている。ベクトルは、条項すなわち条項単位の文章群が変換されたものである。
条項テーブルＴ２には、ユーザによって入力（アップロード）された契約書に含まれる条項が蓄積されている。また、条項テーブルＴ２には、契約書種類ＩＤと、条項種類ＩＤと、条項と、条項を変換したベクトルの組のレコードが蓄積されている。
また、文書ＤＢ２には、標準契約書に含まれる条項それぞれに対応するベクトルと当該条項に付された条項種類識別情報（ここでは契約書種類ＩＤ）との組に対して、契約書の種類が更に関連付けられて記憶されている。 (Document DB2)
The document DB 2 stores information such as contracts and various tables. Information stored in the document DB 2 will be described below.
The contract type master table M1 stores records of pairs of contract type IDs and clause names. Here, the contract type ID is an example of contract type identification information for identifying the type of contract.
The type master table M2 stores records of sets of clause type IDs and clause names. Here, the clause type ID is an example of clause type identification information that identifies the type of clause.
The standard clause table T1 stores information on clauses included in standard contracts prepared in advance.
The standard clause table T1 stores records of pairs of contract type IDs, clause type IDs, vectors, and clauses. A vector is obtained by transforming a clause, that is, a group of sentences for each clause.
Clauses included in contracts input (uploaded) by users are accumulated in the clause table T2. Also, the clause table T2 stores records of sets of contract type IDs, clause type IDs, clauses, and vectors obtained by converting the clauses.
Further, in the document DB 2, a contract type is stored for a set of a vector corresponding to each clause included in the standard contract and clause type identification information (contract type ID in this case) attached to the clause. They are further associated and stored.

ＣＰＵ２００Ｃは、サーバ２を制御し、図示しないＲＯＭ（Read Only Memory）及びＲＡＭ（Random Access Memory）を備えている。 The CPU 200C controls the server 2 and includes ROM (Read Only Memory) and RAM (Random Access Memory) not shown.

図４は、実施形態に係るサーバ２の機能構成の一例を示す図である。図４に示すように、サーバ２は、受信部２０１（第１～第２受付部）、送信部２０２（出力部）、記憶装置制御部２０３、分割部２０４、変換部２０５、算出部２０６、分類部２０７、カウント部２０８、比較部２０９などの機能を有する。なお、図４に示す機能は、サーバ２のＲＯＭ（不図示）に記憶された情報処理プログラムをＣＰＵ２００Ｃが実行することにより実現される。 FIG. 4 is a diagram showing an example of the functional configuration of the server 2 according to the embodiment. As shown in FIG. 4, the server 2 includes a receiving unit 201 (first and second receiving units), a transmitting unit 202 (output unit), a storage device control unit 203, a dividing unit 204, a converting unit 205, a calculating unit 206, It has functions such as a classifying section 207, a counting section 208, a comparing section 209, and the like. The functions shown in FIG. 4 are realized by the CPU 200C executing an information processing program stored in the ROM (not shown) of the server 2. FIG.

受信部２０１は、ネットワーク４を介してユーザ端末３から送信される情報を受信する。受信部２０１は、例えば、分類部２０７により分類された第２契約書をグループごとに表示する第１指示を受信する。また、受信部２０１は、例えば、比較部２０９により比較された第２契約書のうち、所定領域以外の領域を表示する第２指示を受信する。 The receiving unit 201 receives information transmitted from the user terminal 3 via the network 4 . The receiving unit 201 receives, for example, a first instruction to display the second contracts sorted by the sorting unit 207 for each group. The receiving unit 201 also receives, for example, a second instruction to display an area other than the predetermined area in the second contract compared by the comparing unit 209 .

記憶装置制御部２０３は、記憶装置２００Ｂを制御する。具体的には、記憶装置制御部２０３は、記憶装置２００Ｂへの情報の書き込みや読み出しを行う。 The storage device control unit 203 controls the storage device 200B. Specifically, the storage device control unit 203 writes and reads information to and from the storage device 200B.

分割部２０４は、契約書を所定領域単位に分割する。実施形態では、分割部２０４は、契約書を条項ごとに分割する。具体的には、分割部２０４は、契約書から「条」という単語を抽出し、この「条」を境目とみなして条項毎に契約書を分割する。なお、所定領域単位をどのような単位とするかは任意である。例えば、項又は号を所定領域単位として対象の契約書を分割するようにしてもよい。この場合、分割部２０４は、対象の契約書から「項」又は「号」という単語を抽出し、この「項」又は「号」を境目とみなして項毎又は号毎に契約書を分割する。また、「条」という単語を抽出して分割するのではなく、条段落であることを連続する行の中のテキスト情報から判断して分割するようにしてもよい。なお、「項」や「号」の抽出についても同様に「項」や「号」という単語を抽出して分割するのではなく、項段落や号段落であることを連続する行の中のテキスト情報から判断して分割するようにしてもよい The dividing unit 204 divides the contract into predetermined area units. In the embodiment, the dividing unit 204 divides the contract by clause. Specifically, the dividing unit 204 extracts the word "article" from the contract, and divides the contract into clauses using the "article" as a boundary. It should be noted that any unit may be used as the predetermined area unit. For example, the target contract may be divided with the paragraph or number as a predetermined area unit. In this case, the dividing unit 204 extracts the word “paragraph” or “issue” from the target contract, and divides the contract for each paragraph or issue using this “paragraph” or “issue” as a boundary. . In addition, instead of extracting the word "article" and dividing, it is also possible to divide by judging that it is a paragraph from the text information in the continuous lines. Regarding the extraction of ``section'' and ``number'', instead of extracting and dividing the words ``section'' and ``number'' in the same way, the text in the continuous line that is the paragraph and paragraph It is possible to divide by judging from the information

変換部２０５は、契約書の条項をベクトルに変換する。具体的には、変換部２０５は、分割部２０４による分割後の条項をベクトルに変換する。ベクトルは例えば、高次元の実数ベクトルである。変換部２０５により近い意味の条項は、近いベクトルに変換される。この変換によるベクトルを比較することで、契約書の類似性を条項ごとに算出することができる。 The conversion unit 205 converts the terms of the contract into vectors. Specifically, the conversion unit 205 converts the clause after division by the division unit 204 into a vector. The vector is, for example, a high-dimensional real vector. Clauses with closer semantics are converted to closer vectors by the conversion unit 205 . By comparing vectors resulting from this conversion, the similarity of contracts can be calculated for each clause.

＜変換部２０５による変換処理＞
次に、各文章をベクトルに変換する処理の一例について説明する。
初めに、変換部２０５は、文章を形態素に分解する。例えば文章が「今日はいい天気です」の場合、「今日」、「は」、「いい」、「天気」、「です」に分解する。
次いで、変換部２０５は、連続するＮ個（Ｎは自然数）の形態素をｎ－ｇｒａｍとして定義する。例えば、Ｎが２の場合、変換部２０５は、以下のようにｎ－ｇｒａｍを定義する。
N=2：(今日,は),(は,いい),(いい天気),(天気です)
次いで、変換部２０５は行列を計算する。ここでは、仮に、三つの契約書Ｓ１、Ｓ２、Ｓ３があるものとする。変換部２０５は、契約書Ｓ１のすべてのｎ－ｇｒａｍは契約書Ｓ１に最も現れ、契約書Ｓ２、契約書Ｓ３に含まれる異なるｎ－ｇｒａｍは現れず、契約書Ｓ２、契約書Ｓ３に関しても同様となる行列Ｕ、Ｖを計算する。行列Ｕは、センテンス（文章）ごとの値の集合で、行列Ｕ、Ｖの最適化によって学習（最適化関数）を実行することにより、契約書のベクトル（分散表現）を導出する。行列Ｕは、行列Ｖと大きさの同じで９０度傾けた行列で、センテンス（文章）毎の値の集合である。そして変換部２０５は、行列の片方であるＶをもとに、含まれるすべてのｎ－ｇｒａｍ分散表現を平均して文章のベクトル（分散表現ともいう）を得る。
以下、対象の文章（センテンス）Ｓをベクトルに変換する処理の具体的な手法の例について簡単に説明する。学習時には、行列Ｖで単語ごとの表現を求める。単語のセンテンス内での出現パターンを学習するためパラメーターＵも使う。また行列Ｕ、Ｖを誤差関数を使って最適化する。分散表現導出時には、文章（センテンス）ＳのBag of WordsベクトルＤを求める。ベクトルＤと行列Ｖを掛け合わせ、出現頻度を加味したｎ－ｇｒａｍごとのベクトル表現の平均を取り、文章Ｓのベクトル表現を求める。
なお、変換部２０５によるベクトルへの変換には公知のライブラリや公知の方法を用いてもよい。
なお、変換部２０５は、条項に含まれる文章が三つである場合（第１文から第３文まで存在する場合）、変換部２０５は、第１文から第３文をそれぞれベクトルに変換し、それぞれのベクトルの平均を当該条項に対応するベクトルとする。 <Conversion processing by conversion unit 205>
Next, an example of processing for converting each sentence into a vector will be described.
First, the conversion unit 205 decomposes the text into morphemes. For example, if the sentence is ``It's nice weather today'', it is broken down into ``today'', ``ha'', ``nice'', ``weather'', and ``desu''.
Next, the conversion unit 205 defines N consecutive morphemes (N is a natural number) as n-grams. For example, when N is 2, the conversion unit 205 defines n-grams as follows.
N=2: (today, ha), (ha, nice), (nice weather), (weather)
Transformation unit 205 then computes the matrix. Here, it is assumed that there are three contracts S1, S2, and S3. Conversion unit 205 determines that all n-grams of contract S1 appear most in contract S1, different n-grams included in contract S2 and contract S3 do not appear, and the same applies to contract S2 and contract S3. Compute the matrices U and V such that The matrix U is a set of values for each sentence, and by performing learning (optimization function) by optimizing the matrices U and V, a contract vector (distributed representation) is derived. Matrix U is a matrix that has the same size as matrix V and is tilted by 90 degrees, and is a set of values for each sentence. Based on V, which is one side of the matrix, the transformation unit 205 averages all the included n-gram distributed representations to obtain a sentence vector (also called a distributed representation).
An example of a specific technique for converting a target sentence (sentence) S into a vector will be briefly described below. At the time of learning, a matrix V is used to obtain an expression for each word. We also use the parameter U to learn patterns of occurrence of words within sentences. Also, the matrices U and V are optimized using the error function. When deriving the distributed representation, the Bag of Words vector D of the sentence S is obtained. The vector representation of the sentence S is obtained by multiplying the vector D and the matrix V, and taking the average of the vector representation for each n-gram with appearance frequency added.
Note that a known library or a known method may be used for conversion into a vector by the conversion unit 205 .
Note that when the clause includes three sentences (the first to third sentences exist), the converting unit 205 converts the first to third sentences into vectors. , the average of each vector is taken as the vector corresponding to the clause.

算出部２０６は、ユーザによる編集対象である契約書（以下、第１契約書ともいう）の条項と、文書ＤＢ２に格納された他の契約書（以下、第２契約書ともいう）の条項との類似性をベクトルに基づいて算出する。具体的には、算出部２０６は、第１契約書の条項のベクトルと、第２契約書の条項のベクトルとの類似性（例えばコサイン類似性）を算出する。なお、実施形態では、算出部２０６は、ベクトル間のコサイン類似性に基づいて契約書の条項間の類似性を算出するが、契約書の条項間の類似性を算出できれば、他の手法により類似性を算出するようにしてもよい。例えば、ユークリッド距離を用いた類似度計算により類似性を算出するようにしてもよい。 The calculation unit 206 calculates the terms of the contract to be edited by the user (hereinafter also referred to as the first contract) and the terms of other contracts stored in the document DB 2 (hereinafter also referred to as the second contract). based on the vectors. Specifically, the calculation unit 206 calculates the similarity (for example, cosine similarity) between the vector of the clauses of the first contract and the vector of the clauses of the second contract. In the embodiment, the calculation unit 206 calculates the similarity between contract clauses based on the cosine similarity between vectors. You may make it calculate gender. For example, similarity may be calculated by similarity calculation using Euclidean distance.

分類部２０７は、契約書の条項間の類似性に応じて第２契約書を所定のグループ（１以上のグループ）に分類する。具体的には、分類部２０７は、算出部２０６で算出された類似性が第１所定値以上第２所定値未満である第２契約書を第１グループに分類する。算出部２０６で算出された類似性が第２所定値以上第３所定値未満である第２契約書を第２グループに分類する。算出部２０６で算出された類似性が第３所定値以上第４所定値未満である第２契約書を第３グループに分類する。なお、実施形態では、３つのグループに分類しているが、いくつのグループに分類するかは任意である。また、第１～第４所定値をどのような値とするかについても任意である。 The classification unit 207 classifies the second contract into a predetermined group (one or more groups) according to the similarity between the clauses of the contract. Specifically, the classifying unit 207 classifies the second contracts whose similarity calculated by the calculating unit 206 is greater than or equal to the first predetermined value and less than the second predetermined value into the first group. A second contract whose similarity calculated by the calculation unit 206 is greater than or equal to a second predetermined value and less than a third predetermined value is classified into a second group. A second contract whose similarity calculated by the calculator 206 is greater than or equal to a third predetermined value and less than a fourth predetermined value is classified into a third group. In addition, although classification is made into three groups in the embodiment, the classification into any number of groups is arbitrary. Also, it is arbitrary as to what values the first to fourth predetermined values are set to.

なお、分類部２０７は、契約書を分類する際に、ベクトル化された条文を混合ガウスモデルなど（一例）を用いて事前にクラスタリングしておき、そのクラスタリング結果に応じてグループ分けを判断してもよい。また、類似性の所定値とクラスタリングを組み合わせ、類似性の所定値の範囲であることかつ同一のクラスタリングに属することを条件にする形で、グループを判断してもよい。 When classifying contracts, the classification unit 207 clusters the vectorized clauses in advance using a Gaussian mixture model (one example), and determines grouping according to the clustering results. good too. Alternatively, a predetermined value of similarity and clustering may be combined, and groups may be determined based on the condition that the range of the predetermined value of similarity and belonging to the same clustering are met.

カウント部２０８は、分類部２０７により分類された第２契約書の件数をグループごとにカウントする。 The counting unit 208 counts the number of second contracts sorted by the sorting unit 207 for each group.

比較部２０９は、算出部２０６で算出された類似性に応じて、第１文書と、第２文書とを所定領域単位（実施形態では条項）で比較する。具体的には、比較部２０９は、第１契約書の条項と、第２契約書の条項とを比較し、両契約書の異なる箇所（文字）を検出する。 The comparison unit 209 compares the first document and the second document in predetermined area units (sections in the embodiment) according to the similarity calculated by the calculation unit 206 . Specifically, the comparison unit 209 compares the clauses of the first contract and the clauses of the second contract, and detects different portions (characters) between the two contracts.

送信部２０２は、ネットワーク４を介してユーザ端末３へ情報を送信する。例えば、送信部２０２は、比較部２０９での比較結果に応じて、比較された契約書の同一箇所及び異なる箇所を異なる態様で表示させる比較情報を送信（出力）する。
また、送信部２０２は、受信部２０１が分類部２０７により分類された第２契約書をグループごとに表示する第１指示を受信した場合、指定されたグループに属する第２契約書を表示させる情報を送信（出力）する。
また、送信部２０２は、受信部２０１が比較部２０９により比較された第２契約書のうち、所定領域以外の領域を表示する第２指示を受信した場合、所定領域以外の領域を表示させる情報を送信（出力）する。
また、送信部２０２は、類似性が所定値以下である第２契約書の所定領域を送信（出力）する。 The transmission unit 202 transmits information to the user terminal 3 via the network 4 . For example, the transmission unit 202 transmits (outputs) comparison information for displaying the same parts and different parts of the compared contracts in different modes according to the comparison result of the comparison unit 209 .
Further, when the receiving unit 201 receives the first instruction to display the second contracts sorted by the classifying unit 207 for each group, the transmitting unit 202 displays information for displaying the second contracts belonging to the specified group. is transmitted (output).
Further, when the receiving unit 201 receives a second instruction to display an area other than the predetermined area in the second contract compared by the comparing unit 209, the transmitting unit 202 displays information for displaying an area other than the predetermined area. is transmitted (output).
Also, the transmitting unit 202 transmits (outputs) a predetermined area of the second contract whose similarity is equal to or less than a predetermined value.

（ユーザ端末３）
図５は、実施形態に係るユーザ端末３のハード構成及び機能構成の一例を示す図である。図５（ａ）は、ユーザ端末３のハード構成の一例を示す図、図５（ｂ）は、ユーザ端末３の機能構成の一例を示す図である。ユーザ端末３は、ＰＣ（Personal Computer）や携帯端末（例えば、タブレット端末）などである。図５（ａ）に示すように、ユーザ端末３は、通信ＩＦ３００Ａ、記憶装置３００Ｂ、入力装置３００Ｃ、表示装置３００Ｄ、ＣＰＵ３００Ｅなどを備える。 (User terminal 3)
FIG. 5 is a diagram showing an example of the hardware configuration and functional configuration of the user terminal 3 according to the embodiment. FIG. 5(a) is a diagram showing an example of the hardware configuration of the user terminal 3, and FIG. 5(b) is a diagram showing an example of the functional configuration of the user terminal 3. As shown in FIG. The user terminal 3 is a PC (Personal Computer), a mobile terminal (for example, a tablet terminal), or the like. As shown in FIG. 5A, the user terminal 3 includes a communication IF 300A, a storage device 300B, an input device 300C, a display device 300D, a CPU 300E, and the like.

通信ＩＦ３００Ａは、他の装置（実施形態では、サーバ２）と通信するためのインターフェースである。 Communication IF 300A is an interface for communicating with another device (server 2 in the embodiment).

記憶装置３００Ｂは、例えば、ＨＤＤ（Hard Disk Drive）や半導体記憶装置（ＳＳＤ(Solid State Drive)）である。記憶装置３００Ｂには、ユーザ端末３の識別子（ＩＤ）及び情報処理プログラムなどが記憶されている。なお、識別子は、サーバ２がユーザ端末３に対して新たに付与してもよいし、ＩＰ（Internet Protocol）アドレス、ＭＡＣ（Media Access Control）アドレスなどを利用してもよい。 The storage device 300B is, for example, a HDD (Hard Disk Drive) or a semiconductor storage device (SSD (Solid State Drive)). The storage device 300B stores an identifier (ID) of the user terminal 3, an information processing program, and the like. The identifier may be newly assigned to the user terminal 3 by the server 2, or may be an IP (Internet Protocol) address, MAC (Media Access Control) address, or the like.

入力装置３００Ｃは、例えば、キーボード、タッチパネルなどであり、ユーザは、入力装置３００Ｃを操作して、情報処理システム１の利用に必要な情報を入力することができる。 The input device 300C is, for example, a keyboard, a touch panel, or the like, and the user can input information necessary for using the information processing system 1 by operating the input device 300C.

表示装置３００Ｄは、例えば、液晶モニタや有機ＥＬモニタなどである。表示装置３００Ｄは、情報処理システム１の利用に必要な画面を表示する。 The display device 300D is, for example, a liquid crystal monitor or an organic EL monitor. The display device 300</b>D displays screens necessary for using the information processing system 1 .

ＣＰＵ３００Ｅは、ユーザ端末３を制御するものであり、図示しないＲＯＭ及びＲＡＭを備えている。 The CPU 300E controls the user terminal 3 and has ROM and RAM (not shown).

図５（ｂ）に示すように、ユーザ端末３は、受信部３０１、送信部３０２、記憶装置制御部３０３、操作受付部３０４、表示装置制御部３０５などの機能を有する。なお、図５（ｂ）に示す機能は、ＣＰＵ３００Ｅが、記憶装置３００Ｂに記憶されている情報処理プログラムを実行することで実現される。 As shown in FIG. 5B, the user terminal 3 has functions such as a receiving section 301, a transmitting section 302, a storage device control section 303, an operation receiving section 304, a display device control section 305, and the like. Note that the function shown in FIG. 5B is implemented by the CPU 300E executing an information processing program stored in the storage device 300B.

受信部３０１は、サーバ２から送信される情報を受信する。 The receiving unit 301 receives information transmitted from the server 2 .

送信部３０２は、入力装置３００Ｃを利用して入力された情報に識別子を付与してサーバ２へ送信する。ユーザ端末３から送信される情報に識別子を付与することでサーバ２は、受信した情報がどのユーザ端末３から送信されたものであるかを認識できる。 The transmission unit 302 assigns an identifier to information input using the input device 300C and transmits the information to the server 2 . By assigning an identifier to the information transmitted from the user terminal 3, the server 2 can recognize from which user terminal 3 the received information is transmitted.

記憶装置制御部３０３は、記憶装置３００Ｂを制御する。具体的には、記憶装置制御部３０３は、記憶装置３００Ｂを制御して情報の書き込みや読み出しを行う。 The storage device control unit 303 controls the storage device 300B. Specifically, the storage device control unit 303 controls the storage device 300B to write and read information.

操作受付部３０４は、入力装置３００Ｃでの入力操作を受け付ける。 The operation reception unit 304 receives an input operation on the input device 300C.

表示装置制御部３０５は、表示装置３００Ｄを制御する。具体的には、表示装置制御部３０５は、表示装置３００Ｄを制御して実施形態に係る情報処理システム１の利用に必要な画面を表示させる。 The display device control unit 305 controls the display device 300D. Specifically, the display device control unit 305 controls the display device 300D to display a screen necessary for using the information processing system 1 according to the embodiment.

（表示画面例）
図６～図８は、実施形態に係るユーザ端末３の表示装置３００Ｄに表示される画面の一例を示す図である。以下、図６～図８を参照してユーザ端末３の表示装置３００Ｄに表示される画面について説明する。なお、以下の説明では同一の構成には同一の符号を付して重複する説明を省略する。なお、以下の説明において、表示には、操作が完了するまで親ウィンドウへの操作を受け付けなくさせるタイプのウィンドウを表示するモーダル表示、タブの切り替えによる表示（タブ表示）、親ウィンドウとは別にサイズの小さなウィンドウ(サブウインドウ)を開く表示（サブウィンドウ表示）などが含まれるものとする。 (Example of display screen)
6 to 8 are diagrams showing examples of screens displayed on the display device 300D of the user terminal 3 according to the embodiment. Screens displayed on the display device 300D of the user terminal 3 will be described below with reference to FIGS. 6 to 8. FIG. In the following description, the same components are denoted by the same reference numerals, and overlapping descriptions are omitted. In the explanation below, the display includes a modal display that displays a type of window that prevents operations on the parent window from being accepted until the operation is completed, a display that switches tabs (tab display), and a window that is sized separately from the parent window. A display (sub-window display) that opens a small window (sub-window) of .

図６は、ユーザ端末３の表示装置３００Ｄに表示される画面の一例である。図６に示す画面では、画面向かって左側にユーザにより指定された第１契約書Ｄ１が表示される。また、画面向かって左側の中央部には第１契約書Ｄ１の編集対象である条項Ｊ１が表示される。また、類似契約書表示ボタンＢ１を選択すると、画面向かって右側に、算出部２０６で算出された類似性に応じて分類部２０７により分類された第２契約書の条項が類似性に応じたグループＧ１～Ｇ３ごとに表示される（なお、図６に示す例では、各グループＧ１～Ｇ３のうち、最も類似性の高い第２契約書の条項が表示されている）。図６に示す例では、３つのグループＧ１～Ｇ３に分類された状態で表示されている。なお、表７に示す例では、類似性の高いグループから（画面上から画面下向かって）降順に表示される。 FIG. 6 is an example of a screen displayed on the display device 300D of the user terminal 3. As shown in FIG. On the screen shown in FIG. 6, the first contract D1 specified by the user is displayed on the left side of the screen. In addition, clause J1, which is the subject of editing of the first contract document D1, is displayed in the central part on the left side of the screen. When the similar contract display button B1 is selected, on the right side of the screen, the clauses of the second contract classified by the classification unit 207 according to the similarity calculated by the calculation unit 206 are grouped according to the similarity. It is displayed for each of G1 to G3 (in the example shown in FIG. 6, the terms of the second contract with the highest similarity among the groups G1 to G3 are displayed). In the example shown in FIG. 6, they are displayed in a state classified into three groups G1 to G3. In the example shown in Table 7, the groups are displayed in descending order (from the top of the screen toward the bottom of the screen) from the group with the highest similarity.

また、図６に示す画面には、カウント部２０８によりカウントされた各グループに分類された条項が含まれる第２契約書の件数Ｎ１～Ｎ３が表示される。図６に示す例では、グループＧ１には２件の第２契約書が、グループＧ２には３件の第２契約書が、グループＧ３には１件の第２契約書が各々分類されていることがわかる。また、図６に示すボタンＢ２を選択操作することで、後述の図７に示す画面が表示される。また、図６に示すボタンＢ３を左から右へスライド操作することで、後述の図８に示す画面（差分表示画面）が表示される。 The screen shown in FIG. 6 also displays the number of second contracts N1 to N3 containing the clauses classified into each group counted by the counting unit 208 . In the example shown in FIG. 6, two second contracts are classified into group G1, three second contracts are classified into group G2, and one second contract is classified into group G3. I understand. Further, by selecting and operating the button B2 shown in FIG. 6, a screen shown in FIG. 7, which will be described later, is displayed. Further, by sliding the button B3 shown in FIG. 6 from left to right, a screen (difference display screen) shown in FIG. 8, which will be described later, is displayed.

なお、図６に示す画面では、画面向かって左側にユーザにより指定された編集対象である第１契約書Ｄ１が表示され、画面向かって右側に、算出部２０６で算出された類似性に応じて分類部２０７による分類された第２契約書が分類された状態で表示されているが、画面向かって右側にユーザにより指定された編集対象である第１契約書Ｄ１が表示され、画面向かって左側に、算出部２０６で算出された類似性に応じて分類部２０７による分類された第２契約書が分類された状態で表示されてもよい。 In the screen shown in FIG. 6, the first contract D1 to be edited designated by the user is displayed on the left side of the screen, and the similarity calculated by the calculation unit 206 is displayed on the right side of the screen. Although the second contract classified by the classification unit 207 is displayed in a classified state, the first contract D1 to be edited specified by the user is displayed on the right side of the screen, and the first contract D1 to be edited is displayed on the left side of the screen. 2, the second contract classified by the classification unit 207 according to the similarity calculated by the calculation unit 206 may be displayed in a classified state.

図７は、ユーザ端末３の表示装置３００Ｄに表示される画面の一例である。なお、図７は、図６においてグループＧ１に分類された第２契約書を表示した例である。図７に示すように、画面向かって左側には第２契約書のファイル名Ｆ１、Ｆ２が表示される。また、図７の画面中央部には、画面右側で選択されたファイル名Ｆ１の第２契約書の内容が表示される。なお、図７に示す例では、ファイル名Ｆ１が選択されていることを示すため、ファイル名Ｆ１の背景が変化している（ハイライト表示されている）。また、図７に示す例では、ファイル名Ｆ１の第２契約書の内容のうち、図６において比較されている条項の領域の背景が変化している（ハイライト表示されている）。さらに、図７の中央上部には、選択されたファイル名Ｆ１の第２契約書のステータスＳ、作成日Ｄ１、作成日からの経過日数Ｄ２、作成者・編集者のアイコンＵなどが表示される。 FIG. 7 is an example of a screen displayed on the display device 300D of the user terminal 3. As shown in FIG. FIG. 7 is an example of displaying the second contract classified into group G1 in FIG. As shown in FIG. 7, the file names F1 and F2 of the second contract are displayed on the left side of the screen. Also, in the center of the screen in FIG. 7, the contents of the second contract with the file name F1 selected on the right side of the screen are displayed. In the example shown in FIG. 7, the background of the file name F1 is changed (highlighted) to indicate that the file name F1 is selected. In the example shown in FIG. 7, among the contents of the second contract with the file name F1, the background of the region of the clause compared in FIG. 6 is changed (highlighted). Further, in the upper center of FIG. 7, the status S of the second contract with the selected file name F1, the creation date D1, the number of days elapsed from the creation date D2, the creator/editor icon U, etc. are displayed. .

つまり、図６に示すボタンＢ２を選択操作することで、分類部２０７により分類された第２契約書をグループごとに表示する指示（第１指示）を受信部２０１が受け付けると、サーバ２の送信部２０２（出力部）は、指定されたグループに属する第２契約書を表示させる情報をユーザ端末３へ送信する。送信された情報は、ユーザ端末３の受信部３０１で受信され、ユーザ端末３の表示装置制御部３０５により表示装置３００Ｄに図７に示す画面が表示される。 In other words, when the reception unit 201 receives an instruction (first instruction) to display the second contracts classified by the classification unit 207 by group by selecting the button B2 shown in FIG. The unit 202 (output unit) transmits to the user terminal 3 information for displaying the second contract belonging to the designated group. The transmitted information is received by the receiving section 301 of the user terminal 3, and the display device control section 305 of the user terminal 3 displays the screen shown in FIG. 7 on the display device 300D.

図８は、ユーザ端末３の表示装置３００Ｄに表示される画面の一例である。図８に示す画面では、画面向かって左側にユーザにより指定された第１契約書Ｄ１が表示される。また、画面向かって左側には第１契約書Ｄ１の編集対象である条項Ｊ１が表示される。また、画面向かって右側上段には、編集対象である第１契約書Ｄ１の条項Ｊ１が表示され、画面向かって右側下段には、比較対象である第２契約書のうち、条項Ｊ１に対応する条項Ｊ２が表示される。画面向かって右側では、第１契約書Ｄ１のうち編集対象である条項Ｊ１と、比較対象である第２契約書Ｄ２のうち、条項Ｊ１に対応する条項Ｊ２とが、互いに異なる箇所（以下、差分ともいう）が認識できる態様で表示される。 FIG. 8 is an example of a screen displayed on the display device 300D of the user terminal 3. As shown in FIG. On the screen shown in FIG. 8, the first contract D1 specified by the user is displayed on the left side of the screen. Also, on the left side of the screen, clause J1, which is the subject of editing of the first contract document D1, is displayed. In addition, clause J1 of the first contract D1 to be edited is displayed on the upper right side of the screen, and clause J1 of the second contract to be compared is displayed on the lower right side of the screen. Clause J2 is displayed. On the right side of the screen, the section J1 to be edited in the first contract D1 and the section J2 corresponding to section J1 in the second contract D2 to be compared differ from each other (hereinafter referred to as difference ) is displayed in a recognizable manner.

図８に示す例では、条項Ｊ１の条項Ｊ２とは異なる箇所が太字体で表示され、条項Ｊ２の条項Ｊ１とは異なる箇所が斜体で表示されているが、異なる箇所（文字）が認識できれば、他の態様、例えば、異なる箇所（文字）をハイライトで表示するようにしてもよい。なお、図８に示す例では、異なる箇所（文字）の位置をわかりやすくするために異なる箇所（文字）に下線を付しているが実際には下線は付されていない。 In the example shown in FIG. 8, the parts of Clause J1 that differ from Clause J2 are displayed in bold, and the parts of Clause J2 that differ from Clause J1 are displayed in italics. Other aspects, for example, different parts (characters) may be highlighted. In the example shown in FIG. 8, different parts (characters) are underlined in order to make the positions of different parts (characters) easier to understand, but they are not actually underlined.

このように、図８に示す画面では、画面向かって右側で比較対象である第２契約書の条項Ｊ２と異なる箇所（差分）を確認しながら、画面向かって左側で編集対象である第１契約書Ｄ１の条項Ｊ１を編集することができる。なお、図８においては、画面向かって左側で第１契約書Ｄ１の条項Ｊ１を編集すると、編集した内容が、画面向かって右側に表示される第１契約書Ｄ１の条項Ｊ１にリアルタイムに反映され、編集内容を反映した第１契約書Ｄ１の条項Ｊ１と比較対象である第２契約書の条項Ｊ２との差分を確認することができる。このため非常に利便性に優れる。 In this way, on the screen shown in FIG. 8, while checking the differences (differences) from clause J2 of the second contract to be compared on the right side of the screen, on the left side of the screen is the first contract to be edited. Clause J1 of document D1 can be edited. Note that in FIG. 8, when the clause J1 of the first contract D1 is edited on the left side of the screen, the edited contents are reflected in real time on the clause J1 of the first contract D1 displayed on the right side of the screen. , the difference between the clause J1 of the first contract document D1 reflecting the edited content and the clause J2 of the second contract document to be compared can be confirmed. Therefore, it is very convenient.

なお、図８に示す画面では、画面向かって左側にユーザにより指定された編集対象である第１契約書Ｄ１が表示され、画面向かって右側に、第１契約書Ｄ１及び第２契約書が、それぞれ差分が認識できる態様で表示される構成となっているが、画面向かって右側にユーザにより指定された編集対象である第１契約書Ｄ１が表示され、画面向かって左側に、第１契約書Ｄ１及び第２契約書が、それぞれ差分が認識できる態様で表示される構成としてもよい。また、図８に示す画面では、条項単位で画面の上下に差分が認識できる態様で表示される構成となっているが、行単位で画面の上下に差分が認識できる態様で表示される構成としてもよいし、画面の左右に差分が認識できる態様で表示される構成としてもよい。また、インラインで差分が認識できる態様で表示（見え消し表示で差分表示する方法）される構成としてもよい。 In the screen shown in FIG. 8, the first contract D1 to be edited designated by the user is displayed on the left side of the screen, and the first contract D1 and the second contract are displayed on the right side of the screen. The first contract D1 to be edited specified by the user is displayed on the right side of the screen, and the first contract D1 is displayed on the left side of the screen. D1 and the second contract may be configured to be displayed in such a manner that the difference between them can be recognized. In addition, although the screen shown in FIG. 8 is configured to display the difference between clauses at the top and bottom of the screen in units of clauses, it is configured to display the difference at the top and bottom of the screen in units of lines in a manner in which differences can be recognized. Alternatively, the configuration may be such that the difference is displayed on the left and right sides of the screen in such a manner that the difference can be recognized. In addition, a configuration may be adopted in which the difference is displayed in a manner in which the difference can be recognized inline (a method of displaying the difference in a hidden display).

（情報処理システム１で実行される処理）
図９～図１２は、サーバ２で実行される処理の一例を示すフローチャートである。以下、図９～図１２を参照して、サーバ２で実行される処理について説明するが、図１～図８を参照して説明した構成と同一の構成には同一の符号を付して重複する説明を省略する。 (Processing executed by information processing system 1)
9 to 12 are flowcharts showing an example of the processing executed by the server 2. FIG. The processing executed by the server 2 will be described below with reference to FIGS. 9 to 12, and the same components as those described with reference to FIGS. 1 to 8 are denoted by the same reference numerals. omit the description.

（ユーザ登録処理）
図９は、サーバ２で実行されるユーザ登録処理の一例を示すフローチャートである。以下、図９を参照して、サーバ２で実行されるユーザ登録処理について説明する。 (User registration process)
FIG. 9 is a flowchart showing an example of user registration processing executed by the server 2. As shown in FIG. User registration processing executed by the server 2 will be described below with reference to FIG.

（ステップＳ１０１）
ユーザは、ユーザ端末３の入力装置３００Ｃを操作して、ユーザの情報、例えば、パスワード、氏名、性別、年齢、生年月日、連絡先、アイコン用の自画像データなどの情報を入力する。入力されたユーザの情報は、操作受付部３０４で受け付けられる。受け付けられたユーザの情報は、送信部３０２からサーバ２へと送信される。サーバ２の受信部２０１は、ユーザ端末３から送信されたユーザの情報を受信する。 (Step S101)
The user operates the input device 300C of the user terminal 3 to input user information such as a password, name, gender, age, date of birth, contact information, and self-portrait data for icons. The input user information is received by the operation receiving unit 304 . The accepted user information is transmitted from the transmission unit 302 to the server 2 . The receiving unit 201 of the server 2 receives user information transmitted from the user terminal 3 .

（ステップＳ１０２）
サーバ２の受信部２０１で受信されたユーザの情報は、記憶装置制御部２０３により、ユーザＩＤに関連付けて記憶装置２００ＢのユーザＤＢ１に記憶される。 (Step S102)
The user information received by the receiving unit 201 of the server 2 is stored in the user DB 1 of the storage device 200B by the storage device control unit 203 in association with the user ID.

（契約書登録処理）
図１０は、サーバ２で実行される契約書登録処理の一例を示すフローチャートである。以下、図１０を参照して、サーバ２で実行される契約書登録処理について説明する。 (Contract registration process)
FIG. 10 is a flow chart showing an example of contract registration processing executed by the server 2 . The contract registration process executed by the server 2 will be described below with reference to FIG.

（ステップＳ２０１）
ユーザは、ユーザ端末３の入力装置３００Ｃを操作して、サーバ２に登録する契約書を指定する。この契約書の指定は、３００Ｄに表示された所定領域に登録したい契約書をドラッグアンドドロップすることで行われるが、他の方法により指定される構成でもよい。指定された契約書は、送信部３０２からサーバ２へと送信される。サーバ２の受信部２０１は、ユーザ端末３から送信された契約書を受信する。 (Step S201)
The user operates the input device 300</b>C of the user terminal 3 to specify the contract to be registered in the server 2 . This contract is specified by dragging and dropping the contract to be registered in the predetermined area displayed in 300D, but it may be specified by other methods. The designated contract is transmitted from the transmission unit 302 to the server 2 . The receiving unit 201 of the server 2 receives the contract sent from the user terminal 3 .

（ステップＳ２０２）
サーバ２の分割部２０４は、受信部２０１で受信された契約書を所定領域単位に分割する。実施形態では、分割部２０４は、対象の契約書を条項ごとに分割する。 (Step S202)
The dividing unit 204 of the server 2 divides the contract received by the receiving unit 201 into predetermined area units. In the embodiment, the dividing unit 204 divides the target contract for each clause.

（ステップＳ２０３）
サーバ２の変換部２０５は、受信部２０１で受信された契約書を所定領域単位ごとにベクトルに変換する。具体的には、変換部２０５は、分割部２０４による分割後の条項をベクトルに変換する。 (Step S203)
The conversion unit 205 of the server 2 converts the contract received by the reception unit 201 into a vector for each predetermined area unit. Specifically, the conversion unit 205 converts the clause after division by the division unit 204 into a vector.

（ステップＳ２０４）
サーバ２の記憶装置制御部２０３は、受信部２０１が受信した契約書及び変換部２０５により変換された条項毎のベクトルをユーザＩＤに関連付けて記憶装置２００Ｂの文書ＤＢ２へ記憶する。 (Step S204)
The storage device control unit 203 of the server 2 stores the contract received by the receiving unit 201 and the vector for each clause converted by the conversion unit 205 in the document DB 2 of the storage device 200B in association with the user ID.

（カウント処理）
図１１は、サーバ２で実行されるカウント処理の一例を示すフローチャートである。以下、図１１を参照して、サーバ２で実行されるカウント処理について説明する。 (count processing)
FIG. 11 is a flow chart showing an example of the counting process executed by the server 2. As shown in FIG. The counting process executed by the server 2 will be described below with reference to FIG. 11 .

（ステップＳ３０１）
ユーザは、ユーザ端末３の入力装置３００Ｃを操作して、編集する契約書の条項を指定する。この指定は、送信部３０２からサーバ２へと送信される。サーバ２の受信部２０１は、ユーザ端末３から送信された指定を受信する。 (Step S301)
The user operates the input device 300C of the user terminal 3 to specify the terms of the contract to be edited. This designation is transmitted from the transmission unit 302 to the server 2 . The receiving unit 201 of the server 2 receives the designation transmitted from the user terminal 3 .

（ステップＳ３０２）
サーバ２の算出部２０６は、文書ＤＢ２を参照し、指定された契約書に関連付けられたユーザＩＤと同じユーザＩＤが関連付けられた他の契約書を抽出する。 (Step S302)
The calculation unit 206 of the server 2 refers to the document DB 2 and extracts other contracts associated with the same user ID as the user ID associated with the specified contract.

（ステップＳ３０３）
サーバ２の算出部２０６は、指定された契約書の条項と、ステップＳ３０２で抽出された他の契約書の条項との類似性を算出する。より具体的には、変換部２０５により変換された条項のベクトルに基づいてコサイン類似度を算出する。 (Step S303)
The calculation unit 206 of the server 2 calculates the similarity between the clause of the designated contract and the clause of the other contract extracted in step S302. More specifically, the cosine similarity is calculated based on the clause vectors converted by the conversion unit 205 .

（ステップＳ３０４）
サーバ２の分類部２０７は、契約書の条項間の類似性に応じて契約書を所定のグループに分類する。なお、分類部２０７による分類の詳細については既に説明したため省略する。 (Step S304)
The classification unit 207 of the server 2 classifies the contracts into predetermined groups according to the similarity between the clauses of the contracts. Since the details of the classification by the classification unit 207 have already been explained, they will be omitted.

（ステップＳ３０５）
サーバ２のカウント部２０８は、分類部２０７により分類された第２契約書の件数をグループごとにカウントする。 (Step S305)
The counting unit 208 of the server 2 counts the number of second contracts sorted by the sorting unit 207 for each group.

（ステップＳ３０６）
サーバ２の送信部２０２は、カウント部２０８によりグループごとにカウントされた第２契約書の件数の情報を送信する。件数の情報は、ユーザ端末３の受信部３０１で受信される。表示装置制御部３０５は、受信した件数の情報に基づいて、グループごとにカウントされた第２契約書の件数を表示装置３００Ｄに表示させる。 (Step S306)
The transmitting unit 202 of the server 2 transmits information on the number of second contracts counted for each group by the counting unit 208 . Information on the number of cases is received by the receiving unit 301 of the user terminal 3 . The display device control unit 305 causes the display device 300D to display the number of second contracts counted for each group based on the received information on the number of cases.

なお、上記説明では、ステップＳ３０２においてサーバ２の算出部２０６は、文書ＤＢ２を参照し、指定された契約書に関連付けられたユーザＩＤと同一のユーザＩＤが関連付けられた他の契約書を抽出しているが、同一のグループ（例えば、企業、部署、タスクフォースなど）単位で契約書を抽出するようにしてもよい。この場合、同一のグループに属するユーザのユーザＩＤに同一のグループＩＤを関連付ける。そして、ステップＳ３０２においてサーバ２の算出部２０６は、文書ＤＢ２を参照し、指定された契約書に関連付けられたユーザＩＤに関連付けられたグループＩＤと同一のグループＩＤが関連付けられた他の契約書を抽出する。 In the above description, in step S302, the calculation unit 206 of the server 2 refers to the document DB 2 and extracts other contracts associated with the same user ID as the user ID associated with the designated contract. However, contracts may be extracted in units of the same group (for example, company, department, task force, etc.). In this case, the same group ID is associated with the user IDs of users belonging to the same group. Then, in step S302, the calculation unit 206 of the server 2 refers to the document DB 2 and searches for other contracts associated with the same group ID as the group ID associated with the user ID associated with the specified contract. Extract.

（比較処理）
図１２は、サーバ２で実行される比較処理の一例を示すフローチャートである。以下、図１２を参照して、サーバ２で実行される比較処理について説明する。 (comparison processing)
FIG. 12 is a flow chart showing an example of comparison processing executed by the server 2. As shown in FIG. The comparison processing executed by the server 2 will be described below with reference to FIG. 12 .

（ステップＳ４０１）
ユーザは、ユーザ端末３の入力装置３００Ｃを操作して、契約書の比較を指示する。この指示は、送信部３０２からサーバ２へと送信される。サーバ２の受信部２０１は、ユーザ端末３から送信された指示を受信する。 (Step S401)
The user operates the input device 300C of the user terminal 3 to instruct contract comparison. This instruction is transmitted from the transmission unit 302 to the server 2 . The receiving unit 201 of the server 2 receives the instruction transmitted from the user terminal 3 .

（ステップＳ４０２）
サーバ２の比較部２０９は、算出部２０６で算出された類似性に応じて、編集対象となる第１契約書と、第２契約書とを所定領域単位（実施形態では条項）で比較する。具体的には、比較部２０９は、対象となる第１契約書の条項と、算出部２０６で算出された類似性が所定値以上の第２契約書の条項とを比較し、両契約書の異なる箇所（文字）を検出する。 (Step S402)
The comparison unit 209 of the server 2 compares the first contract to be edited and the second contract in units of predetermined regions (clauses in the embodiment) according to the similarity calculated by the calculation unit 206 . Specifically, the comparison unit 209 compares the clauses of the target first contract with the clauses of the second contract whose similarity calculated by the calculation unit 206 is equal to or greater than a predetermined value, and compares the clauses of both contracts. Detect different parts (characters).

（ステップＳ４０３）
送信部２０２は、比較部２０９での比較結果に応じて、比較された契約書の同一箇所及び異なる箇所を異なる態様で表示させる比較情報を送信（出力）する。比較情報は、ユーザ端末３の受信部３０１で受信される。表示装置制御部３０５は、受信した比較情報に基づいて、比較された契約書の同一箇所及び異なる箇所を異なる態様で表示装置３００Ｄに表示させる。 (Step S403)
The transmitting unit 202 transmits (outputs) comparison information for displaying the same parts and different parts of the compared contracts in different modes according to the comparison result of the comparing unit 209 . The comparison information is received by the receiving section 301 of the user terminal 3 . Based on the received comparison information, the display device control unit 305 causes the display device 300D to display the same parts and different parts of the compared contracts in different modes.

以上のように、実施形態に係るサーバ２（情報処理装置）は、編集対象である第１契約書（第１文書）と、第１契約書とは異なる他の契約書である第２契約書（第２文書）との類似性を所定領域単位（条項、項、号など）で算出する算出部２０６と、算出部２０６で算出された類似性に応じて、第１契約書と、第２契約書とを所定領域単位で比較する比較部２０９と、比較部２０９での比較結果に応じて、第１契約書と第２契約書の同一箇所又は異なる箇所を他とは異なる態様で表示させる比較情報を送信（出力）する送信部２０２（出力部）とを備える。このように契約書（文書）単位ではなく、所定領域単位で契約書（文書）間の差分を比較することができるので、利便性が向上する。 As described above, the server 2 (information processing device) according to the embodiment can edit the first contract (first document) to be edited and the second contract, which is another contract different from the first contract. Calculation unit 206 that calculates similarity with (second document) in units of predetermined areas (clauses, paragraphs, issues, etc.); A comparison unit 209 that compares the contract with the contract in units of a predetermined area, and according to the comparison result of the comparison unit 209, the same part or different part of the first contract and the second contract is displayed in a different manner from the others. and a transmission unit 202 (output unit) that transmits (outputs) the comparison information. In this way, differences between contracts (documents) can be compared not by contract (document) unit, but by predetermined area unit, thereby improving convenience.

また、実施形態に係るサーバ２は、類似性に応じて、第２契約書（第２文書）を所定のグループに分類する分類部２０７と、分類部２０７により分類された第２契約書の件数をグループごとにカウントするカウント部２０８とを備える。
このように類似性に応じて分類されたグループごとに編集対象である条項に類似する条項が存在する第２契約書の件数を知ることができるので利便性が向上する。 In addition, the server 2 according to the embodiment has a classification unit 207 that classifies the second contracts (second documents) into predetermined groups according to similarity, and the number of second contracts classified by the classification unit 207. is provided for each group.
In this way, it is possible to know the number of second contracts in which clauses similar to the clause to be edited exist for each group classified according to similarity, thereby improving convenience.

また、実施形態に係るサーバ２は、分類部２０７により分類された第２契約書（第２文書）をグループごとに表示する第１指示を受け付ける受信部２０１（第１受付部）を備える。また、実施形態に係るサーバ２の送信部２０２（出力部）は、受信部２０１（第１受付部）が第１指示を受け付けた場合、指定されたグループに属する第２契約書を表示させる情報を送信（出力）する。このように、類似するグループごと第２契約書を表示できるので利便性が向上する。 The server 2 according to the embodiment also includes a receiving unit 201 (first receiving unit) that receives a first instruction to display the second contracts (second documents) classified by the classifying unit 207 for each group. Further, when the receiving unit 201 (first receiving unit) receives the first instruction, the transmitting unit 202 (output unit) of the server 2 according to the embodiment displays information for displaying the second contract belonging to the designated group. is transmitted (output). In this way, the convenience is improved because the second contract can be displayed for each similar group.

また、実施形態に係るサーバ２は、比較部２０９により比較された第１契約書（第１文書）のうち、所定領域以外の領域を表示する第２指示を受け付ける受信部２０１（第２受付部）を備える。そして、サーバ２の送信部２０２（出力部）は、受信部２０１（第２受付部）が第２指示を受け付けた場合、所定領域以外の領域を表示させる情報を送信（出力）する。このように、比較している所定領域以外の領域についても表示して確認等をすることができるので利便性が向上する。 The server 2 according to the embodiment also includes the receiving unit 201 (second receiving unit ). Then, when the receiving unit 201 (second receiving unit) receives the second instruction, the transmitting unit 202 (output unit) of the server 2 transmits (outputs) information for displaying an area other than the predetermined area. In this way, areas other than the predetermined area being compared can also be displayed for confirmation, etc., thereby improving convenience.

また、実施形態では、所定領域単位は、条項単位又は項単位である。一般に、契約書は、条項単位で記載されており、各条項は、項で構成されている。このため、所定領域単位を条項単位又は項単位とすることで、契約書の比較が容易となる。結果、利便性が向上する。 Also, in the embodiment, the predetermined area unit is a clause unit or a paragraph unit. In general, a contract is written on a clause-by-clause basis, and each clause consists of clauses. For this reason, by setting the predetermined area unit to the clause unit or the clause unit, it becomes easy to compare contracts. As a result, convenience is improved.

また、実施形態に係るサーバ２は、第１契約書（第１文書）及び第２契約書（第２文書）を所定領域単位でベクトルに変換する変換部２０５を備える。そして、サーバ２の算出部２０６は、所定領域単位で変換されたベクトルに基づいて、第１契約書（第１文書）と、第２契約書（第２文書）との類似性を所定領域単位で算出する。このように、文章の類似性を判定する際に利用されるベクトルに基づいて、類似性を算出するので類似性の精度が向上する。 The server 2 according to the embodiment also includes a conversion unit 205 that converts the first contract (first document) and the second contract (second document) into vectors in units of predetermined areas. Then, the calculation unit 206 of the server 2 calculates the similarity between the first contract (first document) and the second contract (second document) based on the vectors converted in units of predetermined areas. Calculated by In this way, the similarity is calculated based on the vector used when determining the similarity of sentences, so the accuracy of the similarity is improved.

[実施形態の変形例１]
以下、実施形態の変形例１について説明するが、図１～図１２を参照して説明した構成と同じ構成には同一の符号を付して重複する説明を省略する。図１３は、実施形態の変形例１に係るサーバ（情報処理装置）の機能構成の一例を示す図である。図１３に示すように、実施形態の変形例１に係るサーバ２は、受信部２０１（第３受付部）、送信部２０２（出力部）、記憶装置制御部２０３、分割部２０４、変換部２０５、算出部２０６、分類部２０７、カウント部２０８、比較部２０９に加え、入替部２１０及び検索部２１１の機能を有する。 [Modification 1 of Embodiment]
Modification 1 of the embodiment will be described below, but the same components as those described with reference to FIGS. 13 is a diagram illustrating an example of a functional configuration of a server (information processing device) according to Modification 1 of the embodiment; FIG. As shown in FIG. 13, the server 2 according to Modification 1 of the embodiment includes a receiving unit 201 (third receiving unit), a transmitting unit 202 (output unit), a storage device control unit 203, a dividing unit 204, a converting unit 205 , a calculation unit 206 , a classification unit 207 , a count unit 208 , a comparison unit 209 , a replacement unit 210 and a search unit 211 .

入替部２１０は、実施形態の変形例１に係るサーバ２（情報処理装置）は、対象となる第１契約書Ｄ１（第１契約書）との類似性に応じて第２契約書Ｄ２の表示順序を領域単位で入れ替える。具体的には、入替部２１０は、編集対象である第１契約書の各条項との類似性に応じて、比較対象である第２契約書を条項単位で入れ替える。換言すると、入替部２１０は、第２契約書の各条項のうち最も類似性が高いものが、第１契約書の対応する条項の右隣となるように入れ替える。 The replacement unit 210 allows the server 2 (information processing device) according to Modification 1 of the embodiment to display the second contract D2 according to the similarity with the target first contract D1 (first contract). Swap the order by region. Specifically, the replacement unit 210 replaces the second contract to be compared on a clause-by-clause basis according to the similarity with each clause of the first contract to be edited. In other words, the replacement unit 210 replaces the clauses of the second contract so that the clause with the highest similarity is right next to the corresponding clause of the first contract.

図１４は、実施形態の変形例１に係るユーザ端末３の表示装置３００Ｄに表示される画面の一例である。入替部２１０は、編集対象である第１契約書の各条項との類似性に応じて、比較対象である第２契約書を条項単位で入れ替えるため、図１４に示すように、もともと契約書Ｄ２に記載されていた条項順ではなく、第１契約書Ｄ１の各条項Ｊ１～Ｊ３に類似する順序に入れ替えた状態でユーザ端末３の表示装置３００Ｄに表示される。図１４に示す例では、第１契約書Ｄ１の各条項Ｊ１～Ｊ３に類似する順序に入れ替えられているため、第２契約書Ｄ２は、条項Ｊ４、条項Ｊ５、条項Ｊ６の順でユーザ端末３の表示装置３００Ｄに表示されている。つまり、第１契約書Ｄ１の条項Ｊ１の右隣には、第２契約書の各条項のうち条項Ｊ１に最も類似する条項Ｊ４が表示され、第１契約書Ｄ１の条項Ｊ２の右隣には、第２契約書の各条項のうち条項Ｊ２に最も類似する条項Ｊ５が表示され、第１契約書Ｄ１の条項Ｊ３の右隣には、第２契約書の各条項のうち条項Ｊ３に最も類似する条項Ｊ６が表示されている。 FIG. 14 is an example of a screen displayed on the display device 300D of the user terminal 3 according to Modification 1 of the embodiment. The replacement unit 210 replaces the second contract to be compared on a clause-by-clause basis according to the similarity with each clause of the first contract to be edited. Therefore, as shown in FIG. are displayed on the display device 300D of the user terminal 3 in a state in which they are replaced in an order similar to the clauses J1 to J3 of the first contract document D1, rather than in the order described in . In the example shown in FIG. 14, since the clauses J1 to J3 of the first contract D1 are replaced in an order similar to that of the clauses J1 to J3, the second contract D2 includes clauses J4, J5, and J6 in this order. is displayed on the display device 300D. That is, to the right of clause J1 of the first contract D1, clause J4, which is most similar to clause J1 among the clauses of the second contract, is displayed, and to the right of clause J2 of the first contract D1, , the clause J5 that is most similar to clause J2 among the clauses of the second contract is displayed, and to the right of clause J3 of the first contract D1 is the clause that is most similar to clause J3 among the clauses of the second contract. Clause J6 is displayed.

なお、図８に示した例と同様に、図１４においても編集対象である第１契約書Ｄ１（第１文書）と、比較対象である第２契約書Ｄ２（第２文書）とが、異なる箇所（以下、差分ともいう）が認識できる態様で表示される。図１４に示す例では、第１契約書Ｄ１の第２契約書Ｄ２とは異なる箇所が太字体で表示され、第２契約書Ｄ２の第１契約書Ｄ１とは異なる箇所が斜体で表示されているが、異なる箇所（文字）が認識できれば、他の態様、例えば、異なる箇所（文字）をハイライトで表示するようにしてもよい。なお、図１４に示す例でも、異なる箇所（文字）の位置をわかりやすくするために異なる箇所（文字）に下線を付しているが実際には下線は付されていない。 As in the example shown in FIG. 8, also in FIG. 14, the first contract D1 (first document) to be edited and the second contract D2 (second document) to be compared are different. The part (hereinafter also referred to as difference) is displayed in a recognizable manner. In the example shown in FIG. 14, the portions of the first contract D1 that differ from the second contract D2 are displayed in bold, and the portions of the second contract D2 that differ from the first contract D1 are displayed in italics. However, if the different parts (characters) can be recognized, another aspect, for example, the different parts (characters) may be highlighted. In the example shown in FIG. 14 as well, the different parts (characters) are underlined in order to make the positions of the different parts (characters) easier to understand, but they are not actually underlined.

このように、入替部２１０を備えることで、ユーザは、第１契約書の条項Ｊ１と条項Ｊ１と類似する第２契約書の条項Ｊ４とを、第１契約書の条項Ｊ２と条項Ｊ２と類似する第２契約書の条項Ｊ５とを、第１契約書の条項Ｊ３と条項Ｊ３と類似する第２契約書の条項Ｊ６とを、それぞれ画面をスクロール等することなく同一画面において比較することができる。このように、本来であれば、画面をスクロールして対応する条項を探す必要がなく、同一画面において条項の記載を比較することができるため利便性が向上する。 In this way, with the replacement unit 210, the user can replace clause J1 of the first contract and clause J4 of the second contract, which is similar to clause J1, with clause J2 and clause J2 of the first contract. Clause J5 of the second contract that . In this way, originally, there is no need to scroll the screen to search for a corresponding clause, and the descriptions of clauses can be compared on the same screen, thus improving convenience.

なお、図１４を参照した説明では、編集対象である第１契約書Ｄ１に合わせて第２契約書Ｄ２の条項の順序を入れ替えているが、第２契約書Ｄ２に合わせて編集対象である第１契約書Ｄ１の条項の順序を入れ替えるようにしてもよい。 In the description with reference to FIG. 14, the order of the clauses of the second contract D2 is changed according to the first contract D1 to be edited. The order of the clauses in one contract D1 may be changed.

また、検索部２１１は、受信部２０１が受信したキーワード（検索ワード）に基づいて、契約書を検索する。送信部２０２は、検索部２１１により検索された検索ワードを他のワードと異なる態様（例えば、ハイライト表示）で表示させる検索情報を出力する。これにより、ユーザ端末３の表示装置３００Ｄにおいて、検索された検索ワードの箇所が検索ワードを他のワードと異なる態様で表示される。結果、利便性が向上する。 Also, the search unit 211 searches for a contract based on the keyword (search word) received by the reception unit 201 . The transmission unit 202 outputs search information for displaying the search word searched by the search unit 211 in a manner different from that of other words (for example, highlighted display). As a result, on the display device 300D of the user terminal 3, the location of the retrieved search word is displayed in a manner different from other words. As a result, convenience is improved.

また、送信部２０２は、算出された類似性が所定値以下である第２契約書Ｄ２（この第２契約書は標準契約書であってよい）の所定領域を出力するように構成してもよい。このように構成することで、編集対象である第１契約書Ｄ１に欠落している条項がユーザ端末３の表示装置３００Ｄに表示される。 Further, the transmission unit 202 may be configured to output a predetermined area of the second contract D2 (this second contract may be a standard contract) whose similarity is equal to or less than a predetermined value. good. By configuring in this way, clauses missing in the first contract document D1 to be edited are displayed on the display device 300D of the user terminal 3. FIG.

[実施形態の変形例２]
また、編集履歴を記憶することで文書のリビジョン管理を行い、文書の編集履歴のうちの特定のリビジョン間の同一箇所又は異なる箇所（差分）を他とは異なる態様で表示させるようにしてもよい。この場合、サーバ２の記憶装置制御部２０３は、ユーザによる契約書（文書）の編集履歴を文書ＩＤ及びリビジョンの情報に関連付けてとともにＤＢ２へ記憶させる。そして、算出部２０６は、編集対象である第１契約書（第１文書）の特定リビジョンの編集履歴と、第２契約書（第２文書）の編集履歴との類似性を所定領域単位で算出する。次いで、比較部２０９は、算出部２０６で算出された類似性に応じて、第１契約書の編集履歴と、第２契約書の編集履歴とを所定領域単位（実施形態では条項）で比較する。送信部２０２は、比較部２０９での比較結果に応じて、比較された契約書の変種履歴の同一箇所及び異なる箇所を異なる態様で表示させる比較情報を送信（出力）する。比較情報は、ユーザ端末３の受信部３０１で受信される。表示装置制御部３０５は、受信した比較情報に基づいて、比較された契約書の編集履歴間の同一箇所及び異なる箇所を異なる態様で表示装置３００Ｄに表示させる。このように契約書（文書）だけでなく、編集履歴についても、所定領域単位で契約書（文書）間の差分を比較することができるので、利便性が向上する。 [Modification 2 of Embodiment]
Further, revision management of a document may be performed by storing the editing history, and the same portion or different portion (difference) between specific revisions in the editing history of the document may be displayed in a manner different from others. . In this case, the storage device control unit 203 of the server 2 associates the editing history of the contract (document) by the user with the document ID and revision information and stores it in the DB 2 . Then, the calculation unit 206 calculates the similarity between the editing history of the specific revision of the first contract (first document) to be edited and the editing history of the second contract (second document) for each predetermined region. do. Next, the comparison unit 209 compares the editing history of the first contract with the editing history of the second contract in units of predetermined regions (clauses in the embodiment) according to the similarity calculated by the calculation unit 206. . The transmission unit 202 transmits (outputs) comparison information for displaying the same portions and different portions of the compared variant histories of the contracts in different modes according to the comparison result of the comparison unit 209 . The comparison information is received by the receiving section 301 of the user terminal 3 . Based on the received comparison information, the display device control unit 305 causes the display device 300D to display the same portions and different portions in the compared edit histories of the contracts in different modes. In this manner, differences between contracts (documents) can be compared in units of predetermined areas not only for contracts (documents) but also for editing histories, thereby improving convenience.

[実施形態の変形例３]
なお、第２契約書のうち比較対象である条項以外の条項の展開表示を指示する操作ボタン（以下、展開ボタンともいう）を画面に配置してもよい。該展開ボタンを選択することで、サーバ２の受信部２０１が、比較部２０９により比較された第２契約書（第２文書）のうち、比較対象である条項（所定領域）以外の条項（領域）を表示する指示（第２指示）を受け付けると、サーバ２の送信部２０２は。比較対象である条項（所定領域）以外の条項（領域）を表示させる情報を送信（出力）する。送信された情報は、ユーザ端末３の受信部３０１で受信され、ユーザ端末３の表示装置制御部３０５により表示装置３００Ｄに第２契約書（第２文書）のうち、比較対象である条項以外の条項が展開表示される。 [Modification 3 of Embodiment]
It should be noted that an operation button (hereinafter, also referred to as an expansion button) for instructing expanded display of clauses other than the clauses to be compared in the second contract may be arranged on the screen. By selecting the expand button, the receiving unit 201 of the server 2 causes the second contract (second document) compared by the comparing unit 209 to be compared with clauses (predetermined areas) other than the clauses (predetermined area) to be compared. ), the transmission unit 202 of the server 2 receives an instruction (second instruction) to display Transmits (outputs) information for displaying clauses (areas) other than clauses (predetermined area) to be compared. The transmitted information is received by the receiving unit 301 of the user terminal 3, and is displayed on the display device 300D by the display device control unit 305 of the user terminal 3. The clause is expanded and displayed.

その他、上記実施形態及び変形例は、何れも本発明を実施するにあたっての具体化の一例を示したものに過ぎず、これによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその要旨、またはその主要な特徴から逸脱することなく、様々な形で実施することができる。 In addition, the above embodiments and modifications are merely examples of specific implementations of the present invention, and the technical scope of the present invention should not be construed in a limited manner. be. Thus, the invention may be embodied in various forms without departing from its spirit or essential characteristics.

１情報処理システム
２サーバ（情報処理装置）
２００Ａ通信ＩＦ
２００Ｂ記憶装置
２００ＣＣＰＵ
２０１受信部（第１～第３受付部）
２０２送信部（出力部）
２０３記憶装置制御部
２０４分割部
２０５変換部
２０６算出部
２０７分類部
２０８カウント部
２０９比較部
２１０入替部
２１１検索部
３ユーザ端末
３００Ａ通信ＩＦ
３００Ｂ記憶装置
３００Ｃ入力装置
３００Ｄ表示装置
３００ＥＣＰＵ
３０１受信部
３０２送信部
３０３記憶装置制御部
３０４操作受付部
３０５表示装置制御部
４ネットワーク
ＤＢ１ユーザデータベース
ＤＢ２文書データベース 1 information processing system 2 server (information processing device)
200A communication interface
200B storage device 200C CPU
201 receiving unit (first to third receiving units)
202 transmitter (output unit)
203 Storage device control unit 204 Division unit 205 Conversion unit 206 Calculation unit 207 Classification unit 208 Count unit 209 Comparison unit 210 Replacement unit 211 Search unit 3 User terminal 300A Communication IF
300B storage device 300C input device 300D display device 300E CPU
301 Reception unit 302 Transmission unit 303 Storage device control unit 304 Operation reception unit 305 Display device control unit 4 Network DB1 User database DB2 Document database

Claims

a calculation unit that calculates the similarity between a first document and a second document different from the first document in units of predetermined regions;
a comparison unit that compares the first document and the second document in units of the predetermined area according to the similarity calculated by the calculation unit;
an output unit for outputting comparison information for displaying the same or different portions of the first document and the second document in a manner different from the others according to the comparison result of the comparison unit ;
a classification unit that classifies the second document into a predetermined group according to the similarity;
a counting unit that counts the number of the second documents classified by the classifying unit for each group;
The output unit
outputting the number of the second documents for each group counted by the counting unit;
An information processing device characterized by:

a calculation unit that calculates the similarity between a first document and a second document different from the first document in units of predetermined regions;
a comparison unit that compares the first document and the second document in units of the predetermined area according to the similarity calculated by the calculation unit;
an output unit that outputs comparison information for displaying the same or different portions of the first document and the second document in a manner different from the others, according to the comparison result of the comparison unit;
The output unit
outputting a predetermined region of the second document lacking in the first document whose similarity with the first document is equal to or less than a predetermined value;
An information processing device characterized by:

2. The information processing apparatus according to claim 1, further comprising a switching unit that switches a display order of said first document or said second document in units of said predetermined area according to said similarity.

a first reception unit that receives a first instruction to display the second documents classified by the classification unit for each group;
The output unit
2. The information processing apparatus according to claim 1 , wherein when said first reception unit receives said first instruction, it outputs information for displaying said second document belonging to a designated group.

a storage control unit that stores an edit history of at least one of the first document and the second document;
The calculation unit
calculating the similarity between the editing history of the first document and the editing history of the second document for each predetermined area;
The comparison unit
5. The history of the first document and the history of the second document are compared in units of the predetermined area according to the similarity calculated by the calculation unit. 1. The information processing device according to claim 1.

a second reception unit that receives a second instruction to display an area other than the predetermined area in the second document compared by the comparison unit;
The output unit
6. The information processing apparatus according to any one of claims 1 to 5, wherein when the second reception unit receives the second instruction, it outputs information for displaying an area other than the predetermined area.

The predetermined area unit is
7. The information processing apparatus according to any one of claims 1 to 6 , wherein the information is provided on a clause-by-clause basis or on a claim-by-claim basis.

a conversion unit that converts the first document and the second document into vectors in units of the predetermined area;
The calculation unit
8. The method according to any one of claims 1 to 7 , wherein the similarity between said first document and said second document is calculated in units of predetermined areas based on the vectors converted in units of said predetermined areas. The information processing device described.

a third reception unit that receives a search word for searching the first document or the second document;
a search unit that searches for the first document or the second document based on the search word received by the third reception unit;
with
The output unit
The information processing apparatus according to any one of claims 1 to 8 , wherein search information for displaying the search word searched by the search unit in a manner different from other words is output.

a step of calculating a similarity between a first document and a second document different from the first document by a calculation unit in units of predetermined regions;
a comparing unit comparing the first document and the second document in units of the predetermined area according to the similarity calculated by the calculating unit;
an output unit outputting comparison information for displaying the same or different portions of the first document and the second document in a manner different from the others, according to the comparison result of the comparison unit ;
a classifying unit classifying the second document into a predetermined group according to the similarity;
a counting unit counting the number of the second documents classified by the classifying unit for each group;
a step in which the output unit outputs the number of the second documents for each group counted by the counting unit;
An information processing method characterized by having

the computer,
a calculation unit that calculates the similarity between a first document and a second document different from the first document in units of predetermined regions;
a comparison unit that compares the first document and the second document in units of the predetermined area according to the similarity calculated by the calculation unit;
an output unit for outputting comparison information for displaying the same or different portions of the first document and the second document in a manner different from the others according to the comparison result of the comparison unit ;
a classification unit that classifies the second document into a predetermined group according to the similarity;
functioning as a counting unit that counts the number of the second documents classified by the classifying unit for each group;
The output unit
outputting the number of the second documents for each group counted by the counting unit;
An information processing program characterized by:

a step of calculating a similarity between a first document and a second document different from the first document by a calculation unit in units of predetermined regions;
a comparing unit comparing the first document and the second document in units of the predetermined area according to the similarity calculated by the calculating unit;
an output unit outputting comparison information for displaying the same or different portions of the first document and the second document in a manner different from the others, according to the comparison result of the comparison unit. ,
The output unit
outputting a predetermined region of the second document lacking in the first document whose similarity with the first document is equal to or less than a predetermined value;
An information processing method characterized by:

the computer,
a calculation unit that calculates the similarity between a first document and a second document different from the first document in units of predetermined regions;
a comparison unit that compares the first document and the second document in units of the predetermined area according to the similarity calculated by the calculation unit;
functioning as an output unit for outputting comparison information for displaying the same or different portions of the first document and the second document in a different manner according to the comparison result of the comparison unit;
The output unit
outputting a predetermined region of the second document lacking in the first document whose similarity with the first document is equal to or less than a predetermined value;
An information processing program characterized by: