JPH04182788A

JPH04182788A - Method and device for recognizing character

Info

Publication number: JPH04182788A
Application number: JP2312074A
Authority: JP
Inventors: Toru Futaki; 徹二木
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1990-11-17
Filing date: 1990-11-17
Publication date: 1992-06-30

Abstract

PURPOSE:To exactly execute the segmentation of characters and to reduce memory capacity for a dictionary by executing the shift processing of data. CONSTITUTION:When an operator indicates it from a character type selecting means 110 that a character type is inclined or a CPU 1 or an identification calculation part 6 judges the character type is inclined, the shift processing is executed for the data to be transferred from a temporary buffer 109 to an image memory 103. The shift processing of the data is executed by transferring the contents of the low-order byte of an i-th byte in the temporary buffer to the low-order byte of the i-th byte in the (j+iX2)th column of an image buffer 202 and transferring the contents of the high-order byte to the high-order byte of the i-th byte in the (j+iX2+1)th column in the i-th transfer from the temporary buffer 109 to the image memory 103. Thus, the data for one character can be exactly segmented, and the memory capacity for the recognition dictionary is reduced.

Description

【発明の詳細な説明】［産業上の利用分野コ本発明は、傾きのある字体の文字認識も可能な文字認識
方法及び装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a character recognition method and apparatus that can also recognize characters with slanted fonts.

し従来の技術］従来の文字認識装置は、一般に第４図のように構成され
ている。４０１は文字画像をアナログ電気信号に変換す
るスキャナ、４０２はスキャナからのアナログ信号を２
値化する２値化手段、４０３は２値化された画像データ
を記憶しておく画像メモリ、４０４は画像メモリ上の文
字列に対して１文字ごとに文字領域を取り出す文字切り
出し手段、４０５は予め定められたアルゴリズムに従っ
て文字固有の特徴を抽出する特徴抽出手段、４０７は４
０５と同じ特徴抽出処理を学習用データに対して行った
結果得られる平均値や分散等の統計量を予め格納してお
く認識辞書、４０６は入力文字画像から得られた特徴量
を認識辞書の内容と比較し、最も距離の近い候補字種を
認識結果として選び出す照合手段、４０８は認識結果と
して候補字種のコードをデイスプレィや外部装置へ転送
する出力手段である。BACKGROUND ART A conventional character recognition device is generally configured as shown in FIG. 401 is a scanner that converts character images into analog electrical signals; 402 is a scanner that converts analog signals from the scanner into 2
403 is an image memory for storing the binarized image data; 404 is a character cutting means for extracting a character area for each character from the character string on the image memory; 405 is a Feature extraction means 407 extracts character-specific features according to a predetermined algorithm.
406 is a recognition dictionary that stores statistics such as average value and variance obtained as a result of performing the same feature extraction process on the learning data as in 05; A collation means compares the content and selects the closest candidate character type as a recognition result, and 408 is an output means for transmitting the code of the candidate character type as a recognition result to a display or an external device.

［発明が解決しようとしている課題］第３図に欧文文字の例を示すが、文字には３０２のよう
に傾きのない字体（例えばローマン体）だけではなく、
３０１のように傾きのある字体（例えばイタリック体）
もあるが、従来の文字認識装置では３０１のような傾き
のある字体に対応する技術はなかった。よって、以下の
ような問題が生じる。[Problem to be solved by the invention] Figure 3 shows an example of Roman characters, but the characters include not only fonts with no slope (for example, Roman fonts) like 302, but also
Slanted fonts like 301 (e.g. italics)
However, conventional character recognition devices did not have the technology to handle slanted fonts like 301. Therefore, the following problems arise.

（１）隣り合う文字で重なりが生じ、１文字分の切り出
しデータに隣接文字の侵入が起こることがあり、誤認識
の原因となる。(1) Adjacent characters may overlap, and adjacent characters may intrude into the cutout data for one character, causing misrecognition.

（２）字体に合せて複数の認識辞書を持たなければなら
ず、認識辞書の為のメモリ容量が非常に多（必要となる
。(2) It is necessary to have multiple recognition dictionaries according to the fonts, and the memory capacity for the recognition dictionaries is extremely large.

（３）文字認識における一般的な特徴抽出アルゴリズム
では文字線の局所的な傾きを捕らえることに重点が置か
れる。(3) Typical feature extraction algorithms for character recognition focus on capturing the local slope of character lines.

したがって、傾きのある字体用の認識辞書を作成しよう
としても、水平・垂直の文字線が多い通常の傾きのない
文字用に最適化されたアルゴリズムでは、傾いた文字線
の多い文字の特徴を十分捕らえられないことが多い。Therefore, even if you try to create a recognition dictionary for slanted fonts, the algorithm optimized for normal, non-slanted characters with many horizontal and vertical lines will not be able to adequately recognize the characteristics of characters with many slanted lines. They are often not caught.

［課題を解決するための手段］上記課題を解決するために本発明によれば、画像情報を
所望の態様にずらすことを指示し、前記画像情報を１列
分ずつ格納し、前記画像情報をずらすことが指示された
場合には、前記格納された画像情報をずらして転送する
よう制御することを特徴とする文字認識方法を提供する
。[Means for Solving the Problems] In order to solve the above problems, according to the present invention, an instruction is given to shift image information in a desired manner, the image information is stored one column at a time, and the image information is A character recognition method is provided, characterized in that, when a shift is instructed, the stored image information is controlled to be shifted and transferred.

上記課題を解決するために本発明によれば、画像情報を
所望の態様にずらすことを指示する指示手段と、前記画
像情報を１列分ずづ格納する格納手段と、前記指示手段
によって前記画像情報をずらすことが指示された場合に
は、該格納手段に格納された画像情報をずらして転送す
るよう制御する転送制御手段を有することを特徴とする
文字認識装置を提供する。In order to solve the above problems, according to the present invention, there is provided an instruction means for instructing to shift the image information in a desired manner, a storage means for storing the image information one column at a time, and a storage means for storing the image information one column at a time. A character recognition device is provided, characterized in that it has a transfer control means that controls to shift and transfer the image information stored in the storage means when an instruction is given to shift the information.

［実施例］第１図（Ａ）は本発明の実施例を示す構成図で、１０１
〜１０８は第４図の従来例のそれぞれ４０１〜４０８に
同じである。本実施例では、１０２の２値化手段と１０
３の画像バッファの間にスキャナで読み取った１列分の
画像データを格納しておくテンポラリバッファ１０９を
設け、また字体の選択を行う字体選択手段１１０を有し
ている。[Example] FIG. 1(A) is a block diagram showing an example of the present invention, and 101
to 108 are the same as 401 to 408, respectively, of the conventional example in FIG. In this embodiment, 102 binarization means and 10
A temporary buffer 109 for storing one column of image data read by a scanner is provided between the three image buffers, and a font selection means 110 is provided for selecting a font.

第１図（Ｂ）は本実施例の構成を示すブロック図であり
、ここで第１図（Ａ）の構成図との対応について説明す
る。FIG. 1(B) is a block diagram showing the configuration of this embodiment, and the correspondence with the configuration diagram of FIG. 1(A) will be explained here.

ｌはＣＰＵであり、スキャナ８から読み取られた画像情
報を２値化する２値化手段１０２、その他の処理の制御
を行う。２はキーボード（ＫＢ）、３はポインティング
・デバイス（Ｐ、Ｄ、）であり、認識対象である文字の
書体を見てオペレータが字体を指示する字体選択手段１
１０として用いられ、また、その他の作業の指示を与え
る為に用いる。４はリード・オンリー・メモリ（ＲＯＭ
）であり、文字を認識する際に用いる認識辞書１０７を
予め記憶してお（。５はメモリであり、スキャナ１０１
から読み取られた画像情報を１列分ずつ一時的に記憶す
るテンポラリバッファ１０９及び画像メモリ１０３の役
割を果たす。1 is a CPU, which controls a binarization means 102 that binarizes image information read from the scanner 8 and other processing. 2 is a keyboard (KB), 3 is a pointing device (P, D,), and font selection means 1 where an operator specifies the font by looking at the font of the character to be recognized.
10, and is also used to give instructions for other tasks. 4 is read-only memory (ROM)
), and a recognition dictionary 107 used when recognizing characters is stored in advance (.5 is a memory, and the scanner 101
It plays the role of a temporary buffer 109 and an image memory 103 that temporarily stores image information read from the image data column by column.

６は文字の認識の主たる部分の計算を行う識別計算部で
あり、画像メモリ１０３のデータから文字の切り出しを
行う文字切り出し手段１０４、切り出されたデータから
その文字の特徴を抽出する特徴抽出手段１０５、抽出さ
れた特徴データを認識辞書１０７と照合する照合手段１
０６の処理を行う。７はＣＲＴであり、認識結果を出力
する出力手段１０８であり、認識結果が出る前の途中経
過や、オペレータへの指示を促すデータ等を表示する表
示手段である。８はスキャナ（ＳＣＡＮ）であり、第１
図（Ａ）におけるスキャナ１０１に相当する。９はスキ
ャナ８とのインターフェイス、５ＣＡＮ　　１．／Ｆで
ある。Reference numeral 6 denotes an identification calculation unit that performs calculations for the main part of character recognition, character extraction means 104 that extracts characters from data in the image memory 103, and feature extraction means 105 that extracts features of the characters from the extracted data. , collation means 1 for collating the extracted feature data with the recognition dictionary 107
06 processing is performed. Reference numeral 7 denotes a CRT, which is an output means 108 for outputting the recognition results, and is a display means for displaying the progress before the recognition results are output, data prompting instructions to the operator, and the like. 8 is a scanner (SCAN), and the first
This corresponds to the scanner 101 in Figure (A). 9 is an interface with the scanner 8, 5CAN 1. /F.

第２図にはテンポラリバッファ１０９と画像メモリ１０
３のデータの例を示し、テンポラリバッファ１０９から
画像メモリ１０３へのデータ転送方法について説明する
。FIG. 2 shows a temporary buffer 109 and an image memory 10.
A method of transferring data from the temporary buffer 109 to the image memory 103 will be explained using an example of the data of No. 3.

まずスキャナ１０１で入力された画像データは、２値化
手段１０２において２値化される。この２値化は、画像
データ上の１画素は１ビツトに対応し、黒画素はビット
を１に、白画素はビットを０に変換する。First, image data inputted by the scanner 101 is binarized by the binarization means 102. In this binarization, one pixel on the image data corresponds to one bit, and a black pixel is converted to a bit of 1, and a white pixel is converted to a bit of 0.

第２図においては、２０１がテンポラリバッファ、２０
２が画像メモリのデータの様子を各々表している。テン
ポラリバッファ２０１上のデータを上から順に第０バイ
ト、第１バイト、・・・、第ｉバイト、・・・と呼び、
バイト中では上側を下位バイト、下側を上位バイトとす
る。In FIG. 2, 201 is a temporary buffer, 20
2 represents the state of data in the image memory. The data on the temporary buffer 201 is called the 0th byte, 1st byte, . . . , i-th byte, . . . in order from the top.
Among the bytes, the upper side is the lower byte and the lower side is the upper byte.

まず、スキャナ１０１から入力され、２値化手段１０２
において２値化された画像データは、順次テンポラリバ
ッファ１０９に送られる。First, input from the scanner 101 is input to the binarization means 102.
The binarized image data is sequentially sent to the temporary buffer 109.

オペレータにより字体選択手段１１０から字体が傾いて
いるという指示が与えられないが、または字体が傾いて
いないということをＣＰＵＩ或いは識別計算部６で判断
した時は、テンポラリバッファ１０９に１列分のデータ
が格納されるごとに画像バッファ１０３の第１列へとデ
ータを順次転送する。When the operator does not give an instruction that the font is slanted from the font selection means 110, or when the CPUI or identification calculation unit 6 determines that the font is not slanted, one column of data is stored in the temporary buffer 109. Data is sequentially transferred to the first column of the image buffer 103 each time the data is stored.

しかし、オペレータにより字体選択手段１１０がら字体
が傾いているという指示が与えられるが、または字体が
傾いているということをＣＰＵＩ或いは識別計算部６で
判断した時は、テンポラリバッファ１０９から画像メモ
リ１０３へ転送するデータのずらし処理を行う。However, when the operator gives an instruction that the font is slanted from the font selection means 110, or when the CPUI or identification calculation unit 6 determines that the font is slanted, the data is transferred from the temporary buffer 109 to the image memory 103. Performs shifting processing of data to be transferred.

データのずらし処理は、テンポラリバッファ１０９から
画像メモリ１０３へのｉ回目の転送においてテンポラリ
バッファの第ｉバイトの下位バイトの内容は、画像バッ
ファ２０２の第（ｊ＋１ｘ２）列の第ｉバトの下位バイ
トへ転送され、上位バイトの内容は第（ｊ＋ｉｘ２＋１
）列の第ｉバイトの上位バイトへ転送される。こうして
４ビツトごとに１列のずらし変換を行うので、原画像に
対してＴａｎ−１（１／４）の傾きを持った図形が画像
バッファに格納されたことになる。なお、第２図におい
て斜線で示されている、画像バッファの左下隅及び右上
隅の部分にはデータが転送されないので、画像の余白部
としてこの部分のビットは０にしておく。In the data shifting process, in the i-th transfer from the temporary buffer 109 to the image memory 103, the contents of the lower byte of the i-th byte of the temporary buffer are transferred to the lower byte of the i-th byte of the (j+1x2)th column of the image buffer 202. The content of the upper byte is the (j+ix2+1
) is transferred to the upper byte of the i-th byte of the column. In this way, since one column shift conversion is performed for every 4 bits, a figure having an inclination of Tan-1 (1/4) with respect to the original image is stored in the image buffer. Note that since no data is transferred to the lower left and upper right corners of the image buffer, which are indicated by diagonal lines in FIG. 2, the bits in these areas are set to 0 as the margins of the image.

一般的に傾きのある書体、例えばイタリック文字等はＴ
ａｎ　−１（１／４）の傾きを原画像に与えることによ
ってもともとの字体の傾きが除去される。Generally, slanted typefaces, such as italic characters, are T.
By applying a slope of an -1 (1/4) to the original image, the original slope of the font is removed.

尚、この原画像に対して与える傾きは、Ｔａｎ　−１（
１／４）に限ることはなく、字体選択手段１１０によっ
て選択された字体によって傾きを変えるようにしても良
いし、ずらし処理を行った後の画像データをいったんＣ
ＲＴ７に表示し、オペレータが傾きを指示することによ
り、もう１度データのずらし処理を行って、オペレータ
が確認するようにしても良い。Note that the slope given to this original image is Tan −1(
1/4), the inclination may be changed depending on the font selected by the font selection means 110, or the image data after the shift processing may be
The data may be shifted once more by displaying it on RT7 and having the operator instruct the inclination, so that the operator can confirm it.

このように入力したデータを画像メモリに格納する前に
ずらし処理を行って傾きのある字体を補正したデータで
認識を行うことができるので、認識辞書１０７に傾きの
ある字体のデータを記憶させておく必要もなく、また、
文字の切り出しを行った時に隣り合うデータが入ってし
まったりということがなくなる。In this way, before storing the input data in the image memory, the data can be shifted and the slanted fonts corrected for recognition. There is no need to keep it, and
This eliminates the possibility of adjacent data being included when cutting out characters.

また、ここまででは画像データを縦に分割し列（縦方向
）ごとに画像メモリに格納し１列分のデータをテンポラ
リバッファに記憶することによってずらし処理を行う例
について述べたが、画像データを横に分割し行（横方向
）ごとに画像メモリに格納して１行分のデータをテンポ
ラリバッファに記憶することによってずらし処理を行う
ことも可能である。In addition, so far we have described an example in which shifting processing is performed by dividing image data vertically, storing each column (vertical direction) in image memory, and storing one column's worth of data in a temporary buffer. It is also possible to perform the shifting process by dividing the data horizontally, storing it in the image memory row by row (horizontal direction), and storing one row's worth of data in a temporary buffer.

［発明の効果］本発明によれば、データのずらし処理を行うことにより
、原画像の傾きのある字体から傾きを除去することがで
きるため、以下のような効果が得られる。[Effects of the Invention] According to the present invention, by performing data shifting processing, it is possible to remove a slant from a slanted font in an original image, so that the following effects can be obtained.

（１）文字の切り出しが正確に行える。(1) Characters can be cut out accurately.

（２）傾きのある字体の文字の認識辞書は通常の傾きの
ない文字の認識辞書ｉ流用することができる。その結果
、辞書の為のメモリ容量を小さくできる。(2) The recognition dictionary for characters with slanted fonts can be used as a recognition dictionary for characters with no slant. As a result, the memory capacity for the dictionary can be reduced.

（３）傾きが除去される為、通常の傾きのない文字用の
認識アルゴリズムをそのまま適用することが可能となる
。(3) Since the slant is removed, it is possible to apply the normal recognition algorithm for characters without a slant as is.

[Brief explanation of the drawing]

第１図は本発明の第１の実施例を表す構成図第２図は第
１の実施例におけるずらし処理を説明する図第３図は通常文字とイタリック文字の何箱４図は従来の
光学的文字認識装置の構成を表す図第１図（Ａ）第２図第３図、３０１Fig. 1 is a block diagram showing the first embodiment of the present invention Fig. 2 is a diagram explaining the shift processing in the first embodiment Fig. 3 is a diagram showing the number of boxes for normal characters and italic characters Figures 1 (A), 2 (A) and 3 (301) showing the configuration of a character recognition device.

Claims

[Claims]

(1) Instructing to shift the image information in a desired manner, storing the image information one column at a time, and when shifting the image information is instructed, shifting the stored image information. A character recognition method characterized by controlling to transfer.

(2) an instruction means for instructing to shift the image information in a desired manner; a storage means for storing the image information one column at a time; and when the instruction means instructs to shift the image information; , a character recognition device comprising transfer control means for controlling the image information stored in the storage means to be shifted and transferred.