JP2007025902A

JP2007025902A - Image processing apparatus and image processing method

Info

Publication number: JP2007025902A
Application number: JP2005204738A
Authority: JP
Inventors: Yumi Watabe; 由美渡部
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-07-13
Filing date: 2005-07-13
Publication date: 2007-02-01

Abstract

【課題】経験的な閾値を必要とせず、画像中における所定の被写体の検出をより簡便且つ高精度に行うための技術を提供すること。
【解決手段】輝度画像上の各位置に配置した所定サイズの矩形内の顔確率を求め（Ｓ１０６）、それぞれの領域の顔確率に基づいてそれぞれの領域の確率分布を求め、このそれぞれの確率分布を示す第１マップデータを作成し（Ｓ１０９）、輝度画像の第２縮小画像乃至第Ｎ縮小画像を生成し（Ｓ１０３）、第ｎ縮小画像上の各位置に配置した上記矩形内の領域の顔確率を求め（Ｓ１０６）、それぞれの領域の顔確率に基づいてそれぞれの領域の確率分布を求め、このそれぞれの確率分布を示す第ｎマップデータを第（ｎ−１）マップデータに合成する処理をｎ＝２〜Ｎについて繰り返すことで合成マップデータを作成し、合成マップデータを用いて代表パターンを選択する（Ｓ１１５）。
【選択図】図２PROBLEM TO BE SOLVED: To provide a technique for detecting a predetermined subject in an image more simply and with high accuracy without requiring an empirical threshold.
SOLUTION: A face probability within a rectangle of a predetermined size arranged at each position on a luminance image is obtained (S106), a probability distribution of each area is obtained based on the face probability of each area, and each probability distribution is obtained. (S109), the second reduced image to the Nth reduced image of the luminance image are generated (S103), and the face of the region in the rectangle arranged at each position on the nth reduced image is generated. A process of obtaining a probability (S106), obtaining a probability distribution of each area based on the face probability of each area, and combining the n-th map data indicating each probability distribution with the (n-1) -th map data. By repeating for n = 2 to N, composite map data is created, and a representative pattern is selected using the composite map data (S115).
[Selection] Figure 2

Description

本発明は、画像中における所定の被写体を検出するための技術に関するものである。 The present invention relates to a technique for detecting a predetermined subject in an image.

画像から特定の被写体パターンを自動的に検出する画像処理方法は非常に有用であり、このような画像処理方法は例えば、人間の顔を検出するために利用することができる。このような方法は、通信会議、マン・マシン・インタフェース、セキュリティ、人間の顔を追跡するためのモニタ・システム、画像圧縮などの多くの分野で使用することができる。このような画像中から顔を検出する技術としては、例えば、非特許文献１に各種方式が挙げられている。その中では、いくつかの顕著な特徴（２つの目、口、鼻など）とその特徴間の固有の幾何学的位置関係とを利用するか、又は人間の顔の対称的特徴、人間の顔色の特徴、テンプレート・マッチング、ニューラル・ネットワークなどを利用することによって、人間の顔を検出する方式が示されている。 An image processing method for automatically detecting a specific subject pattern from an image is very useful, and such an image processing method can be used to detect a human face, for example. Such methods can be used in many areas such as teleconferencing, man-machine interfaces, security, monitor systems for tracking human faces, image compression, and the like. As a technique for detecting a face from such an image, for example, Non-Patent Document 1 discloses various methods. Among them, use some prominent features (two eyes, mouth, nose, etc.) and the unique geometric positional relationship between those features, or symmetric features of human face, human face color A method for detecting a human face by using features, template matching, neural network, etc. is shown.

例えば、非特許文献２で提案されている方式は、ニューラル・ネットワークにより画像中の顔パターンを検出する方法である。以下、非特許文献２による顔検出の方法について簡単に説明する。 For example, the method proposed in Non-Patent Document 2 is a method of detecting a face pattern in an image using a neural network. The face detection method according to Non-Patent Document 2 will be briefly described below.

まず、顔を含む画像をメモリに読み込み、この画像から、顔と照合する所定の領域を切り出す。そして、切り出した領域を構成する各画素の画素値の分布を入力としてニューラル・ネットワークによる演算で一つの出力を得る。 First, an image including a face is read into a memory, and a predetermined area to be compared with the face is cut out from the image. Then, a pixel value distribution of each pixel constituting the cut-out area is used as an input to obtain one output by calculation using a neural network.

このとき、ニューラル・ネットワークの重み、閾値は、膨大な顔画像パターンと非顔画像パターンによりあらかじめ学習されており、このようなニューラル・ネットワークを用いれば、例えば、ニューラル・ネットワークの出力が０以上なら顔、それ以外は非顔であると判別することができる。 At this time, the weights and threshold values of the neural network are learned in advance using a large number of face image patterns and non-face image patterns. If such a neural network is used, for example, if the output of the neural network is 0 or more. It can be determined that the face is non-face.

そして、ニューラル・ネットワークの入力である顔と照合する画像パターンの切り出し位置を、例えば、画像全域から縦横順次に走査していくことにより、画像中から顔を検出する。 Then, the face is detected from the image by scanning the cutout position of the image pattern to be collated with the face which is an input of the neural network, for example, in the vertical and horizontal directions from the entire image.

また、様々な大きさの顔の検出に対応するため、読み込んだ画像を所定の割合で順次縮小し、それぞれに対して、前述した顔検出の走査を行うようにしている。 Further, in order to cope with detection of faces of various sizes, the read images are sequentially reduced at a predetermined rate, and the above-described face detection scanning is performed on each of the images.

上記方法で顔の検出を行い、顔であると判別されたパターンを出力とした場合、隣接したパターンや大きさが微妙に異なるパターン等で重なってパターンが検出される状況が頻繁に発生する。このような場合に非特許文献２では、所定の閾値を利用してパターンの大きさや出現数を絞り込む処理や、パターンの重なりと重心の位置関係を利用するなど、複数の経験的なアルゴリズムを組み合わせて、正しい顔の絞込みを行っている。 When a face is detected by the above-described method and a pattern determined to be a face is output, a situation in which patterns are detected by overlapping with adjacent patterns or patterns slightly different in size frequently occurs. In such a case, Non-Patent Document 2 combines a plurality of empirical algorithms, such as a process of narrowing down the size and number of appearances of patterns using a predetermined threshold, and using the positional relationship between pattern overlap and the center of gravity. The correct face is narrowed down.

このように従来例はアルゴリズムが複雑で、また、パラメータを経験的に複数設定する必要がある。したがって、検出すべき顔を示すパターンを取りこぼす（未検出）、あるいは顔でないパターンを残してしまう（誤検出）という問題があった。
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.24 , NO.1, JANUARY 2002、”Detecting Faces in Images: A Survey” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.20 , NO.1, JANUARY 1998、”Neural network-based face detection” Thus, the conventional example has a complicated algorithm, and a plurality of parameters need to be set empirically. Therefore, there is a problem that a pattern indicating a face to be detected is missed (undetected) or a pattern other than a face is left (false detection).
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.24, NO.1, JANUARY 2002, “Detecting Faces in Images: A Survey” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL.20, NO.1, JANUARY 1998, “Neural network-based face detection”

本発明は以上の問題に鑑みてなされたものであり、経験的な閾値を必要とせず、画像中における所定の被写体の検出をより簡便且つ高精度に行うための技術を提供することを目的とする。 The present invention has been made in view of the above problems, and an object thereof is to provide a technique for detecting a predetermined subject in an image more easily and with high accuracy without requiring an empirical threshold. To do.

本発明の目的を達成するために、例えば、本発明の画像処理装置は以下の構成を備える。 In order to achieve the object of the present invention, for example, an image processing apparatus of the present invention comprises the following arrangement.

即ち、所定の被写体を含む画像を取得する取得手段と、
前記画像の輝度成分で構成される輝度画像を生成する生成手段と、
前記輝度画像上の各位置に所定サイズの矩形を配置した場合に、当該矩形内の領域が前記所定の被写体とおぼしきパターンを示す確率を求める第１確率計算手段と、
前記第１確率計算手段が計算したそれぞれの領域に対する確率に基づいて、当該それぞれの領域に対する確率分布を求める第１確率分布計算手段と、
前記輝度画像上における当該それぞれの領域に対する確率分布を示す第１マップデータを作成する第１作成手段と、
前記輝度画像を再帰的に縮小することで、第２縮小画像乃至第Ｎ縮小画像を生成する縮小手段と、
第ｎ（２≦ｎ≦Ｎ）縮小画像上の各位置に前記矩形を配置した場合に、当該矩形内の領域が前記所定の被写体とおぼしきパターンを示す確率を求める第２確率計算手段と、
前記第２確率計算手段が計算したそれぞれの領域に対する確率に基づいて、当該それぞれの領域に対する確率分布を求める第２確率分布計算手段と、
前記第ｎ縮小画像上における当該それぞれの領域に対する確率分布を示す第ｎマップデータを第（ｎ−１）マップデータに合成する合成手段と、
ｎ＝２〜Ｎについて前記第２確率計算手段、前記第２確率分布計算手段、前記合成手段による処理を繰り返すことで合成マップデータを作成する繰り返し手段と、
前記合成マップデータに基づいて、被写体パターン候補の中から代表パターンを検出する検出手段と
を備えることを特徴とする。 That is, acquisition means for acquiring an image including a predetermined subject;
Generating means for generating a luminance image composed of luminance components of the image;
First probability calculating means for obtaining a probability that an area in the rectangle indicates a predetermined pattern with the predetermined subject when a rectangle of a predetermined size is arranged at each position on the luminance image;
First probability distribution calculating means for obtaining a probability distribution for each region based on the probability for each region calculated by the first probability calculating unit;
First creation means for creating first map data indicating a probability distribution for each of the regions on the luminance image;
Reduction means for generating a second reduced image to an Nth reduced image by recursively reducing the luminance image;
A second probability calculating means for obtaining a probability that an area in the rectangle indicates the predetermined subject and an oversight pattern when the rectangle is arranged at each position on an nth (2 ≦ n ≦ N) reduced image;
Second probability distribution calculating means for obtaining a probability distribution for each region based on the probability for each region calculated by the second probability calculating means;
Synthesizing means for synthesizing the nth map data indicating the probability distribution for the respective regions on the nth reduced image with the (n-1) th map data;
repetitive means for creating composite map data by repeating the processing by the second probability calculation means, the second probability distribution calculation means, and the synthesis means for n = 2 to N;
And detecting means for detecting a representative pattern from subject pattern candidates based on the composite map data.

本発明の目的を達成するために、例えば、本発明の画像処理方法は以下の構成を備える。 In order to achieve the object of the present invention, for example, an image processing method of the present invention comprises the following arrangement.

即ち、所定の被写体を含む画像を取得する取得工程と、
前記画像の輝度成分で構成される輝度画像を生成する生成工程と、
前記輝度画像上の各位置に所定サイズの矩形を配置した場合に、当該矩形内の領域が前記所定の被写体とおぼしきパターンを示す確率を求める第１確率計算工程と、
前記第１確率計算工程で計算したそれぞれの領域に対する確率に基づいて、当該それぞれの領域に対する確率分布を求める第１確率分布計算工程と、
前記輝度画像上における当該それぞれの領域に対する確率分布を示す第１マップデータを作成する第１作成工程と、
前記輝度画像を再帰的に縮小することで、第２縮小画像乃至第Ｎ縮小画像を生成する縮小工程と、
第ｎ（２≦ｎ≦Ｎ）縮小画像上の各位置に前記矩形を配置した場合に、当該矩形内の領域が前記所定の被写体とおぼしきパターンを示す確率を求める第２確率計算工程と、
前記第２確率計算工程で計算したそれぞれの領域に対する確率に基づいて、当該それぞれの領域に対する確率分布を求める第２確率分布計算工程と、
前記第ｎ縮小画像上における当該それぞれの領域に対する確率分布を示す第ｎマップデータを第（ｎ−１）マップデータに合成する合成工程と、
ｎ＝２〜Ｎについて前記第２確率計算工程、前記第２確率分布計算工程、前記合成工程による処理を繰り返すことで合成マップデータを作成する繰り返し工程と、
前記合成マップデータに基づいて、被写体パターン候補の中から代表パターンを検出する検出工程と
を備えることを特徴とする。 That is, an acquisition step of acquiring an image including a predetermined subject;
A generation step of generating a luminance image composed of luminance components of the image;
A first probability calculating step of obtaining a probability that an area in the rectangle indicates the predetermined subject and an opening pattern when a rectangle of a predetermined size is arranged at each position on the luminance image;
A first probability distribution calculation step for obtaining a probability distribution for each region based on the probability for each region calculated in the first probability calculation step;
A first creation step of creating first map data indicating a probability distribution for each of the regions on the luminance image;
A reduction process for generating a second reduced image to an Nth reduced image by recursively reducing the luminance image;
A second probability calculating step of obtaining a probability that an area in the rectangle shows an obtuse pattern with the predetermined subject when the rectangle is arranged at each position on an nth (2 ≦ n ≦ N) reduced image;
A second probability distribution calculating step for obtaining a probability distribution for each region based on the probability for each region calculated in the second probability calculating step;
A synthesis step of synthesizing n-th map data indicating a probability distribution for the respective regions on the n-th reduced image with (n-1) -th map data;
a repetition step of creating composite map data by repeating the processes of the second probability calculation step, the second probability distribution calculation step, and the combination step for n = 2 to N;
And a detecting step of detecting a representative pattern from the subject pattern candidates based on the composite map data.

本発明の構成により、経験的な閾値を必要とせず、画像中における所定の被写体の検出をより簡便且つ高精度に行うことができる。 With the configuration of the present invention, it is possible to detect a predetermined subject in an image more simply and with high accuracy without requiring an empirical threshold.

以下添付図面を参照して、本発明を好適な実施形態に従って詳細に説明する。 Hereinafter, the present invention will be described in detail according to preferred embodiments with reference to the accompanying drawings.

［第１の実施形態］
本実施形態に係る画像処理装置は、ＰＣ（パーソナルコンピュータ）やＷＳ（ワークステーション）等のコンピュータにより構成されており、ディジタルカメラなどの撮像装置から入力した画像、インターネットなどのネットワークを介して外部機器からダウンロードした画像、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの記憶媒体からの読み出しにより入力した画像など、様々な入力形態で入力した画像中に含まれている所定の被写体を検出する。なお、本実施形態では被写体として人間の顔を用いるが、その他の被写体を用いても良い。 [First Embodiment]
The image processing apparatus according to the present embodiment is configured by a computer such as a PC (personal computer) or WS (workstation), and an external device via an image input from an imaging apparatus such as a digital camera or a network such as the Internet. A predetermined subject included in an image input in various input forms such as an image downloaded from, an image input by reading from a storage medium such as a CD-ROM or a DVD-ROM is detected. In this embodiment, a human face is used as a subject, but other subjects may be used.

先ず、このような処理を行う本実施形態に係る画像処理装置について説明する。図３は、本実施形態に係る画像処理装置に適用可能なコンピュータのハードウェア構成を示す図である。 First, an image processing apparatus according to the present embodiment that performs such processing will be described. FIG. 3 is a diagram illustrating a hardware configuration of a computer applicable to the image processing apparatus according to the present embodiment.

２０１はＣＰＵで、ＲＡＭ２０２やＲＯＭ２０３に格納されているプログラムやデータを用いてコンピュータ全体の制御を行うと共に、コンピュータが行う後述の各処理を実行する。 A CPU 201 controls the entire computer using programs and data stored in the RAM 202 and the ROM 203, and executes each process described later performed by the computer.

２０２はＲＡＭで、外部記憶装置２０７や記憶媒体ドライブ装置２０８から読み出したプログラムやデータを一時的に記憶するためのエリア、Ｉ／Ｆ２０９を介して外部から受信したデータを一時的に記憶する為のエリア、ＣＰＵ２０１が各種の処理を実行する為に用いるワークエリア等、各種のエリアを適宜提供することができる。 Reference numeral 202 denotes a RAM, an area for temporarily storing programs and data read from the external storage device 207 and the storage medium drive device 208, and for temporarily storing data received from the outside via the I / F 209. Various areas such as an area and a work area used by the CPU 201 to execute various processes can be provided as appropriate.

２０３はＲＯＭで、ここにブートプログラムや本コンピュータの設定データなどを格納する。 A ROM 203 stores a boot program, setting data of the computer, and the like.

２０４、２０５は夫々キーボード、マウスで、コンピュータの操作者が操作することで各種の指示をＣＰＵ２０１に対して入力することができる。 Reference numerals 204 and 205 denote a keyboard and a mouse, respectively, and various instructions can be input to the CPU 201 when operated by a computer operator.

２０６は表示部で、ＣＲＴや液晶画面などにより構成されており、ＣＰＵ２０１による処理結果を文字や画像等でもって表示する。 A display unit 206 includes a CRT, a liquid crystal screen, and the like, and displays the processing result by the CPU 201 using characters, images, and the like.

２０７は外部記憶装置で、ハードディスクドライブ装置等の大容量情報記憶装置であって、ここにＯＳ（オペレーティングシステム）や、コンピュータが行う後述の各処理をＣＰＵ２０１に実行させるためのプログラムやデータが格納されており、これらはＣＰＵ２０１による制御に従って適宜ＲＡＭ２０２に読み出される。 Reference numeral 207 denotes an external storage device, which is a large-capacity information storage device such as a hard disk drive device, and stores an OS (Operating System) and programs and data for causing the CPU 201 to execute each process described later performed by the computer. These are read out to the RAM 202 as appropriate under the control of the CPU 201.

２０８は記憶媒体ドライブ装置で、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭなどの記憶媒体に記録されているプログラムやデータを読み出して、ＲＡＭ２０２や外部記憶装置２０７等に出力する。なお、上記外部記憶装置２０７に記憶されているプログラムやデータの一部を上記記憶媒体に記録しておいても良く、その場合には、これら記憶されているプログラムやデータを使用する際に、記憶媒体ドライブ装置２０８がこの記憶媒体に記録されているプログラムやデータを読み出して、ＲＡＭ２０２に出力する。 A storage medium drive device 208 reads out programs and data recorded on a storage medium such as a CD-ROM or DVD-ROM, and outputs them to the RAM 202, the external storage device 207, or the like. A part of the program or data stored in the external storage device 207 may be recorded on the storage medium. In that case, when using the stored program or data, The storage medium drive device 208 reads out programs and data recorded on the storage medium and outputs them to the RAM 202.

２０９はＩ／Ｆ（インターフェース）で、ここにディジタルカメラやインターネットやＬＡＮのネットワーク回線等を接続することができる。 Reference numeral 209 denotes an I / F (interface), to which a digital camera, the Internet, a LAN network line, or the like can be connected.

２１０は上述の各部を繋ぐバスである。 A bus 210 connects the above-described units.

なお、コンピュータへの画像の入力形態については特に限定するものではなく、様々な形態が考えられる。 In addition, the input form of the image to the computer is not particularly limited, and various forms are conceivable.

図１は、本実施形態に係る画像処理装置に適用可能なコンピュータの機能構成を示すブロック図である。同図に示す如く、本実施形態に係る画像処理装置は、画像入力部１０、画像メモリ２０、画像縮小部３０、照合パターン抽出部４０、輝度正規化部５０、顔判別部６０、顔候補リスト格納部７０、顔確率分布生成部８０、顔確率マップ格納部９０、代表パターン検出部１００、顔領域出力部１０１により構成されている。 FIG. 1 is a block diagram showing a functional configuration of a computer applicable to the image processing apparatus according to the present embodiment. As shown in the figure, the image processing apparatus according to the present embodiment includes an image input unit 10, an image memory 20, an image reduction unit 30, a collation pattern extraction unit 40, a luminance normalization unit 50, a face discrimination unit 60, a face candidate list. The storage unit 70, the face probability distribution generation unit 80, the face probability map storage unit 90, the representative pattern detection unit 100, and the face area output unit 101 are configured.

画像入力部１０は、例えばディジタルスチルカメラ、フィルムスキャナーなどの装置から出力された画像データを受け、後段の画像メモリ２０に出力するものである。なお、上述の通り、画像の入力形態については特に限定するものではない。 The image input unit 10 receives image data output from a device such as a digital still camera or a film scanner, and outputs the image data to the subsequent image memory 20. As described above, the image input form is not particularly limited.

画像メモリ２０は、画像入力部１０から出力される画像データを格納するためのメモリである。 The image memory 20 is a memory for storing image data output from the image input unit 10.

画像縮小部３０は、先ず、画像メモリ２０から受けた画像データの輝度成分で構成される輝度画像を生成する。そして、生成した輝度画像を再帰的に縮小することで、複数枚の縮小画像を生成する。生成したそれぞれの縮小画像（画像メモリ２０から受けた画像データに基づいて生成したオリジナルの輝度画像も１／１の縮小画像と解釈すれば、このオリジナルもまた、縮小画像に含めることができる）は順次後段の照合パターン抽出部４０に出力する。 The image reduction unit 30 first generates a luminance image composed of luminance components of image data received from the image memory 20. Then, a plurality of reduced images are generated by recursively reducing the generated luminance image. Each generated reduced image (if the original luminance image generated based on the image data received from the image memory 20 is also interpreted as a 1/1 reduced image, this original can also be included in the reduced image). The data is sequentially output to the subsequent matching pattern extraction unit 40.

照合パターン抽出部４０は、画像縮小部３０から縮小画像を受けると、この縮小画像上で所定サイズの矩形を移動させながら、この矩形に含まれる画素群を「照合対象パターン」として順次抽出し、後段の輝度正規化部５０に出力する。このような処理は、画像縮小部３０から受けたそれぞれの縮小画像について行う。 When the collation pattern extraction unit 40 receives the reduced image from the image reduction unit 30, the pixel group included in the rectangle is sequentially extracted as a “collation target pattern” while moving a rectangle of a predetermined size on the reduced image, This is output to the luminance normalization unit 50 in the subsequent stage. Such processing is performed for each reduced image received from the image reduction unit 30.

輝度正規化部５０は、照合パターン抽出部４０から受けた照合対象のパターンを構成する画素群の輝度分布を正規化する。 The luminance normalization unit 50 normalizes the luminance distribution of the pixel group constituting the pattern to be collated received from the collation pattern extraction unit 40.

顔判別部６０は、輝度正規化部５０で正規化された照合対象パターンが顔パターンである確率を求め、この照合対象パターンが顔パターンである確率が０ではない場合には、画像入力部１０に入力された画像上におけるこの照合対象パターンの位置（この照合対象パターンを抽出した縮小画像とこの照合対象パターンとの位置関係が維持されるような、画像入力部１０に入力された画像上における位置）、及びサイズを顔候補リスト格納部７０に登録すると共に、この照合対象パターンについて求めた確率を示すデータを後段の顔確率分布生成部８０に出力する。 The face discriminating unit 60 obtains the probability that the collation target pattern normalized by the luminance normalization unit 50 is a face pattern. If the probability that the collation target pattern is a face pattern is not 0, the image input unit 10 The position of the pattern to be collated on the image input to (the position on the image input to the image input unit 10 so that the positional relationship between the reduced image from which the pattern to be collated is extracted and the pattern to be collated is maintained. (Position) and size are registered in the face candidate list storage unit 70, and data indicating the probabilities obtained for the matching target pattern is output to the face probability distribution generation unit 80 in the subsequent stage.

顔候補リスト格納部７０は、画像入力部１０に入力された画像上において顔パターンとおぼしき照合対象パターンの位置、及びサイズのセットを格納する。このセットのデータはリストデータなるデータに順次登録されることになる。従ってこのリストデータは、１以上のセットのデータから成る。 The face candidate list storage unit 70 stores a set of positions and sizes of the face pattern and the target pattern to be compared on the image input to the image input unit 10. This set of data is sequentially registered in the list data. Therefore, this list data consists of one or more sets of data.

顔確率分布生成部８０は、照合対象パターンに対する確率分布を求め、求めた確率分布のデータを用いて確率マップ格納部９０に格納されているマップデータを更新する。 The face probability distribution generation unit 80 obtains a probability distribution for the verification target pattern, and updates the map data stored in the probability map storage unit 90 using the obtained probability distribution data.

確率マップ格納部９０は、マップデータを格納する。マップデータについては後述する。 The probability map storage unit 90 stores map data. The map data will be described later.

代表パターン検出部１００は、顔確率マップ格納部９０に格納されているマップデータと、顔候補リスト格納部７０に登録されているリストデータとを用いて、画像入力部１０に入力された画像上における代表的な顔パターン（以下、代表パターンと呼称する）を検出する処理を行う。代表パターンについては後述する。 The representative pattern detection unit 100 uses the map data stored in the face probability map storage unit 90 and the list data registered in the face candidate list storage unit 70 on the image input to the image input unit 10. A process for detecting a representative face pattern (hereinafter referred to as a representative pattern) is performed. The representative pattern will be described later.

顔領域出力部１０１は、代表パターン検出部１００によって検出された代表パターンを出力する。 The face area output unit 101 outputs the representative pattern detected by the representative pattern detection unit 100.

以上の各部は、例えば、ＣＰＵ２０１やＲＡＭ２０２、外部記憶装置２０７等でもって構成することができる。 Each of the above units can be configured with, for example, the CPU 201, the RAM 202, the external storage device 207, and the like.

次に、ＣＰＵ２０１が図１に示した各部として動作することでなされる処理、即ち、画像中に含まれている被写体を検出するための処理について、同処理のフローチャートを示す図２を用いて以下説明する。なお、同図のフローチャートに従った処理をＣＰＵ２０１に実行させるためのプログラムやデータは外部記憶装置２０７（もしくは記憶媒体ドライブ装置２０８が読み取り可能な記憶媒体）に保存されており、これをＣＰＵ２０１の制御に従って適宜ＲＡＭ２０２にロードし、ＣＰＵ２０１がこれを用いて処理を実行することで、コンピュータは以下説明する各処理を実行することになる。 Next, a process performed by the CPU 201 operating as each unit illustrated in FIG. 1, that is, a process for detecting a subject included in an image will be described below with reference to FIG. 2 showing a flowchart of the process. explain. Note that a program and data for causing the CPU 201 to execute the processing according to the flowchart of FIG. 10 are stored in the external storage device 207 (or a storage medium readable by the storage medium drive device 208). Accordingly, the computer executes each process described below by loading the data into the RAM 202 as needed and the CPU 201 executing the process using this.

外部記憶装置２０７や、Ｉ／Ｆ２０９を介して外部から画像データが入力されると、ＣＰＵ２０１は、これをＲＡＭ２０２内の上記画像メモリ２０に相当するエリアに一時的に格納する（ステップＳ１０１）。なお、本コンピュータに入力した画像が圧縮されている場合には、これを伸張してからＲＡＭ２０２に一時的に格納する。 When image data is input from the outside via the external storage device 207 or the I / F 209, the CPU 201 temporarily stores it in an area corresponding to the image memory 20 in the RAM 202 (step S101). If an image input to the computer is compressed, the image is decompressed and temporarily stored in the RAM 202.

本実施形態では、入力された画像データを構成する各画素は、Ｒ、Ｇ、Ｂで表現されるものであるとする。従って、ＣＰＵ２０１は、ステップＳ１０１でＲＡＭ２０２に格納した画像データに基づいて、この画像の輝度成分で構成される画像（輝度画像）、即ち、この画像を構成する各画素の値をこの画素の輝度値に変換した画像を生成する（ステップＳ１０２）。しかし、ステップＳ１０１でＲＡＭ２０２に格納した画像データを構成する各画素がＹＣｒＣｂで表現されるものである場合には、ステップＳ１０２では、Ｙ成分のみを用いて輝度画像を生成する。 In the present embodiment, it is assumed that each pixel constituting the input image data is represented by R, G, and B. Accordingly, the CPU 201 determines, based on the image data stored in the RAM 202 in step S101, an image (luminance image) composed of the luminance components of this image, that is, the value of each pixel constituting this image as the luminance value of this pixel. The image converted into is generated (step S102). However, if each pixel constituting the image data stored in the RAM 202 in step S101 is expressed by YCrCb, a luminance image is generated using only the Y component in step S102.

次に、ＣＰＵ２０１は、生成した輝度画像を再帰的に縮小することで、複数枚の縮小画像を生成する（ステップＳ１０３）。例えば、ステップＳ１０２で生成した輝度画像（以下の説明上、縮小画像１と呼称する）の縦横のサイズを１／１．２倍した縮小画像２を生成し、次に縮小画像２の縦横のサイズを１／１．２倍した縮小画像３を生成する、というように、複数枚の縮小画像を生成する。これは、以後の処理で顔を抽出する際に様々な大きさの顔の検出に対応するため複数のサイズの画像データに対して順次検出を行うようにしたためである。なお、生成する縮小画像の枚数については特に限定するものではない。 Next, the CPU 201 generates a plurality of reduced images by recursively reducing the generated luminance image (step S103). For example, a reduced image 2 obtained by multiplying the vertical and horizontal sizes of the luminance image generated in step S102 (hereinafter referred to as reduced image 1) by 1 / 1.2 is generated, and then the vertical and horizontal sizes of the reduced image 2 are generated. A plurality of reduced images are generated, such as generating a reduced image 3 obtained by multiplying 1 / 1.2. This is because when detecting a face in the subsequent processing, detection is sequentially performed on image data of a plurality of sizes in order to support detection of faces of various sizes. The number of reduced images to be generated is not particularly limited.

そして、ステップＳ１０４以降では、生成したそれぞれの縮小画像について処理を行う。即ち、ステップＳ１０４以降の処理は、生成した縮小画像の数だけ繰り返し行うことになる。 In step S104 and subsequent steps, each generated reduced image is processed. That is, the processing after step S104 is repeatedly performed for the number of generated reduced images.

以降の説明では、生成した縮小画像はサイズの大きい順に縮小画像１、縮小画像２、、、、縮小画像Ｎと呼称するものとし、先ず、縮小画像１について以降の処理を行うものとする。なお、処理の対象として選択する順番については特に限定するものではない。 In the following description, it is assumed that the generated reduced images are referred to as reduced image 1, reduced image 2,... Reduced image N in descending order of size. Note that the order of selection as processing targets is not particularly limited.

先ず、ＣＰＵ２０１は、縮小画像１上に所定サイズの矩形を配置し、矩形内の画素群を照合対象パターンとして抽出する（ステップＳ１０４）。この矩形は、縮小画像１上の各位置に配置した場合に、それぞれの位置における矩形内の輝度分布を得るためのものであるので、例えばこの矩形は最初は画像の左上隅に配置する。 First, the CPU 201 arranges a rectangle of a predetermined size on the reduced image 1, and extracts a pixel group in the rectangle as a verification target pattern (step S104). When this rectangle is arranged at each position on the reduced image 1, it is for obtaining the luminance distribution within the rectangle at each position. For example, this rectangle is initially arranged at the upper left corner of the image.

次に、ステップＳ１０４で抽出した照合対象パターン内の各画素の輝度分布を正規化する処理を行う（ステップＳ１０５）。例えば、ヒストグラム平滑化などの輝度補正を行う。これは、撮像される被写体パターンはその照明条件によって輝度分布が変わるので被写体照合の精度が劣化するのを抑制するためである。 Next, a process of normalizing the luminance distribution of each pixel in the verification target pattern extracted in step S104 is performed (step S105). For example, brightness correction such as histogram smoothing is performed. This is for suppressing deterioration in accuracy of subject collation because the luminance distribution of the subject pattern to be captured changes depending on the illumination condition.

次に、ステップＳ１０５で輝度分布が正規化された照合対象パターン（輝度パターン）が、顔パターン（顔とおぼしきパターン）である確率（顔確率）を求める処理を行う（ステップＳ１０６）。 Next, a process of obtaining a probability (face probability) that the collation target pattern (luminance pattern) whose luminance distribution has been normalized in step S105 is a face pattern (face and face pattern) is performed (step S106).

図５は、所定領域内のパターンを識別する為のニューラル・ネットワークの動作について示した図である。同図においてＲは、例えば画像上で識別する領域を示すものであり、本実施形態ではこの領域Ｒを同図に示す如く、３種類の方法にてさらに領域分割し、各ニューロン（Ｎで示す）への受容野とする。そして、分割された領域の輝度分布を各ニューロンに入力し、中間層での出力が得られる。そして、各ニューロンの出力を出力層のニューロンの入力として最終出力が得られる。 FIG. 5 is a diagram showing the operation of the neural network for identifying the pattern in the predetermined area. In the figure, R represents an area to be identified on the image, for example. In this embodiment, as shown in the figure, this area R is further divided into three areas by each of the neurons (indicated by N). ) As a receptive field. Then, the luminance distribution of the divided area is input to each neuron, and an output in the intermediate layer is obtained. Then, the final output is obtained by using the output of each neuron as the input of the neuron in the output layer.

ここで、各ニューロンでは予め学習によって得られた重みと輝度分布との積和演算およびその結果のシグモイド関数による演算が行われる。本実施形態では出力層のニューロンの出力値を顔確率とした（ニューラル・ネットワークの詳細および学習の方法については、上記非特許文献２を参照されたい）。なお、ステップＳ１０５で輝度分布が正規化された照合対象パターンが顔パターンである確率（顔確率）を求める処理についてはこれに限定するものではなく、例えば、Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2001に”Rapid Object Detection using a Boosted Cascade of Simple Features”と題するViolaとJonesによる報告で提案されているAdaBoostによる方式を用いてもよい。 Here, in each neuron, the product-sum operation of the weight and the luminance distribution obtained by learning in advance and the operation by the sigmoid function as a result are performed. In this embodiment, the output value of the neuron in the output layer is used as the face probability (for details of the neural network and the learning method, see Non-Patent Document 2 above). Note that the processing for obtaining the probability (face probability) that the pattern to be verified whose luminance distribution is normalized in step S105 is a face pattern is not limited to this, and for example, Proceedings of the IEEE Conference on Computer Vision and Pattern The AdaBoost method proposed in a report by Viola and Jones entitled “Rapid Object Detection using a Boosted Cascade of Simple Features” in Recognition, 2001 may be used.

図４は、様々なサイズの縮小画像（本実施形態の場合には縮小画像１、縮小画像２、、、、縮小画像Ｎ）について、顔のパターンを検出するための処理を説明する図である。それぞれの縮小画像上の各位置に同じサイズの矩形を配置した場合に、それぞれの位置における矩形内の領域が顔のパターンであるのか否かを判断するために、先ず、同図左側に示す如く、縮小画像の左上隅に矩形を配置し、そこから右側に、上から下に向かって矩形の位置を移動させる。移動させる毎に矩形内の画素群は照合パターンとして顔のパターンの判別に用いられる。 FIG. 4 is a diagram for explaining processing for detecting a face pattern for reduced images of various sizes (in the case of the present embodiment, reduced image 1, reduced image 2,... Reduced image N). . When a rectangle of the same size is arranged at each position on each reduced image, in order to determine whether or not the area within the rectangle at each position is a face pattern, first, as shown on the left side of FIG. Then, a rectangle is arranged at the upper left corner of the reduced image, and the position of the rectangle is moved from top to bottom from there to the right side. Each time it is moved, the pixel group in the rectangle is used as a collation pattern for discrimination of the face pattern.

図２に戻って、次に、ステップＳ１０５で輝度分布が正規化された照合対象パターン（輝度パターン）が、顔パターン（顔とおぼしきパターン）である確率が０よりも大きい、即ち、この照合対象パターンが顔パターンであると判別した場合には処理をステップＳ１０７を介してステップＳ１０８に進め、この照合対象パターンのサイズ、及びステップＳ１０１で取得した画像上におけるこの照合対象パターンの位置をセットにしてＲＡＭ２０２内、若しくは外部記憶装置２０７内に設けられている顔候補リスト格納部７０に格納（登録）する（ステップＳ１０８）。 Returning to FIG. 2, next, the probability that the pattern to be collated (luminance pattern) whose luminance distribution has been normalized in step S105 is a face pattern (face and face pattern) is greater than 0. If it is determined that the pattern is a face pattern, the process proceeds to step S108 via step S107, and the size of the pattern to be verified and the position of the pattern to be verified on the image acquired in step S101 are set. It is stored (registered) in the face candidate list storage unit 70 provided in the RAM 202 or in the external storage device 207 (step S108).

次に、ステップＳ１０４で抽出された照合対象パターン内についてステップＳ１０６で求めた顔確率に基づいて、この照合対象パターンに対する確率分布（顔確率分布）を求め、ＲＡＭ２０２内、若しくは外部記憶装置２０７内に設けられている顔確率マップ格納部９０に格納されているマップデータにおいて、この照合対象パターンに対応するデータ部分に求めた顔確率分布を加算すべく、マップデータを更新する処理を行う（ステップＳ１０９）。 Next, based on the face probability obtained in step S106 for the collation target pattern extracted in step S104, a probability distribution (face probability distribution) for this collation target pattern is obtained and stored in the RAM 202 or the external storage device 207. In the map data stored in the provided face probability map storage unit 90, a process of updating the map data is performed in order to add the obtained face probability distribution to the data portion corresponding to the matching target pattern (step S109). ).

ここで、ステップＳ１０９における処理についてより詳細に説明する。 Here, the process in step S109 will be described in more detail.

先ず、ステップＳ１０４で抽出された照合対象パターンの中心位置を原点とした場合に、この原点でピーク値を有し、この原点からの距離が遠くなるほど確率値が小さくなるような確率分布を求める。例えば、照合対象パターンの中心位置におけるピーク値を上記顔確率とし、照合対象パターンの対角長を拡がりとしてもつ二次元ガウス関数を求める。そして、求めた二次元ガウス関数を用いて、照合対象パターン内における各画素位置に対応する確率値を求める。これにより、照合対象パターンを構成する各画素に対応する確率値を得ることができる。 First, when the center position of the pattern to be collated extracted in step S104 is set as the origin, a probability distribution is obtained such that the peak value is at this origin and the probability value decreases as the distance from the origin increases. For example, a two-dimensional Gaussian function having the peak value at the center position of the verification target pattern as the face probability and the diagonal length of the verification target pattern as an extension is obtained. Then, using the obtained two-dimensional Gaussian function, a probability value corresponding to each pixel position in the verification target pattern is obtained. Thereby, the probability value corresponding to each pixel constituting the verification target pattern can be obtained.

ここで、マップデータは、所定のサイズを有する画像（マップ画像）のデータであり、マップ画像を構成する各画素の画素値は０に初期化されている。 Here, the map data is data of an image (map image) having a predetermined size, and the pixel value of each pixel constituting the map image is initialized to 0.

従って、ステップＳ１０９では、縮小画像１上において照合対象パターンを抽出した領域（第１の領域）に対応するマップ画像上における領域（第２の領域）を構成する各画素の画素値に対して、第１の領域を構成する各画素のうち位置的に対応する画素の確率値を加算する処理を行うことで、マップデータを更新する。 Therefore, in step S109, the pixel value of each pixel constituting the region (second region) on the map image corresponding to the region (first region) from which the pattern to be collated is extracted on the reduced image 1 is calculated. The map data is updated by performing a process of adding the probability values of the pixels corresponding in position among the pixels constituting the first region.

ここで、マップ画像と縮小画像１とが同じサイズであれば、第１の領域と第２の領域とは同じサイズとなり、その場合、第１の領域を構成する画素の確率値をＳ（ｘ、ｙ）｛ｘ、ｙは縮小画像１上における座標値｝とすると、マップ画像で座標（ｘ、ｙ）に位置する画素の確率値Ｍ（ｘ、ｙ）に対して確率値Ｓ（ｘ、ｙ）を加算することで、マップデータを更新する。 Here, if the map image and the reduced image 1 have the same size, the first area and the second area have the same size. In this case, the probability value of the pixels constituting the first area is set to S (x , Y) {x, y are coordinate values on the reduced image 1}, the probability value S (x, y) with respect to the probability value M (x, y) of the pixel located at the coordinate (x, y) in the map image The map data is updated by adding y).

次に処理をステップＳ１１０に進め、縮小画像１上における矩形の移動先があるのかをチェックする（ステップＳ１１０）。即ち、縮小画像１上における矩形の位置を移動させ、次の位置における矩形内の部分（画素群）を照合対象パターンとして抽出する処理を行う場合に、移動先が無い場合、例えば、現在の矩形の位置が既に縮小画像１の右下隅の位置であれば、もう矩形の移動は行えない。一方、現在の矩形の位置が既に縮小画像１の右下隅の位置でなければ、矩形の移動を行うことができる。 Next, the process proceeds to step S110, and it is checked whether there is a rectangular movement destination on the reduced image 1 (step S110). That is, when the position of the rectangle on the reduced image 1 is moved and the process of extracting the portion (pixel group) in the rectangle at the next position as the verification target pattern is performed, if there is no destination, for example, the current rectangle If the position is already at the lower right corner of the reduced image 1, the rectangle cannot be moved anymore. On the other hand, if the position of the current rectangle is not already the position of the lower right corner of the reduced image 1, the rectangle can be moved.

従って、移動先がある場合には処理をステップＳ１１０からステップＳ１１１に進め、縮小画像１上における矩形の位置を移動させる（ステップＳ１１１）。そして矩形の移動が完了すると、処理をステップＳ１０４に進め、移動後の矩形内における照合対象パターンを抽出し、以降の処理を行う。 Accordingly, if there is a destination, the process proceeds from step S110 to step S111, and the position of the rectangle on the reduced image 1 is moved (step S111). When the movement of the rectangle is completed, the process proceeds to step S104, the pattern to be collated in the moved rectangle is extracted, and the subsequent processes are performed.

一方、矩形の移動先がない場合には、処理をステップＳ１１２に進め、全ての縮小画像について以上の処理を行ったのかを判断し（ステップＳ１１２）、まだ処理対象となっていない縮小画像がある場合には処理をステップＳ１１３に進め、縮小画像上に配置する矩形の位置を初期化（例えば縮小画像の左上隅の位置に戻す）し（ステップＳ１１３）、次の縮小画像についてステップＳ１０４以降の処理を行う。 On the other hand, if there is no rectangular movement destination, the process proceeds to step S112 to determine whether or not the above processing has been performed for all the reduced images (step S112), and there is a reduced image that has not yet been processed. In this case, the process proceeds to step S113, the position of the rectangle to be arranged on the reduced image is initialized (for example, returned to the position of the upper left corner of the reduced image) (step S113), and the processing after step S104 is performed for the next reduced image. I do.

本実施形態では現在縮小画像１について処理を行ったので、次は縮小画像２について処理を行う。よってこの場合には、縮小画像２上の左上隅の位置に矩形を配置し（ステップＳ１１３）、この縮小画像２についてステップＳ１０４以降の処理を行う。 In the present embodiment, since processing is currently performed on the reduced image 1, processing is performed on the reduced image 2 next. Therefore, in this case, a rectangle is arranged at the position of the upper left corner on the reduced image 2 (step S113), and the processing after step S104 is performed on the reduced image 2.

よって、縮小画像ｎ（ｎ≧２）についてステップＳ１０４以降の処理を行うことで、縮小画像ｎ上において照合対象パターンを抽出した領域（第ｋの領域）に対応するマップ画像上における領域（第ｍの領域）を構成する各画素の確率値に対して、第ｋの領域を構成する各画素のうち位置的に対応する画素の確率値を加算する処理を行うことで、マップデータを更新することになる。 Therefore, by performing the processing from step S104 on on the reduced image n (n ≧ 2), an area (m-th area) on the map image corresponding to the area (k-th area) from which the pattern to be collated is extracted on the reduced image n. Updating the map data by performing a process of adding the probability value of the pixel corresponding to the position among the pixels constituting the kth region to the probability value of each pixel constituting the region become.

これによれば、本来顔である領域で起こる、隣接したパターンや大きさが微妙に異なるパターン等で重なってパターンが検出された場合には、確率値が順次加算されることになるので、マップデータにおいて比較的高い確率が得られる。また、本来顔でないパターンがたまたま顔として判別されるような場合には孤立して高い確率が出力されることがあるが、高い確率値が加算されることがないので、照合パターン走査後のマップデータにおいては比較的低い確率になる。 According to this, since a pattern is detected by overlapping adjacent patterns or patterns with slightly different sizes that occur in an area that is originally a face, probability values are sequentially added. A relatively high probability is obtained in the data. In addition, when a pattern that is not originally a face happens to be identified as a face, it may be isolated and a high probability may be output, but since a high probability value is not added, the map after the matching pattern scan There is a relatively low probability in the data.

なお、以上の確率分布は全ての照合対象パターンについて算出する必要はなく、例えば、０以上の顔確率を出力した（顔パターンとして判別された）照合パターンについてのみ確率分布を生成するようにする。 Note that the above probability distribution does not need to be calculated for all the patterns to be collated. For example, the probability distribution is generated only for a collation pattern that outputs a face probability of 0 or more (identified as a face pattern).

図６は確率分布生成の様子を示す図である。Ａは顔パターンとして判別されたパターンを表し、Ｂは一つのパターンから生成される確率分布で暗いほど高い確率値を表す。また、Ｃはその確率分布を加算したマップデータを表す。 FIG. 6 is a diagram showing a state of probability distribution generation. A represents a pattern discriminated as a face pattern, and B represents a probability distribution generated from one pattern and represents a higher probability value as it becomes darker. C represents map data obtained by adding the probability distributions.

なお、顔確率マップの生成方法はこれに限らず、例えば図１３に示すように、照合の対象とする顔パターンが示す矩形Ｌ内部の画素の重みを１に、それ以外の部分を０として、対応する顔確率マップの画素に値を加算していく方法などがある。図１３の顔確率マップＢにおける数字は、重みを表す。この方法を用いた場合でも、本来顔である領域で起こる隣接したパターンや大きさが微妙に異なるパターン等で重なってパターンが検出された場合には、図１４（ｂ）に示すように確率値が順次加算され、代表顔パターンの画像Ａ内での位置は、顔確率マップＢにおいて比較的高い確率となるため、マップの値が最大となる領域あるいは、ある閾値以上の領域を指定することで、代表パターンを検出することが可能である。例えば、図１５において、マップの値が最大となる領域Ｑをとると、入力画像においてＰで示す矩形領域となる。 Note that the method of generating the face probability map is not limited to this. For example, as shown in FIG. 13, the weight of the pixel inside the rectangle L indicated by the face pattern to be collated is set to 1, and the other portions are set to 0. There is a method of adding a value to the corresponding face probability map pixel. The numbers in the face probability map B in FIG. 13 represent weights. Even when this method is used, if a pattern is detected by overlapping adjacent patterns or patterns with slightly different sizes that occur in an area that is originally a face, a probability value as shown in FIG. Are sequentially added, and the position of the representative face pattern in the image A has a relatively high probability in the face probability map B. Therefore, by designating a region where the map value is maximum or a region above a certain threshold value It is possible to detect a representative pattern. For example, in FIG. 15, when a region Q having the maximum map value is taken, a rectangular region indicated by P in the input image is obtained.

図２に戻って、全ての縮小画像について以上の処理を行った場合には処理をステップＳ１１４に進める。 Returning to FIG. 2, if the above processing is performed for all the reduced images, the processing proceeds to step S <b> 114.

ステップＳ１１４では、マップデータが示すマップ画像を構成する各画素の確率値を０〜１に正規化する（ステップＳ１１４）。例えば、マップ画像を構成する全ての画素の確率値を走査して最大値と最小値を求め、最大値が１、最小値が０になるように線形変換により正規化する。 In step S114, the probability value of each pixel constituting the map image indicated by the map data is normalized to 0 to 1 (step S114). For example, the maximum value and the minimum value are obtained by scanning the probability values of all the pixels constituting the map image, and normalized by linear transformation so that the maximum value is 1 and the minimum value is 0.

そして、このようにして生成されたマップデータを用いて、ステップＳ１０１で取得した画像から代表パターンを検出する処理を行う（ステップＳ１１５）。 Then, using the map data generated in this way, a process of detecting a representative pattern from the image acquired in step S101 is performed (step S115).

図７は、ステップＳ１１５における処理の詳細を示すフローチャートである。 FIG. 7 is a flowchart showing details of the processing in step S115.

先ず、ＲＡＭ２０２内、若しくは外部記憶装置２０７内に設けられている顔候補リスト格納部７０が保持するリストデータを読み出す（ステップＳ３０１）。次に、マップデータが示すマップ画像を構成する各画素の確率値を参照し、最も大きい確率値を有する画素位置を特定する処理を行う（ステップＳ３０２）。図８（ｂ）は、マップ画像の一例を示す図で、同図においてＡは最大の確率値を有する画素の位置を示している。なお、このマップ画像は、ステップＳ１０１で図８（ａ）に示す画像８０１が入力された場合に得られたものであるとする。 First, the list data held in the face candidate list storage unit 70 provided in the RAM 202 or in the external storage device 207 is read (step S301). Next, referring to the probability value of each pixel constituting the map image indicated by the map data, a process of specifying the pixel position having the largest probability value is performed (step S302). FIG. 8B shows an example of a map image. In FIG. 8A, A indicates the position of a pixel having the maximum probability value. It is assumed that this map image is obtained when the image 801 shown in FIG. 8A is input in step S101.

次に、ステップＳ３０３では以下のような処理を行う。先ず、リストデータに登録されている各位置（ステップＳ１０１で取得した画像上において顔パターンとおぼしき照合対象パターンの位置）を取得する。そして取得したそれぞれの位置のうち、ステップＳ３０２で特定した位置に最も近いものを選択する。選択した位置における照合対象パターンが代表パターンとなる。 Next, in step S303, the following processing is performed. First, each position registered in the list data (the position of the face pattern and the target pattern to be compared in the image acquired in step S101) is acquired. Then, of the acquired positions, the one closest to the position specified in step S302 is selected. The verification target pattern at the selected position is the representative pattern.

ステップＳ３０３における処理を図８を用いて説明する。図８（ａ）において、リストデータに登録されている各位置のうちの１つをＢとする。また、この位置Ｂを中心とし、「位置Ｂとセットでリストデータに登録されているサイズ」を有する領域（即ち、位置Ｂにおける照合対象パターン）をＰとする。ここで、リストデータに登録されている位置のうち、この位置Ｂが、図８（ｂ）に示す位置Ａに最も近い場合（即ち、画像８０１と画像８０２とを同じサイズにして重ね合わせた場合に、位置Ｂが位置Ａに最も近い場合）、ステップＳ３０３では、この照合対象パターンＰが代表パターンとして選択されることになる。 The process in step S303 will be described with reference to FIG. In FIG. 8A, let B be one of the positions registered in the list data. Further, an area having “the size registered in the list data as a set with the position B” (that is, the pattern to be collated at the position B) with the position B as the center is P. Here, among the positions registered in the list data, when this position B is closest to the position A shown in FIG. 8B (that is, when the image 801 and the image 802 are overlapped with the same size) If the position B is closest to the position A), in step S303, the verification target pattern P is selected as a representative pattern.

図７に戻って、次に、マップデータを更新する処理を行う（ステップＳ３０４）。例えば、図８（ｂ）の場合、マップデータにおいて、マップ画像８０２上において照合対象パターンＰに対応する領域に相当するデータを０に設定する。図９は、マップ画像８０２に対してステップＳ３０４における処理を行った後のマップ画像の例を示す図である。 Returning to FIG. 7, next, a process of updating the map data is performed (step S304). For example, in the case of FIG. 8B, in the map data, data corresponding to the region corresponding to the pattern to be verified P on the map image 802 is set to zero. FIG. 9 is a diagram illustrating an example of a map image after the processing in step S304 is performed on the map image 802.

次に、選択した代表パターンの数が所定数となった場合、若しくは、ステップＳ３０４における更新後のマップデータを参照し、所定値以上の確率値を有する画素が存在しない場合には、本処理を終了するが、選択した代表パターンの数が所定数に達しておらず、且つ、ステップＳ３０４における更新後のマップデータを参照し、所定値以上の確率値を有する画素が存在する場合には処理をステップＳ３０５を介してステップＳ３０２に進め、以降の処理を繰り返す。 Next, when the number of selected representative patterns reaches a predetermined number, or when there is no pixel having a probability value greater than or equal to the predetermined value with reference to the updated map data in step S304, the present processing is performed. If the number of selected representative patterns has not reached the predetermined number and the map data after the update in step S304 is referred to and there is a pixel having a probability value greater than or equal to the predetermined value, the process is performed. The process proceeds to step S302 via step S305, and the subsequent processing is repeated.

このような処理により、２つ目以降の代表パターンを選択する場合には、マップ画像において先に選択された照合対象パターンに対応する領域を避け、２番目に大きい確率値を有する画素位置を高速に検索することができる。 When the second and subsequent representative patterns are selected by such processing, a region corresponding to the previously selected matching target pattern is avoided in the map image, and the pixel position having the second largest probability value is accelerated. Can be searched.

なお、ステップＳ３０５における判断処理で用いる判断条件についてはこれに限定するものではなく、選択する代表パターンの数を特に限定せず、ステップＳ３０５では、ステップＳ３０４における更新後のマップデータを参照し、所定値以上の確率値を有する画素が存在しない場合には本処理を終了し、ステップＳ３０４における更新後のマップデータを参照し、所定値以上の確率値を有する画素が存在する場合には処理をステップＳ３０２に進めるようにしても良い。 Note that the determination conditions used in the determination process in step S305 are not limited to this, and the number of representative patterns to be selected is not particularly limited. In step S305, the map data updated in step S304 is referred to, and the predetermined pattern is determined. If there is no pixel having a probability value equal to or greater than the value, the present process is terminated, and the updated map data in step S304 is referred to. If a pixel having a probability value equal to or greater than the predetermined value exists, the process is performed. You may make it progress to S302.

以上のようにして、代表パターンを選択することができるので、図２に戻って、次に、この代表パターンの画像データを出力する（ステップＳ１１６）。なお、出力する代表パターンについては、ステップＳ１０５における補正前のものであっても良いし、補正後のものであっても良い。 Since the representative pattern can be selected as described above, returning to FIG. 2, next, image data of this representative pattern is output (step S116). Note that the representative pattern to be output may be the one before correction in step S105 or the one after correction.

また、出力先については特に限定するものではないが、ＲＡＭ２０２内の所定のエリアであっても良いし、外部記憶装置２０７やＩ／Ｆ２０９を介してデータ通信可能な外部の装置であっても良い。 The output destination is not particularly limited, but may be a predetermined area in the RAM 202 or an external device capable of data communication via the external storage device 207 or the I / F 209. .

また、更新されたマップ画像は図９に示すようになるので、ステップＳ３０３による処理において代表パターンとして選択された照合対象パターンと、それ以前に代表パターンとして検出された照合対象パターンとを比較し、重なりが所定の面積以上ある場合には、代表パターンとして出力しないようにする、という処理も考えられ、これにより更に精度が高くなる。なお、この時、一旦代表パターンとして検出された領域に対してステップＳ３０４で更新処理を行うようにする。 Further, since the updated map image is as shown in FIG. 9, the comparison target pattern selected as the representative pattern in the processing in step S303 is compared with the verification target pattern detected as the representative pattern before, If the overlap is greater than or equal to a predetermined area, a process of not outputting as a representative pattern can be considered, which further increases the accuracy. At this time, the update process is performed in step S304 on the area once detected as the representative pattern.

また、前述した、ステップＳ１１５では、照合対象パターンの中心と最大確率値を有する画素との距離を利用する手法について説明したが、代表パターンを決定する手法はこれに限るものではない。例えば、マップ画像上で最大確率値を有する画素を中心とし、所定の閾値以上の値を持つ領域に内接する矩形パターンを代表パターンとしてもよい。 In step S115 described above, the method of using the distance between the center of the pattern to be verified and the pixel having the maximum probability value has been described. However, the method of determining the representative pattern is not limited to this. For example, the representative pattern may be a rectangular pattern centered on a pixel having the maximum probability value on the map image and inscribed in an area having a value equal to or greater than a predetermined threshold.

即ち、図８（ｂ）に示すようなマップ画像８０２において、位置Ａにおける画素が最大確率値を有する画素位置とした場合、図１０に示す如く、このマップ画像８０２において位置Ａを通り、且つｘ軸に平行な面Ｈにおける確率分布は、同図下段に示すグラフのようになる。この時、同図では、マップ画像８０２上で最大確率を有する画素の位置を中心とし、所定の閾値ｔｈ以上の確率値を有する領域に内接する矩形パターンのｘ軸と平行な辺の長さはｍとなる。 That is, in the map image 802 as shown in FIG. 8B, when the pixel at the position A is the pixel position having the maximum probability value, as shown in FIG. 10, the map image 802 passes through the position A and x The probability distribution in the plane H parallel to the axis is as shown in the graph in the lower part of the figure. At this time, in the figure, the length of the side parallel to the x axis of the rectangular pattern centered on the position of the pixel having the maximum probability on the map image 802 and inscribed in the region having the probability value equal to or greater than the predetermined threshold th is m.

同様に、位置Ａを通り、且つｙ軸に平行な面Ｉにおける確率分布は、図１１に示すグラフのようになる。この時、同図では、所定の閾値ｔｈ以上の確率値を有する領域に内接する矩形パターンのｙ軸と平行な辺の長さはｎとなる。 Similarly, the probability distribution in the plane I passing through the position A and parallel to the y axis is as shown in the graph of FIG. At this time, in the figure, the length of the side parallel to the y-axis of the rectangular pattern inscribed in the region having the probability value equal to or greater than the predetermined threshold th is n.

このようにして、マップ画像８０２上で最大確率を有する画素位置を中心とし、所定の閾値以上の確率値を有する領域に内接する矩形パターンの各辺の長さが決まるので、図１２に示す如く、代表パターンＰとしてマップ画像８０２上に示す矩形Ｒと対応する入力画像（ステップＳ１０１で取得した画像）上の領域が検出される。 In this manner, the length of each side of the rectangular pattern inscribed in an area having a probability value equal to or greater than a predetermined threshold centered on the pixel position having the maximum probability on the map image 802 is determined, as shown in FIG. Then, an area on the input image (image acquired in step S101) corresponding to the rectangle R shown on the map image 802 as the representative pattern P is detected.

また、ここで抽出するパターンは矩形パターンに限らず、最大確率を有する画素位置Ａを中心とした円形のパターンとしてもよい。 Further, the pattern extracted here is not limited to the rectangular pattern, and may be a circular pattern centered on the pixel position A having the maximum probability.

［その他の実施形態］
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記録媒体（または記憶媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記録媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記録した記録媒体は本発明を構成することになる。 [Other Embodiments]
Also, an object of the present invention is to supply a recording medium (or storage medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded to a system or apparatus, and the computer (or CPU or Needless to say, this can also be achieved when the MPU) reads and executes the program code stored in the recording medium. In this case, the program code itself read from the recording medium realizes the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.

さらに、記録媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Furthermore, after the program code read from the recording medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion card or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.

本発明を上記記録媒体に適用する場合、その記録媒体には、先に説明したフローチャートに対応するプログラムコードが格納されることになる。 When the present invention is applied to the recording medium, program code corresponding to the flowchart described above is stored in the recording medium.

本発明の第１の実施形態に係る画像処理装置に適用可能なコンピュータの機能構成を示すブロック図である。1 is a block diagram showing a functional configuration of a computer applicable to an image processing apparatus according to a first embodiment of the present invention. 画像中に含まれている被写体を検出するための処理のフローチャートである。It is a flowchart of the process for detecting the to-be-photographed object contained in the image. 本発明の第１の実施形態に係る画像処理装置に適用可能なコンピュータのハードウェア構成を示す図である。FIG. 2 is a diagram illustrating a hardware configuration of a computer applicable to the image processing apparatus according to the first embodiment of the present invention. 様々なサイズの縮小画像について、照合対象パターンが顔パターンであるかを判別する処理を説明する図である。It is a figure explaining the process which discriminate | determines whether the collation target pattern is a face pattern about the reduced image of various sizes. 所定領域内のパターンを識別する為のニューラル・ネットワークの動作について示した図である。It is the figure shown about the operation | movement of the neural network for identifying the pattern in a predetermined area | region. 確率分布生成の様子を示す図である。It is a figure which shows the mode of probability distribution generation. ステップＳ１１５における処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the process in step S115. ステップＳ１０１で取得した画像、マップ画像の一例を示す図である。It is a figure which shows an example of the image acquired by step S101, and a map image. マップ画像８０２に対してステップＳ３０４における処理を行った後のマップ画像の例を示す図である。It is a figure which shows the example of the map image after performing the process in step S304 with respect to the map image 802. FIG. 図８（ｂ）に示すようなマップ画像８０２において、位置Ａにおける画素が最大確率値を有する画素位置とした場合に、このマップ画像８０２において位置Ａを通り、且つｘ軸に平行な面Ｈにおける確率分布を示す図である。In the map image 802 as shown in FIG. 8B, when the pixel at the position A is the pixel position having the maximum probability value, the map image 802 passes through the position A in the plane H parallel to the x-axis. It is a figure which shows probability distribution. 位置Ａを通り、且つｙ軸に平行な面Ｉにおける確率分布を示す図である。It is a figure which shows the probability distribution in the surface I which passes the position A and is parallel to a y-axis. マップ画像８０２上に示す矩形Ｒと対応する入力画像上の領域を示す図である。It is a figure which shows the area | region on the input image corresponding to the rectangle R shown on the map image. 顔確率マップの生成方法を説明する図である。It is a figure explaining the production | generation method of a face probability map. 顔確率マップの生成方法を説明する図である。It is a figure explaining the production | generation method of a face probability map. 顔確率マップの生成方法を説明する図である。It is a figure explaining the production | generation method of a face probability map.

Claims

Acquisition means for acquiring an image including a predetermined subject;
Generating means for generating a luminance image composed of luminance components of the image;
First probability calculating means for obtaining a probability that an area in the rectangle indicates a predetermined pattern with the predetermined subject when a rectangle of a predetermined size is arranged at each position on the luminance image;
First probability distribution calculating means for obtaining a probability distribution for each region based on the probability for each region calculated by the first probability calculating unit;
First creation means for creating first map data indicating a probability distribution for each of the regions on the luminance image;
Reduction means for generating a second reduced image to an Nth reduced image by recursively reducing the luminance image;
A second probability calculating means for obtaining a probability that an area in the rectangle indicates the predetermined subject and an oversight pattern when the rectangle is arranged at each position on an nth (2 ≦ n ≦ N) reduced image;
Second probability distribution calculating means for obtaining a probability distribution for each region based on the probability for each region calculated by the second probability calculating means;
Synthesizing means for synthesizing the nth map data indicating the probability distribution for the respective regions on the nth reduced image with the (n-1) th map data;
repetitive means for creating composite map data by repeating the processing by the second probability calculation means, the second probability distribution calculation means, and the synthesis means for n = 2 to N;
An image processing apparatus comprising: detecting means for detecting a representative pattern from subject pattern candidates based on the composite map data.

The detection means includes
Among the regions on the image acquired by the acquisition unit corresponding to the region having the probability calculated by the first probability calculation unit greater than 0, the pixel position having the largest value on the image indicated by the composite map data is displayed. First means for outputting a region at the closest position as the region of the predetermined subject;
In the composite map data, the composite map data is updated by setting the data corresponding to the region corresponding to the region detected by the first means on the image indicated by the composite map data to 0. Means,
The image processing apparatus according to claim 1, further comprising: a third unit that repeats the processing by the first and second units for the composite map data updated by the second unit.

3. The method according to claim 2, wherein the third means repeats the processing by the first and second means if the largest value on the image indicated by the composite map data is equal to or greater than a predetermined threshold value. Image processing device.

An acquisition step of acquiring an image including a predetermined subject;
A generation step of generating a luminance image composed of luminance components of the image;
A first probability calculating step of obtaining a probability that an area in the rectangle indicates the predetermined subject and an opening pattern when a rectangle of a predetermined size is arranged at each position on the luminance image;
A first probability distribution calculation step for obtaining a probability distribution for each region based on the probability for each region calculated in the first probability calculation step;
A first creation step of creating first map data indicating a probability distribution for each of the regions on the luminance image;
A reduction process for generating a second reduced image to an Nth reduced image by recursively reducing the luminance image;
A second probability calculating step of obtaining a probability that an area in the rectangle shows an obtuse pattern with the predetermined subject when the rectangle is arranged at each position on an nth (2 ≦ n ≦ N) reduced image;
A second probability distribution calculating step for obtaining a probability distribution for each region based on the probability for each region calculated in the second probability calculating step;
A synthesis step of synthesizing n-th map data indicating a probability distribution for the respective regions on the n-th reduced image with (n-1) -th map data;
a repetition step of creating composite map data by repeating the processes of the second probability calculation step, the second probability distribution calculation step, and the combination step for n = 2 to N;
And a detection step of detecting a representative pattern from the subject pattern candidates based on the composite map data.

A program causing a computer to execute the image processing method according to claim 4.

A computer-readable storage medium storing the program according to claim 5.