WO1996022595A1 - Speaker verification method - Google Patents
Speaker verification method Download PDFInfo
- Publication number
- WO1996022595A1 WO1996022595A1 PCT/US1996/000709 US9600709W WO9622595A1 WO 1996022595 A1 WO1996022595 A1 WO 1996022595A1 US 9600709 W US9600709 W US 9600709W WO 9622595 A1 WO9622595 A1 WO 9622595A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- steps
- frequency domain
- prints
- voice
- voice print
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
Definitions
- This invention relates generally to electronic security methods which provide for modeling or otherwise comparing human features such as fingerprints, voice patterns, and retina patterns, in order to distinguish between individuals, and, more particularly, to a security method and protocol for modeling and comparing voice prints by digital means.
- the prior art is characterized by methods for forming of a code book of word- grouping features that are independent of the temporal sequence of such features for each word, and then generating a set of acoustic feature signals from a test case for matching to the code book.
- the disadvantages of the above methods include the difficulty in rigorously removing the temporal features from speech, problems with repetitive results within a statistically satisfactory confidence level, problems introduced by equipment variability, and environmental noise.
- the present invention is a method for comparing different electronic representations of sounds in an effort to determine a high level of confidence that the sounds, in fact, originated from the same source, the source usually being human voice, the sounds being common speech. Specifically, the method attempts to determine if two sounds were produced by the same human voice in an attempt to discriminate between allowed and non-allowed personnel seeking entry to a secure facility.
- a short word or phrase is spoken into a microphone and is then converted into an electronic representation of the word or phrase.
- This representation is then compared with a stored similar representation of the same word or phrase made by the same person at some earlier time. If the two representations meet a given similarity criterion, then a door or other security device, as an example, is unlocked. If the criterion is not met, the security device is not unlocked.
- the technique could be used for access to doors, computer files, file cabinets or other things that must be kept under "lock and key” .
- each of the electronic representations are termed "sound prints” and consist of two portions; a time domain portion and a frequency domain portion.
- the method provides for the establishment of a data bank of "enrollment sound prints", one for each of the admittable individuals.
- the method further provides for the taking of a sample or challenge-sound print from a prospective individual seeking admittance.
- the challenge sound print is compared with each of those stored in the data bank to see if a match exists. If it does, the individual is admitted, if not he is withheld from admittance.
- Sound prints, whether of the enrollment type, or of the challenge type, are established in the same way so that they are able to be compared on a one-to-one basis.
- the manner in which the sound prints are created, the manner in which the sound prints, once created, are compared and the admittance protocol are all the subject of the claims of this disclosure.
- FIGURE 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- FIG. 1 illustrates the invention.
- the method of creating any one of the sound prints in the invention includes, first, the step of converting a verbal utterance of a specific word or phrase of about two seconds duration into an electronic representation through the use of a microphone or other transducer.
- the representation is therefore a time domain analog signal.
- this signal which may contain pauses or portions with zero signal level, is reduced by eliminating the pauses.
- the reduced signal is then sampled to produce a sampling set.
- the sampling set of the reduced time domain signal is called the time domain sound print portion, and is one part of the total sound print.
- the time domain sound print portion is stored digitally for reference.
- the electronic representation is also processed in an analog to digital converter and then into a series of Fast Fourier Transformation digital filters, to produce a frequency domain electrical signal representation. Again, a sampling set of this signal is stored digitally and is called the frequency domain sound print portion, a second part of the sound print. Together the time domain sound print portion and the frequency domain sound print portion make up the total sound print of the spoken word or phrase.
- An important part of the method includes repeating the above at least twice, and further, calculating the set of statistical variances for both the time domain and the frequency domain portions.
- a grand average of the variances is calculated and is used as a preliminary match criterion between each of the enrollment sound prints and the challenge sound print.
- the step of first calculating the grand average variance of the challenge sound print and comparing it with the grand average of each of tile enrollment sound prints is an initial step in determining if a match exists. If no variance grand average match exists, then no further comparison need be executed. If a set of variance grand average matches is found, then the further comparison may be limited to those enrollment sound prints which have met this first requirement.
- the challenger is first identified, one need not search all of the enrollment data bank for a preliminary match. In this case only the enrollment print of the identified individual need be matched with the challenge print.
- the total spectrum energy content is determined as well as the energy content of each of the filter subsets. Ratios of each of the subset energy contents to the total energy content are calculated. These ratio numbers are compared between the challenge and each of the enrollment sound prints as a further preliminary measure in determining if a match exists.
- the primary comparison between prints is one in which each of the elements of each of the sampling sets, both time domain and frequency domain, are compared element to element, on a one to one basis, between the two prints being compared, whereby a prescribed number of elemental matches must be found or the match process is taken as a failure.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Developing Agents For Electrophotography (AREA)
- Materials For Photolithography (AREA)
Abstract
A method for comparing different electronic representations of sounds in an effort to determine a high level of confidence that the sounds, in fact, originated from the same source, is disclosed. Specifically, the method attempts to determine if two sounds were produced by the same human voice in an attempt to discriminate between allowed and non-allowed personnel seeking entry to a secure facility. A short word or phrase is converted into an electronic representation and is then compared with a data bank of stored similar representation of the same word or phrase made by the same person at some earlier time. Depending upon the comparison results, access is granted or denied. Sound prints comprising two portions; a time domain portion and a frequency domain portion are created and stored. The manner in which the sound prints are created, the manner in which the sound prints, once created, are compared and the admittance protocol are all the subject of the claims of this disclosure.
Description
TITLE: SPEAKER VERIFICATION METHOD
HELD OF THE INVENTION
This invention relates generally to electronic security methods which provide for modeling or otherwise comparing human features such as fingerprints, voice patterns, and retina patterns, in order to distinguish between individuals, and, more particularly, to a security method and protocol for modeling and comparing voice prints by digital means.
BACKGROUND OF THE INVENTION
There is clearly a need for automated individual recognition in industry and elsewhere. The art is defined by Rabiner, Rosenberg and Soong of American Telephone and Telegraph Company in Canadian patent 1,252,567, entitled Individual Recognition by Voice Analysis, of interest also are Koristka, in DDR patent 201,524, and two Russian references, 518,512 and 522,512.
The prior art is characterized by methods for forming of a code book of word- grouping features that are independent of the temporal sequence of such features for each word, and then generating a set of acoustic feature signals from a test case for matching to the code book. However the disadvantages of the above methods include the difficulty in rigorously removing the temporal features from speech, problems with repetitive results within a statistically satisfactory confidence level, problems introduced by equipment variability, and environmental noise.
Clearly, then, there is a need for an improved protocol for management of voice prints in a system which is expandable to virtually any size, and a method for comparing a challenge print in a large data base, or in producing print matching of very high confidence. Such a needed method is described in the following summary and detailed description and is based upon principles which are defined
in the appended claims.
SUMMARY OF THE INVENTION
The present invention is a method for comparing different electronic representations of sounds in an effort to determine a high level of confidence that the sounds, in fact, originated from the same source, the source usually being human voice, the sounds being common speech. Specifically, the method attempts to determine if two sounds were produced by the same human voice in an attempt to discriminate between allowed and non-allowed personnel seeking entry to a secure facility. In the preferred embodiment of the method a short word or phrase is spoken into a microphone and is then converted into an electronic representation of the word or phrase. This representation is then compared with a stored similar representation of the same word or phrase made by the same person at some earlier time. If the two representations meet a given similarity criterion, then a door or other security device, as an example, is unlocked. If the criterion is not met, the security device is not unlocked. The technique could be used for access to doors, computer files, file cabinets or other things that must be kept under "lock and key" .
In the method of the present invention each of the electronic representations are termed "sound prints" and consist of two portions; a time domain portion and a frequency domain portion. In the general case there are at least several different individuals which may be admitted, while possibly many others may not be. The method provides for the establishment of a data bank of "enrollment sound prints", one for each of the admittable individuals. The method further provides for the taking of a sample or challenge-sound print from a prospective individual seeking admittance. The challenge sound print is compared with each of those stored in the data bank to see if a match exists. If it does, the individual is admitted, if not he is withheld from admittance.
Sound prints, whether of the enrollment type, or of the challenge type, are
established in the same way so that they are able to be compared on a one-to-one basis. The manner in which the sound prints are created, the manner in which the sound prints, once created, are compared and the admittance protocol are all the subject of the claims of this disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawing, FIGURE 1, illustrates the invention. In such drawing is shown a method flow diagram summarizing the various steps of the invention method.
DETAILED DESCRIPTION OF THE DISCLOSURE
The method of creating any one of the sound prints in the invention includes, first, the step of converting a verbal utterance of a specific word or phrase of about two seconds duration into an electronic representation through the use of a microphone or other transducer. The representation is therefore a time domain analog signal. Next, this signal, which may contain pauses or portions with zero signal level, is reduced by eliminating the pauses. The reduced signal is then sampled to produce a sampling set. The sampling set of the reduced time domain signal is called the time domain sound print portion, and is one part of the total sound print. The time domain sound print portion is stored digitally for reference.
The electronic representation is also processed in an analog to digital converter and then into a series of Fast Fourier Transformation digital filters, to produce a frequency domain electrical signal representation. Again, a sampling set of this signal is stored digitally and is called the frequency domain sound print portion, a second part of the sound print. Together the time domain sound print portion and the frequency domain sound print portion make up the total sound print of the spoken word or phrase.
An important part of the method includes repeating the above at least twice, and
further, calculating the set of statistical variances for both the time domain and the frequency domain portions. A grand average of the variances is calculated and is used as a preliminary match criterion between each of the enrollment sound prints and the challenge sound print. The step of first calculating the grand average variance of the challenge sound print and comparing it with the grand average of each of tile enrollment sound prints is an initial step in determining if a match exists. If no variance grand average match exists, then no further comparison need be executed. If a set of variance grand average matches is found, then the further comparison may be limited to those enrollment sound prints which have met this first requirement. Of course, in a system whereby the challenger is first identified, one need not search all of the enrollment data bank for a preliminary match. In this case only the enrollment print of the identified individual need be matched with the challenge print.
In producing the frequency domain electrical signal representation, the total spectrum energy content is determined as well as the energy content of each of the filter subsets. Ratios of each of the subset energy contents to the total energy content are calculated. These ratio numbers are compared between the challenge and each of the enrollment sound prints as a further preliminary measure in determining if a match exists.
The primary comparison between prints is one in which each of the elements of each of the sampling sets, both time domain and frequency domain, are compared element to element, on a one to one basis, between the two prints being compared, whereby a prescribed number of elemental matches must be found or the match process is taken as a failure.
While the invention has been described with reference to a preferred embodiment, it is to be clearly understood by those skilled in the art that the invention is not limited thereto. Rather, the scope of the invention is to be interpreted only in conjunction with the appended claims.
CODE ENABLEMENT The following machine language code provides the software enablement for the method of the invention: S123080OBFAA02A326O421340120FAB623A47F20OCBFAA02A31F0120O4AEFFBFB0B72302BA S12-MK20A3060D.IOFD0 -2114BEAA8100201EOT24FAB62720D8B7270D24FD20E103A308A62C -n230840∞B724A67re724Ifc23CD15793FA3CC0C5ra784BBA7B7A7B6 S12-W86O0444444444AB30A139.^2AB07CTXmi81B673A43FαX^ S1230880CD0811CD095røD82270BA43FCD0851CD09^ S123(WA0115C20F5CD0899CIXtøE45FCD09A33F82CD08CC5CBF8 S123O8C0275EAD33CD0811CDO87E20E7A653CDO811A63DCD0811B674AB05CDO851815350A3 S12308E041584300A60DCI^11A60ACD08118131313148494E5A43BF8497B6A43F823F73B2 S1230900A3502604 CS2AB04A3582602AB03A341260.-AB02A343260--AB01B7749FBE848178 S1.130920AE40CD0899B6A4AB01B7743F73CD0950B784AEFFA608B7825CA62E38842403D S12-W94008EFCD08113A8226EF81B784A6C72002A6C61550B79EA612B79CA650B79DB673F6 S123O960B79FB674B7A0A681B7A1B684BD9C1350145O81A601BB74B7744FB973B77381A675 S123098001B784B674B084B774B673A20020EDB675B173250B2208B676B1742503A601816C S123røA04F-H)FC3-*73B6A4B7749FCD097581CD09A3CD09∞A43ra^ S12309C081A6FCB774A63FB77320E6A6F8ADF4AD42AE91CDOA09A6FAADE9AD3781CD09A34F S12309EOCD095081CIXI9A3B675CD094A81AE04ADBDAD2081AE04CD09A3B675CD094ACIXI951 S1230A0073B676CD094A81AE75B673A43FF7B674E70181AE75F6B773E601B774815F6F85B0 S1230A205CA30D23F981BE823C823C82E686B774E685B7732602E68681A608B7835FB6739C S1230A40E1852606B674E18627EE5C5CB38323EE815FA608B783BF82ADCC270D1650CD0969 S1230A605054E793A683CD094ABE82B38323E9CC1520AE084FBF82B783ADAB270654E693B6 S1230A8OCDO94AB682A004B782B1832AEC813FA83FA9CD0950B784A4OF97B684A4F02603E1 S12-. AAOCC0BB4A110276DA1202603CC0BB0A17022385D271BA14026O4A302273DA302230B S1230AC00CA3052708A30B2704A30E26033C5281A1302740A140270CA150270A16A9A1603F S1230AE02732201610A912A92010A1802745A190260CA30623D7A30E27D3A6012066A1A013 S1230B002614A30D270EA30727C3A30C27BFA30F27BB14A92074A1B02770A1C0273C16A939 S1_30B20A1D027-^A1E02762AD2926CEAE03CIX)9DD206E5A2B122714A3022606CIX)9C15F84 Sl.t-K)B40->038A30C238720B2AEC)92002AEC)6αX)9EF2014A30C2702A30D813CA83CA8AD S1230B60270BA603CIX)975AE8FCIX)A()9815FA1C02702AE03CIX)973CD09B14F5D2703CD )997 Sl-BOB∞DDB783CIX)A13B683-tøDA3CA8ADC52606AlA0271C220 S1230BA0B83F753F7620DA3F75AE03CD09B82CCAA6012(X)2A602B7A84CCDC)975AE91CD0ABC S12-røBC009CI-W7FCD095097CIX#739F4D2A953A732091CIM^ S1230BE01827FOA1082607A30027015A20EEE7545CA31E2704A10D26E3A60DE7543F7F8152 S1230COOBF84BE7FE6543C^FBE84B78381A1602304A020B783813F7D3F7ECDOCOOA124267E -n-3fJC20O3CD<X:θ0CIX)C0D3FAA3AAAA0302B2BAlrø S1230C407DBE7E5849584958495849B77D9FBAAAB77EBE843DA3260->20C73C^3813α2 S123O_30CD08E43D52270AAE25CTJ0899CIX)8E43F52Cro^ S123OC^ODECl- X 0D5CD61AE727D3A47FB183270C3F7F3C51D61AE72BE15C20F8D61AE72A32
S1230CA0D9CDOCO0A1OD272AA12026E4B651A10D272OA104271CCDOC16BE5358A3OA229D0C S1230CC0B67DE771B67EE772B683A12027E8A10D268BB6514897A6CCB79CD60CE6B79DD6F3 S1230CE00CE7B79EBC9C0EDA1429121C1276148A12E413D9126012AC136213F1166312D1F0 S1230D001448185A191F19AA1A0E196216A7AEA5CD0A09CD08E4CD0871A620AE1DE7545A4E S1230D202AFBCDOA8ECDOE55B6A8B784B753AE02BF7FCDOE5E3C7FCD09733A842AF4CDOE70 S1230D40553D522709CDO97F3CA8AE262038CDO95OA40F97CD095OA10F222DB784AE17BFC2 S1230D6O7FCDOE7FCDOE8A5FBF80AE12BF7FB684O184023C8O44CD0E71CD0E7FCDOE5B3FB9 S1230D80A8BE80DEOEB52041A11F2208A010B784AE0220D4A12F22059FAB05201EA17F2228 S1230DA005DE0E952023A19F220BA18F2602AE02DE0ECA2014A1AD260DA6O4B780AE12BF26 S1230DC07FCD0E8820B9DE0EA5BF513F845FD6114FA10F22035C20F6A40FB780B684B151EE S1230DE027043C8420EFD6114FA40FB1802218D610C7A47FBF82B784B680ABOC97B684E7BO S1230E00543A802B05BE825A20DCOOA90703A908A6582002A641B763AE12BF7F05A904A61F S1230E2023AD563DA82711AD5ACD09B83AA82B082702A43FAD2B20F107A908A62CAD3AA6B5 S1230E4058AD365FE654CD08115CA31D23F6AEOACDOA1E3F52AEA5CDOA1581CD0973CD0982 S1230E6050BE7FB783AD06B683A40F200444444444AB30A1392302AB07E7545CBF7F81A64C S1230E802CADF6A624ADF281ADF9B68FA43FADD1B690ADCD8130002F212E0035042D34239D S1230EA0002742001F3F203922020F2B3C25003201292A2C3E1A181B061C17190B11050798 S1230EC01508090A161312140EOD37384440000000411D3A1E3B36313D433A53263D3FABCF S1230EE0CDOD0E3FA8CDOBD6CDOC00A10D2608B6534CCDO97520E9A12E27223A7F3F8O3F44 S123OF0O51AEFFCD0C00CDOC0D5CD6114FA10F23O4A40F3C51B18O270722EE3C52CC0C5F33 S1230F20D610C727F6A47FB18322F026DCD610C72B043C8020CDD6114F44444444B782BEB6 S1230F40515AD611D7B751B682A104261CCDOCOOCDOCODA1412708A1582611A6202002A66F S1230F6010BB51B751A601B782CDOCOOA12E2704A10D2609B6824A26A23A7F2004A12026AB S1230F809AB68248BB8297DCOF87CC1080CCOFFCCCOFB6CC101CCC10B8CC1033CC1038ADCD S1230FA060CDOC163D7D2604B683A12C2632B67EB782A6022002A601B7A8CDOC16B6A84CC6 S1230FCOCD0975B67DB775B67EB776CD098F2613B676CD0981B675B0732605B674402B13CO S1230FE0CC0F1BB676B074B774B675B27326F1B6742BEDB77EB682B77DCC1083AD03CC1O47 S1231000BACDOC16B67EA40FA1002525A107222148BB51B751B683A12C267281CDOCOOA171 S12310202C26034F20343CA83A7FCDOC163D7D2727205ACDOC002007CDOCOOA123277BA1C6 S12310402C27153CA83A7FCDOC16A6103D7D27043CA8AB10BB51B751A610B782B683A12C55 S1231O602621CD0C(X)CD0C0DA1582621B68202A807AB20O0A802AB2OBB51B75120023AABE2 S1231080CDOC00B683AlOD2707A12E27πCCθπBCD0E55B651CD094ACD09733AA82B095F6C S1231OA0OOA8015CE67D20EDCDOE55CD0D0E3DAB2603CC0EEFCCOC5F20D3CD0C163D7D26CC S12310C0CC3A7F3CA820B94144C3C44EC453CCD24243C34CD2D345D14843C3D3C9D349C89E S12310EOCCD44CCFD34DC3C9D34EC55OCC52C1434CD2CE5345D45345D4D2434CC3C9D24D5D S1231100D04FCD50D84445C3D8454FD24643C2494EC3D84A4DD053D24C44C1D853CCD24DBD S123112055CC4E45C74FD04F52C1C7524FCCD253D054C9D35342C345C3C954C14FD0D8550C S1231140C257C95441D853D458C1574149D4000001727201720142420001320223320132AD S12311600102333332320132327201323201323232013201320132020384320203840102EB S1231180233200011212420172014201720001421200017200015200014212000162016230 S12311A0000172720142420001120001420112000172520001424201120112120001720162
S12311C01212OT620213620172OT12(XXni2014201120001021300A9ABA438372411252720 S12311E0282922242F2EA525232C2B2D262A2001210010AD989A3FA133A33A5AA8013C5C7A S1231200ACADA6AE383442309DAA0039369C8081A2999BA78EAFA083973D9F8F3A532B1243 S1231220CD0AlD5IΕ673E785E674E7865αC3A532AF2CD08E4AE13CE)0899A673CD0811A6DD S12312403DCD08π3F82O50A2627()8CD0871AE40CD0899BE82A3rø S12312603A532B0D26F6CI-iOA3926π6F856F8620ClCD0AlD20BC3A532B12264ACD0A07AE02 S1231- 8004CIX)9E4<-ΪX)973B676<-D094ACi-W^ S12312A08E3D522621AE0AA60CCC0A54BE535A2B1C2613BE74BF9B270DCD09EDAEA5CD0A9A S12312C009CD0A3927D13C523F9BCC0C5F5C5C20E4BE535A2B0A26EEBE74BF9A27E820B977 S12312E05<^C20F6BE535A2B5A5A27052A55CDOA07CD098F274F3F7F3F80CD08E4CD0871F2 S1231300AE40CD0899AD41CD09504D2B08A1202504A17F2502A62EBE7FCDOE79CD0950CD50 S123132(X)851A620CD0811CD09733C800980D6y^E41CD08995FADllE654CD08115CA30F2375 S1231340F420AE3C52CCOC5F3DB027153FBOB623A47FA11326070120FDB623A47FA1182713 S12313600981CD(tøE4AElFαXtøA4CCOC5F3C5220F9α^ S123138013D4272D-»*CB18326F6B67ECI )94A3D82270BCDW7FB67DCD094ACD0973BE80B638 S12313A083A13D2726A15E270DA10D2713A12E27023C52B68381CD097F5A2AF7AE0420F38B S12313C0CDO9735CA30423EB5F20E83D8227E4CD097F20DF5E3D2E0DO03A5326613F82CDB2 S12313ECXWE CD0871AD8A3D522655A12E26F0204FBE532649BF80CD09A35D260ACD08E4A4 S1231400CD08CCAD9820EEA30427015FBF82CD08E4BE80D608DECD08F7CD0811CD13713D72 S123142052261DA12E26C-Ε2017BE53A303260FCDrø8F270CB678CD094ACD097320F13C525F S1231440CC0C5F3C52CC0C5F3FABAE01B683A10D2704CDOC0097BF83CDO8E4AD18012OFB95 S123146004211D3FABB62397A47FB18327D70D24FDBF2720E6012407B627B7230D20FD81CF S1231480A6O0B724A680B72420D1CD08E4B683A10D27B0A12026ACCD0C00CD0C0DA15426F1 S12314A00410A32017A148269A12A33C7FCDOCOOA10D2705CD081120F4CD08E43FA8CD08FF S12314COC10A15326F9CD0800A1392706A13126EE20023CA83FA7AD30A003B780AD2AB7738F S12314EOAD26B7743A802BOAAD1ECD094ACD097320F2BEA7BF80AD103DA8260743B18027A7 S12315(X)BD3C-523FA3CCOC5F3F7EAD09AD07BBA7B7A7B67E81CD0800CDOC243DAA2BE28180 S1231520B6A41F5OA1FF26O3105080A5012605AB031E5O83AD00AB0220E83C52CC0C5F207E S1231540F53F50A6FAB7A43F73AB01B774A6E8CD094A3FA33FBOCDOA1DAD39A60CB7ADB759 S1231560AECD15795FCD08E413A33F9B3F523F9A3FA23F5O1450CC13671E211E25A6E0B773 S123158021B7251F211F25A64097BAADB7219FBAAEB72581AEFFA6FF4A26FD5A26F88111DD S12315A0500Iϊ5O9BC6OlO-HAB7A49CCIX)9EDCDO97FCDOA07A60CCD0A3B-r7O9CD^ S12315C0277B1A50C»95∞EAEOCA60ACDOA75AEOACDOA1E1750075∞3CIDOA72C1M9F4CDOA86 S12315E0130A5O4F3F5O145OCD0A392620B69BA0012512B6A5B1732608B6A6B17426023A29 S12316009B3TOB262BAE13CIX»E4Cα5683DA2260FB69AA001250-t2-^ S12316203FA2CCOA513A9ACDODOECD08A7CC1299CC1297CD1579CD08E4AE1920CDB6A4AOB8 S123164CK)5B7A4AEC)6BF82C-X>09DDB7759FA00597CI )9E4BE825C^30823EBCD09CTCD09F40 S1231660CC15203A5326245FA3162736D61B21B17D27045C5C2θπ5CD61B21B17E26F554D9 S12316809FAB03B7AEAA40B7252019CD08E4B6AEA0C 897I^lB21CI-i0851D61B22CD085185 S12316A02∞23C32CCOC5F5FCD()8E4CD08E4D61BCA27()6CD08115C20F55FCD08E4D61CB66F S12316CO2706CTJ )8115C20F55FCD()8E4D61IMF27()6CI )8115C20F55FCD08E4D61E->827064
S12316EOCD08115C20F55FCD08E4D61F042706CD08115C20F5CCOC5F9D9D9D9D9D9D9D9DDB S12317O09D9DA6FFB7055FD63F00E7505CA30626F65FAD1EE7565CA3O526F73D562632AE3D S12317204EBF1CF7C70A003355CD3F85A655AD0820FEOB10FDB611810F10FDB71181B659F9 S1231740B151250B2206B65AB1522503A601814F81B657B751B702B658B752B700ADD3B725 S123176057BE56A301271CA3022704j 3032628A601B755B657AlFF2705CD3F853πCCD3FCB S1231780962006CD1F89CD3F B657B701ADA9ADAD26CAA655ADA120FEA1A6FF204DA6222B S12317AOB7073FB23FB3B6B1A1042620A602B700A607B70415001000A6FFAEFF21FC21FABC S12317C05A26F94A26F4CD1900CC19A6CD1A89CD1732B673CD1738B674CD17383DB126BAD8 S12317E0CD0871A6ODCD0811CD095OCD1738CD1732B783B6B12708A1012719A102271CCD49 S12318000950B1832705A601B7B281CD1AAFCD098F26BF81B683CD094A20FOB683A1FF27BO S1231820EAA6(πB7B381CCT8F5CCTA47AE8FCD0A09CD198F270FCD08E4AE64CDO899CD084C S123184000A10D26E1AE8FCDOA153FB1CD08E4CD179EAE8FCDOA15CC19F8CD18E5B651A103 S123186OB627C9CD08E4AE84CD0899A602B7A9CD19C0270FCD08E4AE64CD0899CD0800A1D2 S12318800D2654B68FB773B690B774CD1A52A601BE51A3C427064CA3C927014CCD17385FB3 S12318A0AD38CIX)8E4B610B611CI )950CD1738CD0871A60DCD()811CDlAAFCD1732CDO98FC7 S12318C026E7B68FB773B690B774B651A1C926043AA926AFCC1A17CC18F5E673CD17385C1D S12318EOA30426F681BE535A2716A3022605CD098F26233C521100A600B70FB704CCOC5F7D S1231900B673B777B674B7783F73AE20BF74AE3FBF75AEFFBF76B678B751B677B7AC81B660 S123192053A10126CEB674B751A1C4270BA1B626C2A604B7B1CC179ECD1A524FB7AFCD174D S1231940385FAD96CD1732CD1900CD08E4AE84CE)0899CD19COCC18F5A601B7B1CD179ECC78 S123196018F5CD18E5B651A1B627EDCD1A7BCD1732BE8FB373220B2506BE90B3742203CD70 S1231980094ACD1AAFCD098F26E43F52CC18F5A602B7B1CD08E4AE84CD0899CD179E3DB3A6 S12319A02703CD19ED81ADE72013CD18E5B651A1B627F3CD08E4AE84CD0899AD03CC18F5BA S12319COCD1A7BCD1732BE8FB37322172506BE90B374220FBE51A3C826054D26102004A121 S12319E0FF260ACD1AAFCDO98F26D84F81AE45CD08E4CD0899A6O181A603B7B1CDO8E4AE31 S1231A007ACD0899CD179E3DB22744CC1A47CD18E5B651A1B627E1CD08E4AE7ACD0899CD85 S1231A201A7BCD1732BE8FB37322112506BE90B3742209B7AFCD0950B1AF260BCD1AAFCD06 S1231A40O98F26DECC198AAE54CD08E4CD0899CC18F53F00A6O3B704AD36A603AEFF5A2619 S1231A60FD4A26F8C61799AB02CD1738AEFF5CD616F8CD1738C3179926F481B673B78FB682 S1231A8074B79∞F73A620B774A606B7004CB7043FOEB651A1B62604A6C02002A630B70D83 S1231AA0160n00042424242140FB610B61181B673260ABE74A34F2604AEFFBF743C742657 S1231AC0254CBE51A3C42712A3B62608A1022604A6082010A13F260C2004A1112606A6F006 S1231AE0B774A63FB773814153CD42C642D2C74C4F41C44DC44DCD4E4F42D2D052C452CD62 S1231B0053504545C4D454CD50524FC742554CCB434843CB564552C6434F50D948454CD0C5 S1231B200O)150030006(X)120CX)0002400u00048000000 S1231B4037303543392∞.i2E33(X)42726B70740041626F7274(X)526567732000494C4C45E4 S1231B6O47414C2F494F 3554646494349454E5420454E545259002020202000504152546E S1231B80204E4F5420424C414E4B00444F4553204E4F542056455249465900484954205220 S1231BAO455455524E20544F2050524F4752414D0O564552494659494E470O424C414E4B7D S1231BC020434845434B494E4700425245414B203D2041626F727420636F6D6D616E642CA0 S1231BE0204354524C2D41203D2045786974207472616E73706172656E74206D6F64652C44
S1231C002M354524C2D482(BD2()4261636B73706163632C200DOA4354524C2D53203D2067 S1231C20467265657A65207363726 656E2C204354524C2D58203D2043616E63656C206353 S1-131C406F6D6D616E64206C696E650D0A41534D203C535441525420414444523E2D205341 S1-Ω1C60696E676C65206C6%E652()617373656D626C652F64^ S1231<-»00A4246203C5354415--5420414444523E203C454E4420414444523I-203C44^^ S1231CA0413E2D20426C6F636B20666*%C6C206D656-%F7279004--52205B3C414444523152 S1231CC0202D2D41444452353E5D2D20536574203120746F203520627265616B706F6%EAB -n231CE074730D0A4-554C4B2røC4445564943453E2D2042756C6B20657261736^ S1231D005520454550524F4IX)DOA4348434B205B3C535441525420414444523E2røC4 S1231D2044204144445231ϊ5D203C4445564943453E2D20426C616E6B2C)636-te5636B204D7B S1231D404355204F54.iF45-»F454550524F4IX)0434F5059205B-K^^ S1231I)6∞IΪ2∞C454E4420414444523E5D203C 4455o494-^3E2I>136F7079204D435520El S1231D804F542F452F454550524F4D20746F206D656DODOA47205B3C535441525420414487 S1231DAC)44523E5D2D204578ci56375746520757365722070726F6772616DODOA4C4F4144AO S1231DC02∞C504F52543E205B3D3C544558543IΪ5D2D20446I776E6C6F6164202848296F9F -n231DE07374206I*722028542965726D696E61-<:2C)706F7274-t0746F206D-^ S1231ECJ0203C53544152542M14444523E205B3C454E4420414444523E5D2D20446973701A S1231E206<:6179206D656D004mD20-K:414444524553533E2D204I^F64696679206D656DAB S1231E4∞I )A4E4F4252205B3C4144445231202D2041444452353E5D2D2052656D6F766520 S1231E6020627265616B706F696E74730D0A50205B3C434F554E543E5D2D2050726F636514 -π231E80656420312D46462074696D6573207468726F756768206120627265616B706F691A S1231EAC»E740IMA50524F47205B3C535441525420414444523E203C454E442041444452C1 S1231EC03E5D203C4445564943453E2D2050726F6772616D204D4355204F542F452F4545FF S1-31EE050524F4D2066726F6D206D656IWDOA52442D20526567697374657220646973705E S1231FCK)6C617900524D2D205265676973746572206D6F646966790DOA5350454544205B9B S1231F203C424155442(K24154453E5D2D20446973706 :61792F73656C65637420686I:7322 S1231F407420626175642072617465OD0A54205B3C434F554E543E5D2D20547261636520DF S1231F6O312D464620696E737472756374696F6E730DOA544D205B3C455849542043484189 S1231F805241435445523E5D2D20456E746572207472616E73706172656E74206D6F6465FF S1231FAOODOA5645524^205B3C53544152-j*420414444523E203C454E442(Ml*-444523E5IDOC Sl-»31Fα)203C4445564943453E2D566572696679204D4355204F542F452F454550524F4DDE S10D1FE020746F206D656DO0O00E83 S10D1FF6154115411541159F1541D1 S9030000FC
Claims
1. A method of creating and comparing voice prints for determining a level of comparability between the prints in order to provide an access step if the level of comparability is at least as great as a predetermined level, comprising the steps:
a) converting a verbal utterance of a specific word or phrase of about two seconds duration into an electronic representation through the use of a microphone or other transducer;
b) deleting portions of the electronic representation which have zero signal level;
c) sampling the electronic representation of the utterance with deleted portions to produce a time domain sampling set;
d) storing the time domain sampling set digitally;
e) converting the electronic representation from an analog to a digital form;
f) transforming the digital form of the electronic representation through a Fast Fourier transform device to produce a frequency domain electrical signal representation;
g) sampling the frequency domain electrical signal representation to produce a frequency domain sampling set;
h) forming a list voice print as a composite of the time domain and the frequency domain sampling sets; 1 i) repeating the steps (a) through (g) to form a second voice print as a
2 composite of a second time and a second frequency domain sampling sets resulting
3 from the repeated steps;
4 j) comparing corresponding elements from the first and the second sampling
5 sets of the first with the second voice print to establish a level of comparability of
6 the two prints;
7
8 k) enabling a security access if the level of comparability in step (j) is at least
9 as great as a predetermined level of comparability, and denying said access 10 otherwise. ll
12 2. The method of claim 1 including the further steps, after step (i) of:
13
14 j') calculating a first statistical variance of the elements of the time domain
15 sampling set of the first voice print; 16
17 k') calculating a second statistical variance of the elements of the time domain
18 sampling set of the second voice print; 19
20 1 ') taking the arithmetical difference between the first with the second
21 statistical variances and continuing with steps (j) and (k) only if tile difference is
22 less than a selected number, access being denied otherwise, whereby steps (j'), (k') 3 and (1 ') provide a quick method of discriminating between the first and second 4 voice prints before execution of more time consuming steps. 5 6 3. The method of claim 1 including the further steps, after step (i) of:
27 8 j") calculating a first total spectrum energy content, a first band pass energy
29 content of each of the filtered subsets of the frequency domain sampling set of the 0 first voice print, and a statistical variance of said first filtered subsets; 31 2 k' ') calculating a second total spectrum energy content, a second band pass 1 energy content of each of the filtered subsets of the frequency domain sampling set
2 of the second voice print, and a statistical variance of said second filtered subsets; 3
4 1 ") taking the arithmetical difference between the first and the second total
5 spectrum energy content, and the arithmetical difference between the variances of
6 the first and the second filtered subsets, and continuing with steps (j) and (k) only
7 if the difference is less than a selected number, access being denied otherwise,
8 whereby steps (j"). (k") and (1 ") provide a quick method of discriminating
9 between the first and second voice prints before execution of more time consuming 10 steps. ll
12 4. A method of creating and comparing a challenge voice print with a
13 collection of enrollment voice prints for determining if the challenge voice print
14 meets a level of comparability with any one of the enrollment voice prints in order
15 to provide an access step if the level of comparability is at least as great as a
16 predetermined level, comprising the steps: 17
18 a) converting a verbal utterance of a specific word or phrase of about two
19 seconds duration into an electronic representation through the use of a microphone
20 or other transducer; 21
22 b) deleting portions of the electronic representation which have zero signal
23 level; 24
25 c) sampling the electronic representation of the utterance with deleted portions
26 to produce a time domain sampling set; 27
28 d) storing the time domain sampling set digitally;
29
30 e) converting the electronic representation from an analog to a digital form;
31
32 f) transforming the digital form of the electronic representation through a Fast Fourier Transform device to produce a frequency domain electrical signal representation;
g) sampling the frequency domain electrical signal representation to produce a frequency domain sampling set;
h) forming an enrollment voice print as a composite of the time domain and the frequency domain sampling sets;
i) repeating the steps (a) through (h) a plurality of times by a corresponding plurality of individuals to form a data base of enrollment voice prints, each said voice print representing one said individual;
j) repeating the steps (a) through (g) to form a challenge voice print of an unknown individual;
k) comparing corresponding elements in turn from each of the enrollment sampling sets, in order, with the challenge sampling set to establish a level of comparability between each of the enrollment voice prints with the challenge voice print;
1) enabling a security access if the level of comparability in step (k) with any said comparing of elements is at least as great as a predetermined level of comparability and denying said access otherwise.
5. The method of claim 4 including the further steps, after step (i) of:
j') calculating a first statistical variance of the elements of the time domain sampling set of each of the enrollment voice prints;
k') calculating a second statistical variance of the elements of the time domain sampling set of the challenge voice print; 1 ') taking the arithmetical difference between the each of the first with the second statistical variances and continuing with steps 0) and (k) only if the difference is less than a selected number for at least one of the arithmetical differences, access being denied otherwise, whereby steps (j'), (k') and (1 ') provide a quick method of discriminating between the enrollment prints and the challenge voice prints before execution of more time consuming steps.
6. The method of claim 4 including the further steps, after step (i) of:
j") calculating a first total spectrum energy content, a first band pass energy content of each of the filtered subsets of the frequency domain sampling set of each of the enrollment voice prints, and a statistical variance of said first filtered subsets;
k") calculating a second total spectrum energy content, a second band pass energy content of each of the filtered subsets of the frequency domain sampling set of the challenge voice print, and a statistical variance of said second filtered subsets;
1") taking a first arithmetical difference between the first and the second total spectrum energy content, and a second arithmetical difference between the variances of the first and the second filtered subsets, and continuing with steps (j) and (k) only if each of the first and the second differences is less than selected numbers respectively, access being denied otherwise, whereby steps (j"), (k") and (I") provide a quick method of discriminating between the first and second voice prints before execution of more time consuming steps.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU47606/96A AU4760696A (en) | 1995-01-19 | 1996-01-18 | Speaker verification method |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US37539895A | 1995-01-19 | 1995-01-19 | |
| US37481195A | 1995-01-19 | 1995-01-19 | |
| US08/374,811 | 1995-01-19 | ||
| US08/375,398 | 1995-01-19 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO1996022595A1 true WO1996022595A1 (en) | 1996-07-25 |
Family
ID=27006763
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US1996/000709 Ceased WO1996022595A1 (en) | 1995-01-19 | 1996-01-18 | Speaker verification method |
Country Status (2)
| Country | Link |
|---|---|
| AU (1) | AU4760696A (en) |
| WO (1) | WO1996022595A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US3673331A (en) * | 1970-01-19 | 1972-06-27 | Texas Instruments Inc | Identity verification by voice signals in the frequency domain |
| US3896266A (en) * | 1971-08-09 | 1975-07-22 | Nelson J Waterbury | Credit and other security cards and card utilization systems therefore |
| US4449189A (en) * | 1981-11-20 | 1984-05-15 | Siemens Corporation | Personal access control system using speech and face recognition |
| US4833713A (en) * | 1985-09-06 | 1989-05-23 | Ricoh Company, Ltd. | Voice recognition system |
| US5313556A (en) * | 1991-02-22 | 1994-05-17 | Seaway Technologies, Inc. | Acoustic method and apparatus for identifying human sonic sources |
-
1996
- 1996-01-18 WO PCT/US1996/000709 patent/WO1996022595A1/en not_active Ceased
- 1996-01-18 AU AU47606/96A patent/AU4760696A/en not_active Abandoned
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US3673331A (en) * | 1970-01-19 | 1972-06-27 | Texas Instruments Inc | Identity verification by voice signals in the frequency domain |
| US3896266A (en) * | 1971-08-09 | 1975-07-22 | Nelson J Waterbury | Credit and other security cards and card utilization systems therefore |
| US4449189A (en) * | 1981-11-20 | 1984-05-15 | Siemens Corporation | Personal access control system using speech and face recognition |
| US4833713A (en) * | 1985-09-06 | 1989-05-23 | Ricoh Company, Ltd. | Voice recognition system |
| US5313556A (en) * | 1991-02-22 | 1994-05-17 | Seaway Technologies, Inc. | Acoustic method and apparatus for identifying human sonic sources |
Also Published As
| Publication number | Publication date |
|---|---|
| AU4760696A (en) | 1996-08-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US5548647A (en) | Fixed text speaker verification method and apparatus | |
| US10476872B2 (en) | Joint speaker authentication and key phrase identification | |
| US5835894A (en) | Speaker and command verification method | |
| US6480825B1 (en) | System and method for detecting a recorded voice | |
| US6519565B1 (en) | Method of comparing utterances for security control | |
| JP3080388B2 (en) | Unknown person identity verification method | |
| US5893057A (en) | Voice-based verification and identification methods and systems | |
| JP2008509432A (en) | Method and system for verifying and enabling user access based on voice parameters | |
| WO2000077772A2 (en) | Speech and voice signal preprocessing | |
| US10957318B2 (en) | Dynamic voice authentication | |
| US6161094A (en) | Method of comparing utterances for security control | |
| JPS62502571A (en) | Personal identification through voice analysis | |
| Trysnyuk et al. | A method for user authenticating to critical infrastructure objects based on voice message identification | |
| WO1996022595A1 (en) | Speaker verification method | |
| Ahmad et al. | The impact of low-pass filter in speaker identification | |
| Corsi | Speaker recognition: A survey | |
| Markowitz | The many roles of speaker classification in speaker verification and identification | |
| Aliyu et al. | Development of a text-dependent speaker recognition system | |
| Acevedo et al. | Speaker Verification Using Pitch and Melspec Information | |
| Jin et al. | A high-performance text-independent speaker identification system based on BCDM. | |
| Feustel et al. | Voice-based security: identity verification over telephone lines | |
| CN118609563A (en) | A robot temporary control method and system based on dynamic password voiceprint authentication | |
| JPS58189700A (en) | Private collator | |
| Digavadekar et al. | Authentication of Fingerprint Recognition Using Natural Language Processing | |
| Khan et al. | Enhancing Security Via Speaker Recognition |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AK | Designated states |
Kind code of ref document: A1 Designated state(s): AM AT AU BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU JP KE KG KP KR KZ LK LR LT LU LV MD MG MN MW MX NO NZ PL PT RO RU SD SE SI SK TJ TT UA US UZ VN |
|
| AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
| REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
| 122 | Ep: pct application non-entry in european phase |