Home | Site Map | Contact | |
| Products | Solutions | News | Technology | Company | Downloads





September 10, 2007

GritTec laboratory updates speaker identification technology.

   Automatic text independent speaker identification technology is intended for automatic identification of a speech signal of unknown voice by paired comparing with 'speaker cards', existing in the database of system. Comparison is conducted by calculation of 'true' and 'false' spots (spots of correspondences) and with the further determination of probability of Acceptance and Rejection. Each speaker card besides information about current speaker (first, last name, birthday, gender, and so on) is characterized by examples of audio files with the speaker voice.
   Each example of audio file is described by the acoustic voice model, error model (FAR, FRR, EER) and noise model, describing surrounding noises and channel distortion, existing in audio file (see Fig.1). For the full description of each speaker card it is sufficiently 1 - 3 audio files with the speaker voice, recorded for different telephone lines and duration of each one not less than 60 sec.
   In algorithmic part of speaker identification technology it was added tone and music detectors. Detector of tone signals is intended for detection of DTMF, CPTD, UMTD and other similar signals. Detector of music is intended for detection of musical accompanying, playing during waiting of connection between telephone speakers.
   Technology of building statistical voice models and its re-estimation (with S-states) was updated in speaker card module. Comparative analysis has shown that using the updating voice models greatly enlarges account of "true" and "false" spots and increases probability of definition of Acceptance and Rejection.
   Testing of updating speaker identifications technology was conducted on the real telephone records and on specialized sound base LDC96S61 of English telephone records given by LDC consortium (Linguistic Data Consortium).
   Renovations and optimization of architecture of program identification modules for using in multi-threading mode were made in software code. At the renovation of program modules architecture of modules was structured on the functionality of each modules. Developing architecture of program modules supposes buildings a client-server applications and identification server by end developers. In identification server identification of unknown speaker is made in the threading mode - independently for each other.
   At present automatic speaker identification technology is available for Intel platform as SDK library with examples of MS VC++ projects.

Fig.1. Structure of speaker card

   FRR - False Rejection Rate;
   FAR - False Acceptance Rate;
   EER - Error Equal Rate: EER = FRR = FAR;
   DTMF - Dual Tone Modulated Frequency;
   UMTD - Universal Multy Tone Detection;
   CPTD - Call Progress Tone Detection.

About GritTec
GritTec Laboratory specializes on research and development of algorithms and technologies in the field of speech and audio processing. GritTec's research is focused on speech enhancement, speech concealment, voice biometric, speech recognition, speech synthesis and other speech and audio technologies.
Url: http://www.grittec.com