Home | Site Map | Contact | |
| Products | Solutions | News | Technology | Company | Downloads




Voice Transcription

  DOC Downloads


Automatic Voice Transcription System is used for making phonetic transcription of a speech signal of unknown voice and it's language identification.

This system consists of kit by modules:
  • A speaker independent phoneme recognition system (40 phonemes in English);
  • Language identification system.
Module of phoneme recognition system is designed on the base triphone Hidden Markov Models (HMMs) and uses  continuous-density phoneme model.

Developed algorithm of language identifications is based on double bi-gram model of language. Double bi-gram model allows to trace in the speech signal probability of transition between phonemes, with its further comparison with each the matrix of language from the database system. Matrix of language consists from the transition probabilities between phonemes of given language.

This system can be effectively used:
  • For automatic voice transcription of unknown voice by phonogram of telephone negotiations;
  • In security systems, where it's important to identify language of unknown voice and to do phonetic transcription of the voice;
  • Applications with high safety level, for instance, when access to digital information is limited by circle of given persons.

Designed system of language identification is trained for English language. It's planned to train system of language identification German, French, Chinese, Japanese and Russian languages.

  • Operated with low SNR;
  • Fast adaptation to changing of channel distortion and external noises;
  • Speaker independent system;
  • Accuracy of phoneme recognition nearing 75% for train of TIMIT database (40 phonemes in English);
  • Reliability of language identification nearing 95% for speech signal recorded not less than 10 seconds;
  • Real time processing;
  • Easy integration with target applications.

Signal requirement
  • Signal format: 16-bits linear;
  • 8 kHz sampling rate;
  • SNR, at least 10 db;
  • Frequency range: 300-3400 Hz or better.

  • DLL libraries for MS Windows;
  • PC demo for MS Windows is available on request.

For more information, please contact us via Online Request Form.