ISSN 0021-3454 (print version)
ISSN 2500-0381 (online version)
Menu
Summaries of the issue

SPEECH ANALYSIS SYSTEMS

7
Language aspects applicable for automatic language and speaker recognition are revealed. A method proposed for language recognition is based on statistical parameters of pitch stress pattern in a given language. Typical ranges of the parameters variation for languages of different language families are compared.
DETERMINATION OF CHANNEL-INDEPENDENT INFORMATION INDICATORS Vitaliy V. Kiselyov, Andrey V. Tkachenia, Mikhail V. Khitrov
12
Information indicators of speech are analyzed for creation of channel-independent feature space aimed at improvement of speaker recognition system efficiency. For the problem of determination of similarity between several audio recordings, the optimal set of channel-independent information feature vectors is determined experimentally with the use of dynamic time warping.
17
The problem of data balancing for training of acoustic models for automatic speech recognition system is considered. A metric is proposed which enables an explicit account for the data level in a cluster during triphone clustering. The proposed approach is shown to improve the quality of speech recognition.
CROSS-VALIDATION STATE CONTROL IN ACOUSTIC MODEL TRAINING OF AUTOMATIC SPEECH RECOGNITION SYSTEM German A. Chernykh, Maksim L. Korenevsky, Levin Kirill E, Irina A. Ponomareva, Tomashenko Natalia A
23
A technique is presented for optimization of Gaussian mixture models (GMM) size during the training of hidden Markov models (HMM), an essential part of many of the automatic speech recognition systems. Application of the technique increases recognition accuracy by avoiding the over-fitting effect, and reduces significantly computational load of the recognition procedure.
STATISTICAL METHODS FOR AUTOMATIC PROSODIC BREAK DETECTION IN A TEXT-TO-SPEECH SYSTEM Pavel Chistikov, Olga G. Khomitsevich, Sergey V. Rybin
28
Application of statistical methods for predicting positions and durations of prosodic breaks in a text-to-speech system is proposed. The methods are shown to ensure better results as compared with a baseline rule-based system.

SYSTEMS OF SPEECH AND ACOUSTIC SIGNAL PROCESSING

TIME DELAY ESTIMATION OF AUDIO SIGNALS USING THEIR ENVELOPES Sergey A. Aleynikov, Mikhail B. Stolbov
33
The problem of time delay estimation for dual-channel acoustic signals (speech, music, etc.) recorded under reverberant conditions is investigated. A method of time delay estimation based on cross-correlation of temporal envelopes of the signals is presented. Comparison with other known methods of time delay estimation is provided.
SPEECH SIGNALS STOCHASTICITY AND ITS EVALUATION Sergey A. Aleynikov, Mikhail B. Stolbov
40
The known and new presented methods for evaluation of speech signal stochasticity are analyzed. Results of statistical simulation demonstrate advantages of the proposed approach as compared with the known ones: the estimates obtained with the new method possess lower variance and bias.
47
Functional safety of detection of a vibroacoustic signal from arriving train with energy detector is investigated. The lower value of detectability level is derived from a proposed false alarm probability. Sufficiency of developed method for detection of arriving train in the case of long-welded rails is demonstrated.
53
A practical speech detection method for robust automatic speech recognition is proposed. The method employs a system of two symmetrical microphones oriented in opposite directions. The algorithm of signal processing allows for spatial filtering of speakers.

SPEAKER RECOGNITION SYSTEMS

EXPERT SYSTEMS AND METHODS FOR SPEAKER IDENTIFICATION Bulgakova Elena V, Krasnova Ekaterina V
58
Modern approaches to forensic phonographic examination are analyzed. Utilizing different Various software used for the purpose of speaker identification is considered. Special phonographic editor SIS II developed by the Speech Technology Center is described.
CONCEPT OF THE NATIONAL VOICE ACCOUNTING AND VOICE BIOMETRIC SEARCH SYSTEM Dmitry V. Dyrmovsky, Sergey L. Koval, Mikhail V. Khitrov
63
Concept of the national voice accounting and voice biometric search system is presented.
70
Applicability of manifold learning methods widely used in image recognition, to the problem of speaker identification, is considered. Experimental study are carried out, the results are analyzed.
EMPLOYMENT OF DTW-BASED HMM-GMM MULTI-SESSION TRAINING IN TEXTDEPENDENT SPEAKER VERIFICATION Sergey A. Novoselov, Vladislav A. Sukhmel, Sholokhov Alexey Vladimirovich, Timur Pekhovsky
77
An HMM training procedure using several password utterances is proposed. The proposed method is based on the Dynamic Time Warping algorithm, and is shown to allow for reduction of verification system errors.
84
A method of spoofing text-dependent voice verification system based on the most popular TTS approaches (Unit Selection and HMM) is presented. The method is shown to allow for false acceptance error of 98—100 % in the case of sufficiently large TTS database. A distinctive feature of the method is that it can be fully automated if used in conjunction with a speech recognition system.