METHODOLOGICAL AND ALGORITHMIC BASE OF SPEECH AND ACOUSTIC SIGNALS PROCESSING AND ANALYSIS
A PHONETICALLY RICH TEXT FOR FUNDAMENTAL AND APPLIED RESEARCH ON RUSSIAN SPEECH VARIABILITY
Natalia S. Smirnova, Mikhail V. Khitrov
5
A phonetically rich text intended to be used for research into regional and individual variability of Russian speech is presented. The text provides full coverage of basic phonetic units of Russian, which allows for application in fundamental and applied studies of various kind in the field of speech science.
SUPPRESSION OF ACOUSTIC NOISE IN AUDIO DEVICE USING ASYNCHRONOUS REFERENCE SIGNAL
Sergey A. Aleynikov, Mikhail B. Stolbov
11
Semi-automatic technique for two-channel noise suppression with asynchronous reference signal (i.e. from an external source) is presented. The technique is described in details, its efficiency is compared with the algorithms using synchronous noise recordings.
ALGORITHMS FOR DETECTION OF TYPICAL NOISES AND INTERFERING BURSTS IN SPEECH SIGNALS
Sergey A. Aleynikov, Konstantin Simonchik
18
Methods of typical additive interfering noises and bursts detection in speech processing systems are analyzed and discussed. Detectors influence to the performance of speaker verification system is investigated experimentally. New improved algorithms for typical noises detection are proposed.
MODERN MOBILE SYSTEM FOR TRACK WARNING
Bibikov Sergey V, Maxim E. Markisonov, Sergey A. Panasyuk
24
The track warning systems for railway workers travel teams performing tracks repair are analyzed. An alternative mobile warning system is presented. The system is based on remote sensors detecting train approach. A comparative analysis of the proposed version of the warning system and foreign systems is carried out.
SPEECH SYNTHESIS SYSTEMS
AUTOMATION OF NEW VOICE CREATION PROCEDURE FOR A RUSSIAN TTS SYSTEM
Anna I. Solomennik, Pavel Chistikov, Sergey V. Rybin, Andrey O. Talanov, Tomashenko Natalia A
29
An automatic system for creating a new voice for VitalVoice TTS is presented. The system includes text selection, speech recording and record monitoring, database labeling, and parameter setting of unit selection.
A HYBRID TECHNOLOGY FOR TTS SYSTEM BASED ON HIDDEN MARKOV MODELS AND UNIT SELECTION ALGORITHM
Pavel Chistikov, Evgeny A. Korolkov, Andrey O. Talanov, Anna I. Solomennik
33
An approach to synthesis of Russian TTS system based on integration of Hidden Markov Models and Unit Selection algorithms is presented. The voice model creation method is developed for constructing a natural intonation contour. Improved quality of synthesized speech is confirmed by experimental results.
ASSESSMENT OF SYNTHESIZED SPEECH QUALITY: PROBLEMS AND SOLUTIONS
Anna I. Solomennik, Andrey O. Talanov, Mikhail V. Solomennik, Olga G. Khomitsevich, Pavel Chistikov
38
Various aspects of the speech synthesis systems quality assessment and comparison of existing TTS systems are concerned. A brief review of existing methods of quality assessment is presented.
APPLICATION OF LINGUISTIC ANALYSIS FOR TEXT NORMALIZATION AND HOMONYMY RESOLUTION IN RUSSIAN TEXT-TO-SPEECH SYSTEM
Olga G. Khomitsevich, Sergey V. Rybin, Аничкин И. М.
42
A method based on automatic morphological and syntactic analysis is developed to resolve ambiguities that arise in the process of text normalization and homonymy resolution in VitalVoice Russian TTS system. A high degree of accuracy is demonstrated in experimental processing of Russian texts of various types.
SYSTEM OF SPEAKER VOICE RECOGNITION
STUDY OF INFORMATIVE SPEECH FEATURES FOR AUTOMATIC SPEAKER IDENTIFICATION
Matveev Yuri Nikolaevich
47
The most popular speech features used in automatic speaker recognition systems are studied. Results of experiments with speech database collected in different acoustic environments (wide range of signal/noise levels and reverberation times) and over different channels are reported.
COMPARISON OF VARIOUS MIXTURES OF GAUSSIAN PLDA-MODELS IN THE PROBLEM OF TEXT-INDEPENDENT SPEAKER VERIFICATION
Timur Pekhovsky, Alexander Yu. Sizov
51
Applicability of unsupervised mixtures of PLDA models with Gaussian priors in a i-vector space for speaker verification is studied. Conditions under which the application is advantageous are analyzed for existing training databases. A mixture of two PLDA models is shown to be more effective than a single PLDA model for a cross-channel task.
GINI CRITERION SVM FOR EMOTION CLASSIFICATION FRAMEWORK
Andrey V. Tkachenia, Andrey G. Davydov, Vitaliy V. Kiselyov, Mikhail V. Khitrov
61
Gini criterion is applied for creation of SVM classifier feature space. An experimental study of the optimal set of informative features and the classifier construction is presented.
FEATURES OF HUMAN-MACHINE INTERFACE OF MODERN BIOMETRIC IDENTIFICATION SYSTEMS
Dmitry V. Dyrmovsky, Sergey L. Koval
66
Modern systems designed for automated identification of personality based on biometric characteristics analysis is considered. Requirements on arrangement of human-machine interface for such systems are formulated.
EVALUATION OF THE CONFIDENCE INTERVAL FOR DECISION PREDICTION OF AN ENSEMBLE OF CLASSIFIERS
Matveev Yuri Nikolaevich
74
An algorithm is proposed for evaluation of the confidence interval for decision prediction of an ensemble of classifiers, where each classifier in the ensemble returns a prediction as a logarithmic likelihood ratio.