SPECOM 2016 conference proceedings - LNAI 9811 is now available online. You can access the online version at here.
Program at a glance
| Hours |  Tue |  Wednesday |  Thursday |  Friday |  Sat | |
| 8.00 | Registration | |||||
| 8.30 | Opening ceremony | |||||
| 9.00 | Keynote lecture of Ralf Schlueter | Keynote lecture of Attila Vékony | Keynote lecture of Nick Campbell | Budapest tour | ||
| 10.00 | Coffee break | Coffee break | Coffee break | |||
| 10.30 | Speech recognition and understanding | Multimodal human-machine interaction | Natural language processing | |||
| 12.30 | Lunch | Lunch | Lunch | |||
| 14.00 | SPECOM Poster session I | ICR Poster session | SPECOM Poster session II | |||
| 16.00 | Registration | Coffee break | Coffee break | Coffee break | ||
| 16.30 | Speech synthesis | Interactive collaborative robotics | Speech signal processing | Speaker and language recognition | ||
| 18.30 | Welcome reception 18.30 - 20.00 | Closing ceremony | ||||
| 19.30 | Gala dinner on the Danube 19.30 - 21.30 | |||||
|  | ||||||
 The detailed program can be downloaded here 
Detailed Technical Program
Tuesday, August, 23th
 16:00-18:00 Registration
 18:30-20:00 Welcome Reception
  
Wednesday, August, 24th
 08:00-08:30 Registration
 08:30-09:00 Opening ceremony
 09:00-10:00 Keynote speech: Automatic Speech Recognition based on Neural Networks 
 Ralf Schlueter, RWTH Aachen University, Germany
 Chair: Géza Németh, Budapest University of Technology and Economics, Hungary
10:00-10:30 Coffee break
 10:30-12:30 Speech recognition and understanding 
 Chair: Alexey Karpov, SPIIRAS, Russia
 10:30-10:50 Adaptation of DNN Acoustic Models using KL-divergence Regularization and Multi-Task Training 
 Lászlo Tóth and Gábor Gosztolya
 10:50-11:10 Improving Automatic Speech Recognition Containing Additive Noise Using Deep Denoising, Autoencoders of LSTM Networks 
 Marvin Coto, John Goddard and Fabiola Martinez
 11:10-11:30 Knowledge Transfer for Utterance Classification in Low-Resource Languages 
 Andrei Smirnov and Valentin Mendelev
 11:30-11:50 Designing Syllable Models for an HMM based Speech Recognition System
 Kseniya Proenca, Kris Demuynck and Dirk Van Compernolle
 11:50-12:10 In-document Adaptation for a Human Guided Automatic Transcription Service
 André Mansikkaniemi, Mikko Kurimo and Krister Lindén
 12:10-12:30 Automatic Summarization of Highly Spontaneous Speech 
 András Beke and György Szaszák
12:30-14:00 Lunch
 14:00-16:00 SPECOM Poster session I
 Chair: Ralf Schlueter, RWTH Aachen University, Germany
 P1: Exploring GMM-derived Features for Unsupervised Adaptation of Deep Neural Network Acoustic Models
 Natalia Tomashenko, Yuri Khokhlov, Anthony Larcher and Yannick Estève
 P2: DNN-based Acoustic Modeling for Russian Speech Recognition Using Kaldi
 Irina Kipyatkova and Alexey Karpov
 P3: Improving the Quality of Automatic Speech Recognition in Trucks
 Maхim Korenevsky, Ivan Medennikov and Vadim Shchemelinin
 P4: Feature Space VTS with Phase Term Modeling
 Maxim Korenevsky and Aleksei Romanenko
 P5: LSTM-based Language Models for Spontaneous Speech Recognition
 Ivan Medennikov and Anna Bulusheva
 P6: Speaker-dependent bottleneck features for Egyptian Arabic speech recognition
 Aleksei Romanenko and Valentin Mendelev
 P7: Advances in STC Russian Spontaneous Speech Recognition System
 Ivan Medennikov and Alexey Prudnikov
 P8: Combining Atom Decomposition of the F0 Track and HMM-based Phonological Phrase Modelling for Robust Stress Detection in Speech
 György Szaszák, Máté Ákos Tündik, Branislav Gerazov and Aleksandar Gjoreski
 P9: Improving Recognition of Dysarthric Speech Using Severity Based Tempo Adaptation Chitralekha Bhat, Bhavik Vachhani and Sunil Kumar Kopparapu
 P10: Comparison of Retrieval Approaches and Blind Relevance Feedback Methods within the Czech Speech Information Retrieval
 Lucie Skorkovska
 P11: A Phonetic Segmentation Procedure Based on Hidden Markov Models
 Edvin Pakoci, Branislav Popović, Nikša Jakovljević, Darko Pekar and Fathy Yassa
 P12: Stress, arousal, and stress detector trained on acted speech database
 Róbert Sabo, Milan Rusko, Andrej Ridzik and Jakub Rajčani
 P13: Improvements to Prosodic Variation in Long Short-Term Memory based Intonation Models Using Random Forest
 Bálint Pál Tóth, Balázs Szórádi and Géza Németh
 P14: Fusing various audio feature sets for detection of Parkinson's disease from sustained voice and speech recordings
 Evaldas Vaiciukynas, Antanas Verikas, Adas Gelzinis, Marija Bacauskiene, Kestutis Vaskevicius, Virgilijus Uloza, Evaldas Padervinskis and Jolita Ciceliene
 P15: Investigation of Speech Signal Parameters Reflecting the Truth of Transmitted Information
 Victor Budkov, Irina Vatamaniuk, Vladimir Basov and Daniyar Volf
 P16: Trade-off between speed and accuracy for Noise Variance Minimization (NVM) pitch estimation algorithm
 Andrey Barabanov and Aleksandr Melnikov
 P17: Study on the improvement of intelligibility for elderly speech using formant frequency shift method
 Yuto Tanaka, Mitsunori Mizumachi and Yoshihisa Nakatoh
 P18: Quality Assessment of two Fullband Audio Codecs Supporting Real-Time Communication
 Michael Maruschke, Oliver Jokisch, Martin Meszaros, Franziska Trojahn and Mario Hoffmann
 P19: A Deep Neural Networks (DNN) Based models for a Computer Aided Pronunciation Learning System
 Mohamed Elaraby, Mustafa Abdallah, Sherif Abdou and Mohsen Rashwan (in absentia)
 P20: Evaluation of Response Times on a Touch Screen using Stereo Panned Speech Command Auditory Feedback
 Hunor Nagy and György Wersényi
 P21: Speech Enhancement with Microphone Array Using a Multi Beam Adaptive Noise Suppressor
 Mikhail Stolbov and Alexander Lavrentyev
 P22: Microphone Array Directivity Improvement in Low-Frequency Domain for Speech Processing
 Sergei Aleinik and Mikhail Stolbov
 P23: Optimization of Zelinski post-filtering calculation
 Sergei Aleinik
 P24: Assessment of the relation between low-frequency features and velum opening by using real articulatory data
 Alexander Sepulveda-Sepulveda and German Castellanos-Dominguez
 P25: Evaluation of the speech quality during rehabilitation after surgical treatment of the cancer of oral cavity and oropharynx based on a comparison of the Fourier spectra
 Evgeny Kostyuchenko, Roman Mescheryakov, Dariya Ignatieva, Alexander Pyatkov, Evgeny Choynzonov and Lidiya Batatskaya
16:00-16:30 Coffee break
 16:30-18:30 Speech synthesis
 Chair: Géza Németh, Budapest University of Technology and Economics, Hungary
 16:30-16:50 Ensemble Deep Neural Network based Waveform-Driven Stress Model for Speech Synthesis 
 Bálint Pál Tóth, Kornél István Kiss, György Szaszák and Géza Németh
 16:50-17:10 DNN-Based Duration Modeling for Synthesizing Short Sentences 
 Péter Nagy and Géza Németh
 17:10-17:30 Experiments with One-Class Classifier as a Predictor of Spectral Discontinuities in Unit Concatenation 
 Daniel Tihelka, Martin Grůber and Markéta Jůzová
 17:30-17:50 Phonetic Aspects of High Level of Naturalness in Speech Synthesis 
 Vera Evdokimova, Pavel Skrelin, Andrey Barabanov and Karina Evgrafova
 17:50-18:10 An agonist-antagonist pitch production model 
 Branislav Gerazov and Philip N. Garner
 18:10-18:30 An UMP (Universal Melodic Portraits) Model of Pitch Contours Stylization for Analysis and Synthesis of Intonation
 Boris Lobanov
  
Thursday, August, 25th
 09:00-10:00 Keynote speech: Speech Recognition Challenges in the Car Navigation Industry 
 Attila Vékony, NNG Software Developing and Commercial Llc. Hungary
 Chair: Andrey Ronzhin, SPIIRAS, Russia
 
 10:00-10:30 Coffee break
 10:30-12:30 Multimodal human-machine interaction
 Chair: Milos Zelezny, University of West Bohemia, Czech Republic
 10:30-10:50 Toward Sign Language Motion Capture Dataset Building
 Zdeněk Krňoul, Pavel Jedlička, Jakub Kanis and Milos Zelezny
 10:50-11:10 Selecting Keypoint Detector and Descriptor Combination for Augmented Reality Application 
 Lukáš Bureš and Luděk Müller
 11:10-11:30 Human-Robot Interaction using Brain-Computer Interface
 Lev Stankevich and Konstantin Sonkin
 11:30-11:50 Attention Training Game with Aldebaran Robotics NAO and Brain-Computer Interface 
 Evgeny Shandarov, Stepan Gomilko and Alina Zimina
 11:50-12:10 HAVRUS Corpus: High-speed Recordings of Audio-Visual Russian Speech 
 Vasilisa Verkhodanova, Alexander Ronzhin, Irina Kipyatkova, Denis Ivanko, Alexey Karpov and Milos Zelezny
 12:10-12:30 Speech Recognition combining MFCCs and Image Features (Skype) 
 Stamatis Karlos, Nikos Fazakis, Katerina Karanikola, Sotiris Kotsiantis and Kyriakos Sgarbas
12:30-14:00 Lunch
 14:00-16:00 ICR Poster session
 Chair: Eugene Larkin, Tula State University, Russia
 P1: Decentralized Approach to Control of Robot Groups During Execution of the Task Flow
 Igor Kalyaev, Anatoly Kalyaev and Iakov Korovin
 P2: A Recovery Method for the Robotic Decentralized Control System with Performance Redundancy
 Iakov Korovin, Eduard Melnik and Anna Klimenko
 P3: Control Algorithms for Heterogeneous Vehicle Groups Control in Obstructed 2-D Environments
 Viacheslav Pshikhopov, Mikhail Medvedev, Anatoly Gaiduk and Aleksandr Kolesnikov
 P4: Method of Spheres for Solving 3D Formation Task in a Group of Quadrotors
 Donat Ivanov, Sergey Kapustyan and Igor Kalyaev
 P5: Multi-Robot Exploration and Mapping Based on the Subdefinite Models
 Valery Karpov, Alexander Migalev, Anton Moscowsky, Maxim Rovbo and Vitaly Vorobiev
 P6: Simulation of Commands Execution by Mobile Robot
 Eugene Larkin, Alexey Ivutin, Vladislav Kotov and Alexander Privalov
 P7: The Effectiveness of Rescuing Casualties when Using Robotic Systems
 Anna Motienko, Igor Dorozhko, Anatoly Tarasov and Oleg Basov
 P8: Distributed Information System for Collaborative Robots and IoT Devices
 Siarhei Herasiuta, Uladzislau Sychou and Ryhor Prakapovich
 P9: Positioning Method Basing on External Reference Points for Surgical Robots
 Ekaterina Sinyavskaya, Elena Shestova, Mikhail Medvedev and Evgenij Kosenko
 P10: Hardware-Software Solution for Three-Dimensional Model Control in Volumetric Display Testing Unit for Visualization and Dispatching Applications
 Alexander Bolshakov, Arthur Sgibnev, Tatiana Chistyakova, Viktor Glazkov and Dmitry Lachugin P11: Educational Marine Robotics in SMTU
 Mikhail Chemodanov, Ryzhov Vladimir, Nickolay Semenov, Kirill Rozhdestvensky and Igor Kozhemyakin
 P12: Designing Simulation Model of Humanoid Robot to Study Servo Control System Alexander Denisov, Viktor Budkov and Daniil Mikhalchenko
 P13: Speech Dialog as a Part of Interactive "Human-Machine" Systems
 Rodmonga Potapova
 P14: Human-Machine Speech-Based Interfaces with Augmented Reality and Interactive Systems for Controlling Mobile Cranes
 Maciej J. Majewski and Wojciech Kacalak
 P15: Preprocessing Data for Facial Gestures Classifier on the Basis of the Neural Network Analysis of Biopotentials Muscle Signals
 Raisa Budko and Irina Starchenko
 P16: Mimic Recognition and Reproduction in Bilateral Human-Robot Speech Communication
 Arkady S. Yuschenko, Sergey Vorotnikov, Dmitry Konyshev and Andrey Zhonin
 P17: Interactive Collaborative Robotics and Natural Language Interface Based on Multi-Agent Recursive Cognitive Architectures
 Murat Anchokov, Zalimkhan Nagoev, Vladimir Denisenko, Boris Tazhev and Zaurbek Sundukov P18: An Analysis of Visual Faces Datasets
 Ivan Gruber, Miroslav Hlaváč, Marek Hrúz, Miloš Železný and Alexey Karpov
 P19: Voice Dialogue with a Collaborative Robot Driven by Multimodal Semantics
 Alexander Kharlamov and Konstantin Ermishin
 P20: Human-Smartphone Interaction for Dangerous Situation Detection & Recommendation Generation while Driving
 Alexander Smirnov, Alexey Kashevnik and Igor Lashkov
 P21: Conceptual Model of Cyberphysical Environment Based on Collaborative Work of Distributed Means and Mobile Robots
 Anton Saveliev, Oleg Basov and Andrey Ronzhin
 P22: The Humanoid Robot Assistant for a Preschool Children
 Evgeny Shandarov, Alina Zimina, Dmitry Rimer, Evgenia Sokolova and Olga Shandarova
16:00-16:30 Coffee break
 16:30-18:30 Interactive collaborative robotics
 Chair: Roman Meshcheryakov, TUSUR, Russia
 16:30-16:50 Development of Wireless Charging Robot for Indoor Environment based on Probabilistic Roadmap 
 Yi-Shiun Wu, Chi-Wei Chen and Hooman Samani
 16:50-17:10 Mechanical Leg Design of the Anthropomorphic Robot Antares 
 Nikita Pavluk, Victor Budkov, Andrey Kodyakov and Andrey Ronzhin
 17:10-17:30 YuMi, come and play with me! A Collaborative Robot for piecing together a Tangram Puzzle 
 David Kirschner, Rosemarie Velik, Saeed Yahyanejad, Mathias Brandstötter and Michael Hofbaur
 17:30-17:50 A Control Strategy for a Lower Limb Exoskeleton with a Toe Joint 
 Sergei Savin, Sergey Jatsun and Andrey Yatsun
 17:50-18:10 Robot Soccer Team for RoboCup Humanoid KidSize League 
 Evgeny Shandarov, Stepan Gomilko, Darya Zhulaeva, Dmitry Rimer, Dmitry Yakushin and Roman Meshcheryakov
 18:10-18:30 Smart M3-Based Robot Interaction Scenario for Coalition Work 
 Alexander Smirnov, Alexey Kashevnik, Sergey Mikhailov, Mikhail Mironov and Mikhail Petrov
 16:30-18:30 Speech signal processing
 Chair: László Tóth, University of Szeged
 16:30-16:50 Robust Speech Analysis Based on Source-Filter Model Using Multivariate Empirical Mode Decomposition in Noisy Environments 
 Surasak Boonkla, Masashi Unoki and Stanislav S. Makhanov
 16:50-17:10 An Algorithm for Phase Manipulation in a Speech Signal 
 Darko Pekar, Siniša Suzić, Robert Mak, Meir Friedlander and Milan Sečujski
 17:10-17:30 Detecting Laughter and Filler Events by Time Series Smoothing with Genetic Algorithms 
 Gábor Gosztolya
 17:30-17:50 Bio-Inspired Sparse Representation of Speech and Audio Using Psychoacoustic Adaptive Matching Pursuit 
 Alexey Petrovsky, Vadzim Herasimovich and Alexander Petrovsky
 17:50-18:10 Statistical analysis of acoustical parameters in the voice of children with juvenile dysphonia 
 Miklós Gábriel Tulics, Ferenc Kazinczi and Klára Vicsi
 18:10-18:30 Precise estimation of harmonic parameter trend and modification of a speech signal
 Andrey Barabanov, Evgenij Vikulov and Valentin Magerkin
 19:30-21:30 Gala dinner on the Danube
  
Friday, August, 26th
 09:00-10:00 Keynote speech: Machine Processing of Dialogue States; Speculations on Conversational Entropy 
 Nick Campbell, Trinity College Dublin, Ireland
 Chair: Rodmonga Potapova, MSLU, Russia
 
 10:00-10:30 Coffee break
 10:30-12:30 Natural language processing
 Chair: Rodmonga Potapova, MSLU, Russia
 10:30-10:50 Text Classification in the Domain of Applied Linguistics as Part of a Pre-editing Module for Machine Translation Systems
 Ksenia Oskina
 10:50-11:10 Backchanneling via Twitter Data for Conversational Dialogue Systems 
 Michimasa Inaba and Kenichi Takahasi
 11:10-11:30 Measuring prosodic entrainment in Italian collaborative game-based dialogues 
 Michelina Savino, Loredana Lapertosa, Alessandro Caffò and Mario Refice
 11:30-11:50 A Preliminary Exploration of Group Social Engagement Level Recognition in Multiparty Casual Conversation 
 Yuyun Huang, Emer Gilmartin, Benjamin R. Cowan and Nick Campbell
 11:50-12:10 Interaction Quality as a Human-Human Task-Oriented Conversation Performance (ppsx)
 Anastasiia Spirina, Olesia Vaskovskaia, Maxim Sidorov and Alexander Schmitt
 12:10-12:30 A comparison of acoustic features of speech of typically developing children and children with autism spectrum disorders 
 Elena Lyakso, Olga Frolova and Aleksey Grigorev
 SPECOM Poster session II
 14:00-16:00 Chair: Nick Campbell, Trinity College Dublin, Ireland
 P1: Polybasic Attribution of Social Network Discourse
 Rodmonga Potapova and Vsevolod Potapov
 P2: Detecting Filled Pauses and Lengthenings in Russian Spontaneous Speech using SVM
 Vasilisa Verkhodanova and Vladimir Shapranov
 P3: Multimodal Perception of Aggressive Behavior
 Rodmonga Potapova and Liliya Komalova
 P4: Designing High-Coverage Multi-Level Text Corpus for Non-Professional-Voice Conservation
 Markéta Jůzová, Daniel Tihelka and Jindřich Matoušek
 P5: A Linguistic Interpretation of the Atom Decomposition of Fundamental Frequency Contour for American English
 Tijana Delić, Branislav Gerazov, Branislav Popović and Milan Sečujski
 P6: Emotional speech of 3-years old children: norm-risk-deprivation
 Olga Frolova and Elena Lyakso
 P7: Profiling a Set of Personality Traits of a Text's Author: a Corpus-Based Approach
 Tatiana Litvinova, Olga Zagorovskaya, Olga Litvinova and Pavel Seredin
 P8: Unsupervised trained functional discourse parser for e-learning materials scaffolding
 Varvara Krayvanova and Svetlana Duka
 P9: Low Inter-Annotator Agreement in Sentence Boundary Detection and Personality
 Anton Stepikhov and Anastassia Loukina
 P10: Modeling Imperative Utterances in Russian Spoken Dialogue: Verb-Central Quantitative Approach
 Olga Blinova
 P11: An Exploratory Study on Sociolinguistic Variation of Spoken Russian
 Natalia Bogdanova-Beglarian, Tatiana Sherstinova, Olga Blinova and Gregory Martynenko
 P12: Speech Acts Annotation of Everyday Conversations in the ORD corpus of Spoken Russian
 Tatiana Sherstinova
 P13: Design of a Speech Corpus for Research on Cross-Lingual Prosody Transfer
 Milan Sečujski, Branislav Gerazov, Tamás Gábor Csapó, Vlado Delić, Philip Garner, Aleksandar Gjoreski, David Guennec, Zoran Ivanovski, Aleksandar Melov, Géza Németh, Ana Stojković and György Szaszák
 P14: Sociolinguistic Extension of the ORD Corpus of Russian Everyday Speech
 Natalia Bogdanova-Beglarian, Tatiana Sherstinova, Olga Blinova, Olga Ermolova, Ekaterina Baeva, Gregory Martynenko and Anastasia Ryko
 P15: Detecting state of aggression in sentences using CNN
 Denis Gordeev
 P16: Tonal Specification of Perceptually Prominent Non-Nuclear Pitch Accents in Russian
 Nina Volskaya and Tatiana Kachkovskaia
 P17: Lexical Stress in Punjabi and its Representation in PLS
 Swaran Lata, Swati Arora and Simerjeet Kaur
 P18: Comparative analysis of classifiers for automatic language recognition in spontaneous speech
 Konstantin Simonchik, Sergey Novoselov and Galina Lavrentyeva
 P19: Semi-automatic Speaker Verification System Based on Analysis of Formant, Durational and Pitch Characteristics
 Elena Bulgakova and Aleksei Sholohov
 P20: Scores Calibration in Speaker Recognition Systems
 Andrey Shulipa, Sergey Novoselov and Yuri Matveev
 P21: Speech Features Evaluation for Small Set Automatic Speaker Verification Using GMM-UBM System
 Ivan Rakhmanenko and Roman Meshcheryakov
 P22: Approaches for Out-of-Domain Adaptation to Improve Speaker Recognition Performance Andrey Shulipa, Sergey Novoselov and Aleksandr Melnikov
 P23: Prosody Analysis of Malay Language Storytelling Corpus
 Izzad Ramli, Noraini Seman, Norizah Ardi and Nursuriati Jamil
 P24: Finding speaker position under difficult acoustic conditions
 Evgeniy Shuranov, Alexander Lavrentyev, Alexey Kozlyaev and Valeriya Volkovaya
 P25: Scenarios of Multimodal Information Navigation Services for Users in Cyberphysical Environment
 Irina Vatamaniuk, Dmitriy Levonevskiy, Anton Saveliev and Alexander Denisov
16:00-16:30 Coffee break
 16:30-18:30 Speaker and language recognition
 Chair: Iosif Mporas, University of Hertfordshire, UK
 16:30-16:50 Investigation of Segmentation in i-Vector based Speaker Diarization of Telephone Speech 
 Zbynek Zajic, Marie Kunesova and Vlasta Radova
 16:50-17:10 Improving Robustness of Speaker Verification by Fusion of Prompted Text-Dependent & Text- Independent Operation Modalities 
 Iosif Mporas, Saeid Safavi and Reza Sotudeh
 17:10-17:30 Convolutional Neural Network in the Task of Speaker Change Detection 
 Marek Hruz and Marie Kunesova
 17:30-17:50 Online Biometric Identification With Face Analysis in Web Applications 
 Gerasimos Arvanitis, Konstantinos Moustakas and Nikos Fakotakis
 17:50-18:10 Language Identification using Time Delay Neural Network D-Vector on Short Utterances
 Maxim Tkachenko, Alexander Yamshinin, Nikolay Luibimov, Mikhail Kotov and Marina Nastasenko
 18:10-18:30 On Individual Polyinformativity of Speech and Voice Regarding Speaker's Auditive Attribution (Forensic Phonetic Aspect) 
 Rodmonga Potapova and Vsevolod Potapov
 18:30-18:40 Closing ceremony 
  
Saturday, August, 27th
 09:00-15:00 Budapest tour
   
Recent news
| PicturesPictures are available at the Gallery | |
| Specom History presentation | |
| Program Guideclick on the picture |  | 
 
 








