Диссертация (Методы, алгоритмы и программные средства распознавания русской телефонной спонтанной речи), страница 24
Описание файла
Файл "Диссертация" внутри архива находится в папке "Методы, алгоритмы и программные средства распознавания русской телефонной спонтанной речи". PDF-файл из архива "Методы, алгоритмы и программные средства распознавания русской телефонной спонтанной речи", который расположен в категории "". Всё это находится в предмете "технические науки" из Аспирантура и докторантура, которые можно найти в файловом архиве СПбГУ. Не смотря на прямую связь этого архива с СПбГУ, его также можно найти и в других разделах. , а ещё этот архив представляет собой кандидатскую диссертацию, поэтому ещё представлен в разделе всех диссертаций на соискание учёной степени кандидата технических наук.
Просмотр PDF-файла онлайн
Текст 24 страницы из PDF
— Vol. 14, no. 2. — P. 115–135.72. Sequence-discriminative training of deep neural networks [Text] / K. Veselý,A. Ghoshal, L. Burget, D. Povey // Proc. Annual Conference of InternationalSpeech Communication Association (INTERSPEECH). — 2013.73. Error back propagation for sequence training of context-dependent deep networksfor conversational speech transcription [Text] / H. Su, G. Li, D.
Yu, F. Seide //Proc. IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). — 2013.74. Li, B. Comparison of discriminative input and output transformations for speakeradaptation in the hybrid NN/HMM systems [Text] / B. Li, K. Sim // Proc. Interspeech. — 2010. — P. 526––529.75. Linear hidden transformations for adaptation of hybrid ANN/HMM models[Text] / R.
Gemello, F. Mana, S. Scanzio [et al.] // Speech Communication. —2007. — Vol. 49, no. 10–11. — P. 827––835.14176. Adaptation of context-dependent deep neural networks for automatic speechrecognition [Text] / K. Yao, D. Yu, F. Seide [et al.] // IEEE Spoken LanguageTechnology Workshop (SLT). — 2012.
— P. 366––369.77. Feature engineering in context-dependent deep neural networks for conversationalspeech transcription [Text] / F. Seide, G. Li, X. Chen, D. Yu // Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). — 2011. —P. 24––29.78. Speaker Adaptive Training Using Deep Neural Networks [Text] / T. Ochiai,S.
Matsuda, X. Lu [et al.] // Proc. IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). — 2014. — P. 6349–6353.79. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network [Text] / Xue J., J. Li, D. Yu [et al.] //Proc.
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). — 2014. — P. 6359–6363.80. Li, X. Regularized adaptation of discriminative classifiers [Text] / X. Li,J. Bilmes // Proc. IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP). — 2006.81. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition [Text] / D. Yu, K.
Yao, H. Su [et al.] // Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). —2013. — P. 7893–7897.82. Abdel-Hamid, O. Fast speaker adaptation of hybrid NN/HMM model for speechrecognition based on discriminative learning of speaker code [Text] / O. AbdelHamid, H. Jiang // Proc. IEEE International Conference on Acoustics, Speech andSignal Processing (ICASSP). — 2013. — P. 7942–7946.83.
Front-end factor analysis for speaker verification [Text] / N. Dehak, P. Kenny,R. Dehak [et al.] // IEEE Trans. Audio, Speech and Language Processing). —2010. — Vol. 19, no. 4. — P. 788–798.14284. SVID Speaker Recognition System for NIST SRE 2012 [Text] / A. Kozlov, O. Kudashev, Y. Matveev [et al.] // Proc. SPECOM 2013, Lecture Notes in ArtificialIntelligence. — 2013.
— Vol. 8113. — P. 278––285.85. Speaker adaptation of neural network acoustic models using i-vectors [Text] /G. Saon, H. Soltau, D. Nahamoo, M. Picheny // Proc. IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). — 2013. — P. 55–59.86. Rouvier, M. Speaker adaptation of DNN-based ASR with i-vectors: Does it actually adapt models to speakers? [Text] / M.
Rouvier, B. Favre // Proc. Annual Conference of International Speech Communication Association (INTERSPEECH). — 2014. — P. 3007––3011.87. Senior, A. Improving DNN speaker independence with I-vector inputs [Text] /A. Senior, I. Lopez-Moreno // Proc. IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). — 2014.
— P. 225–229.88. Li, G. Factorized adaptation for deep neural network [Text] / G. Li, J.-T. Huang,Y. Gong // Proc. IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP). — 2014. — P. 5537–5541.89. Tomashenko, N. Speaker adaptation of context dependent deep neural networks based on MAP-adaptation and GMM-derived feature processing [Text] /N. Tomashenko, Y. Khokhlov // Proc. Annual Conference of International SpeechCommunication Association (INTERSPEECH). — 2014. — P. 2997––3001.90. Liu, S.
On Combining DNN and GMM with Unsupervised Speaker Adaptationfor Robust Automatic Speech [Text] / S. Liu, K. Sim // Proc. IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP). — 2014. —P. 195––199.91. Jelinek, J. Continuous Speech Recognition by Statisical Methods [Text] / J. Jelinek // Proceedings of the IEEE. — 1976. — Vol. 64, no. 4. — P. 532–556.92. Chen, S. An empirical study of smoothing techniques for language modeling[Text] / S. Chen, J. Goodman // Computer Speech and Language. — 1999.
—Vol. 13. — P. 359–394.14393. Bell, T. Text Compression [Text] / T. Bell, J. Cleary, I. Witten. — EnglewoodCliffs, NJ : Prentice Hall, 1990.94. Chen, S. Evaluation metrics for language models [Text] / S. Chen, D. Beeferman,R. Rosenfeld // DARPA Broadcast News Transcription and Understanding Workshop. — 1998.95. A neural probabilistic language model [Text] / Y. Bengio, R. Ducharme, P.
Vincent, C. Jauvin // Journal of Machine Learning Research. — 2003. — Vol. 3. —P. 1137––1155.96. Recurrent neural network based language model [Text] / T. Mikolov, M. Karafiat,L. Burget [et al.] // Proc. Annual Conference of International Speech Communication Association (INTERSPEECH). — 2010. — P. 1045––1048.97. Bilmes, J.
Factored language models and generalized parallel backoff [Text] /J. Bilmes, K. Kirchhoff // Proc. HLT/NAACL. — 2003. — P. 4––6.98. Saon, G. Anatomy of extremely fast LVCSR decoder [Text] / G. Saon, D. Povey,G. Zweig // Proc. Annual Conference of International Speech CommunicationAssociation (INTERSPEECH). — 2005. — P. 549–552.99. Ortmanns, S.
Look-ahead techniques for fast beam search [Text] / S. Ortmanns,H. Ney // Computer Speech and Language. — 2000. — Vol. 14. — P. 15––32.100. The Hidden Markov Model Toolkit (HTK) [Electronic resource]. — 2016. —URL: http://htk.eng.cam.ac.uk/ (online; accessed: 22.01.2016).101. Kaldi ASR Toolkit [Electronic resource]. — 2016.
— URL: http://kaldi-asr.org/(online; accessed: 22.01.2016).102. The Kaldi Speech Recognition Toolkit [Text] / D. Povey, A. Ghoshal, G. Boulianne [et al.] // Proc. IEEE Workshop on Automatic Speech Recognition andUnderstanding (ASRU). — 2011. — P. 1––4.103.
CMU Sphinx [Electronic resource]. — 2016. — URL: http://cmusphinx.sourceforge.net/ (online; accessed: 22.01.2016).144104. Comparing Open-Source Speech Recognition Toolkits [Text] / C. Gaida,P. Lange, P. Proba [et al.]. — 2014. — URL: http://suendermann.com/su/pdf/oasis2014.pdf (online; accessed: 02.02.2016).105. Ронжин, А. Л.
Анализ вариативности спонтанной речи и способов устранения речевых сбоев [Текст] / А. Л. Ронжин, К. В. Евграфова // Известиявысших учебных заведений: Гуманитарные науки. –– 2011. –– Т. 2, № 3. ––С. 227––231.106. Allison, B. Another Look at the Data Sparsity Problem [Text] / B. Allison,D. Guthrie, L. Guthrie // Text, Speech and Dialogue, Lecture Notes in ComputerScience. — 2006. — Vol. 4188.
— P. 327–334.107. Бондарко, Л. В. Фонетическое описание языка и фонологическое описаниеречи [Текст] / Л. В. Бондарко. –– Ленинград : Изд-во ЛГУ, 1981.108. Кузнецов, В. И. Вокализм связной речи: Экспериментальное исследованиена материале русского языка [Текст] / В. И. Кузнецов. –– СПб : Изд-во СПбГУ, 1997.109. Русская разговорная речь [Текст] / Под ред. Е. А. Земской. –– Москва : Наука,1973.110. Фонетика спонтанной речи [Текст] / Под ред. Н.
Д. Светозаровой. –– Ленинград : Изд-во ЛГУ, 1988.111. Phonetic properties of Russian Spontaneous Speech [Text] / Bondarko L., Volskaya N., Tananaiko S., Vasilieva L. // Proc. 15th ICPhS. — 2003. — P. 2973–2976.112. Кипяткова, И. С. Аналитический обзор систем распознавания русской речис большим словарем [Текст] / И. С. Кипяткова, А. А.
Карпов // Труды СПИИРАН. –– 2010. –– С. 7–20.113. Кипяткова, И. С. Методы и программные средства фонетико-языковогомоделирования в системах автоматического распознавания русской речи[Текст] : Диссертация на соискание ученой степени кандидата техническихнаук : 05.13.11 / И. С. Кипяткова ; СПИИРАН. –– СПб : [б. и.], 2011.145114. Large vocabulary Russian speech recognition using syntactico-statistical languagemodeling [Text] / A.
Karpov, K. Markov, I. Kipyatkova [et al.] // Speech Communication. — 2014. — Vol. 56. — P. 213–228.115. Transcription of Russian conversational speech [Text] / L. Lamel, S. Courcinous,J.-L. Gauvain [et al.] // Proc. SLTU. — 2012. — P. 156––161.116. OpenSubtitles [Electronic resource]. — 2016. — URL: http://www.opensubtitles.org/ (online; accessed: 02.02.2016).117.
Feature Learning in Deep Neural Networks –— studies on Speech RecognitionTasks [Text] / D. Yu, M. Seltzer, J. Li [et al.] // Proc. ICLR. — 2013.118. Ratnaparkhi, A. A simple introduction to maximum entropy models for naturallanguage processing [Text] / A. Ratnaparkhi // IRCS Technical Reports Series.
—1997.119. Hermansky, H. Tandem connectionist feature extraction for conventional HMMsystems [Text] / H. Hermansky, D. P. Ellis, S. Sharma // Proc. IEEE InternationalConference on Acoustics, Speech and Signal Processing (ICASSP). — 2000. —Vol. 3. — P. 1635––1638.120. Probabilistic and bottle-neck features for LVCSR of meetings [Text] / F. Grézl,M. Karafiát, S. Kontár, J Černocký // Proc. IEEE International Conference onAcoustics, Speech and Signal Processing (ICASSP). — 2007.
— P. 757––760.121. Grézl, F. Optimizing bottle-neck features for LVCSR [Text] / F. Grézl,P. Fousek // Proc. IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP). — 2008. — P. 4729––4732.122. Grézl, F. Hierarchical Neural Net Architectures for Feature Extraction in ASR[Text] / F.
Grézl, M. Karafiát // Proc. Annual Conference of International SpeechCommunication Association (INTERSPEECH). — 2010. — P. 1201–1204.123. Sainath, T. Auto-encoder bottleneck features using deep belief networks [Text] /T. Sainath, B. Kingsbury, B. Ramabhadran // Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). — 2012. — P. 4153–4156.146124. Yu, D. Improved Bottleneck Features Using Pretrained Deep Neural Networks[Text] / D.
Yu, M. Seltzer // Proc. Annual Conference of International SpeechCommunication Association (INTERSPEECH). — 2011. — P. 237–240.125. Extracting deep bottleneck features using stacked auto-encoders [Text] /J. Gehring, Y. Miao, F. Metze, A. Waibel // Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). — 2013. — P. 3377–3381.126. Grézl, F. Semi-supervised bootstrapping approach for neural network feature extractor training [Text] / F. Grézl, M.