Institute of Information Technology
- 1:
About the Institute. - 2:
Staff.- 2.1:
Information Transmission. - 2.2:
Dialogue Systems. - 2.3:
Alumni.
- 2.1:
- 3:
Teaching. - 4:
Information Transmission. - 5:
Dialogue Systems. - 6:
Intranet. - 7:
Links.
Dr. rer. nat. Rainer Gruhn

PhD Topic
Dr. rer. nat. Rainer Gruhn
Statistical Pronunciation Modeling for Non-Native Speech
completed
Research Interests
- Acoustic Modelling
- Multilingual Speech Recognition
- Non-Native Speech Recognition
- Speech-Driven Systems
Publications
D. Vásquez, R. Gruhn and W. Minker
Hierarchical Neural Network Structures for Modeling Inter and Intra Phonetic Information for Phoneme Recognition
Springer Verlag, Heidelberg (Germany), 2012
Bibtex
M. Elmahdy, R. Gruhn and W. Minker
Novel Techniques for Dialectal Arabic Speech Recognition
Springer, Boston (USA), 2012
Link to Document
Bibtex
M. Elmahdy, R. Gruhn, S. Abdennadher and W. Minker
Rapid Phonetic Transcription using Everyday Life Natural Chat Alphabet Orthography for Dialectal Arabic Speech Recognition
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Prague, Czech Republic, May 2011
Link to Document
Bibtex
R. Gruhn, W. Minker and S. Nakamura
Statistical Pronunciation Modeling for Non-Native Speech Processing
Springer Verlag, Heidelberg (Germany), 2011
Link to Document
Bibtex
M. Elmahdy, R. Gruhn, W. Minker and S. Abdennadher
Cross-Lingual Acoustic Modeling for Dialectal Arabic Speech Recognition
International Conference on Speech and Language Processing (Interspeech), Makuhari, Japan, September 2010
Link to Document
Bibtex
D. Vásquez, G. Aradilla, R. Gruhn and W. Minker
A Hierarchical Structure for Modeling Inter and Intra Phonetic Information for Phoneme Recognition
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Merano, Italy, December 2009
Bibtex
D. Vásquez, G. Aradilla, R. Gruhn and W. Minker
Isolated Word Recognition Based on Inter and Intra Phonetic Classifiers
First International Workshop on Spoken Dialogue Systems (IWSDS), Kloster Irsee, Germany, December 2009
Bibtex
D. Vásquez, G. Aradilla, R. Gruhn and W. Minker
On Speeding Phoneme Recognition in a Hierarchical MLP Structure
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Merano, Italy, December 2009
Bibtex
M. Elmahdy, R. Gruhn, W. Minker and S. Abdennadher
Effect of Gaussian Densities and Amount of Training Data on Grapheme-Based Acoustic Modeling for Arabic
IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE), Dalian, China, September 2009
Link to Document
Bibtex
D. Vásquez, G. Aradilla, R. Gruhn and W. Minker
On Expanding Context by Temporal Decomposition for Improving Phoneme Recognition
International Conference on Speech and Computer (SPECOM), St. Petersburg, Russia, June 2009
Bibtex
M. Elmahdy, R. Gruhn, W. Minker and S. Abdennadher
Survey on common Arabic language forms from a speech recognition point of view
International Conference on Acoustics (NAG-DAGA), Rotterdam, Netherlands, March 2009
Link to Document
Bibtex
T. Cincarek, R. Gruhn, C. Hacker, E. Nöth and S. Nakamura
Automatic Pronunciation Scoring of Words and Sentences Independent from the Non-native’s First Language
Computer Speech and Language, Vol. 23, Num. 1, pp. 65-88, January 2009
DOI
Bibtex
M. Raab, O. Schreiner, T. Herbig, R. Gruhn and E. Nöth
Optimal Projections between Gaussian Mixture Feature Spaces for Multilingual Speech Recognition
International Conference on Acoustics (NAG-DAGA), Rotterdam (The Netherlands), 2009
Bibtex
H. Lang, M. Raab, R. Gruhn and W. Minker
Comparing Acoustic Model Adaption Methods for Non-Native Speech Recognition
International Conference on Acoustics (NAG-DAGA), 2009
Link to Document
Bibtex
D. Vásquez, G. Aradilla, R. Gruhn and W. Minker
Improving Context Modeling for Phoneme Recognition
International Conference on Acoustics (NAG-DAGA), Rotterdam, The Netherlands, pp. 419--422, 2009
Bibtex
M. Elmahdy, R. Gruhn, W. Minker and S. Abdennadher
Modern Standard Arabic Based Multilingual Approach for Dialectal Arabic Speech Recognition
The Eighth International Symposium on Natural Language Processing (SNLP), Bangkok, Thailand, 2009
Link to Document
Bibtex
M. Raab, T. Herbig, R. Brueckner, R. Gruhn and E. Nöth
Adaptation of Frequency Band Influence for Non-native Speech Recognition
19. Konferenz Elektronische Sprachsignalverarbeitung (ESSV), Frankfurt a.M. (Germany), September 2008
Link to Document
Bibtex
M. Pfeil, D. Bühler, R. Gruhn and W. Minker
Evaluating Text Normalization for Speech-Based Media Selection
Proceedings of the 4th IEEE Tutorial and Research Workshop Perception and Interactive Technology for Speech-Based Systems (PIT08), Kloster Irsee (Germany), June 2008
Link to Document
Bibtex
M. Raab, R. Gruhn and E. Nöth
Codebook Design for Speech Guided Car Infotainment Systems
Proceedings of the 4th IEEE Tutorial and Research Workshop Perception and Interactive Technology for Speech-Based Systems (PIT08), pp. 44-51, 2008
Link to Document
Bibtex
D. Vásquez, R. Gruhn, R. Brueckner and W. Minker
Comparing Linear Feature Space Transformations for Correlated Features
Springer, In: Perception in Multimodal Dialogue Systems, Series: Lecture Notes in Computer Science, Vol. 5078, pp. 176-187, 2008
Link to Document
Bibtex
M. Raab, R. Gruhn and E. Nöth
Multilingual Weighted Codebooks
ICASSP, pp. 4257-4260, 2008
Link to Document
Bibtex
M. Raab, R. Gruhn and E. Nöth
Multilingual Weighted Codebooks for Non-native Speech Recognition
Proc. TSD, pp. 485-492, 2008
Link to Document
Bibtex
M. Raab, R. Gruhn and E. Nöth
Non-Native Speech Databases
2007 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Kyoto (Japan), December 2007
Link to Document
Bibtex
Y. Sun, D. Willett, R. Brueckner, R. Gruhn and D. Bühler
Experiments on Chinese Speech Recognition with Tonal Models and Pitch Estimation using the Mandarin SPEECON Data
International Conference on Speech and Language Processing (ICSLP), Pittsburgh (USA), 2006
Link to Document
Bibtex
C. Hacker, T. Cincarek, R. Gruhn, S. Steidl, E. Nöth and H. Niemann
Pronunciation Feature Extraction
Proc. DAGM-Symposium, pp. 141-148, 2005
Bibtex
R. Gruhn, T. Cincarek and S. Nakamura
A Multi-Accent Non-Native English Database
Proc. Acoust. Soc. Japan, pp. 195-196, September 2004
Link to Document
Bibtex
R. Gruhn and S. Nakamura
A statistical lexicon based on HMMs
Proc. Information Processing Society of Japan, Vol. 2, pp. 37f, 2004
Bibtex
R. Gruhn, K. Markov and S. Nakamura
A statistical lexicon for non-native speech recognition
Proc. Interspeech, pp. 1497-1500, 2004
Link to Document
Bibtex
W. Svojanovsky, R. Gruhn and S. Nakamura
Classification of Nonverbal Utterances in Japanese Spontaneous Speech
Proc. Information Processing Society of Japan, Vol. 2, pp. 277-278, 2004
Bibtex
R. Gruhn, K. Markov and S. Nakamura
Discrete HMMs for statistical pronunciation modeling
SLP 52 / HI 109, Series: Speech and Language Processing Workshop, pp. 123-128, 2004
Link to Document
Bibtex
T. Cincarek, R. Gruhn and S. Nakamura
Pronunciation scoring and extraction of mispronounced words for non-native speech
Proc. Acoustic Society of Japan, pp. 165-166, 2004
Bibtex
K. Markov, T. Matsui, R. Gruhn, J. Zhang and S. Nakamura
Noise and Channel Distortion Rubust ASR System for DARPA SPINE2 Task
IEICE Transactions on Information and Systems, Vol. E86-D, Num. 3, pp. 497-504, March 2003
Bibtex
T. Nishiura, R. Gruhn and S. Nakamura
Automatic Steering of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference Through Speech-to-Speech Translation
Journal of the Information Processing Society of Japan, pp. 3617-3620, December 2002
Bibtex
T. Nishiura, R. Gruhn and S. Nakamura
A Prototype System Design of Distant Talking Speech Translation with a Microphone Array and Video Camera
Proc. Acoustic Society of Japan, Vol. 1, pp. 229-230, 2002
Bibtex
J. Zhang, K. Markov, T. Matsui, R. Gruhn and S. Nakamura
Developing Robust Baseline Acoustic Models for Noisy Speech Recognition in SPINE2 Project
Proc. Acoustic Society of Japan, Vol. 1, pp. 65-66, 2002
Bibtex
R. Gruhn, K. Markov and S. Nakamura
Probability Sustaining Phoneme Substitution for Non-Native Speech Recognition
Proc. Acoustic Society of Japan, pp. 195-196, 2002
Bibtex
N. Binder, R. Gruhn and S. Nakamura
Recognition of Non-native Speech Using Dynamic Phoneme Lattice Processing
Proc. Acoustic Society of Japan, pp. 203-204, 2002
Bibtex
K. Markov, T. Matsui, R. Gruhn and S. Nakamura
Robust Speech Recognition in Diverse Noisy Environment
Proc. Acoustic Society of Japan, Vol. 1, pp. 67-68, 2002
Bibtex
T. Nishiura, R. Gruhn and S. Nakamura
Collaborative Steering of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference Through Speech-to-Speech Translation
ASRU, December 2001
Bibtex
T. Nishiura, R. Gruhn and S. Nakamura
Automatic Steering of Microphone Array and Video Camera Toward Multi-Lingual Tele-Conference Through Speech-to-Speech Translation
ICME, pp. 569-572, August 2001
Bibtex
R. Gruhn, K. Takashima, T. Matsuda, A. Nishino and S. Nakamura
A CORBA Based Speech-to-Speech Translation System
Proc. Acoustic Society of Japan, pp. 225-226, 2001
Bibtex
A. Nakamura, M. Naito, H. Tsukada, R. Gruhn, E. Sumita, H. Kashioka, J. Nakajima, T. Shimizu and Y. Sagisaka
A Speech Translation System Applied to a Real-World Task/Domain and Its Evaluation Using Real-World Speech Data
IEICE Transactions on Information and Systems, Vol. E84-D, Num. 1, pp. 142-154, 2001
Bibtex
K. Markov, T. Matsui, J. Zhang, R. Gruhn and S. Nakamura
ATR System for Robust Speech Recognition in Real World Noisy and Channel Environments
IEICE technical report on natural language understanding and models of communication, Vol. 101, Num. 520, pp. 37-43, 2001
Bibtex
R. Gruhn, K. Takashima, T. Matsuda, A. Nishino and S. Nakamura
CORBA-based Speech to Speech Translation System
ASRU, pp. 355-358, 2001
Bibtex
R. Gruhn and S. Nakamura
Multilingual Speech Recognition with the CALLHOME Corpus
Proc. Acoustic Society of Japan, pp. 153-154, 2001
Bibtex
N. Binder, K. Markov, R. Gruhn and S. Nakamura
Speech Non-Speech Separations with GMMs
Proc. Acoustic Society of Japan, Vol. 1, pp. 141-142, 2001
Bibtex
R. Gruhn, H. Singer, H. Tsukada, A. Nakamura, M. Naito, A. Nishino, Y. Sagisaka and S. Nakamura
Cellular Phone Based Speech-to-Speech Translation System ATR-MATRIX
ICSLP, pp. 448-451, 2000
Bibtex
R. Gruhn, S. Nakamura and Y. Sagisaka
Towards a Cellular Phone Based Speech-To-Speech Translation Service
Proc. MSC workshop on Multilingual Speech Communication, Kyoto, 2000
Link to Document
Bibtex
K. Murai, R. Gruhn and S. Nakamura
Speech Start/End Point Detection Using Mouth Image
Proc. Information Processing Society of Japan, Vol. 2, pp. 169-170, 2000
Bibtex
R. Gruhn, H. Singer and Y. Sagisaka
Scalar Quantization of Cepstral Parameters for Low Bandwidth Client-Server Speech Recognition Systems
Proc. Acoustic Society of Japan, pp. 129-130, November 1999
Bibtex
H. Singer, R. Gruhn and Y. Sagisaka
Speech Translation Anywhere: Client-Server Based ATR-MATRIX
Proc. Acoustic Society of Japan, pp. 165-166, November 1999
Bibtex


