Chin Hui Lee - Distinguished Lecture - Wednesday September 8 - 11:30 - Sala D5-003
13/06/2018
On Recent Progresses in the ASAT Project: Universal
Speech Attribute Modeling
Summary
Automatic Speech Attribute Transcription (ASAT) is a recently proposed speech analysis paradigm to simulate human auditory perception processes of detecting acoustic and auditory cues, weighing and combining them to form theories, and then processing these cognitive hypotheses until semantically and pragmatically consistent understanding can be achieved. In contrast to conventional HMM-based framework which is top-down in nature, one major goal of the ASAT paradigm is to develop a bottom-up approach to automatic speech recognition via attribute detection and knowledge integration. These two key technologies can also be applies to other applications. In this talk we report on recent studies with universal speech attribute modeling in two related tasks, namely: (i) language-universal attribute and phone recognition; and (ii) automatic spoken language recognition. We show that language-universal speech attribute models can perform better than language-specific attribute models for attribute detection and phone recognition. We also demonstrate that language recognition accuracies with two simple sets of manner and place of articulation models can outperform the state-of-the-art spoken language recognition systems. We anticipate that the universal speech attribute modeling tools to provide new opportunities to explore future multilingual research.
Bio
Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Dr. Lee received the B.S. degree in Electrical Engineering from National Taiwan University, Taipei, in 1973, the M.S. degree in Engineering and Applied Science from Yale University, New Haven, in 1977, and the Ph.D. degree in Electrical Engineering with a minor in Statistics from University of Washington, Seattle, in 1981.
Dr. Lee started his professional career at Verbex Corporation, Bedford, MA, and was involved in research on connected word recognition. In 1984, he became affiliated with Digital Sound Corporation, Santa Barbara, where he engaged in research and product development in speech coding, speech synthesis, speech recognition and signal processing for the development of the DSC-2000 Voice Server. Between 1986 and 2001, he was with Bell Laboratories, Murray Hill, New Jersey, where he became a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. His research interests include multimedia communication, multimedia signal and information processing, speech and speaker recognition, speech and language modeling, spoken dialogue processing, adaptive and discriminative learning, biometric authentication, and information retrieval. From August 2001 to August 2002 he was a visiting professor at School of Computing, The National University of Singapore. In September 2002, he joined the Faculty Georgia Institute of Technology.
Prof. Lee has participated actively in professional societies. He is a member of the IEEE Signal Processing Society (SPS), Communication Society, and the International Speech Communication Association (ISCA). In 1991-1995, he was an associate editor for the IEEE Transactions on Signal Processing and Transactions on Speech and Audio Processing. During the same period, he served as a member of the ARPA Spoken Language Coordination Committee. In 1995-1998 he was a member of the Speech Processing Technical Committee and later became the chairman from 1997 to 1998. In 1996, he helped promote the SPS Multimedia Signal Processing Technical Committee in which he is a founding member.
Dr. Lee is a Fellow of the IEEE, and has published more than 300 papers and 25 patents. He received the SPS Senior Award in 1994 and the SPS Best Paper Award in 1997 and 1999, respectively. In 1997, he was awarded the prestigious Bell Labs President's Gold Award for his contributions to the Lucent Speech Processing Solutions product. Dr. Lee often gives seminal lectures to a wide international audience. In 2000, he was named one of the six Distinguished Lecturers by the IEEE Signal Processing Society. He was also named one of the two ISCA's inaugural Distinguished Lecturers in 2007-2008. Recently he won the SPS's 2006 Technical Achievement Award for "Exceptional Contributions to the Field of Automatic Speech Recognition".
Comparteix: