course will enable the student to understand and design advanced multi modal
user interfaces - including speech based interaction - which is one of the
primary goal of the VGIS programme.
is the most natural means for human-human communication. As computing
machines become more and more capable and widespread, there is an increasing
demand to include speech as a key component in human-machine interface. This
course attempts to provide the students with a basic comprehension of the
methods and models applied in speech and multi-modal systems.
speech recognition and -synthesis
of information from e.g. speech and visual modalities into advanced
modal interface design and evaluation methods
and platforms of MM systems
1 Slides (Introduction and speech
2 Slides (Speech recognition I)
and try out Sphinx-4, a speech recognizer
written entirely in the JavaTM programming language. Some key
Walker, Paul Lamere, Philip Kwok, et al., "Sphinx-4:
A Flexible Open Source Framework for Speech Recognition,"
Technical Report, Sun Microsystems, Inc.
Gouva, Paul Lamere, Paul Lamere, Philip Kwok, Philip Kwok, William Walker, William Walker, Ro GouvÍa, Rita Singh, Rita Singh, Bhiksha Raj, Bhiksha Raj, Peter Wolf, Peter Wolf,
"Design of the cmu sphinx-4
decoder," EUROSPEECH 2003.
3 Slides (Speech recognition II)
4 Slides (Language modeling)
5 Slides Part 1 & Part
2 by Rita Singh (System design and applications)
Link to Lecture 6-10 by Lars Bo Larsen.