SIPCom8-4 

Speech Processing

 

Course Description

 

The course is given by Per Rubak (PR), Mads Græsbøll Christensen (MGC), Søren Holdt Jensen (SHJ), Chunjian Li (CL), and Zheng-Hua Tan (ZT). The course coordinator is Zheng-Hua Tan.

 

 

Literature:

Deller, Hansen, Proakis, Discrete-Time Processing of Speech Signals, 2nd. edition, Wiley-IEEE Press, 1999.

---------------------------------------------

Lecture 1:   

Time and place:

    Tuesday, February 7th 2006

    Lecture, 12:30 to 14:00 in A4-106

    Problem solving, 14:14 to 16:15

Topics:        Fundamentals of speech science, model of speech production

Lecturer:   ZT

Preparation: 

    Slides for MM1

    Literature: Deller, Hansen, Proakis, “Discrete-Time Processing of Speech Signals”, chapter 2 and 3, (pp. 81-85, 99-146, 151-201)

Exercise:

    Speech files associated with the textbook and Speech files for this exercise. The tool - Speech Filing System.

    Exercise for MM1.

---------------------------------------------

Lecture 2:   

Time and place:

    Tuesday, February 14th 2006

    Lecture, 12:30 to 14:00 in A4-106

    Problem solving, 14:14 to 16:15

Topic:   

     Speech perception

Lecturer:   PR

Preparation

    Slides

    Literature:

Exercise:
Solution

  ---------------------------------------------

Lecture 3:   

Time and place:

    Tuesday, February 21st 2006

    Lecture, 12:30 to 14:00 in A4-106

    Problem solving, 14:14 to 16:15

Topics:   

      Fundamentals of linear prediction-based speech coding.

Lecturer:
    MGC

Literature:
   
W. B. Kleijn and K. K. Paliwal, Eds., "Speech Coding and Synthesis", 1995, Chapter 1 (pp. 3-47) and 3 (pp.  79-119).
Excercises:

    Click here.

---------------------------------------------

Lecture 4:   

Time and place:

    Thursday, February 28th 2006

    Lecture, 12:30 to 14:00 in A4-106

    Problem solving, 14:14 to 16:15 

Topics:

    Parametric speech and audio coding, relation to vector quantization, efficient implementation of perceptual distortion measures.

Lecturer:  

    MGC

Literature:
    M. G. Christensen and S. H. Jensen, "On perceptual distortion minimization and nonlinear least-squares frequency estimation", IEEE Trans. on Audio, Speech and Language Processing, Volume 14, Issue 1, Jan. 2006, pp. 99-109 (pdf).

Excercises:

    Click here.


---------------------------------------------

Lecture 5:   

Time and place:

    Tuesday, March 7th 2005

    Lecture, 12:30 to 14:00 in A4-106

    Problem solving, 14:14 to 16:15

Topics:
   
Rate-distortion optimization, rate-distortion optimized LPC.

Lecturer:

   MGC

Literature:

    P. Prandoni and M. Vetterli, "R/D optimal linear prediction", IEEE Trans. on Speech and Audio Processing, Volume 8,  Issue 6,  Nov. 2000, pp. 646-655 (pdf).

Excercises:

    Click here.

---------------------------------------------

Lecture 6:   

Time and place:

    Tuesday, March 21st 2006 

    Lecture, 12:30 to 14:00 in NJ14 3-119

    Problem solving, 14:14 to 16:15

Topics:   

     Speech enhancement, Single-Channel Spectral Subtraction based methods

Lecturer:   SHJ / CL

Preparation: 

Slides for MM6
Literature:

Deller, Hansen, Proakis, “Discrete-Time Processing of Speech Signals”, chapter 8

Supplementary Literature:

D. Ealey, H. Kelleher, and D. Pearce, "Harmonic Tunneling: tracking non-stationary noise during speech", Eurospeech 2001

R. Martin, "Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics", Trans. Speech and Audio Processing, vol. 9, no. 5, July 2001, pp. 504-514

Exercise:      

Given at the lecture 

---------------------------------------------

Lecture 7:   

Time and place:

    Tuesday, March 28th 2006

    Lecture, 12:30 to 14:00 in NJ14 3-119

    Problem solving, 14:14 to 16:15

Topic:   

     Speech enhancement, Multi-Channel methods

Lecturer:   SHJ / CL

Preparation: 

Slides for MM7

Literature:

Deller, Hansen, Proakis, “Discrete-Time Processing of Speech Signals”, chapter 8

Exercise:      

Given at the lecture 

---------------------------------------------

Lecture 8:   

Time and place:

    Tuesday, April 4th 2006 

    Lecture, 12:30 to 14:00 in A4-108 (Note the room!)

    Problem solving, 14:14 to 16:15

Topic:   

     Speech synthesis  & Speech Recognition (DTW)

Lecturer:   ZT

Preparation: 

Slides for MM8 

Literature: 

Exercise:

Exercise 1: Online demo.

Exercise 2: Use Matlab to implement Dynamic Time Warping to compare speech signals

---------------------------------------------

Lecture 9:   

Time and place:

    Tuesday, April 11th 2006 

    Lecture, 12:30 to 14:00 in NJ14 3-119 (Note the room!)

    Problem solving, 14:14 to 16:15

Topic:   

     Speech recognition, HMM

Lecturer:   ZT

Preparation: 

Slides for MM9

Literature:

 

Exercise:

Exercise 1: Hidden Markov model and Viterbi decoding.

Exercise 2: HTK demo. 

---------------------------------------------

Lecture 10:   

Time and place:

    Tuesday, April 18th 2006

    Lecture, 12:30 - 14.00 in A4-106 (Note the room!)

    Problem solving, 14.14  to 16:15

Topic:   

     Large Vocabulary Continuous Speech Recognition (LVCSR) and More

Lecturer:   ZT

Preparation: 

Slides for MM10

Literature:

Exercise:

Given at the lecture.