Back to Zheng-Hua's homepage                                     


Online Resources


(Click the titles to get access to the resources)

rVAD
Noise-robust voice activity detection (rVAD) - source code, reference VAD for Aurora 2, based on the following paper:
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis for speech recognition and voice activity detection." IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.

Aurora 2 VAD
The VAD lables for Aurora 2 database generated by forced alignment, as presented in the following paper
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis for speech recognition and voice activity detection." IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.

iSocioBot
The source code for iSocioBot, presented in the following paper:
Zheng-Hua Tan, Nicolai Bk Thomsen, Xiaodong Duan, Evgenios Vlachos, Sven Ewan Shepstone, Morten H. Rasmussen and Jesper Lisby Hjvang, "iSocioBot - A Multimodal Interactive Social Robot," accepted by International Journal of Social Robotics. (Springer). PDF from Springer Nature Sharing.

Filter bank neural networks (FBNN.zip, 55 MB)
The source code of filter bank neural network (FBNN), presented in the following paper:
Hong Yu, Zheng-Hua Tan, Yiming Zhang, Zhanyu Ma, and Jun Guo, DNN Filter Bank Cepstral Coefficients for Spoofing Detection," accepted by IEEE Access. PDF from IEEEXplore.

3D sensing
Three-Dimensional Adaptive Sensing of People - Code and Supplementary Video Examples

Crowd analysis
Crowd Analysis - Supplementary Video Examples of the following paper:
F. Santoro, S. Pedro, Z.-H. Tan and T.B. Moeslund, "Crowd Analysis by Using Optical Flow and Density Based Clustering," EUSIPCO 2010 the 18th European Signal Processing Conference, Aalborg, Denmark, August 2010.