Back to Zheng-Hua's homepage
(Click the titles to get access to the resources)
Noise-robust voice activity detection (rVAD) - source code, reference VAD for Aurora 2, based on the following paper:
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis
for speech recognition and voice activity detection." IEEE Journal of
Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.
Aurora 2 VAD
The VAD lables for Aurora 2 database generated by forced alignment, as presented in the following paper
Z.-H. Tan and
B. Lindberg, "Low-complexity variable frame rate analysis for speech
recognition and voice activity detection." IEEE Journal of Selected
Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.
The source code for iSocioBot, presented in the following paper:
Nicolai Bæk Thomsen, Xiaodong Duan, Evgenios Vlachos, Sven Ewan
Shepstone, Morten H. Rasmussen and Jesper Lisby Højvang, "iSocioBot - A
Multimodal Interactive Social Robot," accepted by International
Journal of Social Robotics. (Springer). PDF from Springer Nature Sharing.
Filter bank neural networks (FBNN.zip, 55 MB)
The source code of filter bank neural network (FBNN), presented in the following paper:
Zheng-Hua Tan, Yiming Zhang, Zhanyu Ma, and Jun Guo, DNN Filter Bank
Cepstral Coefficients for Spoofing Detection," accepted by IEEE Access.
PDF from IEEEXplore.
Three-Dimensional Adaptive Sensing of People - Code and Supplementary Video Examples
Crowd Analysis - Supplementary Video Examples of the following paper:
Santoro, S. Pedro, Z.-H. Tan and T.B. Moeslund, "Crowd
Analysis by Using Optical Flow and Density Based Clustering," EUSIPCO
2010 the 18th European Signal Processing Conference, Aalborg,
Denmark, August 2010.