Back to Zheng-Hua's homepage                                     


Online Resources


(Click the titles to get access to the resources)

rVAD
Noise-robust voice activity detection (rVAD) - source code, reference VAD for Aurora 2, based on the following paper:
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis for speech recognition and voice activity detection." IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.

Aurora 2 VAD
The VAD lables for Aurora 2 database generated by forced alignment, as presented in the following paper:
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis for speech recognition and voice activity detection." IEEE Journal of Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.

Audio adversarial examples
Source code and datasets for generation and detection of attacks on deep speech recognition systems.

iSocioBot
The source code for iSocioBot, presented in the following paper:
Zheng-Hua Tan, Nicolai Bk Thomsen, Xiaodong Duan, Evgenios Vlachos, Sven Ewan Shepstone, Morten H. Rasmussen and Jesper Lisby Hjvang, "iSocioBot - A Multimodal Interactive Social Robot," accepted by International Journal of Social Robotics. (Springer). PDF from Springer Nature Sharing.

Contextual TV Dataset
Miklas S. Kristoffersen, Sven E. Shepstone, and Zheng-Hua Tan. The Importance of Context When Recommending TV Content: Dataset and Algorithms. arXiv:1808.00337 [cs.IR].

Subjective annotations of attention
Andrea Coifman, Pter Rohoska, Miklas S. Kristoffersen, Sven E. Shepstone, and Zheng-Hua Tan. Subjective Annotations for Vision-Based Attention Level Estimation. VISAPP '19: 14th International Conference on Computer Vision Theory and Applications.

Feature learning for face recognition
Xiaodong Duan and Zheng-Hua Tan, A Spatial Self-Similarity Based Feature Learning Method for Face Recognition under Varying Poses, Pattern Recognition Letters, vol. 111, pp. 109-116, August 2018.

Filter bank neural networks (FBNN.zip, 55 MB)
The source code of filter bank neural network (FBNN), presented in the following paper:
Hong Yu, Zheng-Hua Tan, Yiming Zhang, Zhanyu Ma, and Jun Guo, DNN Filter Bank Cepstral Coefficients for Spoofing Detection," accepted by IEEE Access. PDF from IEEEXplore.

3D sensing
Three-Dimensional Adaptive Sensing of People - Code and Supplementary Video Examples

Crowd analysis
Crowd Analysis - Supplementary Video Examples of the following paper:
F. Santoro, S. Pedro, Z.-H. Tan and T.B. Moeslund, "Crowd Analysis by Using Optical Flow and Density Based Clustering," EUSIPCO 2010 the 18th European Signal Processing Conference, Aalborg, Denmark, August 2010.