Back to Zheng-Hua's homepage
(Click the titles to get access to the resources)
Noise-robust voice activity detection (rVAD) - source code, reference VAD for Aurora 2, based on the following paper:
Z.-H. Tan and B. Lindberg, "Low-complexity variable frame rate analysis
for speech recognition and voice activity detection." IEEE Journal of
Selected Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.
Aurora 2 VAD
The VAD lables for Aurora 2 database generated by forced alignment, as presented in the following paper:
Z.-H. Tan and
B. Lindberg, "Low-complexity variable frame rate analysis for speech
recognition and voice activity detection." IEEE Journal of Selected
Topics in Signal Processing, vol. 4, no. 5, pp. 798-807, 2010.
Audio adversarial examples
Source code and datasets for generation and detection of attacks on deep speech recognition systems.
The source code for iSocioBot, presented in the following paper:
Nicolai Bćk Thomsen, Xiaodong Duan, Evgenios Vlachos, Sven Ewan
Shepstone, Morten H. Rasmussen and Jesper Lisby Hřjvang, "iSocioBot - A
Multimodal Interactive Social Robot,"ť accepted by International
Journal of Social Robotics. (Springer). PDF from Springer Nature Sharing.
Contextual TV Dataset
Miklas S. Kristoffersen, Sven E. Shepstone, and Zheng-Hua Tan. The
Importance of Context When Recommending TV Content: Dataset and
Algorithms. arXiv:1808.00337 [cs.IR].
Subjective annotations of attention
Andrea Coifman, Péter Rohoska, Miklas S. Kristoffersen, Sven E.
Shepstone, and Zheng-Hua Tan. Subjective Annotations for Vision-Based
Attention Level Estimation. VISAPP '19: 14th International Conference
on Computer Vision Theory and Applications.
Feature learning for face recognition
Xiaodong Duan and Zheng-Hua Tan, “A Spatial Self-Similarity Based
Feature Learning Method for Face Recognition under Varying Poses,”
Pattern Recognition Letters, vol. 111, pp. 109-116, August 2018.
Filter bank neural networks (FBNN.zip, 55 MB)
The source code of filter bank neural network (FBNN), presented in the following paper:
Zheng-Hua Tan, Yiming Zhang, Zhanyu Ma, and Jun Guo, “DNN Filter Bank
Cepstral Coefficients for Spoofing Detection," accepted by IEEE Access.
PDF from IEEEXplore.
Three-Dimensional Adaptive Sensing of People - Code and Supplementary Video Examples
Crowd Analysis - Supplementary Video Examples of the following paper:
Santoro, S. Pedro, Z.-H. Tan and T.B. Moeslund, "Crowd
Analysis by Using Optical Flow and Density Based Clustering," EUSIPCO
2010 – the 18th European Signal Processing Conference, Aalborg,
Denmark, August 2010.