Speech Technology and Research (STAR) Laboratory Seminar Series
Upcoming Talks
Abstract: Dan will talk about the open-source speech recognition toolkit "Kaldi". Topics covered include the history of the project, the overall design of the toolkit, the use of Weighted Finite State Transducers (WFSTs), mechanisms for dealing efficiently with large collections of data, the use of lattices, decoding-graph construction, and the algorithms used in the training recipes. He will also talk about lattice generation for speech recognition, and describe how to generate "perfect" lattices efficiently, using a special semiring.
Daniel Povey received his Bachelor's (Natural Sciences, 1997), Master's (Computer Speech and Language Processing, 1998) and PhD (Engineering, 2003) from Cambridge University. From 2003 to 2008 he worked as a researcher in IBM Research in Yorktown Heights, NY. He is best known for his work on discriminative training for HMM-GMM based speech recognition: MMI, MPE, fMPE/fMMI, and boosted MMI. He is currently working at Microsoft Research.