Next TALC seminar will take place on Wednesday, January 16th at 2pm in room A008.
Md Sahidullah (Multispeech team), will give a presentation entitled “Speaker embeddings: from i-vector to x-vector and beyond”.
Abstract:
Speaker recognition is the task of recognizing a human from his/her voice. The state-of-the-art speaker recognition technology uses a speaker embedding method for representing a speech utterance of arbitrary length in the form of a fixed-dimensional vector. The recent advancements in deep neural network (DNN) research have enabled the
development of robust and efficient speaker embedding techniques. In this talk, I will first provide a brief overview of speaker recognition basics. It will be followed by the description of the conventional speaker embedding method popularly known as i-vector. Then I will present various attempts to develop speech signal representations with
DNN-based discriminative training. I will explain the recently introduced x-vector embedding which showed promising speaker recognition performance. This talk will end with a discussion on potential future directions in the speaker embedding research including our ongoing work.