ArabCeleb: Speaker Recognition in Arabic
In this paper we present ArabCeleb, a dataset collected in the wild that specifically focuses on arabic language. The proposed dataset contains utterances from 100 celebrities taken from video on YouTube.com. The dataset might be used for several speaker recognition tasks: identification, verification, gender recognition as well as multimodal recognition tasks thus integrating audio and video tracks.
To complete our study, we evaluated the most recent state-of-the-art methods for speaker recognition by measuring robustness as the length of the utterances increases.