Disentangled Speech Embeddings Using Cross-Modal Self-Supervision

This video program is a part of the Premium package:

Disentangled Speech Embeddings Using Cross-Modal Self-Supervision


  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00
Purchase

Disentangled Speech Embeddings Using Cross-Modal Self-Supervision

0 views
  • Share
Create Account or Sign In to post comments
The objective of this paper is to learn representations of speaker identity without access to manually annotated data. To do so, we develop a self-supervised learning objective that exploits the natural cross-modal synchrony between faces and audio in vid
The objective of this paper is to learn representations of speaker identity without access to manually annotated data. To do so, we develop a self-supervised learning objective that exploits the natural cross-modal synchrony between faces and audio in vid