Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis

This video program is a part of the Premium package:

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis


  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00
Purchase

Fully-Hierarchical Fine-Grained Prosody Modeling For Interpretable Speech Synthesis

0 views
  • Share
Create Account or Sign In to post comments
This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser leve
This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser leve