Learning Diverse Sub-Policies Via A Task-Agnostic Regularization On Action Distributions.

Automatic sub-policy discovery has recently received much attention in hierarchical reinforcement learning (HRL). The conventional approaches to learning sub-policies suffer from collapsing into just one sub-policy dominating the whole task, lacking techn
  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00

Videos in this product