Parallelizing Adam Optimizer With Blockwise Model-Update Filtering

Recently Adam has become a popular stochastic optimization method in deep learning area. To parallelize Adam in a distributed system, synchronous stochastic gradient (SSG) technique is widely used, which is inefficient due to heavy communication cost. In
  • IEEE MemberUS $11.00
  • Society MemberUS $0.00
  • IEEE Student MemberUS $11.00
  • Non-IEEE MemberUS $15.00

Videos in this product