IEEE Member-only icon Stochastic Multi-Armed Bandit With Knapsack & Distributing AP/Server Selection Problem Stochastic Multi-Armed Bandit With Knapsack & Distributing AP/Server Selection Problem

Stochastic Multi-Armed Bandit With Knapsack & Distributing AP/Server Selection Problem

3 views
  • Share

In this talk, Ekram Hossain discusses a BwK model and shows its application to the distributed access point (AP) or server selection problem. Hossain also discusses a linear contextual bandit with a knapsack model for the same problem.

Multi-armed bandits (MAB) is a popular sequential decision-making technique ideal for decision-making under uncertainty given no prior knowledge of the environment. It uses the history of previous decisions and observations as well as side information, if available, to arrive at the current decision.

The classic MAB algorithm such as the upper confidence bound (UCB) algorithm concerned with learning the single optimal action among a set of candidate actions with unknown rewards. Different from traditional bandits, bandits with knapsacks (BwK) can model more sophisticated distributed decision-making problems under global constraints.

In this talk, Ekram Hossain discusses a BwK model and shows its application to the distributed access point (AP) or server selection problem. Hossain also discusses a linear contextual bandit with a knapsack model for the same problem.

Multi-armed bandits (MAB) is a popular sequential decision-making technique ideal for...

Speakers in this video