btb.selection package¶
Submodules¶
Module contents¶
-
class
btb.selection.
BestKReward
(choices, k=2)[source]¶ Bases:
btb.selection.ucb1.UCB1
Best K reward selector
Computes the average reward from the past scores by using only the highest k scores. In implementation, the other scores are replaced with ``nan``s such that they still factor into the number of arm pulls.
- Parameters
k (int) – number of best scores to consider
-
class
btb.selection.
BestKVelocity
(choices, k=2)[source]¶ Bases:
btb.selection.best.BestKReward
Best K velocity selector
-
class
btb.selection.
HierarchicalByAlgorithm
(choices, by_algorithm)[source]¶ Bases:
btb.selection.ucb1.UCB1
Hierarchical selector
- Parameters
by_algorithm (Dict[str, List]) – mapping of ML algorithms to frozen set choices
-
class
btb.selection.
PureBestKVelocity
(choices, k=3)[source]¶ Bases:
btb.selection.selector.Selector
Pure Best K Velocity Selector
Simply returns the choice with the best best-K velocity.
-
class
btb.selection.
RecentKReward
(choices, k=2)[source]¶ Bases:
btb.selection.ucb1.UCB1
Recent K reward selector
- Parameters
k (int) – number of best scores to consider
-
class
btb.selection.
RecentKVelocity
(choices, k=2)[source]¶ Bases:
btb.selection.recent.RecentKReward
Recent K velocity selector
-
class
btb.selection.
UCB1
(choices)[source]¶ Bases:
btb.selection.selector.Selector
UCB1 selector
Uses Upper Confidence Bound 1 algorithm (UCB1) for bandit selection.
See also:
Auer, Peter et al. "Finite-time Analysis of the Multiarmed Bandit Problem." Machine Learning 47 (2002): 235-256.
-
bandit
(choice_rewards)[source]¶ Multi-armed bandit method which chooses the arm for which the upper confidence bound (UCB) of expected reward is greatest.
If there are multiple arms with the same UCB1 index, then one is chosen at random.
An explanation is here: https://www.cs.bham.ac.uk/internal/courses/robotics/lectures/ucb1.pdf
-
-
class
btb.selection.
Uniform
(choices)[source]¶ Bases:
btb.selection.selector.Selector
Uniform selector
Selects a choice uniformly at random.
-
select
(choice_scores)[source]¶ Select the next best choice to make
- Parameters
choice_scores (Dict[object, List[float]]) –
Mapping of choice to list of scores for each possible choice. The caller is responsible for making sure each choice that is possible at this juncture is represented in the dict, even those with no scores. Score lists should be in ascending chronological order, that is, the score from the earliest trial should be listed first.
For example:
{ 1: [0.56, 0.61, 0.33, 0.67], 2: [0.25, 0.58], 3: [0.60, 0.65, 0.68], }
-