btb.selection.recent module¶

class btb.selection.recent.RecentKReward(choices, k=2)[source]¶

Recent K reward selector

compute_rewards(scores)[source]¶: Retain the K most recent scores, and replace the rest with zeros

select(choice_scores)[source]¶: Use the top k learner’s scores for usage in rewards for the bandit calculation

class btb.selection.recent.RecentKVelocity(choices, k=2)[source]¶

Recent K velocity selector

compute_rewards(scores)[source]¶

Compute the velocity of thte k+1 most recent scores.

The velocity is the average distance between scores. Return a list with those k velocities padded out with zeros so that the count remains the same.