Tuning¶
Defining a Tuning Problem¶
A tuning problem consists of the process of finding an optimal configuration of arguments or hyperparameters for a function that can be evaluated to produce a score.
What is a Hyperparameter?¶
A hyperparameter is each one of the arguments that can be optimized on our tuning problem. Hyperparameters can be of different types and can be defined with a set of constraints regarding possible values that they can take.
In BTB, hyperparameters are represented using a family of classes called HyperParams. This is the list of the HyperParams that are currently implemented in BTB:
BooleanHyperParam
: boolean parameters i.e:True
orFalse
.CategoricalHyperParam
: categorical parameters i.e: “foo”, “bar”.FloatHyperParam
:float
parameters i.e:0.0 - 1.0
IntHyperParam
:int
parameters i.e:0 - 1
Creating a HyperParam¶
BooleanHyperParam¶
The BooleanHyperParam
is used for parameters that represent boolean values. This HyperParam has the following arguments:
default
: default value for the hyperparameter. Defaults toFalse
.
[2]:
from btb.tuning.hyperparams import BooleanHyperParam
bool_hp = BooleanHyperParam(default=True)
CategoricalHyperParam¶
The CategoricalHyperParam
is used for parameters that use categorical values. This HyperParam accepts the following arguments: - choices
: list of values that the hyperparameter can be. - default
: default value for the hyperparameter to take. Defaults to the first item in choices
.
[3]:
from btb.tuning.hyperparams import CategoricalHyperParam
values = ['a', 'b', 'c']
categorical_hp = CategoricalHyperParam(choices=values, default='b')
FloatHyperParam¶
The FloatHyperparam
is used for parameters that use float
values. This HyperParam accepts the following arguments:
min
(float): minimum value that this hyperparameter can take, by default isNone
which will take the system’s minimum float value possible.max
(float): maximum value that this hyperparameter can take, by default isNone
which will take the system’s maximum float value possible.default
(float): number that represents the default value for the hyperparameter. Defaults toself.min
.include_min
(bool): Either or not to include the minimum value, by default isTrue
.include_max
(bool): Either or not to include the maximum value, by default isTrue
.
[4]:
from btb.tuning.hyperparams import FloatHyperParam
float_hp = FloatHyperParam(min=0, max=1, default=0.5)
IntHyperParam¶
The IntHyperParam
is used for parameters that use int
values. This HyperParam accepts the following arguments:
min
(int): minimum value that this hyperparameter can take, by default isNone
which will take the system’s minimum int value possible.max
(int): maximum value that this hyperparameter can take, by default isNone
which will take the system’s maximum int value possible.default
(int): number that represents the default value for the hyperparameter. Defaults toself.min
.step
(int): Increase amount to take for each sample. Defaults to 1.include_min
(bool): Either or not to include the minimum value, by default isTrue
.include_max
(bool): Either or not to include the maximum value, by default isTrue
.
[5]:
from btb.tuning.hyperparams import IntHyperParam
int_hp = IntHyperParam(min=1, max=10, default=5, include_min=False, include_max=True)
What is Tunable?¶
In BTB, a tuning problem is represented using the class Tunable, which consists of a collection of HyperParams which will be all tuned at once to find the optiomal solution to our Tuning Problem.
Creating a Tunable¶
Tunable instances can be created in two ways:
Using HyperParam instances¶
One way of using the Tunable is to create HyperParam instances for each one of the hyperparameters that we want to tune and pass them as a dict to the Tunable:
[1]:
from btb.tuning.tunable import Tunable
from btb.tuning.hyperparams import (
BooleanHyperParam, CategoricalHyperParam, IntHyperParam, FloatHyperParam)
hyperparams = {
'bhp': BooleanHyperParam(default=False),
'chp': CategoricalHyperParam(choices=['foo', 'bar'], default='foo'),
'fhp': FloatHyperParam(min=0, max=1, default=0.5),
'ihp': IntHyperParam(min=1, max=10, default=2),
}
tunable = Tunable(hyperparams)
[ ]:
from btb
Using a dict representation¶
Alternatively, the Tunable can be represented as a dictionary with all the details of each hyperparameter specified, which can then be stored as a JSON file or in other non-python format.
A python dictionary format would contain as key the given name for the parameter and as value a dictionary containing the following keys
type
(str):bool
forBoolHyperParam
,int
forIntHyperParam
,float
forFloatHyperParam
,str
forCategoricalHyperParam
.range
orvalues
(list): range / values that this hyperparameter can take, in case ofCategoricalHyperParam
those will be used as thechoices
, forNumericalHyperParams
themin
value will be used as the minimum value and themax
value will be used as themaximum
value.default
(str, bool, int, float or None): The default value for the hyperparameter.
Once this dict is written, it can be passed to the from_dict
method.
The previously created Tunable can be created using the following dictionary:
[7]:
hyperparams = {
'bhp': {
'type': 'bool',
'default': False
},
'chp': {
'type': 'str',
'values': ['foo', 'bar'],
'default': 'foo'
},
'fhp': {
'type': 'float',
'values': [0, 1],
'default': 0.5
},
'ihp': {
'type': 'int',
'values': [1, 10],
'default': 2
}
}
tunable = Tunable.from_dict(hyperparams)
What is a Tuner?¶
Tuners are classes with a fit/predict/propose interface for suggesting sets of hyperparameters. This are specifically designed to speed up the process of selecting the optimal hyperparameter values for a specific tuning problem.
Using a Tuner¶
The BTB Tuners are used by following a Bayesian Optimization approach and iteratively:
letting the tuner propose new sets of hyper parameter
fitting and scoring the model with the proposed hyper parameters
passing the score obtained back to the tuner
At each iteration the tuner will use the information already obtained to propose the set of hyper parameters that it considers that have the highest probability to obtain the best results.
Creating a Tuner¶
We will be using a GPTuner
that accepts the following arguments:
tunable
(btb.tuning.tunable.Tunable): Instance of a tunable class containing hyperparameters to be tuned.num_candidates
(int): Number of samples to generate and select the best of it for each proposal. Defaults to 1000.maximize
(bool): IfTrue
the model will understand that the score bigger is better, ifFalse
the smaller is better. Defaults toTrue
.min_trials
(int): Number of recordedtrials
needed to perform a fitting over the model. Defaults to 2.
Bear in mind that the tunable
is a requiered argument in order to create a Tuner
.
[8]:
from btb.tuning import hyperparams as hp
from btb.tuning.tuners import GPTuner
tunable = Tunable({
'fhp': hp.FloatHyperParam(min=0, max=1),
'ihp': hp.IntHyperParam(min=1, max=10)
})
tuner = GPTuner(tunable)
Propose¶
This method will propose one or more new hyperparameter configuration(s) by using the following aproach:
Create
num_candidates
amount of candidates.Use acquisition function to select the best candidates.
Return the best selected candidate(s) to be evaluated.
This method accepts the following arguments: - n
(int): Number of candidates to create. Defaults to 1. - allow_duplicates
(bool): If it’s False, the tuner will propose trials that are not recorded, otherwise will generate trials that can be repeated. Defaults to False
.
[9]:
proposal = tuner.propose()
proposal
[9]:
{'fhp': 0.1546118292012293, 'ihp': 1}
Record¶
This method will record the result of one trial or more trials. Then it will re-fit
the meta-model (if min_trials
is reached) in order to generate posterior proposals:
Append trial to internal results store.
Re-fit meta-model if the
min_trials
is reached.
Bear in mind that the proposals that we want to record must have the same parameter names as the tunable.
[10]:
score = 0.5
tuner.record(proposal, score)
Tuning loop example¶
The tuners are ment to be used in a loop that perform the following three steps over and over:
Propose.
Score the proposal.
Record the proposal.
In this example we will use the wine dataset and tune the SGDClassifier that will atempt to solve it.
Next, we will load the dataset and split it in two partitions, train and test, which we will use later on to evaluate the performance of our machine learning model
[11]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
dataset = load_wine()
X_train, X_test, y_train, y_test = train_test_split(
dataset.data, dataset.target, test_size=0.3, random_state=0)
Now that we have our dataset ready, we will import our model and create the hyperparams for it:
[12]:
from sklearn.linear_model import SGDClassifier
from btb.tuning import hyperparams as hp
# define the SGDClassifier Tunable
hyperparams = {
"alpha": FloatHyperParam(min=0.0001, max=1, default=0.0001),
"max_iter": IntHyperParam(min=1, max=5000, default=1000),
"tol": FloatHyperParam(min=1e-3, max=1, default=1e-3),
"shuffle": BooleanHyperParam(default=True)
}
We can now import our tuner and instantiate it with a tunable using the previous hyperparams, wich will tune the SGDClassifier:
[13]:
from btb.tuning import Tunable
from btb.tuning.tuners import GPTuner
tunable = Tunable(hyperparams)
tuner = GPTuner(tunable)
Finally, we start the tuning loop in which we iteratively:
let the tuner propose new sets of hyper parameter
fit and scoring the model with the proposed hyper parameters
pass the score obtained back to the tuner “
[14]:
best_score = 0
for _ in range(100):
proposal = tuner.propose()
model = SGDClassifier(**proposal)
model.fit(X_train, y_train)
score = model.score(X_test, y_test)
if score > best_score:
best_params = proposal
best_score = score
tuner.record(proposal, score)
print('Best score obtained: ', best_score)
print('Best parameters: ', best_params)
Best score obtained: 0.7962962962962963
Best parameters: {'alpha': 0.784797005036952, 'max_iter': 2135, 'tol': 0.002603875854038042, 'shuffle': True}
Now we can fit our model with the best parameters obtained:
[15]:
model = SGDClassifier(**best_params)
model.fit(dataset.data, dataset.target)
[15]:
SGDClassifier(alpha=0.784797005036952, average=False, class_weight=None,
early_stopping=False, epsilon=0.1, eta0=0.0, fit_intercept=True,
l1_ratio=0.15, learning_rate='optimal', loss='hinge',
max_iter=2135, n_iter_no_change=5, n_jobs=None, penalty='l2',
power_t=0.5, random_state=None, shuffle=True,
tol=0.002603875854038042, validation_fraction=0.1, verbose=0,
warm_start=False)
Implemented tuners¶
BTB has the following three tuners available:
UniformTuner: Uses a Tuner that samples proposals randomly using a uniform distribution.
GPTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel.
GPEiTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel and an Expected Improvement acquisition function.
GCPTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianCopulaProcess metamodel.
GCPEiTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianCopulaProcess metamodel and an Expected Improvement acquisition function.
Leaderboard¶
Currently we have a Benchmarking process that evaluates the tuners
performance against each other this are the latest results that we obtained for the BTB
tuners.
tuner |
with ties |
without ties |
---|---|---|
|
220 |
32 |
|
139 |
2 |
|
252 |
90 |
|
208 |
16 |
|
213 |
24 |
|
177 |
1 |
|
186 |
6 |
|
180 |
4 |
|
220 |
31 |
|
205 |
16 |
|
221 |
35 |