Tuning

Defining a Tuning Problem

A tuning problem consists of the process of finding an optimal configuration of arguments or hyperparameters for a function that can be evaluated to produce a score.

What is a Hyperparameter?

A hyperparameter is each one of the arguments that can be optimized on our tuning problem. Hyperparameters can be of different types and can be defined with a set of constraints regarding possible values that they can take.

In BTB, hyperparameters are represented using a family of classes called HyperParams. This is the list of the HyperParams that are currently implemented in BTB:

  • BooleanHyperParam: boolean parameters i.e: True or False.

  • CategoricalHyperParam: categorical parameters i.e: “foo”, “bar”.

  • FloatHyperParam: float parameters i.e: 0.0 - 1.0

  • IntHyperParam: int parameters i.e: 0 - 1

Creating a HyperParam

BooleanHyperParam

The BooleanHyperParam is used for parameters that represent boolean values. This HyperParam has the following arguments:

  • default: default value for the hyperparameter. Defaults to False.

[2]:
from btb.tuning.hyperparams import BooleanHyperParam

bool_hp = BooleanHyperParam(default=True)

CategoricalHyperParam

The CategoricalHyperParam is used for parameters that use categorical values. This HyperParam accepts the following arguments: - choices: list of values that the hyperparameter can be. - default: default value for the hyperparameter to take. Defaults to the first item in choices.

[3]:
from btb.tuning.hyperparams import CategoricalHyperParam

values = ['a', 'b', 'c']
categorical_hp = CategoricalHyperParam(choices=values, default='b')

FloatHyperParam

The FloatHyperparam is used for parameters that use float values. This HyperParam accepts the following arguments:

  • min (float): minimum value that this hyperparameter can take, by default is None which will take the system’s minimum float value possible.

  • max (float): maximum value that this hyperparameter can take, by default is None which will take the system’s maximum float value possible.

  • default (float): number that represents the default value for the hyperparameter. Defaults to self.min.

  • include_min (bool): Either or not to include the minimum value, by default is True.

  • include_max (bool): Either or not to include the maximum value, by default is True.

[4]:
from btb.tuning.hyperparams import FloatHyperParam

float_hp = FloatHyperParam(min=0, max=1, default=0.5)

IntHyperParam

The IntHyperParam is used for parameters that use int values. This HyperParam accepts the following arguments:

  • min (int): minimum value that this hyperparameter can take, by default is None which will take the system’s minimum int value possible.

  • max (int): maximum value that this hyperparameter can take, by default is None which will take the system’s maximum int value possible.

  • default (int): number that represents the default value for the hyperparameter. Defaults to self.min.

  • step (int): Increase amount to take for each sample. Defaults to 1.

  • include_min (bool): Either or not to include the minimum value, by default is True.

  • include_max (bool): Either or not to include the maximum value, by default is True.

[5]:
from btb.tuning.hyperparams import IntHyperParam

int_hp = IntHyperParam(min=1, max=10, default=5, include_min=False, include_max=True)

What is Tunable?

In BTB, a tuning problem is represented using the class Tunable, which consists of a collection of HyperParams which will be all tuned at once to find the optiomal solution to our Tuning Problem.

Creating a Tunable

Tunable instances can be created in two ways:

Using HyperParam instances

One way of using the Tunable is to create HyperParam instances for each one of the hyperparameters that we want to tune and pass them as a dict to the Tunable:

[1]:
from btb.tuning.tunable import Tunable
from btb.tuning.hyperparams import (
    BooleanHyperParam, CategoricalHyperParam, IntHyperParam, FloatHyperParam)

hyperparams = {
    'bhp': BooleanHyperParam(default=False),
    'chp': CategoricalHyperParam(choices=['foo', 'bar'], default='foo'),
    'fhp': FloatHyperParam(min=0, max=1, default=0.5),
    'ihp': IntHyperParam(min=1, max=10, default=2),
}

tunable = Tunable(hyperparams)
[ ]:
from btb

Using a dict representation

Alternatively, the Tunable can be represented as a dictionary with all the details of each hyperparameter specified, which can then be stored as a JSON file or in other non-python format.

A python dictionary format would contain as key the given name for the parameter and as value a dictionary containing the following keys

  • type (str): bool for BoolHyperParam, int for IntHyperParam, float for FloatHyperParam, str for CategoricalHyperParam.

  • range or values (list): range / values that this hyperparameter can take, in case of CategoricalHyperParam those will be used as the choices, for NumericalHyperParams the min value will be used as the minimum value and the max value will be used as the maximum value.

  • default (str, bool, int, float or None): The default value for the hyperparameter.

Once this dict is written, it can be passed to the from_dict method.

The previously created Tunable can be created using the following dictionary:

[7]:
hyperparams = {
    'bhp': {
        'type': 'bool',
        'default': False
    },
    'chp': {
        'type': 'str',
        'values': ['foo', 'bar'],
        'default': 'foo'
    },
    'fhp': {
        'type': 'float',
        'values': [0, 1],
        'default': 0.5
    },
    'ihp': {
        'type': 'int',
        'values': [1, 10],
        'default': 2
    }
}

tunable = Tunable.from_dict(hyperparams)

What is a Tuner?

Tuners are classes with a fit/predict/propose interface for suggesting sets of hyperparameters. This are specifically designed to speed up the process of selecting the optimal hyperparameter values for a specific tuning problem.

Using a Tuner

The BTB Tuners are used by following a Bayesian Optimization approach and iteratively:

  • letting the tuner propose new sets of hyper parameter

  • fitting and scoring the model with the proposed hyper parameters

  • passing the score obtained back to the tuner

At each iteration the tuner will use the information already obtained to propose the set of hyper parameters that it considers that have the highest probability to obtain the best results.

Creating a Tuner

We will be using a GPTuner that accepts the following arguments:

  • tunable (btb.tuning.tunable.Tunable): Instance of a tunable class containing hyperparameters to be tuned.

  • num_candidates (int): Number of samples to generate and select the best of it for each proposal. Defaults to 1000.

  • maximize (bool): If True the model will understand that the score bigger is better, if False the smaller is better. Defaults to True.

  • min_trials (int): Number of recorded trials needed to perform a fitting over the model. Defaults to 2.

Bear in mind that the tunable is a requiered argument in order to create a Tuner.

[8]:
from btb.tuning import hyperparams as hp
from btb.tuning.tuners import GPTuner

tunable = Tunable({
    'fhp': hp.FloatHyperParam(min=0, max=1),
    'ihp': hp.IntHyperParam(min=1, max=10)
})

tuner = GPTuner(tunable)

Propose

This method will propose one or more new hyperparameter configuration(s) by using the following aproach:

  1. Create num_candidates amount of candidates.

  2. Use acquisition function to select the best candidates.

  3. Return the best selected candidate(s) to be evaluated.

This method accepts the following arguments: - n (int): Number of candidates to create. Defaults to 1. - allow_duplicates (bool): If it’s False, the tuner will propose trials that are not recorded, otherwise will generate trials that can be repeated. Defaults to False.

[9]:
proposal = tuner.propose()
proposal
[9]:
{'fhp': 0.1546118292012293, 'ihp': 1}

Record

This method will record the result of one trial or more trials. Then it will re-fit the meta-model (if min_trials is reached) in order to generate posterior proposals:

  1. Append trial to internal results store.

  2. Re-fit meta-model if the min_trials is reached.

Bear in mind that the proposals that we want to record must have the same parameter names as the tunable.

[10]:
score = 0.5
tuner.record(proposal, score)

Tuning loop example

The tuners are ment to be used in a loop that perform the following three steps over and over:

  1. Propose.

  2. Score the proposal.

  3. Record the proposal.

In this example we will use the wine dataset and tune the SGDClassifier that will atempt to solve it.

Next, we will load the dataset and split it in two partitions, train and test, which we will use later on to evaluate the performance of our machine learning model

[11]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split


dataset = load_wine()
X_train, X_test, y_train, y_test = train_test_split(
    dataset.data, dataset.target, test_size=0.3, random_state=0)

Now that we have our dataset ready, we will import our model and create the hyperparams for it:

[12]:
from sklearn.linear_model import SGDClassifier

from btb.tuning import hyperparams as hp

# define the SGDClassifier Tunable
hyperparams = {
    "alpha": FloatHyperParam(min=0.0001, max=1, default=0.0001),
    "max_iter": IntHyperParam(min=1, max=5000, default=1000),
    "tol": FloatHyperParam(min=1e-3, max=1, default=1e-3),
    "shuffle": BooleanHyperParam(default=True)
}

We can now import our tuner and instantiate it with a tunable using the previous hyperparams, wich will tune the SGDClassifier:

[13]:
from btb.tuning import Tunable
from btb.tuning.tuners import GPTuner

tunable = Tunable(hyperparams)
tuner = GPTuner(tunable)

Finally, we start the tuning loop in which we iteratively:

  1. let the tuner propose new sets of hyper parameter

  2. fit and scoring the model with the proposed hyper parameters

  3. pass the score obtained back to the tuner “

[14]:
best_score = 0

for _ in range(100):
    proposal = tuner.propose()
    model = SGDClassifier(**proposal)
    model.fit(X_train, y_train)
    score = model.score(X_test, y_test)
    if score > best_score:
        best_params = proposal
        best_score = score

    tuner.record(proposal, score)

print('Best score obtained: ', best_score)
print('Best parameters: ', best_params)
Best score obtained:  0.7962962962962963
Best parameters:  {'alpha': 0.784797005036952, 'max_iter': 2135, 'tol': 0.002603875854038042, 'shuffle': True}

Now we can fit our model with the best parameters obtained:

[15]:
model = SGDClassifier(**best_params)
model.fit(dataset.data, dataset.target)
[15]:
SGDClassifier(alpha=0.784797005036952, average=False, class_weight=None,
              early_stopping=False, epsilon=0.1, eta0=0.0, fit_intercept=True,
              l1_ratio=0.15, learning_rate='optimal', loss='hinge',
              max_iter=2135, n_iter_no_change=5, n_jobs=None, penalty='l2',
              power_t=0.5, random_state=None, shuffle=True,
              tol=0.002603875854038042, validation_fraction=0.1, verbose=0,
              warm_start=False)

Implemented tuners

BTB has the following three tuners available:

  • UniformTuner: Uses a Tuner that samples proposals randomly using a uniform distribution.

  • GPTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel.

  • GPEiTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianProcess metamodel and an Expected Improvement acquisition function.

  • GCPTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianCopulaProcess metamodel.

  • GCPEiTuner: Uses a Bayesian Tuner that optimizes proposals using a GaussianCopulaProcess metamodel and an Expected Improvement acquisition function.

Leaderboard

Currently we have a Benchmarking process that evaluates the tuners performance against each other this are the latest results that we obtained for the BTB tuners.

tuner

with ties

without ties

Ax.optimize

220

32

BTB.GCPEiTuner

139

2

BTB.GCPTuner

252

90

BTB.GPEiTuner

208

16

BTB.GPTuner

213

24

BTB.UniformTuner

177

1

HyperOpt.tpe

186

6

SMAC.HB4AC

180

4

SMAC.SMAC4HPO_EI

220

31

SMAC.SMAC4HPO_LCB

205

16

SMAC.SMAC4HPO_PI

221

35