mlprimitives.custom.feature_selection module

Feature selection based on the scikit-learn ExtraTreesClassifier and SelectFromModel solution.

>>> from sklearn.ensemble import ExtraTreesClassifier
>>> from sklearn.datasets import load_iris
>>> from sklearn.feature_selection import SelectFromModel
>>> iris = load_iris()
>>> X, y = iris.data, iris.target
>>> X.shape
(150, 4)
>>> clf = ExtraTreesClassifier()
>>> clf = clf.fit(X, y)
>>> clf.feature_importances_
array([ 0.04...,  0.05...,  0.4...,  0.4...])
>>> model = SelectFromModel(clf, prefit=True)
>>> X_new = model.transform(X)
>>> X_new.shape
(150, 2)
class mlprimitives.custom.feature_selection.EstimatorFeatureSelector(estimator_class=None, bypass=False, threshold=None, norm_order=1, *args, **kwargs)[source]

Bases: object

Feature Selector based on sklearn.feature_selection.SelectFromModel.

ESTIMATOR

Estimator class to use. To be defined in subclasses.

Type

class

selector

sklearn.ensemble.SelectFromModel that will perform the actual feature selection.

Type

SelectFromModel

Example

This example below shows simple usage case using an ExtraTreesClassifier as the estimator, and shows the output when passing both a pandas.DataFrame and a numpy.ndarray.

>>> import pandas as pd
>>> df = pd.DataFrame([
    ... {'a': 1, 'b': 1, 'c': 1},
    ... {'a': 1, 'b': 2, 'c': 1},
    ... {'a': 2, 'b': 1, 'c': 2},
    ... {'a': 2, 'b': 2, 'c': 2}
    ... ])
>>> X = df[['a', 'b']]
>>> y = df.c
>>> from mlblocks.primitives.custom.preprocessors.feature_selection         ...         import EstimatorFeatureSelector
>>> from sklearn.ensemble import ExtraTreesClassifier
>>> efs = EstimatorFeatureSelector(ExtraTreesClassifier)
>>> efs.fit(X, y)
>>> efs.transform(X)
   a
   0  1
   1  1
   2  2
   3  2
>>> efs.transform(X.values)
array([[1],
       [1],
       [2],
       [2]])
ESTIMATOR = None
fit(X, y)[source]
fit_transform(X, y)[source]
transform(X)[source]
class mlprimitives.custom.feature_selection.ExtraTreesClassifierFeatureSelector(estimator_class=None, bypass=False, threshold=None, norm_order=1, *args, **kwargs)[source]

Bases: mlprimitives.custom.feature_selection.EstimatorFeatureSelector

EstimatorFeatureSelector based on ExtraTreesClassifier.

ESTIMATOR

alias of sklearn.ensemble._forest.ExtraTreesClassifier

class mlprimitives.custom.feature_selection.ExtraTreesRegressorFeatureSelector(estimator_class=None, bypass=False, threshold=None, norm_order=1, *args, **kwargs)[source]

Bases: mlprimitives.custom.feature_selection.EstimatorFeatureSelector

EstimatorFeatureSelector based on ExtraTreesRegressor.

ESTIMATOR

alias of sklearn.ensemble._forest.ExtraTreesRegressor

class mlprimitives.custom.feature_selection.LassoFeatureSelector(estimator_class=None, bypass=False, threshold=None, norm_order=1, *args, **kwargs)[source]

Bases: mlprimitives.custom.feature_selection.EstimatorFeatureSelector

EstimatorFeatureSelector based on Lasso.

ESTIMATOR

alias of sklearn.linear_model._coordinate_descent.Lasso