Pipeline

class sklearn.pipeline.Pipeline（
steps，
memory = None 
）

按顺序应用transforms列表和最终估计器。流水线的中间步骤必须是“transforms”，即必须实现fit和transforms的方法（最后的估算器只需要实现fit）。管道中的变换器可以使用memory参数进行缓存。

管道的目的是组装几个可以一起交叉验证的步骤，同时设置不同的参数。为此，它可以使用名称和参数名以“__”分隔各个步骤来设置各个步骤。一个步骤的估算器可能完全根据参数名称替换为另一个估算器，或者通过设置为None来移除transformer。

参数：

steps：列表

链接的（名称，变换）元组的列表（实现拟合/变换），链接的顺序是最后一个对象是一个估计器。

memory：None，str或带有joblib.Memory接口的对象（可选）

用于缓存管道的拟合变压器。默认情况下，不执行缓存。如果给出了一个字符串，它就是缓存目录的路径。在启动缓存之前，启动一个克隆transformer。因此，给管道的transformer实例不能直接检查。使用该属性named_steps或steps检查管道内的估计器。如果fit十分费时，缓存transformers是有利的。

属性：

named_steps：束对象，一个具有属性访问的字典

只读属性根据用户给定的名称访问任何步骤参数。键是步骤名称，值是步骤参数。

方法：

decision_function(X)

Apply transforms, and decision_function of the final estimator

fit(X[, y])

Fit the model

fit_predict(X[, y])

Applies fit_predict of last step in pipeline after transforms.

fit_transform(X[, y])

Fit the model and transform with the final estimator

get_params([deep])

Get parameters for this estimator.

predict(X)

Apply transforms to the data, and predict with the final estimator

predict_log_proba(X)

Apply transforms, and predict_log_proba of the final estimator

predict_proba(X)

Apply transforms, and predict_proba of the final estimator

score(X[, y, sample_weight])

Apply transforms, and score with the final estimator

set_params(**kwargs)

Set the parameters of this estimator.

例子：

>>> from sklearn import svm
>>> from sklearn.datasets import samples_generator
>>> from sklearn.feature_selection import SelectKBest
>>> from sklearn.feature_selection import f_regression
>>> from sklearn.pipeline import Pipeline
>>> # generate some data to play with
>>> X, y = samples_generator.make_classification(
...     n_informative=5, n_redundant=0, random_state=42)
>>> # ANOVA SVM-C
>>> anova_filter = SelectKBest(f_regression, k=5)
>>> clf = svm.SVC(kernel='linear')
>>> anova_svm = Pipeline([('anova', anova_filter), ('svc', clf)])
>>> # You can set the parameters using the names issued
>>> # For instance, fit using a k of 10 in the SelectKBest
>>> # and a parameter 'C' of the svm
>>> anova_svm.set_params(anova__k=10, svc__C=.1).fit(X, y)
...                      
Pipeline(memory=None,
         steps=[('anova', SelectKBest(...)),
                ('svc', SVC(...))])
>>> prediction = anova_svm.predict(X)
>>> anova_svm.score(X, y)                        
0.829...
>>> # getting the selected features chosen by anova_filter
>>> anova_svm.named_steps['anova'].get_support()
... 
array([False, False,  True,  True, False, False, True,  True, False,
       True,  False,  True,  True, False, True,  False, True, True,
       False, False], dtype=bool)
>>> # Another way to get selected features chosen by anova_filter
>>> anova_svm.named_steps.anova.get_support()
... 
array([False, False,  True,  True, False, False, True,  True, False,
       True,  False,  True,  True, False, True,  False, True, True,
       False, False], dtype=bool)

Previousmake_union Nextmake_pipeline

Last updated 7 years ago