sklearnのPipelineオブジェクトから係数を返す

Question

PipelineオブジェクトをRandomizedSearchCVに適合させました

pipe_sgd = Pipeline([('scl', StandardScaler()), ('clf', SGDClassifier(n_jobs=-1))]) param_dist_sgd = {'clf__loss': ['log'], 'clf__penalty': [None, 'l1', 'l2', 'elasticnet'], 'clf__alpha': np.linspace(0.15, 0.35), 'clf__n_iter': [3, 5, 7]} sgd_randomized_pipe = RandomizedSearchCV(estimator = pipe_sgd, param_distributions=param_dist_sgd, cv=3, n_iter=30, n_jobs=-1) sgd_randomized_pipe.fit(X_train, y_train)

coef_のbest_estimator_属性にアクセスしたいのですが、アクセスできません。以下のコードでcoef_にアクセスしてみました。

sgd_randomized_pipe.best_estimator_.coef_

ただし、次のAttributeErrorが発生します...

AttributeError： 'Pipeline' object has no attribute 'coef_'

Scikit-learnのドキュメントでは、coef_はSGDClassifierの属性であり、これは私のbase_estimator_のクラスです。

何が悪いのですか？

Vivek Kumar · Accepted Answer

named_steps dictを使用してパイプラインを作成するときに、割り当てた名前をいつでも使用できます。

scaler = sgd_randomized_pipe.best_estimator_.named_steps['scl'] classifier = sgd_randomized_pipe.best_estimator_.named_steps['clf']

次に、coef_、intercept_など、対応する近似推定器で使用できるすべての属性にアクセスします。

これは、パイプラインによってドキュメントで指定として公開される正式な属性です。

named_steps：dict

ユーザー指定の名前でステップパラメータにアクセスするための読み取り専用属性。キーはステップ名、値はステップパラメータです。

Roozbeh Bakhshi · Answer

私はこれがうまくいくと思います：

sgd_randomized_pipe.named_steps['clf'].coef_

spies006 · Answer

これを行う1つの方法は、steps属性を使用した連鎖インデックスによる方法です...

sgd_randomized_pipe.best_estimator_.steps[1][1].coef_

これはベストプラクティスですか、それとも別の方法がありますか？