Pythonで凡例とAUCスコアを使用して1つのプロットに複数のROC曲線をプロットする方法は？

Question

2つのモデルを作成しています。

モデル1

modelgb = GradientBoostingClassifier() modelgb.fit(x_train,y_train) predsgb = modelgb.predict_proba(x_test)[:,1] metrics.roc_auc_score(y_test,predsgb, average='macro', sample_weight=None)

モデル2

model = LogisticRegression() model = model.fit(x_train,y_train) predslog = model.predict_proba(x_test)[:,1] metrics.roc_auc_score(y_test,predslog, average='macro', sample_weight=None)

各モデルのAUCスコアの凡例とテキストを使用して、両方のROC曲線を1つのプロットにプロットするにはどうすればよいですか？

Julien · Accepted Answer

これをあなたのデータに合わせてみてください：

from sklearn import metrics import numpy as np import matplotlib.pyplot as plt plt.figure(0).clf() pred = np.random.Rand(1000) label = np.random.randint(2, size=1000) fpr, tpr, thresh = metrics.roc_curve(label, pred) auc = metrics.roc_auc_score(label, pred) plt.plot(fpr,tpr,label="data 1, auc="+str(auc)) pred = np.random.Rand(1000) label = np.random.randint(2, size=1000) fpr, tpr, thresh = metrics.roc_curve(label, pred) auc = metrics.roc_auc_score(label, pred) plt.plot(fpr,tpr,label="data 2, auc="+str(auc)) plt.legend(loc=0)

Rudr · Answer

モデルをリストに追加するだけで、複数のROC曲線が1つのプロットにプロットされます。うまくいけば、これでうまくいきます！

from sklearn.linear_model import LogisticRegression from sklearn.ensemble import GradientBoostingClassifier from sklearn import metrics import matplotlib.pyplot as plt plt.figure() # Add the models to the list that you want to view on the ROC plot models = [ { 'label': 'Logistic Regression', 'model': LogisticRegression(), }, { 'label': 'Gradient Boosting', 'model': GradientBoostingClassifier(), } ] # Below for loop iterates through your models list for m in models: model = m['model'] # select the model model.fit(x_train, y_train) # train the model y_pred=model.predict(x_test) # predict the test data # Compute False postive rate, and True positive rate fpr, tpr, thresholds = metrics.roc_curve(y_test, model.predict_proba(x_test)[:,1]) # Calculate Area under the curve to display on the plot auc = metrics.roc_auc_score(y_test,model.predict(x_test)) # Now, plot the computed values plt.plot(fpr, tpr, label='%s ROC (area = %0.2f)' % (m['label'], auc)) # Custom settings for the plot plt.plot([0, 1], [0, 1],'r--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('1-Specificity(False Positive Rate)') plt.ylabel('Sensitivity(True Positive Rate)') plt.title('Receiver Operating Characteristic') plt.legend(loc="lower right") plt.show() # Display