machine learning - reducing FP rate scikit-learn random forest -


i working scikit-learn random forest classifier , want reduce fp rate increasing number of trees needed successful vote greater 50% 75%, after reading documentation not sure of how this. have suggestions. (i think there should way because according documentation predict method of classifier decides based on majority vote). appreciated, thanks!

lets have classifier use 75% agreement within estimators. in case gets new sample, , odds 51%-49% in favour of 1 class, want do?

the reason 50% rule used, because decision rule proposed may lead cases classifier says "i cannot predict label of these samples".

what can do, wrap results of classifier, , whatever calculations wish -

from sklearn.ensemble import randomforestclassifier sklearn import datasets import numpy np  def my_decision_function(arr):     diff = np.abs(arr[:,0]-arr[:,1])     arr [ diff < 0.5 ] = [-1,-1] # if >0.5, 1 class has more 0.75 prediction     return arr   x, y = datasets.make_classification(n_samples=100000, n_features=20,                                 n_informative=2, n_redundant=2) train_samples = 100  # samples used training models  x_train = x[:train_samples] x_test = x[train_samples:] y_train = y[:train_samples] y_test = y[train_samples:]  clf = randomforestclassifier().fit(x_train,y_train) print my_decision_function(clf.predict_proba(x_train)) 

now, each sample less 0.75% @ least 1 class have [-1,-1] prediction. adjustments made if use multi-label classification, hope notion clear.


Comments

Popular posts from this blog

java - Andrioid studio start fail: Fatal error initializing 'null' -

android - Gradle sync Error:Configuration with name 'default' not found -

StringGrid issue in Delphi XE8 firemonkey mobile app -