The data has a outcome varaible (healthy or cancer) and several binary predictors (yes or no). I tried logistic regression, SVM, KNN, xgboost, lightGBM, random forest algorithms, and found that the best model was logistic regression. The AUC and accuracy index were close to that of logistic regression when using xgboost and lightGBM even though I tuned the parameters. So which machine learning method should choose to predict binary outcome based on several binary predictors?
Is it suitable for using SVM, KNN, xgboost, lightGBM, random forest algorithms in this case? Or logistic regression is the only method?
The roc_aucand accuracy values equal to those obtained from logistic regression.
I mean if other machine learning methods are feasible for this kind of data?
Fair point. I mean, it seems you've pretty much exhausted the options if your primary concerns are AUC.
I went through a similar exploration journey and found very little discrepancy in AUCs. But I ended up settling on something like an optimal decision tree for good visualisation/prediction. I.e. a more white box method.