Let's say there are
a) three factor covariates
b) Y variable, whether it be continuous or binary
If I only had to choose one covariate and PCA the other two (just for sake of ideation, not real modelling) to predict the Y variable, it would make sense that the one covariate shows least amount of error with the Y variable, correct (assuming there wouldn't be overfit)?
So then what would be the best approach to find this one covariate? Correlation? Regression/logit after one-hot coding? Something else?
I would first check the VIF in a model with all variables. If the VIFs are all under 5, I would include an OLS model with standard errors and p-values in my analysis.
If the VIFs are too high, I would fit an elastic net model with k-fold cross validation and report standardized coefficients. Perhaps bootstrap these coefficients.
As for the PCA on factor variables - you can't perform PCA on the factor variables themselves. You can perform it on their dummy variables, but that seems a little strange. Not sure if that will be fruitful for you. Having the principal components in the model will surely take away from the descriptive value of the model. That may be OK if it has a benefit for the predictive value, but I'm not sure that it will, assuming you use a model that can control overfitting (like elastic net).