Factors and dummy variables

My general understanding is that, in R , nominal categorical variables (with 2 or more levels) must be first converted into factors and THEN into to dummy variables (k-1 dummy variables for k levels - dummy encoding). Is that correct?

Once we accomplish categorical variable -> factor -> dummy variables transformation, we can then use the dummy variable as an independent or dependent variable in a statistical model (P.S. : when using the function lm() in R, the function lm() automatically does the dummy variable conversion but I am not sure that being true for other models).

What if we converted the categorical variable straight into dummy variables without the intermediate factor() step? Would that still work in R if we passed the dummy variables to a statistical model? I think so...Which means that we could really skip the conversion to factors..



In fact, there is a function dummy_cols() in the fastDummies package to help you do exactly that.

Thank you.
So no need to convert them to the categorical variables in the imported CSV dataset into factors. We can use the dummy_cols() directly on the column data.

