HI,
I have a large data set (5000*10).
some columns are categorical and some are continuous.
I want to create a distance function that will treat different columns differently.
let's say that if the column is categorical the distance will be 0 if categories in 2 observations are equal and 1 otherwise.
and for continuous variables, I will calculate the regular euclidian distance between rows.
What is the best wat to implement this?
I am open to hearing other distances option for continuous or categorical variables.