Scaling of Nominal, Ordinal, Binary with numeric Variable dataset

Hi All,

Is it meaningful to scale the nominal, ordinal and binary data if converted into numeric form with other numeric variables ? If not what are the scenarios and alternatives to the same ?


I assume you like to scale datato make sure it's well prepared for machine learning. Here are my thoughts on how to scale different types:

  • Numerical: no problem, you can just use traditional normalization or standardization functions
  • Binary data: since there are only two categories, you can just convert them to 0 and 1
  • Nominal data cannot be scaled as is, because even converting it to numbers, the categories do not bear a numerical relationship. The way to solve this is converting them into one-hot vectors, where you convert each option to its own binary variable (see figure below)
  • Ordinal variable:you can use the same procedure as for nominal data. In theory, there is relationship between the variables and if it's linear, you could convert the ordinal classes to numeric values and then normalize them (e.g. small, medium, large --> 1, 2, 3 --> 0, 0.5, 1)

Hope this helps

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.