I appreciate if anyone can help me what is the meaning of Var1 in data_centroid[, Clust := Var1]? and does Clust mean to create a new column for data_centroids which is a matrix with the name Clust?
I do not use the data.table package myself, so this was confusing to me too, but what is happening is that the Var1 is a result from the melt() function performed on the dataset. To make it more confusing, data.table has a melt function, but the one used here is from the reshape2 package, which will be used if installed.
When a table is melted (i.e., becomes long format) the row numbers become Var1 and the column names Var2. Then with the special assignment := used in data.table the Var1 column is duplicated and renamed Clust.
library(data.table)
#Generate some data
set.seed(1) #Only needed for reproducibility
myData = data.frame(
x = runif(1:10),
y = runif(1:10)
)
#Run k-means
test = kmeans(myData, 3)
#Show centers table
test$centers
#> x y
#> 1 0.8450967 0.4948642
#> 2 0.6168256 0.8394645
#> 3 0.2252752 0.4824545
#Melt table
test = data.table(reshape2::melt(test$centers))
test
#> Var1 Var2 value
#> 1: 1 x 0.8450967
#> 2: 2 x 0.6168256
#> 3: 3 x 0.2252752
#> 4: 1 y 0.4948642
#> 5: 2 y 0.8394645
#> 6: 3 y 0.4824545
#Duplicate and rename column Var1 to Clust
test[,Clust := Var1]
test
#> Var1 Var2 value Clust
#> 1: 1 x 0.8450967 1
#> 2: 2 x 0.6168256 2
#> 3: 3 x 0.2252752 3
#> 4: 1 y 0.4948642 1
#> 5: 2 y 0.8394645 2
#> 6: 3 y 0.4824545 3