Dear colleagues,
I am trying to add rows to a data frame in R, but I don't want the column names to get modified. Here is the example, which'll help to explain the problem. Since the data is preloaded in R, I feel it should be easy to reproduce the example;
I am not very sure why did the column names get renamed. Is it possible to keep the column names as col_name and unique_cnt?
Second, earlier I defined the column unique_cnt as numeric data type, but since we're coercing it into a vector vec, the final data type turns out to be character. Is it possible to keep the datatype as it is, when the data frame is defined in the first place.
Since you already know how the resulting data.frame should look like I suggest you ore allocate it and fill it with values. You shouldn't have problems then
Yes, of course, but your code is really diffuse. You are adding rows to your data.frame and you have a data.frame with names, but you just get rid of it and overwrite your object. To keep it short, what about replace it to
library(mlbench)
data("Soybean")
library(tidyverse)
microbenchmark::microbenchmark(
first = {var_unique <- data.frame(col_name=character(0),unique_cnt=integer(0))
col_names <- c(names(Soybean))
cnt <- 0
for (i in 2:5){
variable <- col_names[i]
cnt <- length(unique(Soybean[,variable]))
vec <- data.frame(col_name = variable, unique_cnt=cnt)
var_unique <- rbind(var_unique,vec)
}},
second = {soyd <- summarise_all(Soybean,
~length(unique(.))) %>%
pivot_longer(cols=everything(),
names_to = "col_name",
values_to = "unique_cnt") %>%
slice(2:5)},
unit = "s"
)
Unit: seconds
expr min lq mean median uq max neval
first 0.0093508 0.01260855 0.01612717 0.01476915 0.0186775 0.0387230 100
second 0.0207036 0.02757900 0.03348093 0.03274780 0.0385336 0.0876993 100