'data.frame': 264833 obs. of 6 variables:
revenue : num 6 2.7 3.6 3 1.9 2.8 3.8 2.7 5.7 8.8 ...
pack_size : num 175 150 210 175 160 165 110 150 330 170 ...
life_stage: Factor w/ 7 levels "MIDAGE SINGLES/COUPLES",..: 7 7 6 6 4 1 7 7 2 7 ...
tier : Factor w/ 3 levels "Budget","Mainstream",..: 3 2 1 1 2 2 1 1 3 2 ...
month_year: Factor w/ 12 levels "Jul-2018","Aug-2018",..: 4 3 9 9 5 6 6 6 5 3 ...
brand : Factor w/ 29 levels "Burger","CCs",..: 14 18 9 14 29 3 11 19 7 7 ...
Above df is a sample df, original one consisting of around 2,000,000 I've been using 'Linear Regression Model' and 'ANOVA'.
for example - Linear Regression Model
lm_model <- lm(A ~ B * C * D * E + F, data = df)
lm_model <- lm(revenue ~ life_stage * tier * month_year* brand + pack_size, data = model_data)
Above calculation has been taking forever to Process..
So I tested sample size of the original df .
'data.frame': 26483 obs. of 6 variables:
revenue : num 6.6 8.6 9.2 9.2 11.8 7.6 7.6 7.6 8.8 6.6 ...
pack_size : num 175 250 150 270 380 110 110 110 170 190 ...
life_stage: Factor w/ 7 levels "MIDAGE SINGLES/COUPLES",..: 5 6 4 6 5 7 1 4 6 6 ...
tier : Factor w/ 3 levels "Budget","Mainstream",..: 1 1 2 1 1 3 2 1 3 2 ...
month_year: Factor w/ 12 levels "Jul-2018","Aug-2018",..: 11 1 11 11 2 2 5 11 6 10 ...
brand : Factor w/ 29 levels "Burger","CCs",..: 24 26 13 26 21 12 12 11 7 3 ...
Executed again;
lm_model <- lm(A ~ B * C * D * E , data = df_copy)
2. sample_indices <- sample(nrow(model_data), size = floor(0.1 * nrow(model_data)))
3. model_data_sample <- model_data[sample_indices, ]
lm_model_sample <- lm(revenue ~ life_stage * tier * month_year * brand + pack_size, data = model_data_sample)
predictions <- predict(lm_model_sample, newdata = model_data)
model_data$predicted_revenue <- predictions
at the 4th line of code, It took more than an hour to run this. I don't know what's wrong, I'm stuck here for 2 days straight.
running 5th line of code RStudio is forever stuck. While using sample size data
System using RAM upto 16gb and virtual memory upto 40gb.
System configuration - Ram 16 gb at 2993 mhz Gpu rtx 2060 Cpu amd 2700x at 4ghz