[stuck] RStudio Taking Forever to execute.

'data.frame': 264833 obs. of 6 variables:
revenue : num 6 2.7 3.6 3 1.9 2.8 3.8 2.7 5.7 8.8 ... pack_size : num 175 150 210 175 160 165 110 150 330 170 ...
life_stage: Factor w/ 7 levels "MIDAGE SINGLES/COUPLES",..: 7 7 6 6 4 1 7 7 2 7 ... tier : Factor w/ 3 levels "Budget","Mainstream",..: 3 2 1 1 2 2 1 1 3 2 ...
month_year: Factor w/ 12 levels "Jul-2018","Aug-2018",..: 4 3 9 9 5 6 6 6 5 3 ... brand : Factor w/ 29 levels "Burger","CCs",..: 14 18 9 14 29 3 11 19 7 7 ...
Above df is a sample df, original one consisting of around 2,000,000 I've been using 'Linear Regression Model' and 'ANOVA'.

for example - Linear Regression Model

lm_model <- lm(A ~ B * C * D * E + F, data = df)

lm_model <- lm(revenue ~ life_stage * tier * month_year* brand + pack_size, data = model_data)

Above calculation has been taking forever to Process..

So I tested sample size of the original df .

'data.frame': 26483 obs. of 6 variables:
revenue : num 6.6 8.6 9.2 9.2 11.8 7.6 7.6 7.6 8.8 6.6 ... pack_size : num 175 250 150 270 380 110 110 110 170 190 ...
life_stage: Factor w/ 7 levels "MIDAGE SINGLES/COUPLES",..: 5 6 4 6 5 7 1 4 6 6 ... tier : Factor w/ 3 levels "Budget","Mainstream",..: 1 1 2 1 1 3 2 1 3 2 ...
month_year: Factor w/ 12 levels "Jul-2018","Aug-2018",..: 11 1 11 11 2 2 5 11 6 10 ... brand : Factor w/ 29 levels "Burger","CCs",..: 24 26 13 26 21 12 12 11 7 3 ...
Executed again;

lm_model <- lm(A ~ B * C * D * E , data = df_copy)

set.seed(123)
2. sample_indices <- sample(nrow(model_data), size = floor(0.1 * nrow(model_data)))
3. model_data_sample <- model_data[sample_indices, ]

  1. lm_model_sample <- lm(revenue ~ life_stage * tier * month_year * brand + pack_size, data = model_data_sample)

  2. predictions <- predict(lm_model_sample, newdata = model_data)

  3. model_data$predicted_revenue <- predictions

at the 4th line of code, It took more than an hour to run this. I don't know what's wrong, I'm stuck here for 2 days straight.

running 5th line of code RStudio is forever stuck. While using sample size data

System using RAM upto 16gb and virtual memory upto 40gb.

System configuration - Ram 16 gb at 2993 mhz Gpu rtx 2060 Cpu amd 2700x at 4ghz

Can you send us some sample data?

A handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. Probably dput(head(mydata, 100)) will do here. Paste the output between
```

```

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.