how to optimise code?

nbaes · April 18, 2022, 11:52pm

Context: I realise I am repeating the same thing three times below (to compute a weighted average with 3 different variables) and am wondering if anyone has an idea of how to optimise it? I thought a for loop may be too complicated.
Broader question: Is this bad practice in code? I am looking to share this publicly and am cleaning up my script (beginner here). Any tips/ advice would be much appreciated.

  df_word2 <- df_word2 %>% mutate(VA_prod=(repet*VA_mean_sum)) # prep: product of repeats * mean V+A ratings
  sumVAprod_word <- aggregate(df_word2[, 8], list(df_word2$year), sum) %>% rename(c("year"="Group.1", "sumVAprod_word"="x"))  # sum VA_prod by year
  df_word2 <- df_word2 %>% mutate(A_prod=(repet*A_mean_sum)) # prep: product of repeats * mean A ratings
  sumAprod_word <- aggregate(df_word2[, 9], list(df_word2$year), sum) %>% rename(c("year"="Group.1", "sumAprod_word"="x")) # sum A_prod by year
  df_word2 <- df_word2 %>% mutate(V_prod=(repet*V_mean_sum_r)) # prep: product of repeats * mean V ratings 
  sumVprod_word <- aggregate(df_word2[, 10], list(df_word2$year), sum) %>% rename(c("year"="Group.1", "sumVprod_word"="x"))# sum V_prod by year

These 3 lines may be the easiest to optimise I think.

  # compute standardisation
  word_year <- word_year %>% mutate(sev_word=(sumVAprod_word/sum_repet_word)) # V+A
  word_year <- word_year %>% mutate(aro_word=(sumAprod_word/sum_repet_word)) # A
  word_year <- word_year %>% mutate(val_word=(sumVprod_word/sum_repet_word)) # V

williaml · April 19, 2022, 12:40am

You could create a function and use that. You could even combine it with one of the purrr::map() functions.

Otherwise, for the last bit:

word_year <- word_year %>% mutate(sev_word=(sumVAprod_word/sum_repet_word), # V+A
                                  aro_word=(sumAprod_word/sum_repet_word), # A
                                  val_word=(sumVprod_word/sum_repet_word)) # V

Anyway, a reproducible example of df_word2 and word_year would help.

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

system · May 10, 2022, 12:41am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.