using tidyverse for paired calculation from multiple data frames

m.west · October 14, 2019, 2:18pm

Hello,

I am trying to use tidyverse (I hope this is the right term) to run some calculations. I need to first take the means of the control values (from df_1) and then use these means in the calculations from the treatments (df_2). The issue is the controls are paired with specific treatments and I'm not sure how to do this without using very long if-else statements so I thought I'd try this way. Unfortunately, it's not working. Maybe the if-else is the way to go? (but I still need to use two separate data frames!) Many thanks in advance.

The if-else statement would go something like this (for one treatment, there are 10 treatments in total):

if(treatment == "A"){
df_1$mean_value = control_A/reading_A)(v/t)
}
else {(treatment == "B"){
df_1$mean_value = control_B/reading_B)(v/t)
}

df_1 <- df %>%
group_by(p_control) %>%
mutate(control_mean = mean(reading, na.rm = T)) %>%
select(treatment, control_mean) %>%
distinct() %>% as.data.frame()
df_1

t <- 7 #### length of trial
v <-10 #### volume, L

df_2 <- df %>%
arrange(treatment_df2, bio_rep, technical_rep) %>% # organize rows in this nested order
group_by(treatment_df2) %>%
mutate(new_stat = (df_1$control_mean from paired treatment/value_treatment_df2) * (v/t)), na.rm = T) %>% # make a new column that calculates the wanted value
group_by(treatment_df2) %>%
mutate(n_fr = n()) %>% # count the value of the wanted value
filter(n_fr > 0, # keep only wanted values with non-zero values
!is.na(fvalue_treatment_df2)) %>% # drop rows with invalid frequencies
mean_value_treatment_df2_per_treatment = mean(new_stat, na.rm = TRUE)%>% # get the mean of the calculated value per treatment
select(treatment_df2, mean_value_treatment_df2_per_treatment) %>% # select certain columns
distinct() %>% as.data.frame() # drop duplicates.

andresrcs · October 14, 2019, 2:19pm

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

FAQ: How to do a minimal reproducible example ( reprex ) for beginners Guides & FAQs

A minimal reproducible example consists of the following items: A minimal dataset, necessary to reproduce the issue The minimal runnable code necessary to reproduce the issue, which can be run on the given dataset, and including the necessary information on the used packages. Let's quickly go over each one of these with examples: Minimal Dataset (Sample Data) You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue. Let's say, as an example, that you are working with the iris data frame head(iris) #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.1 3.5 1.4 0.…

m.west · October 14, 2019, 2:51pm

'''r
library(datapasta)

summary_df <- df %>%
group_by(plate_control) %>%
mutate(control_mean = mean(use_fluor_control, na.rm = T)) %>%
select(plate_control, control_mean) %>%
distinct() %>% as.data.frame()
summary_df

m.west · October 14, 2019, 2:57pm

Sorry, I can't get the datapasta to work...

andresrcs · October 14, 2019, 3:27pm

Well, if for some strange reason, datapasta is not working, you can still use dput(), the thing is we need sample data to reproduce your issue.

phiggins · October 15, 2019, 9:36am

Try this guide

https://cran.r-project.org/web/packages/datapasta/vignettes/how-to-datapasta.html

followed by this one

system · November 5, 2019, 9:36am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.