0
I am working on a datasets and after some discussion with my group, we doubt that maybe one or more of our controls are different than the other controls. The motivation is to see if one or more controls have been effected differently by the solvent they were kept in.
I have been suggested to use bootstrap method. If we suppose that I have a dataset with 5 controls and 3 treated samples. I want to create 5 new dataframes with the information such as every new data frame skips one out of 5 controls and do resampling with replacement. I want to see how stable the controls are and by skipping a specific control, would the DEGs result change.
Let us suppose that the original data frame is like this:
x <- round(matrix(rexp(480 * 10, rate=.1), ncol=8), 0)
rownames(x) <- paste("gene", 1:nrow(x))
colnames(x) <-c("control1","control2","control3","control4","control5","treated","treated","treated")
head(x)
I want to create 5 new dataframes (as there are 5 controls in this study) where each data frame skips one specific control and replace with with another control (which means some other control will repeat).
For example one of the 5 data frame can look like:
x1 <- round(matrix(rexp(480 * 10, rate=.1), ncol=8), 0)
rownames(x1) <- paste("gene", 1:nrow(x1))
colnames(x1) <-c("control1","control1.1","control3","control4","control5","treated","treated","treated")
head(x1)
You can see that this new data frame skipped control2 with a copy of control1 called control1.1.
The motivation is to look how stable the controls are and if there is one specific control that is affecting the results when Differential gene expression was done using DESeq2.
Thank you!