How to extract specific groups from multiple columns?

mtoufiq · October 6, 2021, 9:17pm

Hi,

I working with a dataset to study group comparisons using statistical analysis (t-test) and familiar with the comparison of groups from the same column of the sample metadata, and less frequently with groups of two different columns. But, I am just trying to identify if groups from the three different columns can be used or subset for the statistical analysis purpose. Please assist me with this.

dput(Sample_metadata)

structure(list(Cell_Type = c("Neutrophils", "Neutrophils", "Neutrophils", 
                             "Neutrophils", "Neutrophils", "Neutrophils", "Neutrophils", "Neutrophils", 
                             "Neutrophils", "Neutrophils", "Neutrophils", "Neutrophils", "Neutrophils", 
                             "Neutrophils"), Treatment = c("Untreated", "Untreated", "Treated_1", 
                                                           "Treated_1", "Treated_1", "Treated_1", "Treated_1", "Treated_1", 
                                                           "Treated_2", "Treated_2", "Treated_2", "Treated_2", "Treated_2", 
                                                           "Treated_2"), Culture = c("No", "No", "Cp_0", "Cp_0", "Cp_1", 
                                                                                     "Cp_1", "Cp_2", "Cp_2", "Cp_0", "Cp_0", "Cp_1", "Cp_1", "Cp_2", 
                                                                                     "Cp_2")), class = "data.frame", row.names = c("Sample_1", "Sample_2", 
                                                                                                                                   "Sample_3", "Sample_4", "Sample_5", "Sample_6", "Sample_7", "Sample_8", 
                                                                                                                                   "Sample_9", "Sample_10", "Sample_11", "Sample_12", "Sample_13", 
                                                                                                                                   "Sample_14"))
#>             Cell_Type Treatment Culture
#> Sample_1  Neutrophils Untreated      No
#> Sample_2  Neutrophils Untreated      No
#> Sample_3  Neutrophils Treated_1    Cp_0
#> Sample_4  Neutrophils Treated_1    Cp_0
#> Sample_5  Neutrophils Treated_1    Cp_1
#> Sample_6  Neutrophils Treated_1    Cp_1
#> Sample_7  Neutrophils Treated_1    Cp_2
#> Sample_8  Neutrophils Treated_1    Cp_2
#> Sample_9  Neutrophils Treated_2    Cp_0
#> Sample_10 Neutrophils Treated_2    Cp_0
#> Sample_11 Neutrophils Treated_2    Cp_1
#> Sample_12 Neutrophils Treated_2    Cp_1
#> Sample_13 Neutrophils Treated_2    Cp_2
#> Sample_14 Neutrophils Treated_2    Cp_2
#################    #################     #################
group.test = unique(Sample_metadata$Treatment)
group.test 

k=1
for (k in 1:nrow(dat_log2)) {
  signature = rownames(dat_log2)[k]
  test.table <- Sample_metadata
  test.table$scores <- dat_log2[k,]
  i=1
  for (i in 1:length(group.test)) {
    group = group.test[i]
    T2 <- test.table[test.table$Treatment == group,]
    T1 <- test.table[test.table$Treatment==c("Untreated"),]
    if(all(T1$scores == T2$scores)){
      tt_pval[signature,group] = 1
    }else{
      tt_pval[signature,group] <- t.test(x =T1$scores,y=T2$scores,paired = FALSE)$p.value
    }
  }
}

## A. Comparison of groups from the same column
T2 <- test.table[test.table$Treatment == group,]
T1 <- test.table[test.table$Treatment==c("Untreated"),]

## Outcome: This will compare;
## Treated_1 vs Untreated
## Treated_2 vs Untreated

#################    #################    #################

group.test = unique(Sample_metadata$Culture)
group.test 

## B. Comparison of groups from the two column
T2 <- test.table[test.table$Treatment==c("Treated_1")&test.table$Culture == group,]
T1 <- test.table[test.table$Treatment==c("Treated_1")&test.table$Culture==c("Cp_0"),]

## Outcome: This will compare;
## Treated_1 (Cp_1) vs Treated_1 (Cp_0)
## Treated_1 (Cp_2) vs Treated_1 (Cp_0)

#################  #################    #################

## C. Comparison of groups from the three different columns.
## Outcome: Basically, I would like to compare;

## Treated_1 (PsuedoGroup) vs Treated_1 (Cp_0)    # [PsuedoGroup = Cp_1 + Cp_2]

Does this require another column to be added to the dataframe, and then compare the groups?

^{Created on 2021-10-07 by the reprex package (v2.0.1)}

Is there a way to do?

Thank you,
Toufiq

mtoufiq · October 7, 2021, 9:55am

Hi,

I tried adding a new column to the existing Sample_metadata dataframe and used it for the purpose of analysis.

## Add a new column from an existing column
Sample_metadata$Culture_v1 <- Sample_metadata$Culture

## Rename the column values
Sample_metadata$Culture_v1[Sample_metadata$Culture_v1 =="Cp_1"] <- "PsuedoGroup"
Sample_metadata$Culture_v1[Sample_metadata$Culture_v1 =="Cp_2"] <- "PsuedoGroup"
unique(Sample_metadata$Culture_v1)

## Now, compare;
Treated_1 (PsuedoGroup) vs Treated_1 (Cp_0)

system · October 28, 2021, 9:56am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.