R re-order some rows in dataframe

--Hi all,

i'm looking for a soluce to permute specific rows in dataframe, for example how to change this table:
ID sample target ratio
1 1 PIRLA_Clone15 POU5F1 52551.69
2 1 PIRLA_Clone15 NANOG 238806.25
3 1 PIRLA_Clone15 DNMT3B 874650.84
4 1 PIRLA_Clone15 SOX2 10358.24
5 2 PIRLA_Clone21 POU5F1 61847.60
6 2 PIRLA_Clone21 NANOG 238175.42
7 2 PIRLA_Clone21 DNMT3B 765661.66
8 2 PIRLA_Clone21 SOX2 11513.13
9 3 PIRLA_Clone6 POU5F1 52023.12
10 3 PIRLA_Clone6 NANOG 192596.55
11 3 PIRLA_Clone6 DNMT3B 704997.93
12 3 PIRLA_Clone6 SOX2 9004.55

to that:

ID sample target ratio
1 1 PIRLA_Clone15 POU5F1 52551.69
2 1 PIRLA_Clone15 NANOG 238806.25
3 1 PIRLA_Clone15 SOX2 10358.24
4 1 PIRLA_Clone15 DNMT3B 874650.84
5 2 PIRLA_Clone21 POU5F1 61847.60
6 2 PIRLA_Clone21 NANOG 238175.42
7 2 PIRLA_Clone21 SOX2 11513.13
8 2 PIRLA_Clone21 DNMT3B 765661.66
9 3 PIRLA_Clone6 POU5F1 52023.12
10 3 PIRLA_Clone6 NANOG 192596.55
11 3 PIRLA_Clone6 SOX2 9004.55
12 3 PIRLA_Clone6 DNMT3B 704997.93

you see a permutation between lines 3<->4, 7<->8 and 11<->12
the level order is given by target column such as: POU5F1,NANOG,SOX2,DNMT3B

thank for your help.

Are sample and target factors, and if so is the target factor specified in the desired order (meaning SOX2 is level 3 and DNMT3B is level 4)?

1 Like

yes SOX2 is level 3 and DNMT3B level4 in each group of samples.

1 Like

If you dataframe is df, the following should do the trick:

library(dplyr)
df <- arrange(df, sample, target)

That will sort rows first by sample, then within sample by target, both in ascending order.

wrong,
sorting by ascending order i get that:

SID sample target ratio
1 1 PIRLA_Clone15 DNMT3B 874650.84
2 1 PIRLA_Clone15 NANOG 238806.25
3 1 PIRLA_Clone15 POU5F1 52551.69
4 1 PIRLA_Clone15 SOX2 10358.24
5 2 PIRLA_Clone21 DNMT3B 765661.66
6 2 PIRLA_Clone21 NANOG 238175.42
7 2 PIRLA_Clone21 POU5F1 61847.60
8 2 PIRLA_Clone21 SOX2 11513.13
9 3 PIRLA_Clone6 DNMT3B 704997.93
10 3 PIRLA_Clone6 NANOG 192596.55
11 3 PIRLA_Clone6 POU5F1 52023.12
12 3 PIRLA_Clone6 SOX2 9004.55

but i need to have this level of order per group of sample: "POU5F1", "NANOG", "SOX2" , "DNMT3B"

1 Like

Again assuming the dataframe is named df, run str(df[1, ]) to confirm that target is a factor and not a column of strings. If it is a factor, run levels(df$target) to see the order in which the levels are specified. Either you did not enter it as a factor or you did not specify an order of levels.

1 Like

str(df[1, ])
'data.frame': 1 obs. of 4 variables:
SG_ID : chr "1" sample: chr "PIRLA_Clone15"
target: chr "POU5F1" ratio : num 52552

levels(df$target)
NULL

and now ?

So it is a character variable and not a factor. To change to factor, do the following:

library(dplyr)

df |>
  mutate(
    sample=factor(sample, levels=c("POU5F1", "NANOG", "SOX2" , "DNMT3B"))
  ) |>
  arrange(sample, target)
1 Like

wrong.
the correct command i found was:
df %>% arrange(as.numeric(ID),factor(target, levels = c("POU5F1", "NANOG", "SOX2" , "DNMT3B")))

Please consider to use StatSteph´s provided code. This is better coding practice, because it firstly uses the dplyr- independent pipe (|> instead of %>%) and furthermore it seperates the commands arrange and mutate, which is good if you collaborate with others or want to change something in your code. :slight_smile:

Hi,

ok, like that:

df |> mutate(target=factor(target, levels = c("POU5F1", "NANOG", "SOX2" , "DNMT3B"))) |> arrange(as.numeric(SG_ID), target)

I cant confirm that it is necessary to turn the ID column into numeric. If it is not numeric in the first place, but you need it to be numeric (id´s mostly also have to be factorial as they are not continuous but discrete) your code is generally correct. However I would always recommend to not mix your commands. In your code you again transform a variable inside the arrange command. Better do so before in the mutate command as well, it would look like this then:

df |> 
mutate(target=factor(target, levels = c("POU5F1", "NANOG", "SOX2" , "DNMT3B")),
       SG_ID = as.numeric(SG_ID)) |> 
arrange(SG_ID,  sample, target) # include sample before target and after id so it gets arranged by id, sample and within the sample groups by target

Please run this and let me know if it is not in your prefered structure now.

Another hint for future posts: please provide code within code chunk marks " ```{r} # your code" followed by another three "`"

yes, your command works fine and results are in a perfect order.

genelist= c("POU5F1", "NANOG", "SOX2" , "DNMT3B")
df  |>  mutate(target=factor(target, levels = genelist), SG_ID = as.numeric(SG_ID)) |> arrange(SG_ID, sample, target)

1 Like