cook675
October 30, 2019, 12:36am
1
I have a data frame (data.frame) that looks like this (simplified):
[1] [2] [3]
[A] 2 4 3
[B] 1 5 7
[C] 2 3 4
I want to subset out all the columns that correspond to row A >= 3. So that if I did that the resulting matrix would look like this:
[2] [3]
[A] 4 3
[B] 5 7
[C] 3 4
I can't get this for the life of me. I tried:
Test <- data.frame[, data.frame$"A" >= 3]
And I get returned 3x0
1 Like
Hi, there's a tidy
solution to filter rows of data frames with dplyr::filter
new_df <- old_df %>% filter(A < 3)
So I ran:
new_df <-old_df %>% dplyr::filter("A" > 3)
and I got returned the same 3x3 (none of the columns subsetted) and now the rownames are erased and replaced with 1, 2, 3!
1 Like
I think if you resort to the t() function (transpose) in base R, it will get you a version of what you want.
You may have to reassign names at the end.
See the example below.
library(tidyverse)
df <- data.frame(
one = c(2,1,2),
two = c(4,5,3),
three = c(3,7,4)
)
row.names(df) <- c('a','b','c')
df
#> one two three
#> a 2 4 3
#> b 1 5 7
#> c 2 3 4
t(df)
#> a b c
#> one 2 1 2
#> two 4 5 3
#> three 3 7 4
t(df) %>% as.data.frame() %>%
filter(a >=3) %>%
t() %>%
as.data.frame()
#> V1 V2
#> a 4 3
#> b 5 7
#> c 3 4
Created on 2019-10-29 by the reprex package (v0.3.0)
I just tried this and the output was the original dataframe with the columns now listed as V1, V2 etc..
My actual data frame is 5x526 where the 5 rownames are gene names "B2m", "Isg15" etc... and the columns are cell identifiers.
When I run the code, and the one I tried before, its still 5x526
Im not understanding why none of this is working!
Then it's time for a reproducible example, called a reprex , since I clearly misunderstood what you're trying to do.
While creating a reprex and got it to work with phiggins example. Im not sure, I must have made a mistake before.
Is there a way to retain the column names instead of having it erase them and put V1, V2, V3 etc...?
Also this is a stupid question but in the example code it just prints the output after as.data.frame()
how do I save this to a new variable?
1 Like
There are no stupid questions on this forum; everyone of us either asking or reading any question has the potential to learn. Sok?
@phiggins has a workable suggestion in his post:
library(tidyverse)
df <- data.frame(
one = c(2,1,2),
two = c(4,5,3),
three = c(3,7,4)
)
row.names(df) <- c('a','b','c')
t(df)
t(df) %>% as.data.frame() %>%
filter(a >=3) %>%
t() %>%
as.data.frame()
The resulting object df
can first be assigned to its own name
df <- t(df)
Then you need to decide if you want a,b,c
as rownames
or a variable. If rownames, you're done; if a variable
df <- tibble::rownames_to_column(df)
To get more precise guidance, we still need some rows of your actual dataframe in a reproducible example, called a reprex ,
1 Like
This is one of those cases where base R makes things much simpler, see this other solution
df <- data.frame(
one = c(2,1,2),
two = c(4,5,3),
three = c(3,7,4)
)
row.names(df) <- c('a','b','c')
new_df <- df[,df["a",] >= 3]
new_df
#> two three
#> a 4 3
#> b 5 7
#> c 3 4
2 Likes
cook675
October 30, 2019, 3:52pm
10
thanks andresrcs this worked like gangbusters.
You guys are the best always so helpful.
Scoco
November 1, 2019, 9:39am
11
@andresrcs has the best solution, but I'll offer a hint to simplify the data.frame creation by specifying row names directly:
df <- data.frame(
one = c(2,1,2),
two = c(4,5,3),
three = c(3,7,4),
row.names = c('A','B','C')
)
new_df <- df[,df["A",] >= 3]
new_df
#> two three
#> A 4 3
#> B 5 7
#> C 3 4
1 Like
system
Closed
November 8, 2019, 9:40am
12
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.