filter() does not recognize the name of a column

I have this tibble

> DS90_2
# A tibble: 50 x 3
   Parametro           Tabla   Valor
   <chr>               <chr>   <dbl>
 1 Aluminio total (Al) Tabla 1   5  
 2 Aluminio total (Al) Tabla 2  10  
 3 Aluminio total (Al) Tabla 3   1  
 4 Aluminio total (Al) Tabla 4   1  
 5 Aluminio total (Al) Tabla 5  10  
 6 Arsénico total (As) Tabla 1   0.5
 7 Arsénico total (As) Tabla 2   1  
 8 Arsénico total (As) Tabla 3   0.1
 9 Arsénico total (As) Tabla 4   0.2
10 Arsénico total (As) Tabla 5   0.5
# ... with 40 more rows

When I use filter() to column "Parametro", does not work:

> filter(DS90_2,Parametro== "Aluminio total (Al)")
# A tibble: 0 x 3
# ... with 3 variables: Parametro <chr>, Tabla <chr>, Valor <dbl>

But, when I change the column, it works:

> filter(DS90_2,Tabla== "Tabla 1")
# A tibble: 10 x 3
   Parametro            Tabla   Valor
   <chr>                <chr>   <dbl>
 1 Aluminio total (Al)  Tabla 1  5   
 2 Arsénico total (As)  Tabla 1  0.5 
 3 Cadmio total (Cd)    Tabla 1  0.01
 4 Cinc total (Zn)      Tabla 1  3   
 5 Cobre total (Cu)     Tabla 1  1   
 6 Cromo total (Cr)     Tabla 1 NA   
 7 Manganeso total (Mn) Tabla 1  0.3 
 8 Níquel total (Ni)    Tabla 1  0.2 
 9 Plomo total (Pb)     Tabla 1  0.05
10 Selenio total (Se)   Tabla 1  0.01

I have checked spaces, letters and copied and pasted the name of the column and the selection to avoid making character errors.

Hello there,

Do a full reprex and I can have a look to see what is going on here for you (FAQ: How to do a minimal reproducible example ( reprex ) for beginners)

1 Like

1.- Minimal Dataset (Sample Data)

1.1.- Imported data from excel:

> head(DS90)
# A tibble: 6 x 6
  Parametro           `Tabla 1` `Tabla 2` `Tabla 3` `Tabla 4` `Tabla 5`
  <chr>                   <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1 Aluminio total (Al)      5         10        1         1         10  
2 Arsénico total (As)      0.5        1        0.1       0.2        0.5
3 Cadmio total (Cd)        0.01       0.3      0.02      0.02       0.5
4 Cinc total (Zn)          3         20        5         5          5  
5 Cobre total (Cu)         1          3        0.1       1          3  
6 Cromo total (Cr)        NA         NA        2.5       2.5       10 

1.2.- Handled table

> head(DS90_2,10)
# A tibble: 10 x 3
   Parametro           Tabla   Valor
   <chr>               <chr>   <dbl>
 1 Aluminio total (Al) Tabla 1   5  
 2 Aluminio total (Al) Tabla 2  10  
 3 Aluminio total (Al) Tabla 3   1  
 4 Aluminio total (Al) Tabla 4   1  
 5 Aluminio total (Al) Tabla 5  10  
 6 Arsénico total (As) Tabla 1   0.5
 7 Arsénico total (As) Tabla 2   1  
 8 Arsénico total (As) Tabla 3   0.1
 9 Arsénico total (As) Tabla 4   0.2
10 Arsénico total (As) Tabla 5   0.5

2.- Runnable Code

2.1.- Import table from excel

> library(readxl)
> DS90 <- read_excel("DS90.xlsx")

2.2.- Manipulate columns and create another table

> library(tidyverse)
> library(dplyr)
> DS90_2<-DS90%>%pivot_longer(`Tabla 1`:`Tabla 5`,names_to="Tabla",values_to="Valor")
> head(DS90_2,8)
# A tibble: 8 x 3
  Parametro           Tabla   Valor
  <chr>               <chr>   <dbl>
1 Aluminio total (Al) Tabla 1   5  
2 Aluminio total (Al) Tabla 2  10  
3 Aluminio total (Al) Tabla 3   1  
4 Aluminio total (Al) Tabla 4   1  
5 Aluminio total (Al) Tabla 5  10  
6 Arsénico total (As) Tabla 1   0.5
7 Arsénico total (As) Tabla 2   1  
8 Arsénico total (As) Tabla 3   0.1

2.3.- Using filter() with "Parametro" column (without results):

> filter(DS90_2,Parametro=="Aluminio total (Al)")
# A tibble: 0 x 3
# ... with 3 variables: Parametro <chr>, Tabla <chr>, Valor <dbl>

Doing the same but with another column (the results are fine):

> filter(DS90_2,Tabla=="Tabla 1")
# A tibble: 10 x 3
   Parametro            Tabla   Valor
   <chr>                <chr>   <dbl>
 1 Aluminio total (Al)  Tabla 1  5   
 2 Arsénico total (As)  Tabla 1  0.5 
 3 Cadmio total (Cd)    Tabla 1  0.01
 4 Cinc total (Zn)      Tabla 1  3   
 5 Cobre total (Cu)     Tabla 1  1   
 6 Cromo total (Cr)     Tabla 1 NA   
 7 Manganeso total (Mn) Tabla 1  0.3 
 8 Níquel total (Ni)    Tabla 1  0.2 
 9 Plomo total (Pb)     Tabla 1  0.05
10 Selenio total (Se)   Tabla 1  0.01

I hope that reprex has been understood and used correctly

Respondí abajo. Muchas gracias.

The code is not runnable since we don't have access to your local files and you haven't provided sample data on a copy/paste friendly format, the reproducible example is supposed to be self contained so you have to provide code that runs with the sample data, not with a file in your computer.

Maybe if you read the reprex guide in Spanish?

3 Likes
library(readxl)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyverse)
DS90_prueba <- read_excel("C:\\Users\\Camilo Montes\\Documents\\R\\APPS\\EA_PRUEBA\\EA-SMA-02-20\\DS90.xlsx")
head(DS90_prueba)
#> # A tibble: 6 x 6
#>   Parametro           `Tabla 1` `Tabla 2` `Tabla 3` `Tabla 4` `Tabla 5`
#>   <chr>                   <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
#> 1 Aluminio total (Al)      5         10        1         1         10  
#> 2 Arsénico total (As)      0.5        1        0.1       0.2        0.5
#> 3 Cadmio total (Cd)        0.01       0.3      0.02      0.02       0.5
#> 4 Cinc total (Zn)          3         20        5         5          5  
#> 5 Cobre total (Cu)         1          3        0.1       1          3  
#> 6 Cromo total (Cr)        NA         NA        2.5       2.5       10
DS90_2<-DS90_prueba%>%pivot_longer(`Tabla 1`:`Tabla 5`,names_to="Tabla",values_to="Valor")
head(DS90_2)
#> # A tibble: 6 x 3
#>   Parametro           Tabla   Valor
#>   <chr>               <chr>   <dbl>
#> 1 Aluminio total (Al) Tabla 1   5  
#> 2 Aluminio total (Al) Tabla 2  10  
#> 3 Aluminio total (Al) Tabla 3   1  
#> 4 Aluminio total (Al) Tabla 4   1  
#> 5 Aluminio total (Al) Tabla 5  10  
#> 6 Arsénico total (As) Tabla 1   0.5
filter(DS90_2,Parametro=="Aluminio total (Al)")
#> # A tibble: 0 x 3
#> # ... with 3 variables: Parametro <chr>, Tabla <chr>, Valor <dbl>
Created on 2020-10-03 by the reprex package (v0.3.0)

The guide advises you not to do this.
You should be aware that you alone have your xlsx file, we dont. so no data is being communicated in a reproducible format.

try dput(DS90_prueba) , that will provide a copy/paste-able format.

Yes, I understand, thank you very much. As the data obtained from reprex was presented below, I thought it would be possible to work with it.

Here are the data obtained with dput(DS90_prueba)

structure(list(Parametro = c("Aluminio total (Al)", "Arsénico total (As)", 
"Cadmio total (Cd)", "Cinc total (Zn)", "Cobre total (Cu)", "Cromo total (Cr)", 
"Manganeso total (Mn)", "Níquel total (Ni)", "Plomo total (Pb)", 
"Selenio total (Se)"), `Tabla 1` = c(5, 0.5, 0.01, 3, 1, NA, 
0.3, 0.2, 0.05, 0.01), `Tabla 2` = c(10, 1, 0.3, 20, 3, NA, 3, 
3, 0.5, 0.1), `Tabla 3` = c(1, 0.1, 0.02, 5, 0.1, 2.5, 0.5, 0.5, 
0.2, 0.01), `Tabla 4` = c(1, 0.2, 0.02, 5, 1, 2.5, 2, 2, 0.2, 
0.01), `Tabla 5` = c(10, 0.5, 0.5, 5, 3, 10, 4, 4, 1, 0.03)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

I'm not able to reproduce your issue with the sample data you have provided

# library calls
library(dplyr)
library(tidyr)

# Sample data on a copy/paste friendly format
DS90_prueba <- structure(list(Parametro = c("Aluminio total (Al)", "Arsénico total (As)", 
                                            "Cadmio total (Cd)", "Cinc total (Zn)", "Cobre total (Cu)", "Cromo total (Cr)", 
                                            "Manganeso total (Mn)", "Níquel total (Ni)", "Plomo total (Pb)", 
                                            "Selenio total (Se)"), `Tabla 1` = c(5, 0.5, 0.01, 3, 1, NA, 
                                                                                 0.3, 0.2, 0.05, 0.01), `Tabla 2` = c(10, 1, 0.3, 20, 3, NA, 3, 
                                                                                                                      3, 0.5, 0.1), `Tabla 3` = c(1, 0.1, 0.02, 5, 0.1, 2.5, 0.5, 0.5, 
                                                                                                                                                  0.2, 0.01), `Tabla 4` = c(1, 0.2, 0.02, 5, 1, 2.5, 2, 2, 0.2, 
                                                                                                                                                                            0.01), `Tabla 5` = c(10, 0.5, 0.5, 5, 3, 10, 4, 4, 1, 0.03)), row.names = c(NA, 
                                                                                                                                                                                                                                                        -10L), class = c("tbl_df", "tbl", "data.frame"))

# Relevant code
DS90_prueba %>%
    pivot_longer(`Tabla 1`:`Tabla 5`,
                 names_to = "Tabla",
                 values_to = "Valor") %>% 
    filter(Parametro == "Aluminio total (Al)")
#> # A tibble: 5 x 3
#>   Parametro           Tabla   Valor
#>   <chr>               <chr>   <dbl>
#> 1 Aluminio total (Al) Tabla 1     5
#> 2 Aluminio total (Al) Tabla 2    10
#> 3 Aluminio total (Al) Tabla 3     1
#> 4 Aluminio total (Al) Tabla 4     1
#> 5 Aluminio total (Al) Tabla 5    10

Created on 2020-10-03 by the reprex package (v0.3.0)
Try reinstalling/updating dplyr and tidyr, if the issues persist, share a proper reproducible example (like the one I have posted).

2 Likes

Thank you very much.

Regards,

Try

dplyr::filter(DS90_2,Parametro== "Aluminio total (Al)").

You might be using stats::filter, which does not handle data frames this way.

Hi, thank you, but but did not deliver any results. Finally I used:

DS90_2_df <- DS90_2 %>% as.data.frame()
nombres <- distinct(DS90_2_df, Parametro)
Al_DS90<-DS90_2_df%>%filter(Parametro==nombres[1,])

Regards,

Thank you very much!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.