Subtle difference on rownames between two count()-ed tibbles

I've recently noticed some new-to-me behavior when printing count()-ed tibbles in html_notebook documents. In the code below, the bad object is a coerced tibble which is then count()-ed. This object, when printed, generates rownames in the displayed table. The good object, a count()-ed object which is coerced to a tibble as a final step does not display rownames.

Strangely, the class and attributes of both objects look identical, but they fail an identical(attrib.as.set = FALSE) check -- an option I've not used before and am at a loss to explain.

There's a subtle difference in the display of the two tibbles even in this rendered reprex, with the bad object showing an asterisk in the rownames column...another display quirk I'm not sure how to interpret.

I'm expecting behavior similar to the good object, that which printed, rownames are not displayed.

Clarifications and corrections greatly appreciated!

library(dplyr, warn.conflicts = FALSE)
bad <- mtcars %>% tibble::as_tibble() %>% count(disp)
good <- mtcars %>% count(disp) %>% tibble::as_tibble()
identical(bad, good, attrib.as.set = FALSE)
#> [1] FALSE
bad
#> # A tibble: 27 x 2
#>     disp     n
#>  * <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # … with 17 more rows
good
#> # A tibble: 27 x 2
#>     disp     n
#>    <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # … with 17 more rows

Created on 2021-02-03 by the reprex package (v1.0.0)

Session info
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices datasets  utils     methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.4
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.31       magrittr_2.0.1   tidyselect_1.1.0 R6_2.5.0        
#>  [5] rlang_0.4.10     fansi_0.4.1      stringr_1.4.0    styler_1.3.2    
#>  [9] highr_0.8        tools_4.0.3      xfun_0.19        utf8_1.1.4      
#> [13] cli_2.3.0        DBI_1.1.0        htmltools_0.5.0  ellipsis_0.3.1  
#> [17] assertthat_0.2.1 yaml_2.2.1       digest_0.6.27    tibble_3.0.6    
#> [21] lifecycle_0.2.0  crayon_1.4.0     purrr_0.3.4      vctrs_0.3.6     
#> [25] fs_1.5.0         glue_1.4.2       evaluate_0.14    rmarkdown_2.6   
#> [29] reprex_1.0.0     stringi_1.5.3    compiler_4.0.3   pillar_1.4.7    
#> [33] generics_0.1.0   backports_1.2.1  renv_0.12.5      pkgconfig_2.0.3
1 Like

I can't reproduce your failure. I can't offer any insight, other than "it did work for me"... I do note that I used dplyr 1.0.2. I'm currently updating my packages to see if that makes a difference.

I copy/pasted your code into a freshly restarted R session and got:

library(dplyr, warn.conflicts = FALSE)
bad <- mtcars %>% tibble::as_tibble() %>% count(disp)
good <- mtcars %>% count(disp) %>% tibble::as_tibble()
identical(bad, good, attrib.as.set = FALSE)
#> [1] TRUE
bad
#> # A tibble: 27 x 2
#>     disp     n
#>    <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows
good
#> # A tibble: 27 x 2
#>     disp     n
#>    <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows


sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
#> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Australia.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.2
#> 
#> loaded via a namespace (and not attached):
#>  [1] knitr_1.30       magrittr_2.0.1   tidyselect_1.1.0 R6_2.5.0        
#>  [5] rlang_0.4.10     fansi_0.4.1      stringr_1.4.0    highr_0.8       
#>  [9] tools_4.0.3      xfun_0.20        utf8_1.1.4       cli_2.2.0       
#> [13] htmltools_0.5.0  ellipsis_0.3.1   yaml_2.2.1       digest_0.6.27   
#> [17] assertthat_0.2.1 tibble_3.0.4     lifecycle_0.2.0  crayon_1.3.4    
#> [21] purrr_0.3.4      vctrs_0.3.6      glue_1.4.2       evaluate_0.14   
#> [25] rmarkdown_2.6    stringi_1.5.3    compiler_4.0.3   pillar_1.4.7    
#> [29] generics_0.1.0   pkgconfig_2.0.3

Created on 2021-02-04 by the reprex package (v0.3.0)
I'm not spam

How curious! Thanks for checking this out! I'd love to hear if the the versions of packages I'm running which look different than yours (dplyr, tibble, knitr, and xfun at first glance) make any differences for you. I can reproduce this under both RStudio and at the console. Next steps will probably involve firing up docker containers to track this down, which is annoying.

Yep. Now I get the same as you.

library(dplyr, warn.conflicts = FALSE)
bad <- mtcars %>% tibble::as_tibble() %>% count(disp)
good <- mtcars %>% count(disp) %>% tibble::as_tibble()
identical(bad, good, attrib.as.set = FALSE)
#> [1] FALSE
bad
#> # A tibble: 27 x 2
#>     disp     n
#>  * <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows
good
#> # A tibble: 27 x 2
#>     disp     n
#>    <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
#> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Australia.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.4
#> 
#> loaded via a namespace (and not attached):
#>  [1] rstudioapi_0.13  knitr_1.31       magrittr_2.0.1   tidyselect_1.1.0
#>  [5] R6_2.5.0         rlang_0.4.10     fansi_0.4.1      stringr_1.4.0   
#>  [9] highr_0.8        tools_4.0.3      xfun_0.20        utf8_1.1.4      
#> [13] DBI_1.1.1        cli_2.3.0        htmltools_0.5.0  ellipsis_0.3.1  
#> [17] yaml_2.2.1       assertthat_0.2.1 digest_0.6.27    tibble_3.0.6    
#> [21] lifecycle_0.2.0  crayon_1.4.0     purrr_0.3.4      vctrs_0.3.6     
#> [25] fs_1.5.0         ps_1.5.0         glue_1.4.2       evaluate_0.14   
#> [29] rmarkdown_2.6    reprex_1.0.0     stringi_1.5.3    compiler_4.0.3  
#> [33] pillar_1.4.7     generics_0.1.0   pkgconfig_2.0.3

Created on 2021-02-04 by the reprex package (v1.0.0)

The things that are different in before/after are:

Before
#> other attached packages:
#> [1] dplyr_1.0.2

#> loaded via a namespace (and not attached):
#>  [1] knitr_1.30       
#>  [9] cli_2.2.0       
#> [17] tibble_3.0.4     crayon_1.3.4    

After
#> other attached packages:
#> [1] dplyr_1.0.4

#>  [1] rstudioapi_0.13  knitr_1.31       
#> [13] DBI_1.1.1        cli_2.3.0        
#> [17] tibble_3.0.6    
#> [21] crayon_1.4.0     
#> [25] fs_1.5.0         ps_1.5.0         
#> [29] reprex_1.0.0    

Looks like it could be a bug in dplyr.

With me newly updated packages, I reverted to dplyr 1.0.2

require(devtools)
install_version("dplyr", version = "1.0.2")

and then restarted the session and tried again, and got TRUE, as per my first post.

I then installed dplyr 1.0.3, and got FALSE, as per your experience.

The dplyr release notes say that v1.0.3 " count() and tally() are now generic."

library(dplyr, warn.conflicts = FALSE)
bad <- mtcars %>% tibble::as_tibble() %>% count(disp)
good <- mtcars %>% count(disp) %>% tibble::as_tibble()
identical(bad, good, attrib.as.set = FALSE)
#> [1] FALSE
bad
#> # A tibble: 27 x 2
#>     disp     n
#>  * <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows
good
#> # A tibble: 27 x 2
#>     disp     n
#>    <dbl> <int>
#>  1  71.1     1
#>  2  75.7     1
#>  3  78.7     1
#>  4  79       1
#>  5  95.1     1
#>  6 108       1
#>  7 120.      1
#>  8 120.      1
#>  9 121       1
#> 10 141.      1
#> # ... with 17 more rows
sessionInfo()
#> R version 4.0.3 (2020-10-10)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
#> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
#> [5] LC_TIME=English_Australia.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.3
#> 
#> loaded via a namespace (and not attached):
#>  [1] rstudioapi_0.13  knitr_1.31       magrittr_2.0.1   tidyselect_1.1.0
#>  [5] R6_2.5.0         rlang_0.4.10     fansi_0.4.1      stringr_1.4.0   
#>  [9] highr_0.8        tools_4.0.3      xfun_0.20        utf8_1.1.4      
#> [13] DBI_1.1.1        cli_2.3.0        htmltools_0.5.0  ellipsis_0.3.1  
#> [17] yaml_2.2.1       assertthat_0.2.1 digest_0.6.27    tibble_3.0.6    
#> [21] lifecycle_0.2.0  crayon_1.4.0     purrr_0.3.4      vctrs_0.3.6     
#> [25] fs_1.5.0         ps_1.5.0         glue_1.4.2       evaluate_0.14   
#> [29] rmarkdown_2.6    reprex_1.0.0     stringi_1.5.3    compiler_4.0.3  
#> [33] pillar_1.4.7     generics_0.1.0   pkgconfig_2.0.3

Created on 2021-02-04 by the reprex package (v1.0.0)

1 Like

Really appreciate the debugging on this @rowlesmr! It also looks like the bad object returns true from a tibble::has_rownames() check, even if I set rownames = FALSE in the explicit cast to a tibble. I've opened https://github.com/tidyverse/dplyr/issues/5737 to bring this up with the dplyr team.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.