How do you pronounce `tbl`

I'm presenting on dplyr this week and I'm trying to make sure I have my naming sorted out before I start infecting people with my ideas.

flights_db <- tbl(con, "flights")

in the above example, flights_db is (I think) a tbl... although if I try to look up its class I get:

class(flights_db)
[1] "tbl_dbi"  "tbl_sql"  "tbl_lazy" "tbl"  

so is it a tbl or is it a tbl_dbi?

The thing we call a 'Tibble' is of the following class:

[1] "tbl_df"     "tbl"        "data.frame"

So what is the name for the thing returned by the function tbl? And how do we pronounce it? It's this pointer thing that is lazy evaluated and points to a a table on the DB.

And related: once we run a tbl though a pipe flow, we end up with an object that is a pointer to an unevaluated blob of SQL. Is that blob also a tbl or does it get a different name? In other words, what do we call my_tbl in the example below:

flights_db %>% 
  group_by(dest) %>%
  summarise(delay = mean(dep_time)) ->
my_tbl
1 Like

I refer to all tbl_* class object as tbls (which I pronounce the same way I do "tibble" because I'm not able to make whatever noise would otherwise constitute tbl, and for most intents and purposes, they act the same way— excepting for the fact that it's a remote source. By analogy, I might refer to a tbl as a data.frame more generically.

With what limited understanding of classes that I have, my understanding is as follows:

  • tbl_dbi is the S3 class that returns when you use tbl() with dbplyr (as you've done there); it is still a tbl (hence the class cascade — or whatever the word for that is), but has extra information associated with it re. the database source, etc.

Lifting from the docs here:

flights_db <- tbl(con, "flights")

When you print it out, you’ll notice that it mostly looks like a regular tibble:

flights_db 
#> Source:     table<flights> [?? x 19]
#> Database:   sqlite 3.11.1 []
#> 
#> # S3: tbl_dbi
#>    year month   day dep_time sched_dep_time dep_delay arr_time
#>   <int> <int> <int>    <int>          <int>     <dbl>    <int>
#> 1  2013     1     1      517            515         2      830
#> 2  2013     1     1      533            529         4      850
#> 3  2013     1     1      542            540         2      923
#> 4  2013     1     1      544            545        -1     1004
#> 5  2013     1     1      554            600        -6      812
#> 6  2013     1     1      554            558        -4      740
#> # ... with more rows, and 12 more variables: sched_arr_time <int>,
#> #   arr_delay <dbl>, carrier <chr>, flight <int>, tailnum <chr>,
#> #   origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>, hour <dbl>,
#> #   minute <dbl>, time_hour <dbl>

The main difference is that you can see that it’s a remote source in a SQLite database.

Is an unevalutated blob of SQL a tbl? Not until you make it so… If it's unexecuted, it's like potential energy— or an egg, or something that hasn't actually become what it will ultimately become yet… Or, to borrow from Sortals by way of Elijah Meeks (whom I pointed to sortals, so whose knowledge is it, really?!)
https://twitter.com/Elijah_Meeks/status/990103435205722112
Actually, this is totally a sortal question— not the pronunciation part, but (let's be real) you veered off from the pronunciation bit at the end, too…

3 Likes

This is really helpful... I will present the topic of tibbles as being analogous to werewolves and potential energy. That should click with folk ...

Or you could go with sortals —your call! :stuck_out_tongue_winking_eye:

Sortals were totally new to me until you taught me about them. And while I like them, I'm not confident enough to discuss them in public. I'm afraid I'll botch them somehow and end up like the minster who told the congregation to prostate themselves before God.

1 Like