How can I retain attributes from an S3 class on subsetting (and should I)?

maxheld83 · August 8, 2018, 11:54am

Let's assume I'm building an S3 class for which a base type of character vector fits best.
Say,

x <- structure(
  .Data = c("zap", "zong"),
  class = c("myClass", "character")
)
x
# [1] "zap"  "zong"
# attr(,"class")
# [1] "myClass"   "character"

Let's further assume that I want to retain attributes (here: class = "myClass") on subsetting and similar (~ idempotent?) operations.

Default behavior of R is of course to drop all attributes (except name and dim) on subsetting, but :

Attributes should generally be thought of as ephemeral (unless they’re formalised into an S3 class)
(emphasis added, @hadley's adv-r)

This is easy to see for factors, which do retain attributes (here: levels and class) :

attributes(factor(c("foo", "bar"))[1])
# $levels
# [1] "bar" "foo"
# 
# $class
# [1] "factor"

But out-of-the-box S3 classes don't do this :

attributes(x[1])
# NULL

My hunch is that this is so, because base R probably implemented an S3 subsetting methods for factors , and indeed, there's an Extract.factor()/[.factor in base R (I couldn't find the source, probably because it's in C as an internal generic?).

@hadley discusses the same thing with regard to dplyr, teaching it to retain attributes via `sloop::reconstruct() (not on CRAN yet, sadly).

So far so good, but this seems like pretty major surgery (involving internal generic [) just for teaching some class to retain its attributes on subsetting .

The alternatives are:

Implement [.myClass to make this happen (maybe writing a reconstruct.myClass() method, though sloop is not on CRAN yet).
Using the promising sticky package by @ctbrown, which seems to implement all required for 1) via it's own class. (Package is not very active recently, and still has some bugs/limitations).

What's the best practice to do this?
Is it wise to do this, or is (retaining attributes) like putting lipstick on a pig?

Ps.: Here's a related question on S-O.

Pps.: also related:

this issue for sloop::reconstruct()
this issue for adv-r

nwerth · August 8, 2018, 1:27pm

If the sticky package does what you want, that's great. But there are intentionally few rules around attributes, which lets them be flexible. Plenty of special cases (e.g., start and end for ts objects) require changes when subsetting. The default of dropping most attributes ^1 is a good thing; R doesn't presume to know how to handle your custom class.

So the general advice is to write a [.myClass method (maybe also [[.myClass and [.myClass<- methods). But, again, if sticky does what you want, then saving time is a good thing.

^1 Reading the code for sticky showed me the existence of the mostattributes() function, which should simplify some of my packages.

maxheld83 · August 8, 2018, 1:31pm

thanks!
Sticky seems great, it's just a) not on CRAN as of now, and b) appears to drop the name attribute on subsetting (though no others).

hadley · August 8, 2018, 3:27pm

Generally, you should provide a [ method along these lines:

#' @export
`[.binned` <- function(x, i, ...) {
  new_binned(NextMethod(), breaks = attr(x, "breaks"))
}

(assuming that your class is called binned, and you have a constructor called new_binned())