Object-Oriented System in R

Central to any object-oriented system are the concepts of class and method. A class defines the behaviour of objects by describing their attributes and their relationship to other classes. The class is also used when selecting methods, functions that behave differently depending on the class of their input. Classes are usually organised in a hierarchy: the child inherits behaviour from the parent, if a method does not exist for a child, then the parent’s method is used instead.

There are different ways to define objects in R:

  1. Base types, the internal C-level types that underlie the other OO systems. Base types are mostly manipulated using C code, but they’re important to know about because they provide the building blocks for the other OO systems.

  2. S3 implements a style of OO programming called generic-function OO. S3 implements message-passing OO which is different from most programming languages. With message-passing, messages (methods) are sent to objects and the object determines which function to call. Typically, this object has a special appearance in the method call, usually appearing before the name of the method/message: e.g., canvas.drawRect("blue"). S3 is different. While computations are still carried out via methods, a special type of function called a generic function decides which method to call, e.g., drawRect(canvas, "blue"). S3 is a very casual system, has no formal definition of classes.

  3. S4 works similarly to S3, with two major differences: 1) S4 has formal class definitions, which describe the representation and inheritance for each class, and 2) S4 has special helper functions for defining generics and methods. S4 also has multiple dispatch, which means that generic functions can pick methods based on the class of any number of arguments, not just one.

  4. Reference classes, called RC for short, are quite different from S3 and S4. In RC, methods belong to classes, not functions. RC object is a combination of S4 and environments. $ is used to separate objects and methods, so method calls look like canvas$drawRect("blue"). RC objects are also mutable: they don’t use R’s usual copy-on-modify semantics, but are modified in place. This makes them harder to reason about, but allows them to solve problems that are difficult to solve with S3 or S4.

  5. R6 classes, defined in the R6 package, allow the creation of classes with reference semantics, similar to R’s built-in reference classes. Compared to reference classes, R6 classes are simpler and lighter-weight, and they are not built on S4 classes so they do not require the methods package. These classes allow public and private members, and they support inheritance, even when the classes are defined in different packages.

S3

S3 is R’s first and simplest OO system. It is the only OO system used in the base and stats packages, and it’s the most commonly used system in CRAN packages.

Objects, Generics, and Methods

Most objects that you encounter are S3 objects, you can use pryr::otype() to recognize the object.

library(pryr)
df <- data.frame(x = 1:10, y = letters[1:10])
otype(df)    # A data frame is an S3 class
#> [1] "S3"
otype(df$x)  # A numeric vector isn't
#> [1] "base"
otype(df$y)  # A factor is
#> [1] "S3"

In S3, methods belong to functions, called generic functions, or generics for short. S3 methods do not belong to objects or classes, which his is different from most other programming languages. Similar to otype(), pryr also provides ftype() which describes the object system:

ftype(mean)
#> [1] "s3"      "generic"

Given a class, the job of an S3 generic is to call the right S3 method. You can recognise S3 methods by their names, which look like generic.class(). For example, the Date method for the mean() generic is called mean.Date(), and the factor method for print() is called print.factor(). This is the reason that most modern style guides discourage the use of . in function names: it makes them look like S3 methods. For example, is t.test() the t method for test objects? Similarly, the use of . in class names can also be confusing: is print.data.frame() the print() method for data.frame class, or the print.data() method for frame class? pryr::ftype() knows about these exceptions, so you can use it to figure out if a function is an S3 method or generic:

ftype(t.data.frame) # data frame method for t()
#> [1] "s3"     "method"
ftype(t.test)       # generic function for t tests
#> [1] "s3"      "generic"

You can also get all the methods that belong to a generic or all generics that have a method for a given class:

methods("mean")
#> [1] mean.Date     mean.default  mean.difftime mean.POSIXct  mean.POSIXlt
#> see '?methods' for accessing help and source code
methods("t.test")
#> [1] t.test.default* t.test.formula*
#> see '?methods' for accessing help and source code

methods(class = "ts")
#>  [1] aggregate     as.data.frame cbind         coerce        cycle
#>  [6] diffinv       diff          initialize    kernapply     lines
#> [11] Math2         Math          monthplot     na.omit       Ops
#> [16] plot          print         show          slotsFromS3   time
#> [21] [<-           [             t             window<-      window
#> see '?methods' for accessing help and source code