The zeallot package defines an operator for unpacking
assignment, sometimes called parallel assignment or
destructuring assignment in other programming languages. The
operator is written as %<-%
and used like this.
The result is that the list is unpacked into its elements, and the
elements are assigned to lat
and lng
.
You can also unpack the elements of a vector.
You can unpack much longer structures, too, of course, such as the 6-part summary of a vector.
c(min_wt, q1_wt, med_wt, mean_wt, q3_wt, max_wt) %<-% summary(mtcars$wt)
min_wt
#> [1] 1.513
q1_wt
#> [1] 2.58125
med_wt
#> [1] 3.325
mean_wt
#> [1] 3.21725
q3_wt
#> [1] 3.61
max_wt
#> [1] 5.424
If the left-hand side and right-hand sides do not match, an error is raised. This guards against missing or unexpected values.
c(stg1, stg2, stg3) %<-% list("Moe", "Donald")
#> Error: invalid `%<-%` right-hand side, incorrect number of values
c(stg1, stg2, stg3) %<-% list("Moe", "Larry", "Curley", "Donald")
#> Error: invalid `%<-%` right-hand side, incorrect number of values
A common use-case is when a function returns a list of values and you
want to extract the individual values. In this example, the list of
values returned by coords_list()
is unpacked into the
variables lat
and lng
.
#
# A function which returns a list of 2 numeric values.
#
coords_list <- function() {
list(38.061944, -122.643889)
}
c(lat, lng) %<-% coords_list()
lat
#> [1] 38.06194
lng
#> [1] -122.6439
In this next example, we call a function that returns a vector.
You can directly unpack the coefficients of a simple linear regression into the intercept and slope.
safely
The purrr package includes the safely
function.
It wraps a given function to create a new, “safe” version of the
original function.
The safe version returns a list of two items. The first item is the
result of calling the original function, assuming no error occurred; or
NULL
if an error did occur. The second item is the error,
if an error occurred; or NULL
if no error occurred. Whether
or not the original function would have thrown an error, the safe
version will never throw an error.
pair <- safe_log("donald")
pair$result
#> NULL
pair$error
#> <simpleError in .Primitive("log")(x, base): non-numeric argument to mathematical function>
You can tighten and clarify calls to the safe function by using
%<-%
.
A data frame is simply a list of columns, so the zeallot assignment does what you expect. It unpacks the data frame into individual columns.
c(mpg, cyl, disp, hp) %<-% mtcars[, 1:4]
head(mpg)
#> [1] 21.0 21.0 22.8 21.4 18.7 18.1
head(cyl)
#> [1] 6 6 4 6 8 6
head(disp)
#> [1] 160 160 108 258 360 225
head(hp)
#> [1] 110 110 93 110 175 105
Bear in mind that a list of data frames is still just a list. The assignment will extract the list elements (which are data frames) but not unpack the data frames themselves.
quartet <- lapply(1:4, function(i) anscombe[, c(i, i + 4)])
c(an1, an2, an3, an4) %<-% lapply(quartet, head, n = 3)
an1
#> x1 y1
#> 1 10 8.04
#> 2 8 6.95
#> 3 13 7.58
an2
#> x2 y2
#> 1 10 9.14
#> 2 8 8.14
#> 3 13 8.74
an3
#> x3 y3
#> 1 10 7.46
#> 2 8 6.77
#> 3 13 12.74
an4
#> x4 y4
#> 1 8 6.58
#> 2 8 5.76
#> 3 8 7.71
The %<-%
operator assigned four data frames to four
variables, leaving the data frames intact.
In addition to unpacking flat lists, you can unpack lists of lists.
c(a, c(b, d), e) %<-% list("begin", list("middle1", "middle2"), "end")
a
#> [1] "begin"
b
#> [1] "middle1"
d
#> [1] "middle2"
e
#> [1] "end"
Not only does this simplify extracting individual elements, it also adds a level of checking. If the described list structure does not match the actual list structure, an error is raised.
The previous examples dealt with unpacking a list or vector into its elements. You can also split certain kinds of individual values into subvalues.
You can assign individual characters of a string to variables.
You can split a Date into its year, month, and day, and assign the parts to variables.
zeallot includes implementations of destructure
for character strings, complex numbers, data frames, date objects, and
linear model summaries. However, because destructure
is a
generic function, you can define new implementations for custom classes.
When defining a new implementation keep in mind the implementation needs
to return a list so that values are properly unpacked.
In some cases, you want the first few elements of a list or vector
but do not care about the trailing elements. The summary
function of lm
, for example, returns a list of 11 values,
and you might want only the first few. Fortunately, there is a way to
capture those first few and say “don’t worry about everything else”.
f <- lm(mpg ~ cyl, data = mtcars)
c(fcall, fterms, resids, ...rest) %<-% summary(f)
fcall
#> lm(formula = mpg ~ cyl, data = mtcars)
fterms
#> mpg ~ cyl
#> attr(,"variables")
#> list(mpg, cyl)
#> attr(,"factors")
#> cyl
#> mpg 0
#> cyl 1
#> attr(,"term.labels")
#> [1] "cyl"
#> attr(,"order")
#> [1] 1
#> attr(,"intercept")
#> [1] 1
#> attr(,"response")
#> [1] 1
#> attr(,".Environment")
#> <environment: R_GlobalEnv>
#> attr(,"predvars")
#> list(mpg, cyl)
#> attr(,"dataClasses")
#> mpg cyl
#> "numeric" "numeric"
head(resids)
#> Mazda RX4 Mazda RX4 Wag Datsun 710 Hornet 4 Drive
#> 0.3701643 0.3701643 -3.5814159 0.7701643
#> Hornet Sportabout Valiant
#> 3.8217446 -2.5298357
Here, rest
will capture everything else.
str(rest)
#> List of 8
#> $ coefficients : num [1:2, 1:4] 37.885 -2.876 2.074 0.322 18.268 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:2] "(Intercept)" "cyl"
#> .. ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
#> $ aliased : Named logi [1:2] FALSE FALSE
#> ..- attr(*, "names")= chr [1:2] "(Intercept)" "cyl"
#> $ sigma : num 3.21
#> $ df : int [1:3] 2 30 2
#> $ r.squared : num 0.726
#> $ adj.r.squared: num 0.717
#> $ fstatistic : Named num [1:3] 79.6 1 30
#> ..- attr(*, "names")= chr [1:3] "value" "numdf" "dendf"
#> $ cov.unscaled : num [1:2, 1:2] 0.4185 -0.0626 -0.0626 0.0101
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:2] "(Intercept)" "cyl"
#> .. ..$ : chr [1:2] "(Intercept)" "cyl"
The assignment operator noticed that ...rest
is prefixed
with ...
, and it created a variable called
rest
for the trailing values of the list. If you omitted
the “everything else” prefix, there would be an error because the
lengths of the left- and right-hand sides of the assignment would be
mismatched.
c(fcall, fterms, resids, rest) %<-% summary(f)
#> Error: invalid `%<-%` right-hand side, incorrect number of values
If multiple collector variables are specified at a particular depth it is ambiguous which values to assign to which collector and an error will be raised.
In addition to collecting trailing values, you can also collect initial values and assign specific remaining values.
c(...skip, e, f) %<-% list(1, 2, 3, 4, 5)
skip
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 2
#>
#> [[3]]
#> [1] 3
e
#> [1] 4
f
#> [1] 5
Or you can assign the first value, skip values, and then assign the last value.
You can skip one or more values without raising an error by using a
period (.
) instead of a variable name. For example, you
might care only about the min, mean, and max values of a vector’s
summary
.
c(min_wt, ., ., mean_wt, ., max_wt) %<-% summary(mtcars$wt)
min_wt
#> [1] 1.513
mean_wt
#> [1] 3.21725
max_wt
#> [1] 5.424
By combining an anonymous element (.
) with the collector
prefix, (...
), you can ignore whole sublists.
c(begin, ..., end) %<-% list("hello", "blah", list("blah"), "blah", "world!")
begin
#> [1] "hello"
end
#> [1] "world!"
You can mix periods and collectors together to selectively keep and discard elements.
c(begin, ., ...middle, end) %<-% as.list(1:5)
begin
#> [1] 1
middle
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 4
end
#> [1] 5
It is important to note that although value(s) are skipped they are still expected. The next section touches on how to handle missing values.
You can specify a default value for a left-hand side variable using
=
, similar to specifying the default value of a function
argument. This comes in handy when the number of elements returned by a
function cannot be guaranteed. tail
for example may return
fewer elements than asked for.
However, if we tried to get 3 elements and assign them an error would
be raised because tail(nums, 3)
still returns only 2
values.
We can fix the problem and resolve the error by specifying a default
value for z
.
A handy trick is swapping values without the use of a temporary variable.
c(first, last) %<-% c("Ai", "Genly")
first
#> [1] "Ai"
last
#> [1] "Genly"
c(first, last) %<-% c(last, first)
first
#> [1] "Genly"
last
#> [1] "Ai"
or
The magrittr
package provides a pipe operator
%>%
which allows functions to be called in succession
instead of nested. The left operator %<-%
does not work
well with these function chains. Instead, the right operator
%->%
is recommended. The below example is adapted from
the magrittr
readme.
library(magrittr)
mtcars %>%
subset(hp > 100) %>%
aggregate(. ~ cyl, data = ., FUN = . %>% mean() %>% round(2)) %>%
transform(kpl = mpg %>% multiply_by(0.4251)) %->%
c(cyl, mpg, ...rest)
cyl
#> [1] 4 6 8
mpg
#> [1] 25.90 19.74 15.10
rest
#> $disp
#> [1] 108.05 183.31 353.10
#>
#> $hp
#> [1] 111.00 122.29 209.21
#>
#> $drat
#> [1] 3.94 3.59 3.23
#>
#> $wt
#> [1] 2.15 3.12 4.00
#>
#> $qsec
#> [1] 17.75 17.98 16.77
#>
#> $vs
#> [1] 1.00 0.57 0.00
#>
#> $am
#> [1] 1.00 0.43 0.14
#>
#> $gear
#> [1] 4.50 3.86 3.29
#>
#> $carb
#> [1] 2.00 3.43 3.50
#>
#> $kpl
#> [1] 11.010090 8.391474 6.419010