Installation :
This package proposes an improved assignment using the shorthand ...
library(dotdot)
x <- y <- iris
x$Sepal.Length[5] <- x$Sepal.Length[5] + 3
y$Sepal.Length[5] := .. + 3
identical(x,y)
#> [1] TRUE
z <- factor(letters[1:3])
levels(z) := c(.., "level4")
z
#> [1] a b c
#> Levels: a b c level4You can think about the .. as the : of the := symbol laid horizontally.
data.table, tidyverse and other packages using :=The operator := is used by prominent packages data.table and rlang (mostly through tidyverse functions), but they only use it to parse expressions, due to its convenient operator precedence. It’s not actually called.
Thus dotdot is completely tidyverse and data.table compatible, and some adhoc adjustments were made so it even works when the latter are attached after dotdot.
library(data.table)
#>
#> Attaching package: 'data.table'
#> The following object is masked _by_ 'package:dotdot':
#>
#> :=
levels(z) := c(.., "level5")
z
#> [1] a b c
#> Levels: a b c level4 level5
data <- as.data.table(head(iris,2))
data[,new_col := 3] # `:= ` works as if dotdot wasn't attached
data
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species new_col
#> 1: 5.1 3.5 1.4 0.2 setosa 3
#> 2: 4.9 3.0 1.4 0.2 setosa 3An example of fine integration of the operator being used by dotdot and rlang through dplyr
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:data.table':
#>
#> between, first, last
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
my_data_frame <- iris[3:5]
var = quo(Log.Petal.Width)
my_data_frame := .. %>% mutate(!!var := log(Petal.Width)) %>% head(2)
my_data_frame
#> Petal.Length Petal.Width Species Log.Petal.Width
#> 1 1.4 0.2 setosa -1.609438
#> 2 1.4 0.2 setosa -1.609438In case you’ve attached another package containing :=, you can use dotdot_first() to make sure that our := is not masked (It seems to be rare enough though as I couldn’t find an example)
%<>%The package magrittr contains the operator %<>% which serves a similar role to :=. Let’s see how it is similar first, and then how it differs :
These calls have the same effect:
Those as well, but here we see magrittr is less compact and readable.
iris$Sepal.Length[5] %<>% multiply_by(2) %>% add(3)
iris$Sepal.Length[5] %<>% {2*. + 3}
iris$Sepal.Length[5] := 2*.. + 3Now for the differences, aside from compacity and readability :
magrittr means often masking functions likes extract or set_names. dotdot only exports its operator and the dotdot_first function.magrittr operators deal with environment in a way that is much less straightforward, so this won’t work :library(magrittr)
test <- function(some_parameter) {
some_parameter %<>% {as.character(substitute(.))}
some_parameter
}
x <- try(test(foo))
#> Error in eval(lhs, parent, parent) : objet 'foo' introuvable
inherits(x,"try-error")
#> [1] TRUEWhile this will work fine:
test <- function(some_parameter) {
some_parameter := as.character(substitute(..))
some_parameter
}
test(foo)
#> [1] "foo":= is also faster than %<>% , though these operations are fast anyway and not likely to be a bottleneck very often if ever: b <- x <- y <- z <- 1
microbenchmark::microbenchmark(
base = {b <- b + 1},
dotdot = {x := .. + 1},
magrittr = {y %<>% add(1)},
magrittr2 = {z %<>% {. + 1}},
times = 1e4
)
#> Unit: nanoseconds
#> expr min lq mean median uq max neval cld
#> base 200 302 493.4608 401 501 36501 10000 a
#> dotdot 10001 11802 14891.2715 12901 14001 2888200 10000 b
#> magrittr 61001 63502 79046.3541 65301 69700 4641102 10000 d
#> magrittr2 45601 47801 58116.3759 49100 51701 3128902 10000 c:= is NOT meant to be a complete replacement of the <- operator, the latter is explicit in the absence of .. , so more readable, is faster (though we’re speaking microseconds), and won’t clutter your traceback() when debugging.
:= can be used several times in a statement like z <- (x := .. + 1) + (y:= .. +1) but it never makes sense to use it := several times in an assignment such as x := (y := .. + 2) as all the .. will be replaced by the name of the variable on the lhs of the first evaluated := in any case. It can even produce counter intuitive output, see below.
This is all good and explicit :
x <- 4
y <- 7
z <- (x := .. + 1) + (y:= .. +1)
x
y
z
But using several nested := is unuseful and potentially confusing, here the dots will be replaced by x, though one might have expected them to be replaced by y.
x <- 4
y <- 7
x := (y := .. + 2) # same as `x <- (y := x + 2)`
x
y
Good practice makes things unambiguous :
x <- 4
y <- 7
x <- (y := .. + 2)
x
y
x <- 4
y <- 7
x := (y <- .. + 2)
x
y