Introduction
Have you ever skimmed through the tidyverseās design guide by Hadley Wickham? Even though that book hasnāt been completed yet, it already has useful content such as the pattern āPut ⦠after required argumentsā, that we have applied in several of our packages, and that weād like to focus on in this post!
Why put ⦠after required arguments?
In a function with both
- Required arguments, i.e.Ā arguments with no default value (e.g.Ā the
graphargument of many igraph functions such ashits_score()) - Optional arguments, i.e.Ā arguments with a default value (e.g.Ā the
weightsargument that defaults toNULLin many igraph functions such ashits_score())
Youād hope for
- Users to name optional arguments to make the code easier to read for
outsiders. Compare
hits_score(x, TRUE, NA)tohits_score(x, scales = TRUE, weights = NA): the latter is obvious even to the casual reader; this may be you in a few months. - Yourself to be able to change the order of optional arguments without wreaking havoc.
Well, the pattern described in the tidy design guide allows you to do just that.
Here is a mocked up definition of the make_wheel() function in igraph,
that we call make_wheel0(). The function creates a wheel graph (a
center connected to peripheral nodes, which when plotted looks like a
bike wheel). Its arguments are: the number of nodes n, the optional
mode which describes the direction of edges, and center which
indicates the ID of the central node.
library(igraph)
make_wheel0 <- function(
n,
...,
mode = c("in", "out", "mutual", "undirected"),
center = 1
) {
res <- wheel_impl(
n = n,
mode = mode,
center = center - 1
)
if (igraph_opt("add.params")) {
res$name <- switch(
igraph_match_arg(mode),
"in" = "In-wheel",
"out" = "Out-wheel",
"Wheel"
)
res$mode <- mode
res$center <- center
}
res
}
With that make_wheel0() definition, not naming the mode argument
does not work: it is silently ignored! make_wheel0(10) and
make_wheel0(10, "mutual") return the same wheel graph with edges
pointing towards the center. Only make_wheel0(10, mode = "mutual")
returns a wheel graph with mutual (bidirectional) edges.
(default <- make_wheel0(10))
IGRAPH b632ad1 D--- 10 18 -- In-wheel
+ attr: name (g/c), mode (g/c), center (g/n)
+ edges from b632ad1:
[1] 2-> 1 3-> 1 4-> 1 5-> 1 6-> 1 7-> 1 8-> 1 9-> 1 10-> 1 2-> 3
[11] 3-> 4 4-> 5 5-> 6 6-> 7 7-> 8 8-> 9 9->10 10-> 2
plot(default, vertex.size = 35, vertex.label.cex = 2)
(mutual <- make_wheel0(10, mode = "mutual"))
IGRAPH 94442f5 D--- 10 36 -- Wheel
+ attr: name (g/c), mode (g/c), center (g/n)
+ edges from 94442f5:
[1] 1-> 2 2-> 1 1-> 3 3-> 1 1-> 4 4-> 1 1-> 5 5-> 1 1-> 6 6-> 1
[11] 1-> 7 7-> 1 1-> 8 8-> 1 1-> 9 9-> 1 1->10 10-> 1 2-> 3 3-> 4
[21] 4-> 5 5-> 6 6-> 7 7-> 8 8-> 9 9->10 10-> 2 2->10 10-> 9 9-> 8
[31] 8-> 7 7-> 6 6-> 5 5-> 4 4-> 3 3-> 2
plot(mutual, vertex.size = 35, vertex.label.cex = 2)
(surprise <- make_wheel0(10, "mutual"))
IGRAPH 1ce383c D--- 10 18 -- In-wheel
+ attr: name (g/c), mode (g/c), center (g/n)
+ edges from 1ce383c:
[1] 2-> 1 3-> 1 4-> 1 5-> 1 6-> 1 7-> 1 8-> 1 9-> 1 10-> 1 2-> 3
[11] 3-> 4 4-> 5 5-> 6 6-> 7 7-> 8 8-> 9 9->10 10-> 2
plot(surprise, vertex.size = 35, vertex.label.cex = 2)
This is not exactly the ideal behavior since it can be confusing for users.
But its actual
definition
is the following. We use
rlang::check_dots_empty()
to ensure no argument has been passed as ..., therefore it errors for
unnamed arguments.
make_wheel <- function(
n,
...,
mode = c("in", "out", "mutual", "undirected"),
center = 1
) {
rlang::check_dots_empty()
res <- wheel_impl(
n = n,
mode = mode,
center = center - 1
)
if (igraph_opt("add.params")) {
res$name <- switch(
igraph_match_arg(mode),
"in" = "In-wheel",
"out" = "Out-wheel",
"Wheel"
)
res$mode <- mode
res$center <- center
}
res
}
In consequence, users have to name the second argument:
no_surprise <- make_wheel(10, "mutual")
Error in `make_wheel()`:
! `...` must be empty.
ā Problematic argument:
⢠..1 = "mutual"
ā¹ Did you forget to name an argument?
Adding the dots in later function life
In the dm and igraph packages, we decided to add the ellipsis, together with the rlangās empty dots check, to functions that had been around and exported for a while. This meant we were looking at the possibility of breaking usersā code. š± While breaking changes might sometimes be unavoidable, in this case we resorted to a gentler solution that we will sketch out here.
Letās compare different versions of a fictional function.
In this first implementation, my_message() lets the user pass unnamed
optional arguments pretty and verbose.
my_message <- function(x, pretty = FALSE, verbose = TRUE) {
if (!is.character(x)) {
stop("x must be a character")
}
if (verbose) {
message("Hello!")
}
if (pretty) {
cat(sprintf("-- %s --", x))
} else {
cat(x)
}
}
my_message("a")
Hello!
a
my_message("a", TRUE)
Hello!
-- a --
my_message("a", TRUE, FALSE)
-- a --
This new version of my_message() is stricter but breaks existing code!
my_message <- function(x, ..., pretty = FALSE, verbose = TRUE) {
rlang::check_dots_empty()
if (!is.character(x)) {
stop("x must be a character")
}
if (verbose) {
message("Hello!")
}
if (pretty) {
cat(sprintf("-- %s --", x))
} else {
cat(x)
}
}
my_message("a")
Hello!
a
my_message("a", TRUE)
Error in `my_message()`:
! `...` must be empty.
ā Problematic argument:
⢠..1 = TRUE
ā¹ Did you forget to name an argument?
my_message("a", TRUE, FALSE)
Error in `my_message()`:
! `...` must be empty.
ā Problematic arguments:
⢠..1 = TRUE
⢠..2 = FALSE
ā¹ Did you forget to name an argument?
my_message("a", pretty = TRUE, verbose = FALSE)
-- a --
The gentler version of that new check recovers arguments by their position, and nudges the users towards adapting their code by emitting a lifecycle warning. A bit more work for us, less pain for users.
my_message <- function(x, ..., pretty = FALSE, verbose = TRUE) {
if (!is.character(x)) {
stop("x must be a character")
}
if (...length() > 0) {
user_env <- rlang::caller_env()
recovered_attrs <- recover_positional_attrs(
dots = list(...),
current = list(pretty = pretty, verbose = verbose),
user_env = user_env
)
pretty <- recovered_attrs$pretty
verbose <- recovered_attrs$verbose
}
if (verbose) {
message("Hello!")
}
if (pretty) {
cat(sprintf("-- %s --", x))
} else {
cat(x)
}
}
recover_positional_attrs() definition
recover_positional_attrs <- function(
dots,
current,
user_env = rlang::caller_env()
) {
nms <- rlang::names2(dots)
unnamed <- dots[!nzchar(nms)]
if (
length(dots) == 1L &&
length(unnamed) == 1L &&
is.character(unnamed[[1L]]) &&
length(unnamed[[1L]]) == 1L
) {
lifecycle::deprecate_soft(
"3.0.0",
"my_message(pretty = 'must be named')",
user_env = user_env
)
return(list(pretty = unnamed[[1L]], verbose = NULL))
}
if (
length(dots) == 2L &&
length(unnamed) == 2L &&
all(is.logical(unlist(unnamed))) &&
all(lengths(unnamed) == 1L)
) {
lifecycle::deprecate_soft(
"3.0.0",
"my_message(verbose = 'must be named')",
details = "Both pretty and verbose must be named",
user_env = user_env
)
return(list(pretty = unnamed[[1L]], verbose = unnamed[[2L]]))
}
list(pretty = NULL, verbose = NULL)
}
my_message("a", TRUE, FALSE)
Warning: The `verbose` argument of `my_message()` must be named as of <NA> 3.0.0.
ā¹ Both pretty and verbose must be named
-- a --
Note that the lifecycle message displays NA but when emitted from a package, it displays the name of that package.
In igraph, we went for a similar gentle solution. Now, igraph being igraph which means big, we made it scalable from the get-go:
- In a function where we want to transition towards using the ellipsis
pattern, we add marker comments
# BEGIN GENERATED ARG_HANDLE: <fn>and# END GENERATED ARG_HANDLE. - A script saved in
tools/contains the āconfigurationā: for each migrated function, the names and positions of arguments, from which package version the migration started, etc. - Another script saved in
tools/generates, between the marker comments and based on the configuration, the call to a helper function calledmigrate_recover_args().
Conclusion
The ellipsis between required and optional arguments makes for more
readable code. Erroring when unnamed arguments are passed after the
required arguments, thanks to rlang::check_dots_empty(), avoids
surprising the users. However, for existing functions where passing
unnamed optional arguments used to be accepted, it is gentler to warn
than error, and to recover unnamed arguments based on their position.
The ellipsis pattern is fascinating because it means a package developer can improve the experience of people whoāll read the code that package users write!