lmap()
, lmap_at()
and lmap_if()
are similar to
map()
, map_at()
and map_if()
, with the
difference that they operate exclusively on functions that take
and return a list (or data frame). Thus, instead of mapping
the elements of a list (as in .x[[i]]
), they apply a
function .f
to each subset of size 1 of that list (as in
.x[i]
). We call those elements list-elements
).
lmap(.x, .f, ...) lmap_if(.x, .p, .f, ..., .else = NULL) lmap_at(.x, .at, .f, ...)
.x | A list or data frame. |
---|---|
.f | A function that takes and returns a list or data frame. |
... | Additional arguments passed on to the mapped function. |
.p | A single predicate function, a formula describing such a
predicate function, or a logical vector of the same length as |
.else | A function applied to elements of |
.at | A character vector of names, positive numeric vector of
positions to include, or a negative numeric vector of positions to
exlude. Only those elements corresponding to |
If .x
is a list, a list. If .x
is a data
frame, a data frame.
Mapping the list-elements .x[i]
has several advantages. It
makes it possible to work with functions that exclusively take a
list or data frame. It enables .f
to access the attributes
of the encapsulating list, like the name of the components it
receives. It also enables .f
to return a larger list than
the list-element of size 1 it got as input. Conversely, .f
can also return empty lists. In these cases, the output list is
reshaped with a different size than the input list .x
.
# Let's write a function that returns a larger list or an empty list # depending on some condition. This function also uses the names # metadata available in the attributes of the list-element maybe_rep <- function(x) { n <- rpois(1, 2) out <- rep_len(x, n) if (length(out) > 0) { names(out) <- paste0(names(x), seq_len(n)) } out } # The output size varies each time we map f() x <- list(a = 1:4, b = letters[5:7], c = 8:9, d = letters[10]) x %>% lmap(maybe_rep)#> $a1 #> [1] 1 2 3 4 #> #> $a2 #> [1] 1 2 3 4 #> #> $b1 #> [1] "e" "f" "g" #> #> $b2 #> [1] "e" "f" "g" #> #> $b3 #> [1] "e" "f" "g" #>#> $a1 #> [1] 1 2 3 4 #> #> $b #> [1] "e" "f" "g" #> #> $c #> [1] 8 9 #> #> $d1 #> [1] "j" #># Or only where a condition is satisfied x %>% lmap_if(is.character, maybe_rep)#> $a #> [1] 1 2 3 4 #> #> $c #> [1] 8 9 #> #> $d1 #> [1] "j" #> #> $d2 #> [1] "j" #># A more realistic example would be a function that takes discrete # variables in a dataset and turns them into disjunctive tables, a # form that is amenable to fitting some types of models. # A disjunctive table contains only 0 and 1 but has as many columns # as unique values in the original variable. Ideally, we want to # combine the names of each level with the name of the discrete # variable in order to identify them. Given these requirements, it # makes sense to have a function that takes a data frame of size 1 # and returns a data frame of variable size. disjoin <- function(x, sep = "_") { name <- names(x) x <- as.factor(x[[1]]) out <- lapply(levels(x), function(level) { as.numeric(x == level) }) names(out) <- paste(name, levels(x), sep = sep) out } # Now, we are ready to map disjoin() on each categorical variable of a # data frame: iris %>% lmap_if(is.factor, disjoin)#> # A tibble: 150 x 7 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species_setosa #> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 5.1 3.5 1.4 0.2 1 #> 2 4.9 3 1.4 0.2 1 #> 3 4.7 3.2 1.3 0.2 1 #> 4 4.6 3.1 1.5 0.2 1 #> 5 5 3.6 1.4 0.2 1 #> 6 5.4 3.9 1.7 0.4 1 #> 7 4.6 3.4 1.4 0.3 1 #> 8 5 3.4 1.5 0.2 1 #> 9 4.4 2.9 1.4 0.2 1 #> 10 4.9 3.1 1.5 0.1 1 #> # … with 140 more rows, and 2 more variables: Species_versicolor <dbl>, #> # Species_virginica <dbl>#> # A tibble: 32 x 15 #> mpg cyl_4 cyl_6 cyl_8 disp hp drat wt qsec vs_0 vs_1 am_0 am_1 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 21 0 1 0 160 110 3.9 2.62 16.5 1 0 0 1 #> 2 21 0 1 0 160 110 3.9 2.88 17.0 1 0 0 1 #> 3 22.8 1 0 0 108 93 3.85 2.32 18.6 0 1 0 1 #> 4 21.4 0 1 0 258 110 3.08 3.22 19.4 0 1 1 0 #> 5 18.7 0 0 1 360 175 3.15 3.44 17.0 1 0 1 0 #> 6 18.1 0 1 0 225 105 2.76 3.46 20.2 0 1 1 0 #> 7 14.3 0 0 1 360 245 3.21 3.57 15.8 1 0 1 0 #> 8 24.4 1 0 0 147. 62 3.69 3.19 20 0 1 1 0 #> 9 22.8 1 0 0 141. 95 3.92 3.15 22.9 0 1 1 0 #> 10 19.2 0 1 0 168. 123 3.92 3.44 18.3 0 1 1 0 #> # … with 22 more rows, and 2 more variables: gear <dbl>, carb <dbl>