The scoped variants of mutate() and transmute() make it easy to apply the same transformation to multiple variables. There are three variants:

  • _all affects every variable

  • _at affects variables selected with a character vector or vars()

  • _if affects variables selected with a predicate function:

mutate_all(.tbl, .funs, ...)

mutate_if(.tbl, .predicate, .funs, ...)

mutate_at(.tbl, .vars, .funs, ..., .cols = NULL)

transmute_all(.tbl, .funs, ...)

transmute_if(.tbl, .predicate, .funs, ...)

transmute_at(.tbl, .vars, .funs, ..., .cols = NULL)

Arguments

.tbl

A tbl object.

.funs

A function fun, a quosure style lambda ~ fun(.) or a list of either form.

...

Additional arguments for the function calls in .funs. These are evaluated only once, with tidy dots support.

.predicate

A predicate function to be applied to the columns or a logical vector. The variables for which .predicate is or returns TRUE are selected. This argument is passed to rlang::as_function() and thus supports quosure-style lambda functions and strings representing function names.

.vars

A list of columns generated by vars(), a character vector of column names, a numeric vector of column positions, or NULL.

.cols

This argument has been renamed to .vars to fit dplyr's terminology and is deprecated.

Value

A data frame. By default, the newly created columns have the shortest names needed to uniquely identify the output. To force inclusion of a name, even when not needed, name the input (see examples for details).

Grouping variables

If applied on a grouped tibble, these operations are not applied to the grouping variables. The behaviour depends on whether the selection is implicit (all and if selections) or explicit (at selections).

  • Grouping variables covered by explicit selections in mutate_at() and transmute_at() are always an error. Add -group_cols() to the vars() selection to avoid this:

    data %>% mutate_at(vars(-group_cols(), ...), myoperation)
    

    Or remove group_vars() from the character vector of column names:

    nms <- setdiff(nms, group_vars(data))
    data %>% mutate_at(vars, myoperation)
    
  • Grouping variables covered by implicit selections are ignored by mutate_all(), transmute_all(), mutate_if(), and transmute_if().

Naming

The names of the created columns is derived from the names of the input variables and the names of the functions.

  • if there is only one unnamed function, the names of the input variables are used to name the created columns

  • if there is only one unnamed variable, the names of the functions are used to name the created columns.

  • otherwise in the most general case, the created names are created by concatenating the names of the input variables and the names of the functions.

The names of the functions here means the names of the list of functions that is supplied. When needed and not supplied, the name of a function is the prefix "fn" followed by the index of this function within the unnamed functions in the list. Ultimately, names are made unique.

See also

Examples

iris <- as_tibble(iris) # All variants can be passed functions and additional arguments, # purrr-style. The _at() variants directly support strings. Here # we'll scale the variables `height` and `mass`: scale2 <- function(x, na.rm = FALSE) (x - mean(x, na.rm = na.rm)) / sd(x, na.rm) starwars %>% mutate_at(c("height", "mass"), scale2)
#> # A tibble: 87 x 13 #> name height mass hair_color skin_color eye_color birth_year gender #> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr> #> 1 Luke… NA NA blond fair blue 19 male #> 2 C-3PO NA NA <NA> gold yellow 112 <NA> #> 3 R2-D2 NA NA <NA> white, bl… red 33 <NA> #> 4 Dart… NA NA none white yellow 41.9 male #> 5 Leia… NA NA brown light brown 19 female #> 6 Owen… NA NA brown, gr… light blue 52 male #> 7 Beru… NA NA brown light blue 47 female #> 8 R5-D4 NA NA <NA> white, red red NA <NA> #> 9 Bigg… NA NA black light brown 24 male #> 10 Obi-… NA NA auburn, w… fair blue-gray 57 male #> # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>, #> # films <list>, vehicles <list>, starships <list>
# You can pass additional arguments to the function: starwars %>% mutate_at(c("height", "mass"), scale2, na.rm = TRUE)
#> # A tibble: 87 x 13 #> name height mass hair_color skin_color eye_color birth_year gender #> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr> #> 1 Luke… -0.0678 -0.120 blond fair blue 19 male #> 2 C-3PO -0.212 -0.132 <NA> gold yellow 112 <NA> #> 3 R2-D2 -2.25 -0.385 <NA> white, bl… red 33 <NA> #> 4 Dart… 0.795 0.228 none white yellow 41.9 male #> 5 Leia… -0.701 -0.285 brown light brown 19 female #> 6 Owen… 0.105 0.134 brown, gr… light blue 52 male #> 7 Beru… -0.269 -0.132 brown light blue 47 female #> 8 R5-D4 -2.22 -0.385 <NA> white, red red NA <NA> #> 9 Bigg… 0.249 -0.0786 black light brown 24 male #> 10 Obi-… 0.220 -0.120 auburn, w… fair blue-gray 57 male #> # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>, #> # films <list>, vehicles <list>, starships <list>
# You can also pass formulas to create functions on the spot, purrr-style: starwars %>% mutate_at(c("height", "mass"), ~scale2(., na.rm = TRUE))
#> # A tibble: 87 x 13 #> name height mass hair_color skin_color eye_color birth_year gender #> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr> #> 1 Luke… -0.0678 -0.120 blond fair blue 19 male #> 2 C-3PO -0.212 -0.132 <NA> gold yellow 112 <NA> #> 3 R2-D2 -2.25 -0.385 <NA> white, bl… red 33 <NA> #> 4 Dart… 0.795 0.228 none white yellow 41.9 male #> 5 Leia… -0.701 -0.285 brown light brown 19 female #> 6 Owen… 0.105 0.134 brown, gr… light blue 52 male #> 7 Beru… -0.269 -0.132 brown light blue 47 female #> 8 R5-D4 -2.22 -0.385 <NA> white, red red NA <NA> #> 9 Bigg… 0.249 -0.0786 black light brown 24 male #> 10 Obi-… 0.220 -0.120 auburn, w… fair blue-gray 57 male #> # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>, #> # films <list>, vehicles <list>, starships <list>
# You can also supply selection helpers to _at() functions but you have # to quote them with vars(): iris %>% mutate_at(vars(matches("Sepal")), log)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 1.63 1.25 1.4 0.2 setosa #> 2 1.59 1.10 1.4 0.2 setosa #> 3 1.55 1.16 1.3 0.2 setosa #> 4 1.53 1.13 1.5 0.2 setosa #> 5 1.61 1.28 1.4 0.2 setosa #> 6 1.69 1.36 1.7 0.4 setosa #> 7 1.53 1.22 1.4 0.3 setosa #> 8 1.61 1.22 1.5 0.2 setosa #> 9 1.48 1.06 1.4 0.2 setosa #> 10 1.59 1.13 1.5 0.1 setosa #> # … with 140 more rows
# The _if() variants apply a predicate function (a function that # returns TRUE or FALSE) to determine the relevant subset of # columns. Here we divide all the numeric columns by 100: starwars %>% mutate_if(is.numeric, scale2, na.rm = TRUE)
#> # A tibble: 87 x 13 #> name height mass hair_color skin_color eye_color birth_year gender #> <chr> <dbl> <dbl> <chr> <chr> <chr> <dbl> <chr> #> 1 Luke… -0.0678 -0.120 blond fair blue -0.443 male #> 2 C-3PO -0.212 -0.132 <NA> gold yellow 0.158 <NA> #> 3 R2-D2 -2.25 -0.385 <NA> white, bl… red -0.353 <NA> #> 4 Dart… 0.795 0.228 none white yellow -0.295 male #> 5 Leia… -0.701 -0.285 brown light brown -0.443 female #> 6 Owen… 0.105 0.134 brown, gr… light blue -0.230 male #> 7 Beru… -0.269 -0.132 brown light blue -0.262 female #> 8 R5-D4 -2.22 -0.385 <NA> white, red red NA <NA> #> 9 Bigg… 0.249 -0.0786 black light brown -0.411 male #> 10 Obi-… 0.220 -0.120 auburn, w… fair blue-gray -0.198 male #> # … with 77 more rows, and 5 more variables: homeworld <chr>, species <chr>, #> # films <list>, vehicles <list>, starships <list>
# mutate_if() is particularly useful for transforming variables from # one type to another iris %>% mutate_if(is.factor, as.character)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <chr> #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa #> 7 4.6 3.4 1.4 0.3 setosa #> 8 5 3.4 1.5 0.2 setosa #> 9 4.4 2.9 1.4 0.2 setosa #> 10 4.9 3.1 1.5 0.1 setosa #> # … with 140 more rows
iris %>% mutate_if(is.double, as.integer)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <int> <int> <int> <int> <fct> #> 1 5 3 1 0 setosa #> 2 4 3 1 0 setosa #> 3 4 3 1 0 setosa #> 4 4 3 1 0 setosa #> 5 5 3 1 0 setosa #> 6 5 3 1 0 setosa #> 7 4 3 1 0 setosa #> 8 5 3 1 0 setosa #> 9 4 2 1 0 setosa #> 10 4 3 1 0 setosa #> # … with 140 more rows
# Multiple transformations ---------------------------------------- # If you want to apply multiple transformations, pass a list of # functions. When there are multiple functions, they create new # variables instead of modifying the variables in place: iris %>% mutate_if(is.numeric, list(scale2, log))
#> # A tibble: 150 x 13 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_fn1 #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 setosa -0.898 #> 2 4.9 3 1.4 0.2 setosa -1.14 #> 3 4.7 3.2 1.3 0.2 setosa -1.38 #> 4 4.6 3.1 1.5 0.2 setosa -1.50 #> 5 5 3.6 1.4 0.2 setosa -1.02 #> 6 5.4 3.9 1.7 0.4 setosa -0.535 #> 7 4.6 3.4 1.4 0.3 setosa -1.50 #> 8 5 3.4 1.5 0.2 setosa -1.02 #> 9 4.4 2.9 1.4 0.2 setosa -1.74 #> 10 4.9 3.1 1.5 0.1 setosa -1.14 #> # … with 140 more rows, and 7 more variables: Sepal.Width_fn1 <dbl>, #> # Petal.Length_fn1 <dbl>, Petal.Width_fn1 <dbl>, Sepal.Length_fn2 <dbl>, #> # Sepal.Width_fn2 <dbl>, Petal.Length_fn2 <dbl>, Petal.Width_fn2 <dbl>
# The list can contain purrr-style formulas: iris %>% mutate_if(is.numeric, list(~scale2(.), ~log(.)))
#> # A tibble: 150 x 13 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_sc… #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 setosa -0.898 #> 2 4.9 3 1.4 0.2 setosa -1.14 #> 3 4.7 3.2 1.3 0.2 setosa -1.38 #> 4 4.6 3.1 1.5 0.2 setosa -1.50 #> 5 5 3.6 1.4 0.2 setosa -1.02 #> 6 5.4 3.9 1.7 0.4 setosa -0.535 #> 7 4.6 3.4 1.4 0.3 setosa -1.50 #> 8 5 3.4 1.5 0.2 setosa -1.02 #> 9 4.4 2.9 1.4 0.2 setosa -1.74 #> 10 4.9 3.1 1.5 0.1 setosa -1.14 #> # … with 140 more rows, and 7 more variables: Sepal.Width_scale2 <dbl>, #> # Petal.Length_scale2 <dbl>, Petal.Width_scale2 <dbl>, #> # Sepal.Length_log <dbl>, Sepal.Width_log <dbl>, Petal.Length_log <dbl>, #> # Petal.Width_log <dbl>
# Note how the new variables include the function name, in order to # keep things distinct. The default names are not always helpful # but you can also supply explicit names: iris %>% mutate_if(is.numeric, list(scale = scale2, log = log))
#> # A tibble: 150 x 13 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_sc… #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 setosa -0.898 #> 2 4.9 3 1.4 0.2 setosa -1.14 #> 3 4.7 3.2 1.3 0.2 setosa -1.38 #> 4 4.6 3.1 1.5 0.2 setosa -1.50 #> 5 5 3.6 1.4 0.2 setosa -1.02 #> 6 5.4 3.9 1.7 0.4 setosa -0.535 #> 7 4.6 3.4 1.4 0.3 setosa -1.50 #> 8 5 3.4 1.5 0.2 setosa -1.02 #> 9 4.4 2.9 1.4 0.2 setosa -1.74 #> 10 4.9 3.1 1.5 0.1 setosa -1.14 #> # … with 140 more rows, and 7 more variables: Sepal.Width_scale <dbl>, #> # Petal.Length_scale <dbl>, Petal.Width_scale <dbl>, Sepal.Length_log <dbl>, #> # Sepal.Width_log <dbl>, Petal.Length_log <dbl>, Petal.Width_log <dbl>
# When there's only one function in the list, it modifies existing # variables in place. Give it a name to instead create new variables: iris %>% mutate_if(is.numeric, list(scale2))
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 -0.898 1.02 -1.34 -1.31 setosa #> 2 -1.14 -0.132 -1.34 -1.31 setosa #> 3 -1.38 0.327 -1.39 -1.31 setosa #> 4 -1.50 0.0979 -1.28 -1.31 setosa #> 5 -1.02 1.25 -1.34 -1.31 setosa #> 6 -0.535 1.93 -1.17 -1.05 setosa #> 7 -1.50 0.786 -1.34 -1.18 setosa #> 8 -1.02 0.786 -1.28 -1.31 setosa #> 9 -1.74 -0.361 -1.34 -1.31 setosa #> 10 -1.14 0.0979 -1.28 -1.44 setosa #> # … with 140 more rows
iris %>% mutate_if(is.numeric, list(scale = scale2))
#> # A tibble: 150 x 9 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length_sc… #> <dbl> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 5.1 3.5 1.4 0.2 setosa -0.898 #> 2 4.9 3 1.4 0.2 setosa -1.14 #> 3 4.7 3.2 1.3 0.2 setosa -1.38 #> 4 4.6 3.1 1.5 0.2 setosa -1.50 #> 5 5 3.6 1.4 0.2 setosa -1.02 #> 6 5.4 3.9 1.7 0.4 setosa -0.535 #> 7 4.6 3.4 1.4 0.3 setosa -1.50 #> 8 5 3.4 1.5 0.2 setosa -1.02 #> 9 4.4 2.9 1.4 0.2 setosa -1.74 #> 10 4.9 3.1 1.5 0.1 setosa -1.14 #> # … with 140 more rows, and 3 more variables: Sepal.Width_scale <dbl>, #> # Petal.Length_scale <dbl>, Petal.Width_scale <dbl>