Convenience function to paste together multiple columns into one.

unite(data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)

Arguments

data

A data frame.

col

The name of the new column, as a string or symbol.

This argument is passed by expression and supports quasiquotation (you can unquote strings and symbols). The name is captured from the expression with rlang::ensym() (note that this kind of interface where symbols do not represent actual objects is now discouraged in the tidyverse; we support it here for backward compatibility).

...

A selection of columns. If empty, all variables are selected. You can supply bare variable names, select all variables between x and z with x:z, exclude y with -y. For more options, see the dplyr::select() documentation. See also the section on selection rules below.

sep

Separator to use between values.

remove

If TRUE, remove input columns from output data frame.

na.rm

If TRUE, missing values will be remove prior to uniting each value.

Rules for selection

Arguments for selecting columns are passed to tidyselect::vars_select() and are treated specially. Unlike other verbs, selecting functions make a strict distinction between data expressions and context expressions.

  • A data expression is either a bare name like x or an expression like x:y or c(x, y). In a data expression, you can only refer to columns from the data frame.

  • Everything else is a context expression in which you can only refer to objects that you have defined with <-.

For instance, col1:col3 is a data expression that refers to data columns, while seq(start, end) is a context expression that refers to objects from the contexts.

If you really need to refer to contextual objects from a data expression, you can unquote them with the tidy eval operator !!. This operator evaluates its argument in the context and inlines the result in the surrounding function call. For instance, c(x, !! x) selects the x column within the data frame and the column referred to by the object x defined in the context (which can contain either a column name as string or a column position).

See also

separate(), the complement.

Examples

df <- expand_grid(x = c("a", NA), y = c("b", NA)) df
#> # A tibble: 4 x 2 #> x y #> <chr> <chr> #> 1 a b #> 2 a <NA> #> 3 <NA> b #> 4 <NA> <NA>
df %>% unite("z", x:y, remove = FALSE)
#> # A tibble: 4 x 3 #> z x y #> <chr> <chr> <chr> #> 1 a_b a b #> 2 a_NA a <NA> #> 3 NA_b <NA> b #> 4 NA_NA <NA> <NA>
# To remove missing values: df %>% unite("z", x:y, na.rm = TRUE, remove = FALSE)
#> # A tibble: 4 x 3 #> z x y #> <chr> <chr> <chr> #> 1 "a_b" a b #> 2 "a" a <NA> #> 3 "b" <NA> b #> 4 "" <NA> <NA>
# Separate is almost the complement of unite df %>% unite("xy", x:y) %>% separate(xy, c("x", "y"))
#> # A tibble: 4 x 2 #> x y #> <chr> <chr> #> 1 a b #> 2 a NA #> 3 NA b #> 4 NA NA
# (but note `x` and `y` contain now "NA" not NA)