If a variable contains observations with multiple delimited values, this separates the values and places each one in its own row.

separate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)

Arguments

data

A data frame.

...

A selection of columns. If empty, nothing happens. You can supply bare variable names, select all variables between x and z with x:z, exclude y with -y. For more selection options, see the dplyr::select() documentation.

sep

Separator delimiting collapsed values.

convert

If TRUE will automatically run type.convert() on the key column. This is useful if the column types are actually numeric, integer, or logical.

Rules for selection

Arguments for selecting columns are passed to tidyselect::vars_select() and are treated specially. Unlike other verbs, selecting functions make a strict distinction between data expressions and context expressions.

  • A data expression is either a bare name like x or an expression like x:y or c(x, y). In a data expression, you can only refer to columns from the data frame.

  • Everything else is a context expression in which you can only refer to objects that you have defined with <-.

For instance, col1:col3 is a data expression that refers to data columns, while seq(start, end) is a context expression that refers to objects from the contexts.

If you really need to refer to contextual objects from a data expression, you can unquote them with the tidy eval operator !!. This operator evaluates its argument in the context and inlines the result in the surrounding function call. For instance, c(x, !! x) selects the x column within the data frame and the column referred to by the object x defined in the context (which can contain either a column name as string or a column position).

Examples

df <- data.frame( x = 1:3, y = c("a", "d,e,f", "g,h"), z = c("1", "2,3,4", "5,6"), stringsAsFactors = FALSE ) separate_rows(df, y, z, convert = TRUE)
#> x y z #> 1 1 a 1 #> 2 2 d 2 #> 3 2 e 3 #> 4 2 f 4 #> 5 3 g 5 #> 6 3 h 6