If a variable contains observations with multiple delimited values, this separates the values and places each one in its own row.
separate_rows(data, ..., sep = "[^[:alnum:].]+", convert = FALSE)
data | A data frame. |
---|---|
... | A selection of columns. If empty, nothing happens. You can
supply bare variable names, select all variables between |
sep | Separator delimiting collapsed values. |
convert | If |
Arguments for selecting columns are passed to
tidyselect::vars_select()
and are treated specially. Unlike other
verbs, selecting functions make a strict distinction between data
expressions and context expressions.
A data expression is either a bare name like x
or an expression
like x:y
or c(x, y)
. In a data expression, you can only refer
to columns from the data frame.
Everything else is a context expression in which you can only
refer to objects that you have defined with <-
.
For instance, col1:col3
is a data expression that refers to data
columns, while seq(start, end)
is a context expression that
refers to objects from the contexts.
If you really need to refer to contextual objects from a data
expression, you can unquote them with the tidy eval operator
!!
. This operator evaluates its argument in the context and
inlines the result in the surrounding function call. For instance,
c(x, !! x)
selects the x
column within the data frame and the
column referred to by the object x
defined in the context (which
can contain either a column name as string or a column position).
df <- data.frame( x = 1:3, y = c("a", "d,e,f", "g,h"), z = c("1", "2,3,4", "5,6"), stringsAsFactors = FALSE ) separate_rows(df, y, z, convert = TRUE)#> x y z #> 1 1 a 1 #> 2 2 d 2 #> 3 2 e 3 #> 4 2 f 4 #> 5 3 g 5 #> 6 3 h 6