R/separate.R
separate.Rd
Given either regular expression or a vector of character positions,
separate()
turns a single character column into multiple columns.
separate(data, col, into, sep = "[^[:alnum:]]+", remove = TRUE, convert = FALSE, extra = "warn", fill = "warn", ...)
data | A data frame. |
---|---|
col | Column name or position. This is passed to
This argument is passed by expression and supports quasiquotation (you can unquote column names or column positions). |
into | Names of new variables to create as character vector.
Use |
sep | Separator between columns. If character, is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values. If numeric, interpreted as positions to split at. Positive values start
at 1 at the far-left of the string; negative value start at -1 at the
far-right of the string. The length of |
remove | If |
convert | If NB: this will cause string |
extra | If
|
fill | If
|
... | Additional arguments passed on to methods. |
#> A B #> 1 <NA> <NA> #> 2 a b #> 3 a d #> 4 b c#> B #> 1 <NA> #> 2 b #> 3 d #> 4 c# If every row doesn't split into the same number of pieces, use # the extra and fill arguments to control what happens df <- data.frame(x = c("a", "a b", "a b c", NA)) df %>% separate(x, c("a", "b"))#> Warning: Expected 2 pieces. Additional pieces discarded in 1 rows [3].#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [1].#> a b #> 1 a <NA> #> 2 a b #> 3 a b #> 4 <NA> <NA># The same behaviour drops the c but no warnings df %>% separate(x, c("a", "b"), extra = "drop", fill = "right")#> a b #> 1 a <NA> #> 2 a b #> 3 a b #> 4 <NA> <NA>#> a b #> 1 <NA> a #> 2 a b #> 3 a b c #> 4 <NA> <NA>#> Warning: Expected 3 pieces. Missing pieces filled with `NA` in 2 rows [1, 2].#> a b c #> 1 a <NA> <NA> #> 2 a b <NA> #> 3 a b c #> 4 <NA> <NA> <NA># If only want to split specified number of times use extra = "merge" df <- data.frame(x = c("x: 123", "y: error: 7")) df %>% separate(x, c("key", "value"), ": ", extra = "merge")#> key value #> 1 x 123 #> 2 y error: 7# Use regular expressions to separate on multiple characters: df <- data.frame(x = c(NA, "a?b", "a.d", "b:c")) df %>% separate(x, c("A","B"), sep = "([\\.\\?\\:])")#> A B #> 1 <NA> <NA> #> 2 a b #> 3 a d #> 4 b c# convert = TRUE detects column classes df <- data.frame(x = c("a:1", "a:2", "c:4", "d", NA)) df %>% separate(x, c("key","value"), ":") %>% str#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [4].#> 'data.frame': 5 obs. of 2 variables: #> $ key : chr "a" "a" "c" "d" ... #> $ value: chr "1" "2" "4" NA ...#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [4].#> 'data.frame': 5 obs. of 2 variables: #> $ key : chr "a" "a" "c" "d" ... #> $ value: int 1 2 4 NA NA# Argument col can take quasiquotation to work with strings var <- "x" df %>% separate(!!var, c("key","value"), ":")#> Warning: Expected 2 pieces. Missing pieces filled with `NA` in 1 rows [4].#> key value #> 1 a 1 #> 2 a 2 #> 3 c 4 #> 4 d <NA> #> 5 <NA> <NA>