Maturing lifecycle

Chopping and unchopping preserve the width of a data frame, changing its length. chop() makes df shorter by converting rows within each group into list-columns. unchop() makes df longer by expanding list-columns so that each element of the list-column gets its own row in the output.

chop(data, cols)

unchop(data, cols, keep_empty = FALSE, ptype = NULL)

Arguments

data

A data frame.

cols

Column to chop or unchop (automatically quoted).

This should be a list-column containing generalised vectors (e.g. any mix of NULLs, atomic vector, S3 vectors, a lists, or data frames).

keep_empty

By default, you get one row of output for each element of the list your unchopping/unnesting. This means that if there's a size-0 element (like NULL or an empty data frame), that entire row will be dropped from the output. If you want to preserve all rows, use keep_empty = TRUE to replace size-0 elements with a single row of missing values.

ptype

Optionally, supply a data frame prototype for the output cols, overriding the default that will be guessed from the combination of individual values.

Details

Generally, unchopping is more useful than chopping because it simplifies a complex data structure, and nest()ing is usually more appropriate that chop()ing` since it better preserves the connections between observations.

Examples

# Chop ============================================================== df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1) # Note that we get one row of output for each unique combination of # non-chopped variables df %>% chop(c(y, z))
#> # A tibble: 3 x 3 #> x y z #> <dbl> <list> <list> #> 1 1 <int [3]> <int [3]> #> 2 2 <int [2]> <int [2]> #> 3 3 <int [1]> <int [1]>
# cf nest df %>% nest(data = c(y, z))
#> # A tibble: 3 x 2 #> x data #> <dbl> <list<df[,2]>> #> 1 1 [3 × 2] #> 2 2 [2 × 2] #> 3 3 [1 × 2]
# Unchop ============================================================ df <- tibble(x = 1:4, y = list(integer(), 1L, 1:2, 1:3)) df %>% unchop(y)
#> # A tibble: 6 x 2 #> x y #> <int> <int> #> 1 2 1 #> 2 3 1 #> 3 3 2 #> 4 4 1 #> 5 4 2 #> 6 4 3
df %>% unchop(y, keep_empty = TRUE)
#> # A tibble: 7 x 2 #> x y #> <int> <int> #> 1 1 NA #> 2 2 1 #> 3 3 1 #> 4 3 2 #> 5 4 1 #> 6 4 2 #> 7 4 3
# Incompatible types ------------------------------------------------- # If the list-col contains types that can not be natively df <- tibble(x = 1:2, y = list("1", 1:3)) try(df %>% unchop(y))
#> Error : No common type for `..1$y` <character> and `..2$y` <integer>.
df %>% unchop(y, ptype = tibble(y = integer()))
#> # A tibble: 4 x 2 #> x y #> <int> <int> #> 1 1 1 #> 2 2 1 #> 3 2 2 #> 4 2 3
df %>% unchop(y, ptype = tibble(y = character()))
#> # A tibble: 4 x 2 #> x y #> <int> <chr> #> 1 1 1 #> 2 2 1 #> 3 2 2 #> 4 2 3
df %>% unchop(y, ptype = tibble(y = list()))
#> # A tibble: 4 x 2 #> x y #> <int> <list> #> 1 1 <chr [1]> #> 2 2 <int [1]> #> 3 2 <int [1]> #> 4 2 <int [1]>
# Unchopping data frames ----------------------------------------------------- # Unchopping a list-col of data frames must generate a df-col because # unchop leaves the column names unchanged df <- tibble(x = 1:3, y = list(NULL, tibble(x = 1), tibble(y = 1:2))) df %>% unchop(y)
#> # A tibble: 3 x 2 #> x y$x $y #> <int> <dbl> <int> #> 1 2 1 NA #> 2 3 NA 1 #> 3 3 NA 2
df %>% unchop(y, keep_empty = TRUE)
#> # A tibble: 4 x 2 #> x y$x $y #> <int> <dbl> <int> #> 1 1 NA NA #> 2 2 1 NA #> 3 3 NA 1 #> 4 3 NA 2