Chop and unchop

Chopping and unchopping preserve the width of a data frame, changing its length. chop() makes df shorter by converting rows within each group into list-columns. unchop() makes df longer by expanding list-columns so that each element of the list-column gets its own row in the output.

chop(data, cols)

unchop(data, cols, keep_empty = FALSE, ptype = NULL)

Arguments

data	A data frame.
cols	Column to chop or unchop (automatically quoted). This should be a list-column containing generalised vectors (e.g. any mix of `NULL`s, atomic vector, S3 vectors, a lists, or data frames).
keep_empty	By default, you get one row of output for each element of the list your unchopping/unnesting. This means that if there's a size-0 element (like `NULL` or an empty data frame), that entire row will be dropped from the output. If you want to preserve all rows, use `keep_empty = TRUE` to replace size-0 elements with a single row of missing values.
ptype	Optionally, supply a data frame prototype for the output `cols`, overriding the default that will be guessed from the combination of individual values.

Details

Generally, unchopping is more useful than chopping because it simplifies a complex data structure, and nest()ing is usually more appropriate that chop()ing` since it better preserves the connections between observations.

Examples

# Chop ==============================================================
df <- tibble(x = c(1, 1, 1, 2, 2, 3), y = 1:6, z = 6:1)
# Note that we get one row of output for each unique combination of
# non-chopped variables
df %>% chop(c(y, z))
#> # A tibble: 3 x 3
#>       x y         z        
#>   <dbl> <list>    <list>   
#> 1     1 <int [3]> <int [3]>
#> 2     2 <int [2]> <int [2]>
#> 3     3 <int [1]> <int [1]>
# cf nest
df %>% nest(data = c(y, z))
#> # A tibble: 3 x 2
#>       x           data
#>   <dbl> <list<df[,2]>>
#> 1     1        [3 × 2]
#> 2     2        [2 × 2]
#> 3     3        [1 × 2]

# Unchop ============================================================
df <- tibble(x = 1:4, y = list(integer(), 1L, 1:2, 1:3))
df %>% unchop(y)
#> # A tibble: 6 x 2
#>       x     y
#>   <int> <int>
#> 1     2     1
#> 2     3     1
#> 3     3     2
#> 4     4     1
#> 5     4     2
#> 6     4     3
df %>% unchop(y, keep_empty = TRUE)
#> # A tibble: 7 x 2
#>       x     y
#>   <int> <int>
#> 1     1    NA
#> 2     2     1
#> 3     3     1
#> 4     3     2
#> 5     4     1
#> 6     4     2
#> 7     4     3

# Incompatible types -------------------------------------------------
# If the list-col contains types that can not be natively
df <- tibble(x = 1:2, y = list("1", 1:3))
try(df %>% unchop(y))
#> Error : No common type for `..1$y` <character> and `..2$y` <integer>.
df %>% unchop(y, ptype = tibble(y = integer()))
#> # A tibble: 4 x 2
#>       x     y
#>   <int> <int>
#> 1     1     1
#> 2     2     1
#> 3     2     2
#> 4     2     3
df %>% unchop(y, ptype = tibble(y = character()))
#> # A tibble: 4 x 2
#>       x y    
#>   <int> <chr>
#> 1     1 1    
#> 2     2 1    
#> 3     2 2    
#> 4     2 3    
df %>% unchop(y, ptype = tibble(y = list()))
#> # A tibble: 4 x 2
#>       x y        
#>   <int> <list>   
#> 1     1 <chr [1]>
#> 2     2 <int [1]>
#> 3     2 <int [1]>
#> 4     2 <int [1]>

# Unchopping data frames -----------------------------------------------------
# Unchopping a list-col of data frames must generate a df-col because
# unchop leaves the column names unchanged
df <- tibble(x = 1:3, y = list(NULL, tibble(x = 1), tibble(y = 1:2)))
df %>% unchop(y)
#> # A tibble: 3 x 2
#>       x   y$x    $y
#>   <int> <dbl> <int>
#> 1     2     1    NA
#> 2     3    NA     1
#> 3     3    NA     2
df %>% unchop(y, keep_empty = TRUE)
#> # A tibble: 4 x 2
#>       x   y$x    $y
#>   <int> <dbl> <int>
#> 1     1    NA    NA
#> 2     2     1    NA
#> 3     3    NA     1
#> 4     3    NA     2

Arguments

Details

Examples

Contents