Tidying methods for rowwise_dfs from dplyr, for tidying each row and recombining the results

Rowwise tidiers are deprecated and will be removed from an upcoming version of broom. We strongly recommend moving to a nest-map-unnest workflow over a rowwise-do workflow. See the vignettes for examples.

# S3 method for rowwise_df
tidy(x, object, ...)

# S3 method for rowwise_df
tidy_(x, object, ...)

# S3 method for rowwise_df
augment(x, object, ...)

# S3 method for rowwise_df
augment_(x, object, ...)

# S3 method for rowwise_df
glance(x, object, ...)

# S3 method for rowwise_df
glance_(x, object, ...)

# S3 method for tbl_df
tidy(x, ...)

# S3 method for tbl_df
augment(x, ...)

# S3 method for tbl_df
glance(x, ...)

Arguments

x	a rowwise_df
object	the column name of the column containing the models to be tidied. For tidy, augment, and glance it should be the bare name; for _ methods it should be quoted.
...	additional arguments to pass on to the respective tidying method

Value

A "grouped_df", where the non-list columns of the original are used as grouping columns alongside the tidied outputs.

Details

These tidy, augment and glance methods are for performing tidying on each row of a rowwise data frame created by dplyr's group_by and do operations. They first group a rowwise data frame based on all columns that are not lists, then perform the tidying operation on the specified column. This greatly shortens a common idiom of extracting tidy/augment/glance outputs after a do statement.

Note that this functionality is not currently implemented for data.tables, since the result of the do operation is difficult to distinguish from a regular data.table.

Examples


library(dplyr)
regressions <- mtcars %>%
    group_by(cyl) %>%
    do(mod = lm(mpg ~ wt, .))

regressions
#> Source: local data frame [3 x 2]
#> Groups: <by row>
#> 
#> # A tibble: 3 x 2
#>     cyl mod   
#> * <dbl> <list>
#> 1     4 <lm>  
#> 2     6 <lm>  
#> 3     8 <lm>  

regressions %>% tidy(mod)
#> # A tibble: 6 x 6
#> # Groups:   cyl [3]
#>     cyl term        estimate std.error statistic    p.value
#>   <dbl> <chr>          <dbl>     <dbl>     <dbl>      <dbl>
#> 1     4 (Intercept)    39.6      4.35       9.10 0.00000777
#> 2     4 wt             -5.65     1.85      -3.05 0.0137    
#> 3     6 (Intercept)    28.4      4.18       6.79 0.00105   
#> 4     6 wt             -2.78     1.33      -2.08 0.0918    
#> 5     8 (Intercept)    23.9      3.01       7.94 0.00000405
#> 6     8 wt             -2.19     0.739     -2.97 0.0118    
regressions %>% augment(mod)
#> # A tibble: 32 x 10
#> # Groups:   cyl [3]
#>      cyl   mpg    wt .fitted .se.fit  .resid   .hat .sigma   .cooksd .std.resid
#>    <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>  <dbl>  <dbl>     <dbl>      <dbl>
#>  1     4  22.8  2.32    26.5    1.01 -3.67   0.0913   3.26 0.0670       -1.16  
#>  2     4  24.4  3.19    21.6    1.95  2.84   0.343    3.31 0.289         1.05  
#>  3     4  22.8  3.15    21.8    1.89  1.02   0.321    3.51 0.0325        0.370 
#>  4     4  32.4  2.2     27.1    1.02  5.25   0.0932   2.95 0.141         1.66  
#>  5     4  30.4  1.62    30.5    1.60 -0.0513 0.230    3.53 0.0000457    -0.0175
#>  6     4  33.9  1.84    29.2    1.31  4.69   0.154    3.04 0.212         1.53  
#>  7     4  21.5  2.46    25.7    1.06 -4.15   0.101    3.18 0.0968       -1.31  
#>  8     4  27.3  1.94    28.6    1.20 -1.34   0.129    3.50 0.0138       -0.432 
#>  9     4  26    2.14    27.5    1.04 -1.49   0.0975   3.49 0.0119       -0.470 
#> 10     4  30.4  1.51    31.0    1.75 -0.627  0.275    3.52 0.00927      -0.221 
#> # … with 22 more rows
regressions %>% glance(mod)
#> # A tibble: 3 x 12
#> # Groups:   cyl [3]
#>     cyl r.squared adj.r.squared sigma statistic p.value    df logLik   AIC   BIC
#>   <dbl>     <dbl>         <dbl> <dbl>     <dbl>   <dbl> <int>  <dbl> <dbl> <dbl>
#> 1     4     0.509         0.454  3.33      9.32  0.0137     2 -27.7   61.5  62.7
#> 2     6     0.465         0.357  1.17      4.34  0.0918     2  -9.83  25.7  25.5
#> 3     8     0.423         0.375  2.02      8.80  0.0118     2 -28.7   63.3  65.2
#> # … with 2 more variables: deviance <dbl>, df.residual <int>

# we can provide additional arguments to the tidying function
regressions %>% tidy(mod, conf.int = TRUE)
#> # A tibble: 6 x 8
#> # Groups:   cyl [3]
#>     cyl term        estimate std.error statistic    p.value conf.low conf.high
#>   <dbl> <chr>          <dbl>     <dbl>     <dbl>      <dbl>    <dbl>     <dbl>
#> 1     4 (Intercept)    39.6      4.35       9.10 0.00000777    29.7     49.4  
#> 2     4 wt             -5.65     1.85      -3.05 0.0137        -9.83    -1.46 
#> 3     6 (Intercept)    28.4      4.18       6.79 0.00105       17.7     39.2  
#> 4     6 wt             -2.78     1.33      -2.08 0.0918        -6.21     0.651
#> 5     8 (Intercept)    23.9      3.01       7.94 0.00000405    17.3     30.4  
#> 6     8 wt             -2.19     0.739     -2.97 0.0118        -3.80    -0.582

# we can also include the original dataset as a "data" argument
# to augment:
regressions <- mtcars %>%
    group_by(cyl) %>%
    do(mod = lm(mpg ~ wt, .), original = (.))

# this allows all the original columns to be included:
regressions %>% augment(mod)  # doesn't include all original
#> # A tibble: 32 x 10
#> # Groups:   cyl [3]
#>      cyl   mpg    wt .fitted .se.fit  .resid   .hat .sigma   .cooksd .std.resid
#>    <dbl> <dbl> <dbl>   <dbl>   <dbl>   <dbl>  <dbl>  <dbl>     <dbl>      <dbl>
#>  1     4  22.8  2.32    26.5    1.01 -3.67   0.0913   3.26 0.0670       -1.16  
#>  2     4  24.4  3.19    21.6    1.95  2.84   0.343    3.31 0.289         1.05  
#>  3     4  22.8  3.15    21.8    1.89  1.02   0.321    3.51 0.0325        0.370 
#>  4     4  32.4  2.2     27.1    1.02  5.25   0.0932   2.95 0.141         1.66  
#>  5     4  30.4  1.62    30.5    1.60 -0.0513 0.230    3.53 0.0000457    -0.0175
#>  6     4  33.9  1.84    29.2    1.31  4.69   0.154    3.04 0.212         1.53  
#>  7     4  21.5  2.46    25.7    1.06 -4.15   0.101    3.18 0.0968       -1.31  
#>  8     4  27.3  1.94    28.6    1.20 -1.34   0.129    3.50 0.0138       -0.432 
#>  9     4  26    2.14    27.5    1.04 -1.49   0.0975   3.49 0.0119       -0.470 
#> 10     4  30.4  1.51    31.0    1.75 -0.627  0.275    3.52 0.00927      -0.221 
#> # … with 22 more rows
regressions %>% augment(mod, data = original)  # includes all original
#> # A tibble: 32 x 18
#> # Groups:   cyl [3]
#>      mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb .fitted
#>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
#>  1  22.8     4 108      93  3.85  2.32  18.6     1     1     4     1    26.5
#>  2  24.4     4 147.     62  3.69  3.19  20       1     0     4     2    21.6
#>  3  22.8     4 141.     95  3.92  3.15  22.9     1     0     4     2    21.8
#>  4  32.4     4  78.7    66  4.08  2.2   19.5     1     1     4     1    27.1
#>  5  30.4     4  75.7    52  4.93  1.62  18.5     1     1     4     2    30.5
#>  6  33.9     4  71.1    65  4.22  1.84  19.9     1     1     4     1    29.2
#>  7  21.5     4 120.     97  3.7   2.46  20.0     1     0     3     1    25.7
#>  8  27.3     4  79      66  4.08  1.94  18.9     1     1     4     1    28.6
#>  9  26       4 120.     91  4.43  2.14  16.7     0     1     5     2    27.5
#> 10  30.4     4  95.1   113  3.77  1.51  16.9     1     1     5     2    31.0
#> # … with 22 more rows, and 6 more variables: .se.fit <dbl>, .resid <dbl>,
#> #   .hat <dbl>, .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>

Tidying methods for rowwise_dfs from dplyr, for tidying each row and recombining the results

Arguments

Value

Details

Examples

Contents