R/dplyr-rowwise-df-tidiers.R
rowwise_df_tidiers.Rd
Rowwise tidiers are deprecated and will be removed from an upcoming version
of broom. We strongly recommend moving to a nest-map-unnest
workflow
over a rowwise-do
workflow. See the vignettes for examples.
# S3 method for rowwise_df tidy(x, object, ...) # S3 method for rowwise_df tidy_(x, object, ...) # S3 method for rowwise_df augment(x, object, ...) # S3 method for rowwise_df augment_(x, object, ...) # S3 method for rowwise_df glance(x, object, ...) # S3 method for rowwise_df glance_(x, object, ...) # S3 method for tbl_df tidy(x, ...) # S3 method for tbl_df augment(x, ...) # S3 method for tbl_df glance(x, ...)
x | a rowwise_df |
---|---|
object | the column name of the column containing the models to be tidied. For tidy, augment, and glance it should be the bare name; for _ methods it should be quoted. |
... | additional arguments to pass on to the respective tidying method |
A "grouped_df"
, where the non-list columns of the
original are used as grouping columns alongside the tidied outputs.
These tidy
, augment
and glance
methods are for
performing tidying on each row of a rowwise data frame created by dplyr's
group_by
and do
operations. They first group a rowwise data
frame based on all columns that are not lists, then perform the tidying
operation on the specified column. This greatly shortens a common idiom
of extracting tidy/augment/glance outputs after a do statement.
Note that this functionality is not currently implemented for data.tables, since the result of the do operation is difficult to distinguish from a regular data.table.
#> Source: local data frame [3 x 2] #> Groups: <by row> #> #> # A tibble: 3 x 2 #> cyl mod #> * <dbl> <list> #> 1 4 <lm> #> 2 6 <lm> #> 3 8 <lm>#> # A tibble: 6 x 6 #> # Groups: cyl [3] #> cyl term estimate std.error statistic p.value #> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 4 (Intercept) 39.6 4.35 9.10 0.00000777 #> 2 4 wt -5.65 1.85 -3.05 0.0137 #> 3 6 (Intercept) 28.4 4.18 6.79 0.00105 #> 4 6 wt -2.78 1.33 -2.08 0.0918 #> 5 8 (Intercept) 23.9 3.01 7.94 0.00000405 #> 6 8 wt -2.19 0.739 -2.97 0.0118#> # A tibble: 32 x 10 #> # Groups: cyl [3] #> cyl mpg wt .fitted .se.fit .resid .hat .sigma .cooksd .std.resid #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4 22.8 2.32 26.5 1.01 -3.67 0.0913 3.26 0.0670 -1.16 #> 2 4 24.4 3.19 21.6 1.95 2.84 0.343 3.31 0.289 1.05 #> 3 4 22.8 3.15 21.8 1.89 1.02 0.321 3.51 0.0325 0.370 #> 4 4 32.4 2.2 27.1 1.02 5.25 0.0932 2.95 0.141 1.66 #> 5 4 30.4 1.62 30.5 1.60 -0.0513 0.230 3.53 0.0000457 -0.0175 #> 6 4 33.9 1.84 29.2 1.31 4.69 0.154 3.04 0.212 1.53 #> 7 4 21.5 2.46 25.7 1.06 -4.15 0.101 3.18 0.0968 -1.31 #> 8 4 27.3 1.94 28.6 1.20 -1.34 0.129 3.50 0.0138 -0.432 #> 9 4 26 2.14 27.5 1.04 -1.49 0.0975 3.49 0.0119 -0.470 #> 10 4 30.4 1.51 31.0 1.75 -0.627 0.275 3.52 0.00927 -0.221 #> # … with 22 more rows#> # A tibble: 3 x 12 #> # Groups: cyl [3] #> cyl r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> #> 1 4 0.509 0.454 3.33 9.32 0.0137 2 -27.7 61.5 62.7 #> 2 6 0.465 0.357 1.17 4.34 0.0918 2 -9.83 25.7 25.5 #> 3 8 0.423 0.375 2.02 8.80 0.0118 2 -28.7 63.3 65.2 #> # … with 2 more variables: deviance <dbl>, df.residual <int># we can provide additional arguments to the tidying function regressions %>% tidy(mod, conf.int = TRUE)#> # A tibble: 6 x 8 #> # Groups: cyl [3] #> cyl term estimate std.error statistic p.value conf.low conf.high #> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4 (Intercept) 39.6 4.35 9.10 0.00000777 29.7 49.4 #> 2 4 wt -5.65 1.85 -3.05 0.0137 -9.83 -1.46 #> 3 6 (Intercept) 28.4 4.18 6.79 0.00105 17.7 39.2 #> 4 6 wt -2.78 1.33 -2.08 0.0918 -6.21 0.651 #> 5 8 (Intercept) 23.9 3.01 7.94 0.00000405 17.3 30.4 #> 6 8 wt -2.19 0.739 -2.97 0.0118 -3.80 -0.582# we can also include the original dataset as a "data" argument # to augment: regressions <- mtcars %>% group_by(cyl) %>% do(mod = lm(mpg ~ wt, .), original = (.)) # this allows all the original columns to be included: regressions %>% augment(mod) # doesn't include all original#> # A tibble: 32 x 10 #> # Groups: cyl [3] #> cyl mpg wt .fitted .se.fit .resid .hat .sigma .cooksd .std.resid #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 4 22.8 2.32 26.5 1.01 -3.67 0.0913 3.26 0.0670 -1.16 #> 2 4 24.4 3.19 21.6 1.95 2.84 0.343 3.31 0.289 1.05 #> 3 4 22.8 3.15 21.8 1.89 1.02 0.321 3.51 0.0325 0.370 #> 4 4 32.4 2.2 27.1 1.02 5.25 0.0932 2.95 0.141 1.66 #> 5 4 30.4 1.62 30.5 1.60 -0.0513 0.230 3.53 0.0000457 -0.0175 #> 6 4 33.9 1.84 29.2 1.31 4.69 0.154 3.04 0.212 1.53 #> 7 4 21.5 2.46 25.7 1.06 -4.15 0.101 3.18 0.0968 -1.31 #> 8 4 27.3 1.94 28.6 1.20 -1.34 0.129 3.50 0.0138 -0.432 #> 9 4 26 2.14 27.5 1.04 -1.49 0.0975 3.49 0.0119 -0.470 #> 10 4 30.4 1.51 31.0 1.75 -0.627 0.275 3.52 0.00927 -0.221 #> # … with 22 more rows#> # A tibble: 32 x 18 #> # Groups: cyl [3] #> mpg cyl disp hp drat wt qsec vs am gear carb .fitted #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 26.5 #> 2 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 21.6 #> 3 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 21.8 #> 4 32.4 4 78.7 66 4.08 2.2 19.5 1 1 4 1 27.1 #> 5 30.4 4 75.7 52 4.93 1.62 18.5 1 1 4 2 30.5 #> 6 33.9 4 71.1 65 4.22 1.84 19.9 1 1 4 1 29.2 #> 7 21.5 4 120. 97 3.7 2.46 20.0 1 0 3 1 25.7 #> 8 27.3 4 79 66 4.08 1.94 18.9 1 1 4 1 28.6 #> 9 26 4 120. 91 4.43 2.14 16.7 0 1 5 2 27.5 #> 10 30.4 4 95.1 113 3.77 1.51 16.9 1 1 5 2 31.0 #> # … with 22 more rows, and 6 more variables: .se.fit <dbl>, .resid <dbl>, #> # .hat <dbl>, .sigma <dbl>, .cooksd <dbl>, .std.resid <dbl>