This function is somewhat similar to tapply
, but is designed for
use in conjunction with id
. It is simpler in that it only
accepts a single grouping vector (use id
if you have more)
and uses vapply
internally, using the .default
value
as the template.
vaggregate(.value, .group, .fun, ..., .default = NULL, .n = nlevels(.group))
.value | vector of values to aggregate |
---|---|
.group | grouping vector |
.fun | aggregation function |
... | other arguments passed on to |
.default | default value used for missing groups. This argument is also used as the template for function output. |
.n | total number of groups |
vaggregate
should be faster than tapply
in most situations
because it avoids making a copy of the data.
# Some examples of use borrowed from ?tapply n <- 17; fac <- factor(rep(1:3, length.out = n), levels = 1:5) table(fac)#> fac #> 1 2 3 4 5 #> 6 6 5 0 0vaggregate(1:n, fac, sum)#> [1] 51 57 45 0 0vaggregate(1:n, fac, sum, .default = NA_integer_)#> [1] 51 57 45 NA NAvaggregate(1:n, fac, range)#> Warning: no non-missing arguments to min; returning Inf#> Warning: no non-missing arguments to max; returning -Inf#> [,1] [,2] [,3] [,4] [,5] #> [1,] 1 2 3 Inf Inf #> [2,] 16 17 15 -Inf -Inf#> [,1] [,2] [,3] [,4] [,5] #> [1,] 1 2 3 NA NA #> [2,] 16 17 15 NA NAvaggregate(1:n, fac, quantile)#> [,1] [,2] [,3] [,4] [,5] #> 0% 1.00 2.00 3 NA NA #> 25% 4.75 5.75 6 NA NA #> 50% 8.50 9.50 9 NA NA #> 75% 12.25 13.25 12 NA NA #> 100% 16.00 17.00 15 NA NA# Unlike tapply, vaggregate does not support multi-d output: tapply(warpbreaks$breaks, warpbreaks[,-1], sum)#> tension #> wool L M H #> A 401 216 221 #> B 254 259 169#> [1] 401 216 221 254 259 169# But it is about 10x faster x <- rnorm(1e6) y1 <- sample.int(10, 1e6, replace = TRUE) system.time(tapply(x, y1, mean))#> user system elapsed #> 0.037 0.004 0.042#> user system elapsed #> 0.014 0.000 0.014