Vector aggregate.

This function is somewhat similar to tapply, but is designed for use in conjunction with id. It is simpler in that it only accepts a single grouping vector (use id if you have more) and uses vapply internally, using the .default value as the template.

vaggregate(.value, .group, .fun, ..., .default = NULL, .n = nlevels(.group))

Arguments

.value	vector of values to aggregate
.group	grouping vector
.fun	aggregation function
...	other arguments passed on to `.fun`
.default	default value used for missing groups. This argument is also used as the template for function output.
.n	total number of groups

Details

vaggregate should be faster than tapply in most situations because it avoids making a copy of the data.

Examples

# Some examples of use borrowed from ?tapply
n <- 17; fac <- factor(rep(1:3, length.out = n), levels = 1:5)
table(fac)
#> fac
#> 1 2 3 4 5 
#> 6 6 5 0 0 
vaggregate(1:n, fac, sum)
#> [1] 51 57 45  0  0
vaggregate(1:n, fac, sum, .default = NA_integer_)
#> [1] 51 57 45 NA NA
vaggregate(1:n, fac, range)
#> Warning: no non-missing arguments to min; returning Inf
#> Warning: no non-missing arguments to max; returning -Inf
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    2    3  Inf  Inf
#> [2,]   16   17   15 -Inf -Inf
vaggregate(1:n, fac, range, .default = c(NA, NA) + 0)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    2    3   NA   NA
#> [2,]   16   17   15   NA   NA
vaggregate(1:n, fac, quantile)
#>       [,1]  [,2] [,3] [,4] [,5]
#> 0%    1.00  2.00    3   NA   NA
#> 25%   4.75  5.75    6   NA   NA
#> 50%   8.50  9.50    9   NA   NA
#> 75%  12.25 13.25   12   NA   NA
#> 100% 16.00 17.00   15   NA   NA
# Unlike tapply, vaggregate does not support multi-d output:
tapply(warpbreaks$breaks, warpbreaks[,-1], sum)
#>     tension
#> wool   L   M   H
#>    A 401 216 221
#>    B 254 259 169
vaggregate(warpbreaks$breaks, id(warpbreaks[,-1]), sum)
#> [1] 401 216 221 254 259 169

# But it is about 10x faster
x <- rnorm(1e6)
y1 <- sample.int(10, 1e6, replace = TRUE)
system.time(tapply(x, y1, mean))
#>    user  system elapsed 
#>   0.037   0.004   0.042 
system.time(vaggregate(x, y1, mean))
#>    user  system elapsed 
#>   0.014   0.000   0.014

Arguments

Details

Examples

Contents