Select top (or bottom) n rows (by value)

This is a convenient wrapper that uses filter() and min_rank() to select the top or bottom entries in each group, ordered by wt.

top_n(x, n, wt)

top_frac(x, n, wt)

Arguments

x	a `tbl()` to filter
n	number of rows to return for `top_n()`, fraction of rows to return for `top_frac()`. If `x` is grouped, this is the number (or fraction) of rows per group. Will include more rows if there are ties. If `n` is positive, selects the top rows. If negative, selects the bottom rows.
wt	(Optional). The variable to use for ordering. If not specified, defaults to the last variable in the tbl.

a tbl() to filter

number of rows to return for top_n(), fraction of rows to return for top_frac().

If x is grouped, this is the number (or fraction) of rows per group. Will include more rows if there are ties.

If n is positive, selects the top rows. If negative, selects the bottom rows.

(Optional). The variable to use for ordering. If not specified, defaults to the last variable in the tbl.

Details

Both n and wt are automatically quoted and later evaluated in the context of the data frame. It supports unquoting.

Examples

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1))
df %>% top_n(2)
#> Selecting by x
#>    x
#> 1 10
#> 2  6

# half the rows
df %>% top_n(n() * .5)
#> Selecting by x
#>    x
#> 1 10
#> 2  4
#> 3  6
df %>% top_frac(.5)
#> Selecting by x
#>    x
#> 1 10
#> 2  4
#> 3  6

# Negative values select bottom from group. Note that we get more
# than 2 values here because there's a tie: top_n() either takes
# all rows with a value, or none.
df %>% top_n(-2)
#> Selecting by x
#>   x
#> 1 1
#> 2 1
#> 3 1

if (require("Lahman")) {
# Find 10 players with most games
tbl_df(Batting) %>%
  group_by(playerID) %>%
  tally(G) %>%
  top_n(10)

# Find year with most games for each player
if (FALSE) {
tbl_df(Batting) %>%
  group_by(playerID) %>%
  top_n(1, G)
}
}
#> Selecting by n

Select top (or bottom) n rows (by value)

Arguments

Details

Examples

Contents