This is a convenient wrapper that uses filter() and min_rank() to select the top or bottom entries in each group, ordered by wt.

top_n(x, n, wt)

top_frac(x, n, wt)

Arguments

x

a tbl() to filter

n

number of rows to return for top_n(), fraction of rows to return for top_frac().

If x is grouped, this is the number (or fraction) of rows per group. Will include more rows if there are ties.

If n is positive, selects the top rows. If negative, selects the bottom rows.

wt

(Optional). The variable to use for ordering. If not specified, defaults to the last variable in the tbl.

Details

Both n and wt are automatically quoted and later evaluated in the context of the data frame. It supports unquoting.

Examples

df <- data.frame(x = c(10, 4, 1, 6, 3, 1, 1)) df %>% top_n(2)
#> Selecting by x
#> x #> 1 10 #> 2 6
# half the rows df %>% top_n(n() * .5)
#> Selecting by x
#> x #> 1 10 #> 2 4 #> 3 6
df %>% top_frac(.5)
#> Selecting by x
#> x #> 1 10 #> 2 4 #> 3 6
# Negative values select bottom from group. Note that we get more # than 2 values here because there's a tie: top_n() either takes # all rows with a value, or none. df %>% top_n(-2)
#> Selecting by x
#> x #> 1 1 #> 2 1 #> 3 1
if (require("Lahman")) { # Find 10 players with most games tbl_df(Batting) %>% group_by(playerID) %>% tally(G) %>% top_n(10) # Find year with most games for each player if (FALSE) { tbl_df(Batting) %>% group_by(playerID) %>% top_n(1, G) } }
#> Selecting by n