Choose or rename variables from a tbl. select() keeps only the variables you mention; rename() keeps all variables.

select(.data, ...)

rename(.data, ...)

Arguments

.data

A tbl. All main verbs are S3 generics and provide methods for tbl_df(), dtplyr::tbl_dt() and dbplyr::tbl_dbi().

...

One or more unquoted expressions separated by commas. You can treat variable names like they are positions, so you can use expressions like x:y to select ranges of variables.

Positive values select variables; negative values drop variables. If the first expression is negative, select() will automatically start with all variables.

Use named arguments, e.g. new_name = old_name, to rename selected variables.

The arguments in ... are automatically quoted and evaluated in a context where column names represent column positions. They also support unquoting and splicing. See vignette("programming") for an introduction to these concepts.

See select helpers for more details and examples about tidyselect helpers such as starts_with(), everything(), ...

Value

An object of the same class as .data.

Details

These functions work by column index, not value; thus, an expression like select(data.frame(x = 1:5, y = 10), z = x+1) does not create a variable with values 2:6. (In the current implementation, the expression z = x+1 wouldn't do anything useful.) To calculate using column values, see mutate()/transmute().

Useful functions

As well as using existing functions like : and c(), there are a number of special functions that only work inside select():

To drop variables, use -.

Note that except for :, - and c(), all complex expressions are evaluated outside the data frame context. This is to prevent accidental matching of data frame variables when you refer to variables from the calling context.

Scoped selection and renaming

The three scoped variants of select() (select_all(), select_if() and select_at()) and the three variants of rename() (rename_all(), rename_if(), rename_at()) make it easy to apply a renaming function to a selection of variables.

Tidy data

When applied to a data frame, row names are silently dropped. To preserve, convert to an explicit variable with tibble::rownames_to_column().

See also

Other single table verbs: arrange, filter, mutate, slice, summarise

Examples

iris <- as_tibble(iris) # so it prints a little nicer select(iris, starts_with("Petal"))
#> # A tibble: 150 x 2 #> Petal.Length Petal.Width #> <dbl> <dbl> #> 1 1.4 0.2 #> 2 1.4 0.2 #> 3 1.3 0.2 #> 4 1.5 0.2 #> 5 1.4 0.2 #> 6 1.7 0.4 #> 7 1.4 0.3 #> 8 1.5 0.2 #> 9 1.4 0.2 #> 10 1.5 0.1 #> # … with 140 more rows
select(iris, ends_with("Width"))
#> # A tibble: 150 x 2 #> Sepal.Width Petal.Width #> <dbl> <dbl> #> 1 3.5 0.2 #> 2 3 0.2 #> 3 3.2 0.2 #> 4 3.1 0.2 #> 5 3.6 0.2 #> 6 3.9 0.4 #> 7 3.4 0.3 #> 8 3.4 0.2 #> 9 2.9 0.2 #> 10 3.1 0.1 #> # … with 140 more rows
# Move Species variable to the front select(iris, Species, everything())
#> # A tibble: 150 x 5 #> Species Sepal.Length Sepal.Width Petal.Length Petal.Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.1 3.5 1.4 0.2 #> 2 setosa 4.9 3 1.4 0.2 #> 3 setosa 4.7 3.2 1.3 0.2 #> 4 setosa 4.6 3.1 1.5 0.2 #> 5 setosa 5 3.6 1.4 0.2 #> 6 setosa 5.4 3.9 1.7 0.4 #> 7 setosa 4.6 3.4 1.4 0.3 #> 8 setosa 5 3.4 1.5 0.2 #> 9 setosa 4.4 2.9 1.4 0.2 #> 10 setosa 4.9 3.1 1.5 0.1 #> # … with 140 more rows
# Move Sepal.Length variable to back # first select all variables except Sepal.Length, then re select Sepal.Length select(iris, -Sepal.Length, Sepal.Length)
#> # A tibble: 150 x 5 #> Sepal.Width Petal.Length Petal.Width Species Sepal.Length #> <dbl> <dbl> <dbl> <fct> <dbl> #> 1 3.5 1.4 0.2 setosa 5.1 #> 2 3 1.4 0.2 setosa 4.9 #> 3 3.2 1.3 0.2 setosa 4.7 #> 4 3.1 1.5 0.2 setosa 4.6 #> 5 3.6 1.4 0.2 setosa 5 #> 6 3.9 1.7 0.4 setosa 5.4 #> 7 3.4 1.4 0.3 setosa 4.6 #> 8 3.4 1.5 0.2 setosa 5 #> 9 2.9 1.4 0.2 setosa 4.4 #> 10 3.1 1.5 0.1 setosa 4.9 #> # … with 140 more rows
df <- as.data.frame(matrix(runif(100), nrow = 10)) df <- tbl_df(df[c(3, 4, 7, 1, 9, 8, 5, 2, 6, 10)]) select(df, V4:V6)
#> # A tibble: 10 x 8 #> V4 V7 V1 V9 V8 V5 V2 V6 #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0.932 0.568 0.0717 0.410 0.0512 0.577 0.149 0.946 #> 2 0.197 0.614 0.0935 0.493 0.727 0.249 0.589 0.585 #> 3 0.181 0.00233 0.460 0.251 0.226 0.692 0.0815 0.338 #> 4 0.564 0.129 0.568 0.603 0.427 0.330 0.748 0.876 #> 5 0.685 0.261 0.353 0.465 0.0613 0.241 0.330 0.313 #> 6 0.475 0.971 0.634 0.876 0.515 0.602 0.246 0.259 #> 7 0.677 0.665 0.176 0.987 0.713 0.0220 0.0440 0.994 #> 8 0.849 0.874 0.925 0.540 0.673 0.426 0.216 0.372 #> 9 0.432 0.149 0.233 0.0661 0.152 0.104 0.768 0.302 #> 10 0.389 0.536 0.336 0.614 0.288 0.378 0.826 0.798
select(df, num_range("V", 4:6))
#> # A tibble: 10 x 3 #> V4 V5 V6 #> <dbl> <dbl> <dbl> #> 1 0.932 0.577 0.946 #> 2 0.197 0.249 0.585 #> 3 0.181 0.692 0.338 #> 4 0.564 0.330 0.876 #> 5 0.685 0.241 0.313 #> 6 0.475 0.602 0.259 #> 7 0.677 0.0220 0.994 #> 8 0.849 0.426 0.372 #> 9 0.432 0.104 0.302 #> 10 0.389 0.378 0.798
# Drop variables with - select(iris, -starts_with("Petal"))
#> # A tibble: 150 x 3 #> Sepal.Length Sepal.Width Species #> <dbl> <dbl> <fct> #> 1 5.1 3.5 setosa #> 2 4.9 3 setosa #> 3 4.7 3.2 setosa #> 4 4.6 3.1 setosa #> 5 5 3.6 setosa #> 6 5.4 3.9 setosa #> 7 4.6 3.4 setosa #> 8 5 3.4 setosa #> 9 4.4 2.9 setosa #> 10 4.9 3.1 setosa #> # … with 140 more rows
# Select the grouping variables: starwars %>% group_by(gender) %>% select(group_cols())
#> # A tibble: 87 x 1 #> # Groups: gender [5] #> gender #> <chr> #> 1 male #> 2 <NA> #> 3 <NA> #> 4 male #> 5 female #> 6 male #> 7 female #> 8 <NA> #> 9 male #> 10 male #> # … with 77 more rows
# The .data pronoun is available: select(mtcars, .data$cyl)
#> cyl #> Mazda RX4 6 #> Mazda RX4 Wag 6 #> Datsun 710 4 #> Hornet 4 Drive 6 #> Hornet Sportabout 8 #> Valiant 6 #> Duster 360 8 #> Merc 240D 4 #> Merc 230 4 #> Merc 280 6 #> Merc 280C 6 #> Merc 450SE 8 #> Merc 450SL 8 #> Merc 450SLC 8 #> Cadillac Fleetwood 8 #> Lincoln Continental 8 #> Chrysler Imperial 8 #> Fiat 128 4 #> Honda Civic 4 #> Toyota Corolla 4 #> Toyota Corona 4 #> Dodge Challenger 8 #> AMC Javelin 8 #> Camaro Z28 8 #> Pontiac Firebird 8 #> Fiat X1-9 4 #> Porsche 914-2 4 #> Lotus Europa 4 #> Ford Pantera L 8 #> Ferrari Dino 6 #> Maserati Bora 8 #> Volvo 142E 4
select(mtcars, .data$mpg : .data$disp)
#> mpg cyl disp #> Mazda RX4 21.0 6 160.0 #> Mazda RX4 Wag 21.0 6 160.0 #> Datsun 710 22.8 4 108.0 #> Hornet 4 Drive 21.4 6 258.0 #> Hornet Sportabout 18.7 8 360.0 #> Valiant 18.1 6 225.0 #> Duster 360 14.3 8 360.0 #> Merc 240D 24.4 4 146.7 #> Merc 230 22.8 4 140.8 #> Merc 280 19.2 6 167.6 #> Merc 280C 17.8 6 167.6 #> Merc 450SE 16.4 8 275.8 #> Merc 450SL 17.3 8 275.8 #> Merc 450SLC 15.2 8 275.8 #> Cadillac Fleetwood 10.4 8 472.0 #> Lincoln Continental 10.4 8 460.0 #> Chrysler Imperial 14.7 8 440.0 #> Fiat 128 32.4 4 78.7 #> Honda Civic 30.4 4 75.7 #> Toyota Corolla 33.9 4 71.1 #> Toyota Corona 21.5 4 120.1 #> Dodge Challenger 15.5 8 318.0 #> AMC Javelin 15.2 8 304.0 #> Camaro Z28 13.3 8 350.0 #> Pontiac Firebird 19.2 8 400.0 #> Fiat X1-9 27.3 4 79.0 #> Porsche 914-2 26.0 4 120.3 #> Lotus Europa 30.4 4 95.1 #> Ford Pantera L 15.8 8 351.0 #> Ferrari Dino 19.7 6 145.0 #> Maserati Bora 15.0 8 301.0 #> Volvo 142E 21.4 4 121.0
# However it isn't available within calls since those are evaluated # outside of the data context. This would fail if run: # select(mtcars, identical(.data$cyl)) # Renaming ----------------------------------------- # * select() keeps only the variables you specify select(iris, petal_length = Petal.Length)
#> # A tibble: 150 x 1 #> petal_length #> <dbl> #> 1 1.4 #> 2 1.4 #> 3 1.3 #> 4 1.5 #> 5 1.4 #> 6 1.7 #> 7 1.4 #> 8 1.5 #> 9 1.4 #> 10 1.5 #> # … with 140 more rows
# * rename() keeps all variables rename(iris, petal_length = Petal.Length)
#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width petal_length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 5.1 3.5 1.4 0.2 setosa #> 2 4.9 3 1.4 0.2 setosa #> 3 4.7 3.2 1.3 0.2 setosa #> 4 4.6 3.1 1.5 0.2 setosa #> 5 5 3.6 1.4 0.2 setosa #> 6 5.4 3.9 1.7 0.4 setosa #> 7 4.6 3.4 1.4 0.3 setosa #> 8 5 3.4 1.5 0.2 setosa #> 9 4.4 2.9 1.4 0.2 setosa #> 10 4.9 3.1 1.5 0.1 setosa #> # … with 140 more rows
# * select() can rename variables in a group select(iris, obs = starts_with('S'))
#> # A tibble: 150 x 3 #> obs1 obs2 obs3 #> <dbl> <dbl> <fct> #> 1 5.1 3.5 setosa #> 2 4.9 3 setosa #> 3 4.7 3.2 setosa #> 4 4.6 3.1 setosa #> 5 5 3.6 setosa #> 6 5.4 3.9 setosa #> 7 4.6 3.4 setosa #> 8 5 3.4 setosa #> 9 4.4 2.9 setosa #> 10 4.9 3.1 setosa #> # … with 140 more rows
# Unquoting ---------------------------------------- # Like all dplyr verbs, select() supports unquoting of symbols: vars <- list( var1 = sym("cyl"), var2 = sym("am") ) select(mtcars, !!!vars)
#> var1 var2 #> Mazda RX4 6 1 #> Mazda RX4 Wag 6 1 #> Datsun 710 4 1 #> Hornet 4 Drive 6 0 #> Hornet Sportabout 8 0 #> Valiant 6 0 #> Duster 360 8 0 #> Merc 240D 4 0 #> Merc 230 4 0 #> Merc 280 6 0 #> Merc 280C 6 0 #> Merc 450SE 8 0 #> Merc 450SL 8 0 #> Merc 450SLC 8 0 #> Cadillac Fleetwood 8 0 #> Lincoln Continental 8 0 #> Chrysler Imperial 8 0 #> Fiat 128 4 1 #> Honda Civic 4 1 #> Toyota Corolla 4 1 #> Toyota Corona 4 0 #> Dodge Challenger 8 0 #> AMC Javelin 8 0 #> Camaro Z28 8 0 #> Pontiac Firebird 8 0 #> Fiat X1-9 4 1 #> Porsche 914-2 4 1 #> Lotus Europa 4 1 #> Ford Pantera L 8 1 #> Ferrari Dino 6 1 #> Maserati Bora 8 1 #> Volvo 142E 4 1
# For convenience it also supports strings and character # vectors. This is unlike other verbs where strings would be # ambiguous. vars <- c(var1 = "cyl", var2 ="am") select(mtcars, !!vars)
#> var1 var2 #> Mazda RX4 6 1 #> Mazda RX4 Wag 6 1 #> Datsun 710 4 1 #> Hornet 4 Drive 6 0 #> Hornet Sportabout 8 0 #> Valiant 6 0 #> Duster 360 8 0 #> Merc 240D 4 0 #> Merc 230 4 0 #> Merc 280 6 0 #> Merc 280C 6 0 #> Merc 450SE 8 0 #> Merc 450SL 8 0 #> Merc 450SLC 8 0 #> Cadillac Fleetwood 8 0 #> Lincoln Continental 8 0 #> Chrysler Imperial 8 0 #> Fiat 128 4 1 #> Honda Civic 4 1 #> Toyota Corolla 4 1 #> Toyota Corona 4 0 #> Dodge Challenger 8 0 #> AMC Javelin 8 0 #> Camaro Z28 8 0 #> Pontiac Firebird 8 0 #> Fiat X1-9 4 1 #> Porsche 914-2 4 1 #> Lotus Europa 4 1 #> Ford Pantera L 8 1 #> Ferrari Dino 6 1 #> Maserati Bora 8 1 #> Volvo 142E 4 1
rename(mtcars, !!vars)
#> mpg var1 disp hp drat wt qsec vs var2 gear carb #> Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 #> Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 #> Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 #> Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 #> Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 #> Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 #> Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 #> Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 #> Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 #> Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 #> Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 #> Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 #> Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 #> Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 #> Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 #> Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 #> Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 #> Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 #> Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 #> Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 #> Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 #> Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 #> AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 #> Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 #> Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 #> Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 #> Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 #> Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 #> Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 #> Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 #> Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 #> Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2