tibble deals with a few levels of name repair:
minimal
names exist. The names
attribute is not NULL
. The name of
an unnamed element is ""
and never NA
. Tibbles created by the tibble
package have names that are, at least, minimal
.
unique
names are minimal
, have no duplicates, and can be used where a variable name is expected.
Empty names, and ...
or ..
followed by a sequence of digits are banned.
All columns can be accessed by name via df[["name"]]
and df$`name`
and with(df, `name`)
.
universal
names are unique
and syntactic (see Details for more).
Names work everywhere, without quoting: df$name
and with(df, name)
and
lm(name1 ~ name2, data = df)
and dplyr::select(df, name)
all work.
universal
implies unique
, unique
implies minimal
. These levels are
nested.
The .name_repair
argument of tibble()
and as_tibble()
refers to these
levels. Alternatively, the user can pass their own name repair function. It
should anticipate minimal
names as input and should, likewise, return names
that are at least minimal
.
The existing functions tidy_names()
, set_tidy_names()
,
and repair_names()
are soft-deprecated.
minimal
namesminimal
names exist. The names
attribute is not NULL
. The name of an
unnamed element is ""
and never NA
.
Examples:
Original names of a vector with length 3: NULL minimal names: "" "" "" Original names: "x" NA minimal names: "x" ""
Request .name_repair = "minimal"
to suppress almost all name munging. This
is useful when the first row of a data source -- allegedly variable names --
actually contains data and the resulting tibble is destined for reshaping
with, e.g., tidyr::gather()
.
unique
namesunique
names are minimal
, have no duplicates, and can be used (possibly with backticks)
in contexts where a variable is expected. Empty names, and ...
or ..
followed by a
sequence of digits are banned
If a data frame has unique
names, you can index it by name, and also access the columns
by name.
In particular, df[["name"]]
and df$`name`
and also with(df, `name`)
always work.
There are many ways to make names unique
. We append a suffix of the form
...j
to any name that is ""
or a duplicate, where j
is the position.
We also change ..#
and ...
to ...#
.
Example:
Original names: "" "x" "" "y" "x" "..2" "..." unique names: "...1" "x...2" "...3" "y" "x...5" "...6" "...7"
Pre-existing suffixes of the form ...j
are always stripped, prior to making
names unique
, i.e. reconstructing the suffixes. If this interacts poorly
with your names, you should take control of name repair.
universal
namesuniversal
names are unique
and syntactic, meaning they:
Are never empty (inherited from unique
).
Have no duplicates (inherited from unique
).
Are not ...
. Do not have the form ..i
, where i
is a number (inherited from unique
).
Consist of letters, numbers, and the dot .
or underscore _
characters.
Start with a letter or start with the dot .
not followed by a number.
Are not a reserved word, e.g., if
or function
or TRUE
.
If a data frame has universal
names, variable names can be used "as is" in
code. They work well with nonstandard evaluation, e.g., df$name
works.
Tibble has a different method of making names syntactic than
base::make.names()
. In general, tibble prepends one or more dots .
until
the name is syntactic.
Examples:
Original names: "" "x" NA "x" universal names: "...1" "x...2" "...3" "x...4" Original names: "(y)" "_z" ".2fa" "FALSE" universal names: ".y." "._z" "..2fa" ".FALSE"
rlang::names2()
returns the names of an object, after making them
minimal
.
The Names attribute section in the "tidyverse package development principles".
if (FALSE) { ## by default, duplicate names are not allowed tibble(x = 1, x = 2) } ## you can authorize duplicate names tibble(x = 1, x = 2, .name_repair = "minimal")#> # A tibble: 1 x 2 #> x x #> <dbl> <dbl> #> 1 1 2#>#> #>#> # A tibble: 1 x 2 #> x...1 x...2 #> <dbl> <dbl> #> 1 1 2## by default, non-syntactic names are allowed df <- tibble(`a 1` = 1, `a 2` = 2) ## because you can still index by name df[["a 1"]]#> [1] 1df$`a 1`#> [1] 1## syntactic names are easier to work with, though, and you can request them df <- tibble(`a 1` = 1, `a 2` = 2, .name_repair = "universal")#>#> #>df$a.1#> [1] 1#> # A tibble: 1 x 2 #> x x.1 #> <dbl> <dbl> #> 1 1 2fix_names <- function(x) gsub("%", " percent", x) tibble(`25%` = 1, `75%` = 2, .name_repair = fix_names)#> # A tibble: 1 x 2 #> `25 percent` `75 percent` #> <dbl> <dbl> #> 1 1 2fix_names <- function(x) gsub("\\s+", "_", x) tibble(`year 1` = 1, `year 2` = 2, .name_repair = fix_names)#> # A tibble: 1 x 2 #> year_1 year_2 #> <dbl> <dbl> #> 1 1 2## purrr-style anonymous functions and constants ## are also supported tibble(x = 1, x = 2, .name_repair = ~ make.names(., unique = TRUE))#> # A tibble: 1 x 2 #> x x.1 #> <dbl> <dbl> #> 1 1 2#> # A tibble: 1 x 2 #> a b #> <dbl> <dbl> #> 1 1 2## the names attibute will be non-NULL, with "" as the default element df <- as_tibble(list(1:3, letters[1:3]), .name_repair = "minimal") names(df)#> [1] "" ""