Importance of features in a model.

Creates a data.table of feature importances in a model.

xgb.importance(feature_names = NULL, model = NULL, trees = NULL,
  data = NULL, label = NULL, target = NULL)

Arguments

feature_names	character vector of feature names. If the model already contains feature names, those would be used when `feature_names=NULL` (default value). Non-null `feature_names` could be provided to override those in the model.
model	object of class `xgb.Booster`.
trees	(only for the gbtree booster) an integer vector of tree indices that should be included into the importance calculation. If set to `NULL`, all trees of the model are parsed. It could be useful, e.g., in multiclass classification to get feature importances for each class separately. IMPORTANT: the tree index in xgboost models is zero-based (e.g., use `trees = 0:4` for first 5 trees).
data	deprecated.
label	deprecated.
target	deprecated.

Value

For a tree model, a data.table with the following columns:

Features names of the features used in the model;
Gain represents fractional contribution of each feature to the model based on the total gain of this feature's splits. Higher percentage means a more important predictive feature.
Cover metric of the number of observation related to this feature;
Frequency percentage representing the relative number of times a feature have been used in trees.

A linear model's importance data.table has the following columns:

Features names of the features used in the model;
Weight the linear coefficient of this feature;
Class (only for multiclass models) class label.

If feature_names is not provided and model doesn't have feature_names, index of the features will be used instead. Because the index is extracted from the model dump (based on C++ code), it starts at 0 (as in C/C++ or Python) instead of 1 (usual in R).

Details

This function works for both linear and tree models.

For linear models, the importance is the absolute magnitude of linear coefficients. For that reason, in order to obtain a meaningful ranking by importance for a linear model, the features need to be on the same scale (which you also would want to do when using either L1 or L2 regularization).

Examples


# binomial classification using gbtree:
data(agaricus.train, package='xgboost')
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2,
               eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")
#> [1]	train-error:0.046522 
#> [2]	train-error:0.022263 
xgb.importance(model = bst)
#>                    Feature       Gain     Cover Frequency
#> 1:               odor=none 0.67615471 0.4978746       0.4
#> 2:         stalk-root=club 0.17135375 0.1920543       0.2
#> 3:       stalk-root=rooted 0.12317236 0.1638750       0.2
#> 4: spore-print-color=green 0.02931918 0.1461960       0.2

# binomial classification using gblinear:
bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, booster = "gblinear",
               eta = 0.3, nthread = 1, nrounds = 20, objective = "binary:logistic")
#> [1]	train-error:0.015507 
#> [2]	train-error:0.003992 
#> [3]	train-error:0.001996 
#> [4]	train-error:0.001228 
#> [5]	train-error:0.000768 
#> [6]	train-error:0.000461 
#> [7]	train-error:0.000461 
#> [8]	train-error:0.000461 
#> [9]	train-error:0.000461 
#> [10]	train-error:0.000461 
#> [11]	train-error:0.000000 
#> [12]	train-error:0.000000 
#> [13]	train-error:0.000000 
#> [14]	train-error:0.000000 
#> [15]	train-error:0.000000 
#> [16]	train-error:0.000000 
#> [17]	train-error:0.000000 
#> [18]	train-error:0.000000 
#> [19]	train-error:0.000000 
#> [20]	train-error:0.000000 
xgb.importance(model = bst)
#>                            Feature  Weight
#>   1:       spore-print-color=green 8.34577
#>   2: stalk-color-above-ring=yellow 8.01066
#>   3:           cap-surface=grooves 7.88359
#>   4:             cap-shape=conical 7.68607
#>   5:              gill-color=green 7.36118
#>  ---                                      
#> 122:        stalk-root=rhizomorphs 0.00000
#> 123:           veil-type=universal 0.00000
#> 124:            ring-type=cobwebby 0.00000
#> 125:           ring-type=sheathing 0.00000
#> 126:                ring-type=zone 0.00000

# multiclass classification using gbtree:
nclass <- 3
nrounds <- 10
mbst <- xgboost(data = as.matrix(iris[, -5]), label = as.numeric(iris$Species) - 1,
               max_depth = 3, eta = 0.2, nthread = 2, nrounds = nrounds,
               objective = "multi:softprob", num_class = nclass)
#> [1]	train-merror:0.026667 
#> [2]	train-merror:0.026667 
#> [3]	train-merror:0.026667 
#> [4]	train-merror:0.026667 
#> [5]	train-merror:0.026667 
#> [6]	train-merror:0.026667 
#> [7]	train-merror:0.026667 
#> [8]	train-merror:0.026667 
#> [9]	train-merror:0.026667 
#> [10]	train-merror:0.026667 
# all classes clumped together:
xgb.importance(model = mbst)
#>         Feature         Gain       Cover  Frequency
#> 1: Petal.Length 0.6997634669 0.688819638 0.64583333
#> 2:  Petal.Width 0.2990390357 0.304397511 0.33333333
#> 3:  Sepal.Width 0.0009753064 0.001342612 0.01041667
#> 4: Sepal.Length 0.0002221910 0.005440239 0.01041667
# inspect importances separately for each class:
xgb.importance(model = mbst, trees = seq(from=0, by=nclass, length.out=nrounds))
#>         Feature Gain Cover Frequency
#> 1: Petal.Length    1     1         1
xgb.importance(model = mbst, trees = seq(from=1, by=nclass, length.out=nrounds))
#>         Feature      Gain     Cover Frequency
#> 1:  Petal.Width 0.6012748 0.2952208      0.25
#> 2: Petal.Length 0.3987252 0.7047792      0.75
xgb.importance(model = mbst, trees = seq(from=2, by=nclass, length.out=nrounds))
#>         Feature         Gain      Cover  Frequency
#> 1: Petal.Length 0.6700913471 0.54322257 0.47826087
#> 2:  Petal.Width 0.3262204472 0.44009200 0.47826087
#> 3:  Sepal.Width 0.0030038734 0.00330275 0.02173913
#> 4: Sepal.Length 0.0006843323 0.01338268 0.02173913

# multiclass classification using gblinear:
mbst <- xgboost(data = scale(as.matrix(iris[, -5])), label = as.numeric(iris$Species) - 1,
               booster = "gblinear", eta = 0.2, nthread = 1, nrounds = 15,
               objective = "multi:softprob", num_class = nclass)
#> [1]	train-merror:0.180000 
#> [2]	train-merror:0.180000 
#> [3]	train-merror:0.173333 
#> [4]	train-merror:0.166667 
#> [5]	train-merror:0.160000 
#> [6]	train-merror:0.153333 
#> [7]	train-merror:0.146667 
#> [8]	train-merror:0.146667 
#> [9]	train-merror:0.140000 
#> [10]	train-merror:0.133333 
#> [11]	train-merror:0.120000 
#> [12]	train-merror:0.113333 
#> [13]	train-merror:0.106667 
#> [14]	train-merror:0.106667 
#> [15]	train-merror:0.093333 
xgb.importance(model = mbst)
#>          Feature     Weight Class
#>  1: Petal.Length -1.4556900     0
#>  2:  Petal.Width -1.1274600     0
#>  3:  Sepal.Width  1.0594100     0
#>  4: Sepal.Length -1.0027000     0
#>  5:  Sepal.Width -0.6074810     1
#>  6:  Petal.Width -0.2977170     1
#>  7: Sepal.Length  0.1526370     1
#>  8: Petal.Length  0.0478541     1
#>  9:  Petal.Width  1.0795600     2
#> 10: Petal.Length  0.9327340     2
#> 11: Sepal.Length  0.4676620     2
#> 12:  Sepal.Width -0.0492139     2

Arguments

Value

Details

Examples

Contents