Creates a data.table
of feature importances in a model.
xgb.importance(feature_names = NULL, model = NULL, trees = NULL, data = NULL, label = NULL, target = NULL)
feature_names | character vector of feature names. If the model already
contains feature names, those would be used when |
---|---|
model | object of class |
trees | (only for the gbtree booster) an integer vector of tree indices that should be included
into the importance calculation. If set to |
data | deprecated. |
label | deprecated. |
target | deprecated. |
For a tree model, a data.table
with the following columns:
Features
names of the features used in the model;
Gain
represents fractional contribution of each feature to the model based on
the total gain of this feature's splits. Higher percentage means a more important
predictive feature.
Cover
metric of the number of observation related to this feature;
Frequency
percentage representing the relative number of times
a feature have been used in trees.
A linear model's importance data.table
has the following columns:
Features
names of the features used in the model;
Weight
the linear coefficient of this feature;
Class
(only for multiclass models) class label.
If feature_names
is not provided and model
doesn't have feature_names
,
index of the features will be used instead. Because the index is extracted from the model dump
(based on C++ code), it starts at 0 (as in C/C++ or Python) instead of 1 (usual in R).
This function works for both linear and tree models.
For linear models, the importance is the absolute magnitude of linear coefficients. For that reason, in order to obtain a meaningful ranking by importance for a linear model, the features need to be on the same scale (which you also would want to do when using either L1 or L2 regularization).
# binomial classification using gbtree: data(agaricus.train, package='xgboost') bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 2, eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")#> [1] train-error:0.046522 #> [2] train-error:0.022263xgb.importance(model = bst)#> Feature Gain Cover Frequency #> 1: odor=none 0.67615471 0.4978746 0.4 #> 2: stalk-root=club 0.17135375 0.1920543 0.2 #> 3: stalk-root=rooted 0.12317236 0.1638750 0.2 #> 4: spore-print-color=green 0.02931918 0.1461960 0.2# binomial classification using gblinear: bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, booster = "gblinear", eta = 0.3, nthread = 1, nrounds = 20, objective = "binary:logistic")#> [1] train-error:0.015507 #> [2] train-error:0.003992 #> [3] train-error:0.001996 #> [4] train-error:0.001228 #> [5] train-error:0.000768 #> [6] train-error:0.000461 #> [7] train-error:0.000461 #> [8] train-error:0.000461 #> [9] train-error:0.000461 #> [10] train-error:0.000461 #> [11] train-error:0.000000 #> [12] train-error:0.000000 #> [13] train-error:0.000000 #> [14] train-error:0.000000 #> [15] train-error:0.000000 #> [16] train-error:0.000000 #> [17] train-error:0.000000 #> [18] train-error:0.000000 #> [19] train-error:0.000000 #> [20] train-error:0.000000xgb.importance(model = bst)#> Feature Weight #> 1: spore-print-color=green 8.34577 #> 2: stalk-color-above-ring=yellow 8.01066 #> 3: cap-surface=grooves 7.88359 #> 4: cap-shape=conical 7.68607 #> 5: gill-color=green 7.36118 #> --- #> 122: stalk-root=rhizomorphs 0.00000 #> 123: veil-type=universal 0.00000 #> 124: ring-type=cobwebby 0.00000 #> 125: ring-type=sheathing 0.00000 #> 126: ring-type=zone 0.00000# multiclass classification using gbtree: nclass <- 3 nrounds <- 10 mbst <- xgboost(data = as.matrix(iris[, -5]), label = as.numeric(iris$Species) - 1, max_depth = 3, eta = 0.2, nthread = 2, nrounds = nrounds, objective = "multi:softprob", num_class = nclass)#> [1] train-merror:0.026667 #> [2] train-merror:0.026667 #> [3] train-merror:0.026667 #> [4] train-merror:0.026667 #> [5] train-merror:0.026667 #> [6] train-merror:0.026667 #> [7] train-merror:0.026667 #> [8] train-merror:0.026667 #> [9] train-merror:0.026667 #> [10] train-merror:0.026667# all classes clumped together: xgb.importance(model = mbst)#> Feature Gain Cover Frequency #> 1: Petal.Length 0.6997634669 0.688819638 0.64583333 #> 2: Petal.Width 0.2990390357 0.304397511 0.33333333 #> 3: Sepal.Width 0.0009753064 0.001342612 0.01041667 #> 4: Sepal.Length 0.0002221910 0.005440239 0.01041667# inspect importances separately for each class: xgb.importance(model = mbst, trees = seq(from=0, by=nclass, length.out=nrounds))#> Feature Gain Cover Frequency #> 1: Petal.Length 1 1 1#> Feature Gain Cover Frequency #> 1: Petal.Width 0.6012748 0.2952208 0.25 #> 2: Petal.Length 0.3987252 0.7047792 0.75#> Feature Gain Cover Frequency #> 1: Petal.Length 0.6700913471 0.54322257 0.47826087 #> 2: Petal.Width 0.3262204472 0.44009200 0.47826087 #> 3: Sepal.Width 0.0030038734 0.00330275 0.02173913 #> 4: Sepal.Length 0.0006843323 0.01338268 0.02173913# multiclass classification using gblinear: mbst <- xgboost(data = scale(as.matrix(iris[, -5])), label = as.numeric(iris$Species) - 1, booster = "gblinear", eta = 0.2, nthread = 1, nrounds = 15, objective = "multi:softprob", num_class = nclass)#> [1] train-merror:0.180000 #> [2] train-merror:0.180000 #> [3] train-merror:0.173333 #> [4] train-merror:0.166667 #> [5] train-merror:0.160000 #> [6] train-merror:0.153333 #> [7] train-merror:0.146667 #> [8] train-merror:0.146667 #> [9] train-merror:0.140000 #> [10] train-merror:0.133333 #> [11] train-merror:0.120000 #> [12] train-merror:0.113333 #> [13] train-merror:0.106667 #> [14] train-merror:0.106667 #> [15] train-merror:0.093333xgb.importance(model = mbst)#> Feature Weight Class #> 1: Petal.Length -1.4556900 0 #> 2: Petal.Width -1.1274600 0 #> 3: Sepal.Width 1.0594100 0 #> 4: Sepal.Length -1.0027000 0 #> 5: Sepal.Width -0.6074810 1 #> 6: Petal.Width -0.2977170 1 #> 7: Sepal.Length 0.1526370 1 #> 8: Petal.Length 0.0478541 1 #> 9: Petal.Width 1.0795600 2 #> 10: Petal.Length 0.9327340 2 #> 11: Sepal.Length 0.4676620 2 #> 12: Sepal.Width -0.0492139 2