getTree.Rd
This function extract the structure of a tree from a
randomForest
object.
getTree(rfobj, k=1, labelVar=FALSE)
rfobj | a |
---|---|
k | which tree to extract? |
labelVar | Should better labels be used for splitting variables and predicted class? |
A matrix (or data frame, if labelVar=TRUE
) with six columns and
number of rows equal to total number of nodes in the tree. The six
columns are:
the row where the left daughter node is; 0 if the node is terminal
the row where the right daughter node is; 0 if the node is terminal
which variable was used to split the node; 0 if the node is terminal
where the best split is; see Details for categorical predictor
is the node terminal (-1) or not (1)
the prediction for the node; 0 if the node is not terminal
For numerical predictors, data with values of the variable less than or equal to the splitting point go to the left daughter node.
For categorical predictors, the splitting point is represented by an integer, whose binary expansion gives the identities of the categories that goes to left or right. For example, if a predictor has four categories, and the split point is 13. The binary expansion of 13 is (1, 0, 1, 1) (because \(13 = 1*2^0 + 0*2^1 + 1*2^2 + 1*2^3\)), so cases with categories 1, 3, or 4 in this predictor get sent to the left, and the rest to the right.
data(iris) ## Look at the third trees in the forest. getTree(randomForest(iris[,-5], iris[,5], ntree=10), 3, labelVar=TRUE)#> left daughter right daughter split var split point status prediction #> 1 2 3 Petal.Width 0.80 1 <NA> #> 2 0 0 <NA> 0.00 -1 setosa #> 3 4 5 Petal.Width 1.75 1 <NA> #> 4 6 7 Petal.Length 5.35 1 <NA> #> 5 0 0 <NA> 0.00 -1 virginica #> 6 8 9 Sepal.Length 4.95 1 <NA> #> 7 0 0 <NA> 0.00 -1 virginica #> 8 10 11 Sepal.Width 2.45 1 <NA> #> 9 12 13 Petal.Length 4.95 1 <NA> #> 10 0 0 <NA> 0.00 -1 versicolor #> 11 0 0 <NA> 0.00 -1 virginica #> 12 0 0 <NA> 0.00 -1 versicolor #> 13 14 15 Sepal.Width 2.45 1 <NA> #> 14 0 0 <NA> 0.00 -1 virginica #> 15 0 0 <NA> 0.00 -1 versicolor