naiveBayes.Rd
Computes the conditional a-posterior probabilities of a categorical class variable given independent predictor variables using the Bayes rule.
# S3 method for formula naiveBayes(formula, data, laplace = 0, ..., subset, na.action = na.pass) # S3 method for default naiveBayes(x, y, laplace = 0, ...) # S3 method for naiveBayes predict(object, newdata, type = c("class", "raw"), threshold = 0.001, eps = 0, ...)
x | A numeric matrix, or a data frame of categorical and/or numeric variables. |
---|---|
y | Class vector. |
formula | A formula of the form |
data | Either a data frame of predictors (categorical and/or numeric) or a contingency table. |
laplace | positive double controlling Laplace smoothing. The default (0) disables Laplace smoothing. |
... | Currently not used. |
subset | For data given in a data frame, an index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.) |
na.action | A function to specify the action to be taken if |
object | An object of class |
newdata | A dataframe with new predictors (with possibly fewer
columns than the training data). Note that the column names of
|
type | If |
threshold | Value replacing cells with probabilities within |
eps | double for specifying an epsilon-range to apply laplace
smoothing (to replace zero or close-zero probabilities by |
An object of class "naiveBayes"
including components:
Class distribution for the dependent variable.
A list of tables, one for each predictor variable. For each categorical variable a table giving, for each attribute level, the conditional probabilities given the target class. For each numeric variable, a table giving, for each target class, mean and standard deviation of the (sub-)variable.
The standard naive Bayes classifier (at least this implementation) assumes independence of the predictor variables, and Gaussian distribution (given the target class) of metric predictors. For attributes with missing values, the corresponding table entries are omitted for prediction.
## Categorical data only: data(HouseVotes84, package = "mlbench") model <- naiveBayes(Class ~ ., data = HouseVotes84) predict(model, HouseVotes84[1:10,])#> [1] republican republican republican democrat democrat democrat #> [7] republican republican republican democrat #> Levels: democrat republican#> democrat republican #> [1,] 1.029209e-07 9.999999e-01 #> [2,] 5.820415e-08 9.999999e-01 #> [3,] 5.684937e-03 9.943151e-01 #> [4,] 9.985798e-01 1.420152e-03 #> [5,] 9.666720e-01 3.332802e-02 #> [6,] 8.121430e-01 1.878570e-01 #> [7,] 1.751512e-04 9.998248e-01 #> [8,] 8.300100e-06 9.999917e-01 #> [9,] 8.277705e-08 9.999999e-01 #> [10,] 1.000000e+00 5.029425e-11#> #> pred democrat republican #> democrat 238 13 #> republican 29 155## using laplace smoothing: model <- naiveBayes(Class ~ ., data = HouseVotes84, laplace = 3) pred <- predict(model, HouseVotes84[,-1]) table(pred, HouseVotes84$Class)#> #> pred democrat republican #> democrat 237 12 #> republican 30 156## Example of using a contingency table: data(Titanic) m <- naiveBayes(Survived ~ ., data = Titanic) m#> #> Naive Bayes Classifier for Discrete Predictors #> #> Call: #> naiveBayes.formula(formula = Survived ~ ., data = Titanic) #> #> A-priori probabilities: #> Survived #> No Yes #> 0.676965 0.323035 #> #> Conditional probabilities: #> Class #> Survived 1st 2nd 3rd Crew #> No 0.08187919 0.11208054 0.35436242 0.45167785 #> Yes 0.28551336 0.16596343 0.25035162 0.29817159 #> #> Sex #> Survived Male Female #> No 0.91543624 0.08456376 #> Yes 0.51617440 0.48382560 #> #> Age #> Survived Child Adult #> No 0.03489933 0.96510067 #> Yes 0.08016878 0.91983122 #>#> [1] Yes No No No Yes Yes Yes Yes No No No No Yes Yes Yes Yes Yes No No #> [20] No Yes Yes Yes Yes No No No No Yes Yes Yes Yes #> Levels: No Yes## Example with metric predictors: data(iris) m <- naiveBayes(Species ~ ., data = iris) ## alternatively: m <- naiveBayes(iris[,-5], iris[,5]) m#> #> Naive Bayes Classifier for Discrete Predictors #> #> Call: #> naiveBayes.default(x = iris[, -5], y = iris[, 5]) #> #> A-priori probabilities: #> iris[, 5] #> setosa versicolor virginica #> 0.3333333 0.3333333 0.3333333 #> #> Conditional probabilities: #> Sepal.Length #> iris[, 5] [,1] [,2] #> setosa 5.006 0.3524897 #> versicolor 5.936 0.5161711 #> virginica 6.588 0.6358796 #> #> Sepal.Width #> iris[, 5] [,1] [,2] #> setosa 3.428 0.3790644 #> versicolor 2.770 0.3137983 #> virginica 2.974 0.3224966 #> #> Petal.Length #> iris[, 5] [,1] [,2] #> setosa 1.462 0.1736640 #> versicolor 4.260 0.4699110 #> virginica 5.552 0.5518947 #> #> Petal.Width #> iris[, 5] [,1] [,2] #> setosa 0.246 0.1053856 #> versicolor 1.326 0.1977527 #> virginica 2.026 0.2746501 #>#> #> setosa versicolor virginica #> setosa 50 0 0 #> versicolor 0 47 3 #> virginica 0 3 47