A modification of the system function glm() to include estimation of the additional parameter, theta, for a Negative Binomial generalized linear model.

glm.nb(formula, data, weights, subset, na.action,
       start = NULL, etastart, mustart,
       control = glm.control(...), method = "glm.fit",
       model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ...,
       init.theta, link = log)

Arguments

formula, data, weights, subset, na.action, start, etastart, mustart, control, method, model, x, y, contrasts, ...

arguments for the glm() function. Note that these exclude family and offset (but offset() can be used).

init.theta

Optional initial value for the theta parameter. If omitted a moment estimator after an initial fit using a Poisson GLM is used.

link

The link function. Currently must be one of log, sqrt or identity.

Value

A fitted model object of class negbin inheriting from glm and lm. The object is like the output of glm but contains three additional components, namely theta for the ML estimate of theta, SE.theta for its approximate standard error (using observed rather than expected information), and twologlik for twice the log-likelihood function.

Details

An alternating iteration process is used. For given theta the GLM is fitted using the same process as used by glm(). For fixed means the theta parameter is estimated using score and information iterations. The two are alternated until convergence of both. (The number of alternations and the number of iterations when estimating theta are controlled by the maxit parameter of glm.control.)

Setting trace > 0 traces the alternating iteration process. Setting trace > 1 traces the glm fit, and setting trace > 2 traces the estimation of theta.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

See also

Examples

quine.nb1 <- glm.nb(Days ~ Sex/(Age + Eth*Lrn), data = quine) quine.nb2 <- update(quine.nb1, . ~ . + Sex:Age:Lrn) quine.nb3 <- update(quine.nb2, Days ~ .^4) anova(quine.nb1, quine.nb2, quine.nb3)
#> Likelihood ratio tests of Negative Binomial Models #> #> Response: Days #> Model #> 1 Sex/(Age + Eth * Lrn) #> 2 Sex + Sex:Age + Sex:Eth + Sex:Lrn + Sex:Eth:Lrn + Sex:Age:Lrn #> 3 Sex + Sex:Age + Sex:Eth + Sex:Lrn + Sex:Eth:Lrn + Sex:Age:Lrn + Sex:Age:Eth + Sex:Age:Eth:Lrn #> theta Resid. df 2 x log-lik. Test df LR stat. Pr(Chi) #> 1 1.597991 132 -1063.025 #> 2 1.686899 128 -1055.398 1 vs 2 4 7.627279 0.10622602 #> 3 1.928360 118 -1039.324 2 vs 3 10 16.073723 0.09754136
## PR#1695 y <- c(7, 5, 4, 7, 5, 2, 11, 5, 5, 4, 2, 3, 4, 3, 5, 9, 6, 7, 10, 6, 12, 6, 3, 5, 3, 9, 13, 0, 6, 1, 2, 0, 1, 0, 0, 4, 5, 1, 5, 3, 3, 4) lag1 <- c(0, 7, 5, 4, 7, 5, 2, 11, 5, 5, 4, 2, 3, 4, 3, 5, 9, 6, 7, 10, 6, 12, 6, 3, 5, 3, 9, 13, 0, 6, 1, 2, 0, 1, 0, 0, 4, 5, 1, 5, 3, 3) lag2 <- c(0, 0, 7, 5, 4, 7, 5, 2, 11, 5, 5, 4, 2, 3, 4, 3, 5, 9, 6, 7, 10, 6, 12, 6, 3, 5, 3, 9, 13, 0, 6, 1, 2, 0, 1, 0, 0, 4, 5, 1, 5, 3) lag3 <- c(0, 0, 0, 7, 5, 4, 7, 5, 2, 11, 5, 5, 4, 2, 3, 4, 3, 5, 9, 6, 7, 10, 6, 12, 6, 3, 5, 3, 9, 13, 0, 6, 1, 2, 0, 1, 0, 0, 4, 5, 1, 5) (fit <- glm(y ~ lag1+lag2+lag3, family=poisson(link=identity), start=c(2, 0.1, 0.1, 0.1)))
#> #> Call: glm(formula = y ~ lag1 + lag2 + lag3, family = poisson(link = identity), #> start = c(2, 0.1, 0.1, 0.1)) #> #> Coefficients: #> (Intercept) lag1 lag2 lag3 #> 2.6609 0.1573 0.1424 0.1458 #> #> Degrees of Freedom: 41 Total (i.e. Null); 38 Residual #> Null Deviance: 100.2 #> Residual Deviance: 90.34 AIC: 225.6
try(glm.nb(y ~ lag1+lag2+lag3, link=identity))
#> Warning: NaNs produced
#> Error : no valid set of coefficients has been found: please supply starting values
glm.nb(y ~ lag1+lag2+lag3, link=identity, start=c(2, 0.1, 0.1, 0.1))
#> #> Call: glm.nb(formula = y ~ lag1 + lag2 + lag3, start = c(2, 0.1, 0.1, #> 0.1), link = identity, init.theta = 4.406504429) #> #> Coefficients: #> (Intercept) lag1 lag2 lag3 #> 2.6298 0.1774 0.1407 0.1346 #> #> Degrees of Freedom: 41 Total (i.e. Null); 38 Residual #> Null Deviance: 55.07 #> Residual Deviance: 50.09 AIC: 215.9
glm.nb(y ~ lag1+lag2+lag3, link=identity, start=coef(fit))
#> #> Call: glm.nb(formula = y ~ lag1 + lag2 + lag3, start = coef(fit), link = identity, #> init.theta = 4.406504429) #> #> Coefficients: #> (Intercept) lag1 lag2 lag3 #> 2.6298 0.1774 0.1407 0.1346 #> #> Degrees of Freedom: 41 Total (i.e. Null); 38 Residual #> Null Deviance: 55.07 #> Residual Deviance: 50.09 AIC: 215.9
glm.nb(y ~ lag1+lag2+lag3, link=identity, etastart=rep(5, 42))
#> #> Call: glm.nb(formula = y ~ lag1 + lag2 + lag3, etastart = rep(5, 42), #> link = identity, init.theta = 4.406504429) #> #> Coefficients: #> (Intercept) lag1 lag2 lag3 #> 2.6298 0.1774 0.1407 0.1346 #> #> Degrees of Freedom: 41 Total (i.e. Null); 38 Residual #> Null Deviance: 55.07 #> Residual Deviance: 50.09 AIC: 215.9