Default.Rd
A simulated data set containing information on ten thousand customers. The aim here is to predict which customers will default on their credit card debt.
Default
A data frame with 10000 observations on the following 4 variables.
default
A factor with levels No
and Yes
indicating whether the customer defaulted on their debt
student
A factor with levels No
and Yes
indicating whether the customer is a student
balance
The average balance that the customer has remaining on their credit card after making their monthly payment
income
Income of customer
Simulated data
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York
summary(Default)#> default student balance income #> No :9667 No :7056 Min. : 0.0 Min. : 772 #> Yes: 333 Yes:2944 1st Qu.: 481.7 1st Qu.:21340 #> Median : 823.6 Median :34553 #> Mean : 835.4 Mean :33517 #> 3rd Qu.:1166.3 3rd Qu.:43808 #> Max. :2654.3 Max. :73554#> #> Call: glm(formula = default ~ student + balance + income, family = "binomial", #> data = Default) #> #> Coefficients: #> (Intercept) studentYes balance income #> -1.087e+01 -6.468e-01 5.737e-03 3.033e-06 #> #> Degrees of Freedom: 9999 Total (i.e. Null); 9996 Residual #> Null Deviance: 2921 #> Residual Deviance: 1572 AIC: 1580