downSample will randomly sample a data set so that all classes have the same frequency as the minority class. upSample samples with replacement to make the class distributions equal

downSample(x, y, list = FALSE, yname = "Class")

Arguments

x

a matrix or data frame of predictor variables

y

a factor variable with the class memberships

list

should the function return list(x, y) or bind x and y together? If TRUE, the output will be coerced to a data frame.

yname

if list = FALSE, a label for the class column

Value

Either a data frame or a list with elements x and y.

Details

Simple random sampling is used to down-sample for the majority class(es). Note that the minority class data are left intact and that the samples will be re-ordered in the down-sampled version.

For up-sampling, all the original data are left intact and additional samples are added to the minority classes with replacement.

Examples

## A ridiculous example... data(oil) table(oilType)
#> oilType #> A B C D E F G #> 37 26 3 7 11 10 2
downSample(fattyAcids, oilType)
#> Palmitic Stearic Oleic Linoleic Linolenic Eicosanoic Eicosenoic Class #> 1 11.1 5.0 32.9 49.8 0.3 0.4 0.1 A #> 2 10.5 5.0 31.8 51.3 0.4 0.4 0.1 A #> 3 6.8 4.3 26.4 60.5 1.3 0.1 0.1 B #> 4 6.0 4.4 27.1 60.9 0.9 0.1 0.1 B #> 5 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 6 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 7 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 8 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 9 11.9 3.8 25.7 52.7 5.8 0.1 0.1 E #> 10 9.6 3.5 30.3 49.2 5.9 0.4 0.3 E #> 11 4.5 1.7 64.9 18.6 8.3 0.1 0.1 F #> 12 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 13 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 14 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G
upSample(fattyAcids, oilType)
#> Palmitic Stearic Oleic Linoleic Linolenic Eicosanoic Eicosenoic Class #> 1 9.7 5.2 31.0 52.7 0.4 0.4 0.1 A #> 2 11.1 5.0 32.9 49.8 0.3 0.4 0.1 A #> 3 11.5 5.2 35.0 47.2 0.2 0.4 0.1 A #> 4 10.0 4.8 30.4 53.5 0.3 0.4 0.1 A #> 5 12.2 5.0 31.1 50.5 0.3 0.4 0.1 A #> 6 9.8 4.2 43.0 39.2 2.4 0.4 0.5 A #> 7 10.5 5.0 31.8 51.3 0.4 0.4 0.1 A #> 8 10.5 5.0 31.8 51.3 0.4 0.4 0.1 A #> 9 11.5 5.2 35.0 47.2 0.2 0.4 0.1 A #> 10 10.0 4.8 30.4 53.5 0.3 0.4 0.1 A #> 11 11.1 5.0 32.9 49.8 0.3 0.4 0.1 A #> 12 9.3 4.4 43.3 39.2 2.3 0.4 0.5 A #> 13 12.2 5.0 31.1 50.5 0.3 0.4 0.1 A #> 14 9.7 5.6 35.2 47.8 0.5 0.4 0.2 A #> 15 11.1 4.6 28.3 50.6 4.2 0.4 0.2 A #> 16 11.0 4.7 29.8 49.6 3.4 0.4 0.2 A #> 17 11.6 6.0 35.4 45.9 0.2 0.4 0.1 A #> 18 9.8 5.3 31.7 51.3 0.8 0.4 0.2 A #> 19 11.2 6.2 37.7 43.6 0.2 0.5 0.2 A #> 20 7.6 4.7 29.1 56.1 0.7 0.4 0.2 A #> 21 7.7 4.7 31.1 54.3 0.8 0.4 0.2 A #> 22 9.9 5.2 37.4 44.2 2.1 0.4 0.3 A #> 23 11.5 5.1 27.8 54.5 0.2 0.4 0.1 A #> 24 11.3 5.8 35.2 46.7 0.2 0.4 0.1 A #> 25 12.2 5.4 29.4 53.0 1.0 0.1 0.1 A #> 26 11.0 5.3 35.0 45.2 1.3 1.3 0.7 A #> 27 11.9 5.6 33.6 48.9 1.0 0.1 0.1 A #> 28 11.4 5.8 34.5 48.3 1.0 0.1 0.1 A #> 29 10.7 5.4 39.3 43.2 1.4 0.1 0.1 A #> 30 11.2 6.2 35.8 44.1 0.7 1.1 0.1 A #> 31 11.4 5.8 33.9 48.1 0.8 0.8 0.1 A #> 32 11.5 6.2 39.7 42.6 0.8 0.1 0.1 A #> 33 13.0 6.2 25.8 55.0 0.8 0.1 0.1 A #> 34 13.0 6.7 29.6 50.8 0.5 0.1 0.1 A #> 35 13.1 6.3 26.5 53.6 0.5 0.5 0.1 A #> 36 11.6 6.5 38.4 42.8 0.5 0.7 0.1 A #> 37 13.1 5.7 31.7 49.5 0.6 0.1 0.1 A #> 38 6.1 4.1 24.0 64.3 0.1 0.3 0.1 B #> 39 6.2 3.9 27.1 59.9 1.3 0.3 0.3 B #> 40 6.0 4.9 22.8 64.4 0.3 0.3 0.2 B #> 41 7.2 4.5 25.6 61.1 0.2 0.3 0.2 B #> 42 6.2 4.1 26.8 61.4 0.1 0.3 0.2 B #> 43 6.1 4.0 25.2 63.2 0.1 0.2 0.2 B #> 44 6.1 4.1 26.7 61.0 0.6 0.3 0.2 B #> 45 6.2 4.0 25.8 62.2 0.4 0.3 0.2 B #> 46 6.0 3.8 29.6 57.7 1.2 0.3 0.3 B #> 47 6.5 2.8 23.2 66.1 0.1 0.2 0.2 B #> 48 7.0 3.4 23.2 64.9 0.1 0.2 0.2 B #> 49 6.0 4.0 28.3 60.1 0.1 0.3 0.2 B #> 50 6.1 4.1 25.1 63.5 0.5 0.1 0.1 B #> 51 6.3 4.2 27.4 61.4 0.8 0.1 0.1 B #> 52 6.2 4.2 27.1 61.8 0.8 0.1 0.1 B #> 53 6.2 4.2 27.0 60.9 0.5 0.3 0.1 B #> 54 6.2 4.0 28.3 59.7 0.9 0.1 0.1 B #> 55 5.6 4.2 25.7 58.9 1.7 2.8 0.9 B #> 56 6.4 3.9 26.0 63.7 0.5 0.1 0.1 B #> 57 6.8 4.3 26.4 60.5 1.3 0.1 0.1 B #> 58 6.0 4.4 27.1 60.9 0.9 0.1 0.1 B #> 59 6.4 4.8 25.3 61.8 1.0 0.1 0.1 B #> 60 5.9 4.5 24.1 61.7 0.9 0.6 0.6 B #> 61 6.2 4.1 29.9 57.8 1.3 0.1 0.1 B #> 62 6.6 4.7 24.5 62.8 0.3 0.4 0.1 B #> 63 6.4 4.4 24.4 63.7 0.4 0.4 0.1 B #> 64 6.1 4.1 25.1 63.5 0.5 0.1 0.1 B #> 65 7.0 3.4 23.2 64.9 0.1 0.2 0.2 B #> 66 6.2 4.2 27.1 61.8 0.8 0.1 0.1 B #> 67 6.1 4.0 25.2 63.2 0.1 0.2 0.2 B #> 68 7.2 4.5 25.6 61.1 0.2 0.3 0.2 B #> 69 7.2 4.5 25.6 61.1 0.2 0.3 0.2 B #> 70 6.5 2.8 23.2 66.1 0.1 0.2 0.2 B #> 71 6.1 4.1 24.0 64.3 0.1 0.3 0.1 B #> 72 6.5 2.8 23.2 66.1 0.1 0.2 0.2 B #> 73 6.4 3.9 26.0 63.7 0.5 0.1 0.1 B #> 74 6.5 2.8 23.2 66.1 0.1 0.2 0.2 B #> 75 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 76 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 77 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 78 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 79 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 80 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 81 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 82 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 83 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 84 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 85 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 86 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 87 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 88 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 89 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 90 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 91 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 92 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 93 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 94 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 95 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 96 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 97 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 98 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 99 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 100 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 101 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 102 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 103 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 104 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 105 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 106 10.0 3.3 60.0 21.3 0.2 1.5 1.3 C #> 107 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 108 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 109 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 110 9.7 3.4 59.3 20.5 0.1 1.5 1.2 C #> 111 9.6 3.3 57.7 20.7 0.2 1.5 1.8 C #> 112 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 113 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 114 10.9 2.7 76.7 7.9 0.8 0.1 0.1 D #> 115 10.5 2.8 75.8 8.0 0.7 0.1 0.1 D #> 116 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 117 11.7 2.9 74.6 10.1 0.6 0.1 0.1 D #> 118 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 119 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 120 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 121 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 122 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 123 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 124 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 125 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 126 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 127 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 128 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 129 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 130 11.4 3.0 73.0 10.6 0.7 0.1 0.1 D #> 131 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 132 11.7 2.9 74.6 10.1 0.6 0.1 0.1 D #> 133 10.9 2.7 76.7 7.9 0.8 0.1 0.1 D #> 134 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 135 10.9 2.7 76.7 7.9 0.8 0.1 0.1 D #> 136 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 137 11.7 2.9 74.6 10.1 0.6 0.1 0.1 D #> 138 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 139 10.5 2.8 75.8 8.0 0.7 0.1 0.1 D #> 140 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 141 9.3 2.8 65.0 17.0 3.9 0.5 0.7 D #> 142 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 143 10.9 2.7 76.7 7.9 0.8 0.1 0.1 D #> 144 12.0 2.7 75.1 8.5 0.8 0.1 0.1 D #> 145 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 146 10.5 2.8 75.8 8.0 0.7 0.1 0.1 D #> 147 14.9 2.6 68.2 12.8 0.6 0.4 0.3 D #> 148 10.9 2.7 76.7 7.9 0.8 0.1 0.1 D #> 149 10.9 3.6 26.0 52.6 5.5 0.4 0.2 E #> 150 9.6 3.5 30.3 49.2 5.9 0.4 0.3 E #> 151 10.5 4.2 25.5 52.0 7.8 0.1 0.1 E #> 152 10.0 4.2 24.9 53.2 6.9 0.4 0.1 E #> 153 10.4 4.2 25.9 50.8 7.5 0.4 0.4 E #> 154 10.5 4.2 24.4 52.1 7.5 0.4 0.1 E #> 155 10.5 4.3 24.6 53.1 7.6 0.1 0.1 E #> 156 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 157 10.9 3.8 27.2 49.5 6.4 0.7 0.9 E #> 158 11.9 3.8 25.7 52.7 5.8 0.1 0.1 E #> 159 9.7 3.9 25.1 54.2 5.9 0.1 0.1 E #> 160 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 161 9.6 3.5 30.3 49.2 5.9 0.4 0.3 E #> 162 10.9 3.6 26.0 52.6 5.5 0.4 0.2 E #> 163 10.5 4.3 24.6 53.1 7.6 0.1 0.1 E #> 164 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 165 10.0 4.2 24.9 53.2 6.9 0.4 0.1 E #> 166 10.0 4.2 24.9 53.2 6.9 0.4 0.1 E #> 167 9.6 3.5 30.3 49.2 5.9 0.4 0.3 E #> 168 10.5 4.2 24.4 52.1 7.5 0.4 0.1 E #> 169 10.5 4.2 24.4 52.1 7.5 0.4 0.1 E #> 170 9.7 3.9 25.1 54.2 5.9 0.1 0.1 E #> 171 10.9 3.6 26.0 52.6 5.5 0.4 0.2 E #> 172 10.5 4.2 25.5 52.0 7.8 0.1 0.1 E #> 173 10.4 4.2 25.9 50.8 7.5 0.4 0.4 E #> 174 9.6 3.5 30.3 49.2 5.9 0.4 0.3 E #> 175 10.5 4.3 24.6 53.1 7.6 0.1 0.1 E #> 176 10.9 3.6 26.0 52.6 5.5 0.4 0.2 E #> 177 10.5 4.2 25.5 52.0 7.8 0.1 0.1 E #> 178 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 179 9.7 3.9 25.1 54.2 5.9 0.1 0.1 E #> 180 10.5 4.3 24.6 53.1 7.6 0.1 0.1 E #> 181 10.4 4.2 25.9 50.8 7.5 0.4 0.4 E #> 182 10.9 3.6 26.0 52.6 5.5 0.4 0.2 E #> 183 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 184 10.2 4.0 23.1 55.1 7.1 0.5 0.1 E #> 185 9.7 3.9 25.1 54.2 5.9 0.1 0.1 E #> 186 5.1 2.3 55.9 27.4 6.8 0.5 0.5 F #> 187 4.8 1.8 62.6 20.0 9.5 0.1 1.4 F #> 188 5.5 1.7 59.0 21.3 9.3 0.6 1.5 F #> 189 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 190 4.8 1.9 61.6 20.9 8.0 0.8 1.5 F #> 191 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 192 5.1 1.9 59.2 22.4 9.3 0.6 1.5 F #> 193 4.5 1.7 64.9 18.6 8.3 0.1 0.1 F #> 194 5.7 2.1 54.6 26.8 8.0 0.1 0.1 F #> 195 6.2 2.2 52.2 29.0 8.0 0.1 0.1 F #> 196 5.7 2.1 54.6 26.8 8.0 0.1 0.1 F #> 197 5.1 2.3 55.9 27.4 6.8 0.5 0.5 F #> 198 4.5 1.7 64.9 18.6 8.3 0.1 0.1 F #> 199 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 200 4.8 1.8 62.6 20.0 9.5 0.1 1.4 F #> 201 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 202 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 203 6.2 2.2 52.2 29.0 8.0 0.1 0.1 F #> 204 5.5 1.7 59.0 21.3 9.3 0.6 1.5 F #> 205 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 206 5.1 2.3 55.9 27.4 6.8 0.5 0.5 F #> 207 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 208 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 209 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 210 5.7 2.1 54.6 26.8 8.0 0.1 0.1 F #> 211 5.1 1.9 59.2 22.3 9.3 0.7 1.6 F #> 212 5.4 2.0 53.2 28.9 7.3 0.6 1.3 F #> 213 5.7 2.1 54.6 26.8 8.0 0.1 0.1 F #> 214 4.8 1.8 62.6 20.0 9.5 0.1 1.4 F #> 215 4.8 1.9 61.6 20.9 8.0 0.8 1.5 F #> 216 6.2 2.2 52.2 29.0 8.0 0.1 0.1 F #> 217 5.1 2.3 55.9 27.4 6.8 0.5 0.5 F #> 218 5.1 1.9 59.2 22.4 9.3 0.6 1.5 F #> 219 5.1 2.3 55.9 27.4 6.8 0.5 0.5 F #> 220 4.5 1.7 64.9 18.6 8.3 0.1 0.1 F #> 221 4.5 1.7 64.9 18.6 8.3 0.1 0.1 F #> 222 5.5 1.7 59.0 21.3 9.3 0.6 1.5 F #> 223 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 224 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 225 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 226 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 227 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 228 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 229 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 230 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 231 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 232 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 233 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 234 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 235 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 236 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 237 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 238 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 239 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 240 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 241 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 242 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 243 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 244 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 245 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 246 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 247 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 248 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 249 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 250 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 251 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 252 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 253 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 254 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 255 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 256 10.0 2.3 36.9 47.1 2.2 0.5 0.5 G #> 257 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 258 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G #> 259 10.7 1.8 30.2 55.5 0.9 0.5 0.3 G