read_table()
and read_table2()
are designed to read the type of textual
data where each column is separated by one (or more) columns of space.
read_table2()
is like read.table()
, it allows any number of whitespace
characters between columns, and the lines can be of different lengths.
read_table()
is more strict, each line must be the same length,
and each field is in the same position in every line. It first finds empty columns and then
parses like a fixed width file.
spec_table()
and spec_table2()
return
the column specifications rather than a data frame.
read_table(file, col_names = TRUE, col_types = NULL, locale = default_locale(), na = "NA", skip = 0, n_max = Inf, guess_max = min(n_max, 1000), progress = show_progress(), comment = "", skip_empty_rows = TRUE) read_table2(file, col_names = TRUE, col_types = NULL, locale = default_locale(), na = "NA", skip = 0, n_max = Inf, guess_max = min(n_max, 1000), progress = show_progress(), comment = "", skip_empty_rows = TRUE)
file | Either a path to a file, a connection, or literal data (either a single string or a raw vector). Files ending in Literal data is most useful for examples and tests. It must contain at least one new line to be recognised as data (instead of a path) or be a vector of greater than length 1. Using a value of |
---|---|
col_names | Either If If Missing ( |
col_types | One of If If a column specification created by Alternatively, you can use a compact string representation where each
character represents one column:
c = character, i = integer, n = number, d = double,
l = logical, f = factor, D = date, T = date time, t = time, ? = guess, or
|
locale | The locale controls defaults that vary from place to place.
The default locale is US-centric (like R), but you can use
|
na | Character vector of strings to interpret as missing values. Set this
option to |
skip | Number of lines to skip before reading data. |
n_max | Maximum number of records to read. |
guess_max | Maximum number of records to use for guessing column types. |
progress | Display a progress bar? By default it will only display
in an interactive session and not while knitting a document. The display
is updated every 50,000 values and will only display if estimated reading
time is 5 seconds or more. The automatic progress bar can be disabled by
setting option |
comment | A string used to identify comments. Any text after the comment characters will be silently ignored. |
skip_empty_rows | Should blank rows be ignored altogether? i.e. If this
option is |
read_fwf()
to read fixed width files where each column
is not separated by whitespace. read_fwf()
is also useful for reading
tabular data with non-standard formatting.
# One corner from http://www.masseyratings.com/cf/compare.htm massey <- readr_example("massey-rating.txt") cat(read_file(massey))#> UCC PAY LAZ KPK RT COF BIH DII ENG ACU Rank Team Conf #> 1 1 1 1 1 1 1 1 1 1 1 Ohio St B10 #> 2 2 2 2 2 2 2 2 4 2 2 Oregon P12 #> 3 4 3 4 3 4 3 4 2 3 3 Alabama SEC #> 4 3 4 3 4 3 5 3 3 4 4 TCU B12 #> 6 6 6 5 5 7 6 5 6 11 5 Michigan St B10 #> 7 7 7 6 7 6 11 8 7 8 6 Georgia SEC #> 5 5 5 7 6 8 4 6 5 5 7 Florida St ACC #> 8 8 9 9 10 5 7 7 10 7 8 Baylor B12 #> 9 11 8 13 11 11 12 9 14 9 9 Georgia Tech ACC #> 13 10 13 11 8 9 10 11 9 10 10 Mississippi SECread_table(massey)#>#> #> #> #> #> #> #> #> #> #> #> #> #> #> #>#> # A tibble: 10 x 13 #> UCC PAY LAZ KPK RT COF BIH DII ENG ACU Rank Team Conf #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> #> 1 1 1 1 1 1 1 1 1 1 1 1 Ohio… B10 #> 2 2 2 2 2 2 2 2 2 4 2 2 Oreg… P12 #> 3 3 4 3 4 3 4 3 4 2 3 3 Alab… SEC #> 4 4 3 4 3 4 3 5 3 3 4 4 TCU B12 #> 5 6 6 6 5 5 7 6 5 6 11 5 Mich… B10 #> 6 7 7 7 6 7 6 11 8 7 8 6 Geor… SEC #> 7 5 5 5 7 6 8 4 6 5 5 7 Flor… ACC #> 8 8 8 9 9 10 5 7 7 10 7 8 Bayl… B12 #> 9 9 11 8 13 11 11 12 9 14 9 9 Geor… ACC #> 10 13 10 13 11 8 9 10 11 9 10 10 Miss… SEC# Sample of 1978 fuel economy data from # http://www.fueleconomy.gov/feg/epadata/78data.zip epa <- readr_example("epa78.txt") cat(read_file(epa))#> ALFA ROMEO ALFA ROMEO 78010003 #> ALFETTA 03 81 8 74 7 89 9 ALFETTA 78010053 #> SPIDER 2000 01 SPIDER 2000 78010103 #> AMC AMC 78020002 #> GREMLIN 03 79 9 79 9 GREMLIN 78020053 #> PACER 04 89 11 89 11 PACER 78020103 #> PACER WAGON 07 90 26 91 26 PACER WAGON 78020153 #> CONCORD 04 88 12 90 11 90 11 83 16 CONCORD 78020203 #> CONCORD WAGON 07 91 30 91 30 CONCORD WAGON 78020253 #> MATADOR COUPE 05 97 14 97 14 MATADOR COUPE 78020303 #> MATADOR SEDAN 06 110 20 110 20 MATADOR SEDAN 78020353 #> MATADOR WAGON 09 112 50 112 50 MATADOR WAGON 78020403 #> ASTON MARTIN ASTON MARTIN 78040002 #> ASTON MARTIN ASTON MARTIN 78040053 #> AUDI AUDI 78050002 #> FOX 03 84 11 84 11 84 11 FOX 78050053 #> FOX WAGON 07 83 40 83 40 FOX WAGON 78050103 #> 5000 04 90 15 90 15 5000 78050153 #> AVANTI AVANTI 78065002 #> AVANTI II 02 75 8 75 8 AVANTI II 78065053read_table(epa, col_names = FALSE)#>#> #> #> #> #> #> #> #> #> #> #> #> #> #>#> # A tibble: 20 x 12 #> X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> #> 1 ALFA RO… "" NA NA NA NA NA NA NA NA ALFA R… 7.80e7 #> 2 ALFETTA "03" 81 8 74 7 89 9 NA NA ALFETTA 7.80e7 #> 3 SPIDER … "01" NA NA NA NA NA NA NA NA SPIDER… 7.80e7 #> 4 AMC "" NA NA NA NA NA NA NA NA AMC 7.80e7 #> 5 GREMLIN "03" 79 9 NA NA NA NA 79 9 GREMLIN 7.80e7 #> 6 PACER "04" 89 11 NA NA NA NA 89 11 PACER 7.80e7 #> 7 PACER W… "07" 90 26 91 26 NA NA NA NA PACER … 7.80e7 #> 8 CONCORD "04" 88 12 90 11 90 11 83 16 CONCORD 7.80e7 #> 9 CONCORD… "07" 91 30 NA NA 91 30 NA NA CONCOR… 7.80e7 #> 10 MATADOR… "05" 97 14 97 14 NA NA NA NA MATADO… 7.80e7 #> 11 MATADOR… "06" 110 20 NA NA 110 20 NA NA MATADO… 7.80e7 #> 12 MATADOR… "09" 112 50 NA NA 112 50 NA NA MATADO… 7.80e7 #> 13 ASTON M… "" NA NA NA NA NA NA NA NA ASTON … 7.80e7 #> 14 ASTON M… "" NA NA NA NA NA NA NA NA ASTON … 7.80e7 #> 15 AUDI "" NA NA NA NA NA NA NA NA AUDI 7.81e7 #> 16 FOX "03" 84 11 84 11 84 11 NA NA FOX 7.81e7 #> 17 FOX WAG… "07" 83 40 NA NA 83 40 NA NA FOX WA… 7.81e7 #> 18 5000 "04" 90 15 NA NA 90 15 NA NA 5000 7.81e7 #> 19 AVANTI "" NA NA NA NA NA NA NA NA AVANTI 7.81e7 #> 20 AVANTI … "02" 75 8 75 8 NA NA NA NA AVANTI… 7.81e7