Set operations for data tables

Similar to base R set functions, union, intersect, setdiff and setequal but for data.tables. Additional all argument controls how duplicated rows are handled. Functions fintersect, setdiff (MINUS or EXCEPT in SQL) and funion are meant to provide functionality of corresponding SQL operators. Unlike SQL, data.table functions will retain row order.

fintersect(x, y, all = FALSE)
fsetdiff(x, y, all = FALSE)
funion(x, y, all = FALSE)
fsetequal(x, y, all = TRUE)

Arguments

x, y

x, y	`data.table`s.
all	Logical. Default is `FALSE` and removes duplicate rows on the result. When `TRUE`, if there are `xn` copies of a particular row in `x` and `yn` copies of the same row in `y`, then: `fintersect` will return `min(xn, yn)` copies of that row. `fsetdiff` will return `max(0, xn-yn)` copies of that row. `funion` will return `xn+yn` copies of that row. `fsetequal` will return `FALSE` unless `xn == yn`.

data.tables.

all

Logical. Default is FALSE and removes duplicate rows on the result. When TRUE, if there are xn copies of a particular row in x and yn copies of the same row in y, then:

fintersect will return min(xn, yn) copies of that row.
fsetdiff will return max(0, xn-yn) copies of that row.
funion will return xn+yn copies of that row.
fsetequal will return FALSE unless xn == yn.

Details

bit64::integer64 columns are supported but not complex and list, except for funion.

Value

A data.table in case of fintersect, funion and fsetdiff. Logical TRUE or FALSE for fsetequal.

References

https://db.apache.org/derby/papers/Intersect-design.html

Examples

x = data.table(c(1,2,2,2,3,4,4))
x2 = data.table(c(1,2,3,4)) # same set of rows as x
y = data.table(c(2,3,4,4,4,5))
fintersect(x, y)            # intersect
#>    V1
#> 1:  2
#> 2:  3
#> 3:  4
fintersect(x, y, all=TRUE)  # intersect all
#>    V1
#> 1:  2
#> 2:  3
#> 3:  4
#> 4:  4
fsetdiff(x, y)              # except
#>    V1
#> 1:  1
fsetdiff(x, y, all=TRUE)    # except all
#>    V1
#> 1:  1
#> 2:  2
#> 3:  2
funion(x, y)                # union
#>    V1
#> 1:  1
#> 2:  2
#> 3:  3
#> 4:  4
#> 5:  5
funion(x, y, all=TRUE)      # union all
#>     V1
#>  1:  1
#>  2:  2
#>  3:  2
#>  4:  2
#>  5:  3
#>  6:  4
#>  7:  4
#>  8:  2
#>  9:  3
#> 10:  4
#> 11:  4
#> 12:  4
#> 13:  5
fsetequal(x, x2, all=FALSE) # setequal
#> [1] TRUE
fsetequal(x, x2)            # setequal all
#> [1] FALSE

Arguments

Details

Value

See also

References

Examples

Contents