Title: | Classes and Methods for Fast Memory-Efficient Boolean Selections |
---|---|
Description: | Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range index, compression, chunked processing). |
Authors: | Michael Chirico [aut, cre], Jens Oehlschlägel [aut], Brian Ripley [ctb] |
Maintainer: | Michael Chirico <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 4.5.99 |
Built: | 2024-11-27 03:08:53 UTC |
Source: | https://github.com/r-lib/bit |
Provided are classes for boolean and skewed boolean vectors, fast boolean methods, fast unique and non-unique integer sorting, fast set operations on sorted and unsorted sets of integers, and foundations for ff (range indices, compression, chunked processing).
For details view the vignette("bit-usage")
and vignette("bit-performance")
.
Functions to allocate (and de-allocate) bit masks
.BITS bit_init() bit_done()
.BITS bit_init() bit_done()
An object of class integer
of length 1.
The C-code operates with bit masks. The memory for these is allocated
dynamically. bit_init
is called by .First.lib()
and
bit_done
is called by .Last.lib()
. You don't need to
care about these under normal circumstances.
Jens Oehlschlägel
bit_done() bit_init()
bit_done() bit_init()
Coercing to bit vector
## S3 method for class ''NULL'' as.bit(x, ...) ## S3 method for class 'bit' as.bit(x, ...) ## S3 method for class 'logical' as.bit(x, ...) ## S3 method for class 'integer' as.bit(x, ...) ## S3 method for class 'double' as.bit(x, ...) ## S3 method for class 'bitwhich' as.bit(x, ...) ## S3 method for class 'which' as.bit(x, length = attr(x, "maxindex"), ...) ## S3 method for class 'ri' as.bit(x, ...) as.bit(x = NULL, ...)
## S3 method for class ''NULL'' as.bit(x, ...) ## S3 method for class 'bit' as.bit(x, ...) ## S3 method for class 'logical' as.bit(x, ...) ## S3 method for class 'integer' as.bit(x, ...) ## S3 method for class 'double' as.bit(x, ...) ## S3 method for class 'bitwhich' as.bit(x, ...) ## S3 method for class 'which' as.bit(x, length = attr(x, "maxindex"), ...) ## S3 method for class 'ri' as.bit(x, ...) as.bit(x = NULL, ...)
x |
an object of class |
... |
further arguments |
length |
the length of the new bit vector |
Coercing to bit is quite fast because we use a double loop that fixes each word in a processor register
is.bit
returns FALSE or TRUE, as.bit
returns a vector
of class 'bit'
as.bit(`NULL`)
: method to coerce to bit()
(zero length) from NULL
as.bit(integer)
: method to coerce to bit()
from
integer()
(0L
and NA
become FALSE
,
everthing else becomes TRUE
)
as.bit(double)
: method to coerce to bit()
from
double()
(0
and NA
become FALSE
, everthing
else becomes TRUE
)
as.bit(bitwhich)
: method to coerce to bit()
from bitwhich()
Zero is coerced to FALSE, all other numbers including NA are coerced to TRUE. This differs from the NA-to-FALSE coercion in package ff and may change in the future.
Jens Oehlschlägel
CoercionToStandard
, as.booltype()
, as.bit()
,
as.bitwhich()
, as.which()
, as.ri()
, ff::as.hi()
, ff::as.ff()
as.bit(c(0L, 1L, 2L, -2L, NA)) as.bit(c(0, 1, 2, -2, NA)) as.bit(c(FALSE, NA, TRUE))
as.bit(c(0L, 1L, 2L, -2L, NA)) as.bit(c(0, 1, 2, -2, NA)) as.bit(c(FALSE, NA, TRUE))
Functions to coerce to bitwhich
## S3 method for class ''NULL'' as.bitwhich(x, ...) ## S3 method for class 'bitwhich' as.bitwhich(x, ...) ## S3 method for class 'which' as.bitwhich(x, maxindex = attr(x, "maxindex"), ...) ## S3 method for class 'ri' as.bitwhich(x, ...) ## S3 method for class 'integer' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'double' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'logical' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'bit' as.bitwhich(x, range = NULL, poslength = NULL, ...) as.bitwhich(x = NULL, ...)
## S3 method for class ''NULL'' as.bitwhich(x, ...) ## S3 method for class 'bitwhich' as.bitwhich(x, ...) ## S3 method for class 'which' as.bitwhich(x, maxindex = attr(x, "maxindex"), ...) ## S3 method for class 'ri' as.bitwhich(x, ...) ## S3 method for class 'integer' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'double' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'logical' as.bitwhich(x, poslength = NULL, ...) ## S3 method for class 'bit' as.bitwhich(x, range = NULL, poslength = NULL, ...) as.bitwhich(x = NULL, ...)
x |
An object of class 'bitwhich', 'integer', 'logical' or 'bit' or an integer vector as resulting from 'which' |
... |
further arguments |
maxindex |
the length of the new bitwhich vector |
poslength |
the number of selected elements |
range |
a |
a value of class bitwhich()
as.bitwhich(`NULL`)
: method to coerce to bitwhich()
(zero length) from
NULL
as.bitwhich(bitwhich)
: method to coerce to bitwhich()
from bitwhich()
as.bitwhich(which)
: method to coerce to bitwhich()
from which()
as.bitwhich(ri)
: method to coerce to bitwhich()
from ri()
as.bitwhich(integer)
: method to coerce to bitwhich()
from
integer()
(0
and NA
become FALSE
, everthing
else becomes TRUE
)
as.bitwhich(double)
: method to coerce to bitwhich()
from
double()
(0
and NA
become FALSE
, everthing
else becomes TRUE
)
as.bitwhich(logical)
: method to coerce to bitwhich()
from logical()
as.bitwhich(bit)
: method to coerce to bitwhich()
from bit()
Jens Oehlschlägel
CoercionToStandard
, as.booltype()
, as.bit()
,
as.bitwhich()
, as.which()
, as.ri()
, ff::as.hi()
, ff::as.ff()
as.bitwhich(c(0L, 1L, 2L, -2L, NA)) as.bitwhich(c(0, 1, 2, -2, NA)) as.bitwhich(c(NA, NA, NA)) as.bitwhich(c(FALSE, FALSE, FALSE)) as.bitwhich(c(FALSE, FALSE, TRUE)) as.bitwhich(c(FALSE, TRUE, TRUE)) as.bitwhich(c(TRUE, TRUE, TRUE))
as.bitwhich(c(0L, 1L, 2L, -2L, NA)) as.bitwhich(c(0, 1, 2, -2, NA)) as.bitwhich(c(NA, NA, NA)) as.bitwhich(c(FALSE, FALSE, FALSE)) as.bitwhich(c(FALSE, FALSE, TRUE)) as.bitwhich(c(FALSE, TRUE, TRUE)) as.bitwhich(c(TRUE, TRUE, TRUE))
Coerce to booltype (generic)
## Default S3 method: as.booltype(x, booltype = "logical", ...) as.booltype(x, booltype, ...)
## Default S3 method: as.booltype(x, booltype = "logical", ...) as.booltype(x, booltype, ...)
x |
object to coerce |
booltype |
target |
... |
further arguments |
x
coerced to booltype
as.booltype(default)
: default method for as.booltype
CoercionToStandard
, booltypes()
, booltype()
,
is.booltype()
as.booltype(0:1) as.booltype(0:1, "logical") as.booltype(0:1, "bit") as.booltype(0:1, "bitwhich") as.booltype(0:1, "which", maxindex=2) as.booltype(0:1, "ri")
as.booltype(0:1) as.booltype(0:1, "logical") as.booltype(0:1, "bit") as.booltype(0:1, "bitwhich") as.booltype(0:1, "which", maxindex=2) as.booltype(0:1, "ri")
Coerce bit to character
## S3 method for class 'bit' as.character(x, ...)
## S3 method for class 'bit' as.character(x, ...)
x |
a |
... |
ignored |
a character vector of zeroes and ones
as.character(bit(12))
as.character(bit(12))
Coerce bitwhich to character
## S3 method for class 'bitwhich' as.character(x, ...)
## S3 method for class 'bitwhich' as.character(x, ...)
x |
a |
... |
ignored |
a character vector of zeroes and ones
as.character(bitwhich(12))
as.character(bitwhich(12))
Coerce to ri
## S3 method for class 'ri' as.ri(x, ...) ## Default S3 method: as.ri(x, ...) as.ri(x, ...)
## S3 method for class 'ri' as.ri(x, ...) ## Default S3 method: as.ri(x, ...) as.ri(x, ...)
x |
object to coerce |
... |
further arguments |
an ri()
object
Jens Oehlschlägel
CoercionToStandard
, as.booltype()
, as.bit()
,
as.bitwhich()
, as.which()
, as.ri()
, ff::as.hi()
, ff::as.ff()
as.ri(c(FALSE, TRUE, FALSE, TRUE))
as.ri(c(FALSE, TRUE, FALSE, TRUE))
Coercing to something like the result of which which()
## S3 method for class 'which' as.which(x, maxindex = NA_integer_, ...) ## S3 method for class ''NULL'' as.which(x, ...) ## S3 method for class 'numeric' as.which(x, maxindex = NA_integer_, ...) ## S3 method for class 'integer' as.which(x, maxindex = NA_integer_, is.unsorted = TRUE, has.dup = TRUE, ...) ## S3 method for class 'logical' as.which(x, ...) ## S3 method for class 'ri' as.which(x, ...) ## S3 method for class 'bit' as.which(x, range = NULL, ...) ## S3 method for class 'bitwhich' as.which(x, ...) as.which(x, ...)
## S3 method for class 'which' as.which(x, maxindex = NA_integer_, ...) ## S3 method for class ''NULL'' as.which(x, ...) ## S3 method for class 'numeric' as.which(x, maxindex = NA_integer_, ...) ## S3 method for class 'integer' as.which(x, maxindex = NA_integer_, is.unsorted = TRUE, has.dup = TRUE, ...) ## S3 method for class 'logical' as.which(x, ...) ## S3 method for class 'ri' as.which(x, ...) ## S3 method for class 'bit' as.which(x, range = NULL, ...) ## S3 method for class 'bitwhich' as.which(x, ...) as.which(x, ...)
x |
an object of classes |
maxindex |
the length of the boolean vector which is represented |
... |
further arguments (passed to |
is.unsorted |
a logical scalar indicating whether the data may be unsorted |
has.dup |
a logical scalar indicating whether the data may have duplicates |
range |
a |
as.which.bit
returns a vector of subscripts with class 'which'
a vector of class 'logical' or 'integer'
as.which(`NULL`)
: method to coerce to zero length which()
from
NULL
as.which(numeric)
: method to coerce to which()
from numeric()
as.which(integer)
: method to coerce to which()
from integer()
as.which(logical)
: method to coerce to which()
from logical()
as.which(bitwhich)
: method to coerce to which()
from bitwhich()
Jens Oehlschlägel
CoercionToStandard
, as.booltype()
, as.bit()
,
as.bitwhich()
, as.which()
, as.ri()
, ff::as.hi()
, ff::as.ff()
r <- ri(5, 20, 100) x <- as.which(r) x stopifnot(identical(x, as.which(as.logical(r)))) stopifnot(identical(x, as.which(as.bitwhich(r)))) stopifnot(identical(x, as.which(as.bit(r))))
r <- ri(5, 20, 100) x <- as.which(r) x stopifnot(identical(x, as.which(as.logical(r)))) stopifnot(identical(x, as.which(as.bitwhich(r)))) stopifnot(identical(x, as.which(as.bit(r))))
bbatch
calculates batch sizes in 1..N so that they have rather balanced
sizes than very different sizes.
bbatch(N, B)
bbatch(N, B)
N |
total size in 0..integer_max |
B |
desired batch size in 1..integer_max |
Tries to have rb == 0
or rb
as close to b
as possible
while guaranteeing that rb < b && (b - rb) <= min(nb, b)
a list with components:
b: the batch size
nb: the number of batches
rb: the size of the rest
Jens Oehlschlägel
bbatch(100, 24)
bbatch(100, 24)
Bit vectors are a boolean type wihout NA
that requires by factor 32 less
RAM than logical()
.
For details on usage see vignette("bit-usage")
and for details on
performance see vignette("bit-performance")
.
bit(length = 0L)
bit(length = 0L)
length |
length in bits |
bit
returns a vector of integer sufficiently long to store 'length' bits
booltype()
, bitwhich()
, logical()
bit(12) !bit(12) str(bit(128))
bit(12) !bit(12) str(bit(128))
fast %in%
for integers
bit_in(x, table, retFUN = as.bit)
bit_in(x, table, retFUN = as.bit)
x |
an integer vector of values to be looked-up |
table |
an integer vector used as lookup-table |
retFUN |
determines the range of the integers and checks if the density justifies use
of a bit vector; if yes, maps x
or table
– whatever is smaller
– into a bit vector and searches the other of table
or x
in
the it vector; if no, falls back to %in%
a boolean vector coerced to retFUN
bit_in(1:2, 2:3) bit_in(1:2, 2:3, retFUN=as.logical)
bit_in(1:2, 2:3) bit_in(1:2, 2:3, retFUN=as.logical)
Fast version of setdiff(rx[1]:rx[2], y)
.
bit_rangediff(rx, y, revx = FALSE, revy = FALSE)
bit_rangediff(rx, y, revx = FALSE, revy = FALSE)
rx |
range of integers given as |
y |
an integer vector of elements to exclude |
revx |
|
revy |
|
determines the range of the integers y
and checks if the density justifies use
of a bit vector; if yes, uses a bit vector for the set operation; if no,
falls back to a quicksort and merge_rangediff()
an integer vector
bit_setdiff()
, merge_rangediff()
bit_rangediff(c(1L, 6L), c(3L, 4L)) bit_rangediff(c(6L, 1L), c(3L, 4L)) bit_rangediff(c(6L, 1L), c(3L, 4L), revx=TRUE) bit_rangediff(c(6L, 1L), c(3L, 4L), revx=TRUE, revy=TRUE)
bit_rangediff(c(1L, 6L), c(3L, 4L)) bit_rangediff(c(6L, 1L), c(3L, 4L)) bit_rangediff(c(6L, 1L), c(3L, 4L), revx=TRUE) bit_rangediff(c(6L, 1L), c(3L, 4L), revx=TRUE, revy=TRUE)
Fast versions of union()
, intersect()
,
setdiff()
, symmetric difference and setequal()
for integers.
bit_union(x, y) bit_intersect(x, y) bit_setdiff(x, y) bit_symdiff(x, y) bit_setequal(x, y)
bit_union(x, y) bit_intersect(x, y) bit_setdiff(x, y) bit_symdiff(x, y) bit_setequal(x, y)
x |
an integer vector |
y |
an integer vector |
determines the range of the integers and checks if the density justifies use
of a bit vector; if yes, uses a bit vector for finding duplicates; if no,
falls back to union()
, intersect()
,
setdiff()
, union(setdiff(x, y), setdiff(y, x))
and setequal()
an integer vector
bit_union()
: union
bit_intersect()
: intersection
bit_setdiff()
: asymmetric difference
bit_symdiff()
: symmetricx difference
bit_setequal()
: equality
bit_union(1:2, 2:3) bit_intersect(1:2, 2:3) bit_setdiff(1:2, 2:3) bit_symdiff(1:2, 2:3) bit_setequal(1:2, 2:3) bit_setequal(1:2, 2:1)
bit_union(1:2, 2:3) bit_intersect(1:2, 2:3) bit_setdiff(1:2, 2:3) bit_symdiff(1:2, 2:3) bit_setequal(1:2, 2:3) bit_setequal(1:2, 2:1)
fast sorting of integers
bit_sort(x, decreasing = FALSE, na.last = NA, has.dup = TRUE)
bit_sort(x, decreasing = FALSE, na.last = NA, has.dup = TRUE)
x |
an integer vector |
decreasing |
(currently only |
na.last |
|
has.dup |
TRUE (the default) assumes that |
determines the range of the integers and checks if the density justifies use of a bit vector; if yes, sorts the first occurences of each integer in the range using a bit vector, sorts the rest and merges; if no, falls back to quicksort.
a sorted vector
sort()
, ramsort()
,
bit_sort_unique()
bit_sort(c(2L, 1L, NA, NA, 1L, 2L)) bit_sort(c(2L, 1L, NA, NA, 1L, 2L), na.last=FALSE) bit_sort(c(2L, 1L, NA, NA, 1L, 2L), na.last=TRUE) ## Not run: x <- sample(1e7, replace=TRUE) system.time(bit_sort(x)) system.time(sort(x)) ## End(Not run)
bit_sort(c(2L, 1L, NA, NA, 1L, 2L)) bit_sort(c(2L, 1L, NA, NA, 1L, 2L), na.last=FALSE) bit_sort(c(2L, 1L, NA, NA, 1L, 2L), na.last=TRUE) ## Not run: x <- sample(1e7, replace=TRUE) system.time(bit_sort(x)) system.time(sort(x)) ## End(Not run)
fast combination of sort()
and unique()
for integers
bit_sort_unique( x, decreasing = FALSE, na.last = NA, has.dup = TRUE, range_na = NULL )
bit_sort_unique( x, decreasing = FALSE, na.last = NA, has.dup = TRUE, range_na = NULL )
x |
an integer vector |
decreasing |
|
na.last |
|
has.dup |
TRUE (the default) assumes that |
range_na |
|
determines the range of the integers and checks if the density justifies use
of a bit vector; if yes, creates the result using a bit vector; if no, falls back to
sort(unique())
a sorted unique integer vector
sort()
, unique()
,
bit_sort()
, bit_unique()
bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L)) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), na.last=FALSE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), na.last=TRUE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE, na.last=FALSE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE, na.last=TRUE) ## Not run: x <- sample(1e7, replace=TRUE) system.time(bit_sort_unique(x)) system.time(sort(unique(x))) x <- sample(1e7) system.time(bit_sort_unique(x)) system.time(sort(x)) ## End(Not run)
bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L)) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), na.last=FALSE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), na.last=TRUE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE, na.last=FALSE) bit_sort_unique(c(2L, 1L, NA, NA, 1L, 2L), decreasing = TRUE, na.last=TRUE) ## Not run: x <- sample(1e7, replace=TRUE) system.time(bit_sort_unique(x)) system.time(sort(unique(x))) x <- sample(1e7) system.time(bit_sort_unique(x)) system.time(sort(x)) ## End(Not run)
Fast versions of unique()
, duplicated()
,
anyDuplicated()
and sum(duplicated(x))
for integers.
bit_unique(x, na.rm = NA, range_na = NULL) bit_duplicated(x, na.rm = NA, range_na = NULL, retFUN = as.bit) bit_anyDuplicated(x, na.rm = NA, range_na = NULL) bit_sumDuplicated(x, na.rm = NA, range_na = NULL)
bit_unique(x, na.rm = NA, range_na = NULL) bit_duplicated(x, na.rm = NA, range_na = NULL, retFUN = as.bit) bit_anyDuplicated(x, na.rm = NA, range_na = NULL) bit_sumDuplicated(x, na.rm = NA, range_na = NULL)
x |
an integer vector |
na.rm |
|
range_na |
|
retFUN |
determines the range of the integers and checks if the density justifies use
of a bit vector; if yes, uses a bit vector for finding duplicates; if no,
falls back to unique()
, duplicated()
, anyDuplicated()
and sum(duplicated(x))
bit_unique
returns a vector of unique integers,
bit_duplicated
returns a boolean vector coerced to retFUN
,
bit_anyDuplicated
returns the position of the first duplicate (or zero if no
duplicates)
bit_sumDuplicated
returns the number of duplicated values (as.integer)
bit_unique()
: extracts unique elements
bit_duplicated()
: determines duplicate elements
bit_anyDuplicated()
: checks for existence of duplicate elements
bit_sumDuplicated()
: counts duplicate elements
bit_unique(c(2L, 1L, NA, NA, 1L, 2L)) bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
bit_unique(c(2L, 1L, NA, NA, 1L, 2L)) bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_unique(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_duplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_anyDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L)) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=FALSE) bit_sumDuplicated(c(2L, 1L, NA, NA, 1L, 2L), na.rm=TRUE)
In one pass over the vector NA
s are handled according to parameter
na.last
by range_sortna()
, then, if the vector is unsorted,
bit sort is invoked.
bitsort(x, na.last = NA, depth = 1)
bitsort(x, na.last = NA, depth = 1)
x |
an integer vector |
na.last |
|
depth |
an integer scalar giving the number of bit-passed before switching to quicksort |
a sorted vector
bitsort(c(2L, 0L, 1L, NA, 2L)) bitsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) bitsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
bitsort(c(2L, 0L, 1L, NA, 2L)) bitsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) bitsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
A bitwhich object represents a boolean filter like a bit()
object (NAs are not
allowed) but uses a sparse representation suitable for very skewed (asymmetric)
selections. Three extreme cases are represented with logical values, no length via
logical()
, all TRUE
with TRUE
and all FALSE
with FALSE
. All other
selections are represented with positive or negative integers, whatever is shorter.
This needs less RAM compared to logical()
(and often less than bit()
or
which()
). Logical operations are fast if the selection is asymmetric
(only few or almost all selected).
bitwhich( maxindex = 0L, x = NULL, xempty = FALSE, poslength = NULL, is.unsorted = TRUE, has.dup = TRUE )
bitwhich( maxindex = 0L, x = NULL, xempty = FALSE, poslength = NULL, is.unsorted = TRUE, has.dup = TRUE )
maxindex |
length of the vector |
x |
Information about which positions are |
xempty |
what to assume about parameter |
poslength |
tuning: |
is.unsorted |
tuning: FALSE implies that |
has.dup |
tuning: FALSE implies that |
an object of class 'bitwhich' carrying two attributes
maxindex: see above
poslength: see above
bitwhich_representation()
, as.bitwhich()
, bit()
bitwhich() bitwhich(12) bitwhich(12, x=TRUE) bitwhich(12, x=3) bitwhich(12, x=-3) bitwhich(12, x=integer()) bitwhich(12, x=integer(), xempty=TRUE)
bitwhich() bitwhich(12) bitwhich(12, x=TRUE) bitwhich(12, x=3) bitwhich(12, x=-3) bitwhich(12, x=integer()) bitwhich(12, x=integer(), xempty=TRUE)
Diagnose representation of bitwhich
bitwhich_representation(x)
bitwhich_representation(x)
x |
a |
a scalar, one of logical()
, FALSE
, TRUE
, -1
or 1
bitwhich_representation(bitwhich()) bitwhich_representation(bitwhich(12, FALSE)) bitwhich_representation(bitwhich(12, TRUE)) bitwhich_representation(bitwhich(12, -3)) bitwhich_representation(bitwhich(12, 3))
bitwhich_representation(bitwhich()) bitwhich_representation(bitwhich(12, FALSE)) bitwhich_representation(bitwhich(12, TRUE)) bitwhich_representation(bitwhich(12, -3)) bitwhich_representation(bitwhich(12, 3))
Specific methods for booltype
are required, where non-unary methods can combine
multiple bollean types, particularly boolean binary operators.
booltype(x)
booltype(x)
x |
an R object |
Function booltype
returns the boolean type of its argument.
There are currently six boolean types, booltypes
is an ordered()
vector with the
following ordinal levels()
:
nobool: non-boolean type
logical()
: for representing any boolean data including NA
bit()
: for representing dense boolean data
bitwhich()
: for representing sparse (skewed) boolean data
which()
: for representing sparse boolean data with few 'TRUE
ri()
: range-indexing, for representing sparse boolean data with a single range of
TRUE
one scalar element of booltypes()
in case of 'nobool' it carries a name
attribute with the data type.
do not rely on the internal integer codes of these levels, we might add-in
hi
later
booltypes()
, is.booltype()
, as.booltype()
unname(booltypes) str(booltypes) sapply( list(double(), integer(), logical(), bit(), bitwhich(), as.which(), ri(1, 2, 3)), booltype )
unname(booltypes) str(booltypes) sapply( list(double(), integer(), logical(), bit(), bitwhich(), as.which(), ri(1, 2, 3)), booltype )
The ordered()
factor booltypes
ranks the boolean types.
booltypes
booltypes
An object of class ordered
(inherits from factor
) of length 6.
There are currently six boolean types, booltypes
is an ordered()
vector with the
following ordinal levels()
:
nobool: non-boolean type
logical()
: for representing any boolean data including NA
bit()
: for representing dense boolean data
bitwhich()
: for representing sparse (skewed) boolean data
which()
: for representing sparse boolean data with few 'TRUE
ri()
: range-indexing, for representing sparse boolean data with a single range of
TRUE
booltypes
has a names()
attribute such that elements can be selected by name.
do not rely on the internal integer codes of these levels, we might add-in
hi
later
booltype()
, is.booltype()
, as.booltype()
Creating new boolean vectors by concatenating boolean vectors
## S3 method for class 'booltype' c(...) ## S3 method for class 'bit' c(...) ## S3 method for class 'bitwhich' c(...)
## S3 method for class 'booltype' c(...) ## S3 method for class 'bit' c(...) ## S3 method for class 'bitwhich' c(...)
... |
|
a vector with the lowest input booltype()
(but not lower thanlogical()
)
Jens Oehlschlägel
c()
, bit()
, bitwhich()
, , which()
c(bit(4), !bit(4)) c(bit(4), !bitwhich(4)) c(bitwhich(4), !bit(4)) c(ri(1, 2, 4), !bit(4)) c(bit(4), !logical(4)) message("logical in first argument does not dispatch: c(logical(4), bit(4))") c.booltype(logical(4), !bit(4))
c(bit(4), !bit(4)) c(bit(4), !bitwhich(4)) c(bitwhich(4), !bit(4)) c(ri(1, 2, 4), !bit(4)) c(bit(4), !logical(4)) message("logical in first argument does not dispatch: c(logical(4), bit(4))") c.booltype(logical(4), !bit(4))
Calls chunks()
to create a sequence of range indexes along the object which causes
the method dispatch.
chunk(x = NULL, ...) ## Default S3 method: chunk(x = NULL, ..., RECORDBYTES = NULL, BATCHBYTES = NULL)
chunk(x = NULL, ...) ## Default S3 method: chunk(x = NULL, ..., RECORDBYTES = NULL, BATCHBYTES = NULL)
x |
the object along we want chunks |
... |
further arguments passed to |
RECORDBYTES |
integer scalar representing the bytes needed to process a single element of the boolean vector (default 4 bytes for logical) |
BATCHBYTES |
integer scalar limiting the number of bytes to be processed in one
chunk, default from |
chunk
is generic, the default method is described here, other methods
that automatically consider RAM needs are provided with package 'ff', see
for example ff::chunk.ffdf()
returns a named list of ri()
objects
representing chunks of subscripts
chunk(default)
: default vector method
chunk.default
, ff::chunk.ff_vector()
,
ff::chunk.ffdf()
Jens Oehlschlägel
chunks()
, ri()
, seq()
, bbatch()
chunk(complex(1e7)) chunk(raw(1e7)) chunk(raw(1e7), length=3) chunks(1, 10, 3) # no longer do chunk(1, 100, 10) # but for bckward compatibility this works chunk(from=1, to=100, by=10)
chunk(complex(1e7)) chunk(raw(1e7)) chunk(raw(1e7), length=3) chunks(1, 10, 3) # no longer do chunk(1, 100, 10) # but for bckward compatibility this works chunk(from=1, to=100, by=10)
creates a sequence of range indexes using a syntax not completely unlike 'seq'
chunks( from = NULL, to = NULL, by = NULL, length.out = NULL, along.with = NULL, overlap = 0L, method = c("bbatch", "seq"), maxindex = NA )
chunks( from = NULL, to = NULL, by = NULL, length.out = NULL, along.with = NULL, overlap = 0L, method = c("bbatch", "seq"), maxindex = NA )
from |
the starting value of the sequence. |
to |
the (maximal) end value of the sequence. |
by |
increment of the sequence |
length.out |
desired length of the sequence. |
along.with |
take the length from the length of this argument. |
overlap |
number of values to overlap (will lower the starting value of the sequence, the first range becomes smaller |
method |
default 'bbatch' will try to balance the chunk size, see
|
maxindex |
passed to |
returns a named list of ri()
objects
representing chunks of subscripts
Jens Oehlschlägel
generic chunk()
, ri()
, seq()
, bbatch()
chunks(1, 100, by=30) chunks(1, 100, by=30, method="seq") ## Not run: require(foreach) m <- 10000 k <- 1000 n <- m*k message("Four ways to loop from 1 to n. Slowest foreach to fastest chunk is 1700:1 on a dual core notebook with 3GB RAM\n") z <- 0L; print(k*system.time({it <- icount(m); foreach (i = it) %do% { z <- i; NULL }})) z z <- 0L print(system.time({i <- 0L; while (i < n) {i <- i + 1L; z <- i}})) z z <- 0L print(system.time(for (i in 1:n) z <- i)) z z <- 0L; n <- m*k; print(system.time(for (ch in chunks(1, n, by=m)) {for (i in ch[1]:ch[2]) z <- i})) z message("Seven ways to calculate sum(1:n). Slowest foreach to fastest chunk is 61000:1 on a dual core notebook with 3GB RAM\n") print(k*system.time({it <- icount(m); foreach (i = it, .combine="+") %do% { i }})) z <- 0; print(k*system.time({it <- icount(m); foreach (i = it) %do% { z <- z + i; NULL }})) z z <- 0; print(system.time({i <- 0L;while (i < n) {i <- i + 1L; z <- z + i}})); z z <- 0; print(system.time(for (i in 1:n) z <- z + i)); z print(system.time(sum(as.double(1:n)))) z <- 0; n <- m*k print(system.time(for (ch in chunks(1, n, by=m)) {for (i in ch[1]:ch[2]) z <- z + i})) z z <- 0; n <- m*k print(system.time(for (ch in chunks(1, n, by=m)) {z <- z + sum(as.double(ch[1]:ch[2]))})) z ## End(Not run)
chunks(1, 100, by=30) chunks(1, 100, by=30, method="seq") ## Not run: require(foreach) m <- 10000 k <- 1000 n <- m*k message("Four ways to loop from 1 to n. Slowest foreach to fastest chunk is 1700:1 on a dual core notebook with 3GB RAM\n") z <- 0L; print(k*system.time({it <- icount(m); foreach (i = it) %do% { z <- i; NULL }})) z z <- 0L print(system.time({i <- 0L; while (i < n) {i <- i + 1L; z <- i}})) z z <- 0L print(system.time(for (i in 1:n) z <- i)) z z <- 0L; n <- m*k; print(system.time(for (ch in chunks(1, n, by=m)) {for (i in ch[1]:ch[2]) z <- i})) z message("Seven ways to calculate sum(1:n). Slowest foreach to fastest chunk is 61000:1 on a dual core notebook with 3GB RAM\n") print(k*system.time({it <- icount(m); foreach (i = it, .combine="+") %do% { i }})) z <- 0; print(k*system.time({it <- icount(m); foreach (i = it) %do% { z <- z + i; NULL }})) z z <- 0; print(system.time({i <- 0L;while (i < n) {i <- i + 1L; z <- z + i}})); z z <- 0; print(system.time(for (i in 1:n) z <- z + i)); z print(system.time(sum(as.double(1:n)))) z <- 0; n <- m*k print(system.time(for (ch in chunks(1, n, by=m)) {for (i in ch[1]:ch[2]) z <- z + i})) z z <- 0; n <- m*k print(system.time(for (ch in chunks(1, n, by=m)) {z <- z + sum(as.double(ch[1]:ch[2]))})) z ## End(Not run)
clone
physically duplicates objects and can additionally change
some features, e.g. length.
clone(x, ...) ## Default S3 method: clone(x, ...)
clone(x, ...) ## Default S3 method: clone(x, ...)
x |
|
... |
further arguments to the generic |
clone
is generic. clone.default
handles ram objects.
Further methods are provided in package 'ff'.
still.identical
returns TRUE if the two atomic arguments still
point to the same memory.
an object that is a deep copy of x
clone(default)
: default method uses R's C-API 'duplicate()'
Jens Oehlschlägel
clone.ff
, copy_vector()
x <- 1:12 y <- x still.identical(x, y) y[1] <- y[1] still.identical(x, y) y <- clone(x) still.identical(x, y) rm(x, y); gc()
x <- 1:12 y <- x still.identical(x, y) y[1] <- y[1] still.identical(x, y) y <- clone(x) still.identical(x, y) rm(x, y); gc()
Coercion from bit is quite fast because we use a double loop that fixes each word in a processor register.
## S3 method for class 'bit' as.logical(x, ...) ## S3 method for class 'bit' as.integer(x, ...) ## S3 method for class 'bit' as.double(x, ...) ## S3 method for class 'bitwhich' as.integer(x, ...) ## S3 method for class 'bitwhich' as.double(x, ...) ## S3 method for class 'bitwhich' as.logical(x, ...) ## S3 method for class 'ri' as.logical(x, ...) ## S3 method for class 'ri' as.integer(x, ...) ## S3 method for class 'ri' as.double(x, ...) ## S3 method for class 'which' as.logical(x, length = attr(x, "maxindex"), ...)
## S3 method for class 'bit' as.logical(x, ...) ## S3 method for class 'bit' as.integer(x, ...) ## S3 method for class 'bit' as.double(x, ...) ## S3 method for class 'bitwhich' as.integer(x, ...) ## S3 method for class 'bitwhich' as.double(x, ...) ## S3 method for class 'bitwhich' as.logical(x, ...) ## S3 method for class 'ri' as.logical(x, ...) ## S3 method for class 'ri' as.integer(x, ...) ## S3 method for class 'ri' as.double(x, ...) ## S3 method for class 'which' as.logical(x, length = attr(x, "maxindex"), ...)
x |
an object of class |
... |
ignored |
length |
length of the boolean vector (required for |
as.logical()
returns a vector of FALSE, TRUE
,
as.integer()
and as.double()
return a vector of
0,1
.
Jens Oehlschlägel
CoercionToStandard
, as.booltype()
, as.bit()
,
as.bitwhich()
, as.which()
, as.ri()
, ff::as.hi()
, ff::as.ff()
x <- ri(2, 5, 10) y <- as.logical(x) y stopifnot(identical(y, as.logical(as.bit(x)))) stopifnot(identical(y, as.logical(as.bitwhich(x)))) y <- as.integer(x) y stopifnot(identical(y, as.integer(as.logical(x)))) stopifnot(identical(y, as.integer(as.bit(x)))) stopifnot(identical(y, as.integer(as.bitwhich(x)))) y <- as.double(x) y stopifnot(identical(y, as.double(as.logical(x)))) stopifnot(identical(y, as.double(as.bit(x)))) stopifnot(identical(y, as.double(as.bitwhich(x))))
x <- ri(2, 5, 10) y <- as.logical(x) y stopifnot(identical(y, as.logical(as.bit(x)))) stopifnot(identical(y, as.logical(as.bitwhich(x)))) y <- as.integer(x) y stopifnot(identical(y, as.integer(as.logical(x)))) stopifnot(identical(y, as.integer(as.bit(x)))) stopifnot(identical(y, as.integer(as.bitwhich(x)))) y <- as.double(x) y stopifnot(identical(y, as.double(as.logical(x)))) stopifnot(identical(y, as.double(as.bit(x)))) stopifnot(identical(y, as.double(as.bitwhich(x))))
Creates a true copy of the underlying C-vector – dropping all attributes – and optionally reverses the direction of the elements.
copy_vector(x, revx = FALSE)
copy_vector(x, revx = FALSE)
x |
an R vector |
revx |
default |
This can be substantially faster than duplicate(as.vector(unclass(x)))
copied R vector
clone()
, still.identical()
, reverse_vector()
x <- factor(letters) y <- x z <- copy_vector(x) still.identical(x, y) still.identical(x, z) str(x) str(y) str(z)
x <- factor(letters) y <- x z <- copy_vector(x) still.identical(x, y) still.identical(x, z) str(x) str(y) str(z)
In one pass over the vector NA
s are handled according to parameter
na.last
by range_sortna()
, then, if the vector is unsorted,
counting sort is invoked.
countsort(x, na.last = NA)
countsort(x, na.last = NA)
x |
an integer vector |
na.last |
|
a sorted vector
countsort(c(2L, 0L, 1L, NA, 2L)) countsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) countsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
countsort(c(2L, 0L, 1L, NA, 2L)) countsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) countsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
Operators acting on bit()
or bitwhich()
objects to extract or replace parts.
## S3 method for class 'bit' x[[i]] ## S3 replacement method for class 'bit' x[[i]] <- value ## S3 method for class 'bit' x[i] ## S3 replacement method for class 'bit' x[i] <- value ## S3 method for class 'bitwhich' x[[i]] ## S3 replacement method for class 'bitwhich' x[[i]] <- value ## S3 method for class 'bitwhich' x[i] ## S3 replacement method for class 'bitwhich' x[i] <- value
## S3 method for class 'bit' x[[i]] ## S3 replacement method for class 'bit' x[[i]] <- value ## S3 method for class 'bit' x[i] ## S3 replacement method for class 'bit' x[i] <- value ## S3 method for class 'bitwhich' x[[i]] ## S3 replacement method for class 'bitwhich' x[[i]] <- value ## S3 method for class 'bitwhich' x[i] ## S3 replacement method for class 'bitwhich' x[i] <- value
x |
a |
i |
preferrably a positive integer subscript or a |
value |
new logical or integer values |
The typical usecase for '[' and '[<-' is subscripting with positive integers,
negative integers are allowed but slower,
as logical subscripts only scalars are allowed.
The subscript can be given as a bitwhich()
object.
Also ri()
can be used as subscript.
Extracting from bit()
and bitwhich()
is faster than from logical()
if positive
subscripts are used. Unteger subscripts make sense. Negative subscripts are
converted to positive ones, beware the RAM consumption.
The extractors [[
and [
return a logical scalar or
vector. The replacment functions return an object of class(x)
.
Jens Oehlschlägel
x <- as.bit(c(FALSE, NA, TRUE)) x[] <- c(FALSE, NA, TRUE) x[1:2] x[-3] x[ri(1, 2)] x[as.bitwhich(c(TRUE, TRUE, FALSE))] x[[1]] x[] <- TRUE x[1:2] <- FALSE x[[1]] <- TRUE
x <- as.bit(c(FALSE, NA, TRUE)) x[] <- c(FALSE, NA, TRUE) x[1:2] x[-3] x[ri(1, 2)] x[as.bitwhich(c(TRUE, TRUE, FALSE))] x[[1]] x[] <- TRUE x[1:2] <- FALSE x[[1]] <- TRUE
This is substantially faster than which.max(is.na(x))
firstNA(x)
firstNA(x)
x |
an R vector |
a reversed vector
which.max()
, is.na()
, anyNA()
, anyDuplicated()
, bit_anyDuplicated()
x <- c(FALSE, NA, TRUE) firstNA(x) reverse_vector(x) ## Not run: x <- 1:1e7 system.time(rev(x)) system.time(reverse_vector(x)) ## End(Not run)
x <- c(FALSE, NA, TRUE) firstNA(x) reverse_vector(x) ## Not run: x <- 1:1e7 system.time(rev(x)) system.time(reverse_vector(x)) ## End(Not run)
Gets C length of a vector ignoring any length-methods dispatched by classes
get_length(x)
get_length(x)
x |
a vector |
Queries the vector length using C-macro LENGTH
, this can be substantially faster than
length(unclass(x))
integer scalar
length(bit(12)) get_length(bit(12))
length(bit(12)) get_length(bit(12))
Function setattr
sets a singe attribute and function
setattributes
sets a list of attributes.
getsetattr(x, which, value) setattr(x, which, value) setattributes(x, attributes)
getsetattr(x, which, value) setattr(x, which, value) setattributes(x, attributes)
x |
an R object |
which |
name of the attribute |
value |
value of the attribute, use NULL to remove this attribute |
attributes |
a named list of attribute values |
The attributes of 'x' are changed in place without copying x. function
setattributes
does only change the named attributes, it does not
delete the non-names attributes like attributes()
does.
invisible(), we do not return the changed object to remind you of the fact that this function is called for its side-effect of changing its input object.
Jens Oehlschlägel
Writing R extensions – System and foreign language interfaces – Handling R objects in C – Attributes (Version 2.11.1 (2010-06-03 ) R Development)
x <- as.single(runif(10)) attr(x, "Csingle") f <- function(x) attr(x, "Csingle") <- NULL g <- function(x) setattr(x, "Csingle", NULL) f(x) x g(x) x ## Not run: # restart R library(bit) mysingle <- function(length = 0) { ret <- double(length) setattr(ret, "Csingle", TRUE) ret } # show that mysinge gives exactly the same result as single identical(single(10), mysingle(10)) # look at the speedup and memory-savings of mysingle compared to single system.time(mysingle(1e7)) memory.size(max=TRUE) system.time(single(1e7)) memory.size(max=TRUE) # look at the memory limits # on my win32 machine the first line fails # because of not enough RAM, the second works x <- single(1e8) x <- mysingle(1e8) # .g. performance with factors x <- rep(factor(letters), length.out=1e7) x[1:10] # look how fast one can do this system.time(setattr(x, "levels", rev(letters))) x[1:10] # look at the performance loss in time caused by the non-needed copying system.time(levels(x) <- letters) x[1:10] # restart R library(bit) simplefactor <- function(n) { factor(rep(1:2, length.out=n)) } mysimplefactor <- function(n) { ret <- rep(1:2, length.out=n) setattr(ret, "levels", as.character(1:2)) setattr(ret, "class", "factor") ret } identical(simplefactor(10), mysimplefactor(10)) system.time(x <- mysimplefactor(1e7)) memory.size(max=TRUE) system.time(setattr(x, "levels", c("a", "b"))) memory.size(max=TRUE) x[1:4] memory.size(max=TRUE) rm(x) gc() system.time(x <- simplefactor(1e7)) memory.size(max=TRUE) system.time(levels(x) <- c("x", "y")) memory.size(max=TRUE) x[1:4] memory.size(max=TRUE) rm(x) gc() ## End(Not run)
x <- as.single(runif(10)) attr(x, "Csingle") f <- function(x) attr(x, "Csingle") <- NULL g <- function(x) setattr(x, "Csingle", NULL) f(x) x g(x) x ## Not run: # restart R library(bit) mysingle <- function(length = 0) { ret <- double(length) setattr(ret, "Csingle", TRUE) ret } # show that mysinge gives exactly the same result as single identical(single(10), mysingle(10)) # look at the speedup and memory-savings of mysingle compared to single system.time(mysingle(1e7)) memory.size(max=TRUE) system.time(single(1e7)) memory.size(max=TRUE) # look at the memory limits # on my win32 machine the first line fails # because of not enough RAM, the second works x <- single(1e8) x <- mysingle(1e8) # .g. performance with factors x <- rep(factor(letters), length.out=1e7) x[1:10] # look how fast one can do this system.time(setattr(x, "levels", rev(letters))) x[1:10] # look at the performance loss in time caused by the non-needed copying system.time(levels(x) <- letters) x[1:10] # restart R library(bit) simplefactor <- function(n) { factor(rep(1:2, length.out=n)) } mysimplefactor <- function(n) { ret <- rep(1:2, length.out=n) setattr(ret, "levels", as.character(1:2)) setattr(ret, "class", "factor") ret } identical(simplefactor(10), mysimplefactor(10)) system.time(x <- mysimplefactor(1e7)) memory.size(max=TRUE) system.time(setattr(x, "levels", c("a", "b"))) memory.size(max=TRUE) x[1:4] memory.size(max=TRUE) rm(x) gc() system.time(x <- simplefactor(1e7)) memory.size(max=TRUE) system.time(levels(x) <- c("x", "y")) memory.size(max=TRUE) x[1:4] memory.size(max=TRUE) rm(x) gc() ## End(Not run)
If the table is sorted, this can be much faster than %in%
in.bitwhich(x, table, is.unsorted = NULL)
in.bitwhich(x, table, is.unsorted = NULL)
x |
a vector of integer |
table |
a |
is.unsorted |
logical telling the function whether the table is (un)sorted. With
the default |
logical vector
x <- bitwhich(100) x[3] <- TRUE in.bitwhich(c(NA, 2, 3), x)
x <- bitwhich(100) x[3] <- TRUE in.bitwhich(c(NA, 2, 3), x)
These C-coded utilitites speed up index preprocessing considerably.
intrle(x) intisasc(x, na.method = c("none", "break", "skip")[2]) intisdesc(x, na.method = c("none", "break", "skip")[1])
intrle(x) intisasc(x, na.method = c("none", "break", "skip")[2]) intisdesc(x, na.method = c("none", "break", "skip")[1])
x |
an integer vector |
na.method |
one of "none", "break", "skip", see details. The strange defaults stem from the initial usage. |
intrle
is by factor 50 faster and needs less RAM (2x its input
vector) compared to rle()
which needs 9x the RAM of its input
vector. This is achieved because we allow the C-code of intrle
to
break when it turns out, that rle-packing will not achieve a compression
factor of 3 or better.
intisasc
is a faster version of is.unsorted()
: it checks whether x
is sorted.
intisdesc
checks for being sorted descending and by default default assumes that the
input x
contains no NAs.
na.method="none"
treats NAs
(the smallest integer) like every other integer and
hence returns either TRUE
or FALSE
na.method="break"
checks for NAs
and
returns either NA
as soon as NA
is encountered. na.method="skip"
checks for
NAs
and skips over them, hence decides the return value only on the basis of
non-NA values.
intrle
returns an object of class rle()
or NULL, if rle-compression is not
efficient (compression factor <3 or length(x) < 3
).
intisasc
returns one of FALSE, NA, TRUE
intisdesc
returns one of FALSE, TRUE
(if the input contains NAs, the output is
undefined)
intisasc()
: check whether integer vector is ascending
intisdesc()
: check whether integer vector is descending
Jens Oehlschlägel
ff::hi()
, rle()
, is.unsorted()
,
ff::is.sorted.default()
intrle(sample(1:10)) intrle(diff(1:10)) intisasc(1:10) intisasc(10:1) intisasc(c(NA, 1:10)) intisdesc(1:10) intisdesc(c(10:1, NA)) intisdesc(c(10:6, NA, 5:1)) intisdesc(c(10:6, NA, 5:1), na.method="skip") intisdesc(c(10:6, NA, 5:1), na.method="break")
intrle(sample(1:10)) intrle(diff(1:10)) intisasc(1:10) intisasc(10:1) intisasc(c(NA, 1:10)) intisdesc(1:10) intisdesc(c(10:1, NA)) intisdesc(c(10:6, NA, 5:1)) intisdesc(c(10:6, NA, 5:1), na.method="skip") intisdesc(c(10:6, NA, 5:1), na.method="break")
All booltypes()
including logical()
except 'nobool' types are considered
'is.booltype'.
is.booltype(x) is.bit(x) is.bitwhich(x) is.which(x) is.hi(x) is.ri(x)
is.booltype(x) is.bit(x) is.bitwhich(x) is.which(x) is.hi(x) is.ri(x)
x |
an R object |
logical scalar
is.bit()
: tests for bit()
is.bitwhich()
: tests for bitwhich()
is.which()
: tests for which()
is.hi()
: tests for hi
is.ri()
: tests for ri()
booltypes()
, booltype()
, as.booltype()
sapply( list(double(), integer(), logical(), bit(), bitwhich(), as.which(), ri(1, 2, 3)), is.booltype )
sapply( list(double(), integer(), logical(), bit(), bitwhich(), as.which(), ri(1, 2, 3)), is.booltype )
Test for NA in bit and bitwhich
## S3 method for class 'bit' is.na(x) ## S3 method for class 'bitwhich' is.na(x)
## S3 method for class 'bit' is.na(x) ## S3 method for class 'bitwhich' is.na(x)
x |
a |
vector of same type with all elements FALSE
is.na(bitwhich)
: method for is.na()
from bitwhich()
is.na(bit(6)) is.na(bitwhich(6))
is.na(bit(6)) is.na(bitwhich(6))
Query the number of bits in a bit()
vector or change the number
of bits in a bit vector.
Query the number of bits in a bitwhich()] vector or change the number of bits in a bit
vector.
## S3 method for class 'bit' length(x) ## S3 replacement method for class 'bit' length(x) <- value ## S3 method for class 'bitwhich' length(x) ## S3 replacement method for class 'bitwhich' length(x) <- value ## S3 method for class 'ri' length(x)
## S3 method for class 'bit' length(x) ## S3 replacement method for class 'bit' length(x) <- value ## S3 method for class 'bitwhich' length(x) ## S3 replacement method for class 'bitwhich' length(x) <- value ## S3 method for class 'ri' length(x)
x |
a |
value |
the new number of bits |
NOTE that the length does NOT reflect the number of selected (TRUE
)
bits, it reflects the sum of both, TRUE
and FALSE
bits.
Increasing the length of a bit()
object will set new bits to
FALSE
. The behaviour of increasing the length of a
bitwhich()
object is different and depends on the content of the
object:
TRUE – all included, new bits are set to TRUE
positive integers – some included, new bits are set to FALSE
negative integers – some excluded, new bits are set to TRUE
FALSE – all excluded:, new bits are set to FALSE
Decreasing the length of bit or bitwhich removes any previous information about the status bits above the new length.
the length A bit vector with the new length
Jens Oehlschlägel
length()
, sum()
,
poslength()
, maxindex()
stopifnot(length(ri(1, 1, 32)) == 32) x <- as.bit(ri(32, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bit(ri(1, 1, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 1) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 1) x <- as.bitwhich(bit(32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 0) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bitwhich(!bit(32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 32) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 16) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 32) x <- as.bitwhich(ri(32, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bitwhich(ri(2, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 31) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 15) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 31) x <- as.bitwhich(ri(1, 1, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 1) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 1) x <- as.bitwhich(ri(1, 31, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 31) message("NOTE the change from 'some excluded' to 'all excluded' here") length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 16) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 32)
stopifnot(length(ri(1, 1, 32)) == 32) x <- as.bit(ri(32, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bit(ri(1, 1, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 1) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 1) x <- as.bitwhich(bit(32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 0) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bitwhich(!bit(32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 32) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 16) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 32) x <- as.bitwhich(ri(32, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 0) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 0) x <- as.bitwhich(ri(2, 32, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 31) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 15) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 31) x <- as.bitwhich(ri(1, 1, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 1) length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 1) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 1) x <- as.bitwhich(ri(1, 31, 32)) stopifnot(length(x) == 32) stopifnot(sum(x) == 31) message("NOTE the change from 'some excluded' to 'all excluded' here") length(x) <- 16 stopifnot(length(x) == 16) stopifnot(sum(x) == 16) length(x) <- 32 stopifnot(length(x) == 32) stopifnot(sum(x) == 32)
For is.booltype()
objects the term length()
is ambiguous.
For example the length of which()
corresponds to the sum of logical()
.
The generic maxindex
gives length(logical)
for all booltype()
s.
The generic poslength
gives the number of positively selected elements, i.e.
sum(logical)
for all booltype()
s (and gives NA
if NAs
are present).
## Default S3 method: maxindex(x, ...) ## Default S3 method: poslength(x, ...) ## S3 method for class 'logical' maxindex(x, ...) ## S3 method for class 'logical' poslength(x, ...) ## S3 method for class 'bit' maxindex(x, ...) ## S3 method for class 'bit' poslength(x, ...) ## S3 method for class 'bitwhich' maxindex(x, ...) ## S3 method for class 'bitwhich' poslength(x, ...) ## S3 method for class 'which' maxindex(x, ...) ## S3 method for class 'which' poslength(x, ...) ## S3 method for class 'ri' maxindex(x, ...) ## S3 method for class 'ri' poslength(x, ...) maxindex(x, ...) poslength(x, ...)
## Default S3 method: maxindex(x, ...) ## Default S3 method: poslength(x, ...) ## S3 method for class 'logical' maxindex(x, ...) ## S3 method for class 'logical' poslength(x, ...) ## S3 method for class 'bit' maxindex(x, ...) ## S3 method for class 'bit' poslength(x, ...) ## S3 method for class 'bitwhich' maxindex(x, ...) ## S3 method for class 'bitwhich' poslength(x, ...) ## S3 method for class 'which' maxindex(x, ...) ## S3 method for class 'which' poslength(x, ...) ## S3 method for class 'ri' maxindex(x, ...) ## S3 method for class 'ri' poslength(x, ...) maxindex(x, ...) poslength(x, ...)
x |
an R object, typically a |
... |
further arguments (ignored) |
an integer scalar
maxindex(default)
: default method for maxindex
maxindex(logical)
: maxindex
method for class logical()
maxindex(bit)
: maxindex
method for class bit()
maxindex(bitwhich)
: maxindex
method for class bitwhich()
maxindex(which)
: maxindex
method for class which()
maxindex(ri)
: maxindex
method for class ri()
poslength(default)
: default method for poslength
poslength(logical)
: poslength
method for class logical()
poslength(bit)
: poslength
method for class bit()
poslength(bitwhich)
: poslength
method for class bitwhich()
poslength(which)
: poslength
method for class which()
poslength(ri)
: poslength
method for class ri()
r <- ri(1, 2, 12) i <- as.which(r) w <- as.bitwhich(r) b <- as.bit(r) l <- as.logical(r) u <- which(l) # unclassed which sapply(list(r=r, u=u, i=i, w=w, b=b, l=l), function(x) { c(length=length(x), sum=sum(x), maxindex=maxindex(x), poslength=poslength(x)) })
r <- ri(1, 2, 12) i <- as.which(r) w <- as.bitwhich(r) b <- as.bit(r) l <- as.logical(r) u <- which(l) # unclassed which sapply(list(r=r, u=u, i=i, w=w, b=b, l=l), function(x) { c(length=length(x), sum=sum(x), maxindex=maxindex(x), poslength=poslength(x)) })
The merge_
functions allow unary and binary operations on (ascending) sorted vectors
of integer()
.
merge_rev(x)
will do in one scan what costs two scans in -rev(x)
, see also
reverse_vector()
.
Many of these merge_
can optionally scan their input in reverse order (and switch the
sign), which again saves extra scans for calling merge_rev(x)
first.
merge_rev(x) merge_match(x, y, revx = FALSE, revy = FALSE, nomatch = NA_integer_) merge_in(x, y, revx = FALSE, revy = FALSE) merge_notin(x, y, revx = FALSE, revy = FALSE) merge_duplicated(x, revx = FALSE) merge_anyDuplicated(x, revx = FALSE) merge_sumDuplicated(x, revx = FALSE) merge_unique(x, revx = FALSE) merge_union( x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact", "all") ) merge_setdiff(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_symdiff(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_intersect( x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact") ) merge_setequal(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_rangein(rx, y, revx = FALSE, revy = FALSE) merge_rangenotin(rx, y, revx = FALSE, revy = FALSE) merge_rangesect(rx, y, revx = FALSE, revy = FALSE) merge_rangediff(rx, y, revx = FALSE, revy = FALSE) merge_first(x, revx = FALSE) merge_last(x, revx = FALSE) merge_firstin(rx, y, revx = FALSE, revy = FALSE) merge_lastin(rx, y, revx = FALSE, revy = FALSE) merge_firstnotin(rx, y, revx = FALSE, revy = FALSE) merge_lastnotin(rx, y, revx = FALSE, revy = FALSE)
merge_rev(x) merge_match(x, y, revx = FALSE, revy = FALSE, nomatch = NA_integer_) merge_in(x, y, revx = FALSE, revy = FALSE) merge_notin(x, y, revx = FALSE, revy = FALSE) merge_duplicated(x, revx = FALSE) merge_anyDuplicated(x, revx = FALSE) merge_sumDuplicated(x, revx = FALSE) merge_unique(x, revx = FALSE) merge_union( x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact", "all") ) merge_setdiff(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_symdiff(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_intersect( x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact") ) merge_setequal(x, y, revx = FALSE, revy = FALSE, method = c("unique", "exact")) merge_rangein(rx, y, revx = FALSE, revy = FALSE) merge_rangenotin(rx, y, revx = FALSE, revy = FALSE) merge_rangesect(rx, y, revx = FALSE, revy = FALSE) merge_rangediff(rx, y, revx = FALSE, revy = FALSE) merge_first(x, revx = FALSE) merge_last(x, revx = FALSE) merge_firstin(rx, y, revx = FALSE, revy = FALSE) merge_lastin(rx, y, revx = FALSE, revy = FALSE) merge_firstnotin(rx, y, revx = FALSE, revy = FALSE) merge_lastnotin(rx, y, revx = FALSE, revy = FALSE)
x |
a sorted set |
y |
a sorted set |
revx |
default |
revy |
default |
nomatch |
integer value returned for non-matched elements, see |
method |
one of "unique", "exact" (or "all") which governs how to treat ties, see the function descriptions |
rx |
range of integers given as |
These are low-level functions and hence do not check whether the set is
actually sorted.
Note that the merge_*
and merge_range*
functions have no special treatment for
NA
.
If vectors with NA
are sorted ith NA
in the first positions (na.last=FALSE
) and
arguments revx=
or revy=
have not been used, then NAs
are treated like ordinary
integers. NA
sorted elsewhere or using revx=
or revy=
can cause unexpected
results (note for example that revx=
switches the sign on all integers but NAs
).
The binary merge_*
functions have a method="exact"
which in both sets treats consecutive occurrences of the same value as if they were
different values, more precisely they are handled as if the identity of ties were
tuples of ties, rank(ties)
. method="exact"
delivers unique output if the input is
unique, and in this case works faster than method="unique"
.
merge_rev(x)
returns -rev(x)
for integer()
and double()
and
!rev(x)
for logical()
merge_match()
: returns integer positions of sorted set x in sorted set y, see
match(x, y, ...)
merge_in()
: returns logical existence of sorted set x in sorted set y, see
x %in% y
merge_notin()
: returns logical in-existence of sorted set x in sorted set y, see
!(x %in% y)
merge_duplicated()
: returns the duplicated status of a sorted set x, see
duplicated()
merge_anyDuplicated()
: returns the anyDuplicated status of a sorted set x, see
anyDuplicated()
merge_sumDuplicated()
: returns the sumDuplicated status of a sorted set x, see
bit_sumDuplicated()
merge_unique()
: returns unique elements of sorted set x, see unique()
merge_union()
: returns union of two sorted sets.
Default method='unique'
returns a unique sorted set, see union()
;
method='exact'
returns a sorted set with the maximum number of ties in either
input set; method='all'
returns a sorted set with the sum of ties in both input
sets.
merge_setdiff()
: returns sorted set x minus sorted set y
Default method='unique'
returns a unique sorted set, see setdiff()
;
ethod='exact'
returns a sorted set with sum(x ties) minus sum(y ties);
merge_symdiff()
: returns those elements that are in sorted set y
xor()
in
sorted set y
Default method='unique'
returns the sorted unique set complement, see symdiff()
;
method='exact'
returns a sorted set set complement with
abs(sum(x ties) - sum(y ties))
.
merge_intersect()
: returns the intersection of two sorted sets x and y
Default method='unique'
returns the sorted unique intersect, see intersect()
;
method='exact'
returns the intersect with the minium number of ties in either set;
merge_setequal()
: returns TRUE
for equal sorted sets and FALSE
otherwise
Default method='unique'
compares the sets after removing ties, see setequal()
;
method='exact'
compares the sets without removing ties;
merge_rangein()
: returns logical existence of range rx in sorted set y, see
merge_in()
merge_rangenotin()
: returns logical in-existence of range rx in sorted set y, see
merge_notin()
merge_rangesect()
: returns the intersection of range rx and sorted set y, see
merge_intersect()
merge_rangediff()
: returns range rx minus sorted set y, see merge_setdiff()
merge_first()
: quickly returns the first element of a sorted set x (or NA
if
x is empty), hence x[1]
or merge_rev(x)[1]
merge_last()
: quickly returns the last element of a sorted set x, (or NA
if
x is empty), hence x[n]
or merge_rev(x)[n]
merge_firstin()
: quickly returns the first common element of a range rx and a
sorted set y, (or NA
if the intersection is empty), hence
merge_first(merge_rangesect(rx, y))
merge_lastin()
: quickly returns the last common element of a range rx and a
sorted set y, (or NA
if the intersection is empty), hence
merge_last(merge_rangesect(rx, y))
merge_firstnotin()
: quickly returns the first element of a range rx which is not in a
sorted set y (or NA
if all rx are in y), hence merge_first(merge_rangediff(rx, y))
merge_lastnotin()
: quickly returns the last element of a range rx which is not in a
sorted set y (or NA
if all rx are in y), hence merge_last(merge_rangediff(rx, y))
xx OPTIMIZATION OPPORTUNITY These are low-level functions could be optimized with initial binary search (not findInterval, which coerces to double).
merge_rev(1:9) merge_match(1:7, 3:9) #' merge_match(merge_rev(1:7), 3:9) merge_match(merge_rev(1:7), 3:9, revx=TRUE) merge_match(merge_rev(1:7), 3:9, revy=TRUE) merge_match(merge_rev(1:7), merge_rev(3:9)) merge_in(1:7, 3:9) merge_notin(1:7, 3:9) merge_anyDuplicated(c(1L, 1L, 2L, 3L)) merge_duplicated(c(1L, 1L, 2L, 3L)) merge_unique(c(1L, 1L, 2L, 3L)) merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="all") merge_setdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_setdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_setdiff(c(1L, 2L, 2L), c(2L, 2L, 2L, 3L), method="exact") merge_symdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_symdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_symdiff(c(1L, 2L, 2L), c(2L, 2L, 2L, 3L), method="exact") merge_intersect(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_intersect(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_setequal(c(1L, 2L, 2L), c(1L, 2L)) merge_setequal(c(1L, 2L, 2L), c(1L, 2L, 2L)) merge_setequal(c(1L, 2L, 2L), c(1L, 2L), method="exact") merge_setequal(c(1L, 2L, 2L), c(1L, 2L, 2L), method="exact")
merge_rev(1:9) merge_match(1:7, 3:9) #' merge_match(merge_rev(1:7), 3:9) merge_match(merge_rev(1:7), 3:9, revx=TRUE) merge_match(merge_rev(1:7), 3:9, revy=TRUE) merge_match(merge_rev(1:7), merge_rev(3:9)) merge_in(1:7, 3:9) merge_notin(1:7, 3:9) merge_anyDuplicated(c(1L, 1L, 2L, 3L)) merge_duplicated(c(1L, 1L, 2L, 3L)) merge_unique(c(1L, 1L, 2L, 3L)) merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_union(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="all") merge_setdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_setdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_setdiff(c(1L, 2L, 2L), c(2L, 2L, 2L, 3L), method="exact") merge_symdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_symdiff(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_symdiff(c(1L, 2L, 2L), c(2L, 2L, 2L, 3L), method="exact") merge_intersect(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L)) merge_intersect(c(1L, 2L, 2L, 2L), c(2L, 2L, 3L), method="exact") merge_setequal(c(1L, 2L, 2L), c(1L, 2L)) merge_setequal(c(1L, 2L, 2L), c(1L, 2L, 2L)) merge_setequal(c(1L, 2L, 2L), c(1L, 2L), method="exact") merge_setequal(c(1L, 2L, 2L), c(1L, 2L, 2L), method="exact")
These generics are packaged here for methods in packages bit64
and
ff
.
is.sorted(x, ...) is.sorted(x, ...) <- value na.count(x, ...) na.count(x, ...) <- value nvalid(x, ...) nunique(x, ...) nunique(x, ...) <- value nties(x, ...) nties(x, ...) <- value
is.sorted(x, ...) is.sorted(x, ...) <- value na.count(x, ...) na.count(x, ...) <- value nvalid(x, ...) nunique(x, ...) nunique(x, ...) <- value nties(x, ...) nties(x, ...) <- value
x |
some object |
... |
ignored |
value |
value assigned on responsibility of the user |
see help of the available methods
see help of the available methods
Jens Oehlschlägel [email protected]
bit64::is.sorted.integer64()
, bit64::na.count.integer64()
,
bit64::nvalid.integer64()
, bit64::nunique.integer64()
, bit64::nties.integer64()
methods("na.count")
methods("na.count")
Compatibility functions (to package ff) for getting and setting physical and virtual attributes.
## Default S3 method: physical(x) ## Default S3 replacement method: physical(x) <- value ## Default S3 method: virtual(x) ## Default S3 replacement method: virtual(x) <- value ## S3 method for class 'physical' print(x, ...) ## S3 method for class 'virtual' print(x, ...) physical(x) physical(x) <- value virtual(x) virtual(x) <- value
## Default S3 method: physical(x) ## Default S3 replacement method: physical(x) <- value ## Default S3 method: virtual(x) ## Default S3 replacement method: virtual(x) <- value ## S3 method for class 'physical' print(x, ...) ## S3 method for class 'virtual' print(x, ...) physical(x) physical(x) <- value virtual(x) virtual(x) <- value
x |
a ff or ram object |
value |
a list with named elements |
... |
further arguments |
ff objects have physical and virtual attributes, which have different
copying semantics: physical attributes are shared between copies of ff
objects while virtual attributes might differ between copies.
ff::as.ram()
will retain some physical and virtual atrributes in
the ram clone, such that ff::as.ff()
can restore an ff object
with the same attributes.
physical
and virtual
returns a list with named elements
Jens Oehlschlägel
ff::physical.ff()
, ff::physical.ffdf()
physical(bit(12)) virtual(bit(12))
physical(bit(12)) virtual(bit(12))
Print method for bit
## S3 method for class 'bit' print(x, ...)
## S3 method for class 'bit' print(x, ...)
x |
a bit vector |
... |
passed to print |
a character vector showing first and last elements of the bit vector
print(bit(120))
print(bit(120))
Print method for bitwhich
## S3 method for class 'bitwhich' print(x, ...)
## S3 method for class 'bitwhich' print(x, ...)
x |
a |
... |
ignored |
In one pass over the vector NA
s are handled according to parameter
na.last
by range_sortna()
, then, if the vector is unsorted,
binary quicksort is invoked.
quicksort2(x, na.last = NA)
quicksort2(x, na.last = NA)
x |
an integer vector |
na.last |
|
a sorted vector
quicksort2(c(2L, 0L, 1L, NA, 2L)) quicksort2(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) quicksort2(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
quicksort2(c(2L, 0L, 1L, NA, 2L)) quicksort2(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) quicksort2(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
In one pass over the vector NA
s are handled according to parameter
na.last
by range_sortna()
, then, if the vector is unsorted,
threeway quicksort is invoked.
quicksort3(x, na.last = NA)
quicksort3(x, na.last = NA)
x |
an integer vector |
na.last |
|
a sorted vector
countsort(c(2L, 0L, 1L, NA, 2L)) countsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) countsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
countsort(c(2L, 0L, 1L, NA, 2L)) countsort(c(2L, 0L, 1L, NA, 2L), na.last=TRUE) countsort(c(2L, 0L, 1L, NA, 2L), na.last=FALSE)
Get range and number of NAs
range_na(x)
range_na(x)
x |
an integer vector |
an integer vector with three elements:
min integer
max integer
number of NAs
range_nanozero()
and range_sortna()
range_na(c(0L, 1L, 2L, NA))
range_na(c(0L, 1L, 2L, NA))
Remove zeros and get range and number of NAs
range_nanozero(x)
range_nanozero(x)
x |
an integer vector |
an integer vector without zeros and with an attribute range_na()
with three
elements:
min integer
max integer
number of NAs
range_nanozero(c(0L, 1L, 2L, NA))
range_nanozero(c(0L, 1L, 2L, NA))
In one pass over the vector NA
s are treated according to parameter
na.last
exactly like sort()
does, the range()
,
number of NA
s and unsortedness is determined.
range_sortna(x, decreasing = FALSE, na.last = NA)
range_sortna(x, decreasing = FALSE, na.last = NA)
x |
an integer vector |
decreasing |
(currently only |
na.last |
|
an integer vector with NA
s are treated and an attribute
range_na()
with
four elements:
min integer
max integer
number of NAs
0 for sorted vector and 1 for is.unsorted()
range_na()
and range_nanozero()
range_sortna(c(0L, 1L, NA, 2L)) range_sortna(c(2L, NA, 1L, 0L)) range_sortna(c(0L, 1L, NA, 2L), na.last=TRUE) range_sortna(c(2L, NA, 1L, 0L), na.last=TRUE) range_sortna(c(0L, 1L, NA, 2L), na.last=FALSE) range_sortna(c(2L, NA, 1L, 0L), na.last=FALSE)
range_sortna(c(0L, 1L, NA, 2L)) range_sortna(c(2L, NA, 1L, 0L)) range_sortna(c(0L, 1L, NA, 2L), na.last=TRUE) range_sortna(c(2L, NA, 1L, 0L), na.last=TRUE) range_sortna(c(0L, 1L, NA, 2L), na.last=FALSE) range_sortna(c(2L, NA, 1L, 0L), na.last=FALSE)
Creating new bit or bitwhich by recycling such vectors
## S3 method for class 'bit' rep(x, times = 1L, length.out = NA, ...) ## S3 method for class 'bitwhich' rep(x, times = 1L, length.out = NA, ...)
## S3 method for class 'bit' rep(x, times = 1L, length.out = NA, ...) ## S3 method for class 'bitwhich' rep(x, times = 1L, length.out = NA, ...)
x |
bit or bitwhich object |
times |
number of replications |
length.out |
final length of replicated vector (dominates times) |
... |
not used |
An object of class 'bit' or 'bitwhich'
Jens Oehlschlägel
rep()
, bit()
, bitwhich()
rep(as.bit(c(FALSE, TRUE)), 2) rep(as.bit(c(FALSE, TRUE)), length.out=7) rep(as.bitwhich(c(FALSE, TRUE)), 2) rep(as.bitwhich(c(FALSE, TRUE)), length.out=1)
rep(as.bit(c(FALSE, TRUE)), 2) rep(as.bit(c(FALSE, TRUE)), length.out=7) rep(as.bitwhich(c(FALSE, TRUE)), 2) rep(as.bitwhich(c(FALSE, TRUE)), length.out=1)
Repeats timing expr until minSec is reached
repeat.time(expr, gcFirst = TRUE, minSec = 0.5, envir = parent.frame())
repeat.time(expr, gcFirst = TRUE, minSec = 0.5, envir = parent.frame())
expr |
Valid expression to be timed. |
gcFirst |
Logical - should a garbage collection be performed
immediately before the timing? Default is |
minSec |
number of seconds to repeat at least |
envir |
the environment in which to evaluate |
A object of class "proc_time"
: see proc.time()
for details.
Jens Oehlschlägel [email protected]
system.time(1 + 1) repeat.time(1 + 1) system.time(sort(runif(1e6))) repeat.time(sort(runif(1e6)))
system.time(1 + 1) repeat.time(1 + 1) system.time(sort(runif(1e6))) repeat.time(sort(runif(1e6)))
repfromto
virtually recylcles object x
and cuts out
positions from .. to
repfromto(x, from, to) repfromto(x, from, to) <- value
repfromto(x, from, to) repfromto(x, from, to) <- value
x |
an object from which to recycle |
from |
first position to return |
to |
last position to return |
value |
value to assign |
repfromto
is a generalization of rep()
, where
rep(x, n) == repfromto(x, 1, n)
. You can see this as an R-side
(vector) solution of the mod_iterate
macro in arithmetic.c
a vector of length from - to + 1
Jens Oehlschlägel
message("a simple example") repfromto(0:9, 11, 20)
message("a simple example") repfromto(0:9, 11, 20)
Creating new bit or bitwhich by reversing such vectors
## S3 method for class 'bit' rev(x) ## S3 method for class 'bitwhich' rev(x)
## S3 method for class 'bit' rev(x) ## S3 method for class 'bitwhich' rev(x)
x |
bit or bitwhich object |
An object of class 'bit' or 'bitwhich'
Jens Oehlschlägel
rev()
, bit()
, bitwhich()
rev(as.bit(c(FALSE, TRUE))) rev(as.bitwhich(c(FALSE, TRUE)))
rev(as.bit(c(FALSE, TRUE))) rev(as.bitwhich(c(FALSE, TRUE)))
Returns a reversed copy – with attributes retained.
reverse_vector(x)
reverse_vector(x)
x |
an R vector |
This is substantially faster than rev()
a reversed vector
x <- factor(letters) rev(x) reverse_vector(x) ## Not run: x <- 1:1e7 system.time(rev(x)) system.time(reverse_vector(x)) ## End(Not run)
x <- factor(letters) rev(x) reverse_vector(x) ## Not run: x <- 1:1e7 system.time(rev(x)) system.time(reverse_vector(x)) ## End(Not run)
A range index can be used to extract or replace a continuous ascending part of the data
ri(from, to = NULL, maxindex = NA) ## S3 method for class 'ri' print(x, ...)
ri(from, to = NULL, maxindex = NA) ## S3 method for class 'ri' print(x, ...)
from |
first position |
to |
last posistion |
maxindex |
the maximal length of the object-to-be-subscripted (if known) |
x |
an object of class 'ri' |
... |
further arguments |
A two element integer vector with class 'ri'
Jens Oehlschlägel
bit(12)[ri(1, 6)]
bit(12)[ri(1, 6)]
Basic utilities for rle packing and unpacking and apropriate methods for
rev()
and unique()
.
rlepack(x, ...) ## S3 method for class 'integer' rlepack(x, pack = TRUE, ...) rleunpack(x) ## S3 method for class 'rlepack' rleunpack(x) ## S3 method for class 'rlepack' rev(x) ## S3 method for class 'rlepack' unique(x, incomparables = FALSE, ...) ## S3 method for class 'rlepack' anyDuplicated(x, incomparables = FALSE, ...)
rlepack(x, ...) ## S3 method for class 'integer' rlepack(x, pack = TRUE, ...) rleunpack(x) ## S3 method for class 'rlepack' rleunpack(x) ## S3 method for class 'rlepack' rev(x) ## S3 method for class 'rlepack' unique(x, incomparables = FALSE, ...) ## S3 method for class 'rlepack' anyDuplicated(x, incomparables = FALSE, ...)
x |
in 'rlepack' an integer vector, in the other functions an object of class 'rlepack' |
... |
just to keep R CMD CHECK quiet (not used) |
pack |
FALSE to suppress packing |
incomparables |
just to keep R CMD CHECK quiet (not used) |
A list with components:
first: the first element of the packed sequence
dat: either an object of class rle()
or the complete input vector x
if
rle-packing is not efficient
last: the last element of the packed sequence
Jens Oehlschlägel
ff::hi()
, intrle()
, rle()
, rev()
, unique()
x <- rlepack(rep(0L, 10))
x <- rlepack(rep(0L, 10))
These are generic stubs for low-level sorting and ordering methods
implemented in packages 'bit64' and 'ff'. The ..sortorder
methods do
sorting and ordering at once, which requires more RAM than ordering but is
(almost) as fast as as sorting.
ramsort(x, ...) ramorder(x, i, ...) ramsortorder(x, i, ...) mergesort(x, ...) mergeorder(x, i, ...) mergesortorder(x, i, ...) quicksort(x, ...) quickorder(x, i, ...) quicksortorder(x, i, ...) shellsort(x, ...) shellorder(x, i, ...) shellsortorder(x, i, ...) radixsort(x, ...) radixorder(x, i, ...) radixsortorder(x, i, ...) keysort(x, ...) keyorder(x, i, ...) keysortorder(x, i, ...)
ramsort(x, ...) ramorder(x, i, ...) ramsortorder(x, i, ...) mergesort(x, ...) mergeorder(x, i, ...) mergesortorder(x, i, ...) quicksort(x, ...) quickorder(x, i, ...) quicksortorder(x, i, ...) shellsort(x, ...) shellorder(x, i, ...) shellsortorder(x, i, ...) radixsort(x, ...) radixorder(x, i, ...) radixsortorder(x, i, ...) keysort(x, ...) keyorder(x, i, ...) keysortorder(x, i, ...)
x |
a vector to be sorted by |
... |
further arguments to the sorting methods |
i |
integer positions to be modified by |
The sort
generics do sort their argument 'x', some methods need
temporary RAM of the same size as 'x'. The order
generics do order
their argument 'i' leaving 'x' as it was, some methods need temporary RAM of
the same size as 'i'. The sortorder
generics do sort their argument
'x' and order their argument 'i', this way of ordering is much faster at the
price of requiring temporary RAM for both, 'x' and 'i', if the method
requires temporary RAM. The ram
generics are high-level functions
containing an optimizer that chooses the 'best' algorithms given some
context.
These functions return the number of NAs
found or assumed
during sorting
generic | ff | bit64 |
ramsort |
ff::ramsort.default() |
bit64::ramsort.integer64() |
shellsort |
ff::shellsort.default() |
bit64::shellsort.integer64() |
quicksort |
bit64::quicksort.integer64() |
|
mergesort |
ff::mergesort.default() |
bit64::mergesort.integer64() |
radixsort |
ff::radixsort.default() |
bit64::radixsort.integer64() |
keysort |
ff::keysort.default() |
|
ramorder |
ff::ramorder.default() |
bit64::ramorder.integer64() |
shellorder |
ff::shellorder.default() |
bit64::shellorder.integer64() |
quickorder |
bit64::quickorder.integer64() |
|
mergeorder |
ff::mergeorder.default() |
bit64::mergeorder.integer64() |
radixorder |
ff::radixorder.default() |
bit64::radixorder.integer64() |
keyorder |
ff::keyorder.default() |
|
ramsortorder |
bit64::ramsortorder.integer64() |
|
shellsortorder |
bit64::shellsortorder.integer64() |
|
quicksortorder |
bit64::quicksortorder.integer64() |
|
mergesortorder |
bit64::mergesortorder.integer64() |
|
radixsortorder |
bit64::radixsortorder.integer64() |
|
keysortorder |
||
Note that these methods purposely violate the functional programming
paradigm: they are called for the side-effect of changing some of their
arguments. The rationale behind this is that sorting is very RAM-intensive
and in certain situations we might not want to allocate additional memory if
not necessary to do so. The sort
-methods change x
, the
order
-methods change i
, and the sortoder
-methods change
both x
and i
You as the user are responsible to create copies
of the input data 'x' and 'i' if you need non-modified versions.
Jens Oehlschlägel [email protected]
sort()
and order()
in base R, bitsort()
for faster inteer sorting
Test for C-level identity of two atomic vectors
still.identical(x, y)
still.identical(x, y)
x |
an atomic vector |
y |
an atomic vector |
logical scalar
x <- 1:2 y <- x z <- copy_vector(x) still.identical(y, x) still.identical(z, x)
x <- 1:2 y <- x z <- copy_vector(x) still.identical(y, x) still.identical(z, x)
To actually view the internal structure use str(unclass(bit))
## S3 method for class 'bit' str( object, vec.len = strO$vec.len, give.head = TRUE, give.length = give.head, ... )
## S3 method for class 'bit' str( object, vec.len = strO$vec.len, give.head = TRUE, give.length = give.head, ... )
object |
any R object about which you want to have some information. |
vec.len |
numeric (>= 0) indicating how many ‘first few’ elements
are displayed of each vector. The number is multiplied by different
factors (from .5 to 3) depending on the kind of vector. Defaults to
the |
give.head |
logical; if |
give.length |
logical; if |
... |
potential further arguments (required for Method/Generic reasons). |
str(bit(120))
str(bit(120))
To actually view the internal structure use str(unclass(bitwhich))
## S3 method for class 'bitwhich' str( object, vec.len = strO$vec.len, give.head = TRUE, give.length = give.head, ... )
## S3 method for class 'bitwhich' str( object, vec.len = strO$vec.len, give.head = TRUE, give.length = give.head, ... )
object |
any R object about which you want to have some information. |
vec.len |
numeric (>= 0) indicating how many ‘first few’ elements
are displayed of each vector. The number is multiplied by different
factors (from .5 to 3) depending on the kind of vector. Defaults to
the |
give.head |
logical; if |
give.length |
logical; if |
... |
potential further arguments (required for Method/Generic reasons). |
str(bitwhich(120))
str(bitwhich(120))
Fast aggregation functions for booltype()
vectors. namely bit()
, all()
, any()
,
anyNA()
, min()
, max()
, range()
, sum()
and summary()
.
Now all boolean summaries (except for anyNA
because the generic does not allow it)
have an optional range
argument to restrict the range of evalution.
Note that the boolean summaries have meaning and return values differing from logical
aggregation functions: they treat NA
as FALSE
, min
, max
and range
give the
minimum and maximum positions of TRUE
, summary
returns counts of FALSE
, TRUE
and the range
.
Note that you can force the boolean interpretation by calling the booltype method
explicitly on any booltypes
input, e.g. min.booltype()
, see the
examples.
## S3 method for class 'bit' all(x, range = NULL, ...) ## S3 method for class 'bit' any(x, range = NULL, ...) ## S3 method for class 'bit' anyNA(x, recursive = FALSE) ## S3 method for class 'bit' sum(x, range = NULL, ...) ## S3 method for class 'bit' min(x, range = NULL, ...) ## S3 method for class 'bit' max(x, range = NULL, ...) ## S3 method for class 'bit' range(x, range = NULL, ...) ## S3 method for class 'bit' summary(object, range = NULL, ...) ## S3 method for class 'bitwhich' all(x, range = NULL, ...) ## S3 method for class 'bitwhich' any(x, range = NULL, ...) ## S3 method for class 'bitwhich' anyNA(x, recursive = FALSE) ## S3 method for class 'bitwhich' sum(x, range = NULL, ...) ## S3 method for class 'bitwhich' min(x, range = NULL, ...) ## S3 method for class 'bitwhich' max(x, range = NULL, ...) ## S3 method for class 'bitwhich' range(x, range = NULL, ...) ## S3 method for class 'bitwhich' summary(object, range = NULL, ...) ## S3 method for class 'which' all(x, range = NULL, ...) ## S3 method for class 'which' any(x, range = NULL, ...) ## S3 method for class 'which' anyNA(x, recursive = FALSE) ## S3 method for class 'which' sum(x, range = NULL, ...) ## S3 method for class 'which' min(x, range = NULL, ...) ## S3 method for class 'which' max(x, range = NULL, ...) ## S3 method for class 'which' range(x, range = NULL, ...) ## S3 method for class 'which' summary(object, range = NULL, ...) ## S3 method for class 'booltype' all(x, range = NULL, ...) ## S3 method for class 'booltype' any(x, range = NULL, ...) ## S3 method for class 'booltype' anyNA(x, ...) ## S3 method for class 'booltype' sum(x, range = NULL, ...) ## S3 method for class 'booltype' min(x, range = NULL, ...) ## S3 method for class 'booltype' max(x, range = NULL, ...) ## S3 method for class 'booltype' range(x, range = NULL, ...) ## S3 method for class 'booltype' summary(object, range = NULL, ...) ## S3 method for class 'ri' all(x, range = NULL, ...) ## S3 method for class 'ri' any(x, range = NULL, ...) ## S3 method for class 'ri' anyNA(x, recursive = FALSE) ## S3 method for class 'ri' sum(x, ...) ## S3 method for class 'ri' min(x, ...) ## S3 method for class 'ri' max(x, ...) ## S3 method for class 'ri' range(x, ...) ## S3 method for class 'ri' summary(object, ...)
## S3 method for class 'bit' all(x, range = NULL, ...) ## S3 method for class 'bit' any(x, range = NULL, ...) ## S3 method for class 'bit' anyNA(x, recursive = FALSE) ## S3 method for class 'bit' sum(x, range = NULL, ...) ## S3 method for class 'bit' min(x, range = NULL, ...) ## S3 method for class 'bit' max(x, range = NULL, ...) ## S3 method for class 'bit' range(x, range = NULL, ...) ## S3 method for class 'bit' summary(object, range = NULL, ...) ## S3 method for class 'bitwhich' all(x, range = NULL, ...) ## S3 method for class 'bitwhich' any(x, range = NULL, ...) ## S3 method for class 'bitwhich' anyNA(x, recursive = FALSE) ## S3 method for class 'bitwhich' sum(x, range = NULL, ...) ## S3 method for class 'bitwhich' min(x, range = NULL, ...) ## S3 method for class 'bitwhich' max(x, range = NULL, ...) ## S3 method for class 'bitwhich' range(x, range = NULL, ...) ## S3 method for class 'bitwhich' summary(object, range = NULL, ...) ## S3 method for class 'which' all(x, range = NULL, ...) ## S3 method for class 'which' any(x, range = NULL, ...) ## S3 method for class 'which' anyNA(x, recursive = FALSE) ## S3 method for class 'which' sum(x, range = NULL, ...) ## S3 method for class 'which' min(x, range = NULL, ...) ## S3 method for class 'which' max(x, range = NULL, ...) ## S3 method for class 'which' range(x, range = NULL, ...) ## S3 method for class 'which' summary(object, range = NULL, ...) ## S3 method for class 'booltype' all(x, range = NULL, ...) ## S3 method for class 'booltype' any(x, range = NULL, ...) ## S3 method for class 'booltype' anyNA(x, ...) ## S3 method for class 'booltype' sum(x, range = NULL, ...) ## S3 method for class 'booltype' min(x, range = NULL, ...) ## S3 method for class 'booltype' max(x, range = NULL, ...) ## S3 method for class 'booltype' range(x, range = NULL, ...) ## S3 method for class 'booltype' summary(object, range = NULL, ...) ## S3 method for class 'ri' all(x, range = NULL, ...) ## S3 method for class 'ri' any(x, range = NULL, ...) ## S3 method for class 'ri' anyNA(x, recursive = FALSE) ## S3 method for class 'ri' sum(x, ...) ## S3 method for class 'ri' min(x, ...) ## S3 method for class 'ri' max(x, ...) ## S3 method for class 'ri' range(x, ...) ## S3 method for class 'ri' summary(object, ...)
x |
an object of class bit or bitwhich |
range |
a |
... |
formally required but not used |
recursive |
formally required but not used |
object |
an object of class bit |
Summaries of bit()
vectors are quite fast because we use a double loop that fixes
each word in a processor register. Furthermore we break out of looping as soon
as possible. Summaries of bitwhich()
vectors are even faster, if the selection is
very skewed.
as expected
Jens Oehlschlägel
l <- c(NA, FALSE, TRUE) b <- as.bit(l) all(l) all(b) all(b, range=c(3, 3)) all.booltype(l, range=c(3, 3)) min(l) min(b) sum(l) sum(b) summary(l) summary(b) summary.booltype(l)
l <- c(NA, FALSE, TRUE) b <- as.bit(l) all(l) all(b) all(b, range=c(3, 3)) all.booltype(l, range=c(3, 3)) min(l) min(b) sum(l) sum(b) summary(l) summary(b) summary.booltype(l)
Symmetric set complement
symdiff(x, y)
symdiff(x, y)
x |
a vector |
y |
a vector |
union(setdiff(x, y), setdiff(y, x))
that symdiff(x, y)
is not identical()
as symdiff(y, x)
without applying sort()
to the result
merge_symdiff()
and xor()
symdiff(c(1L, 2L, 2L), c(2L, 3L)) symdiff(c(2L, 3L), c(1L, 2L, 2L))
symdiff(c(1L, 2L, 2L), c(2L, 3L)) symdiff(c(2L, 3L), c(1L, 2L, 2L))
Returns object with attributes removed
unattr(x)
unattr(x)
x |
any R object |
attribute removal copies the object as usual
a similar object with attributes removed
Jens Oehlschlägel
attributes()
, setattributes()
,
unclass()
bit(2)[] unattr(bit(2)[])
bit(2)[] unattr(bit(2)[])
vecseq
returns concatenated multiple sequences
vecseq(x, y = NULL, concat = TRUE, eval = TRUE)
vecseq(x, y = NULL, concat = TRUE, eval = TRUE)
x |
vector of sequence start points |
y |
vector of sequence end points (if |
concat |
vector of sequence end points (if |
eval |
vector of sequence end points (if |
This is a generalization of sequence()
in that you can choose
sequence starts other than 1 and also have options to no concat and/or
return a call instead of the evaluated sequence.
if concat == FALSE
and eval == FALSE
a list with n calls that generate sequences
if concat == FALSE
and eval == TRUE
a list with n sequences
if concat == TRUE
and eval == FALSE
a single call generating the concatenated
sequences
if concat == TRUE
and eval == TRUE
an integer vector of concatentated sequences
Angelo Canty, Jens Oehlschlägel
:
, seq()
, sequence()
sequence(c(3, 4)) vecseq(c(3, 4)) vecseq(c(1, 11), c(5, 15)) vecseq(c(1, 11), c(5, 15), concat=FALSE, eval=FALSE) vecseq(c(1, 11), c(5, 15), concat=FALSE, eval=TRUE) vecseq(c(1, 11), c(5, 15), concat=TRUE, eval=FALSE) vecseq(c(1, 11), c(5, 15), concat=TRUE, eval=TRUE)
sequence(c(3, 4)) vecseq(c(3, 4)) vecseq(c(1, 11), c(5, 15)) vecseq(c(1, 11), c(5, 15), concat=FALSE, eval=FALSE) vecseq(c(1, 11), c(5, 15), concat=FALSE, eval=TRUE) vecseq(c(1, 11), c(5, 15), concat=TRUE, eval=FALSE) vecseq(c(1, 11), c(5, 15), concat=TRUE, eval=TRUE)
Boolean NEGATION '!', AND '&', OR '|' and EXCLUSIVE OR xor', see
Logic
.
## Default S3 method: xor(x, y) ## S3 method for class 'logical' xor(x, y) ## S3 method for class 'bit' !x ## S3 method for class 'bit' e1 & e2 ## S3 method for class 'bit' e1 | e2 ## S3 method for class 'bit' e1 == e2 ## S3 method for class 'bit' e1 != e2 ## S3 method for class 'bit' xor(x, y) ## S3 method for class 'bitwhich' !x ## S3 method for class 'bitwhich' e1 & e2 ## S3 method for class 'bitwhich' e1 | e2 ## S3 method for class 'bitwhich' e1 == e2 ## S3 method for class 'bitwhich' e1 != e2 ## S3 method for class 'bitwhich' xor(x, y) ## S3 method for class 'booltype' e1 & e2 ## S3 method for class 'booltype' e1 | e2 ## S3 method for class 'booltype' e1 == e2 ## S3 method for class 'booltype' e1 != e2 ## S3 method for class 'booltype' xor(x, y) xor(x, y)
## Default S3 method: xor(x, y) ## S3 method for class 'logical' xor(x, y) ## S3 method for class 'bit' !x ## S3 method for class 'bit' e1 & e2 ## S3 method for class 'bit' e1 | e2 ## S3 method for class 'bit' e1 == e2 ## S3 method for class 'bit' e1 != e2 ## S3 method for class 'bit' xor(x, y) ## S3 method for class 'bitwhich' !x ## S3 method for class 'bitwhich' e1 & e2 ## S3 method for class 'bitwhich' e1 | e2 ## S3 method for class 'bitwhich' e1 == e2 ## S3 method for class 'bitwhich' e1 != e2 ## S3 method for class 'bitwhich' xor(x, y) ## S3 method for class 'booltype' e1 & e2 ## S3 method for class 'booltype' e1 | e2 ## S3 method for class 'booltype' e1 == e2 ## S3 method for class 'booltype' e1 != e2 ## S3 method for class 'booltype' xor(x, y) xor(x, y)
x |
a |
y |
a |
e1 |
a |
e2 |
a |
The binary operators and function xor
can now combine any is.booltype()
vectors.
They now recycle if vectors have different length. If the two arguments have different
booltypes()
the return value corresponds to the lower booltype()
of the two.
Boolean operations on bit()
vectors are extremely fast because they are
implemented using C's bitwise operators. Boolean operations on or bitwhich()
vectors are even faster, if they represent very skewed selections.
The xor
function has been made generic and xor.default
has
been implemented much faster than R's standard xor()
.
This was possible because actually boolean function xor
and
comparison operator !=
do the same (even with NAs), and !=
is
much faster than the multiple calls in (x | y) & !(x & y)
An object of class booltype()
or logical()
xor(default)
: default method for xor()
xor(bitwhich)
: bitwhich()
method for xor()
xor(booltype)
: booltype()
method for xor()
`!`(bitwhich)
: bitwhich()
method for !
&
: bitwhich()
method for &
|
: bitwhich()
method for |
==
: bitwhich()
method for ==
!=
: bitwhich()
method for !=
&
: booltype()
method for &
|
: booltype()
method for |
==
: booltype()
method for ==
!=
: booltype()
method for !=
Jens Oehlschlägel
x <- c(FALSE, FALSE, FALSE, NA, NA, NA, TRUE, TRUE, TRUE) y <- c(FALSE, NA, TRUE, FALSE, NA, TRUE, FALSE, NA, TRUE) x | y x | as.bit(y) x | as.bitwhich(y) x | as.which(y) x | ri(1, 1, 9)
x <- c(FALSE, FALSE, FALSE, NA, NA, NA, TRUE, TRUE, TRUE) y <- c(FALSE, NA, TRUE, FALSE, NA, TRUE, FALSE, NA, TRUE) x | y x | as.bit(y) x | as.bitwhich(y) x | as.which(y) x | ri(1, 1, 9)