(discussion) NA handling #78
Unanswered
zhangyingmath
asked this question in
Ideas
Replies: 1 comment
-
|
The way NA behaves in R is interesting. First of, in R, there are integer types and numeric types (which includes integer and double), which is different from python/numpy. In base R vector or data.frame
df = data.frame(f=c(1L, 2L, NA), g=c(1,2,NA), h=c('a', 'b', NA), m=c(TRUE, FALSE, NA)) An data.table object behaves almost the same. In fact, if you take dt$f, then it behaves according to the above rules. If you take dt, then NA in a mask will not get selected, instead of returning NA. This seems desirable in my opinion. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
(This is not necessarily a request to riptide, I just like to write down a wish list here to see if other people agree with me, or if I agree with myself three months from now.)
I wish we can create a NA object that have most of the properties as specified in IEEE 754 https://en.wikipedia.org/wiki/NaN (i.e. NA does not equal itself in comparison; it should propagate in arithmetic operations, etc), but is not required to be a float. (a.k.a I guess I want an "atomic NA" that can fit in an array of floats, ints, str or object columns.)
In addition, I want:
The NA is "congruent" whether I access it via part of an array or as a number extracted from the array. In other words,
func(arr)[i]gives the same result asfunc(arr[i]), if the i-th position is NA.(I am sure I will think of sth else to add).
I don't care how it is represented internally, in RAM or on disk.
(Vague here:) Some people argues that NA not equals itself will lose reflexibility. I guess when we can comparisons, we might be either comparing values (one by one), or an array object, or a certain piece of memory. Somehow I feel we might get some mileage in distinguishing
==versusis.(As an analogy, I also wonder if we can distinguish the "view versus copy" behavior by supporting/distinguishing two different assignment operators, ":=" means to copy, "=" means to be lazy on copy and most of the time just a view/reference.) This is an independent problem, and I will post examples later.
Beta Was this translation helpful? Give feedback.
All reactions