子集化时的逻辑条件未给出正确的值
Logical condition while subsetting not giving correct values
我想对我正在使用的数据框 project
进行子集化,使用逻辑。我得到了一个矛盾的结果。 ROLL.NO.
参数之前的逻辑部分与问题无关。抱歉,我无法给出可重现的示例。请告诉我如何使这个问题可重现,而不必在我的数据框中显示相关列的全部 393 个条目。D14
和 DC31
是简单的整数值,其中一些值为 NA
.
culprits<-project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)]
culprits
[1] 3138 3129 3129 3135 3135 3136 3120 3126 3133 3125 3125 3125 3132 3132 3123 3123 3131
project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3131]
[1] "14/132" "14/176" "16/133" "14/111" "14/252"
> project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3129]
[1] "14/132" "15/162" "14/176" "16/133" "14/111"
> project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3136]
[1] 3129 3136 3120 3123 3123
project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3125]
[1] 3129 3120 3125 3125 3125 3123 3123
project$ROLL.NO.[project$ROLL.NO.==3136]
[1] 3136 3136 3136 3136 3136 3136 3136 3136 3136
我试图了解我的代码中出了什么问题,并且我还包含了这些查询的结果。当 project$ROLL.NO.==3136
对于任何其他 ROLL.NO.
是 FALSE
时,我不明白为什么在添加其他参数时调用其他 ROLL.NO.
&
。此外,相同的三个条目错误地与任何称为 ROLL.NO.
的条目一起重复 ROLL.NO.
列中没有 NA
值。并且每个条件下的逻辑向量的长度是相同的,因此没有回收。如果需要提供其他信息,请告诉我。
附录
project <- structure(list(ROLL.NO. = c(3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3121L, 3121L, 3121L, 3121L, 3121L, 3121L
), DC31 = c(2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 2L, 2L, 2L), D14 = c(2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 0L, 1L, 2L, 2L, 0L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), HOUSE.NO = c("14/274",
"14/259", "14/217", "14/258", "14/306", "14/300", "14/96", "14/166",
"14/69", "14/68", "14/16", "14/93", "14/130", "14/321", "14/324",
"14/139", "14/314", "14/323", "14/208", "14/78", "14/150", "14/155",
"14/102", "14/132", "14/159", "14/163", "14/165", "14/146", "14/148",
"14/104", "14/56", "14/53", "14/99", "14/48", "15/164", "15/148",
"15/158", "15/107", "15/160", "15/162", "15/243", "15/66", "15/249",
"15/86", "14/388", "14/396", "14/431", "14/401", "14/103", "15/36"
)), .Names = c("ROLL.NO.", "DC31", "D14", "HOUSE.NO"), row.names = c(NA,
50L), class = "data.frame")
来自 ?base::Logic
、help('&')
、help('|')
等
See Syntax
for the precedence of these operators: unlike many other languages (including S) the AND and OR operators do not have the same precedence (the AND operators have higher precedence than the OR operators).
这解释了为什么
TRUE | TRUE & FALSE
# [1] TRUE
本质上是
TRUE | (TRUE & FALSE)
这也是正确的,并且简化了您在这里所做的事情:
(project$DC31==1&project$D14==2) |
(project$DC31==2&project$D14==1) &
!is.na(project$DC31) &
!is.na(project$D14) &
project$ROLL.NO. == 3131
因为你希望结果只包含一些 project$ROLL.NO. == 3131
我假设,所以即使其中一些是错误的,如果一个或多个 OR
是正确的,你可能会得到一些不是ROLL.NO.
不是 3131
另请注意 !
的优先级高于逻辑
我想对我正在使用的数据框 project
进行子集化,使用逻辑。我得到了一个矛盾的结果。 ROLL.NO.
参数之前的逻辑部分与问题无关。抱歉,我无法给出可重现的示例。请告诉我如何使这个问题可重现,而不必在我的数据框中显示相关列的全部 393 个条目。D14
和 DC31
是简单的整数值,其中一些值为 NA
.
culprits<-project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)]
culprits
[1] 3138 3129 3129 3135 3135 3136 3120 3126 3133 3125 3125 3125 3132 3132 3123 3123 3131
project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3131]
[1] "14/132" "14/176" "16/133" "14/111" "14/252"
> project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3129]
[1] "14/132" "15/162" "14/176" "16/133" "14/111"
> project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3136]
[1] 3129 3136 3120 3123 3123
project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3125]
[1] 3129 3120 3125 3125 3125 3123 3123
project$ROLL.NO.[project$ROLL.NO.==3136]
[1] 3136 3136 3136 3136 3136 3136 3136 3136 3136
我试图了解我的代码中出了什么问题,并且我还包含了这些查询的结果。当 project$ROLL.NO.==3136
对于任何其他 ROLL.NO.
是 FALSE
时,我不明白为什么在添加其他参数时调用其他 ROLL.NO.
&
。此外,相同的三个条目错误地与任何称为 ROLL.NO.
的条目一起重复 ROLL.NO.
列中没有 NA
值。并且每个条件下的逻辑向量的长度是相同的,因此没有回收。如果需要提供其他信息,请告诉我。
附录
project <- structure(list(ROLL.NO. = c(3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3121L, 3121L, 3121L, 3121L, 3121L, 3121L
), DC31 = c(2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 2L, 2L, 2L), D14 = c(2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 0L, 1L, 2L, 2L, 0L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), HOUSE.NO = c("14/274",
"14/259", "14/217", "14/258", "14/306", "14/300", "14/96", "14/166",
"14/69", "14/68", "14/16", "14/93", "14/130", "14/321", "14/324",
"14/139", "14/314", "14/323", "14/208", "14/78", "14/150", "14/155",
"14/102", "14/132", "14/159", "14/163", "14/165", "14/146", "14/148",
"14/104", "14/56", "14/53", "14/99", "14/48", "15/164", "15/148",
"15/158", "15/107", "15/160", "15/162", "15/243", "15/66", "15/249",
"15/86", "14/388", "14/396", "14/431", "14/401", "14/103", "15/36"
)), .Names = c("ROLL.NO.", "DC31", "D14", "HOUSE.NO"), row.names = c(NA,
50L), class = "data.frame")
来自 ?base::Logic
、help('&')
、help('|')
等
See
Syntax
for the precedence of these operators: unlike many other languages (including S) the AND and OR operators do not have the same precedence (the AND operators have higher precedence than the OR operators).
这解释了为什么
TRUE | TRUE & FALSE
# [1] TRUE
本质上是
TRUE | (TRUE & FALSE)
这也是正确的,并且简化了您在这里所做的事情:
(project$DC31==1&project$D14==2) |
(project$DC31==2&project$D14==1) &
!is.na(project$DC31) &
!is.na(project$D14) &
project$ROLL.NO. == 3131
因为你希望结果只包含一些 project$ROLL.NO. == 3131
我假设,所以即使其中一些是错误的,如果一个或多个 OR
是正确的,你可能会得到一些不是ROLL.NO.
不是 3131
另请注意 !
的优先级高于逻辑