如何删除 R 中具有 NULL 值的行

How to remove rows that have NULL values in R

下面是示例数据和一种操作。在更大的图片中,我正在读取一堆按年份描述的 excel 文件,然后只取 select 列(1000 列中的 14 列)并将它们放入新的数据框(df1,df2 for例子)。从那里,我将这些新数据组合成一个最终数据框。我的问题是如何删除最终数据框中填充有空值的行。我可以过滤但希望简单地在 R 中删除它们并完成它们。

 testyear <-c(2010,2010,2010,2010,2011,2011,2011,2010)
 teststate<-c("CA", "Co", "NV", "NE", "CA", "CO","NV","NE")
 totalhousehold<-c(251,252,253,"NULL",301,302,303,"NULL")
 marriedhousehold <-c(85,86,87,"NULL",158,159,245,"NULL")


 test1<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)


 testyear<-c(2012,2012,2012,2012)
 teststate<-c("WA","OR","WY","UT")
 totalhousehold<-c(654,650,646,641)
 marriedhousehold<-c(400,399,398,395)

 test2<-data.frame(testyear,teststate,totalhousehold,marriedhousehold)

 test3<-rbind(test1,test2)

因为这些是 character 列,我们可以 filter across 只有 character 列到 return 没有 "NULL" 元素的行并使用 type.convert

更改列的 type
library(dplyr)
test4 <- test3 %>% 
      filter(across(where(is.character), ~ . != "NULL")) %>%
       type.convert(as.is = TRUE)

-输出

> test4
   testyear teststate totalhousehold marriedhousehold
1      2010        CA            251               85
2      2010        Co            252               86
3      2010        NV            253               87
4      2011        CA            301              158
5      2011        CO            302              159
6      2011        NV            303              245
7      2012        WA            654              400
8      2012        OR            650              399
9      2012        WY            646              398
10     2012        UT            641              395
> str(test4)
'data.frame':   10 obs. of  4 variables:
 $ testyear        : int  2010 2010 2010 2011 2011 2011 2012 2012 2012 2012
 $ teststate       : chr  "CA" "Co" "NV" "CA" ...
 $ totalhousehold  : int  251 252 253 301 302 303 654 650 646 641
 $ marriedhousehold: int  85 86 87 158 159 245 400 399 398 395

或在base R中,使用subsetrowSums创建逻辑表达式

type.convert(subset(test3, !rowSums(test3 == "NULL")), as.is = TRUE)

为什么 dplyr 什么时候可以做到简单?

test3[test3 == "NULL"] = NA
test3 <- na.omit(test3)