使用freed读取大cdv时，为什么不能参考列号读取列

Question

当使用fread读取一个csv时，例如：

library(data.table)
outcome4<-fread("outcome-of-care-measures.csv")

然后如果第 11 列 ('Hospital 30-Day Death (Mortality) Rates from Heart Attack') 不是 'Not Available'，我想对观察进行子集化。所以我在下面写道：

outcome5<-subset(outcome4, outcome4[,11]!="Not Available")

但是好像subset函数没有用，outcome 5的观察和outcome4一样，第11列的观察全部变成11。

为什么？如果我使用 read.csv 函数，一切正常。

提前致谢！

Answer 1

查看outcome4[,11]的结果。

如您所写，fread()return是一个数据table。反过来，outcome4[,11]returns11。 11 永远不会等于 "Not Available"，所以你得到了整个 table。您想要使用 outcome4[, 11, with = FALSE] 作为数据 table 的第 11 列，或者 data.table = FALSE in fread() to return 数据框而不是数据 table.

但是数据 table 方法是：

outcome4[<column name 11> != "Not Available"]

其中 <column name 11> 是第 11 列未加引号的名称。

或者，在阅读

后将 "Not Available" 更改为 NA

outcome4 <- fread(file, na.strings = "Not Available")
outcome4[!is.na(<column name 11>)]

使用freed读取大cdv时，为什么不能参考列号读取列

When using freed to read a large cdv, why cannot refer to the column number to read the column

r

subset

fread