数据框可以做什么而 tibble 不能？

Question

Tidyverse 的粉丝经常给出使用小标题而不是数据框的几个优点。它们中的大多数似乎旨在保护用户免于犯错。例如，与数据框不同，tibbles:

不需要 ,drop=FALSE 参数来不从数据中删除维度。
不会让 $ 运算符对列名进行部分匹配。
仅回收恰好长度为 1 的输入向量。

我逐渐确信要用 tibbles 替换我所有的数据框。这样做的主要缺点是什么？更具体地说，数据框可以做什么而 tibble 做不到？

先发制人，我想明确表示我不是在询问 data.table 或任何 big-picture 对 Tidyverse 的反对意见。我严格询问小标题和数据框。

Answer 1

在此处了解到：https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html

小标题和数据框之间存在三个主要区别：

打印
子集化
回收规则

Tibbles：

切勿更改输入的类型（即不再有 stringsAsFactors = 错误！）
永远不要调整变量的名称
延迟和顺序地计算参数
从不使用 row.names()
只回收长度为1的向量

大数据帧以尽可能多的行显示，直到内存缓冲区不堪重负。在这种情况下，R 将在数据帧的任意部分停止。

在 tibble 格式中，仅显示前十行和所有适合的列。还会显示数据集的 Colum 数据类型和大小。

Answer 2

来自the trouble with tibbles，您可以阅读：

there isn’t really any trouble with tibbles

然而，

Some older packages don’t work with tibbles because of their alternative subsetting method. They expect tib[,1] to return a vector, when in fact it will now return another tibble.

这是@Henrik 在评论中指出的。

例如，length 函数不会 return 相同的结果：

library(tibble)
tibblecars <- as_tibble(mtcars)
tibblecars[,"cyl"]
#> # A tibble: 32 x 1
#>      cyl
#>    <dbl>
#>  1     6
#>  2     6
#>  3     4
#>  4     6
#>  5     8
#>  6     6
#>  7     8
#>  8     4
#>  9     4
#> 10     6
#> # ... with 22 more rows
length(tibblecars[,"cyl"])
#> [1] 1
mtcars[,"cyl"]
#>  [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
length(mtcars[,"cyl"])
#> [1] 32

其他示例：

base::reshape not working with tibbles

Invariants for subsetting and subassignment 解释了 tibble 的行为与 data.frame 的不同之处。

已知这些限制，哈德利在interacting with legacy code中给出的解决方案是：

A handful of functions don’t work with tibbles because they expect df[, 1] to return a vector, not a data frame. If you encounter one of these functions, use as.data.frame() to turn a tibble back to a data frame:

数据框可以做什么而 tibble 不能？

What can a data frame do that a tibble cannot?

r

dataframe

tibble