如何导入字符列作为因子?
How to import character column as factor?
我在 .txt
文件中有以下数据集:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,Action
0,0,0,2,0,0,0,2,0,0,0,0,0,0,0,0,Up
2,0,0,0,2,0,0,0,0,0,0,2,0,0,0,0,Left
4,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,Left
4,2,0,2,0,2,0,0,0,0,0,0,0,0,0,0,Up
4,4,0,0,2,0,0,0,0,0,0,0,0,0,0,2,Up
8,0,0,0,2,0,0,0,2,0,0,0,2,0,0,0,Left
当我将我的数据集加载到 RStudio 中时,我希望它将最后一列 Action
转换为 Factor
类型。但是,它认为它是 Character
.
我可以强迫它把它当作因素,但它要求我
请插入逗号分隔的因素列表(您可以在下图中找到它)。但是,我不明白。我应该插入 Levels
的因子,还是将被整个 Action 列替换的值列表?
如何导入最后一列中带有 Factor
的数据集?
if 运行 str(dt)
它给了我:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 2979 obs. of 17 variables:
$ 1 : int 0 2 4 4 4 8 8 8 8 8 ...
$ 2 : int 0 0 0 2 4 0 0 0 2 4 ...
$ 3 : int 0 0 0 0 0 0 0 0 0 2 ...
$ 4 : int 2 0 2 2 0 0 0 2 2 0 ...
$ 5 : int 0 2 0 0 2 2 4 4 4 4 ...
$ 6 : int 0 0 0 2 0 0 0 0 0 2 ...
$ 7 : int 0 0 0 0 0 0 0 0 2 0 ...
$ 8 : int 2 0 0 0 0 0 0 0 0 0 ...
$ 9 : int 0 0 0 0 0 2 2 2 2 2 ...
$ 10 : int 0 0 2 0 0 0 0 2 0 0 ...
$ 11 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 12 : int 0 2 0 0 0 0 0 0 0 0 ...
$ 13 : int 0 0 0 0 0 2 0 0 0 0 ...
$ 14 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 15 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 16 : int 0 0 0 0 2 0 2 0 0 0 ...
$ Action: chr "Up" "Left" "Left" "Up" ...
- attr(*, "spec")=List of 2
..$ cols :List of 17
.. ..$ 1 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 2 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 3 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 4 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 5 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 6 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 7 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 8 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 9 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 10 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 11 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 12 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 13 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 14 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 15 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 16 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ Action: list()
.. .. ..- attr(*, "class")= chr "collector_character" "collector"
..$ default: list()
.. ..- attr(*, "class")= chr "collector_guess" "collector"
..- attr(*, "class")= chr "col_spec"
我们指定特定列的col_types
或所有列
library(readr)
read_csv(file, col_type = cols(Action = col_factor(levels = c("Up", "Left"))))
# A tibble: 6 x 17
# `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12` `13` `14` `15` `16` Action
# <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <fct>
#1 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 Up
#2 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 Left
#3 4 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 Left
#4 4 2 0 2 0 2 0 0 0 0 0 0 0 0 0 0 Up
#5 4 4 0 0 2 0 0 0 0 0 0 0 0 0 0 2 Up
#6 8 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 Left
使用基础 R
,这也应该有效:read.table("FileName.txt", sep = ",", header = TRUE, stringsAsFactors = TRUE)
我在 .txt
文件中有以下数据集:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,Action
0,0,0,2,0,0,0,2,0,0,0,0,0,0,0,0,Up
2,0,0,0,2,0,0,0,0,0,0,2,0,0,0,0,Left
4,0,0,2,0,0,0,0,0,2,0,0,0,0,0,0,Left
4,2,0,2,0,2,0,0,0,0,0,0,0,0,0,0,Up
4,4,0,0,2,0,0,0,0,0,0,0,0,0,0,2,Up
8,0,0,0,2,0,0,0,2,0,0,0,2,0,0,0,Left
当我将我的数据集加载到 RStudio 中时,我希望它将最后一列 Action
转换为 Factor
类型。但是,它认为它是 Character
.
我可以强迫它把它当作因素,但它要求我
请插入逗号分隔的因素列表(您可以在下图中找到它)。但是,我不明白。我应该插入 Levels
的因子,还是将被整个 Action 列替换的值列表?
如何导入最后一列中带有 Factor
的数据集?
if 运行 str(dt)
它给了我:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 2979 obs. of 17 variables:
$ 1 : int 0 2 4 4 4 8 8 8 8 8 ...
$ 2 : int 0 0 0 2 4 0 0 0 2 4 ...
$ 3 : int 0 0 0 0 0 0 0 0 0 2 ...
$ 4 : int 2 0 2 2 0 0 0 2 2 0 ...
$ 5 : int 0 2 0 0 2 2 4 4 4 4 ...
$ 6 : int 0 0 0 2 0 0 0 0 0 2 ...
$ 7 : int 0 0 0 0 0 0 0 0 2 0 ...
$ 8 : int 2 0 0 0 0 0 0 0 0 0 ...
$ 9 : int 0 0 0 0 0 2 2 2 2 2 ...
$ 10 : int 0 0 2 0 0 0 0 2 0 0 ...
$ 11 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 12 : int 0 2 0 0 0 0 0 0 0 0 ...
$ 13 : int 0 0 0 0 0 2 0 0 0 0 ...
$ 14 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 15 : int 0 0 0 0 0 0 0 0 0 0 ...
$ 16 : int 0 0 0 0 2 0 2 0 0 0 ...
$ Action: chr "Up" "Left" "Left" "Up" ...
- attr(*, "spec")=List of 2
..$ cols :List of 17
.. ..$ 1 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 2 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 3 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 4 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 5 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 6 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 7 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 8 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 9 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 10 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 11 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 12 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 13 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 14 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 15 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ 16 : list()
.. .. ..- attr(*, "class")= chr "collector_integer" "collector"
.. ..$ Action: list()
.. .. ..- attr(*, "class")= chr "collector_character" "collector"
..$ default: list()
.. ..- attr(*, "class")= chr "collector_guess" "collector"
..- attr(*, "class")= chr "col_spec"
我们指定特定列的col_types
或所有列
library(readr)
read_csv(file, col_type = cols(Action = col_factor(levels = c("Up", "Left"))))
# A tibble: 6 x 17
# `1` `2` `3` `4` `5` `6` `7` `8` `9` `10` `11` `12` `13` `14` `15` `16` Action
# <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <fct>
#1 0 0 0 2 0 0 0 2 0 0 0 0 0 0 0 0 Up
#2 2 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 Left
#3 4 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 Left
#4 4 2 0 2 0 2 0 0 0 0 0 0 0 0 0 0 Up
#5 4 4 0 0 2 0 0 0 0 0 0 0 0 0 0 2 Up
#6 8 0 0 0 2 0 0 0 2 0 0 0 2 0 0 0 Left
使用基础 R
,这也应该有效:read.table("FileName.txt", sep = ",", header = TRUE, stringsAsFactors = TRUE)