如何在 R 中生成 .txt 文件的散点图
How to produce scatterplot of .txt file in R
我目前正在尝试生成 .txt 文件的散点图,该文件的结构如下所示,分为 25 行:
age income weight
33 63 180
25 72 220
但是,当我尝试将其转换为 csv,然后使用以下代码生成散点图时:
my_input <- read.csv2('dataInput.txt', sep = '\t', header = T)
plot(x = my_input$ageX, y = my_input$weightY)
我收到一条错误消息。我还注意到 'age' 'income' 和 'weight' 之间现在有一个句点,我不明白,因为我希望它们之间有一个逗号。错误信息如下:
Error in plot.window(...) : need finite 'xlim' values In addition:
Warning messages: 1: In min(x) : no non-missing arguments to min;
returning Inf 2: In max(x) : no non-missing arguments to max;
returning -Inf 3: In min(x) : no non-missing arguments to min;
returning Inf 4: In max(x) : no non-missing arguments to max;
returning -Inf
关于如何实际获得数据散点图的任何想法?
编辑: 执行
head(my_input)
age. income. weight
1 56 63 185
2 38 72 156
3 28 75 178
4 49 59 205
5 69 65 235
6 19 70 195
编辑:
str(my_input)
age.income.weight: Factor w/ 18 levels "56 63 185",..: 1 2 3 4 5 6 7 8 9 10 ...
summary(my_input)
age.income.weight
56 63 185: 1
38 72 156: 1
28 75 178: 1
49 59 205: 1
69 65 235: 1
19 70 195: 1
(Other) :19
根据您对问题的修改,您在加载 txt 文件时遇到了问题。在检查您的文本文件的结构时,似乎每行和每列之间的间距不一致。
因此,让它工作的一种方法是通过使用 readLines
:
读取它从头开始创建数据框
my_input <- readLines("crime_input.txt")
my_input <- unlist(strsplit(my_input," "))
现在你看到文件中包含了很多space:
> my_input
[1] "age" "income" "crimes" "16" "" "" "" "" "63" "" "" ""
[13] "" "23" "18" "" "" "" "" "72" "" "" "" ""
[25] "25" "18" "" "" "" "" "75" "" "" "" "" "22"
[37] "19" "" "" "" "" "59" "" "" "" "" "16" "19"
[49] "" "" "" "" "65" "" "" "" "" "19" "19" ""
[61] "" "" "" "70" "" "" "" "" "19" "20" "" ""
[73] "" "" "78" "" "" "" "" "18" "21" "" "" ""
[85] "" "35" "" "" "" "" "11" "21" "" "" "" ""
[97] "53" "" "" "" "" "15" "23" "" "" "" "" "28"
[109] "" "" "" "" "" "9" "27" "" "" "" "" "56"
[121] "" "" "" "" "16" "28" "" "" "" "" "52" ""
[133] "" "" "" "14" "29" "" "" "" "" "63" "" ""
[145] "" "" "25" "30" "" "" "" "" "46" "" "" ""
[157] "" "17" "30" "" "" "" "" "55" "" "" "" ""
[169] "19" "31" "" "" "" "" "29" "" "" "" "" ""
[181] "8" "32" "" "" "" "" "55" "" "" "" "" "22"
[193] "32" "" "" "" "" "62" "" "" "" "" "25"
因此,我们可以将所有内容转换为数字,删除 NA 并得到:
my_input <- as.numeric(my_input)
my_input <- my_input[!is.na(my_input)]
获得:
> my_input
[1] 16 63 23 18 72 25 18 75 22 19 59 16 19 65 19 19 70 19 20 78 18 21 35 11 21 53 15 23 28 9 27 56 16 28 52 14
[37] 29 63 25 30 46 17 30 55 19 31 29 8 32 55 22 32 62 25
最后,我们可以用这个向量填充一个矩阵:
my_input <- matrix(my_input, nrow = 3, ncol = length(my_input)/3)
> my_input
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
[1,] 16 18 18 19 19 19 20 21 21 23 27 28 29 30 30 31 32 32
[2,] 63 72 75 59 65 70 78 35 53 28 56 52 63 46 55 29 55 62
[3,] 23 25 22 16 19 19 18 11 15 9 16 14 25 17 19 8 22 25
现在,我们可以转置矩阵,转换为 data.frame 并添加列名:
my_input <- as.data.frame(t(my_input))
colnames(my_input) <- c("age","income","crimes")
最后,你得到:
> head(my_input)
age income crimes
1 16 63 23
2 18 72 25
3 18 75 22
4 19 59 16
5 19 65 19
6 19 70 19
如果你检查 my_input
的格式:
> str(my_input)
'data.frame': 18 obs. of 3 variables:
$ age : num 16 18 18 19 19 19 20 21 21 23 ...
$ income: num 63 72 75 59 65 70 78 35 53 28 ...
$ crimes: num 23 25 22 16 19 19 18 11 15 9 ...
所以,现在,您可以绘制它了:
my_input = my_input[order(my_input$age),]
plot(x = my_input$age, y = my_input$crimes, type = "b")
现在,您可以使用这个文件了。希望能帮到您解决这个问题。
我目前正在尝试生成 .txt 文件的散点图,该文件的结构如下所示,分为 25 行:
age income weight
33 63 180
25 72 220
但是,当我尝试将其转换为 csv,然后使用以下代码生成散点图时:
my_input <- read.csv2('dataInput.txt', sep = '\t', header = T)
plot(x = my_input$ageX, y = my_input$weightY)
我收到一条错误消息。我还注意到 'age' 'income' 和 'weight' 之间现在有一个句点,我不明白,因为我希望它们之间有一个逗号。错误信息如下:
Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf
关于如何实际获得数据散点图的任何想法?
编辑: 执行
head(my_input)
age. income. weight
1 56 63 185
2 38 72 156
3 28 75 178
4 49 59 205
5 69 65 235
6 19 70 195
编辑:
str(my_input)
age.income.weight: Factor w/ 18 levels "56 63 185",..: 1 2 3 4 5 6 7 8 9 10 ...
summary(my_input)
age.income.weight
56 63 185: 1
38 72 156: 1
28 75 178: 1
49 59 205: 1
69 65 235: 1
19 70 195: 1
(Other) :19
根据您对问题的修改,您在加载 txt 文件时遇到了问题。在检查您的文本文件的结构时,似乎每行和每列之间的间距不一致。
因此,让它工作的一种方法是通过使用 readLines
:
my_input <- readLines("crime_input.txt")
my_input <- unlist(strsplit(my_input," "))
现在你看到文件中包含了很多space:
> my_input
[1] "age" "income" "crimes" "16" "" "" "" "" "63" "" "" ""
[13] "" "23" "18" "" "" "" "" "72" "" "" "" ""
[25] "25" "18" "" "" "" "" "75" "" "" "" "" "22"
[37] "19" "" "" "" "" "59" "" "" "" "" "16" "19"
[49] "" "" "" "" "65" "" "" "" "" "19" "19" ""
[61] "" "" "" "70" "" "" "" "" "19" "20" "" ""
[73] "" "" "78" "" "" "" "" "18" "21" "" "" ""
[85] "" "35" "" "" "" "" "11" "21" "" "" "" ""
[97] "53" "" "" "" "" "15" "23" "" "" "" "" "28"
[109] "" "" "" "" "" "9" "27" "" "" "" "" "56"
[121] "" "" "" "" "16" "28" "" "" "" "" "52" ""
[133] "" "" "" "14" "29" "" "" "" "" "63" "" ""
[145] "" "" "25" "30" "" "" "" "" "46" "" "" ""
[157] "" "17" "30" "" "" "" "" "55" "" "" "" ""
[169] "19" "31" "" "" "" "" "29" "" "" "" "" ""
[181] "8" "32" "" "" "" "" "55" "" "" "" "" "22"
[193] "32" "" "" "" "" "62" "" "" "" "" "25"
因此,我们可以将所有内容转换为数字,删除 NA 并得到:
my_input <- as.numeric(my_input)
my_input <- my_input[!is.na(my_input)]
获得:
> my_input
[1] 16 63 23 18 72 25 18 75 22 19 59 16 19 65 19 19 70 19 20 78 18 21 35 11 21 53 15 23 28 9 27 56 16 28 52 14
[37] 29 63 25 30 46 17 30 55 19 31 29 8 32 55 22 32 62 25
最后,我们可以用这个向量填充一个矩阵:
my_input <- matrix(my_input, nrow = 3, ncol = length(my_input)/3)
> my_input
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
[1,] 16 18 18 19 19 19 20 21 21 23 27 28 29 30 30 31 32 32
[2,] 63 72 75 59 65 70 78 35 53 28 56 52 63 46 55 29 55 62
[3,] 23 25 22 16 19 19 18 11 15 9 16 14 25 17 19 8 22 25
现在,我们可以转置矩阵,转换为 data.frame 并添加列名:
my_input <- as.data.frame(t(my_input))
colnames(my_input) <- c("age","income","crimes")
最后,你得到:
> head(my_input)
age income crimes
1 16 63 23
2 18 72 25
3 18 75 22
4 19 59 16
5 19 65 19
6 19 70 19
如果你检查 my_input
的格式:
> str(my_input)
'data.frame': 18 obs. of 3 variables:
$ age : num 16 18 18 19 19 19 20 21 21 23 ...
$ income: num 63 72 75 59 65 70 78 35 53 28 ...
$ crimes: num 23 25 22 16 19 19 18 11 15 9 ...
所以,现在,您可以绘制它了:
my_input = my_input[order(my_input$age),]
plot(x = my_input$age, y = my_input$crimes, type = "b")
现在,您可以使用这个文件了。希望能帮到您解决这个问题。