数据表几列的散点图

Question

在一个有 20 行和 6 列的数据table 中，我想绘制第 2 列和第 4 列的散点图。第 1 列是从 0 到 19 的行 ID。在 table 描述中，第 1 列和第 2 列的因子为 20 个水平，第 4 列为 num

我已经尝试将带有 as.factor 的单个列转换为单个数据文件，然后合并在一起并使用 ggplot 绘图。这对我不起作用。

BasinSize <- as.factor(Table_Barrow20$`Lake Size`) #column2 of table
Basinheight <- as.factor(Table_Barrow20$`Lake Mean`) #column 4 of table
scatterdata <- merge(Basinheight, BasinSize)

plot(scatterdata)
ggplot(scatterdata, aes(x=Basinheight, y=BasinSize), col=c("33FF00")) + 
  geom_point(shape=18)

问题是，两列以错误的方式合并在一起，将所有 20 个值与 20 个值合并，而不是按 ID 合并。

这是从 .txt

复制的 table

"Name""Lake Size""Max""Mean""Med""Min" “1” “0” “2419723” 9.37238597869873 6.85431201700351 6.79038763046265 5.5892276763916 “2” “1” “737345” 2.20990252494812 1.17229168051113 1.16918420791626 0.532729208469391 “3” “2” “1904419” 6.97486448287964 6.29653060932372 6.29239559173584 5.74258995056152 “4” “3” “633220” 2.94963598251343 0.693283292837505 0.566755801439285 -1.04891955852509 “5” “4” “3417157” 2.02893280982971 1.04370415649172 1.16990214586258 -0.615132451057434 “6” “5” “3046643” 2.39258670806885 0.612889545533382 0.621234953403473 -2.27862739562988 “7” “6” “3868608” 16.8747043609619 15.986930145805 15.9581031799316 14.7309837341309 “8” “7” “11952064” 4.12359857559204 3.50676135545307 3.50302672386169 2.70154309272766 “9” “8” “2431961” 6.02156400680542 4.79594737052494 4.82516670227051 3.39997673034668 “10” “9” “5624563” 7.80270195007324 6.76836155530465 6.72958827018738 5.68962478637695 “11” “10” “2430490” 4.87959337234497 3.43340588038286 3.3837513923645 2.91182518005371 “12” “12” “1436097” 3.67803716659546 2.49129957226396 2.47576546669006 1.17649579048157 “13” “13” “791941” 5.25690269470215 4.07207433426663 4.07166481018066 3.61373019218445 “14” “14” “3013737” 1.69542956352234 0.756966933677959 0.755697637796402 -2.0527184009552 “15” “15” “2594511” 5.87903642654419 2.43693244171563 2.44506788253784 0.725884079933167 “16” “16” “3105136” 12.6303310394287 9.71669491262446 9.67505931854248 8.92571830749512 “17” “17” “1985544” 9.32382488250732 8.25899538392204 8.30398368835449 6.08988952636719 “18” “18” “1800122” 12.424147605896 8.48729049871582 8.50036954879761 7.7384238243103 “19” “19” “2753803” 16.724292755127 15.7803085039918 15.7673816680908 14.8390283584595 “20” “11” “765907” 3.45813465118408 2.61115002320832 2.59490370750427 2.17101335525513

Answer 1

因为这两个字段来自同一个数据表，所以可能不需要单独合并它们，您可以在 ggplot 调用中调用它们即：

Table_Barrow20 = data.frame(LakeSize = rnorm(50,2),
               LakeMean = rnorm(50, 3))

 ggplot(Table_Barrow20, aes(x=LakeMean, y=LakeSize), col=c("33FF00")) + 
     geom_point(shape=18)

您可能还需要考虑投射数字 as.numeric()，因为散点图不是显示因子类型数据的最佳方式，而且您提供的数据看起来是连续的。

Answer 2

我认为您不应该使用因子，因为使用 geom_point，您需要 x 和 y 的数值特征

这应该有效：

ggplot(Table_Barrow20, aes(x=as.numeric(`Lake Size`), y=as.numeric(`Lake Mean`)), col=c("33FF00")) + 
  geom_point(shape=18)

更新如果你有多个 geom_point，你可以把你的数据框放在整洁的格式中：

library(tidyverse)
Table_Barrow20  <- data.frame(Buffer.Mean=c(1,2,4),Buffer.Size=c(10,30,20),Lake.Mean=c(3,1,4),Lake.Size=c(15,25,12))
# select columns about buffer and add type variable
df1 <- Table_Barrow20 %>% 
          select(Buffer.Mean,Buffer.Size) %>% 
          rename(Mean=Buffer.Mean,Size=Buffer.Size) %>% 
          mutate(type="Buffer") %>% 
          mutate(Size=as.numeric(Size), Mean=as.numeric(Mean))
# select columns about lake and add type variable
df2 <- Table_Barrow20 %>% 
          select(Lake.Mean,Lake.Size) %>% 
          rename(Mean=Lake.Mean,Size=Lake.Size) %>% 
          mutate(type="Lake") %>% 
          mutate(Size=as.numeric(Size), Mean=as.numeric(Mean))
# bind the two dataframe to make one with all lines
df_tot <- rbind(df1,df2) 
# add "type" column as color so the colors will be different for lake and buffer
ggplot(df_tot, aes(x=Size, y=Mean, col=type)) + 
  geom_point(shape=18)

数据表几列的散点图

Scatterplot of several columns of a datatable

datatable

r

scatter-plot