R:如何生成具有多个离散 y 对 x 的 xy 图

R: How to generate a xy plot with multiple discrete y against x

我现在有一个这样的数据框。

df <-data.frame("networkNO"=c(1:length(dg)),"AverageDegree"=average_degree,"AverageBetweenness"=average_betweenness,"AverageCloseness"=average_closeness,"ClusterCoefficient"=cluster_coefficient)

所以每一列显示一系列网络的度量。

       networkNO AverageDegree AverageBetweenness AverageCloseness ClusterCoefficient
1          1     10.804124         300.453608     0.0012898154          0.4388075
2          2     10.785714          31.660714     0.0085438562          0.4646219
3          3     10.909091          52.688312     0.0055827873          0.4440915
4          4     10.000000          19.435897     0.0131519596          0.5078864
5          5     11.372014        1348.049488     0.0003100285          0.4193862
6          6      8.736842          66.210526     0.0054046865          0.5077356
7          7      1.000000           0.000000     1.0000000000                NaN
8          8      7.755102          49.346939     0.0070593456          0.5193906
9          9      9.000000           6.363636     0.0298526499          0.5279429
10        10      7.538462           2.230769     0.0611896445          0.6666667
11        11      7.297297          34.027027     0.0099660321          0.5391566
12        12      1.000000           0.000000     1.0000000000                NaN
13        13      6.666667          20.111111     0.0156903046          0.5445378
14        14      3.000000           0.000000     0.3333333333          1.0000000
15        15      9.658537          21.341463     0.0122712462          0.4870849
16        16      7.100000           8.050000     0.0290803614          0.5692964

我想生成一个二维图,其中 x 轴显示不同的测量值,y 轴显示实际测量值。

如何实现? 我怎样才能生成箱线图?

这应该是你想要的:

boxplot(df[-1])  ## exclude column `networkNO`

但是,您必须重新缩放数据。目前,这些列的比例差异很大,直接在箱线图上绘制它们是个坏主意(如上图)。

以下代码将重新缩放您的列并更新 df:

df1 <- within(df, {AverageDegree = scale(AverageDegree);
                   AverageBetweenness = scale(AverageBetweenness);
                   AverageCloseness = scale(AverageBetweenness);
                   ClusterCoefficient = scale(ClusterCoefficient);})

boxplot(df1[-1])  ## exclude column `networkNO`

scale(参见?scale)中的默认方法首先通过减去均值对数据进行居中,然后将数据除以标准差。你可能会三思这是否是你想要的,因为在重新缩放后,y-axis 上的值的含义略有不同。

如果您不想进行任何类型的缩放,那么您可能需要考虑为每一列绘制一个单独的箱形图,并将它们排列在同一个面板中。以下是这样做的:

par(mfrow = c(2,2))
boxplot(df$AverageDegree, xlab = "AverageDegree")
boxplot(df$AverageBetweenness, xlab = "AverageBetweenness")
boxplot(df$AverageCloseness, xlab = "AverageCloseness")
boxplot(df$ClusterCoefficient, xlab = "ClusterCoefficient")

注意 boxplot() 有一个特殊的参数 outline。通过设置 outline = FALSE,异常值(异常大的数据)将被删除。您可以比较:

par(mfrow = c(2,2))
boxplot(df$AverageDegree, xlab = "AverageDegree", outline = FALSE)
boxplot(df$AverageBetweenness, xlab = "AverageBetweenness", outline = FALSE)
boxplot(df$AverageCloseness, xlab = "AverageCloseness", outline = FALSE)
boxplot(df$ClusterCoefficient, xlab = "ClusterCoefficient", outline = FALSE)

跟进

How can I plot dot plot instead of boxplot?

我们可以简单地使用 plot():

par(mfrow = c(2,2))
plot(rep(1, nrow(df)), df$AverageDegree, xlab = "AverageDegree", xaxt = "n")
plot(rep(1, nrow(df)), df$AverageBetweenness, xlab = "AverageBetweenness", xaxt = "n")
plot(rep(1, nrow(df)), df$AverageCloseness, xlab = "AverageCloseness", xaxt = "n")
plot(rep(1, nrow(df)), df$ClusterCoefficient, xlab = "ClusterCoefficient", xaxt = "n")

也许您也对制作直方图感兴趣?

par(mfrow = c(2,2))
hist(df$AverageDegree, main = "AverageDegree", xlab = "")
hist(df$AverageBetweenness, main = "AverageBetweenness", xlab = "")
hist(df$AverageCloseness, main = "AverageCloseness", xlab = "")
hist(df$ClusterCoefficient, main = "ClusterCoefficient", xlab = "")