用于数据分布的 ggplot 语法
ggplot syntax for data distribution
我试图绘制 beforeMinWageLaw 和 afterMinWageLaw 变量的数据分布,但是当我将它存储在 df 而不是 seattleData 中时,r 显示 "Error: Aesthetics must be either length 1 or the same as the data (43): x"。我怎样才能解决这个问题?另外,我如何绘制正态概率图来了解数据的正态性?谢谢
#Import Data
#seattleData <- read.table(file=file.choose(),
# header=T, sep=",",)
library(ggplot2)
#Define Variables
food_drink_workers <- seattleData$food_drink_workers
MinWage <- seattleData$washington_state_minwage
afterMinWageLaw <- food_drink_workers[304:346]
beforeMinWageLaw <- food_drink_workers[1:303]
df <- data.frame(seattleData)
#Display Data Distribution with ggplot
x <-ggplot(df, aes(x=food_drink_workers)) +
geom_histogram(mapping = aes(y = ..density..), color="black", fill="white") +
geom_density(alpha=.2, fill="blue")
x + geom_vline(xintercept = c(108.8636), linetype = "dashed", color = "red") +
ggtitle("Distribtution of the Data") + xlab("Seattle MSA Food and Drink Workers") + ylab("Density")
#Conduct Two Sample t-test
options(scipen = 100)
tTest <- t.test(beforeMinWageLaw, afterMinWageLaw, mu=0, alternative = "less",
conf=.95, var.equal = F, paired = F)
您可以在这里下载数据:https://fred.stlouisfed.org/series/SMU53426607072200001SA
Screenshot
您收到此错误消息 "Error: Aesthetics must be either length 1 or the same as the data (43): x" 因为向量 afterMinWageLaw
的长度为 43 个值,而 beforeMinWageLaw
的长度为 303 个值,这就是为什么您不能在其中引用它们的原因一样的审美观aes()
,我猜。
我会在一个图中使用不同的可视化效果,这样您就可以使用不同的数据长度或行数来设置不同的美学效果。首先,我会把你的数据分成两个数据框,一个在法律之前,一个在法律之后。使用 ggplot,您可以在一个图中引用不同的数据框,在您的例子中是这样的:
#set row indicex ranges for before and after law
row_index_range_before <- 1:303;
row_index_range_after <- 304:346;
#define two data frames
df_before <- data.frame(seattleData)[row_index_range_before, ];
df_after <- data.frame(seattleData)[row_index_range_after, ];
#display data distributions of both data frames with ggplot
x <- ggplot() +
geom_histogram(
data = df_before
,mapping = aes(
x = food_drink_workers
,y = ..density..
,color = "blue")
,fill = "white") +
geom_histogram(
data = df_after
,mapping = aes(
x = food_drink_workers
,y = ..density..
,color = "red")
,fill = "white") +
geom_density(
data = df_before
,mapping = aes(
x = food_drink_workers
,y = ..density..
,fill = "blue")
,alpha = .2) +
geom_density(
data = df_after
,mapping = aes(
x = food_drink_workers
,y = ..density..
,fill = "red")
,alpha = .2) +
scale_colour_manual(
name = "Color"
,values = c("blue" = "blue", "red" = "red")
,labels = c("blue" = "Before Law", "red" = "After Law")) +
scale_fill_manual(
name = "Fill"
,values = c("blue" = "blue", "red" = "red")
,labels = c("blue" = "Before Law","red" = "After Law"));
x + geom_vline(
xintercept = c(108.8636)
,linetype = "dashed"
,color = "red") +
ggtitle("Distribtution of the Data") +
xlab("Seattle MSA Food and Drink Workers") +
ylab("Density");
但是这样,您还可以将 afterMinWageLaw
和 beforeMinWageLaw
引用为 aes()
中的 x
并删除引用数据框的 data
,我认为.
要同时绘制图例,您需要在 aes()
内设置 color
或 fill
,并在您的绘图中添加 scale_colour_manual()
或 scale_fill_manual()
。
我试图绘制 beforeMinWageLaw 和 afterMinWageLaw 变量的数据分布,但是当我将它存储在 df 而不是 seattleData 中时,r 显示 "Error: Aesthetics must be either length 1 or the same as the data (43): x"。我怎样才能解决这个问题?另外,我如何绘制正态概率图来了解数据的正态性?谢谢
#Import Data
#seattleData <- read.table(file=file.choose(),
# header=T, sep=",",)
library(ggplot2)
#Define Variables
food_drink_workers <- seattleData$food_drink_workers
MinWage <- seattleData$washington_state_minwage
afterMinWageLaw <- food_drink_workers[304:346]
beforeMinWageLaw <- food_drink_workers[1:303]
df <- data.frame(seattleData)
#Display Data Distribution with ggplot
x <-ggplot(df, aes(x=food_drink_workers)) +
geom_histogram(mapping = aes(y = ..density..), color="black", fill="white") +
geom_density(alpha=.2, fill="blue")
x + geom_vline(xintercept = c(108.8636), linetype = "dashed", color = "red") +
ggtitle("Distribtution of the Data") + xlab("Seattle MSA Food and Drink Workers") + ylab("Density")
#Conduct Two Sample t-test
options(scipen = 100)
tTest <- t.test(beforeMinWageLaw, afterMinWageLaw, mu=0, alternative = "less",
conf=.95, var.equal = F, paired = F)
您可以在这里下载数据:https://fred.stlouisfed.org/series/SMU53426607072200001SA
Screenshot
您收到此错误消息 "Error: Aesthetics must be either length 1 or the same as the data (43): x" 因为向量 afterMinWageLaw
的长度为 43 个值,而 beforeMinWageLaw
的长度为 303 个值,这就是为什么您不能在其中引用它们的原因一样的审美观aes()
,我猜。
我会在一个图中使用不同的可视化效果,这样您就可以使用不同的数据长度或行数来设置不同的美学效果。首先,我会把你的数据分成两个数据框,一个在法律之前,一个在法律之后。使用 ggplot,您可以在一个图中引用不同的数据框,在您的例子中是这样的:
#set row indicex ranges for before and after law
row_index_range_before <- 1:303;
row_index_range_after <- 304:346;
#define two data frames
df_before <- data.frame(seattleData)[row_index_range_before, ];
df_after <- data.frame(seattleData)[row_index_range_after, ];
#display data distributions of both data frames with ggplot
x <- ggplot() +
geom_histogram(
data = df_before
,mapping = aes(
x = food_drink_workers
,y = ..density..
,color = "blue")
,fill = "white") +
geom_histogram(
data = df_after
,mapping = aes(
x = food_drink_workers
,y = ..density..
,color = "red")
,fill = "white") +
geom_density(
data = df_before
,mapping = aes(
x = food_drink_workers
,y = ..density..
,fill = "blue")
,alpha = .2) +
geom_density(
data = df_after
,mapping = aes(
x = food_drink_workers
,y = ..density..
,fill = "red")
,alpha = .2) +
scale_colour_manual(
name = "Color"
,values = c("blue" = "blue", "red" = "red")
,labels = c("blue" = "Before Law", "red" = "After Law")) +
scale_fill_manual(
name = "Fill"
,values = c("blue" = "blue", "red" = "red")
,labels = c("blue" = "Before Law","red" = "After Law"));
x + geom_vline(
xintercept = c(108.8636)
,linetype = "dashed"
,color = "red") +
ggtitle("Distribtution of the Data") +
xlab("Seattle MSA Food and Drink Workers") +
ylab("Density");
但是这样,您还可以将 afterMinWageLaw
和 beforeMinWageLaw
引用为 aes()
中的 x
并删除引用数据框的 data
,我认为.
要同时绘制图例,您需要在 aes()
内设置 color
或 fill
,并在您的绘图中添加 scale_colour_manual()
或 scale_fill_manual()
。