将值分组到箱中,然后使用 plotly (R, Dplyr) 进行绘图
Group values into bins and then plot using plotly (R, Dplyr)
我有一个数据集 df,它告诉我们 X 是频率,类别是 'bin
其中 X 值属于。所以 X 告诉我们类别出现了多少次。 (这是实际数据集的一个小样本)
图1
Category X
100 5
101 10
110 20
120 5
125 2
150 1
我使用来自另一个看起来像图 3 的数据集的代码创建了上面的输出
Fig. 2 df1 <- aggregate(df$gr, by=list(Category=data$Duration), FUN=length)
Fig. 3 gr Duration
Outdata1 100
Outdata2 101
Outdata3 110
Outdata4 120
Outdata5 125
Outdata6 150
这是我的绘图示例:
p <- plot_ly(data = df,
x = ~Category,
y = ~x,
name = "name",
type = "bar",
orientation = 'v'
)%>%
layout(
title = "title",
xaxis = list(title = "Time in Seconds" , categoryorder = "ascending",tickangle = -45 ),
yaxis = list(title = "example",
barmode = "group"
))
[![enter image description here][1]][1]
但是,我不想将类别显示为单独的值,而是像直方图一样将它们分组在 'bins' 中,如下所示:
以便类别以 10 为增量显示 bin,因此类别将如下所示:
100 110 120 130 140 150 对比 100 101 110 120 125 150
这是图 1 的输入
structure(list(Category = structure(c(0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 36, 38, 39, 40, 42, 43, 44, 47,
48, 49, 50, 51, 52, 53, 55, 56, 57, 58, 60, 63, 65, 66, 67, 68,
69, 70, 71, 72, 74, 77, 79, 80, 82, 84, 87, 89, 90, 91, 96, 97,
98, 103, 110, 114, 116, 124, 125, 126, 133, 134, 143, 149, 152,
154, 155, 157, 158, 161, 163, 164, 173, 177, 179, 183, 184, 185,
189, 190, 193, 196, 198, 201, 207, 211, 214, 217, 227, 229, 234,
235, 248, 265, 270, 285, 293, 307), class = "difftime", units = "secs"),
x = c(1L, 1L, 1L, 5L, 4L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 7L,
2L, 2L, 4L, 3L, 3L, 3L, 1L, 1L, 3L, 1L, 4L, 3L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 4L, 2L, 2L, 4L,
1L, 2L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L
)), row.names = c(NA, -118L), class = "data.frame")
所以根据你所做的,我没有过滤任何类型的数据。但是使用 tidyverse
包,我会这样做:
dfs %>%
mutate(newvar = as.numeric(gsub(" secs", "", Category)),
new_cat = cut(newvar, breaks = seq(0,round(max(newvar), -1), by = 10), include.lowest = T)) %>%
group_by(new_cat) %>%
summarise(Counts = sum(x)) %>%
ungroup() %>%
ggplot(aes(x = new_cat, y = Counts)) +
geom_bar(stat = "identity")
我有一个数据集 df,它告诉我们 X 是频率,类别是 'bin 其中 X 值属于。所以 X 告诉我们类别出现了多少次。 (这是实际数据集的一个小样本)
图1
Category X
100 5
101 10
110 20
120 5
125 2
150 1
我使用来自另一个看起来像图 3 的数据集的代码创建了上面的输出
Fig. 2 df1 <- aggregate(df$gr, by=list(Category=data$Duration), FUN=length)
Fig. 3 gr Duration
Outdata1 100
Outdata2 101
Outdata3 110
Outdata4 120
Outdata5 125
Outdata6 150
这是我的绘图示例:
p <- plot_ly(data = df,
x = ~Category,
y = ~x,
name = "name",
type = "bar",
orientation = 'v'
)%>%
layout(
title = "title",
xaxis = list(title = "Time in Seconds" , categoryorder = "ascending",tickangle = -45 ),
yaxis = list(title = "example",
barmode = "group"
))
[![enter image description here][1]][1]
但是,我不想将类别显示为单独的值,而是像直方图一样将它们分组在 'bins' 中,如下所示:
以便类别以 10 为增量显示 bin,因此类别将如下所示: 100 110 120 130 140 150 对比 100 101 110 120 125 150
这是图 1 的输入
structure(list(Category = structure(c(0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30, 31, 32, 33, 34, 36, 38, 39, 40, 42, 43, 44, 47,
48, 49, 50, 51, 52, 53, 55, 56, 57, 58, 60, 63, 65, 66, 67, 68,
69, 70, 71, 72, 74, 77, 79, 80, 82, 84, 87, 89, 90, 91, 96, 97,
98, 103, 110, 114, 116, 124, 125, 126, 133, 134, 143, 149, 152,
154, 155, 157, 158, 161, 163, 164, 173, 177, 179, 183, 184, 185,
189, 190, 193, 196, 198, 201, 207, 211, 214, 217, 227, 229, 234,
235, 248, 265, 270, 285, 293, 307), class = "difftime", units = "secs"),
x = c(1L, 1L, 1L, 5L, 4L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 7L,
2L, 2L, 4L, 3L, 3L, 3L, 1L, 1L, 3L, 1L, 4L, 3L, 2L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 3L, 1L, 3L, 1L, 1L, 4L, 2L, 2L, 4L,
1L, 2L, 2L, 1L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 1L,
1L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 3L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L
)), row.names = c(NA, -118L), class = "data.frame")
所以根据你所做的,我没有过滤任何类型的数据。但是使用 tidyverse
包,我会这样做:
dfs %>%
mutate(newvar = as.numeric(gsub(" secs", "", Category)),
new_cat = cut(newvar, breaks = seq(0,round(max(newvar), -1), by = 10), include.lowest = T)) %>%
group_by(new_cat) %>%
summarise(Counts = sum(x)) %>%
ungroup() %>%
ggplot(aes(x = new_cat, y = Counts)) +
geom_bar(stat = "identity")