为什么我的 x 轴刻度在 plotly 图中没有正确排序
why does my x axis ticks are not sorted properly in plotly graph
我正在尝试绘制几周内的销售趋势。但是在 x 轴上,刻度的排序顺序不正确,这让我的图表看起来很奇怪。
图表:
如你所见,圈出的周,需要在起点或轴中进来。为什么即使我对数据框中的日期进行了排序,也会这样?
Pandas代码:
basic_df = filterDataFrameByDate(df,start_date,end_date)
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
df.sort_values('S2BillDate',inplace=True)
df['S2BillDate'] = df['S2BillDate'].astype('str')
我还注意到的另一件事是,当我从图例中删除 select 'Samosa' 时,刻度线排列正确。
截图:
有人可以帮我解决这个问题吗?
Pandas 代码和数据:
代码:
print(df['S2BillDate'].unique())
OP :
<PeriodArray>
['2020-02-03/2020-02-09', '2020-02-10/2020-02-16', '2020-02-17/2020-02-23',
'2020-02-24/2020-03-01']
Length: 4, dtype: period[W-SUN]
代码:
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
OP :
[537 rows x 4 columns]
S2PName S2BillDate totSale count
0 SAMOSA 2020-02-10/2020-02-16 4057.89 228
1 COFFEE 2020-02-10/2020-02-16 10567.21 582
2 TEA 2020-02-10/2020-02-16 6808.92 445
3 POORI 2020-02-10/2020-02-16 7556.77 179
4 PONGAL 2020-02-10/2020-02-16 4758.97 122
.. ... ... ... ...
411 PEPPER CHICKEN 2020-02-24/2020-03-01 90.00 1
412 SEZWAN CHICKEN FRIED NOODLES 2020-02-24/2020-03-01 199.50 2
413 SEZWAN VEG FRIED RICE 2020-02-24/2020-03-01 69.83 1
414 SEZWAN EGG FRIED RICE 2020-02-24/2020-03-01 89.78 1
415 EGG MASALA 2020-02-24/2020-03-01 50.04 1
我可以重现你的问题。我正在使用 plotly.express
但它与 plotly.graph_objs
的工作方式相同
数据
import pandas as pd
import plotly.express as px
df = pd.DataFrame({"SPName":["SAMOSA"]*3+ ["COFFEE"]*4,
"S2BillDate":["2020-02-10/2020-02-16",
"2020-02-17/2020-02-23",
"2020-02-24/2020-03-01",
"2020-02-24/2020-03-01",
"2020-02-17/2020-02-23",
"2020-02-10/2020-02-16",
"2020-02-03/2020-02-09"],
"totSale":[4000, 4500, 5000, 10_000, 12_000, 10_000, 2000]})
这个产品
fig = px.line(df, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.show()
这里的问题是日期是如何排序的。如果你看到 COFFEE
的第一个点是 2020-02-24/2020-03-01
第二个 2020-02-17/2020-02-23
等等。
快速解决方法是
df1 = df.sort_values("S2BillDate").reset_index(drop=True)
fig = px.line(df1, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
我个人更喜欢在 xaxis
上使用日期而不是字符串
df["Date"] = df["S2BillDate"].str.split("/").str[1].astype("M8")
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
```[![enter image description here][3]][3]
but in this case in order to show the ticktext in the format you asked for you still need to sort `df` and in this case there you need more coding.
```python
df = df.sort_values(["Date"]).reset_index(drop=True)
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.update_layout(
xaxis = dict(
type="category",
tickmode = 'array',
tickvals = df["Date"].tolist(),
ticktext = df["S2BillDate"].tolist()
)
)
fig.show()
我正在尝试绘制几周内的销售趋势。但是在 x 轴上,刻度的排序顺序不正确,这让我的图表看起来很奇怪。
图表:
如你所见,圈出的周,需要在起点或轴中进来。为什么即使我对数据框中的日期进行了排序,也会这样?
Pandas代码:
basic_df = filterDataFrameByDate(df,start_date,end_date)
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
df.sort_values('S2BillDate',inplace=True)
df['S2BillDate'] = df['S2BillDate'].astype('str')
我还注意到的另一件事是,当我从图例中删除 select 'Samosa' 时,刻度线排列正确。
截图:
有人可以帮我解决这个问题吗?
Pandas 代码和数据:
代码:
print(df['S2BillDate'].unique())
OP :
<PeriodArray>
['2020-02-03/2020-02-09', '2020-02-10/2020-02-16', '2020-02-17/2020-02-23',
'2020-02-24/2020-03-01']
Length: 4, dtype: period[W-SUN]
代码:
df = basic_df.groupby(['S2PName',basic_df['S2BillDate'].dt.to_period(flag)], sort=False)['S2PGTotal'].agg([('totSale','sum'),('count','size')]).reset_index()
OP :
[537 rows x 4 columns]
S2PName S2BillDate totSale count
0 SAMOSA 2020-02-10/2020-02-16 4057.89 228
1 COFFEE 2020-02-10/2020-02-16 10567.21 582
2 TEA 2020-02-10/2020-02-16 6808.92 445
3 POORI 2020-02-10/2020-02-16 7556.77 179
4 PONGAL 2020-02-10/2020-02-16 4758.97 122
.. ... ... ... ...
411 PEPPER CHICKEN 2020-02-24/2020-03-01 90.00 1
412 SEZWAN CHICKEN FRIED NOODLES 2020-02-24/2020-03-01 199.50 2
413 SEZWAN VEG FRIED RICE 2020-02-24/2020-03-01 69.83 1
414 SEZWAN EGG FRIED RICE 2020-02-24/2020-03-01 89.78 1
415 EGG MASALA 2020-02-24/2020-03-01 50.04 1
我可以重现你的问题。我正在使用 plotly.express
但它与 plotly.graph_objs
数据
import pandas as pd
import plotly.express as px
df = pd.DataFrame({"SPName":["SAMOSA"]*3+ ["COFFEE"]*4,
"S2BillDate":["2020-02-10/2020-02-16",
"2020-02-17/2020-02-23",
"2020-02-24/2020-03-01",
"2020-02-24/2020-03-01",
"2020-02-17/2020-02-23",
"2020-02-10/2020-02-16",
"2020-02-03/2020-02-09"],
"totSale":[4000, 4500, 5000, 10_000, 12_000, 10_000, 2000]})
这个产品
fig = px.line(df, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.show()
这里的问题是日期是如何排序的。如果你看到 COFFEE
的第一个点是 2020-02-24/2020-03-01
第二个 2020-02-17/2020-02-23
等等。
快速解决方法是
df1 = df.sort_values("S2BillDate").reset_index(drop=True)
fig = px.line(df1, x="S2BillDate", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
我个人更喜欢在 xaxis
df["Date"] = df["S2BillDate"].str.split("/").str[1].astype("M8")
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
```[![enter image description here][3]][3]
but in this case in order to show the ticktext in the format you asked for you still need to sort `df` and in this case there you need more coding.
```python
df = df.sort_values(["Date"]).reset_index(drop=True)
fig = px.line(df, x="Date", y="totSale", color="SPName")
fig.update_traces(mode='markers+lines')
fig.update_layout(
xaxis = dict(
type="category",
tickmode = 'array',
tickvals = df["Date"].tolist(),
ticktext = df["S2BillDate"].tolist()
)
)
fig.show()