如何将列表的数据框列绘制为水平线
How to plot a dataframe column of lists as horizontal lines
我有一个包含列 'all_maxs'
的 Dataframe,它可以包含不同值的列表。
c all_maxs
38 50804.6 [50883.3]
39 50743.9 [50883.3]
40 50649.9 [50883.3]
41 50508.3 [50883.3]
42 50577.6 [50883.3]
43 50703.0 [50883.3]
44 50793.7 [50883.3]
45 50647.8 [50883.3, 50813.1]
46 50732.8 [50883.3, 50813.1]
47 50673.2 [50883.3, 50813.1]
df.plot(y='c')
当前结果
我需要绘制列 'c'
,列 'all_maxs'
的值应该是水平线。
预期结果
- 验证
'all_maxs'
值是 list
类型。
- 从列表中提取值并将它们绘制为水平线。
df = df.dropna()
如果有 NaN
导入和 DataFrame
- 如果需要,使用
ast.liter_eval
将 'all_maxs'
列类型从 str
转换为 list
import pandas as pd
from ast import literal_eval
data =\
{38: {'all_maxs': '[50883.3]', 'c': 50804.6},
39: {'all_maxs': '[50883.3]', 'c': 50743.9},
40: {'all_maxs': '[50883.3]', 'c': 50649.9},
41: {'all_maxs': '[50883.3]', 'c': 50508.3},
42: {'all_maxs': '[50883.3]', 'c': 50577.6},
43: {'all_maxs': '[50883.3]', 'c': 50703.0},
44: {'all_maxs': '[50883.3]', 'c': 50793.7},
45: {'all_maxs': '[50883.3, 50813.1]', 'c': 50647.8},
46: {'all_maxs': '[50883.3, 50813.1]', 'c': 50732.8},
47: {'all_maxs': '[50883.3, 50813.1]', 'c': 50673.2}}
df = pd.DataFrame.from_dict(data, orient='index')
# reorder the columns to match the OP
df = df[['c', 'all_maxs']]
# print a value from all_maxs to see the type
>>> print(type(df.loc[38, 'all_maxs']))
str
# currently the all_max values are strings, which must be converted to list type
df.all_maxs = df.all_maxs.apply(literal_eval)
# print a value from all_maxs to see the type
>>> print(type(df.loc[38, 'all_maxs']))
list
情节
- 直接用
pandas.DataFrame.plot
绘制数据帧
xticks=df.index
将为索引中的每个值创建一个 xtick,但如果有很多值挤在 x 轴上,请删除此参数。
- 使用
matplotlib.pyplot.hlines
,它将接受值列表,将 'all_max'
中的唯一值绘制为水平线。
- 使用
pandas.DataFrame.explode
to remove all the values from lists, and then drop duplicates with .drop_duplicates
y=
将是 'all_maxs'
列中的剩余值
xmin=
将是剩余的索引值
xmax=
将是从 df
绘制的索引中的最大值
ax = df.plot(y='c', legend=False, figsize=(8, 5), xticks=df.index)
# extract all the values from all_maxs, drop the duplicates
all_maxs = df.all_maxs.explode().drop_duplicates().to_frame()
# add the horizontal lines
ax.hlines(y=all_maxs.all_maxs, xmin=all_maxs.index, xmax=df.index.max(), color='k')
我有一个包含列 'all_maxs'
的 Dataframe,它可以包含不同值的列表。
c all_maxs
38 50804.6 [50883.3]
39 50743.9 [50883.3]
40 50649.9 [50883.3]
41 50508.3 [50883.3]
42 50577.6 [50883.3]
43 50703.0 [50883.3]
44 50793.7 [50883.3]
45 50647.8 [50883.3, 50813.1]
46 50732.8 [50883.3, 50813.1]
47 50673.2 [50883.3, 50813.1]
df.plot(y='c')
当前结果
我需要绘制列 'c'
,列 'all_maxs'
的值应该是水平线。
预期结果
- 验证
'all_maxs'
值是list
类型。 - 从列表中提取值并将它们绘制为水平线。
df = df.dropna()
如果有NaN
导入和 DataFrame
- 如果需要,使用
ast.liter_eval
将
'all_maxs'
列类型从 str
转换为 list
import pandas as pd
from ast import literal_eval
data =\
{38: {'all_maxs': '[50883.3]', 'c': 50804.6},
39: {'all_maxs': '[50883.3]', 'c': 50743.9},
40: {'all_maxs': '[50883.3]', 'c': 50649.9},
41: {'all_maxs': '[50883.3]', 'c': 50508.3},
42: {'all_maxs': '[50883.3]', 'c': 50577.6},
43: {'all_maxs': '[50883.3]', 'c': 50703.0},
44: {'all_maxs': '[50883.3]', 'c': 50793.7},
45: {'all_maxs': '[50883.3, 50813.1]', 'c': 50647.8},
46: {'all_maxs': '[50883.3, 50813.1]', 'c': 50732.8},
47: {'all_maxs': '[50883.3, 50813.1]', 'c': 50673.2}}
df = pd.DataFrame.from_dict(data, orient='index')
# reorder the columns to match the OP
df = df[['c', 'all_maxs']]
# print a value from all_maxs to see the type
>>> print(type(df.loc[38, 'all_maxs']))
str
# currently the all_max values are strings, which must be converted to list type
df.all_maxs = df.all_maxs.apply(literal_eval)
# print a value from all_maxs to see the type
>>> print(type(df.loc[38, 'all_maxs']))
list
情节
- 直接用
pandas.DataFrame.plot
绘制数据帧xticks=df.index
将为索引中的每个值创建一个 xtick,但如果有很多值挤在 x 轴上,请删除此参数。
- 使用
matplotlib.pyplot.hlines
,它将接受值列表,将'all_max'
中的唯一值绘制为水平线。- 使用
pandas.DataFrame.explode
to remove all the values from lists, and then drop duplicates with.drop_duplicates
y=
将是'all_maxs'
列中的剩余值xmin=
将是剩余的索引值xmax=
将是从df
绘制的索引中的最大值
- 使用
ax = df.plot(y='c', legend=False, figsize=(8, 5), xticks=df.index)
# extract all the values from all_maxs, drop the duplicates
all_maxs = df.all_maxs.explode().drop_duplicates().to_frame()
# add the horizontal lines
ax.hlines(y=all_maxs.all_maxs, xmin=all_maxs.index, xmax=df.index.max(), color='k')