matplotlib - 是否可以创建普通的桑基图?

matplotlib - is it possible to create a normal sankey chart?

我使用下面的 plotly 代码创建桑基图。

import plotly.graph_objects as go
import plotly.io as pio
import pandas as pd

dataset = pd.read_csv('/Users/i073341/Library/CloudStorage/OneDrive-SAPSE/Data Analysis/result/Opportunity/20220121-170612/omp_cleanSankey.csv')

labelListTemp1 = list(set(dataset.source.values))
labelListTemp2 = list(set(dataset.target.values))
labelList = labelListTemp1 + labelListTemp2
sankey_node = list(dict.fromkeys(labelList))
 
fig = go.Figure(data=[go.Sankey( node = dict( pad=15,thickness=20,line = dict(color = "black", width = 0.5),label = labelList,color = "blue" ),
                                 link = dict(source = dataset.source.apply(lambda x: labelList.index(x)),
                                                          target = dataset.target.apply(lambda x: labelList.index(x)),
                                                          value = dataset.value))])

fig.update_layout(autosize=False,width = 3000,height = 1000,hovermode = 'x',title="performance Goal user behavior monitor",font=dict(size=16, color='black'))
    
fig.write_html('/Users/i073341/Library/CloudStorage/OneDrive-SAPSE/test/perfUXRGoal.html', auto_open=False)

而 plotly 创建的图表样式如下所示。

我搜索了一下,matplotlib 也可以创建 sankey 图表,但是 matplotlib 创建的 sankey 样式如下所示。

matplotlib 是否可以创建一个风格类似于 plotly 创建的桑基图?

由于缺乏好的替代方案,我硬着头皮尝试创建我自己的 sankey plot,它看起来更像 plotly 和 sankeymatic。这纯粹使用 Matplotlib 并产生如下所示的流程。不过我在你的 post 中没有看到情节图像,所以我不知道你想要它是什么样子。

完整代码在底部。您可以使用 python -m pip install sankeyflow 安装它。基本工作流程很简单

from sankeyflow import Sankey
plt.figure()
s = Sankey(flows=flows, nodes=nodes)
s.draw()
plt.show()

请注意,pySankey does use Matplotlib too, but it only allows for 1 level of bijective flow. SankeyFlow 更加灵活,具有多个级别并且不必是双射的,但需要您定义节点。

from sankeyflow import Sankey
import matplotlib.pyplot as plt

plt.figure(figsize=(20, 10), dpi=144)
nodes = [
    [('Product', 20779), ('Sevice\nand other', 30949)],
    [('Total revenue', 51728)],
    [('Gross margin', 34768), ('Cost of revenue', 16960)],
    [('Operating income', 22247), ('Other income, net', 268), ('Research and\ndevelopment', 5758), ('Sales and marketing', 5379), ('General and\nadministrative', 1384)],
    [('Income before\nincome taxes', 22515)],
    [('Net income', 18765), ('Provision for\nincome taxes', 3750)]
]
flows = [
    ('Product', 'Total revenue', 20779, {'flow_color_mode': 'source'}),
    ('Sevice\nand other', 'Total revenue', 30949, {'flow_color_mode': 'source'}),
    ('Total revenue', 'Gross margin', 34768),
    ('Total revenue', 'Cost of revenue', 16960),
    ('Gross margin', 'Operating income', 22247),
    ('Gross margin', 'Research and\ndevelopment', 5758), 
    ('Gross margin', 'Sales and marketing', 5379), 
    ('Gross margin', 'General and\nadministrative', 1384),
    ('Operating income', 'Income before\nincome taxes', 22247),
    ('Other income, net', 'Income before\nincome taxes', 268, {'flow_color_mode': 'source'}),
    ('Income before\nincome taxes', 'Net income', 18765), 
    ('Income before\nincome taxes', 'Provision for\nincome taxes', 3750),
]

s = Sankey(
    flows=flows,
    nodes=nodes,
)
s.draw()
plt.show()