如何从 pandas 数据帧行创建 networkx 边
How to make networkx edges from pandas dataframe rows
对于上下文:
我正在为蛋白质-蛋白质相互作用网络制作可视化图表。这里的节点对应于蛋白质,边表示两个节点之间的相互作用。
这是我的代码:
首先我导入我需要的所有模块和文件:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
interactome_edges = pd.read_csv("*a_directory*", delimiter = "\t", header = None)
interactome_nodes = pd.read_csv("*a_directory*", delimiter = "\t", header = None)
# A few adjustments for the dataframes
interactome_nodes = interactome_nodes.drop(columns = [0])
interactome_edges.columns = ["node1","node2"]
节点的数据框如下所示:
1
0 MET3
1 IMD3
2 OLE1
3 MUP1
4 PIS1
...
边的数据框如下所示:
node1 node2
0 MET3 MET3
1 IMD3 IMD4
2 OLE1 OLE1
3 MUP1 MUP1
4 PIS1 PIS1
...
基本上边从节点 1 到节点 2
现在我遍历节点数据帧和边数据帧的每一行,并将其用作 networkx 节点和边。
interactome = nx.Graph()
# Adding Nodes to Graph
for index, row in interactome_nodes.iterrows():
interactome.add_nodes_from(row)
# Adding Edges to Graph
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row["node1", "node2"]) #### Here is the problem
我的问题出在添加边的部分。
我目前收到以下错误:
KeyError: ('node1', 'node2')
我也试过了:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from((row["node1"],row["node2"]))
和:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row["node1"],row["node2"])
也只是:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row)
所有这些都给我某种形式的错误。
如何使用我的节点到节点数据框作为 networkx 图的边?
In [9]: import networkx as nx
In [10]: import pandas as pd
In [11]: df = pd.read_csv("a.csv")
In [12]: df
Out[12]:
node1 node2
0 MET3 MET3
1 IMD3 IMD4
2 OLE1 OLE1
3 MUP1 MUP1
4 PIS1 PIS1
In [13]: G=nx.from_pandas_edgelist(df, "node1", "node2")
In [14]: [e for e in G.edges]
Out[14]:
[('MET3', 'MET3'),
('IMD3', 'IMD4'),
('OLE1', 'OLE1'),
('MUP1', 'MUP1'),
('PIS1', 'PIS1')]
Networkx 具有从 pandas 数据帧读取的方法。我使用了提供的边缘数据框。在这里,我使用 from_pandas_edgelist
方法从边缘数据帧中读取。
绘制图表后,
nx.draw_planar(G, with_labels = True)
plt.savefig("filename2.png")
对于上下文: 我正在为蛋白质-蛋白质相互作用网络制作可视化图表。这里的节点对应于蛋白质,边表示两个节点之间的相互作用。
这是我的代码:
首先我导入我需要的所有模块和文件:
import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd
interactome_edges = pd.read_csv("*a_directory*", delimiter = "\t", header = None)
interactome_nodes = pd.read_csv("*a_directory*", delimiter = "\t", header = None)
# A few adjustments for the dataframes
interactome_nodes = interactome_nodes.drop(columns = [0])
interactome_edges.columns = ["node1","node2"]
节点的数据框如下所示:
1
0 MET3
1 IMD3
2 OLE1
3 MUP1
4 PIS1
...
边的数据框如下所示:
node1 node2
0 MET3 MET3
1 IMD3 IMD4
2 OLE1 OLE1
3 MUP1 MUP1
4 PIS1 PIS1
...
基本上边从节点 1 到节点 2
现在我遍历节点数据帧和边数据帧的每一行,并将其用作 networkx 节点和边。
interactome = nx.Graph()
# Adding Nodes to Graph
for index, row in interactome_nodes.iterrows():
interactome.add_nodes_from(row)
# Adding Edges to Graph
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row["node1", "node2"]) #### Here is the problem
我的问题出在添加边的部分。 我目前收到以下错误:
KeyError: ('node1', 'node2')
我也试过了:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from((row["node1"],row["node2"]))
和:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row["node1"],row["node2"])
也只是:
for index, row in interactome_edges.iterrows():
interactome.add_edges_from(row)
所有这些都给我某种形式的错误。
如何使用我的节点到节点数据框作为 networkx 图的边?
In [9]: import networkx as nx
In [10]: import pandas as pd
In [11]: df = pd.read_csv("a.csv")
In [12]: df
Out[12]:
node1 node2
0 MET3 MET3
1 IMD3 IMD4
2 OLE1 OLE1
3 MUP1 MUP1
4 PIS1 PIS1
In [13]: G=nx.from_pandas_edgelist(df, "node1", "node2")
In [14]: [e for e in G.edges]
Out[14]:
[('MET3', 'MET3'),
('IMD3', 'IMD4'),
('OLE1', 'OLE1'),
('MUP1', 'MUP1'),
('PIS1', 'PIS1')]
Networkx 具有从 pandas 数据帧读取的方法。我使用了提供的边缘数据框。在这里,我使用 from_pandas_edgelist
方法从边缘数据帧中读取。
绘制图表后,
nx.draw_planar(G, with_labels = True)
plt.savefig("filename2.png")