删除 networkx 中权重在特定范围内的边
Remove edges having a weight in a specific range in networkx
我有一个数据框。我将该数据框转换为图表。之后,我想去除特定重量范围内的边缘:
列:
df.columns:
Index(['source', 'target', 'weight'], dtype='object')
长度:
len(df)
1048575
数据类型:
df.dtypes
source int64
target float64
weight int64
dtype: object
现在,构建一个 networkx 图:
Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, 'source','target', edge_attr='weight', create_using=Graphtype)
图表信息:
print(nx.info(G))
Name:
Type: Graph
Number of nodes: 609627
Number of edges: 915549
Average degree: 3.0036
度数:
degrees = sorted(G.degree, key=lambda x: x[1], reverse=True)
degrees
[(a, 1111),
(c, 1107),
(f, 836),
(g, 722),
(h, 608),
(k, 600),
(r, 582),
(z, 557),
(l, 417), etc....
我想做的是删除具有特定权重的边缘。例如,我想删除所有权重 > = 500
的边
to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >= 500]
G.remove_edges_from(to_remove)
但是,我收到以下错误消息:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-df3a67f18df9> in <module>()
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]
<ipython-input-12-df3a67f18df9> in <listcomp>(.0)
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]
ValueError: too many values to unpack (expected 2)
知道为什么我会收到这条消息吗?或者也许有更好的方法来做到这一点?
谢谢
我是这样解决这个问题的:
threshold = 500
# filter out all edges above threshold and grab id's
long_edges = list(filter(lambda e: e[2] > threshold, (e for e in G.edges.data('weight'))))
le_ids = list(e[:2] for e in long_edges)
# remove filtered edges from graph G
G.remove_edges_from(le_ids)
在这种理解中,您将字符串 "weight" 与 int 进行比较,这没有多大意义:
[(a,b) for a,b in G.edges(data=True) if "weight" >= 500]
此外,真正导致异常的原因是,如果您传递 data=True
,您将得到一个 3 元组,其中第三个元素是属性字典。
您可能想要做的是:
[(a,b) for a, b, attrs in G.edges(data=True) if attrs["weight"] >= 500]
我有一个数据框。我将该数据框转换为图表。之后,我想去除特定重量范围内的边缘:
列:
df.columns:
Index(['source', 'target', 'weight'], dtype='object')
长度:
len(df)
1048575
数据类型:
df.dtypes
source int64
target float64
weight int64
dtype: object
现在,构建一个 networkx 图:
Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, 'source','target', edge_attr='weight', create_using=Graphtype)
图表信息:
print(nx.info(G))
Name:
Type: Graph
Number of nodes: 609627
Number of edges: 915549
Average degree: 3.0036
度数:
degrees = sorted(G.degree, key=lambda x: x[1], reverse=True)
degrees
[(a, 1111),
(c, 1107),
(f, 836),
(g, 722),
(h, 608),
(k, 600),
(r, 582),
(z, 557),
(l, 417), etc....
我想做的是删除具有特定权重的边缘。例如,我想删除所有权重 > = 500
的边to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >= 500]
G.remove_edges_from(to_remove)
但是,我收到以下错误消息:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-df3a67f18df9> in <module>()
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]
<ipython-input-12-df3a67f18df9> in <listcomp>(.0)
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]
ValueError: too many values to unpack (expected 2)
知道为什么我会收到这条消息吗?或者也许有更好的方法来做到这一点? 谢谢
我是这样解决这个问题的:
threshold = 500
# filter out all edges above threshold and grab id's
long_edges = list(filter(lambda e: e[2] > threshold, (e for e in G.edges.data('weight'))))
le_ids = list(e[:2] for e in long_edges)
# remove filtered edges from graph G
G.remove_edges_from(le_ids)
在这种理解中,您将字符串 "weight" 与 int 进行比较,这没有多大意义:
[(a,b) for a,b in G.edges(data=True) if "weight" >= 500]
此外,真正导致异常的原因是,如果您传递 data=True
,您将得到一个 3 元组,其中第三个元素是属性字典。
您可能想要做的是:
[(a,b) for a, b, attrs in G.edges(data=True) if attrs["weight"] >= 500]