删除 networkx 中权重在特定范围内的边

Remove edges having a weight in a specific range in networkx

我有一个数据框。我将该数据框转换为图表。之后,我想去除特定重量范围内的边缘:

列:

df.columns:
Index(['source', 'target', 'weight'], dtype='object')

长度:

len(df)
1048575

数据类型:

df.dtypes
source      int64
target    float64
weight      int64
dtype: object

现在,构建一个 networkx 图:

Graphtype = nx.Graph()
G = nx.from_pandas_edgelist(df, 'source','target', edge_attr='weight', create_using=Graphtype)

图表信息:

print(nx.info(G))
Name: 
Type: Graph
Number of nodes: 609627
Number of edges: 915549
Average degree:   3.0036

度数:

degrees = sorted(G.degree, key=lambda x: x[1], reverse=True)
degrees
[(a, 1111),
 (c, 1107),
 (f, 836),
 (g, 722),
 (h, 608),
 (k, 600),
 (r, 582),
 (z, 557),
 (l, 417), etc....

我想做的是删除具有特定权重的边缘。例如,我想删除所有权重 > = 500

的边
to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >= 500]
G.remove_edges_from(to_remove)

但是,我收到以下错误消息:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-12-df3a67f18df9> in <module>()
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]

<ipython-input-12-df3a67f18df9> in <listcomp>(.0)
----> 1 to_remove = [(a,b) for a,b in G.edges(data=True) if "weight" >=500]

ValueError: too many values to unpack (expected 2)

知道为什么我会收到这条消息吗?或者也许有更好的方法来做到这一点? 谢谢

我是这样解决这个问题的:

threshold = 500

# filter out all edges above threshold and grab id's
long_edges = list(filter(lambda e: e[2] > threshold, (e for e in G.edges.data('weight'))))
le_ids = list(e[:2] for e in long_edges)

# remove filtered edges from graph G
G.remove_edges_from(le_ids)

在这种理解中,您将字符串 "weight" 与 int 进行比较,这没有多大意义:

[(a,b) for a,b in G.edges(data=True) if "weight" >= 500]

此外,真正导致异常的原因是,如果您传递 data=True,您将得到一个 3 元组,其中第三个元素是属性字典。

您可能想要做的是:

[(a,b) for a, b, attrs in G.edges(data=True) if attrs["weight"] >= 500]